Responsible AI Development: Complying with the “Safety” Core Principle

August 22, 2022
By
- Eran Kahana

When it comes to responsible development of AI, there are many (specifically 35) attributes which I have labeled as “core principles” that need to be considered. To be clear, this does not mean that all the core principles are always relevant to every AI application; they are not. But going through all of them can be important for a developer demonstrating compliance with best practices and limiting/disposing of liability.

One of the core principles is “safety.” Since it is a broad term it is important to clarify how it should be applied. So let’s take a closer look.

At a high level, “safety” refers to the developer maintaining policies and procedures that minimize the possibility of the AI behaving in an unintended manner. These policies and procedures control identified risks through four primary undertakings:

(1) physical and software based controls that monitor and limit access to sensitive areas related to application development (e.g., the source code);

(2) secure development practices (e.g., documented change management, data flow, data source, ability to detect unauthorized access and/or tampering);

(3) maintaining a current development asset inventory (e.g., personnel, devices, systems, and facilities); and

(4) periodic audits of the above.

To increase the likelihood of successfully demonstrating compliance with the “safety” core principle, it is important that the developer demonstrate that its practices follow all of these four elements, not just a select few.

***Postscript***

August 31, 2022: The NIST Risk Management Framework (second draft) states that a trustworthy AI is a system that is “valid and reliable, safe, fair and bias managed, secure and resilient, accountable and transparent, explainable and interpretable, and privacy-enhanced.” This is only a partial view of what “trustworthy” means and to get the full picture it is important to dissect each of its component attributes. To better understand how all of this works, take for example the “safe” attribute and the discussion above that analyzes its composition. Then, aggregate all of that into the definition of “trustworthy.” Doing this makes it possible to better appreciate the holographically complex quality of “trustworthy.” It also renders the term more understandable as a guideline that helps deliver this type of AI.