Redefining the NIST Definition of AI “Trustworthiness”

October 25, 2020
By
- Eran Kahana

According to NIST, an AI application’s trustworthiness value is derived from the following variables:

Accuracy
Explainability
Resiliency (robustness*)
Safety
Reliability
Objectivity (unbiased*)
Security (risk managed*)

For starters, this list can shortened from seven variables down to four: Explainability, Safety, Objectivity and Security. We start this process by carefully examining item number 2, Explainability. This variable ties directly with XAI, which, when properly designed, delivers Perfect information. (For more on this concept, see my post The Role of Explainable AI (XAI) in Regulating AI Behavior: Delivery of “Perfect” Information.) In a nutshell, Perfect information is the key. It is information that is: (1) relevant, (2) easily understood, and (3) is not prone to misrepresentation. At a minimum, these characteristics capture variables 1, 3 and 5, folding them neatly into number 2.

Let’s not stop there, this list can be shortened even more. Take variable number 6, Objectivity. Objectivity, on its own is superfluous as it is mostly already achieved through the three elements that make up Perfect information. So now we’re down to three variables: Explainability, Safety and Security.

And we can reduce this list even further. Safety and Security overlap, even if incompletely. It is not necessary to have two separate variables. Furthermore, when information is easily understood and is not prone to misrepresentation, it is fair to say that it contains within it safety and security features. This means that we can use these as a subset within the definition of Perfect information. And so we’re down to a single variable: Explainability.

All of this flows into my view that a properly deployed XAI, the essence of NIST’s variable 2, is the key to the determination of whether and how much a particular AI application is trustworthy. Not only is there no need for additional metrics, their existence is also potentially harmful as it is likely to contribute to analytical dissonance, yielding inaccurate assessments of what is and how much is trustworthy.

Of course, “trustworthy” is a dynamic term, and needs to match with the type of AI application that is being deployed. And this is where the concept of iterative liability (which I wrote about here) comes into play. I won’t get into too much on this now, but I’ll leave you with the following thought: Trustworthiness needs to sync with iterative liability. As the concentric circles of subsequent versions of the AI application propagate, how well does the XAI engine maintain its integrity?

________

* ISO/IEC JTC 1/SC 42 Artificial Intelligence defines AI “trustworthiness” with these three variables.

***Postscript***

April 22, 2021: Reinforcement, supervised, and unsupervised ML can negatively affect the trustworthy character of the AI app. But from an iterative liability perspective, unsupervised learning is more prone to strain trustworthiness. Part of the challenge here is that, by definition, outcome predictability is relatively more difficult in these apps. Part of the solution to this challenge could be drawn from proper use of XAI, specifically, the audit function I described in Mitigating Liability with XAI: The Case for Standardization. But even with that process in place, the trustworthiness feature remains vulnerable to a relatively more nuanced damages measurement. This makes it necessary to understand and accept that an iterative liability model is necessary for examining harm resulting from these types of ML apps. This in turn, suggests that certain unsupervised ML apps should only be used by licensed entities, a requirement that could help manage attendant liability issues. Of course, the license structure (what is the criteria, who grants it, how is it managed, etc.) is a complex undertaking, but I think a useful starting point would be to look at the certification principles found in standard setting organizations (e.g., ISO).