Stanford Law’s Michelle Mello, Professor of Law and Health Policy, testified before the United States Senate Committee on Finance for the full committee hearing on “Artificial Intelligence and Health Care: Promise and Pitfalls.” Below is her video testimony (beginning at 23:34) and written testimony.
“I have the extraordinary privilege of being part of a group of ethicists, data scientists, and physicians at Stanford University—long a leading hub of AI innovation—that is directly involved in governing how healthcare AI tools are used in patient care. I have studied patient safety, healthcare quality regulation, and data ethics for more than two decades. I apply that expertise in our team’s evaluations of all AI tools proposed for use in Stanford Health Care facilities, which care for over 1 million patients per year, and our recommendations about whether and how they can be used safely and effectively. I would like to share the three most important things we’ve learned so far.
First, while hospitals are starting to recognize the need to vet AI tools before use, most healthcare organizations don’t have robust review processes yet. Some, like Stanford, have plentiful resources to drawn on; others don’t. All need help. Although as a lawyer I know that more law isn’t always the answer, in this case there is much that Congress could do to help.
Second, to be effective, governance can’t focus only on the algorithm. It must also encompass how the algorithm is integrated into clinical workflow. By “workflow,” I mean how physicians, nurses, and other staff interact with each other, the AI tool, the patient, and other systems. Currently, conversations about regulating healthcare AI mostly focus on the AI tool itself—for example, is its output biased? How often does it make wrong predictions or misclassify things? These things matter. But it is equally important to consider how medical professionals will interact with the tool. A key area of inquiry is the expectations placed on physicians and nurses to evaluate whether AI output is accurate for a given patient, given the information readily at hand and the time they will realistically have. For example, large-language models like ChatGPT are employed to compose summaries of clinic visits and doctors’ and nurses’ notes, and to draft replies to patients’ emails. Developers trust that doctors and nurses will carefully edit those drafts before they’re submitted—but will they? Research on human-computer interactions shows that humans are prone to automation bias: we tend to overrely on computerized decision support tools and fail to catch errors and intervene where we should.
Therefore, regulation and governance should address not only the algorithm, but also how the adopting organization will use and monitor it. To take a simple analogy, if we want to avoid motor vehicle accidents, we can’t just set design standards for cars. Road safety features, driver’s licensing requirements, and rules of the road all play important roles in keeping people safe.
Third, because the success of AI tools depends on the adopting organization’s ability to support them through vetting and monitoring, the federal government should establish standards for organizational readiness and responsibility to use healthcare AI tools, as well as for the tools themselves. As countless historical examples of medical innovations have shown, having good intentions isn’t enough to protect against harm. The community needs some guardrails and guidance…”