AI in Government and Governing AI: A Discussion with Stanford’s RegLab

2020 Fiscal Year Summary 9

Joining Pam and Rich for this discussion are Professor Daniel Ho and RegLab Fellow Christie Lawrence, JD ’24 (MPP, Harvard Kennedy School of Government).

Dan is the founding director of Stanford’s RegLab (Regulation, Evaluation, and Governance Lab), which builds high-impact partnerships for data science and responsible AI in the public sector. The RegLab has an extensive track record partnering with government agencies like the Environmental Protection Agency, Internal Revenue Service, the U.S. Department of Labor, and Santa Clara County on prototyping and evaluating AI tools to make government more fair, efficient, and transparent. Building on this work, the RegLab also helps agencies strengthen AI governance and operationalize trustworthy AI principles.

Christie, a third-year JD student, worked with RegLab and Stanford’s Innovation Clinic on projects to advise DOL on responsible AI and development practices and to support the work with Prof. Ho on the National AI Advisory Committee, which advises the White House on AI policy. In this interview, we’ll learn about several RegLab projects—and the importance of helping government develop smart AI policy and solutions.

View all episodes

Transcript

Daniel Ho: We’ve seen rapid acceleration of where technology is going, most vividly, as people saw about a year and a half ago, with the release of something like ChatGPT. At the same time, we’ve seen voluminous evidence of the kinds of risks that people aren’t aware of, ranging from bias to privacy harms, where large models, for instance, can leak people’s social security numbers, to concerns even about catastrophic forms of risk.

Pam Karlan: This is Stanford Legal, where we look at the cases, questions, conflicts, and legal stories that affect us all every day. I’m Pam Karlan with Rich Ford. Please subscribe or follow this feed on your favorite podcast app. That way you’ll have access to all of our new episodes as soon as they’re available.

Today we’re taping in front of a live audience during Stanford’s admissions weekend. And our guests are my colleague, Dan Ho, who’s a professor here at the law school and also a professor of political science and of computer science, and director of the Stanford Law Center, among other things, of the Stanford RegLab. And Christie Lawrence, who is a member of the class of ’24, and has been very involved in working on RegLab issues here at Stanford.

Rich Ford: So, Dan, Christie, thanks for being on the show. You know, it’s hard to pick up a newspaper these days without reading something about AI. And whether it’s a threat to the world that’s going to take all of our jobs, or it’s going to usher us into some great utopian future. And you’ve been working on these issues quite a bit. Can you tell us what’s going on with the relationship between AI and governance and how people at Stanford are working on those issues?

Daniel Ho: Well, thanks so much for having us on, Rich and Pam. Podcast fame has always eluded me, so I’m really happy to be able to join you. To take a cut at your question, Rich: There are several things that have been going on. One is that we’ve seen rapid acceleration of where technology is going, most vividly, as people saw about a year and a half ago, with the release of something like ChatGPT. At the same time, we’ve seen voluminous evidence of the kinds of risks that people aren’t aware of, ranging from bias to privacy harms where large models for instance can leak people’s social security numbers, to concerns even about catastrophic forms of risk. Look no further than legal practice where the one lawyer who probably very much regrets this at this point is now referred to as the ChatGPT lawyer who submitted a brief that had about a half dozen bogus citations of cases and was sanctioned by the court.

So we’re all very much grappling with the rise of AI and what’s been going on in AI policy and governance is a response, both within the European Union, but then also very much here in the United States, where in November, the Biden administration issued one of the largest executive orders to date around AI, which announced a kind of “whole of government” approach. There are about 150 things that agencies have to do to address things like the bioweapons risk, employment discrimination, the hiring into the government, immigration reform to bring AI talent into the country. And that’s been followed up also by an implementation memo by the Office of Management and Budget.

Rich Ford: So you started the RegLab about five years ago. Tell us a little bit about what the RegLab is, what you do, and how your work at Stanford’s kind of been on the cutting edge of these issues around AI.

Daniel Ho: Yeah, thanks, Rich. We started the Reg Lab around five years ago. Really as we were working with government agencies and realizing that just as we saw the technology really taking off, government was increasingly behind in actually bringing in the expertise and having the digital infrastructure to really think about the implications for government itself. So just to give you some examples to illustrate that: One survey shows that less than one percent of AI PhDs consider a career in public service. The Government Accountability Office had a report that showed that 80% of funding for information technology goes to legacy technology to support code, for instance, that was built in 1959 for a lot of large claims agencies.

And in 2016, it was still –the Department of Defense, it was reported by the Government Accountability Office, was actually storing nuclear missile safety codes on eight inch floppy disks. So when government can’t actually have modern forms of technology and understand this technology, the worry for us has been that that poses a really significant risk. And so we founded the RegLab really to bring together lawyers, technologists, social scientists and ranges of expertise across campus to really think about these issues. The thing that I like to say is that government cannot govern AI if it doesn’t understand AI.

Rich Ford: So can you give us some examples, I mean, now I have this image of people in government trying to program their VCR the way my mother is. And you know, using floppy disks, and you’re working with them in order to bring them into the 21st century. Can you give us an example of a specific agency you worked with?

Daniel Ho: Sure. We started a collaboration around four years ago with the Internal Revenue Service which annually faces a tax gap above 500 billion annually, the difference between taxes owed and taxes paid. And they had been really trying to think about the implications of this technology for better understanding where tax evasion is actually happening. And so we got brought in and there was a wonderful student here who led this project, who’s now teaching at Princeton, really to bring in ways of thinking about machine learning that could actually really identify tax evasion much more effectively.

At the same time that we were doing this work, everyone recognized that it was going to be really important to think about safeguards as you’re thinking about forms of machine learning and tax administration. The problem that IRS faced, like many other federal agencies, was that it didn’t have basic information about race and ethnicity. So on your 1040, your individual income tax return, you don’t fill out demographic attributes that reveal race and ethnicity. So it took us about a year actually to develop a framework of how IRS should actually be able to assess whether or not the introduction of these kinds of systems were generating disparities.

And what we then discovered after building out that framework was a pretty disturbing finding, which is that Black taxpayers were audited at about three to five times the rate as non-Black taxpayers, having paid little to do with differences in under reporting, and then we within that study really tried to drill down on what could potentially be causing that gap, and I was happy to say that the study was released last year, a couple of weeks before now-Commissioner Werfel had his confirmation hearing. He was peppered with questions about the Stanford study and committed at that point of time figuring out what the corrective action plan was in later last year in the fall, IRS actually announced an overhaul of how it was going to audit around the earned income tax credit, which turns out, which is a social support program, particularly for lower income taxpayers who have dependents. And that is really the kind of audit attention that has played a pretty large role in creating this disparity across taxpayers. And IRS announced a kind of overhaul of how it was going to think about auditing around that kind of space.

Rich Ford: Wow, that’s fascinating. So, I mean, people don’t usually think about the relationship between AI and high technology and social justice issues, but this is a wonderful example of how they dovetail and how your work is doing some great things.

Daniel Ho: Can I say one thing on that? Because I think oftentimes we think we, one of the really important conversations has been: we worry so much about how forms of AI can generate disparities, but here it was actually the turn towards thinking about AI that led to a degree of transparency into what was going on with legacy systems.

So these were not fancy forms of AI that IRS was using. What we were documenting were disparities in the existing audit selection system. So that’s both the sort of potential optimism, but also anxiety that people have about the ways in which these systems can operate.

And then, you know, the kind of overarching thing that I think is so important, is that the filing of a tax return, or the filing for an unemployment insurance check, is for most citizens, the major touch point with the federal government. And if the federal government can’t get that interaction right, it could lead and contribute to the really significant erosion of public trust that we’ve seen.

Rich Ford: Now Christie, you’ve been involved with AI matters during your time at Stanford Law School. Can you tell us a bit about your work there and give us the student’s perspective?

Christie Lawrence: Yeah, so first, thanks so much for letting me be a part of this wonderful podcast. I remember years ago, around this same weekend, being in everyone, the audience’s seat because I actually watched the taping of this podcast — still decided to come here.

So I’ve, for several years before law school, I was working on AI policy. I have a background in sort of international relations and national security and was involved in some of the earlier discussions about what the U.S. federal government should be doing with AI. But there was a lot of policy questions where I really felt like I had to defer to the lawyers. So, you know, if the United States wants to work with France and Germany and the United Kingdom on using AI for pandemic prevention, what are we going to do about the fact that we have different data governance regimes? Or, how should we think about different intellectual property regimes and whether AI should be an “inventor?”

So I really wanted to come to law school and be able to explore that. And I’ve been able to do that at the RegLab, for example Dan and other professor have sort of bi weekly or monthly lunches where there’s law students that will be coming together with some computer science students and we’re really able to basically bridge the gap here, I think, at Stanford between policy, law and technology in a way that I just don’t think other universities really can. And, you know, an example of how impactful this work has been to our surprise it’s very humbling. You know, we worked on a paper that was looking at how the federal government agencies are actually really implementing various requirements in the executive orders and laws. And there was a lot of like ambiguity and in, in those executive boards that we really systematically documented, you know, that there’s not a lot of clarity around definitions of AI, about what it means to be an agency, and we released this paper and you don’t really know how things are going to land always. But all of a sudden we were hearing that there was, you know, like a Congressional hearing about this work and, you know, folks were citing our paper and we had individuals from, you know, executive branch agencies saying, you know, thank you for this. And, you know, can we, you know, look at the underlying research basically.

Rich Ford: That’s great. So you’ve worked with several different governmental agencies and so turning to some of the social justice issues: Pam, you had some experience with AI issues as well. Can you tell us a little about that?

Pam Karlan: Yeah, so I took a leave of absence from Stanford to serve at the Justice Department in the Civil Rights Division, and I put together a working group there to deal with issues of AI and algorithmic discrimination, and it turns out that it spanned the entire range of you know, range of projects and programs that DOJ works on. So there’s algorithmic discrimination and the use of AI in the way that people are targeted for advertisements. And one of the things there that’s amazing is when you have all of this data, even if you tell people you can’t use race, other pieces of the data can reconstruct race. And so the question is, how do you deal with that? And there were questions about bail. And detention hearings and the like, because there are proprietary systems of risk assessment and, you know, if there were a judge deciding whether to let somebody out on bail or not, and the judge said to you, you know, I’m going to let you out, Christie, because you’re a woman, but I’m going to keep you in, Dan, because you’re a man, everybody would understand that violates the Constitution.

But the algorithms take race and sex into account, and if they don’t, they’re less accurate in predicting who’s going to commit a crime while they’re out on pretrial release and like, and so trying to grapple with all of these issues was something that started to do across the entire civil rights division at DOJ.

Daniel Ho: It’s so, I mean, DOJ was really ahead of the curve here. I think part of what you see in the recent AI executive order is really trying to have a push to have more agencies have that kind of strategic planning. And I think going back to what Christie was saying in terms of really seeing that intersection between law and technology there’s so much, I think sort of fluency that has to be developed across these fields.

We see a lot of technical, technological solutions, particularly in the algorithmic fair space, that may be plainly illegal. We see a lot of policy proposals that might just be technically impossible to achieve. Like a lot of the demands for forms of explanation, the science demands, it simply doesn’t sort of support yet putting a huge degree of reliance on that as the sort of policy solution and so I think here, what we’re really trying to do through initiatives like the RegLab and the course offerings is to really bring those two forms of expert.

Pam Karlan: Can I jump in with just a plug for another one of your papers that isn’t on it’s face, just about AI, and that’s the work that you’ve done with Isaac, who’s another student here, who’s a third year now, I guess, on IPAs, Intergovernmental Personnel Act designations. Could you say a word or two about that? Because I think that’s kind of fascinating.

Daniel Ho: Sure. Well, one of the real challenges I think for government is this sort of talent gap. For lots of areas of science and emerging technology, it can be very difficult for government to actually bring that talent in. And this is a paper to law students and one of our other colleagues Professor Ann Joseph O’Connell, where we really document the ways in which government agencies have really creatively used this obscure act known as the Intergovernmental Personnel Act that allows you to essentially assign academics and folks from qualified nonprofits into these kinds of positions for term limited periods of time and really do the kind of strategic planning that Pam, you’re talking about at, at DOJ Civil Rrights where, for instance, ARPA H, which was an innovation agency stood up for health innovation was basically largely created by folks who came into the government under IPA status. And there are tons of examples like this where the National Science Foundation, for instance, has like 12% of it’s workforce under the Intergovernmental Personnel Act, and it’s something that we’ve now recognizedtoo  is lacking at the state level, and so we’ve been very much working on creating those kinds of talent exchange mechanisms at the state level, so we’re really happy to see that Senator Padilla here in California has actually introduced legislation after Stanford worked with his team to actually have that mechanism for California as well.

Rich Ford: So Dan and Christie, you also did some really fascinating work with the Department of Labor. Can you tell us a little bit about that work?

Daniel Ho: Sure. I’ll turn it to Christie really to talk about some of the more policy and law-oriented work. The way this began was actually also coming out of a policy practicum and I think was actually the topic of the original podcast that Christie, you attended when you were an admitted student, which was, we had a practicum there that brought together about 30 computer science students, 15 law students, 15 computer science students, to develop a report for the Administrative Conference of the United States to understand how technology and AI was being used within the, the federal government.

And at that point in time, we had talked to the head of the Office of Medicare Hearings and Appeals, who then was brought in at the Department of Labor. And when she started up this unit, really wanted to engage Stanford in thinking about how to actually use AI in an assistive way, given the really significant concerns that exist when you have large forms of claims adjudication. So in law school we focus a lot on Article III. The lesser known fact is that there’s far more adjudication that actually happens in agencies. So an agency like the Social Security Administration can adjudicate half million cases annually, and there have been really significant pain points in the accuracy and consistency of these kinds of determinations.

And so we got brought in really to help build out a system that would allow these claims examiners to sift through volumes of medical records to find the material that would support the award of workers compensation, which is basically cash-based assistance in medical care for federal workers who are injured on the job. After doing that, we showed how we, in a kind of trial, how it could improve the accuracy of claims determination. But that then really led us to thinking more broadly about what implications does this have for how the Department of Labor should actually think about these kinds of potential AI solutions, given the real significant potential risks. And I think I’ll turn it over to Kristi to talk a little bit about the work that we do there.

Christie Lawrence: I should also say too, I need to So, I’m speaking in my personal capacity and not on behalf of any federal entity, but so, yeah, so basically because of the amazing work that RegLab had been doing already with the department on thinking through how to actually improve their claims adjudication you know, we want to help them think through sort of how they could operationalize principles. So I’m sure that a lot of you who’ve been following AI have heard people say, AI should be fair, it should be accurate, it should be explainable. There’s been a proliferation of some estimates of over a hundred different principles that those who are developing and using AI should follow. But that’s not exactly very actionable.

So we really wanted to think through what would be sort of concrete practices, ways that an agency who’s actually trying to make sure that their use of AI is trustworthy, can actually do so. So in the claims adjudication space, it’s been said that it sort of like benefits roulette. Sometimes the largest predictor of what the outcome is going to be for your claim is going to be just simply who was your adjudicator.

So there’s a lot of errors that we know in the existing system and, as Dan was  pointing to earlier with the IRS. There is a possibility with AI for us to actually correct those errors, to not actually replicate them in society. So, how would you do that in this space? And what the RegLab sort of worked with the Department of Labor to do is think through can we get a bunch of senior claims examiners that can look at some existing claims, re-adjudicate them, find the errors, point to the really important information that would be maybe predictive or actually related to the outcome that we would want to have an adjudication and can we correct that and make sure we have really high quality data for the AI tool to be trained on. So it’s these sort of practices of thinking through like what’s the pipeline of the development of this AI tool and what can actually be done to sort of correct and minimize these errors.

Daniel Ho: If I could follow up on that, I mean, I think one, one of the key things here is, the really easy thing to do would just be to grab a bunch of historical data and try to train something on top of that. But that is exactly the recipe for propagating what are well known historical errors into the future.

And so that, I think, was one of the things that we really pushed on. Two other things just to note here is, is I think a key principle that we tend to operate by is really thinking about how these kinds of systems can empower humans and really not to be in the mindset where you’re thinking about these things as replacing human discretion, because it can be so important in these kinds of settings.

The other place where law comes in as a really important element in these kinds of collaborations is that these aren’t technical problems alone. So Jen Pahlka in her book Recoding America talks about all that ails the unemployment insurance system that 46 million Americans relied upon during the pandemic, and that essentially collapsed in the early months of the pandemic. Timeliness of UI checks, unemployment insurance checks was about 97 percent prior to the pandemic, and dropped to about 56 percent right in the early months of the pandemic, when Americans were turning to these and were really in dire need. And part of the challenge here is actually the way in which we have designed this benefit system itself and how we administer these kinds of entitlement programs.

So, she has a chapter where she talks about the guy who describes himself as the “new guy.” And, you know Marina Nitze who’s interviewing him sort of goes and says, Well, you keep describing yourself as the new guy. Why, why do you say that? And he says, Well, I only have 17 years of experience adjudicating these claims, and it’s the people who’ve been at this agency for 25 years who really know the ropes and know how to actually adjudicate these claims.

And so it’s part of this very human problem of, you know, as California and other states brought in thousands of other employees to try to do this work. What you were actually doing was taking people who have the 25 years of experience off the line, resolving some of these issues. And that’s the area really that’s ripe for potential simplification, reform and improvement that isn’t purely a technical problem.

Christie Lawrence: If I could add on to that, I mean, this came up in the Department of Labor context and it’s come up in other work that the RegLab has done. But, you know, procurement: right now, the government needs to be leveraging AI, but around half Senator Gary Peters said around half of AI that’s used in government is actually procured by private companies.

And it’s, there’s so much research on how government procurement, particularly in technology, is just really languishing. It’s why you have the IRS and other agencies using systems that are decades old. And so when you’re thinking about what should the government be doing to make sure that they’re leveraging AI, it sometimes still requires looking at, well, actually maybe the problem is that there’s a risk-averse culture in procurement or that there’s misconceptions about what the law actually requires and there are things that can be done like quality assurance surveillance plans or for procurement officials to be reaching out more proactively to companies to learn about what AI can do, but it’s not really purely just a technical problem.

Rich Ford: This is really fascinating work, and it makes me more optimistic about the future of AI than I was before. If you could just tell us a couple of things about what the future holds, and we’ll wrap it up.

Christie Lawrence: The future holds, I think that the governments in the United States and across the world have really recognized the importance of figuring out what to do with AI. So, I think that we are optimistic, at least I am optimistic about the most recent November Executive Order and what that’s going to bring to the government. I think they recently announced a hiring – There’s going to be hiring a hiring surge of AI technologists.

Daniel Ho: Yeah, I’d say, I’d maybe add two things to that.

I think we’re going to have to pay a lot of attention as to how this executive order is actually implemented. Because a lot of the work that Christie did will show some of the difficulties when you don’t have the requisite talent of how do you actually implement that with fidelity in a way that really works. And one of the really emerging questions embedded within the Executive Order is around how open of an innovation ecosystem are we likely to have? Are we going to see regulation that basically closes down the innovation ecosystem that the concerns about risk basically motivate a kind of quasi licensing regime where only a small number of companies are able to kind of develop the frontier models, or will we see something more by way of an open innovation ecosystem, which is traditionally characterized AI? We’ll likely see the pickup of after the election year also of legislative proposals. Senator Schumer has been quite involved in that, but in the interim, the last thing I’ll point out in terms of what’s in stock is that as a result of the election year, we’re seeing a lot of action in the states. So California recently had its legislative calendar close and there were 30-plus proposals on AI regulation alone. And just as in other areas, I think consumer privacy, California may actually end up leading the way in the short run if the federal government doesn’t act with comprehensive legislation.

Rich Ford: Professor Dan Ho, Christie Lawrence, thanks so much for joining us here on Stanford Legal.

Pam Karlan: We have time for a couple of questions if folks want to ask questions. So, there’s a question over here.

Student question 1: What challenges, if any, do you see for procuring AI or like improved technology with government resource constraints, like financial constraints?

Daniel Ho: I think in a nutshell, the challenge is if you don’t have folks with technical expertise within government, it’s going to be really hard to have responsible procurement. And the worries are that you’re going to, you know that it will be hard to tell the difference, as one agency told us, between AI snake oil and something that is actually going to solve problems. And so I think this is why there’s been such an emphasis on the so-called AI hiring surge to bring folks who have the kind of blended expertise that Christie or others who’ve been working on these issues can bring in to actually think about how to move the procurement system in a way that really responds to the technology. Part of this is really a challenge of, of kind of legacy laws. We have a procurement system that’s been built up with a kind of technology acquisition in mind where you’re hiring or where you’re, you’re procuring for an agency, like hardware that you expect to be around on a five-year cycle. And that’s just, that’s just not the way it works. AI actually, AI adapts, AI sort of can break in all sorts of ways. And so I think that, that, that’s why the technical talent, I think, is actually quite important. But Christie, you’ve actually spent a heck of a lot of time thinking about these issues.

Christie Lawrence: Yeah, I’m happy to talk about this at any time, pretty much. I would say everything, I completely agree with everything Dan just said. I am actually a little bit optimistic about federal procurement right now, because I think that there’s been so much attention on AI, and so many people pointing out that the government relies on procurement to actually get a lot of its AI solutions, that I think that it might be the type of attention we need to shift the needle in a lot of these places, and I think actually a big hindrance is there’s – sure there’s not very flexible law, and there’s problems with the budget cycle, and there’s probably problems with legacy IT —but a big problem is actually that I think procurement is this is– has not been historically the function that you go to if you want to become… if you want to be promoted to, like, the most senior position ever in an agency.

And I think that there’s actually a recognition right now that particularly with AI, those functions need to be recognized for their strategic and mission critical importance. And that they need to be part of the discussions with the Chief AI Officers, et cetera.

Daniel Ho: I had the benefit of teaching in one of the modules that was required under the AI Training Act where procurement officials were supposed to be trained on, on AI and so we worked with OMB and and and GSA to, to design this course. And I taught one of these modules and I thought I was, I was gonna show up to this online webinar and there’d be like 32 procurement officials. And I think, to illustrate what Christie is saying as to the amount of attention, 2,300 federal employees showed up to this online webinar that I was teaching.

This was completely different from what I was imagining. And so there’s a level of attention to this that I think is actually really good. And, and you know, if channeled in the right way, it could lead to a lot of improvements not just in the procurement of AI, but actually updating what most people see as like a pretty outdated federal acquisition regulation.

There’s a question back…

Student question 2: So as far as these technologies are going to continue to develop, it sounds kind of, it’s the builders, the computer scientists, data scientists are on one team, and then the lawyers, the legal team is on another side and they work together. Is there a world where there’s more of an intersection between the two versus just someone’s building this tool and then other people step in to see, Hey, is this okay?, Is this what we want? Is there in the future, like an intersection of like, will there be Sorry, legal artificial intelligence, scientists in the future, something like that. Is that something that’s inevitable? Is it something that we have to try to build as a society or whatever? Like how, how do we see that intersection developing in the future? If you guys have any thoughts on, on that.

Daniel Ho: I definitely think the, the, the wrong way to think about the future is as like two different teams. Especially if they’re oppositional in nature, which has sometimes happened, right? Where like the lawyers are more on the compliance side of things and come in at the tail end and are trying to check off a bunch of compliance boxes of whether the technology that’s been built out sort of works.

And I think that’s very much what we’re striving to build here is fluencies across these fields. Now, I think like if you were to rewind five years, right? Somebody would, some folks would say, Oh, we should teach everybody how to code. That’s not the right approach. In fact, what foundation models teach us is, well, a lot of coding skills actually have been kind of largely assisted by, by things like Github Copilot.

But I think a wonderful illustration of it. is that policy practicum from 2018 that had half computer scientists and half law students, where over the course of the year that we worked together as a team, it ended up being that the computer scientists were asking some of the most astute legal questions and some of the law students were asking some of the most astute technical questions. And it’s that kind of fluency that doesn’t necessarily mean that law students have to, like, go into the guts of a system and, and rebuild the code base or something like this, but really to develop that kind of fluency and understanding to, to get this right.

Pam Karlan: This is Stanford Legal. If you’ve been enjoying the show, tell a friend and please leave us a rating or review on your favorite podcast app. It’ll help us improve and get new listeners to discover the show. I’m Pam Karlan, along with Rich Ford, and we’ll see you next time.