Racism in Property Deeds: Stanford Team Develops AI Tool to Identify and Map Racial Covenants
In this episode, Stanford Law's Daniel Ho and Computer Science/law student Mirac Suzgun discuss the enduring impact of racially restrictive covenants in real estate.

Stanford Law’s Daniel Ho and computer science/law student Mirac Suzgun discuss the enduring impact of racially restrictive covenants in real estate with host Rich Ford. Though unenforceable since 1948, these clauses are a lingering reminder of housing segregation and racism in the United States, as Professor Ho’s own experience of discovering a covenant barring Asians from purchasing his home highlights. The conversation also looks at legislative efforts to remove the covenants and an innovative AI tool developed by Stanford’s RegLab that helps counties identify and redact these covenants, streamlining the process while preserving the historical record.
This episode originally aired on October 24, 2024.
Transcript
Dan Ho: We estimate that as of 1950, one in every four properties is covered by a racial covenant. And the other finding to highlight is that precisely because this was at a time when there was a lot of development, when agricultural land was being turned into residences, it was a really small number of developers that bore the disproportionate responsibility for these racial covenants. Our estimate is that at the deed level, 10 developers are responsible for close to a third of all of these covenants.
Rich Ford: This is Stanford Legal, where we look at the cases, questions, conflicts, and legal stories that affect us all every day. I’m Rich Ford. Please subscribe or follow the show on your favorite podcast app. That way you’ll have access to every episode as soon as it’s available.
Today, we’re speaking with Professor Dan Ho and a Stanford law student, Mirac Suzgun, about the issue of racial covenants in real estate, particularly in Santa Clara County, and the work that Dan and the RegLab have done on developing an AI tool to deal with racially restrictive covenants. So, Dan, of course, is my colleague here at the law school. He’s a faculty director at Stanford’s RegLab, which partners with government agencies to design and evaluate programs, policies, and technologies that modernize government. Mirac is a RegLab graduate student fellow who’s working on his law and computer science degrees. And he was one of the co-authors, along with Dan, of a newly published report and tool about racial covenants. Welcome to the show.
Dan Ho: Thanks so much for having us, Rich.
Mirac Suzgun: Thank you for having us.
Rich Ford: So, let’s jump right in with a brief history of racial covenants. Everyone who went to law school remembers the famous case of Shelley v. Kraemer, which held that racially restrictive covenants were unenforceable under constitutional law. And I think a lot of people thought that’s the end of that issue. They’re gone. But that’s not quite true, is it?
Dan Ho: Yeah, that’s quite right. In fact, when I moved to Palo Alto from San Francisco in 2015, in the large set of documents that you sign when you purchase a home, one of them was a community covenant. That said that the property shall not be used or occupied by any person of African, Japanese, or Chinese or any Mongolian descent, except for in the capacity of a servant to a white person.
And Rich, you’re quite right that these racial covenants, which barred certain racial groups from purchasing homes, while held unenforceable in 1948 and Shelley v. Kraemer, still persist in deed records across the country, and our paper really responds to recent efforts, particularly a 2021 California law that is trying to do something about it by identifying, redacting, and creating a historical registry of these covenants.
Rich Ford: Yeah, so tell us a little bit about that law. So, we have these racially restrictive covenants, and they are unenforceable after 1948, but they’re still there, and so you buy a house in Santa Clara County, and there’s a covenant that would restrict people like you from occupying the property, and so you have to sign this even though you’re there, in violation, technically, of this covenant, it’s stigmatizing and disturbing to see these. And so California has passed a law to get rid of them.
Dan Ho: That’s right. In fact, it has had several legislative efforts. Prior to 2021, California had a legislative provision that allowed homeowners to undergo an individual legal petition process to try to redact the racial covenant from their deed record. But as in other states that have tried this, the number of homeowners who actually pursue that process is vanishingly small. And this has been the modal reaction by about 12 jurisdictions that have tried to do something about this. In 2021, in part in recognition of that, California decided to pursue a more proactive approach and this law, AB 1466 mandates that each of the 58 counties in California establish proactive programs to identify and redact these racial covenants. And the challenge therein is that these deed records, in the case of Santa Clara County, can go back to 1850. So, Santa Clara counted up and said, we have 24 million of these documents, 84 million pages. And then to figure out how to come up with an implementation plan to engage in this process becomes a real challenge.
Rich Ford: Right. So, counties are faced down with a legal obligation to go through and redact all of these documents And they’re doing it by hand? Individuals going through with a Sharpie? What are they doing?
Dan Ho: Prior to when we started this collaboration, they were doing it by hand. They had a team of folks going through and reading up…I think it was close to 10, 000 pages. No, close to 100, 000 pages…that they read to identify 400 of these racial covenants. And that was obviously not going to be scalable given the sheer magnitude of the deed records.
Rich Ford: So then tell us about the RegLab project and what your key findings were and what you’ve developed in order to help with this problem.
Dan Ho: We had reached out to the county based on a number of other collaborations RegLab has had with Santa Clara County to see if we could actually help. And that’s really where Mirac became involved because he was simultaneously enrolled in a course on anti-discrimination law and algorithmic fairness.
And we started to have a conversation with the county, and they were very enthusiastic about potential alternatives to solving this rather than simply trying to brute force this approach. And so, we formalized a collaboration whereby we secured access to the deed records and really undertook the process of developing a machine learning model using large language models to be able to do this in a much more effective way.
Rich Ford: Wow, and tell us a little more about your findings as well … just how widespread these covenants are in California generally and then how your new model is going to help get rid of these covenants and help the counties fulfill their new legal obligation.
Mirac Suzgun: The AI tool that was created in partnership with Santa Clara County’s clerk recorder’s office, and it has been a very vital part of this project. At the heart of this kind of a collaboration, as Professor Ho mentioned, is the recognition that counties like Santa Clara are dealing with a massive volume of historical record, property records. And as Professor Ho mentioned, Santa Clara alone has over 24 million documents dating back more than a century. And based on our calculations, we realized that manually reviewing these records would be overwhelming and almost totally infeasible. It would require over 86,500 person hours. So to address this, we thought to develop a scalable solution that could efficiently identify racially restrictive covenants.
We basically had three primary stages. In the first stage, what we do is convert an image of a property deed to text using one of the open source models that allows us to do OCR, Optical Character Recognition. And then once we have the transcribed text, what we do is actually use an open source language model, namely Mistral, to basically identify the portion of the text that contains unlawful discriminatory language.
And in the third stage what we do is actually highlight — highlight unlawful language — and then extract property address. So when I put it in this way it sounds very easy perhaps for our audience, but unfortunately each step of the whole pipeline has its own issues and difficulties. For instance, in the first stage, since we are dealing with historical deeds, many deeds are basically containing lots of scanning artifacts and it’s sometimes hard to get a clean, spell-checked text out of them.
In the second step, though, it is important for our AI model to look at the entirety of the text, not just do a simple keyword search. For instance, you might say, “well, why don’t we use a simple, perhaps, keyword search to identify and find racially restrictive covenants? Maybe we can just look at the deeds that contain the word white or Caucasian?” But this unfortunately yields lots of false positives. For instance, the term Caucasian might be referred to different things in different contexts, and in some cases it might be used innocuously. Or if you think about the term white, it might be referring to the surname of an individual, or it might be a street name, or it might be just referring to the color of a fence in a house. So, it is really important for us to look at the entirety of the text, not just one part, and for that reason we realized quite early on that just a simple approach based on keyword matching would not allow us to actually identify all the racially restrictive covenants. It would instead actually produce lots of false positives.
And perhaps, finally, in the third stage, what we are doing is also actually providing the county with a very useful tool that allows us to see the highlighted unlawful language along with the extracted property address, and we cannot emphasize how important that is, especially for the county to map these property addresses in their record. And after all this process is done, we sent our documents to the council for final review, and it is important to highlight that at the end of the day, the county council is expected to review and affirmatively basically approve the sent document, so there is always a human in the loop and this is important to emphasize.
Rich Ford: No, it’s amazing. So you’re beginning with just an insurmountable task And you’re saying it’s already saved 86,000 person hours of labor just in Santa Clara county alone. It’s cheaper than anything else available. Did you see 2 percent of what it would cost to use a different type of AI model? And it’s extremely accurate. If I’m hearing this right now, the counties can simply just scan the documents into a computer and then wait and the program is going to redact all of the racially restrictive language?
Dan Ho: Most of that is spot on, Rich. I think one of the things that is notable here is that when Mirac talked about integrating this into the county review process, AB 1466 actually does have a statutory requirement that County Council ultimately review all of the provisions to sign off on them and then to re-record them.
So, one thing that we found really useful here, essentially what the model does is it flags and finds this sort of proverbial needle in the haystack. And then you can do the human review that’s legislatively mandated for a much, much smaller set of documents.
Rich Ford: I see, so you’re not going through everything, now it’s going to filter out all the ones that need to be reviewed.
Dan Ho: Exactly. It’s statutorily required that there still is legal review, so it’s not as if the model is actually itself taking the place of that human judgment.
Rich Ford: I see, okay. But it’s coming close…So let’s talk a little bit about The why the problem is so widespread. Maybe we should take our listeners back to why we have so many racially restrictive covenants, why there are so many documents that need to be reviewed, and perhaps even a little bit of the history of racially restrictive covenants for listeners who may not know that history so we can walk through how we got to where we are and why this problem looked to be so insurmountable until you came up with this AI tool
Dan Ho: Well, gosh Rich, I feel like I should be flipping the script here and interviewing you because you are much more of in this domain than I am. But in a kind of a nutshell: Racial covenants existed in the 19th century, but really became much more prevalently used in the early parts of the 20th century after the Supreme Court found racial exclusionary zoning to be unconstitutional in a decision, the Buchanan decision in 1917.
So prior to that, you could have zoning that basically publicly meant segregated neighborhoods. The Supreme Court finds that to be unconstitutional, and then what happens is the mechanism of discrimination in many ways migrates towards a form of private action, in the form of these deed covenants. And deed covenants, what’s notable about them is while they’re a kind of private transaction, they run with the land and therefore they bind every subsequent purchaser of the home. And that’s why these have stuck around for such a long period of time.
In 1948, as you started off mentioning, that’s when the Supreme Court finds these covenants to be unenforceable. And then later on in 1968 in the Fair Housing Act, they’re declared illegal. But one of the really interesting things here is there’s a fascinating volume by professors Rick Brooks and Carol Rose that really talks about the persistence of racial covenants where they were being utilized even after Shelley v. Kraemer, some even after 1968. And one of the arguments by professors Brooks and Rose is that they really, in a sense, even when not legally enforceable, served as a kind of signaling function as to the kind of community that this is. They tell this sort of story of, in 2002, of a Richmond man who refused to sell his home to an African American woman and pointed to the racial covenant, claiming he simply didn’t realize that these were not enforceable. And that was in 2002.
Rich Ford: Wow. So, we have a progression from state-sponsored race segregation that’s held unconstitutional in 1917, to these private covenants, which, listeners might think are, two individuals who get together and agree on something, but in fact, they are replicating the private…or the public zoning in a lot of instances. And the important part of them running with the land is that when you buy a piece of property, you don’t have the option to get out from underneath this racial covenant. It’s a restriction that you’re bound by. That’s why it’s still in there when you’re buying your house because there’s no easy legal mechanism to avoid it without getting everyone who has the kind of a mutually reciprocal relationship to everyone to agree to get rid of it. So, they’re there in perpetuity.
Mirac Suzgun: And it might be actually useful to acknowledge the roles of two other kind of like actors in the enforcement of racial covenants. The first one is the real estate industry at the time for instance institutions like National Association of Real Estate Boards enforce racial covenants through racial steering. And they also made adherence to these racial covenants part of their ethical code which further reinforced segregation in the housing market. And we were able to see some signs of basically the role of National Association of Real Estate Boards in our analysis as well. And perhaps the second actor that should be perhaps mentioned is unfortunately the federal government itself. Federal programs such as the Federal Housing Administration at the time required racial covenants for mortgage insurance, and they embedded segregation basically in housing markets across the country and unfortunately these also played an important role in the rise and prevalence of racial covenants at the time.
Rich Ford: So we have a kind of public private partnership where it’s really hard to disentangle the importance of state action as opposed to private action, and that’s why even when they’re formally unenforceable, they’re still serving the signaling function. The real estate agents engaged in racial steering, the individuals who are getting the signal that, this is a white neighborhood, and other people need not apply.
So you’ve developed this amazing tool, and now we’ve got some information about how widespread these racial covenants are and where we find them. Could you tell us a little bit about that?
Dan Ho: Yeah first, just echoing what Mirac said earlier, the core contribution here is a model that really can offset a huge amount of time that would otherwise be spent on human review. Los Angeles County, for instance, contracted for $8 million to try to complete this process in seven years, and this model really makes it possible to do this kind of a process at scale and in a much shorter period of time. But then there’s some really rich and interesting historical findings. For instance, until doing a scan like this, we really had no idea how pervasive racial covenants were in Santa Clara County. And what we were able to do here is to actually geolocate these deed records essentially by retrieving the County assessor and surveyors’ maps and matching them to the deed document. And then we were able to identify that actually right at the peak period when these racial covenants were being used, from 1920 to 1950, is when the housing units double in Santa Clara County. It’s when a lot of development is happening. And so even though we found around 7,500 deed records, many of these deed records actually cover potentially hundreds of properties. And so what that meant was this really sobering finding, which is that we estimate that as of 1950, one in every four properties is covered by a racial covenant.
And the other finding to highlight is that precisely because this was at a time when there was a lot of development, when agricultural land was being turned into sort of residences, it was a really small number of developers that bore the disproportionate responsibility for these racial covenants. Our estimate is that at the deed level, 10 developers are responsible for close to a third of all of these covenants. And there’s some really interesting historical work trying to understand how much responsibility do folks bear. Is this all just driven by market forces and the like, and what our evidence really suggests is that there may be a lot more agency by a small number of individuals to potentially change the way in which housing discrimination and segregation occurs in the county.
The sort of poignant counter example is one developer who was responsible for 2,700 homes, Joseph Eichler, in Palo Alto alone, and he is mainly known for his kind of the unique style of his house, but he was someone who adamantly resisted putting racial covenants in the deed record and so … and showed that actually he could survive and this was not simply driven by market pressure. And so this really corroborates is this sort of notion that there is real agency and individual responsibility by the small number of developers operating in Santa Clara County at that time.
Rich Ford: Yeah, okay. That’s…it’s all so interesting. And the bit about Eichler is great too. It even burnishes Eichler’s reputation a little bit more that he resisted this.
Dan Ho: We found a really great historical piece of scholarship about Eichler and how he really tried to both lead by example and was actually involved in some policy efforts for housing reform in the state, but he’s such a great counter example, right? Because some people will excuse parties who are implicated in this by saying, that any developer who didn’t put these deeds in would have lost business….and as a result would not have been able to survive. And it’s that market pressure that, in a sense, one could argue absolves them of a kind of responsibility. But really here it’s a very small number of developers responsible for very large portions of the county.
Rich Ford: So, a small number of people, they’re driving the market rather than the market driving them, perhaps, and that’s a lot of … I do think that’s a powerful story to be told about all of this, that this is starting with government, its’s being picked up with, the FHA with various other collaborations, a small number of developers, a group of real estate agents. And do we know that the typical homebuyer demanded racial segregation? I don’t think we do know that. What we know is that they got it and because, this group, relatively small number of people, decided that they should have it. But then, of course, by the 60s, it’s just, it’s baked into expectations. And, yeah, then everyone is demanding it, like “there goes the neighborhood” if a nonwhite family moves in.
Dan Ho: Yeah, and then the other thing that, some people say is this it’s been unenforceable since 1948 what, is this, should we really be … who cares? Yeah. Who cares about doing this? And one of the things that’s really striking is when you look at the terms that are being used, they are some of the most noxious ways to describe demographic groups. And it is really extraordinary when you go through this and you look at … “Provision 1: There shall be a septic tank. Provision two: There shall be no animal husbandry on this lot. Provision three: No Asians and Mongolians on this property,” and that’s yeah, there’s in a sense the ordinariness of how walk over from a septic tank to like racial exclusion is really like striking.
Rich Ford: Right? So there’s a way in which they’re just describing it as regulation of nuisances. Like, “Oh, no, we don’t want these things,” and whole groups of people are being lumped into that. The banality of it is another feature that at some point it’s just, it sounds … and I’m sure it’s true that it just to people started to sound like common sense actuarial work. Like “We’re just assessing risks, and these things bring down property values. So we’re going to…
Dan Ho: And it is quite literally encoded into the like the FHA underwriting of rules.
Rich Ford: So, we’re learning a lot and some sobering things about the nature of real estate development in California during this period of time. We’re also getting these obnoxious deed restrictions taken out, so the subsequent buyers won’t go through what you went through and find themselves being required to sign these kind of objectionable documents. What’s the next step?
Dan Ho: One of the things that I think is quite important here is that it isn’t exclusively a redaction process. There have been some concerns …anyway, Carol Rose at one time noted the concern that if we just go through and take these things out, we could be quite literally erasing the historical record. And so I think what’s quite notable and important about AB 1466 is that there is a requirement to actually retain the unredacted version that really opens up this line of historical inquiry. The other thing I should note is that there have been really fantastic efforts in other jurisdictions to actually try to crowdsource this, where some projects like the Mapping Prejudice project in Minneapolis actually utilized thousands of volunteers to actually read and review these deed records.
Those are amazing initiatives, but unfortunately for the 58 counties in California, they may not have that kind of an amazing volunteer base to be able to do this, and that’s where this technology can really help.
Rich Ford: This is already happening, or has happened, in Santa Clara County. Are there other California counties I imagine are like pounding on your door trying to get a hold of this technology?
Dan Ho: We’re one of the things we’re committed to is really doing this in a kind of open science approach. So, we’ve actually made the model available for anyone to be able to use. And we, of course, welcome being in touch with other jurisdictions that are trying to do this because one of the really careful things that Mirac did when originally curating the training data was to make sure that there were was a diversity of racial covenants in place, sourcing them from seven other counties across the country, so that the model would actually be one that could be used not just in a single county, but really across a number of different jurisdictions.
Rich Ford: Ah, so that’s another piece of the work of the AI and all the work that you’ve done, making it quite different than, as you were saying, a keyword search, that this is something that’s actually taught the AI to find the wide variety of types of racially restrictive covenants. And so it could be scaled up to be used all over the country. I imagine there are other jurisdictions who might be interested. Is there anything that other jurisdictions could do to use this technology in their own searches for racially restrictive covenants?
Mirac Suzgun: At the moment, our tool is available both on our website and also on a machine learning platform known as Hugging Face, but we welcome any feedback, any opportunity for collaboration. If there are any counties or cities or towns that would be interested in working with us, we will be quite honored. So, they should feel free to reach out to us.
Rich Ford: This was a pleasure to read, both because it’s personally interesting, but also inspiring, a rare piece of good news days. I’ll take all the good news I can’t get.
Dan Ho: It’s a weird thing when it’s like good, the good news is you’ve discovered these horrific, historic documents. Really weird. Especially once we got to the one-in-four finding, we’re wow, this is like quite the finding. But on the other hand, incredibly disturbing.
Rich Ford: Right, the history is worse than we thought but the good news is we can discover it and do something about it today. This is an inspiring story and it’s a wonderful example of the way technology can be used in the service of social justice Also a great example of things going on at Stanford Law School at the intersection of law and technology. So thanks so much for being on the show.
This is Stanford Legal If you’re enjoying the show, please tell a friend and leave us a rating or review on your favorite podcast app. Your feedback will help us to improve the show and to help new listeners discover us. I’m Rich Ford. See you next time.