Genetic Privacy in the Age of Facebook, And the Fourth Amendment

Last week, a group of scientists reported (Science: subscription required) that they were able to identify “anonymous” donors in genetic information studies using little more than Internet sleuthing. The study was widely reported in the popular press, including a short piece in Wired where CLB’s own Hank Greely was quoted. I’m interested, and often disappointed, at how both law and genetics are covered in the popular press, and unfortunately, nothing in this press cycle disabused me of those feelings. The headlines ranged from the baldly wrong (“Anonymous Genealogy? Think Again, Everyone Can Be Traced“) to the fear-mongering (“Patients can be ID’d in ‘anonymous’ public genetics databases“); reporting elided over critical facts; and some articles relied on interested or one-sided interview subjects. But worse, perhaps, is that none of the reporting captured, or even discussed, what I think are the two most critical strands of concern for the public: (1) what genetic privacy means in the age of Facebook, and (2) how that analysis affects the Fourth Amendment.

Before that, though, it’s important to understand how last week’s research consortium “unmasked” anonymous DNA profiles. They first obtained sequence information of the Y-chromosome, the male-determinant chromosome, for 911 individuals, whose last names were known. Those individuals last names were known because they put that information online themselves–specifically on a website called FamilyTreeDNA. Second, they compared that sequence information to “anonymous” sequence information from two, large online databases, Ysearch and SMGF. I put “anonymous” in scare quotes because that information isn’t purely anonymous. In many instances, it also contained the individuals’ state of residency, date of birth, and other information. Third, when the researchers found a match–or a close match–between two sequences, they reasonably inferred, in some instances, that the individuals shared a last name. Why? Because, like Y-chromosomes themselves, last names are passed down from father to son (at least typically). And lastly, when the last names were particularly rare, cross-referencing those last names with information like state of residency and dates of birth could, in some instances, uniquely identify the previously anonymous individual.

So what does this mean for genetic privacy? Well, in many ways, nothing different from the same privacy concerns that currently surround online social networks. That is, even if you choose to opt-out of putting your information online, that doesn’t stop one of your relatives from saying things that could identify you. If I choose not to participate in Facebook, for example, that doesn’t stop my sister from posting stuff on her wall about my personal information or tagging me in photos. Even if she only refers to me as “her brother,” someone with the rather bromidic insight that we might share a last name could nonetheless, with minimal internet searching, figure out that I’m the person she’s talking about. And that’s precisely what happened here. Some individuals posted their gene sequences with their last names online, and the research consortium, spurred by their insight that “Surnames are paternally inherited in most human societies, resulting in their cosegregation with Y-chromosome haplotypes,” effectively “Googled” that sequence information to identify previously anonymous individuals.

I won’t rehash the numerous privacy concerns these issues raise. More than enough has been written on the subject; a quick Google Scholar about Facebook privacy and the law turned up at least 170,000 articles –about five times more articles than the number of days Facebook has even existed. But in short: privacy is shrinking, it’s easier for other people to invade it, society seems to be less troubled when it happens, and there’s precious little we can do about it.

Those last points have particular salience, however, in the Fourth Amendment context. It has long been a chestnut of Fourth Amendment law that what the public can legally search itself, the government can also search without a warrant. Thus, because anyone can fly a plane over your backyard means that the government can do so, too. Because the public cannot wiretap a phone booth, the government similarly cannot. The exception, of course, is where “the Government uses a device that is not in general public use.” Thus, while it may be legally permissible to use advanced thermo-imaging technology to spy on your neighbor’s house, because such a device is not in “public use,” the government must first obtain a warrant to use it.

Together, these concerns have broad implications for DNA privacy and the Fourth Amendment. Currently, there are several “hot button” cases exploring whether individuals convicted of no crime–but often arrested–can have their DNA samples taken from them without a warrant. Many of these cases turn on the notion that the warrantless collection of samples constitutes an unconstitutional “seizure.” But what if no collection was needed? What if the DNA profile of a close family member, perhaps already on file with the government, is a close match with that found at a crime scene? Or what if that family member’s DNA profile is, like the recent study in Science, simply available online? In other words, does warrantless, online genetic sleuthing violate the Fourth Amendment?

I think the legal answer is, No. No law prohibits me, or any other private citizen, from trying to identify you based on your relatives’ publicly available genetic sequences. (Laws like GINA and HIPAA may, in some circumstances, prohibit me from publicly disclosing that information, but nothing prevents me from engaging in the exercise for my own amusement.) So, in that sense, genetic sleuthing is something that the public can do. The public concern the recent study in Science raises is that it provides an exemplary model of how to do so. It’s a play by play account of how to go about unmasking hapless “anonymous” participants in genetic studies with their unwitting, overeager family members. The concern with the study, therefore, seems more about how easy it is for anyone to de-anonymize genetic information than the mere fact, as focused on by popular reporting, that some sophisticated researchers could do it.

An even better Fourth Amendment question, however, is whether genetic sleuthing is a new technology, “not in general public use.” After all, it appears that, even though the data sets the authors used had been online for years, no one had tried their technique until recently. But their technique, as discussed above, was little more than creative internet searching. And that, I think, counsels in favor of answering the question in the negative. Internet sleuthing is unquestionably in “general public use.” Who, reading this post, cannot honestly confess to using the internet for information about someone’s looks, past employers, or relationships status? Genetic profiles seem, for better or for worse, a logical extension of that human desire to Google our acquaintances.

To put, perhaps, a finer point on all of this, several states, including California, have already begun the process of familial DNA searching to further law enforcement efforts. Whether this violates some unarticulated spirit, as opposed to letter, of the Fourth Amendment, time will tell. As Judge Kozinski famously remarked about today’s shrinking realm of privacy, “Welcome to the fish bowl.” In an age of widespread social sharing–from restaurant reviews to personal anecdotes about cancer–genetic stalking, by your friends or the police, is not far behind.

Jake Sherkow, CLB Fellow

1 Response to Genetic Privacy in the Age of Facebook, And the Fourth Amendment
  1. I think you’re right about the Fourth Amendment (non)implications, though this is another example of how technology is changing difficult searches into much easier searches.

    The implications for research ethics, though, are greeter. I do think this is just one more example of how we CANNOT promise research participants confidentiality, privacy, or anonymity. Any data set rich enough to be useful for research will (almost) always be rich enough to allow some of the people in it to be re-identified – through the demographic information on them, the phenotypic/health information on them, or the genetic information on them. Real anonymity in research collection is an occasional, and decreasingly common, event.

    I worry that people who agree to be research subjects don’t really know this (investigators certainly don’t go out of their way to make sure the possible participants understand it!). They aren’t likely to be harmed – but if they are harmed (or even learn that their identity was compromised without harm to them), their anger will be all the greater for the fact that they didn’t really understand what they were getting into.

Comments are closed.