Multi-Ethnic Chromosomes: Taking Genetics and Race (or Ethnicity) to a New Level

Discussions about race and genetics have been around for decades, with many people feeling queasy (at least) about the possible uses of DNA to say anything about “race,” in part because of greatly different understandings of what “race” and “ethnicity” are (cultural, biological, both, other, none of the above) as well as broad social exaggeration of the power of DNA.  Two fairly recent – and fascinating – scientific articles demonstrate how race or ethnicity are being imputed to parts of chromosomes and so being used to make findings about humans and human history.

First, a little genetics background on “recombination” (skip this paragraph if you understand it).  As eggs and sperm develop, they go through the process of meiosis, which reduces their DNA from two copies each of 22 different chromosomes (plus two copies of the X chromosome in women and an X and a Y in men) to one copy each of 23 different chromosomes.  But the chromosomes do not just neatly separately, with one member of each pair going in a different direction. Instead, they go through a process of “recombination.”  They break apart into several chunks and come back together.  So, as an example, consider the copy of chromosome 4 you got from your father.  He had two copies of chromosome 4, one from his father and one from his mother.  Call them 4P (paternal) and 4M (maternal).  [Note that if we call them “F’ and “M”, as I started to do, we have an ironic ambiguity – does F stand for “father” or “female”; does M stand for “mother” or “male” – using “P” avoids that problem.]  Without recombination, the chromosome 4 you got from him would be the same as the one he got from one of his parents, with no contribution from your other grandparent.  Instead, his two copies of chromosome 4 break apart into, say, 4 pieces each, which we can call 4P(a), (b), (c), (d), and 4M(a), (b), (c), (d).  The pieces then come back together and one sperm gets, say, a chromosome 4 that is 4P(a), M(b), M(c), and P(d), while the other one gets 4M(a), P(b), P(c), and M(d).  How they are put back together seems to be random.  Each chromosome is a patchwork of the chromosomes of his father and mother, just as their chromosomes were patchworks of the chromosomes of their fathers and mothers.

These chunks of chromosomes typically include millions of base pairs of DNA, enough so that (often) they have variations that are most commonly found in some human populations than in others. Assume, for example, that your father is “totally” East Asian and your mother is “purely” European (the scare quotes are there because the concept of “totally” or “purely” genetically one ethnicity is ultimately meaningless, but it let it ride for now).  The chromosome chunks 4P(a), (b), (c), and (d) will therefore be identifiable as most likely from an East Asian source; 4M (a), (b), (c), and (d) will be seen as most likely from a  European source.   Note that any child from such a couple could have, by chance, a chromosome 4 that looked 100% East Asian, 100% European, or, more likely, various percentages of each  in between.  In that sense, these chromosome “blocks” could be said to have an “ethnicity,” in that they showed genetic variations that made it most likely that they came from someone of a particular ethnicity.  If the East Asian/European child were to end up mating with an African/Native American child and if (for example) in both cases those chromosome 4s had broken into and recombined as four units,  the resulting baby would have a chromosome that might carry bits and pieces of chromosome that were identifiably from people of all four ethnicities.

Now, back to applications.  Back in October 2011, Carlos Bustamante (disclosure – a Stanford colleague and friend) presented a paper at the International Congress of Human Genetics on “the genome of the Taino.”  The Taino were a Native American people in the Caribbean that is generally (but not universally) viewed as having disappeared as a result of European diseases, European abuses, and European (and African) intermarriage.  The Bustamante group looked at genetic samples from Puerto Ricans and were able to distinguish blocks of chromosomes that were of European and African ancestry.  The remaining 10 to 15 percent of the DNA, of Native American and, they believe, largely Taino, origin, they concluded was the living remainder of the Taino genome.  No one person carried carried much of the Taino genome, but the 10 to 15 percent that many people carried came from different bits of different chromosomes.  They could be pieced together to form a (largely) Taino genome, even though not single such Taino genome existed.

The findings, reported in Nature News here, provoked controversy largely because the first version of the news article had called the Taino “extinct,” greatly upsetting people who consider themselves Taino.  Nature News “corrected” the  headline from “Breathing Life into an Extinct Ethnicity” to  “Rebuilding the Genome of a Hidden Ethnicity”  [emphasis added] and both Nature and Carlos apologized for any offense they may have caused.  That controversy is fascinating in itself, but I want to point out how this result was only possible through attributing “race” or “ethnicity” (or, perhaps more properly, a racial or ethnic source) to blocks of chromosomes.

The same kind of thing was done more recently in another, broader, piece of genetic history.  This week, Nature published an article by David Reich (first author), Andrés Ruiz-Linares (last author), and 32 others (we law folks just boggle at multiple author papers) on the genetic history of Native Americans.  Here’s the article and are stories from the New York and Los Angeles Times  about it (the first two may be behind paywalls; if so, sorry!)  (Another disclosure – I provided some advice on an ethics issue to the two main authors.)  The researchers  looked at 493 DNA samples from 52 distinct Native American populations in North and South America and concluded that the genetic evidence showed at least three waves of migration. The first wave provided the ancestors for almost all Native Americans, but not quite all.  Their Inuit/Aleut samples had DNA that was about half from that first migration, but about half from another migration.  And the samples from their only population that spoke a Na-Dene language, the Chipyewan of Canada, showed that 90 percent of their DNA came from the first migration but 10 percent came from a third migration. This “three migration” supports a controversial linguistic theory that groups all Native American languages into the same three supergroups:  Inuit/Aleut, Na-Dene, and “everyone else”.

It’s fascinating work, though, as the authors agree, more work, with more samples (especially more samples from Na-Dene speaking populations), needs to be done.  For the purposes of this post, though, the interesting thing is that they didn’t look at all the genomes of all their samples – they only looked at the chromosome blocks that were of Native American origin, ignoring blocks of chromosomes in their samples that seemed to be of European, African, or other ancestry.  Overall, they found non-Native American admixture of about 8.5%; by subtracting that out (in various ways) they were able to get more power from the samples they had.

What genetics can tell us about the race or ethnicity of people remains controversial.  It is striking, though, that genetic research is using the race or ethnicity of chunks of chromosomes to make, or improve, discoveries.  I don’t know what it means – but I sure find it interesting.

Hank Greely