(Including White People)
In the post-genomic age, our genes dictate our health. Now, an at-home 23&Me DNA testing kit might warn you of your chances of hair loss, or perhaps prenatal screening will reveal your baby’s risk for an inherited disease. We take it for granted that there are genes linked to every trait or disease, but what we don’t often think about is how those associations are made. As the authors of a disturbing commentary in Cell Press revealed recently, there’s a huge problem in the way scientists have been doing so.
Geneticists make those associations by drawing DNA from samples of people meant to represent a larger human population. Unfortunately, as discovered by Sarah Tishkoff, Ph.D., co-author of the commentary and human evolutionary geneticist at the University of Pennsylvania, about 78 percent of that data is drawn from people of European ancestry.
“That’s just disgraceful,” she tells Inverse, en route to Oregon and San Diego to rally scientists to increase diversity in research. “We’ve got to do something about that.”
A white bias means that the conclusions scientists can draw from the data may be applicable to only people of European descent. These incomplete surveys of health and disease risk could have dire consequences for people of all ethnicities — Europeans included.
“It doesn’t just benefit the different ethnic groups you might be studying,” says Tishkoff.
A Pervasive Diversity Problem
Everything we know about the genes linked to disease risk comes from genome-wide association studies, or GWAS. In those studies, researchers sequence the genomes of thousands of participants, then look for associations between idiosyncrasies in the standard human genome sequence (known as single-nucleotide polymorphisms, or SNPs) and certain traits.
Since 2008, all published GWAS have been cataloged in a database jointly maintained by American and European institutions. Tishkoff and her co-authors examined the ethnic breakdown of all the studies published there, hoping to see a shift in the European bias documented in previous analyses.
“We thought maybe it would be better. Some people looked at this a couple of years ago,” she says. “But it hadn’t really. Just a little bit, but not much.”
In 2009, 96 percent of people represented in GWAS were of European descent. In 2016, the proportion was 81 percent. Now, it’s 78 percent — a small improvement, but not enough. The human population is not made up of 78 percent people of European descent, and yet the conclusions drawn from the GWAS data are meant to benefit all people. The rest of the people represented in GWAS studies are 10% Asian, 2% African, 1% Hispanic, and <1% all other ethnic groups.
Why This Matters
The issue with this lack of diversity is that people of non-European ancestry may carry different genes associated with the same diseases being investigated in Europeans. A cancer screening method devised with that data, for example, might not catch a cancer-linked SNP carried only by Asians.
Right now, we don’t know what ethnicity-specific associations we are missing. “Until people start increasing the numbers of ethnically diverse individuals in these studies, we’re not going to find out,” says Tishkoff.
She learned first-hand the value of doing so when she discovered a gene linked to human skin color in a landmark 2018 study involving only 1,600 African people. “Now, they are finding that gene plays a role in melanoma risk,” she says. Such a gene could be used to identify skin cancer risk in people of all ethnicities. “That’s why it’s really informative to look across diverse ethnic groups,” she continues.
Tishkoff also points out that a white bias in the data used to diagnose people of other ethnicities will make existing race-based health disparities worse than they already are. Already, health trends are sharply divided along racial lines for socioeconomic reasons: black and Hispanic populations suffer from air pollution produced by white communities, and HIV mortality rate of black patients is vastly higher than that of their white counterparts.
“My biggest concern — I am really worried — is that we just don’t know how well these are going to translate across ethnic groups,” says Tishkoff. “And if they don’t translate well — and the preliminary studies suggest that they won’t — then we could be giving people wrong information.”
How This Happens
The basis of these disparities lie in the GWAS itself and the people involved in conducting them. There are a number of reasons why the sample population of a GWAS might skew heavily European, says Tishkoff. Some are more understandable than others, though none are excusable.
The first has to do with who is doing the research, and where. “A lot of this research is done in the US and in Europe, and you’re going to study the populations that are most common,” says Tishkoff.
Then there are the people in non-European populations themselves, who may not want to take part in GWAS for a number of reasons, a big one being distrust. “Fair, based on past injustices and bad practices,” says Tishkoff. American syphilis tests on Guatemalan prisoners and the gross exploitation of the black patient Henrietta Lacks are just a few examples that come to mind.
The most complex is the question of funding. Historically, says Tishkoff, the National Institutes of Health would penalize proposals for studies on genetically heterogeneous populations — that is, people who weren’t all of the same ancestry. The accepted reasoning is that studies on people with diverse ancestry tends to give false positive results, which Tiskhoff says “has been known for a long time” but can be corrected to a high degree by statistical methods. Making these studies look worse in the eyes of the funder is the fact that sample sizes are generally smaller in minority groups, and so they don’t have as much statistical power.
“You may not get your grant money if you do that,” she says.
How to Fix It
Fortunately, the scientific community has been receptive to the call for greater diversity. “People are appreciative,” says Tishkoff. The NIH now has an Africa-focused initiative to support genomic research on African populations in African institutions, and some of its branches and projects mandate inclusion of minorities and women.
Tishkoff’s call for diversity was recently echoed by Joyce Tung, 23&Me’s vice president of research, who wrote in STAT: “Social, cultural, economic, and political barriers separate researchers and research funding from the individuals who need it most.” 23&Me has one of the world’s largest genetic data sets of people from African descent and now (with client permission) is opening up that data to scientists.
This diversity issue extends far beyond the data, and will require the work of scientists, policymakers, and laypeople alike to correct. Inclusion begins with supporting people of diverse ancestries, especially in environments where they are underrepresented.
Tishkoff brings it back to the lab: “There have to be more ethnically diverse people in the human genetics field,” she says. “I think people trust people who are more like them.”