Personal Genomes Could Soon Be Public Information
Gene detectives ID ’anonymous’ men in registry
Gene detectives identified seemingly anonymous men in a genetic sleuthing study, with about 12% accuracy, using public gene records to triangulate on their last names.
Scientific sleuths identified 12% of "anonymous" men in a genetics registry with publicly available genealogy records. The report raises privacy concerns in an era when more and more genetics data is becoming public.
The registries contain the genomes, or full genetic profiles, of those who volunteered to have their genes analyzed. Since the completion of human genome efforts a decade ago, increasing amounts of such human gene maps have appeared in research registries, even as prices to complete them have fallen to a few thousand dollars. Popular genealogical or ancestry-tracing efforts have also started using genes tied to family names as well, making information about genes that run in particular families public.
"Everyone benefits from more genomic information finding its way into research, offering clues to rare diseases and other ailments," says Yaniv Erlich of the Whitehead Institute for Biomedical Research in Cambridge, Mass., who headed the demonstration effort reported Thursday in the journal, Science. "But we wanted to illuminate the issue of potential privacy issues," he says, not to stop the flood of genomic information helping researchers, but to halt avenues for privacy abuse.
In the study, the team started with men in the international 1000 Genomes Project, an anonymous registry of genomes. They next tied male genes in the registry to the last names of men in publicly available ancestry-tracing sites, finally using age and location information available in the project data to identify about 50 men. Fairly common last names tied to distinct genes in ancestry site records served as the best pointers to the otherwise anonymous project men’s identification. The study does not name them, although a test run did identify genetic luminaries such as human genome pioneer Craig Venter of the J. Craig Venter Institute in Rockville, Md., whose information is online without any disguise.
"In one sense, this is a unique situation," applying only to male genes, says geneticist Martin Bobrow of the United Kingdom’s University of Cambridge, who was not part of the study. "But in another sense, the critical factor is that many people are putting chunks of their personal DNA sequence into databases which are not well secured from public access."
The 2008 Genetic Information Nondiscrimination Act (GINA) federal law bars insurers and employers from discriminating because of genetic tests. And National Institutes of Health (NIH) officials have hidden age information from the "1000 Genomes Project" used in the demonstration, in response to the report released Thursday. Erlich suggests that a combination of better computing ability combined with more and more genetic information raises a need to prevent yet-unforeseen privacy abuses, while preserving the medical and social benefits. In his own lab, for example, Erlich notes such databases have told parents carrying genes for dangerous rare diseases about their fertility risks.
"I would think someone whose privacy alarm bells would be set off by this is already unlikely to be a participant in a study that made their genome data available," says Princeton genome expert Leonid Kruglyak. "But it is something that will need to be noted in informed consent forms," he says, and perhaps considered in non-discrimination law.
"My genome is all out there and I’ve suffered no ill effects," says Venter, who was an adviser to lawmakers behind the 2008 law. "I would actually encourage people to put more of their genetic information online."
Article from: usatoday.com
How the genetics community addresses these issues is crucial to how large-scale genetic studies will proceed. Although research participants are already sometimes told that their data might not remain private — as the CEPH study participants were — the fact that their identities could be revealed would seem a remote risk to them, as that has only recently become possible. It is now imperative that participants fully understand that it is unlikely that their identities can be kept hidden if their genetic data are revealed. Some participants might welcome this, such as those with an interest in genealogy. Others — perhaps those with stigmatized diseases, for instance — might not.Source: Nature.com
Moving data behind a controlled-access barrier lessens their utility to science and to society at large. But researchers need to show the public that they are acting as careful stewards of the data entrusted to them. Erlich argues that the solution is to make sure that participants understand what they’re signing up for, and to adopt laws that adequately protect people against the misuse of their genetic information.
Geneticists are brainstorming other proposals for balancing data sharing with the need to protect the privacy of research subjects. One is to move more data behind a controlled-access barrier, but to authorize trusted users to access the data from many studies, rather than having to obtain it piecemeal from different studies, as researchers must do today. There are logistical barriers to this — for instance, ensuring compatibility across databases. And it is debatable whether such restrictions might do more harm than good.
But if controlled access is not the right solution, it is up to the research community, in consultation with the public, to devise a better one. A solution should come sooner, rather than later, because this latest revelation of a privacy loophole will be far from the last.