PGP volunteers note: it's hard to hide your APOE status

When James Watson's genome sequence was publicly released earlier this year, Watson famously kept only one region of his DNA a secret - the region encoding the APOE gene, which contains common variants that contribute substantially to the risk of late-onset Alzheimer's, and also affect predisposition to other diseases.

A recent article in the European Journal of Human Genetics shows something that shouldn't have come as a surprise to anyone familiar with human genetics: simply removing the APOE gene was not enough to prevent someone from inferring whether or not Watson carries the riskier versions of this gene, because other markers around the gene can also indirectly convey this information through the magic of linkage disequilibrium.

The authors kindly don't reveal Watson's APOE status, and in fact note that they warned Watson prior to publishing their paper so that he had time to take appropriate actions. He has since responded by removing an additional 2 million bases around the APOE gene from his public sequence.

That action largely removes the possibility of inferring his risk genotype using linkage - in fact, the authors note with dry Australian understatement that the removal of 2 million bases is "likely excessive". Watson could have used linkage information from the HapMap project to delineate the smallest required region, but apparently decided that overkill was the best policy.

It's worth noting that once we have complete genome sequences from sufficient individuals it will be straightforward to determine which DNA positions provide linkage-based information about a particular risk polymorphism (in a specific population, at least). That would allow the clean excision of only those bases that are absolutely required, thus having a smaller impact on research into the rest of the genome. (Of course, that relies on at least some people releasing their APOE sequence into the public domain, even if it turns out to carry the riskier version - I guess it's lucky for us we have anonymous genome sequencing projects like 1000 Genomes.)

The whole episode must be raising questions in the mind of some of the Personal Genome Project volunteers as they consider the prospect of releasing their own genome sequences to the world (participant number 8 has already raised the prospect of redacting his APOE sequence, while Misha Angrist is reserving the right to hold back, well, anything). Are there genes they should be hiding? If so, how much sequence do they need to delete? Ultimately, how do projects like the PGP reconcile the desire for partial genome privacy with the need to get sequences out there in the public domain to further genomic research?

Mind you, given the quality of the sequence data released so far, they probably don't need to worry too much for the moment...

Subscribe to Genetic Future.

Dale R Nyholt, Chang-En Yu, Peter M Visscher (2008). On Jim Watson's APOE status: genetic information is hard to hide European Journal of Human Genetics DOI: 10.1038/ejhg.2008.198

More like this

Pure Pedantry found this paper on leptin's effects in the hippocampus, and then went on to wonder
In this post: the large versions of the Environment and Humanities & Social Science channel photos, comments from readers, and the best posts of the week.
With the aging of the population, one of the most feared potential manners by which more and more of us will leave this earth is through Alzheimer's disease or other forms of dementia. And it is a scary thing, too.
Our mind has a sick sense of humor. It turns out that as we lose our memory, and sink into the darkness of dementia, the last memories to disappear are the memories we spent our lives trying to repress.