Lupski, J.R., et al. (2010). Whole-genome sequencing in a patient with Charcot-Marie-Tooth neuropathy. New England Journal of Medicine advance online 10.1056/nejmoa0908094
Two new papers out today - the first ever studies to employ whole-genome sequencing for disease gene discovery - neatly illustrate both the promise and the challenges lying ahead both for clinical and personal genomics.
Despite having entire genome sequences from four individuals, the researchers could only narrow down the list of candidate disease-causing genes to a shortlist of four - and it was only with the addition of large-scale sequence data from an additional two unrelated patients that the most likely gene could be identified (this result was published in a separate paper in November last year).
The basic problem here is that we're still extremely bad at differentiating between mutations causing serious disease and perfectly benign polymorphisms - each of us have genomes littered with genetic variants that look like nasty mutations but have little or no effect on health. In fact, Lupski's genome illustrates this nicely: one of the mutations causing his disease is a premature stop codon that disrupts the function of a gene - but his genome also contains an additional 120 stop codons disrupting other genes, presumably without severe health effects.
There's some ominous implications here for personal genomics as we move into the whole genome sequencing era. If it's hard to find a severe disease mutation using four complete genomes, how much more difficult will it be to interpret variants with much more subtle effects on health using only one genome (i.e. your own)? What will we do with rare, potentially serious-looking variants found in an individual's genome but nowhere else?
fantastic post! This is precisely the problem with the rush to market with these tools and why Pollack will write his article. The future of this stuff for predicting common disease risk will require 100,000 genomes in most cases. We have a long way to go. And you highlight some of the problems very nicely.
Thanks for the no hype post!
I think disease identification by genome sequencing reflects an inflection point in the maturation of the applications of DNA sequencing. Perhaps, historically 2010 will be noted for these publications in the context of the other genomes coming online monthly.
from the Nick Wade's NYTimes article on these papers:
"About 2,000 sites on the human genome have been statistically linked with various diseases, but in many cases the sites are not inside working genes, suggesting there may be some conceptual flaw in the statistics."
Heh - I'm right in the middle of blogging that exact same paragraph.
Seems like someone's been listening a little too hard to David Goldstein?
Also, Wade's comment, "less than a dozen genomes had been decoded, all of healthy people," has folks in St. Louis apoplectic.
Brilliant and informative post especially important in view of the NY Times article on the subject. I found the reader comments on the article almost as interesting as the article itself. People are very worried about their privacy and are also becoming savvy enough to realize we need a lot more data.
I think whole genome sequence based GWAS will truly shine when it is cheap enough to do it with at least 1,000 cases and 1,000 controls.
I suspect this will take another five years to get there. Two to three years to get the genome sequenced at less than $1,000. Another two to three years to develop the computational algorithms and infrastructure and to actually conduct the case control study.
I think you're being a little pessimistic - I'm expecting to see the $1,000 (reagent cost) genome in 2011, and the infrastructure required to map and analyse 2,000 genomes is already available; it's just expensive to put together.
One thing missing from your analysis (which is well above average) is some discussion about any hope of eventual "cost effectiveness".
Personally, I see very little hope there. The NYT states that an entire sequenced genome can now be yours for "only" $50,000. What they neglect to mention is the cost of then pulling any useful information out of the incredibly huge number of data points.
And the subsequent cost of putting any actual information to use- creating some kind of functional therapy. Designer RNA's, possibly, or specific protein blockers?
None of that is cheap, or ever will be.
So what they're working on, really, is NOT a technology which will bring broad benefits to the health of mankind- but a new way to extend the lifespans of the extremely wealthy.
Personally, I'm not in the mood at the moment, to help them out.
I have CMT. I have waited for the treatments for all my life. Please help CMT patients.
This post was selected in "Surveying the gut microbiota, cross dressing chickens and more, in my Picks of the Week, from RB"