Navigenics has announced in the industry publication In Sequence (subscription only) that it plans to add gene sequencing to its personal genomics service. This would make it the first of the "Big Three" personal genomics companies (Navigenics, 23andMe and deCODEme) to offer analysis of rare as well as common genetic variants.
The move into sequencing has always been inevitable for the personal genomics industry. Currently all three of the major players in the affordable personal genomics field (as opposed to Knome's high-end service) use chip-based technology to analyse up to a million common sites of variation, known as SNPs, scattered throughout the genome. SNP chips provide remarkable insight into common variants (that is, variations with a frequency of 5% or greater in the general population), but they don't provide any real information about rarer variants - particularly those with a frequency of less than 1%.
It has become increasingly clear over the last few years that common variants play a disappointingly small role in most common diseases, as SNP chip approaches on ever-larger sample sizes have consistently failed to find the majority of disease-causing variation. Rather it appears likely that a substantial proportion of disease risk lurks in individually rare, large-effect polymorphisms - variants found in just a small fraction of the population, each contributing a substantial increase in disease risk. Most of these variants will never be picked up by SNP chip technology, so new approaches will be required to find them - and that's where sequencing comes in. By determining the complete DNA code within a set of target genes, sequencing identifies both common and rare variants alike.
Until now, the shift of the personal genomics industry into the sequencing market has been held back by two major barriers: cost, and the difficulty of interpreting rare variants. The first barrier is dropping with alarming speed, but the second is still a major challenge - and one that will pose some serious dilemmas for Navigenics and other companies as they launch their sequencing ventures.
Of course, these aren't new dilemmas: molecular diagnostics labs have been facing the challenge of determining whether or not a novel mutation is disease-causing for decades, in the context of both rare Mendelian diseases (like muscular dystrophy) and particularly in complex diseases such as breast cancer (BRCA1 mutation analysis is a particularly subtle art that probably warrants its own post). Navigenics will thus be taking advantage of the experience and the databases of a company called Correlagen Diagnostics, which already offers sequencing-based tests for a range of known disease-causing genes. I don't know enough about Correlagen to comment on their expertise, but it certainly makes sense for personal genomics companies to team up with experienced molecular diagnostics teams as they face the challenges of the sequencing era.
Navigenics will initially restrict the complexity of the problem by focusing on a set of known disease genes, and will draw on Correlagen's database to see if any new variants they find in a client's genes are known to be associated with diseases in other patients. However, most of the possible disease-causing variants they find will be completely novel - such is the nature of rare variants - and their disease-causing status will thus need to be predicted de novo. Navigenics' solution is roughly laid out in the In Sequence article:
In many cases, though, a rare gene variant will never have been seen before and, thus, be more difficult to interpret. Based on the variant's properties, like its evolutionary conservation, or whether it results in an amino acid change, Navigenics will attempt to assign it a probability score that predicts its clinical relevance. "And that's a really, really hard problem," Stephan said.
What is needed, he said, is sequencing-based genome-wide association studies. "What you ideally would want to do is take thousands of people with a complex genetic disease and thousands of people without one, sequence their entire exomes, and look for hotspots of accumulation of rare variants in certain regions of the genome [of cases vs. controls, where] the specific variants look like they have some sort of functional consequences."
"Then, the next time a person comes through the door, you can start to informatically stratify the loci that you see variants in ... based on sequencing all these genomes."
I think Stephan is under-estimating the sample sizes required for these studies to be effective - we're talking hundreds of thousands of whole genomes, at least - but the overall message is on-target, and it's not good news for personal genomics customers expecting to find out what their genome means right now. It's going to take a long time and a tremendous amount of work before de novo functional prediction becomes a reliable proposition.
Of course, that's not going to stop personal genomics companies from staking out claims in the sequencing arena, and from offering risk predictions from rare variants - however provisional and imperfect - to customers. 23andMe has long expressed interest in a sequencing approach, although co-founder Linda Avey is coy about the company's ambitions in the In Sequence article:
"23andMe is closely following the next-generation sequencing field and will offer an expanded service when the data quality, balanced by the cost, of these offerings meets our criteria," said Linda Avey, co-founder of 23andMe, in an e-mail message. Once the company decides to include sequencing analysis in its service, "we will examine any and all sequencing companies in determining which would work best with our platform," she said.
Navigenics will apparently be offering whole-exome sequencing (analysis of the protein-coding regions of all genes in the genome) some time next year, and complete genome sequencing at some stage after that. You can bet that 23andMe's desire to remain at the lead of the personal genomics industry will ensure that Navigenics will not be alone; at the same time, the whole-genome sequencing services offered by industry pioneer Knome and other emerging players will be dropping to affordable levels. When you throw in the current obscene rate of change in the sequencing technology sphere, this is likely to turn into a chaotic and fascinating race.
Let's not forget that thus far, all the the SNPs found to be associated with diseases are only markers in LD with the actual Causal Variants. No matter how much fine mapping you actually conduct are your SNPs of interest, sequencing is the most efficient way to finding these causal variants.