Software company 5AM Solutions has just launched a neat little FireFox plug-in for customers of consumer genomics company 23andMe.
The idea is very simple:
- Download your raw data from 23andMe (or use one of the files from me or my colleagues at Genomes Unzipped);
- Install the plug-in from here and point it to your 23andMe data;
- Browse to a website discussing one of the genetic variants included on the 23andMe chip, and you'll see highlights around the rsID of any variant on the page (rsIDs are unique codes assigned by dbSNP to most of the common variants targeted by personal genomics companies);
- Mouse over the rsID and your own genotype for that SNP will appear.
For any 23andMe user who's ever come across a variant on PubMed and wondered what their own genotype was, then gone through the process of logging into 23andMe and checking, the value of this tool is immediately obvious.
Here's a screenshot using my own data:
SNPtips creator Andrew Evans has a blog post up explaining the rationale behind the project. I spoke to Evans by email earlier this week, and he told me that future plans for the tool include development for Chrome, extension to data-sets from other companies such as deCODEme and Navigenics, and provision for viewing data from multiple individuals (which will be useful for those with multiple genotyped family members, or for groups like Genomes Unzipped).
As more people gain access to increasingly more comprehensive information about their own genome, online tools will become essential for navigating the data rapids. This is a small but very useful step in that direction.
- Log in to post comments
I love this idea. Is it 23andme only, or does it work with decodeme too?
Currently 23andMe only, but they have plans to extend it to other companies soon. Of course, for the impatient and informatics-savvy deCODEme customer it would be easy to reformat your deCODEme file as a 23andMe file, which would work just as well.
Beyond that - Luke has imputed everyone in Genomes Unzipped onto HapMap 2, and I'm planning to try those files out later this week.
Neat tool! Re the AT/GC flip tweeted about, is that referring to the cases where the common use of the SNP is the opposite of the result provided by 23andMe? As in your (coincidental?) example - in dbSNP and most papers the LCT SNP is C>T (so you would be TT, lactose persistent)?
I guess I'll have to wait till the Chrome extension comes out then.
Who uses IE or Firefox anymore??
Keith: yes, that's correct. Users will just have to learn how to do strand flipping in their heads - the only problem will be A/T and C/G SNPs, but fortunately these are reasonably rare (0.75% of SNPs on the 23andMe v2 chip and 1.5% of SNPs on the v3 chip). Alternatively, geneticists could agree to adopt a consistent standard for strand representation (cue hysterical laughter).
Who uses IE or Firefox anymore??
20.6% and 42.0% of ScienceBlogs readers, respectively. Chrome comes in third with a 17.9% market share. But that said, Chrome support is the first thing I asked about. :-)
Re: reverse complementing SNPs
In general, Illumina GWAS chips (unlike Affymetrix's) purposefully only include A/C, A/G, C/T and G/T SNPs, so the small number of A/T and C/G SNPs reported by 23andMe is from their custom component on top. Therefore, that strand flipping is not a problem is an artifact of the technology used.
To take the bait (and as Daniel knows full well) the obvious method would be to report only on the forward strand, but that implies a fixed genome assembly. So, a large chunk of lab work is still done on assembly NCBI36, as that is what HapMap's data has been reported on. (And people don't know how to cope with alternate named haplotypes in GRCh37, but that's a different story).
The alternative to fixing the assembly, is to fix which allele to report first by sequence context - Illumina's version of that is online.
Yes, I should have specified that the AT/GC problem will be more of an issue for other chips (e.g. Navigenics, which last time I checked used an Affy 6.0 array).
And all fair points on the strandedness issue; strandedness has long been the bane of many a geneticist's existence, but I agree that's because it's a genuinely difficult problem to solve.
The Illumina GWAS chips do not assay A/T and C/G SNPs because it is cheaper not to - or, if you prefer, they have chosen to double the SNP density by not doing so:
Yes, I should get out more.
Hi everyone -
Yes, Chrome is definitely high on the port list, as Daniel points out. So far, it's the top vote in our informal survey from the release. Safari is a distant second, and IE is last with zero votes.
On the strand issues - these break down into two categories, basically - which strand the SNP is reported on, and whether the SNP is prone to ambiguity (A/T and C/G). Each probably calls for a different solution. 23andMe data is normalized to the + strand always, so we've taken the tack in this first release to just report it as-is, and assume that people will do whatever they normally do with their raw 23andMe data anyway (flip strands mentally, look up in dbSNP, etc.). We are looking at ways, however, to make this friendlier in future releases - at a minimum, we could flag the A/T and C/G SNPs so people have some warning when dealing with them. For strand orientation, we are contemplating ways to show dbSNP orientation alongside the native data, etc. This is a bit tricky because it either requires pre-processing of raw data or real-time API lookup, each of which has disadvantages.
DECODEme will probably be the platform we support next, just because it's one we are familiar with (so if you want Navigenics, or any other platform, *please* point us to some good data files to use for testing! email firstname.lastname@example.org if you don't mind). DECODEme conveniently puts strand orientation per SNP right in the file, even though everything isn't + strand normalized.
Do stay tuned to SNPTips and 5AM Solutions - we have more cool things in the works...
Increasingly SNPs are included on chips because they are the best tags for known effects, or because they are rare SNPs that cannot be tagged any other way, meaning that the GC/AT rate continues to rise. The illumina 1M chip has 0.4% GC/AT SNPs, whereas the 2.5M has about 2.5%. Plus, illumina genotyping is far from the only technology that calls variants, and sequencing calls are going to get increasingly common.
Now is the time to acknowledge that times have changed - with a stable genome, there is no good reason to not report something on the forward strand, and the future will thank you for doing so.