Imagine If 'Genomics Denialists' Get Their Grubby Hands on the Top Sekrit Microbiome Emails

I've stayed away from the CRU Swifthack brouhaha, largely because the experts are far better at debunking this denialist crap than I am. But, if there were such a thing as 'genomic denialists', they would probably have a field day if they were to get their hands on what we say.

For example, I'm involved in the Human Microbiome Project and we sometimes use phrases like 'bad data' or 'mediocre data.' I'm not talking about when something goes obviously wrong (i.e., controls fail), but the recognition that the technologies can fall short: technologies designed for genomics where every nucleotide is economically sequenced multiple times have enough errors that the precision of a single sequence is low (error rates can range between 1-5 bases in a 500 base pair read). Throw in chimerism (the creation of hybrid sequences that do not exist due to PCR errors), and if you generate enough sequences, you will get some 'garbage' data (see what I mean?).

Does this mean the Microbiome project data are junk? No, we just have to be cautious in our interpretation of the data--we can't treat every nucleotide in every sequence as 'canonical'. It certainly does not mean that genomics as a field is fundamentally flawed (feel free to insert joke about why genomics is fundamentally flawed...). But our discussions, both in terms of lingo and the occasional bout of dark humor, could be selectively edited to make us sound very bad.

We also do worry about how to present our findings (doesn't everyone?). When we were trying to figure out how to work with 454 data (i.e., what is an accurate sequencing read), we had a choice of two presentation styles:

1) We could say that we don't know what we're doing and that we generate shitty data, followed with "Damn, we suck ass."

2) We could say that we've solve similar problems with Sanger technology (true) and we are using a similar procedure to solve the 454 problems (we are, and doing so rather well, thank you).

Number #2 is the way to go here. It is obviously proof of the Grand Genome Konspiracy. But our private discussions about how to present these data could be selectively edited to sound very damning**. And if someone had an axe to grind, or a denialist claim to make, selective editing could be used to create a 'controversy.'

Just something to keep in mind when you hear about the wellspont of evil that is the email list of the Climate Research Unit of East Anglia (AAAIIIIEEEE!!!!).

**Given that creationists have tried to claim that Stephen J. Gould supported various creationist claims, this level of absurdity by denialists is nothing new.

More like this

I'm not certain you can have all four. Let's start at the beginning. Just to review, one way to examine the human microbiome--the organisms that live on and in us--is extract the DNA from a biological sample (usually something from a person that is slimy, stinky, or both, such as feces or a…
A couple of weeks ago I attended the Human Microbiome Research Conference. At that meeting, one talk by Bruce Birren (and covered by Jonathan Eisen) mentioned something that was completely overlooked by the attendees. Now, I don't blame them, since what Birren mentioned was about bacterial…
A couple of weeks ago, I came across this discussion thread "Will you stop using 454?" It's a pretty good thread--not much to disagree with there, although, from my perspective, it missed a key point (I'll get to that). But my answer is simple: I already have. My work focuses primarily on…
I recently was in a conversation with a collaborator who isn't in the genomics biz, and said collaborator remarked that there was a lot of online criticism of the quality of the genomic data that has been generated for the E. coli O104:H4 outbreak isolates. I've been following it very closely (not…

Thanks for the very nice post. I've made similar points to people I know on the issue, though as someone who is not involved in research I didn't have as good a specific example as you did.

Dissecting Genome data at the level of SNP (single nucleotide polymorphism) is pretty tough even with the latest technology. The accuracy will definitely get better But won't be 100%.

But is the nature that perfect ? Why is there a certain error rate for viral polymerases ?. Do we really have to be that perfect ?

Nobody is perfect. Do we have to "trash" human beings just for not being perfect ? Noo, human life is meaningful.

No genome data (DNA reading) is perfect. Do we need to trash the inaccurate data. No, the data is meaningful.

technologies designed for genomics where every nucleotide is economically sequenced multiple times have enough errors that the precision of a single sequence is low...

Nice post. One caviat is that while genomics has multiple sequencings, the CRU data is one temperature data set. While I can't say I'm overly informed, some of their methedology is clearly flawed in their lack of accounting for error, one of the better examples of this is treating historic tree ring data the same cross regionally.

For me personally, it's not so much that global warming isn't real, it is that the politicalization of it ignores the uncertainty. We could be making very costly mistakes no matter what we do (even nothing) and we need to acknowledge this fact.

By Mango Punch (not verified) on 18 Dec 2009 #permalink

Nice post. One caviat is that while genomics has multiple sequencings, the CRU data is one temperature data set.

Other independent temperature datasets produced by the NCDC and GISS show similar results (but, ironically, slightly more warming over the last 10 years).

While I can't say I'm overly informed, some of their methedology is clearly flawed in their lack of accounting for error, one of the better examples of this is treating historic tree ring data the same cross regionally.

I would like to know what you mean by this. If you have any specific references, I would appreciate them.