Looking for Darwin with a Bad Pair of Eyes

By evolgen on July 11, 2007.

Bad tests for natural selection are bad at detecting selection.

Austin Hughes has published a fairly critical review of some methods used to detect natural selection in protein coding sequences. His attack on current methods for detecting natural selection is threefold. First, he claims that comparing non-synonymous to synonymous substitutions (see here) does not allow one to differentiate between adaptive evolution and relaxed selective constraint. Second, he argues that comparing polymorphism and divergence of synonymous and non-synonymous sites (see here) does not allow one to differentiate between adaptive and demographic explanations for departures from neutrality. And, third, he declares that both of these approaches are flawed because they assume that selection will act on protein coding sequences by fixing multiple amino acid substitutions since the divergence of the two sequences being compared.

I don't think you'll find disagreement from anyone that comparing non-synonymous and synonymous substitutions per site (dN and dS, respectively) is a pretty poor way to detect selection. I'd argue that it's poor not because it cannot be used to tell the difference between positive selection and relaxed constraint, but because it's got such low power. Technically, dN>dS is evidence for natural selection, but there are other implementations of the test which compare dN/dS across multiple branches of a phylogeny; the latter approaches would yield false positives if relaxed selective constraint leads to an elevated dN along a particular lineage. And Hughes has a valid point that this test (along with some others) also assume that natural selection will fix multiple amino acid changes -- a violation of this assumption makes the test even less powerful.

Not only does Hughes criticize the conservative dN/dS as a threat to yield too many false positives, some of his criticisms of the McDonald-Kreitman (MK) test (which compares dN and dS with synonymous and non-synonymous polymorphisms) don't lead to the conclusions he'd like for you to believe. I'll discuss some of those criticisms in a subsequent post, but I'd like to include one of them here. The MK test measures within species polymorphism by counting the number of nucleotide sites that vary within the sample. This does not present a complete picture of nucleotide polymorphism; it's common to also measure the average differences between all pairs of sequences. Hughes correctly points out that deleterious mutations may be segregating as rare polymorphisms, which would elevate the amount of non-synonymous polymorphism in the data. Rather than leading to incorrect inferences of natural selection, this would actually make the test more conservative because it would take an even greater excess of non-synonymous differences between species to reject the null hypothesis and infer natural selection.

The Achilles heel of Hughes's article, however, is that he attacks only a subset of the approaches used to detect natural selection. The article does not mention any tests that use polymorphism data, other than the McDonald-Kreitman test. Analyses that look at the site frequency spectrum of DNA sequence polymorphism (see here) or haplotype blocks are able to detect recent selection events even if only a single nucleotide is under selection. By excluding a large swath of tests from his article, Hughes allows himself to attack a straw man of current approaches toward detecting natural selection. Ironically, he devotes a sizable chunk of his article toward defending Kimura's neutral theory against historical attacks that were based on a misunderstanding of the model.

Hughes concludes that codon based approaches toward detecting selection on DNA sequences are flawed and that we must use new techniques to detect natural selection in non-coding regions. He also includes a fair bit of text defending the importance of transcriptional regulatory regions in adaptive evolution (this part actually offers some solid criticisms of the Hoekstra and Coyne article reviewed here). But if he had performed an adequate survey of techniques that use polymorphism data to detect natural selection, he may have realized that they have the power to identify selection in non-coding regions. Even though it would be cool to be able to use gene expression data to detect natural selection, we're still missing the appropriate algorithms for such an analysis.

Hughes AL. 2007. Looking for Darwin in all the wrong places: the misguided quest for positive selection at the nucleotide sequence level. Heredity In press doi:10.1038/sj.hdy.6801031

More like this

Advertisment

Donate

ScienceBlogs is where scientists communicate directly with the public. We are part of Science 2.0, a science education nonprofit operating under Section 501(c)(3) of the Internal Revenue Code. Please make a tax-deductible donation if you value independent science communication, collaboration, participation, and open access.

You can also shop using Amazon Smile and though you pay nothing more we get a tiny something.

Science 2.0

Science Codex

More by this author

This is a Good-bye Post

January 16, 2009

This is the final post ever at evolgen. It was a fun 4+ years, the last three spent at ScienceBlogs, but it has come time for me to close up shop. When I first got into blogging, I did it as a way to share what was on my mind to the few people who would read what I had to say (usually in topics…

Mendel's Garden #27 - Call for Submissions

January 2, 2009

Mendel's Garden is the original genetics blog carnival. The next edition will be hosted by Jeremy at Another Blasted Weblog. If you would like to submit a blog post to be included in the carnival, send an email to Jeremy (jcherfas at mac dot com). The carnival should be posted within the next few…

Eric Lander Teaches?

December 20, 2008

John Hawks points out that Eric Lander has been appointed to co-chair Obama's Council of Advisers on Science and Technology along with science adviser John Holdren and Nobel Laureate Harold Varmus. Here's how the AP article describes Lander: Lander, who teaches at both MIT and Harvard, founded the…

The Implementation of Molecular Evolution for the Masses

December 18, 2008

A couple of years ago, there was talk in the bioblogosphere about getting the general public interested in bioinformatics and molecular evolution: Amateur bioinformatics? Lowering the Ivory Tower with Molecular Evolution Molecular Evolution for the Masses The idea was inspired by the findings of…

Do people still use microarrays?

December 17, 2008

Larry Moran points to a couple of posts critical of microarrays (The Problem with Microarrays): Why microarray study conclusions are so often wrong Three reasons to distrust microarray results Microarrays are small chips that are covered with short stretches of single stranded DNA. People…