The Frailty of Nearly Neutral Hypotheses

Blogging on Peer-Reviewed Research

Mike Lynch has been getting a fair bit of hype recently for his nearly neutral model of genome evolution (see here and here). The nearly neutral theory riffs off the idea that the ability of natural selection to purge deleterious mutations and fix advantageous mutations depends on the effective population size of the population in which the mutations arise. From here, the nearly neutral theory predicts that more slightly deleterious mutations and fewer slightly advantageous mutations will fix in small populations compared to large populations (see here and here for previous posts on this topic).

Lynch argues that the evolution of genomes can be best understood in a nearly neutral framework rather than a purely neutral or adaptive one (see here for a review by Dan Hartl). He claims that features of eukaryotic genomes (ie, introns, transcriptional regulatory regions, and large genomes in general) are slightly deleterious because they increase the size of mutational targets which can lead to a decrease in fitness if they get hit with a mutation. These features evolved (despite their fitness costs) because of small populations in eukaryotes, relative to prokaryotes, which allowed for the fixation of these slightly deleterious features. Furthermore, microbial eukaryotes tend to have larger population sizes than multicellular eukaryotes, and microbial eukaryotes have more streamlined genomes, on average, than multicellular ones.

While Lynch's model deals with trends observed over multiple comparisons, some people are using nearly neutral models to explain the evolution of individual genomic features. I'll take a look at how nearly neutral explanations are invoked in the evolution of repetitive sequences in the human genome (Gherman et al 2007) and the evolution of protein coding sequences (the Hughes paper I discussed previously). I question the appropriateness of the demographic explanations proposed by these authors and present alternative explanations that are equivalent models.

Demographic versus Molecular Explanations of Transposable Element Distributions

Human genomes are loaded with junk -- DNA sequences with no function. A large fraction of that junk is made up of transposable elements (many of which spread throughout genomes by getting transcribed into RNA, then reverse transcribed back into DNA and inserted in another spot in the genome) and processed pseudogenes (transcribed genes which have been reverse transcribed and inserted into the genome). A smaller fraction of the junk consists of fragments of the mitochondrial genome that have inserted themselves into the nuclear genome (these are known as numts).

A paper in the pipeline at PLoS Genetics reports on the temporal distribution of numt insertion events (see here for a press release and here for TR Gregory's smack down of the crappy coverage). The authors find that a disproportionate amount of numts in the human genome were inserted at about 54 million years ago. This date corresponds to both the approximate divergence time of new world monkeys from old world monkeys and apes (see here for a phylogeny) and to the Paleocene-Eocene boundary. Previous research showed that there was also a burst of transposable element insertions (specifically Alu elements) around this same time.

Both numts and transposable elements, for the most part, range in fitness costs to the host organism from neutral to deleterious. Recall that the nearly neutral theory predicts that slightly deleterious mutations have a higher probability of fixation in small populations. The authors of the paper, led by Nicholas Katsanis, conclude that, because independent non-functional sequences experienced a simultaneous increase in the rate of insertion, the ancestral populations along the lineage to humans went through a population bottleneck during this time. That decrease in population size allowed for a relaxation of selection against these insertions.

This nearly neutral explanation for the current genomic distribution of Alus and numts makes a nice just-so story, but it suffers from the same limitations as adaptationist stories. First of all, what makes the bottleneck along the catarrhine lineage (apes and old-world monkeys) different from other bottlenecks -- allowing for the accumulation in this bottleneck but not others? Does it have something to do with the Paleocene-Eocene boundary? If so, did other lineages experience the same burst in the accumulation of transposable elements and other junk?

It seems like a purely molecular explanation is just as plausible as the nearly neutral one. It is known that organisms evolve mechanisms to suppress the activity of transposable elements. This differs from purging transposon insertions via natural selection; natural selection involves a decrease in fitness of individuals carrying insertions, which means they don't pass them on to subsequent generations. The molecular mechanism is independent of natural selection -- transposable elements are simply kept from replicating themselves throughout genomes. It's hard to deny that there was a burst of numt and Alu insertions in the human lineage around 54 million years ago, but the demographic explanation is as much of a spandrel as adaptive hypotheses for the evolution of of this genomic feature.

Nearly Neutral Explanations of the McDonald-Kreitman Test

The second paper was discussed here a couple of days ago. Briefly, in this paper Austin Hughes argues that current approaches toward detecting natural selection in DNA sequences are flawed. My previous critique pointed out that Hughes neglects a large subset of methods used to detect evidence for natural selection, focusing on only two approaches. Here, I will point out that his criticism of one of those approaches contains faulty reasoning regarding nearly neutral expectations.

Hughes criticizes the McDonald-Kreitman (MK) test for its inability to distinguish between adaptive and demographic explanations for departures from neutrality. This test compares polymorphism and divergence at synonymous and non-synonymous sites using a 2x2 contingency table (reviewed here). An excess of non-synonymous differences between species, relative to non-synonymous polymorphisms within one species, can be explained by invoking either adaptive or nearly neutral mechanisms. The common interpretation is that excess non-synonymous differences were fixed by selection (this is the adaptive explanation). Alternatively, if there was a population bottleneck at some point in a species' evolutionary history, slightly deleterious non-synonymous differences may have been fixed, which would also result in an excess of non-synonymous differences between the species.

Hughes points out that MK tests performed on data from D. melanogaster identifies more amino acid changes fixed by selection than tests performed on human data. He offers the following nearly neutral explanation:

The high rate of 'positive selection' detected by the MK test in Drosophila can be explained by fixation of slightly deleterious mutations during a bottleneck in the process of speciation (Ohta, 1993). The level of nucleotide diversity in D. melanogaster is at least five times as great as that in the human species, indicating a much larger long-term effective population size in the former than in the latter (Li and Sadler, 1991). With an origin in Sub-Saharan Africa, this species was largely unaffected by Pleistocene glaciation, a major cause of bottlenecks in species of the North Temperate zones (Hughes and Hughes, 2007). Given a large effective population size for a long time, the nearly neutral theory predicts that slightly deleterious mutations will have a good chance of being purged by purifying selection. Thus, the highly effective purifying selection within D. melanogaster, by lowering Pn [the frequency of non-synonymous polymorphisms], causes Dn [the frequency of non-synonymous fixed differences] to appear large by comparison.

As cool as this nearly neutral explanation is, it's not the only one we can pull out of our ass. How do you like this one: because Drosophila have historically larger effective population sizes than hominids they have been able to fix more slightly advantageous amino acid mutations. That leads to an excess of amino acid substitutions along the lineage leading to D. melanogaster. What makes my nearly neutral hypothesis any better or worse than Hughes's hypothesis? You see, two can play at this just-so-story game.

Conclusions

If you're going to attack adaptationist explanations of molecular evolution for lacking rigor, you should present a reasoned explanation with empirical evidence. That's not to say that the abundance of TEs in the human genome that were inserted about 54 million years ago aren't the result of a population bottleneck. But where is the evidence for that bottleneck? And if you're going to present one nearly neutral hypothesis as an alternative to an adaptationist story, you should also be aware that there are slightly adaptationist explanations for the same data. As I've said before, a demographic explanation requires the same quality of evidence as an adaptationist one, and, without such evidence, is open to the same ridicule as a spandrel ridden selectionist story.


Gherman A, Chen PE, Teslovich T, Stankiewicz P, Withers M, et al. 2007. Population bottlenecks as a potential major shaping force of human genome architecture. PLoS Genet. In press. doi:10.1371/journal.pgen.0030119.eor

Hughes AL. 2007. Looking for Darwin in all the wrong places: the misguided quest for positive selection at the nucleotide sequence level. Heredity In press doi:10.1038/sj.hdy.6801031

Lynch M. 2006. The origins of eukaryotic gene structure. Mol Biol Evol 23:450-468. doi:10.1093/molbev/msj050

Lynch M. 2007. The frailty of adaptive hypotheses for the origins of organismal complexity. Proc Natl Acad Sci USA 104:8597-8604 doi:10.1073/pnas.0702207104

More like this

I completely agree with your take on poor "just-so" stories being possible on all sides, quite true. I would say that all hypotheses are "just-so" in some extremely oversimplified way, especially when faced with uncertain histories and contingencies.

However, I think a larger point of the nearly neutral view is that nearly neutral forces affect everything in the genome while adaptationist hypotheses must have a necessarily narrowed focus. So that to a rough approximation, nearly neutral hypotheses are more clearly related to a general null model than are adaptationist hypotheses. Gherman et al.'s bottleneck should leave more signature in the genome than just numts.

I don't know much about molecular mechanisms of TE suppression, but how can your molecular alternative exist outside of a population genetics context, whether via selection or otherwise? I don't understand.

You are correct that my alternative does not exist entirely outside of a popgen context. If a suppressor of TE activity were to emerge via mutation, it will be selectively advantageous. But this differs from selection against the actual insertion events (ie, removal, via purifying selection, of individual carrying polymorphic insertions).

As for the specific mechanisms, I'm a bit hazy as well. I do know of suppressors of P-element activity in Drosophila, but I don't know about allows or nmts in primates.