Inferring Demographic History Using Multiple Loci

One of the drums I beat around here pertains to inferring demographic history using molecular markers (i.e., DNA data). I've been known to go off on people who make claims about ancestral population sizes based on studies of a single locus or gene. You see, studying a single locus only gives you the evolutionary history of that locus. There is no way to untangle the affects of natural selection from those of demography without examining multiple loci.

The coalescent is a popular statistical technique used to study DNA sequence polymorphism. Combining bayesian analysis with coalescent theory has led to some advancements in inferring demographic history. Alexei Drummond and colleagues developed what they call a Bayesian Skyline Plot, which allows you to infer historical changes in population sizes. On his blog, Drummond describes an extension to the Bayesian Skyline Plot which takes advantage of data from multiple loci. Here's what they found:

These results demonstrate the essential role of multi-locus data in recovering complex population dynamics. Multi-locus data from a small number of individuals can precisely recover past bottlenecks in population size which can not be characterized with a single locus. However typical data sets used today are probably too small for obtaining precise estimates of population history and providing information on past bottlenecks.

The example given is one of multiple bottlenecks. When looking at a single locus, one can only detect one bottleneck event. However, examining multiple loci allows one to detect bottleneck events that occurred prior to the most recent event. I'd be hesitant to infer any bottleneck events from a single locus for the reasons I gave above.

More like this