Evolution, Information, Hamlet, and Improbability

Over the last couple of days, Dr. Michael Egnor, an anti-evolution neurosurgeon who recently signed on to the Discovery Institute's list of "scientists" who doubt evolution, has created quite a stir here at Scienceblogs. Quite a few Sciencebloggers have already weighed in on his specific arguments, with PZ and Orac leading the charge. They've already dealt with the basics of his information theory arguments, so I'm going to focus mostly on his basic appeal to complexity - and in particular on his analogy between Shakespeare and life.

In a recent comment on Pharyngula, Egnor writes:

Typographical errors can alter a text in ways that occasionally change the meaning of the text, and in that sense may add new information. But we would never ascribe a long complex text (Shakespeare's 'Hamlet') solely to an accumulation of typographical errors. The limits to which non-intelligent variation can produce complexity are real.

What is the statistical boundary at which one may say that an event (eg generation of a meaningful English text) can't happen without intelligent agency? Some mathematicians have tried to define it. Emile Borel suggested that it was between one chance in 10^50 and one chance in 10^150. Because of the large combinatorial space of the English alphabet, that's merely a standard English sentence.

This accords with ordinary experience. We accept slight word variations as typos, but not complete sentences.

This is pretty standard anti-evolutionist rhetoric. It's also a pretty inapt (or should that be inept) analogy, because it implicitly assumes something that simply is not the case - that complexity is the product, rather than the byproduct, of evolution. Complexity has come about, as Richard Dawkins pointed out in a lecture I attended last week, because increasing complexity has allowed organisms to survive and reproduce in environments where all the simple ways of living were already taken.

To borrow an analogy from Dawkins, both a virus and a rhinoceros have been designed (by natural selection) to do the same thing: make more copies of themselves. A virus is an exceptionally efficient self-replicator; a rhino is an exceptionally inefficient self-replicator. Both do, however, manage to succeed at self-replication well enough that there are still viruses and rhinos out there, making more viruses and rhinos.

Life did not set out on a mission to become more complex. Life has gotten more complex because increased complexity has proven to be a successful method of facilitating self-replication.

Let's look at Egnor's main question from this perspective:

There's no question that a random process (meaning 'one not directed by an intelligent agent') can produce some information, however it is measured. But the question is how much, and the question really matters.

At this point, the first thing that anyone who wants to answer Egnor's question needs to know is how to define and measure "information." That's really a non-trivial issue. People have talked about different measures of information, and about how biologically meaningful they are, and so forth. Once that is settled, there are still more issues that would need to be hammered out - such as how to measure the information content of any particular organism, and whether a beneficial change could result in a loss of information, and so on.

Even after addressing all of those things, though, we wouldn't really be any closer to addressing Egnor's question, because it really isn't well-formulated. If the only "goal" of life is to make more life, then the interesting question is not whether a non-intelligence-driven process can produce the total information that we see today. The question is simply whether the process can move in small steps over a large period of time from whatever the starting point was (I won't pretend to know that) to what we see today.

Going back to the Shakespeare analogy, the question isn't whether one can get Shakespeare through the hypothetical language process. The question is whether one can get something complex and meaningful by making small changes to something that started out as a simple and meaningful statement.

More like this

Nice post! One bone to pick, though. I'm sure, as a zoologist, that you already know this, but I think that this sentence is a little misleading "Life has gotten more complex because increased complexity has proven to be a successful method of facilitating self-replication."

I don't think it's true that increasing complexity has proven itself as a more efficient means of facilitating reproduction. I think that, in terms of complexity, what we have is a Red Queen scenario. Some photosynthetic bacteria found it advantageous to evolve multicellularity (presumably to capture more sunlight). So that kind of puts out of business all the organisms that eat things one cell at a time (I know, I'm vastly oversimplifying here). So heterotrophs decide that it's in their best interest to evolve multicellularity too. So the proto-plants evolve into plants, and the sponges evolve into fish, and the water-based plants evolve into land-based plants, and fish evolve into reptiles, etc., etc.

But the fact is, simplicity will always be the primary strategy for reproduction. The numbers of bacteria/archaea will always vastly outnumber the eukaryota, in both sheer numbers and sheer biomass. Hence, I don't think that you can honestly say that complexity is a viable means to consistently increase reproductive fitness. We complex organisms all fill a niche, but that niche could go away tomorrow based solely on the the vagaries of 'simple' organisms' decisions.

If you look closely, you will find that I never said that increasing complexity is a more efficient means of facilitating reproduction. What I said is that increasing complexity has proven to be successful - and all that "successful" means in an evolutionary context is that it works well enough to keep working.

There are several rudimentary flaws in Egnor's common typographical error analogy:

English uses an alphabet of 26 letters plus multiple punctuation marks. DNA uses an "alphabet" of 4 "letters"/base pairs

English vocabulary is too extensive to compile in a portable dictionary. DNA uses only 32 "words"/codons, each only 3 base pairs long. You could print a codon dictionary on a quarter page, with room to spare!

Also, with DNA, EVERY combination of 3 "letters"/BPs is a valid "word"/codon (though some are less optimal than others).

English typos rarely generate valid words, because english has dramatically more letters than DNA; and because, unlike DNA codons, the overwhelming majority of combinations of english letters are meaningless. English is an inefficient language in this way.

Obviously, this makes it completely juvenile to suggest that probabilities of viable genetic mutations are equivalent to monkeys hammering on typewriters, or typos, or whatever.

The next major flaw with in the analogy is a lack of specification of the selection pressure. Instead of english, let's pretend we're dealing with typos in a 4-character language with a small, efficient vocabulary. What is the mechanism for determining which sentences are "better" than others?

Make the language radically simpler and more efficient than english, establish a "typo" mechanism for imperfect replication, and set up a selection system for differentially replicating the results based on inherited traits -- now we're in action! You would expect to see the selected traits more strongly expressed in future versions. We've got a name for that process.

Finally, recognize that evolution isn't directed towards "Hamlet" or any other goal, just toward being better than the current competition. Over time, this leads to results that conform more optimally to selection pressures

Basically, english is a crappy analogy for DNA. Egnor uses this analogy as a misleading straw man.

By Josh Spaulding (not verified) on 27 Feb 2007 #permalink

Josh,

I don't think English is such a terrible analogy for DNA, but I don't think that the analogy helps the ID cause, either.

Instead of focusing on grammaticality, we can think about viability. The analogy for viability for sentences is understandability: can we understand what the sentence means? English sentences that are technically ungrammatical can still, in many cases, be understood. If you misspell a word, or get a tense wrong, or make a punctuation mistake, it is still possible to understand the sentence.

The exact boundary between "understandable sentence" and "gibberish" is, of course, pretty fuzzy. It depends on the person reading the sentence and his tolerance for errors. The same sort of fuzziness is in the concept of "viability", where the environment is the arbiter of whether something is viable or not.

I think it would be an interesting experiment to see how far can mutation and sexual reproduction can go in "evolving" English sentences. Rather than shooting for a specific sentence, such as a line from Hamlet, we could do the following:

Start with a bunch of copies of a single "viable" sentence. It doesn't matter what, maybe "To be or not to be, that is the question." Then continually repeat the following sequence of steps:

(1) Introduce "mutations" by randomly changing some small fraction of the letters in the current collection of viable sentences. A mutation could be: substituting one letter for another, deleting one letter, or adding one letter.

(2) Cull the "nonviable" sentences, the ones that are gibberish.

(3) Make copies of the viable sentences.

This is sort of analogous to asexual reproduction; sexual reproduction makes things a lot more complicated.

The whole Shakespeare argument is also flawed because it assumes that the result of the process was also its goal. [Cue sound of head hitting keyboard.] There was no process, natural or otherwise, that set out with the a priori goal of creating humans or anything else complex. It's an easy trap to look back and say "gee -- humans are really, really complicated" and wonder how we got here through such a complicated route. But, we would have said the same thing if we as a species were something that was entirely different than we are now. Anyone who argues that something is too complex to have been the result of an essentially limitless set of parallel processes, that person would have to also explain why all the other potential complex results didn't get the stamp-of-intelligent-approval. Natural selection explains this handily, particularly when one acknowledges the enormously high degree of parallelism fueling the big experiments that result in evolution.Also, I get the all-to-often impression that anti-natural-selection-istas think of evolution like some sort of directed vector, pointing "up" (whatever that means). My standpoint is that it is what it is; it just is.