A defense of ENCODE?

Dan Graur has snarled at the authors of a paper defending ENCODE. How could I then resist? I read the offending paper, and I have to say something that will weaken my own reputation as a snarling attack dog myself: it does make a few good points. But it's mostly using some valid criticisms to defend an indefensible position.

Here's the abstract.

In its last round of publications in September 2012, the Encyclopedia Of DNA Elements (ENCODE) assigned a biochemical function to most of the human genome, which was taken up by the media as meaning the end of ‘Junk DNA’. This provoked a heated reaction from evolutionary biologists, who among other things claimed that ENCODE adopted a wrong and much too inclusive notion of function, making its dismissal of junk DNA merely rhetorical. We argue that this criticism rests on misunderstandings concerning the nature of the ENCODE project, the relevant notion of function and the claim that most of our genome is junk. We argue that evolutionary accounts of function presuppose functions as ‘causal roles’, and that selection is but a useful proxy for relevant functions, which might well be unsuitable to biomedical research. Taking a closer look at the discovery process in which ENCODE participates, we argue that ENCODE’s strategy of biochemical signatures successfully identified activities of DNA elements with an eye towards causal roles of interest to biomedical research. We argue that ENCODE’s controversial claim of functionality should be interpreted as saying that 80 % of the genome is engaging in relevant biochemical activities and is very likely to have a causal role in phenomena deemed relevant to biomedical research. Finally, we discuss ambiguities in the meaning of junk DNA and in one of the main arguments raised for its prevalence, and we evaluate the impact of ENCODE’s results on the claim that most of our genome is junk.

To simplify it further, they're arguing that there are different definitions of "function", and the criticisms of ENCODE rely greatly on the idea that evolution provides the best and most complete test of functionality, while ignoring biomedical functionality.

This is where I shock everyone: I can see that point and even agree with it. After all, if we're arguing that traits with small effects on fitness are going to be invisible to selection, we have to recognize that there are genetic elements that do have some effect, positive or negative, on the individual, but that aren't a consequence of selection. So there almost certainly are genetic elements that exist and are relevant to people's health -- they might even be very common -- but were never shaped by the effects of selection on the population. An evolution test that identifies elements that have been subject to selection is always going to miss those elements that are invisible to selection.

However, the argument still fails for ENCODE. The authors argue that hypothetically, 80% of the genome has a "causal role in phenomena deemed relevant to biomedical research". Having an allele that shaves ten years off my expected lifespan may be no big deal from the perspective of evolution, but hey, it matters a heck of a lot to me, so a research program that identified phenomena relevant to my health and well-being is something I would support, even if it didn't involve evolution.

But did ENCODE do that?

I don't think so, and the authors don't point to any specifics. Postulating biomedical causality imposes a whole new set of requirements on the testing, and "a protein sticks to this sequence some times in some cells" is such a loose criterion that it doesn't help us recognize biomedical relevance, either.

They quote this bit from ENCODE:

The mission of the Encyclopedia of DNA Elements (ENCODE) Project is to enable the scientific and medical communities to interpret the human genome sequence and apply it to understand human biology and improve health.

(ENCODE Consortium 2011, p.1, emphasis added)

Likewise, in the introduction of the 2012 round of papers the authors explicitly frame their enterprise as providing "an important resource for the study of human biology and disease" (ENCODE Consortium 2012, p. 57). This is particularly important because what is relevant to contemporary medicine is not necessarily evolutionarily relevant.

I've already said I agree with that in principle. However, it's just moving the goalposts to yet another position in which ENCODE fails to score. Where, in ENCODE's assessment of the 'function' of sequences, was there an assay to test its effect on human health or biology?

Here's ENCODE's definition of function.

Operationally, we define a functional element as a discrete genome segment that encodes a defined product (for example, protein or non-coding RNA) or displays a reproducible biochemical signature (for example, protein binding, or a specific chromatin structure).

Showing that a segment of DNA binds a protein is at least as far from showing a health effect as it is from showing evolutionary significance. The authors are simply floundering about, waving their hands, admitting that ENCODE failed to meet the functional criteria of evolutionary biologists, but maybe it meets the functional criteria of some other discipline, like medicine. Yeah, that's the ticket, they were helping doctors, or maybe veterinarians, or maybe transhumanists or bioweapons designers. They didn't actually have any tests for utility in these other domains, but if we fish around enough, maybe we can find an excuse somewhere.

Don't believe me? Here's their diagram explaining the phenomenon.

explanandum

It doesn't matter what biological activity we're looking at, there is an 'explanandum' under which we can assign it a function. (Philosophers: I'm a bit confused here. Isn't an 'explanandum' the thing that needs to be explained, and an 'explanans' the explanation for it, so the terms are a bit off? My pretentious Latin is a bit rusty.)

So even if it's an unknown explanation, even if the experiment wasn't designed to test it, even if the generated data doesn't help solve any of the problems we're using as a justification, ENCODE's ass is still covered. How convenient!


Germain P-L, Ratti E, Boem F (2014) Junk or functional DNA? ENCODE and the function controversy. Biol Philos 29:807–831.

More like this

I rarely laugh out loud when reading science papers, but sometimes one comes along that triggers the response automatically. Although, in this case, it wasn't so much a belly laugh as an evil chortle, and an occasional grim snicker. Dan Graur and his colleagues have written a rebuttal to the claims…
There's another paper out debunking the ENCODE consortium's absurd interpretation of their data. ENCODE, you may recall, published a rather controversial paper in which they claimed to have found that 80% of the human genome was 'functional' — for an extraordinarily loose definition of function —…
Science and its interpretation is wonderful. Today I saw a post on Twitter from @LAbizar, referencing an @GEN, post that stated 8.2% of Human DNA is Functional with a link to a GEN article: "Surprise: Only 8.2% of Human DNA Is Functional." The GEN writeup cited a PLoS Genetics article, "8.2% of the…
Last month, I wrote about the terrible botch journalists had made of an interesting paper in which tweaking regulatory sequences called enhancers transgenically caused subtle shifts in the facial morphology of mice. The problem in the reporting was that the journalists insisted on calling this a…

Dear PZ Myers,
having been thrown stones and likened to creationists, I must say that I really appreciate that you took the time to read the paper and could appraise the arguments reasonably.
As the introduction mentions, our primary aim was not to "cover ENCODE's ass" (there's no denial that the press releases were extremely problematic) but to address the specific criticism that it used the wrong notion of function. The motivation was what we saw as a deeper disagreement among biologists (going beyond semantics) and which you correctly spotted. Many others didn't, which means that we could have written the paper better.
As you noted, the deeper issue is whether it is actually the case that a lot of the genome matters to biomedical research. I take this as an open question, but to be sure had ENCODE found that biochemical signatures are limited to a tiny proportion of the genome, I would have taken this as an argument against the hypothesis, and hence the current observations do offer some support to the idea that much of the genome might be making a relevant difference (though not an evolutionary one). On this point, I find puzzling your claim that "It doesn’t matter what biological activity we’re looking at, there is an ‘explanandum’ under which we can assign it a function." Don't you think that there are explananda that we would all agree are totally irrelevant?
Pierre-Luc
PS: Your latin is correct. We meant it as "role in explaining the Explanandum", which might have been expressed more simply by saying Explanans, but for the fact that there is an ambiguity as to whether Explanans refers to the premise itself or to its logical position in the explanation. We wanted to make sure that it's understood as relative to the Explanandum, although I take your point that this representation also has its ambiguities.

By Pierre-Luc Germain (not verified) on 27 Mar 2015 #permalink