Science and its interpretation is wonderful. Today I saw a post on Twitter from @LAbizar, referencing an @GEN, post that stated 8.2% of Human DNA is Functional with a link to a GEN article: "Surprise: Only 8.2% of Human DNA Is Functional." The GEN writeup cited a PLoS Genetics article, "8.2% of the Human Genome Is Constrained: Variation in Rates of Turnover across Functional Element Classes in the Human Lineage," released today.
How much of the human genome is functional?
In 2012, the ENCODE (Encyclopedia of DNA Elements) project published a landmark summary, "An integrated encyclopedia of DNA elements in the human genome," from nine years of work measuring the ways in which DNA structure and its interactions with proteins such as transcription factors might contribute to the regulation of genes. In the paper's abstract the team stated that "These data enabled us to assign biochemical functions for 80% of the genome, in particular outside of the well-studied protein-coding regions." As a very small fraction of the genome (~1%) encodes for protein sequences, a question in science has been, what does the other 99% do? ENCODE data demonstrated that much of this DNA participates in biochemistry in some way. Many lauded the work for its tour-de-force effort and the resources contributed have been significant.
Others disagreed with ENCODE's findings. One of the first criticisms was in an article entitled "On the Immortality of Television Sets: “Function” in the Human Genome According to the Evolution-Free Gospel of ENCODE." In the abstract the authors reject the ENCODE thesis claiming that the evolutionary constrained regions of the genome are less then 10% of the genome's total DNA. They close their abstract by generating additional controversy, "The ENCODE results were predicted by one of its authors to necessitate the rewriting of textbooks. We agree, many textbooks dealing with marketing, mass-media hype, and public relations may well have to be rewritten." This work is followed up with a more detailed analysis, cited above, of the fraction of the genome that is evolutionarily constrained in mammals and provides a precise extrapolated measurement that 8.2% (with at range of 7.1-9.2%) of the genome is conserved in mammals.
So, what's the correct answer?
As usual, the correct answer depends on the kind of question you are asking, how you define function, and what kind of attention you seek. It also depends on how you interpret what others say. For example, the ENCODE data describes interactions between proteins and DNA, and counts transcribed bases as fuctional. The proteins can be contained in nucleosomes that pack DNA, or factors that bind promoters or operators to activate or inhibit gene expression. We also know that large-scale DNA structure is important and contributes to the activity of enhancers. Finally, non-coding genes - herein defined as those that are transcribed to produce different kinds of non-coding RNA - are much more abundant than we once thought, hence 80% of the genome can be annotated as functional. The ENCODE team found that 15-20% of the genome can bind something or be accessible, and many times there are correlations between different measurements that indicate a useful role for these regions of DNA.
On the other hand, the fraction of the genome that is conserved between different mammalian species is small. Even smaller if you examine non-mammalian species. However, a challenge with using species conservation as the rule for defining function is that it does not accommodate continual evolution very well. After all we [humans] are not mice. Indeed, the senior author of the PLoS paper commented in the GEN article:
“This is in large part a matter of different definitions of what is 'functional' DNA,” says joint senior author Chris Pointing, Ph.D., of the MRC Functional Genomics Unit at Oxford University. “We don't think our figure is actually too different from what you would get looking at ENCODE's bank of data using the same definition for functional DNA.”
So, it's a matter of definition - and interpretation. When the PloS title is read closely, it simply says that 8.2% of the genome is constrained in functional element classes; not that only 8.2% of the genome is functional as the GEN article states - which might be about seeking attention.
As a phsyician-scientist who has studied the human brain for over 40 years, it is patently absurd to suggest that less than 10% of the human genome is functional. What about "Human Accelerated Regions" or the phenomenomal expansion of cerbral cortex in humans and our cognitive and behavorial repertoire compared to any other species on this planet?
If you think it's patently absurd, then you haven't bothered to consider the evidence that suggests most of our genomes are, in fact, non-functional. That could certainly be wrong, but it's far from patently absurd.
Check out Larry Moran's Sandwalk blog if you're interested in actually understanding, and rather than just being dismissive.
Well, this is my first visit to your blog! We are a group of volunteers and starting a new initiative in a community in the same niche. More biotechnology related material exists at http://www.nanowerk.com/ This site is also more resourceful with science, nanotechnology, biotechnology, space and astronomy related latest inventions and news.
"The ENCODE team found that 80% of the genome can bind something or be accessible, and many times there are correlations between different measurements that indicate a useful role for these regions of DNA." - This statement is incorrect. I'm one of the authors on the paper. A majority of the 80% is mainly transcription and largely non-PolyA+ RNAs. Protein-DNA Binding and accessibility accounts for only at max 15-20% of the genome which is infact closer to the 10% that is being proposed in the new paper. Please correct this. A lot of this is clarified in our recent response paper http://www.pnas.org/content/early/2014/04/18/1318948111.short
Thanks for the feedback Anshul, I made the recommended changes.
All DNA is critical whether it's functional or indirectly functional.
The ENCODE project continues. While 80% of the genome possesses protein binding sites, this is not yet proof of functionality just yet. It is incredibly close to it though as proteins are highly specific to their binding motif. While 10% of the genome produces proteins or houses known control regions like centromeres, telomeres, cis-acting elements, not a great deal is known about what regulates interfering RNA and the thousands of nucleic products that govern cell biochemistry. Very little is known of how epigenetics are governed... it is dang spooky how the cell has a life of its own apart from genetics.
It is a great time to be alive and see the advances in genomics. Personally, I hope that all of the genome is functional. Job security for generations :) and of course, exciting times for molecular biology.