One codon, two amino acids - the genetic code has a Shift key

Blogging on Peer-Reviewed ResearchLiving things, from bacteria to humans, depend on a workforce of proteins to carry out essential tasks within their cells. Proteins are chains of amino acids that are strung together according to instructions encoded within that most important of molecules - DNA.

i-a03a641aa0e78f73033434adc2297d71-Euplotes_Crassus.jpgThe string of "letters" that make up DNA correspond to chains of amino acids, and they are read in threes, with every combination representing one of many amino acids. Until now, scientists believed that this relationship is unambiguous - within any single genome, every three-letter combination maps to one and only one amino acid. This strict one-to-one relationship is a tenet of genetics, but new research shows that it's not an absolute one.

A team of American scientists have found a surprising exception to this rule, within a sea microbe called Euplotes crassus. In its genome, one particular triplet of DNA letters can stand for one of two different amino acids - cysteine or selenocysteine - even within the same gene. It all depends on context. This is the first time that such dual-coding has been spotted in the genes of any living thing.

Genetics 101

Before I go any further, it's probably a good idea to have a quick primer on the genetic code for non-scientists. Anyone with prior knowledge of genetics can just skip the next four paragraphs. DNA is a chain of four molecules called nucleotides - adenine, cytosine, guanine and thymine, represented by the letter A, C, G and T. These sequences are transcribed into a similar molecule called messenger RNA (mRNA), which contains three of the same nucleotides, but replaces thymine with uracil (U). It's the information coded by mRNA that is finally translated into proteins.

Proteins are built from 20 different amino acids, chained together in various combinations. In mRNA, every three letters corresponds to a specific amino acid. These three-letter combinations are called "codons", the genetic equivalent of words. For example, the codon CCC (three cytosines in a row) corresponds to the amino acid proline, while AAA (three alanines) corresponds to lysine. And some codons act as full-stops, indicating that the amino acid chain has come to an end.

This genetic code is almost universal. The same codons almost always match up to the same amino acids in tiny bacteria, tall trees and thoughtful humans. There are a few deviations from the universal template, but even then, the differences are relatively minor. Think about computer keyboards - almost all have the same configuration of keys for various letters and symbols, but some will have the @ key in a different place. 

The genetic code is redundant, so that several codons represent the same single amino acid, but there are no ambiguities. There are no examples of a single codon within any genome that represents more than one amino acid. That is, until now.

The Euplotes crassus Code

Anton Turanov, Alexey Lubanov and Vladimir Gladyshev from the University of Nebraska have discovered that in Euplotes crassus, the UGA codon can mean either cysteine or selenocysteine, depending on its location in the gene.

i-1d6aa9aa2ef09d29f93f001f72577704-Shiftkey.jpgIn the universal code, UGA is a stop signal but many species use it to signify selenocysteine, an amino acid that isn't represented in the universal code. This alternative translation of UGA into selenocysteine hinges on a structure called a SECIS element. The SECIS is part of the mRNA molecule itself but sits outside the region that actually codes for amino acids. It's like a genetic Shift key - its presence changes the meaning of UGA codons that sit before it.

What makes E.crassus unique is the fact that its UGA codons can mean either selenocysteine or cysteine - a choice between two amino acids rather than one amino acid and a stop signal.

Turanov and Lubanov analysed the microbe's tRNAs -molecules with one end that recognises a specific codon and another that sticks to its corresponding amino acid. These are the decoders that translate strings of codons into strings of amino acids. It turned out that E.crassus has different tRNAs that recognise UGA - one of these matches the codon with cysteine and another matches it with selenocysteine.

Turanov and Lubanov also purified a protein from E.crassus called Tr1. Its RNA has a SECIS element and five UGA codons, and the duo found that the first four of these are translated into cysteines and the fifth into selenocysteine. Location is all-important when it comes to working out which interpretation comes out top. When Turanov and Lubanov added lots of UGA codons at sites throughout the TR1 gene, they found the vast majority were translated into cysteines. Only those inserted at the end of the gene, within its final 20 codons and near the SECIS element, were interpreted as selenocysteines.

So the SECIS element, in its Shift-key role, affects the fate of nearby UGAs. To confirm that, Turanov and Lubanov replaced the entire SECIS element in the TR1 gene with an equivalent element from a different gene and a different species. They found that this new SECIS element had a wider zone of influence; when it was introduced, UGA codons that sat outside the final 20 were translated into selenocysteines instead of cysteines.

So in E.crassus, the UGA codon is not tied to a single fate - it has a choice. It can be interpreted in two different ways, depending on its location and that of the SECIS element that influences it. One codon, two amino acids - it's a unique set-up and further proof that the genetic code, universal though it almost is, is open to expansion and evolutionary change.

Reference: A. A. Turanov, A. V. Lobanov, D. E. Fomenko, H. G. Morrison, M. L. Sogin, L. A. Klobutcher, D. L. Hatfield, V. N. Gladyshev (2009). Genetic Code Supports Targeted Insertion of Two Amino Acids by One Codon Science, 323 (5911), 259-261 DOI: 10.1126/science.1164748

Subscribe to the feed


More like this

Almost every living thing shares an identical genetic code, with three nucleic acids in an RNA sequence coding for a single amino acid in the translated protein sequence. While there are 64 three-letter RNA sequences, there are only 20 amino acids and degeneracy in the code allows some amino acids…
Here is the third BIO101 lecture (from May 08, 2006). Again, I'd appreciate comments on the correctness as well as suggestions for improvement. --------------------------------------------------BIO101 - Bora Zivkovic - Lecture 1 - Part 3 The DNA code DNA is a long double-stranded molecule residing…
Pim van Meurs has a blog post at The Panda's Thumb about the recent paper on translational selection on a synonymous polymorphic site in a eukaryotic gene (DOI link). He points out that this was predicted in a paper from 1987. In short, the rate of translation depends on the tRNA pool -- amino…
Well two weeks ago in Science, two reports came out about yet another species of small RNA ... rasiRNA ... uhm ... piRNA (OK they haven't harmonized their nomenclature yet). So here is a brief review of the types of RNA: - mRNA (messenger RNA). These are the RNAs that encode polypeptide chains. -…

Now for ctrl and alt...

Paper should be out today.

Yeah, SECIS elements are fairly common - check out the paragraph where I introduce them. I do mention that they're used by lots of species. But only ever to swap between a selenocysteine and a stop - not two aminos.

So if the SECIS element directs the different tRNAs in Ecrassus to a UGA, thereby translating it into one of two aminos, what is the mechanism in SECIS that decides which tRNA goes with which UGA?

Suggesting that I go and read the paper is acceptable.

I don't know what I think is cooler - the phenomenon itself, or that we're in a period when science can actually study these processes in enough detail to discover it. Either way, yet another elegant and clear account of some exciting science.

Million-dollar question, isn't it? I don't think they know yet. I think I covered pretty much everything that was in the paper in the write-up, but feel free to have a look to see if I missed anything.

Doc Spurt - agreed wholeheartedly. The only thing that makes me a bit sad about this paper is that it's really very cool, but it's bordering on being too obscure. By which I mean that it's pretty challenging to get across to people who don't know much genetics why exactly this is so cool. Papers like these are really bloody hard to write up - how much detail can you keep in without turning off a non-specialist reader?

Larry Morgan has a post that talks about this story. Apparently E. coli has been known to do something similar (with a different codon and different amino acids) for quite some time. The claim that is a new phenomenon comes from the original paper so I guess the reviewers and editors at Science didn't catch it either.

Yeah it's interesting actually - the original paper is very clear about the fact that they think it's new. The Scientist also has a piece on this:

But the paper's findings might not be all that novel, cautioned Yale University microbiologist Dieter Soll. He noted that in some Candida species, the CUG codon is translated as both leucine and serine, even in the same gene, albeit by a single ambiguous tRNA rather than two separate tRNAs as Gladyshev's team found in E. crassus. "This already showed that you can have the same codon in one gene [encode] two amino acids," Soll told The Scientist. "Really, it simply shows that nature has more than one way of doing the same thing."

Gladyshev disagrees. "It's really a completely different situation," he said. In Candida, the tRNA doesn't discriminate between amino acids, he noted. Rather, it randomly inserts either serine or leucine, "whereas in our case there is a specific insertion of one amino acid or the other, depending on the presence and availability of the RNA element."