Publishing Original Research on Blogs - Part 6

Previous entries:

This post is part of a series exploring the evolution of a duplicated gene in the genus Drosophila. Links to the previous posts are above. Part 6 of this series (Evolutionary Relationships) can be found below.

Evolutionary Relationships

While we were probing the outgroup genomes for copies of aldolase genes using TBLASTX (Examining the Outgroups), we discovered that there are two excellent matches to aldolase genes in the honeybee, Apis mellifera, genome. However, there is only one match in the mosquito, Anopheles gambiae, genome. In this entry, we'll take a look at the evolutionary relationships of Drosophila, Anopheles, and Apis and formulate hypotheses for the different numbers of matching sequences.

i-94ccc5aa634a08772afd01d892c15179-insect_tree.gif

Both Drosophila and mosquitos are dipterans, while bees are hymenopterans. Therefore, Drosophila and Anopheles are closer relatives to each other than either is to Apis. That means we expect Drosophila genomes to be more similar to Anopheles genomes than to Apis genomes. So, it's surprising that Apis mellifera has two matches to aldolase in its genome, while Anopheles gambiae has only one.

To address the issue of multiple matches in the A. mellifera genome, we should figure out what exactly those two matches are. There are two explanations. First, there may in fact be two aldolase genes in the A. mellifera genome. Second, there may be a single aldolase gene in the A. mellifera genome, but it has two annotated transcripts. The two transcripts may arise via alternative splicing of a single gene, much like we see in D. melanogaster.

To test these two hypotheses, we can examine the Genbank entries for the two A. mellifera sequences (XM_623339 and XM_001121298). The entry for XM_623339 contains the following information in its annotation:

Apis mellifera similar to Aldolase CG6058-PF, isoform F (LOC550785), mRNA.

While the entry for XM_001121298 includes this:

Apis mellifera similar to Aldolase CG6058-PA, isoform A (LOC725455), mRNA.

The two A. mellifera sequences were annotated as similar to the same D. melanogaster gene (CG6058, or Aldolase), but each A. mellifera sequence matches a different D. melanogaster splice-form. That suggests the two sequences from A. mellifera are alternative spice-forms of the the same gene.

We can also take advantage of the A. mellifera genome browser on the NCBI webpage. Here is a screenshot of the Genbank entry for XM_623339:

i-55dddecd4dddd45d91de577f46f6a169-amel_XM_62339_genbank_sm.gif
Click to enlarge

The red arrow points to a link that will take you to a browser for the chunk of DNA in which this sequence is located. That chunk of DNA is known as NW_001253175, and, if you also examine the Genbank entry for XM_001121298, you'll find that NW_001253175 contains both A. mellifera aldolase sequences. In order to find XM_623339 and XM_001121298, however, we'll have to use their synonyms (LOC550785 and LOC725455); these synonyms are given in the Genbank entry for each sequence. When you visit the browser for NW_001253175, enter "LOC550785" into the search box, as shown below:

i-cee5f7b77c8a40d5a561ac2fcb6d3179-amel_NW_001253175_sm.gif
Click to enlarge

This will allow you to visualize the region in which LOC550785 (aka, XM_623339) is located:

i-ecb993e059d3fdd524fca4e8c77b95ec-amel_LOC550785.gif

This sequence spans from positions 271,541 to 291,134 in this chunk of the A. mellifera genome. If the two A. mellifera sequences are alternative splice-forms of the same gene, they should be located in the same region. On the other hand, if they are unique genes, they should be located in different regions of the genome. The other sequence (LOC725455) is found in the same chunk (NW_001253175). If we search for it in the browser, here is what we see:

i-7b3db718be887e81d8f8f03e843143b8-amel_LOC725455.gif

The coordinates of this sequence are 266,908-270,253, and these are different than those of the other sequence. Therefore, the two A. mellifera sequences are located in different parts of the genome (although in close proximity to one another). They are unique genes, not alternative splice-forms of the same gene.

We conclude that Drosophila species and A. mellifera both have two aldolase genes, while A. gambiae has only one. This leaves an important question unresolved: did the same duplication event give rise to the duplicate aldolase genes in A. mellifera and Drosophila? If that's the case, then A. gambiae would have suffered a loss of one of those genes. Conversely, an ancestral aldolase gene may have been duplicated independently in Drosophila and in honeybee. We will attempt to answer this question in a subsequent post by reconstructing the phylogeny of these genes.

More like this