During these past couple of weeks, we've been comparing mitochondrial DNA sequences from humans and great apes, in order to see how similar the sequences are.
Last week, I got distracted by finding a copy of a human mitochondrial genome, that somehow got out of a mitochondria, and got stuck right inside of chromosome 17! The existence of this extra mitochondrial sequence probably complicates some genetic analyses. One of my readers also asked an interesting question about whether apes have a similar mitochondrial sequence in their equivalent of chromosome 17, and how it compares. We will come back to that question, later on.
For now, I'm going to return to the original question, and finish our first comparison of human mitochondrial DNA sequences and with the mitochondria from the great apes.
To compare human and ape mitochondria, I used the NCBI web server to run blastn and compare a human mitochondrial reference sequence with great ape sequences (see how it's done). Last week, I showed some of the blast results. These results contained a graph illustrating where the ape sequences matched the human sequence.
Now, we're going to take this a step further.
I went through the blast results, found which parts of the sequences matched each other, and found the number of bases that matched in each region. Then, I added up the number of bases that matched for each mitochondrial sequence and divided by the number of bases in the human mitochondria to calculate the percent of matching bases for the entire mitochondrial genome. I also made a map and put the matching sequences in order to make sure that I didn't count any sections twice.
For example, for Pan paniscus (Bonobo chimp), I get something like this:
Matching region   number matching
total matching bases from the Bonobo chimp = 15,008 bases
The human mitochondrial sequence is 16,571 bases in size, so for Pan paniscus, we have:
15,008 bases matching out of 16,571, or 90.6%
I'm not doing anything fancy to compensate for gaps or gap distance here, just looking at the percent of matching bases across the mitochondrial genome.
For Pan troglodytes(Chimpanzee): 15,0303 bases out of 16,571, or 90.7%
And for Gorilla gorilla(Gorilla), we have: 14,448 out of 16,571, or 87.2%
Our first question has been answered - at least for the mitochondrial genome. Our mitochondrial DNA is a little over 90 percent similar to that of Bonobo chimps and Chimpanzees and 87% similar to Gorillas.
Of course, this isn't the whole story. There's a bit more information here that we can uncover. If, we can compare the mitochondrial DNA from the apes to each other, we can make a graph that shows the relationships between all four of species, Human, Bonobos, Chimps and Gorillas.
Here's your assignment for next week: compare the mitochondrial sequences from Bonobos to Chimps and Gorillas, and the mitochondrial sequence from the chimp to the sequence from the Gorilla. Next Friday, we'll use the sequences and construct a simple tree to look at all four relationships.
technorati tags: digital biology,
blast, bioinformatics, evolution, mitochondria,
- Log in to post comments
I wondering if you compare the mito's sequence not to primates, is there still high similarity.
Second, the genetic difference between humans and other primates is not so big, so is it surprising that the mito's sequences are closely related?
Furthermore I want to tell you I like your blog, excellent!
I'm glad you like it and you've asked a good question.
Stay tuned, we'll take a look in a week or two.
Do you know if there is a simple way to use BLAST to compare NON-matching mito DNA between species? e.g., if I compare a lemur to a lungfish to get one set of non-matching DNA, compare the same lemur to a salamander to get a second set, and then compare these two sets of DNA to each other for similarity?
I think so. If a complete mitochondrial genome sequence is availalbe for a species, you could make a map like this:
and identify the regions that match (****) between two species.
Then you could use the sequence coordinates to pull out the regions that didn't match (1-100, 150-300, and 400-500). You can use BLAST to compare those "unmatching sequences" to whatever species you want.
Thanks - that works!
I have a short motif from human Alu sequence. How can we find out sequence similarities in primates?
This is the motif: ATCGAGACCATCCCGGCTAAAA
Please let me know, Thanks
I know a couple of ways.
First, you can use blastn. But, since the sequence is pretty short you'll have to adjust the parameters. Remove the low complexity filter, increase E value as high as it will go, and use the smallest word size you can find.
I did this against the chimp genome at the NCBI and got over 10,000 results and some are exact matches.
The second thing, I would try is to go to the UCSC genome browser. Pick the chimp genome (or whatever primate you want). Type Alu in the text window and push the return key. You'll see where Alus are located on all the chimp chromosomes.