tags: researchblogging.org, new species, insects, American cockroach, Periplaneta americana, DNA barcoding, Brenda Tan, Matt Cost, Mark Stoeckle, Rockefeller University, American Museum of Natural History, AMNH
Moving overseas has been a challenge, but worst of all for me has been the fact that my writing has suffered. I still read scientific papers and science news stories, but have been unable to find the time necessary to write these stories for you. Hopefully, my life is returning to some semblance of predictability, which means I can now start working again. I have several half-finished stories that I am working on and will be publishing over the next few days. The first story I want to share with you is about a simple high school DNA barcoding project that yielded an astonishing discovery; a new species that has been living in one of the largest urban areas in the world, New York City.
As a New Yorker, I am both surprised and not surprised at the same time by the discovery of a new species of cockroach hiding in our cabinets and showers and running around under our feet. I mean, where else would a new species of pest insect most likely be found?
Like an episode from the popular television series, CSI: NY, two high school seniors sought to identify hundreds of specimens that they had collected throughout Manhattan. Their goal? To identify the species by analyzing at a small portion of their DNA using a technique known as "DNA barcoding." As a method for quickly identifying species, DNA barcoding has become increasingly more accepted within the previous six years.
The two "DNAHouse investigators" made a number surprising discoveries using DNA barcoding, including mislabeled food items, and -- most astonishing of all -- the discovery of a species of cockroach that is new to science. The insect, which looks like the American cockroach, Periplaneta americana, a widespread pest in NYC and other large cities, turned out to have a different "DNA barcode" from that species.
A DNA "barcode" is a short nucleotide sequence shared between organisms. Although the identity of the "barcode" gene is not standardized as yet, a 648-basepair long region of the mitochrondrial cytochrome c oxidase subunit I (CO1) gene is typically used as a DNA "barcode" for most eukaryotes. This genetic region is ideal because it is nearly universal, it is small and easily sequenced using current technology, and it contains large nucleotide variation between species (but relatively small variation within a species). Additionally, as of 2009, there were more than 620,000 known CO1 sequences from over 58,000 species of animals -- larger than databases available for any other gene. These features allow for direct sequence comparisons and analyses between different species.
"It's genetically distinct from all the other cockroaches in the database," said DNAHouse investigator Brenda Tan. Ms. Tan, a senior at Manhattan's Trinity School, worked on the project, along with fellow classmate Matt Cost.
"[Closely-related] species don't differ [by] more than one percent, [while] this cockroach is four percent different," agreed Professor Mark Stoeckle. "This suggests it is a new species of cockroach."
Professor Stoeckle, a medical doctor who conducts genomic and DNA barcoding research at Rockefeller University, supervised Ms. Tan and Mr. Cost.
Further investigation is essential before it can be determined whether the students' discovery is a new species or a subspecies. If their finding is confirmed, it is traditional that the discoverers -- Ms. Tan and Mr. Cost in this case -- will be granted naming rights for the new species.
This analysis is a continuation of the original 2008 study, also conducted by two Trinity students, that found that one-quarter of food fish that they purchased at restaurants and markets had been mislabeled -- often by replacing expensive fish (like tuna) with cheaper species (like tilapia), triggering a public furor in NYC that came to be known as "sushi-gate."
To conduct this study, Ms. Tan and Mr. Cost collected a total of 217 specimens between November 2008 and March 2009. They rummaged around in supermarkets, streets and in New York apartments, including that of Professor Stoeckle, where they found the new cockroach species.
"The superintendent of the apartment building was surprised when we wanted to save rather than squash the cockroach," remarked Ms. Tan.
After the specimens were collected, they were photographed and labeled before their DNA was isolated and sequenced by scientists at the American Museum of Natural History.
After the DNA was sequenced, Ms. Tan and Mr. Cost analyzed the DNA by matching them to known sequences. Their collected specimens yielded 170 usable genetic codes that were matched to known DNA sequences for 95 different animal species. These DNA sequences are stored in GenBank and in the Barcode of Life Database (BOLD).
GenBank is an open access, annotated database of all publicly available DNA sequences and their protein translations. This database, established in 1979, is maintained by National Center for Biotechnology Information (NCBI) as part of the International Nucleotide Sequence Database Collaboration, or INSDC. Containing more than 65 billion nucleotide bases in more than 61 million sequences, this remarkable database doubles in size roughly every 18 months.
BOLD is a newer database that is maintained by the Biodiversity Institute of Ontario at the University of Guelph, Canada, where DNA barcoding was pioneered. So far, scientists the world over have DNA barcoded over 750,000 individual specimens from more than 65,000 species. Their ultimate goal is a reference library of barcodes for all animals and plants on Earth.
These DNA databases are publicly accessible, so all researchers have to do is enter a DNA sequence to compare it to those stored there.
The students were astonished to find that DNA was ubiquitous.
"We may think we live in a sterile, urban environment seemingly untouched by nature," Mr. Cost marvels. "We imagine objects are purified and cleansed in order to pass into our personal world with evidence of their original source all but erased. But DNA is amazingly resilient to damage through all the processing to which it is subjected. We got usable DNA from 151 of 217 of the items tested -- including dried soup mix, dog biscuits, beef jerky, butter, a feather lying on the sidewalk, a dried bit of horse manure from Central Park, even a feather duster."
Ms. Tan and Mr. Cost found that DNA is very durable.
"[A]fter we realized that DNA was, indeed, omnipresent, an important question arose: How much abuse can this genetic material take before it becomes unintelligible or even unrecognizable?" Ms. Tan asked.
"Could we find decipherable DNA in a piece of cooked meat? A piece of cheese? A highly processed dog treat?" Ms. Tan continued. "What we found was astonishing. Few specific conditions proved able to destroy the DNA consistently."
Canned foods were the one exception. Processed at high temperatures, canned foods contain DNA that was broken into tiny pieces, often making contents identification impossible.
The students were also impressed by the precision and power of the DNA barcoding technique.
"You could have a filet of fish, just the stuff you might throw on your grill, and an expert who spent his whole life [working with it] couldn't tell you what it was by looking at it," Mr. Cost observed. "But with this, it's so simple."
After the specimen was identified from its DNA "barcode", learning more about each species was easy.
"Learning the species name was like finding a key that opened a new book," Ms. Tan explained. "It's exciting to learn still more after you know a species name. For example, 'dried shredded squid' turned out to be jumbo flying squid (Dosidicus gigas). We looked up jumbo flying squid and found it grows to 100 lbs, swims at depths up to 2,000 feet, travels in large schools containing hundreds of individuals, and hunts in cooperative packs like wolves. This gave us new thoughts about the oceans and about calamari salad."
Most results are expected, but the pair have already made some unexpected discoveries.
"There were a lot of surprises," said Mr. Cost. "We tested 'buffalo mozzarella' cheese and found it is made from the milk of Water Buffaloes. We asked some adults who have ordered it on restaurant menus and they didn't know that."
They ran across a few other surprises, too: sixteen percent of food items examined were mislabeled, including venison dog treats that were made of beef, dried shark meat turned out to be Nile perch, sturgeon caviar that was Mississippi paddlefish and sheep's milk cheese that was made of cow's milk -- a potentially dangerous labeling error for those with food allergies.
But the biggest surprise of all was the cockroach whose genetic code did not appear in any of the DNA sequence databases.
"By appearance it looks like the American cockroach but it is genetically different from other American cockroaches in the databases," the two DNAHouse investigators said.
Ms. Tan and Mr. Cost got help on their work from the American Museum of Natural History and Rockefeller University in New York.
Both Ms. Tan and Mr. Cost graduate at the end of the 2010 school year. Ms. Tan plans to pursue biology in college next year, while Mr. Cost will study music.
[NOTE: Even though this was a fun story for me to write, and I absolutely love encouraging students to get involved in science, I have to tell you that I am suspicious of these data. Those of you who are familiar with molecular phylogenies will notice a few oddities in the phylogentic tree above. First, the position of humans relative to birds implies that humans and birds are each other's closest relatives, while "other mammals," reptiles and fish are more distantly related to humans and birds. In short, this figure claims paraphyly in the evolution of mammals! Another oddity are the claimed relationships between "other arthropods" and insects, which again, suggests that "other arthropods" are paraphyletic. Second, the branches of this tree lack any bootstrap data so it is impossible to distinguish real data from noise. DNA "barcodes" do not provide sufficient resolution to describe evolutionary relationships between organisms higher than the generic or family level -- contrary to what is implied by the students' phylogenetic tree. Basically, this phylogenetic tree is not publishable data by any peer-reviewed journal for many reasons, not the least of which are the strange evolutionary relationships that it claims. It really bothers me that none of the journalists covering this story noticed these irregularities, leaving an unemployed (and presumably unemployable) evolutionary biologist to point this out.]
Musante, S. (2010). DNA Barcoding Investigations Bring Biology to Life. BioScience, 60 (1), 14-14 DOI: 10.1525/bio.2010.60.1.4
Press Release [PDF] (quotes).
Further proof that New York is the most diversified city in the US.
Ah. To safely write about NY roaches, you had to move to Germany. ;)
Lots of interesting stuff from a relatively simple school project. It's amazing to consider just how far DNA testing has progressed in such a short time, that this sort of thing can now be done by school kids.
To safely write about NY roaches, you had to move to Germany.
She even brought one with us. It got as far as the bathroom before we stomped it.
She even brought one with us. It got as far as the bathroom before we stomped it.
The was just the kamikaze decoy. The buggers have now established a base, and are busily breeding reinforcements. None of you is safeâ¦
In 1991 I attended an American Society of Ichthyologists and Herpetologists meeting in NYC. A couple of the guys and I ate lunch at a nearby Chinese restaurant. When the food came, there was a dead cockroach on one of the plates. Called the waiter over. He grunted something, flicked the cockroach into a napkin, and that was that.
I'm just amazed that genetic testing technology has gotten to the point that high school students can have access to it for science projects. Granted, these two are obviously pretty advanced high school science students but still, pretty impressive.
Does their incorrect phylogenetic tree impact negatively on their discovery of the new cockroach species? Or is that a separate issue? (Sorry if this is a "dumb" question, I'm an enthusiast but know very little actual knowledge on this subject.)
Dear Grrl Scientist,
I appreciate the interest in Brenda and Matt's DNA exploration. Regarding unrooted neighbor-joining tree, this was included to help illustrate diversity of specimens. It is not an assertion about higher-level relationships; indeed the coloring highlights several of the anomalies you note.
In answer to Jeff Knapp's question, the vagaries of NJ tree do not affect the assessment of COI divergence among cockroach specimens. As you may have seen, the COI character differences are illustrated in a separate figure on the site.
I think what they're calling "DNA barcodes" might just actually be sequence. How else would they be able to say one barcode was 4% different than another? I assumed they used microsatellites because of the barcode word but perhaps I was wrong to do so. Sequencing a highly conserved gene would be simpler than microsatellite analysis I think. Certainly easier to understand.
Jesop: i edited the piece to include information that answers your question. the gene region used as the "barcode" was the mitochondrial gene CO1 -- which is probably the best-known gene region in the animal kingdom.
Does their incorrect phylogenetic tree impact negatively on their discovery of the new cockroach species
Just to explain why it isn't a problem, different genes evolve at different rates. So to infer higher level taxanomic information (i.e. the stuff the tree above gets wrong), one needs something that evolves slowly. If it's too quick, then any pattern in the data is obscured by the large number of mutations.
But a gene that evolves slowly enough to see the higher level information isn't any good for looking at closely related species, as there isn't enough time for any mutations to accumulate. Hence, a faster gene has to be used. Microsatellites (mentioned by Jessop) have been used for between-species comparisons, but they're more often used to look at within-species dynamics, because they can evolve really quickly.
Just to throw my opinion in - the paper isn't available yet, so we can't see the details, but designating a species just on DNA divergence looks suspect to me: there's nothing in biology to say that divergence of more than a certain amount is enough to create a new species. OTOH, it is suggestive, and dissection of the beastie will give a better idea about what's happening. As would a wider survey of the New York cockroaches: it may be that COI is acting strangely, or that the population is diverse, or something else odd has happened (e.g. introgression from another species).
One advantage to having a couple of cats in Houston is that they catch and play with the tree roaches that get into one's residence; by the time the cats get bored, the roach has been slowed down to the point of being easily captured and flushed. My former girlfriend's Abyssinian cat, however, liked to catch roaches and set them loose in bed at 3 a.m., inevitably necessitating our immediate stripping of the bedclothes in search of the interloper. Re speciation, the recent issue of Nature on biodiversity in crisis (Nov. 19, 2009) carried an article pointing out that the two-barred flasher butterfly, a favorite of Lower Rio Grande Valley butterfly watchers, consists of ten species indistinguishable visually as adults (but with distinctive caterpillars)(see p. 272 in article titled "On the origin of bar codes" by Nick Lane). We undoubtedly have missed a lot of diversity with just the naked eye.
So, what they are saying is, that soon cockroaches will be talking and singing ala Joe's Apartment?
So Awesome! Great article. But seriously, give them a break. They're in frackin' HIGH SCHOOL! You should not expect them to be producing publishable material, yet they, two HIGH SCHOOL students, have discovered a NEW SPECIES! I think one can forgive their phylogenetic tree. Sheesh.
Mr. E: i apologize if i sound overly harsh towards the students. i limited my comments exclusively to the data and the data presentation because that is what deserves a critical eye. this is science, after all, and scientific rigor is critical to preserving the credibility of the discipline. Ms. Tan and Mr. Cost are not their data, and they should feel proud of their dedication and of their accomplishment.
Since NYC is already a "common garden" for Cockroaches, the trick will now to be to raise the two mitochondrial genotypes together, and see if they're reproductively isolated or morphologically different. Some of the recent results with Radix and Emys suggest that knowledge of DNA similarities doesn't necessarily simplify the stories that can be told about relationships.
I am amazed at the level of study that these students have been involved in. I assume, Mark Stoeckle, that you are their adviser or science teacher and am impressed that you have led them to the DNA barcoding procedure for analysis.
Can you be cloned for my son's school?