Wikipedia, just like an Organism: clock genes wiki pages

ResearchBlogging.orgThe October issue of the Journal of Biological Rhythms came in late last week - the only scientific journal I get in hard-copy these days. Along with several other interesting articles, one that immediately drew my attention was Clock Gene Wikis Available: Join the 'Long Tail' by John B. Hogenesch and Andrew I. Su (J Biol Rhythms 2008 23: 456-457.), especially since John Hogenesh and I talked about it in May at the SRBR meeting.

Now some of you may be quick to make a connection between this article and its author Andrew Su and A Gene Wiki for Community Annotation of Gene Function, published in PLoS Biology back in July, where one of the authors is also Andrew Su. And you would be right - it's the same person and the two articles are quite related.

In the PLoS Biology article, they write:

A loose organization of Wikipedia editors has spearheaded the creation and expansion of several thousand articles related to molecular and cellular biology (the "MCB Wikiproject"), including many gene-specific pages. These articles vary widely in quality, format, and completeness, ranging from relatively complete encyclopedic entries (e.g., "enzyme," "oxidative phosphorylation," and "RNA interference") to very short collections of information called "stubs" (e.g., "amphinase" and "glomus cell"). As an example of the collaborative writing process, the article on RNAi has been edited 708 times by 232 unique editors since its initial creation in October 2002. On the subject of human genes, generally only the most well-characterized of genes and proteins have highly developed entries (e.g., "HSP90" and "NF- B").

In principle, a comprehensive gene wiki could have naturally evolved out of the existing Wikipedia framework, and as described above, the beginnings of this process were already underway. However, we hypothesized that growth could be greatly accelerated by systematic creation of gene page stubs, each of which would contain a basal level of gene annotation harvested from authoritative sources. Here we describe an effort to automatically create such a foundation for a comprehensive gene wiki. Moreover, we demonstrate that this effort has begun the positive-feedback loop between readers, contributors, and page utility, which will promote its long-term success.

In the JBR paper, the authors focus on the development of Wikipedia pages describing genes involved in circadian rhythms, probably the first genes to be done comprehensively there, as an example for others as to how to do this kind of thing:

Why use Wikipedia for this? First, Google and Wikipedia have already become scientific research tools. When you Google an unfamiliar gene you usually end up at common sites of gene annotation such as the National Center for Biotechnology Information. Though these sites have expert curators who do the best they can, they are usually not domain experts and are so overloaded that they frequently fall behind in accurately summarizing the literature. (It's actually amazing what they accomplish given available resources.) For confirmation, research your favorite gene. Using Wikipedia will allow our community to build and evolve living, up-to-date summaries on the function of important genes in the circadian network. Check out the pages on Arntl (http://en.wikipedia.org/wiki/ARNTL) and Rev-erb-alpha (http://en.wikipedia.org/wiki/Rev-ErbA_alpha). Second, in part due to Wikipedia's past success, its pages appear near the top of search engine lists such as Google, and consequently attract viewers. Finally, our field competes with other disciplines for the best and the brightest young scientists. These people use Wikipedia. High quality pages on annotated clock genes will attract their attention, and attract them to our field.

Importantly, the gene pages need not be extremely long. What is much more important is that they be well referenced. See, for instance Wikipedia pages they mention, those for ARNTL gene (also known as Bmal1 or Mop3), or Rev-ErbA alpha (I have written about some of these genes before, e.g., Lithium, Circadian Clocks and Bipolar Disorder, Tau Mutation in Context and The Lark-Mouse and the Prometheus-Mouse if you want more background). That is all that is needed - if I wanted to be silly, I could say that since genes are small, their wiki pages need to be small as well. But that is only half-silly, really.

This is just like in the real world. Genes don't really do anything. They are coded descriptions of parts in a catalog. To explain a biological function, one needs to go from genes to their mRNAs to proteins, then to look at protein modifications and how multiple proteins interact with each other. Then see how such protein interactions affect the behavior of a cell. Then see how the altered behavior of a cell affects the entire tissue and how the changes in that tissue affect distant organs. Finally, one gets to explain the function once one understands how a collection of organs, interacting with the external environment, results in changes in biochemistry, development, physiology or behavior of the organism, and how this function evolved.

In the same way, gene pages on Wikipedia are not supposed to be stand-alone. Knowing everything about a clock gene does not mean one knows anything about circadian rhythm generation and modulation (not to mention its evolution). The value is in links - to all the other clock genes, to genes that do similar things (e.g., other transcription factors or nuclear receptors), to primary literature on the proteins coded by these genes and their interactions, and to higher-level functions, e.g., the Circadian Rhythms page and links within.

Some would ask - Why Wikipedia (I know, there are still some people out there who don't like it):

What's the downside? The major criticism is poor annotation. Actually, we argue that no annotation is worse than poor annotation, as the latter tends towards self-correction by provoking experts to intervene. In fact, a recent study concluded that Wikipedia was as accurate as Encyclopedia Britannica, and unlike Britannica, growing at a rate of 1500 articles per day (Giles, 2006). Another potential downside is non-consensual or controversial entries. We would argue that these are better addressed in real time via Wikipedia than in journal articles, where they remain fixed for years. Wikipedia even has tools to deal with controversial topics (for examples, see entries on "Intelligent Design," evolution, "Swift-boating," or climate change).

And, I'd argue, clock gene pages are not as contentious as those on climate change or creationism. Very few Wikipedia pages are so controversial as to be continuously suspect. Almost all of the pages are on non-controversial subjects, written and edited by experts on the topic, and are as reliable, or better, as anything else one can find out there, not to mention the fastest to get updated once new information comes in.

The effort is starting with the focus on mammalian genes, for obvious reasons of medical relevance and the existence of a wealth of information. But there is just as much, if not more, information on Drosophila clock genes. And comparative analysis of clock-genes in a variety of organisms is the key to understanding the circadian function and its evolution, so if your strength is in other old or emerging model organisms (did you see Japanese quail on that list?!), don't hesitate to add the pages and information on those.

Finally, I'd like to urge you to contribute - I know that many chronobiologists read this blog (though most are silent types who never comment). It will take 30-60 minutes of your time to make or edit a page on the gene (or a higher-level process) in circadian biology and this effort will have much bigger audience and much broader impact than all of your peer-reviewed papers put together. It's worth your time even if probably will have no effect on your getting tenure. But the tenure committee is not your only audience - there are researchers around the world (many in developing countries), teachers and students and lay audience, who will be affected by your contribution in a much more lasting and important ways than the inner circle of your department. Isn't this why you are doing science in the first place?

If you want to discuss this more, come to ScienceOnline09, where John Hogenesh, one of the authors of the JBR article, will demonstrate Wiki Genes, answer questions, and deeply internalize your suggestions ;-)

References:

John B. Hogenesch and Andrew I. Su, Clock Gene Wikis Available: Join the 'Long Tail', J Biol Rhythms 2008 23: 456-457.

Jon W. Huss, Camilo Orozco, James Goodale, Chunlei Wu, Serge Batalov, Tim J. Vickers, Faramarz Valafar, Andrew I. Su (2008). A Gene Wiki for Community Annotation of Gene Function PLoS Biology, 6 (7) DOI: 10.1371/journal.pbio.0060175

More like this

Speaking of Wikipedia's accuracy issues, the only studies I'm aware of evaluating its articles and comparing them to other projects (like the Encyclopædia Britannica) are fairly old — 2005 vintage. Anyone know about more recent investigations, or have "insight" (i.e., funding) on conducting one?

I sincerely doubt if "the best and the brightest young scientists ... use Wikipedia", and if they did they wouldn't remain "the best and the brightest" for very long.

For up-to-the-wiki-picosecond information on Wikipediot Culture, you may visit The Wikipeida Review.

Jon Awbrey

"the best and the brightest young scientists ... use Wikipedia" - they do.

"....they wouldn't remain "the best and the brightest" for very long..." which is what will happen to the dinosaurs stuck in their last Millennium mindset.

Is Wikipedia (like democracy) perfect? No. But (like democracy) it is better than the alternatives. And (like democracy) more we participate, better it gets. It's that simple, naysayers notwithstanding.

Ignore Jon Awbrey, he's just a banned user who trolls Wikipedia-related blogs and news articles occasionally.

I've personally found the gene-article push to be a very interesting development, and it is certainly a good idea for researchers to help link and improve what's available out there.

As a Wikipedian, I extend the offer to help out any researcher who'd like to contribute but wants guidance in editing Wikipedia.

Bora,

From what you say, I would guess that you are familiar with a small corner of Wikipedia, one that is fortunate enough to be watched over by a pre-established community of people who are competent in the pertinent fields. I have known a few such wiki-paradises in Wikipedia, but I have learned that they are very fragile and ephemeral to the max. No part of Wikipedia is safe when some economic, industrial, or political interest group discovers an interest in it. When that happens, you will discover to your dismay that the Wikipedia Way of doing things affords none of the protections against information warping that have taken centuries to evolve in the Real World.

Jon Awbrey

I have addressed that point in the post - read it.

@Blake, I did stumble upon a study similar to the Giles Nature one which compared Wikipedia to "authoritative sources", though this time on history topics. The findings were much less rosy in this study. Bottom line, I think the jury's still out on where Wikipedia falls in the spectrum of accuracy and completeness relative to other sources. It's worth noting that there are also other systems besides the Gene Wiki that have been recently published (Wikiproteins, wikipathways, wikigene, etc), so it's clear that other models of community intelligence are being explored. To echo Coturnix, all of them empower you to make the resource better, and the best way to vote your support for a model that you like is with your participation...

I saw you mention some aspects of the problems with Wikipedia, but I did not see anything approaching a full appreciation of their character and intransigence.

But perhaps I can distill the remainder of my reflections down to this single piece of advice:

Any scientific resource worth creating in one social intellectual technical environment (SITE) is worth distributing through a number of different SITEs, ones that observe a variety of different protocols. This provides the resource with a measure of redundancy and robustness in case some of the SITEs crash for whatever reasons.

But you probably already knew that ...

Many Regards,

Jon Awbrey

Regarding the reliability/accuracy of Wikipedia, may I offer a couple of links that are worth reading?

http://chance.dartmouth.edu/chancewiki/index.php/Chance_News_31#The_Unb…

http://wikipediareview.com/index.php?showtopic=20565

Within that second link, you'll discover that, at least regarding the 100 Wikipedia articles about the 100 U.S. Senators, their articles are substantially wrong (on average) for 1.63 hours per day.

And, when I say "wrong" I mean things like placing John McCain's birthplace and the Panama Canal in the state of Florida -- an edit which was viewed by about 90,000 visitors without bothering to correct it. Another Senator was purported to have "participated in kinky sex adventures" during his high school years, a defamatory comment that slipped through about 3,100 page views without correction. Sen. Bob Menendez of New Jersey's article indicated that his divorce was the result of his "cheating" and "infidelity". These edits persisted for about 151 hours (over 6 days) on Wikipedia. Do you imagine that these egregious errors would have persisted so on a reputable encyclopedia project?

Note: Nihiltres (commenting above) is also a blog & news surfer who makes lots of sycophantic comments in favor of people being absorbed without question into the communion of Wikipedia, leaving all doubts at the door. Such boosterism is to be feared, I would think. Some of us are actually producing documented statistics about the troubles of Wikipedia. Which would you prefer? Platitudes or statistics?

If you enjoyed this post, please do us a favor and Digg the study of the U.S. Senators' vandalism:

http://digg.com/politics/McCain_raped_wife_Obama_a_nudist_and_Hillary_h…