Seriously. Just getting around to technorati claiming. Move along, nothing to see here. Watch for a lengthy post on scientific publishing later tonight or tomorrow. 59tbcg4wsi
I wrote this up on the request of a colleague who heard my talk recently on open data. I'm posting it here for comment and adding some hyperlinks... Moving from a Web of documents to a Web of data (or of Linked Open Data) is an oft-cited goal in the sciences. The Web of data would allow us to link together disparate information from unrelated disciplines, run powerful queries, and get precise answers to complex, data-driven questions. It's an undoubtedly desirable extension of the way that the existing networks increase the value of documents and computers through connectivity - Metcalfe's…
I was in a roundtable yesterday talking about Health IT with a bunch of very smart people in the bay area. It was sort of a briefing of ourselves and others about the real issues underpinning what it would take to generate real disruptive innovation in health technology and health costs. The vast majority of the conversation centered on payment reform, which is outside my ambit. But we did spend some time talking about health data standards, and the problem of getting standards that are so geared to the existing market-dominant companies that they actually froze out new market entrants. My…
Following on to yesterday's post, where I wrote about the four functions that traditional publishers claim as their space (registration, certification, dissemination, preservation), I want to revisit an argument I made last week at the British Library. In my slides, I argued that the web brings us at least three additional functions: integration, annotation, and federation. I wanted to get this argument out onto the web and get some feedback... Let's start with integration. The article no longer sits on a piece of dead tree, inside a journal formatted by date and volume and page number. It…
I spoke last week at an event at the British Library about the future of the scientific article. It was a lively event - lots of friendfeed and twitter reactions - and it got me thinking a lot about the way we use publication in science. In my conversations with research staff and leaders at the BL, I ran across this statement. Publishers frequently claim four functions: registration (when was an idea stated?), certification (is the idea original, has it been "proved" to satisfactory peer review?), dissemination (delivery), and preservation of the record. The journal thus provides for both…
Just a quick hit - I'm digging out after a wonderful break from work - but this deserves notice... Since 2004, WisconsinView has made aerial photography and satellite imagery of Wisconsin available to the public for free over the web. As part of the AmericaView consortium, WisconsinView supports access and use of these imagery collections through education, workforce development, and research. Starting June 30, 2009, WisconsinView is making available all of its more than 6 Terabytes of imagery data under the new CC0 Protocol provided by Creative Commons. The CC0 (pronounced CC-Zero) Protocol…
There's an interesting tweet about attribution in the data web. And it raises a tension I run into a lot but haven't seen a lot written about: the shifting nature of what the word "attribution" means. We have a fairly common understanding of attribution in our daily lives: credit where credit is due is mine, and it tends to be what most people think. This is whether one is a musician, a scientist, a teacher, or anyone who does creative or innovative work. We like getting credit for our work. No problem there. This idea of attribution encompasses the idea that we should get credit for our…
I'm at the Seed - Council on Competitiveness State of Innovation Summit. I was thinking about live blogging, but find that doing so makes it hard for me to think about what people are actually saying. There's a webcast if you're interested. As far as conferences go, it's a good one. Rock stars on the stage (E.O. Wilson is a hero of mine) and interesting conversations about innovation. But I'm frustrated, as I often am at "innovation" conferences. What follows is a bit of a rant directed less at this event, which as I said is a good one, but at the conversation I hear all the time about…
I'm happy to say that I'll be doing a forum at the British Library on July 22, called Scientific Findings in a Digital World: What is the Genuine Article? There's a Nature Network group you can join to participate in the creation of the agenda. This is pretty cool. The British Library is a legendary institution, and has some personal resonance for me too - my dad wrote a big chunk of his dissertation in the reading room there. I'll make a few introductory comments and then do my best Oprah impersonation.
Paul Miller and I recorded a chat last week that's now online as a podcast from Cloud of Data. Paul is a smart guy and it was a fun interview. We first met when he was working with Talis, which is a very progressive company in the UK (they sponsored some of the development of the PDDL and currently host data in the public domain for free in the Talis Connected Commons) but he's now out freelancing. Check out the podcast and let me know your comments.
So, I was supposed to go up to Montreal and Ottawa the past couple of days, but a series of miserable luck in terms of planes made it unworkable (it's complicated). Instead, I tried to record a presentation and get it onto the web so we could play it for them, and then take questions by skype. That also didn't work. However, we were successful in the end getting the video online. So if you're interested in what I say when I talk to the libraries, but haven't been to one of the conferences where I've spoken, take a look.
As noted on the Creative Commons blog, the folks at Digg have converted to CC0 (replacing a multiyear use of a different public domain legal tool). This is very cool on lots of levels. But Daniel Burka of Digg said it best, so I'll make this a short post by simply quoting him... This is good for the internet and good for society. He's talking about the public domain, and he's right.
This was in the comments from my blog post on Pfizer's semi-open innovation. I don't normally highlight comments like this, but sometimes you have to give credit where credit is due. Why deal with Pfizer in the first place? Anything you might find they'll keep and you're SOL. We have a compound library that started from 1.4 million cmpds from Chemdiv, Chembridge, Maybridge and Tripos. I talked them into using our exclusion criteria (developed by my old buddies from Pharmacia - we all got Pfired when Pfizer took over Kazoo) and got rid of all the junk we didn't want (1 million). From there…
I ran into Virginia Acha last week at the NESTA event in London, but she didn't tell me about this! Derek Lowe at In the Pipeline notes that Pfizer is apparently allowing external companies to screen against their internal library. But I'm told that Pfizer has been meeting with several other (mostly smaller) companies, offering their (entire?) compound library as a screening resource. As I understand it, you need to come to them with a reasonably formatted HTS assay, and there's a fee in the high hundreds of thousands to run the screen. This isn't all the way towards open innovation. In a…
Open Knowledge Foundation have released a short guide to open data as part of the open data commons project. I have my philosophical disagreements with OKF on some issues - and they with me! - but they're the kind of disagreements that come from people on the same side of the fence. We all want open data, and we want it now. Moments like this are good to step back and focus on our agreements. We agree that data is a little weird, and that we need more research on how to best treat the law around the data. We agree that public sector information needs to be free - in fact, Rufus Pollock has…
(note - I have edited this post to add in Rufus Pollock, who I left out primarily because I wasn't sure he would endorse the ideas in this post - Peter notes that he was not only at the meeting but essential, so I'm happy to add these edits!) Peter Murray-Rust has posted some essential reading for anyone interested in open data in the sciences. He follows onto Cameron Neylon's post whose title I have quoted in my own title here. Peter summarizes an informal summit meeting held lest week in the UK by a group of folks interested in open science and open data, including Rufus Pollock of the…
Lately I've been spending a fair amount of time talking to the folks at NESTA in the UK. There's a lot of interest in how the kinds of legal and technical infrastructures we're building at Creative Commons might work at scale in the UK, and yesterday NESTA hosted me and James Boyle (founder of Creative Commons, and a guiding force in our science work from the very beginning) at an event labeled Open Innovation and Intellectual Property, jointly hosted by the Wellcome Trust and Creative Commons. It was an interesting day. It was one of the few times I've had the scope of topic to cover all the…
I was invited to join a meeting last week in New York to kick off something called the "Concept Web Alliance." It's an emerging non profit hoping to stimulate the emergence of lots and lots of marked-up content from the life sciences and it's claiming the mantle of open access. The potential value of a concept web is essentially the same idea as the semantic web, but with a little more savvy about branding - we can have a computable web of data linked into the literature, and a better way of asking very precise questions of a massively complex data space if the information has more structure…
Quick hit at the end of the week. I've got a couple of posts I'm trying to finish up and post next week. But it's worth noting that the new H1N1 is sequenced and available under the same open access terms as the rest of the NCBI data and contents. All that misery and expense and illness, from this short series of letters. Nature's one hell of a programmer. 1 atggatgtca atccgactct acttttccta aaaattccag cgcaaaatgc cataagcacc 61 acattccctt atactggaga tcctccatac agccatggaa caggaacagg atacaccatg 121 gacacagtaa acagaacaca ccaatactca gaaaagggaa agtggacgac aaacacagag 181…
The Tropical Disease Initiative has released a "kernel" for open source drug discovery. It's been published in both Nature Biotechnology (ugh, subscription required) and in PLoS Neglected and Tropical Diseases (yay, open access fulltext under CC-BY). I am not steeped enough in the reality of drug discovery to make believable statements about how much this means for those actually on the ground looking for cures. I do know that drug targets are only the first step on a really long road to drugs in patients, and Derek Lowe has written a very informative post on this topic over at In the…