I need to lift the iron curtain between this blog and my workplace. I beg your indulgence for one post. As those who read Bora's interview with me know, I discontinued my previous blog Caveat Lector because I was informed that it was causing significant distress to individuals in my workplace. In my best judgment, I could not continue to blog there in any capacity without it appearing that I had simply brushed off the problems I caused. I took those problems very seriously indeed, as the closure of CavLec bears witness. When I came to ScienceBlogs, I intentionally structured Book of Trogool…
Did you miss the tidbits? I rather did. Data in climate science, and the problem of standardslessness: One database to rule them all, track global temperatures Congratulations to Duke, the latest open-access mandate success! Paolo Mangiafico, on Open Access at Duke University Not all governments are on the open-data bandwagon: When public records are less than public. See also NARA’s Digital Partnership Agreements, featuring the extreme difficulty of paying for large digitization programs without restricting access, at least initially. Which datasets merit preservation? Bryan Lawrence offers…
Saying that large-scale storage is all that's necessary for data curation is like saying that empty bookshelves are all that's necessary for a library.
Word on the street is that the NSF is planning to ask all grant applicants to submit data-management plans, possibly (though not certainly) starting this fall. Fellow SciBlings the Reveres believe this heralds a new era of open data. I'm not so sanguine, at least not yet. Open data may be the eventual goal; I certainly hope it is. At this juncture, though, the NSF would be stupid to issue a blanket demand for it, and I rather suspect the NSF is not stupid. Part of the problem, of course, is that many disciplinary cultures are simply not ready for even the idea of open data. If the NSF were to…
Having made it back at last from Scotland despite the ash cloud, and overcome jetlag and (some) to-do list explosion, I finally have leisure to reflect a bit on UKSG 2010. My dominant takeaway is that nearly everyone in the scholarly-publishing ecosystem—publishers and librarians alike—is finally aware that we can't keep kicking the journal-cost can down the street any longer. Serials expenditures cannot and will not continue at their current level, much less increase. When I think back to the last talk I gave to an audience of publishers, I see that a lot has changed just in my own demeanor…
My husband and I have been stranded by the ash cloud from Iceland. We are well-housed thanks to good friends and the strength of weak ties, so there is no need to worry about us. With luck, we'll be able to get home Tuesday the 27th. Blogging will continue to be sporadic until we're home. I couldn't let Yale's shortsighted decision to free-ride on open access pass without comment, however. This has always been a danger for gold open access: that libraries would protect their toll-access collection budgets by choosing to free-ride on others' support of open-access journals. It is wrong for any…
I'm just back from lunch, after giving my UKSG talk first thing this morning. Here are slides plus notes: Who owns our work? (notes) I'm aware that some of the notes are cut off owing to font size; I'll fix that as soon as I have a free minute. I also have slides only up, but this deck is a little more gnomic than some I've done, so I don't know how useful that'll be. I'm having a wonderful time, and am very grateful to UKSG for the invitation (as well as their wonderful hospitality). You can follow the sessions via a cadre of brilliant livebloggers at UKSG's LiveSerials blog. Next Monday I'…
I am off to bonnie Scotland tomorrow for the UK Serials Group conference. I'll also be jogging down to Bath to meet some of the fine people at UKOLN and talk data. There's a tremendous amount happening rather fast around serials at present; I wish I had time to blog it all, but I don't—I have a class to give tonight and a little more packing to do. See you soon! And if you're coming to UKSG, please do say hello.
Dan Cohen has an extraordinarily worthwhile post recounting his talk at the Shape of Things to Come conference at Virginia (which I kept my eye on via Twitter; it looked like a good 'un). I see no point in rehashing his post; Dan knows whereof he speaks and expresses himself with a lucidity I can't match. I did want to pick up on one piece toward the end, because it has implications for library and archival systems design: Christine Madsen has made this weekend the important point that the separation of interface and data makes sustainability models easier to imagine (and suggests a new role…
Not good at organizing your thoughts, much less your research notes? Think publishing your data should be as easy as falling off the couch? Yes, well, me too. So I've built a new site to do it all for you, and I'm calling it Curatr. Built on all the shiniest and most proprietary technologies, from HyperCard to Flash Automatically builds the most appropriate storage and interaction models based on computerized analysis of provided data. No documentation needed! Auto-organizing. Never touch metadata again! Can be managed by a single graduate student in two hours a week without any prior…
Tuesday seems a good day for tidbits. (I am head-down in my UKSG presentation and class stuff at the moment, so kindly forgive posting slowness.) One argument I rarely see made for open access that should perhaps be made more often is that it reduces friction in both accessing and providing information. Want to reduce the overhead of responding to FOIA requests? Post the information online. Data, data, we love data! Data is at the heart of new science ecosystem and Preserving the Data Harvest. Oh, and if you hadn't noticed, The Data Singularity is Here. Some good lay-level explanations of…
This is my blog post for Ada Lovelace Day, on which we celebrate technical achievement by women. I'm writing it the day before, and setting it to post at midnight. I hope someone is writing a biography of Henriette Avram. I will be first in line to buy it. I desperately want to know how she did what she did. Her achievement is generally, and appropriately, recognized as a technical one: designer and implementer of the MARC (MAchine Readable Cataloging) format still in use in hundreds of thousands of libraries worldwide. If that had been all: dayenu, it would have been enough. For all its…
I was reading the latest issue of the Journal of Digital Information today, and I found myself wishing I could turn the Readability bookmarklet loose on half its PDF-only articles. I'm sorry, authors. I know you tried, but those PDFs are terrible-looking. Times New Roman, really? (The one in Arial is the worst, though.) Could we discuss your line-height and why it's not tall enough? Line-length, and why it's too long? Sniff at me for an ex-typesetter if you like (I am an ex-typesetter, as it happens), but the on-the-ground reality is that I didn't read as much of those articles as I'd have…
OASPA is starting to get its act together, posting a concise summary of its membership procedures and making a new procedure for complaints relevant to the quality measures OASPA wishes to maintain among its members. I think OASPA is right not to offer to police every OA journal in existence. There isn't enough money in the world. It's also a clever stance that invites additional membership. It's not perfect, however. OASPA had a choice to make between complete transparency—of accusers, of accused, of the process—and the sort of hush-hush under-wraps procedures that invite elevated eyebrows.…
First, a small warning: I am having an extremely crowded and busy week, so blogging here (even the catchup I need to do to the many excellent comments on the Battle of the Opens post) will suffer. Something for folks to chew on in the meantime: can anybody explain to me what this tool (if it is a tool) actually does? I clicked over thinking it might be a good thing to add to a tidbits post, but I confess myself wholly flummoxed by the jargon therein. Any ideas, anyone? Especially anyone with a health-care background?
John Dupuis asks some provocative questions; I thought I'd take a stab at answering them, and I encourage fellow SciBlings to do likewise. I quite agree with John when he says that the ferment over publishing models disguises a larger question, "the role of scholarly and professional societies in a changing publishing and social networking landscape." My own history with professional societies, I think, bears this out nicely. John asks first: What societies do you belong to? I belong to the American Society for Information Science and Technology. I was a member of the American Library…
I'm committed to a lot of different kinds of "open." This means that I can and do engage in tremendous acts of hair-splitting and pilpul with regard to them. "Gratis" versus "libre" open access? Free-speech versus free-beer software code? I'm your librarian; let's sit down and have that discussion. Unfortunately, out there in the wild I find a tremendous amount of misunderstanding about various flavors of open, sometimes coming from otherwise perfectly respectable communications outlets. (Pro tip: If you're not completely sure you understand, please find someone to ask. A librarian is a good…
We have a guestblogger today! At my request, Peggy Schaeffer kindly sent me the following introduction to Dryad, which I reproduce as I received it (save for minor formatting details). I will happily pass any questions in the comments on to Peggy for response. ---- Dryad is a repository for data underlying scientific publications, with an initial focus on evolution, ecology, and related fields. It's not an institutional repository, or one focused on only a single type of data -- it's designed for the multitudes of data underlying published articles that would otherwise be scattered…
One of the truisms in data curation is "well, of course we don't let sensitive data out into the wild woolly world." We hold sensitive data internally. If we must let it out, we anonymize it; sometimes we anonymize it just on general principles. We're not as dumb as the Google engineers, after all. Only it turns out that data anonymization can be frighteningly easy to reverse-engineer. We've had some high-profile examples, such as the AOL search-data fiasco and the ongoing brouhaha over Netflix data. Paul Ohm's working paper on the topic is a great way to get up to speed. We librarians are…
I interrupt your regularly-scheduled blog to ask for some help... comments closed on this post so that you'll comment where it'll do the most good. --- Apologies for duplication, and please forward/repost as appropriate... We are working on comparing four digital-repository software packages (DSpace, ePrints, Fedora, and Zentity) in hopes of helping libraries and other institutions select the most appropriate software for their requirements. Read more about our project athttp://blogs.lib.purdue.edu/rep/. We invite anyone who has recently embarked upon planning for a digital repository to tell…