Tidbits, 16 September 2009

The Book of Trogool turns another page...

  • Social scientists and medical researchers, pay attention to this: "Anonymized" data really isn't—and here's why not. If informaticists aren't starting to run similar analyses on their own "anonymized" data, they should be. This is a serious concern.
  • One for the humanists: the rather vaguely-named Scholarly Communication Institute Report from Virginia. The theme was using spatial data in the humanities.
  • From my SciBling Christina: Anybody can code… but should you? Peer review is for more than published papers. Holding your code close to your chest probably means you're writing unnecessarily bad code. Trust me. I write a lot of bad code.
  • The data tell the story. Government data in this case, but imagine what could be done with research data! Imagine!
  • What is the scientific paper? A sensible outsider's view. Money quote for our purposes: "Like it or not, science increasingly depends on data being published in public machine readable formats."

Personal note: I may be a little scarce around these parts for the next little while. I have three presentations to give in the next six weeks, none of them the same, none of them finished yet. In fact, two of them are but gleams in the back of my cerebellum. This is eating most of my off-work time at present.

Hope your Hump Day was fruitful.


More like this

The tidbits folder is out of control, so this linklist may be a bit epic. My apologies! There's a lot of great discussion in this area of late. Data repositories: the next new wave Steve Hitchcock is sensible, as usual. The answer to "are repositories changing?" is "they already changed," if one…
A common response, including in the comments at Book of Trogool, to raising digital-preservation issues is a chortle of "Guess print doesn't seem so bad now! Let's just print everything out, and then we'll be fine!" Leaving aside my own visceral irritation at that rather rude and dismissive…
I had the honor to participate in a futurist exercise by ALA's Association for Library Collections and Technical Services. The short essays they solicited have been placed online; they are well worth perusal. I wish the discussants at ALA's Midwinter gathering a pleasant and stimulating exchange.…
Many of my readers will already have seen the Nature special issue on data, data curation, and data sharing. If you haven't, go now and read; it's impossible to overestimate the importance of this issue turning up in such a widely-read venue. I read the opening of "Data sharing: Empty archives"…

wrt anonymized data - this article is absolutely not surprising to one who does a lot of searching! The problem becomes when you need a realistic data set to experiment with. What do you do? There are new synthetic data sets - but how realistic they are depends on the care taken when creating the set.
I do what I can and what I need to ethically to protect participants in my studies, but if I use a quote from them, odds are they'll be recognized by people who know them.