There have been a number of piercing calls for training of data professionals (of various stripes) in the last year or so. Schools of information have been answering: Illinois, North Carolina, others.
Honestly, I'm getting a sinking feeling in my stomach. If I were to label it, the label would go something like "where are all these newly-minted data professionals going to work?"
My stomach sinks worse when I realize that quite a few of the calls are coming from the same people and organizations who uttered piercing calls for the establishment of institutional repositories in the early 2000s. Libraries did as they were bid; the results were at best mediocre (and that's a generous assessment). The callers have not, to the best of my knowledge and belief, acknowledged any error in the call they made, much less any of the waste and damage caused. So… we're going to trust these same people on a similar leap into the half-known?
The larger question is how we move data professionals into the research enterprise. It's an analogous question to others that have surfaced in libraries: moving librarians into the classroom to teach more than Booleans, for example. We'll hear some of the same things from the people we want to help: "a solution in search of a problem," notably, as well as "how can you possibly understand my research if you're not just like me?"
(My answer is what it's always been: "I don't have to understand your specific data to tell you that keeping data on CD-ROMS in a shoebox under your desk is a bad idea.")
I've seen one answer I like: internships. GSLIS at Illinois moves its data-curation students into data-related internships once they graduate. They beat the bushes for research organizations looking for the kind of help their graduates provide. In so doing, they ease their people into jobs, raise the profile of their program, and raise the profile of information professionals as research partners generally. This is smart business. I go further: I believe it wholly irresponsible to have a data-curation instruction program targeted at librarians and information professionals without such an internship program.
Training scientists is another question, of course; I don't think it's quite as necessary to do internships in the well-accepted informatics fields. It probably can't hurt, though.
Grant funders: I'd like to see some bribes happening. Make money available to grantees to hire on data professionals. The wording of such grants will be tricky—you don't want them hiring just another developer—but I'm sure you can do it. Likewise, fund the internships I just described! Finally, any research you can fund that demonstrates good outcomes from the presence of data professionals can only help.
Institutions: I don't know; I truly don't. Some days I believe that data management can only happen on the level of the individual research lab. Some days I believe that data can only survive if institutions tackle the problem. Some days I believe both, and my head hurts.
We all of us need to avoid some obvious pitfalls, however. The maverick-manager pitfall familiar to libraries from the IR disappointment is one: data curation for an entire research institution cannot become the exclusive purview of one or a handful of supposed data professionals, especially when they have no budget, no developers, one server at most, and no institutional network.
Flooding the job market is another. Data professionals will lose what little credibility among researchers we have if dozens of us wind up applying to every open job. That leads to perhaps the shortest road to deprofessionalization in history! Let's not do it. One way to avoid it may be to bite the bullet about incoming qualifications: perhaps we need to sigh and say "no science BA, no enrollment in this program; MAs and Ph.Ds in science preferred."
That slams the door on me, incidentally, and I wouldn't be happy about that. But if it means that newly-minted professionals have obvious job-market value, then that's what we have to do.
Finally, let's not get quite so exercised yet about who does what work; we risk "I stubbed my toe! Call in a specialist!" syndrome. Let's focus on the work to be done. Work has a marvelous way of getting done, when it has to be, even by people who aren't "professionals" and don't have "professional" training. I am not a professional programmer. I don't have the least hint of a degree in computer science, software engineering, or anything else. I still write code, because the code won't write itself. If similar processes are how data curation turns out to happen, that's fine with me.
Not least because then I won't have "professional" doors slammed in my face.
That's a common problem. We heard for years about the great shortages of hard science and engineering graduates, but many of those I know (at Ph.D. level) faced a lot of competition when looking for a job or a prolonged stay in underpaid postdoc positions; note, that in industry, typically the salary of the Chief Legal Officer (not to talk about CEO,CFO) often exceeds that of the Vice President for Engineering/Research (whose job it is to create the products whose sale produces the money for the company). As they say, 'Talent is cheap.'
The supply of science Ph.D.s seems to be set by the need of universities for cheap teaching assistant labor, or later, cheap research assistant and post-doc labor.
For the data specialists, it may be the need of the relevant university departments to keep their student numbers up to justify their existence.
The names Bowen and Sosa are not spoken around me save with fulminous curses. :)
It's true in libraryland, too; my last year in library school, I spilled a lot of pixels calling shenanigans on ALA's loud trumpeting of a "librarian shortage."
I have zero interest in seeing this happen again -- not least because the people who are brave enough to enter data-curation programs now are self-selected, probably the cream of any crop we're likely to muster. We do not want to lose these people because we didn't think hard enough about placing them where they can do good work!
I hope this isn't a stupid question, but what is a "Data Professional"? I tried Googling the phrase, and it doesn't seem to be commonly used.
Not a stupid question at all; in fact, I think I just coined the term.
The reason I don't use "data curator" is probably the Alma Swan report that posits four different kinds of "person who works with research data," without actually positing an umbrella term for all four!
I don't necessarily agree with Swan's taxonomy -- in fact, in all honesty I don't agree with it at all! -- but it does seem likely that there will be a few different niches. Which, again, raises the question of an umbrella term.
Also, I think I'm using the term "professional" because it connotes getting paid specifically to do this work.
I'm far from clear that anyone will get paid on a professional level specifically to do this work. To do this work alongside other work, almost certainly. To get paid a pittance (e.g. as part of a graduate assistantship), absolutely; that's what's happening now.
Whether this particular problem will create an actual professional job market -- I don't know, and neither does anyone else, so why do we even have professional training (in the sense of "training for a profession") at this point?
NSF made a similar error a few years ago with the Partnerships for Enhancing Expertise in Taxonomy (PEET) program. They correctly noted that the dwindling number of professional taxonomists was an issue, but misdiagnosed the cause. It wasn't a lack of trained people but a lack of positions.
So every year a few dozen highly trained young PEET taxonomists are dumped into a job market that doesn't exist, left to compete with each other and with older taxonomists who've been laid off from their downsized jobs. It's a terrible situation.
"One way to avoid it may be to bite the bullet about incoming qualifications: perhaps we need to sigh and say "no science BA, no enrollment in this program; MAs and Ph.Ds in science preferred.""
This kind of goes against the business interests of the schools though -- schools which increasingly (are being asked to) act like businesses.
It's not a coincidence that, in the case of the "there's going to be a librarian shortage soon issue" (aka "I'll gladly take your money for a hamburger today that I insist will be ready tomorrow"), the easiest place to find the claims of impending librarian job shortage is on the web sites of the schools that will give you a librarian degree.
True enough, Jonathan.
Perhaps part of the answer is insisting upon placement assessment, then.
Not sure if you saw this poster from the Illinois folks:
"Analyzing Data Curation Job Descriptions"
Andrew Treloar tells me (no URL yet) that the Australian DART project had specific money for something like internships, some of which went on to become real jobs.
Is it proven that research data curation needs librarians of any kind? No. Is it proven that scientists can curate their data naturally, without any kind of additional help? No. Is it likely that scientists, left to their own devices, will not write job descriptions for data librarians? pretty much I guess. Is it likely that institutions will need/want to create data repositories at an institutional or sub-institutional level? At the risk of bringing down Trogool wrath, I think so. Will such institutions want something like a data librarian? I think so. Can I guarantee it? No.
But personally, if I were setting out on a library track right now, I would want a data-orietend arrow in my professional quiver!
Nic, thank you, that's thought-provoking!
Chris, it doesn't surprise me somehow that Australia did something sensible.
What scares me is that institutions will want to create data repositories, but they'll do it as badly as they did IRs. We'll see.
I agree about proto-librarians needing to pick up data arrows; I'm just not at all sure the demonstrated need justifies narrow professional programs at this point.