What's Missing From This Art Project?

Via Bora on FriendFeed, a cute little art project from MIT that takes a name, scans the Web for mentions of that name, and produces a color-coded bar categorizing the various mentions of that name. Here's what you get if you put my name in:


You can click on it for a bigger image, that makes the labels easier to read (these are screencaps edited in GIMP, because in true MIT Media Lab fashion, the whol site is a Flash thing with no way to link directly to anything). It's nice, and all, but there's something a little bit funny about it. Something... missing. Let's see if we can't illuminate the problem by putting in a few more names:




(Again, click for larger versions.)

What's missing is that there's no category for science, or anything science-related. Which leads to absurdities like a large-ish bar corresponding to "Sports" for Richard Feynman, or "Fashion" for Niels Bohr. You can watch it processing the results, and it completely skips most scientific or science-related terms (like "electron" or "atom" or "physicist"), while assigning others to categories in an essentially random manner ("physics" gets coded as "Education" or sometimes "Medicine"; Feynman's "Sports" bar is largely due to phrases like "winner of the Nobel Prize in Physics").

If you want a graphical illustration of the Two Cultures problem, it's right here. Even though this is a technological art project, from MIT no less, it doesn't occur to the artist that science or science-related subjects ought to included in the range of subject categories for human activities. Science isn't something that normal people do, so there's no need to code for it.

And this is why I rant about the lack of respect for science in the rest of the academy.

(Now, to be fair, the project slights a lot of humanities disciplines as well-- "Jean-Paul Sartre" also merits a substantial "Sports" bar, as does Friedrich Nietzsche, both of which I find faintly hilarious. But there are at least some halfway sensible categories into which humanities-related content can be put-- "books" and "art," for example. The total absence of science-related categories is deeply annoying, though.)

More like this

Nietzsche did play football for Germany against the ancient Greeks.

The problem I found was that a lot of hits weren't for me, and there's no way of telling it it's got the wrong me.

Even when it has appropriate categories available, its classification sucks. "Division" leads to 'sports' and "drug" to 'illegal'. This makes for amusing profiles for my medical colleagues...

The lack of anything science is particularly interesting given that "music" and "musical" are two different categories. What?

A further experiment: Teresa Nielsen Hayden's profile has about five times as much "sports" as "politics", and a surprisingly small "online" bar.

It also combines common names. My wife's family has a common Italian surname and I had tons of hits for my niece - including her spouses and death date. She's 15.

At MIT of all places. I hear that science is kind of a big deal there.

The omission would be more annoying if the project weren't silly from the get-go. No attempt at making sure the names actually correspond to what is being searched, crazy attribution of keywords to categories, etc. Seems to be more about looking cool than being useful.

"...merits a substantial 'Sports' bar, as does Friedrich Nietzsche, both of which I find faintly hilarious..."

"Nietzsche did play football for Germany against the ancient Greeks."

Soccer football I believe it was. The ancient Greeks considered American football undignified. The ball was aspherical and and they objected to wearing shoulder pads.

Doesn't seem to be working for me. "No digital traces" for my name is maybe not too surprising. Same thing for "Pericles" kind of surprised me, but I figured maybe it has some problem with single names. But "no digital traces" for Barack Obama is just not believable.

My particular research interests, with a lot of "charge" and "violation," gave me a very high "illegal" rating.

Mine only touches on 7 items, one of which sports doesn't interest me at all. I guess it's my rants about it that it picked up.


Entering "Digital Cuttlefish", I am not quite certain how to interpret the results, but watching the program work was entertaining. Nice to see a bunch of positive comments and no negative ones (although I know they exist), but the coolest was the quotes. Apparently I am "a fucking genius".

I had no idea Gardner's multiple intelligences had gotten so specific. I would have thought that covered under bodily/kinesthetic, for the most part.

The stuff ya learn.

All I can get from it (even after doing an F11) is "Increase your window size, please." This is followed by the obnoxious comments "Remember this is an art piece, not a tool," and "If you're on a 7" netbook, you might be out of luck." Fail. (For the record, it's a 9" netbook.)

One of the aims of this art experiment is to show the errors in data-mining "caused by the inability to separate data from multiple owners of the same name." among other things.

from http://personas.media.mit.edu/

I see there's also no category for "games" (as distinct from "sports"), thus making my profile just as much an exercise in miscategorization.

But this makes it a poor example of a "Two Cultures" problem. If the designer didn't think of science *or* games as human activities, then he's just got a lousy category set, not a disrespect for science.

My more direct objection is that it's impossible to learn anything about me from the final display. All the digested information is sterilized away. If each band brought up some kind of link list, or even a tag cloud, then you might be able to glean something *about* my books, my "online" activity, my contributions to education (?), my sports activity (??) (I guess "Capture the Flag with Stuff" is a sport...)

By Andrew Plotkin (not verified) on 23 Aug 2009 #permalink

At least 2 of the 18 are actually me (using my real name). "Warrior" counts several times in sports and military even though in context it's a place name (i.e. the Black Warrior basin). Using my usual screen name all 13 items were me; several were comments on Consumerist posts. Now I'm curious how it picks the references.

By marciepooh (not verified) on 24 Aug 2009 #permalink

Thier data-mining algoritms are all fkd into a cocked hat. There's no way that my list (yes, I have nothing better to do today) of John Connor, Lewis Carrol, Fairy Godmother, Erasmus, Tooth Fairy, Mark Twain, Bilbo Baggins, James Joyce and Ingmar Bergman should all show substantial amounts of sports participation, but they all do.

By Kate from Iowa (not verified) on 24 Aug 2009 #permalink