The machine that identifies images from brain activity alone

Blogging on Peer-Reviewed ResearchModern brain-scanning technology allows us to measure a person's brain activity on the fly and visualise the various parts of their brain as they switch on and off. But imagine being able to literally see what someone else is thinking - to be able to convert measurements of brain activity into actual images.

i-d0e5ad17c8eaebf7466311ffc71b18f1-Tank.jpgIt's a scene reminiscent of the 'operators' in The Matrix, but this technology may soon stray from the realm of science-fiction into that of science-fact. Kendrick Kay and colleagues from the University of California, Berkeley have created a decoder that can accurately work out the one image from a large set that an observer is looking at, based solely on a scan of their brain activity.

The machine is still a while away from being a full-blown brain-reader. Rather than reconstructing what the onlooker is viewing from scratch, it can only select the most likely fit from a set of possible images. Even so, it's no small feat, especially since the set of possible pictures is both very large and completely new to the viewer. And while previous similar studies used very simple images like gratings, Kay's decoder has the ability to recognise actual photos.

Training the brain-reader

i-7ccf0280f16385828e347cd50b2869bc-Visualcortex.gifTo begin with, Kay calibrated the machine with two willing subjects - himself and research partner, Thomas Naselaris. The duo painstakingly worked their way through 1,750 photos while sitting in an fMRI scanner, a machine that measures blood flow in the brain to detect both active and inactive regions. The scanner focused on three sites (V1, V2 and V3) within the part of the brain that processes images - the visual cortex (right).

The neurons in the visual cortex are all triggered by slightly different features in the things we see. All of them have different 'receptive fields'; that is to say that they respond to slightly different sections within a person's field of vision. Some are also tuned to specific orientations, such as horizontal or vertical objects, while others fire depending on 'spatial frequency', a measurement that roughly corresponds to how busy and detailed a part of a scene is.

By measuring these responses with the fMRI scanner, Kay created a sophisticated model that could predict how each small section of the visual cortex would respond to different images.

Kay and Naselaris tested the model by looking at 120 brand-new images and once again, recording their brain activity throughout the experience. To account for 'noisy' variations in the fMRI scans, they averaged the readouts from 13 trials before feeding the results into the decoder.

The duo then showed the 120 new images to the decoder itself, which used its model to predict the pattern of brain activity that each one would trigger. The programme paired up the closest matches for the predicted and actual brain patterns and guessed the order of the images that Kay and Naselnaris had looked at.


It was incredibly successful, correctly identifying 92% of the images that Kay looked at and 72% of those viewed by Naselnaris. Obviously, using the average of 13 scans is a bit of a cheat, and if the machine were to ever decode brain patterns in real-time, it would need to do so based on a single trial. Fortunately, Kay found that this is entirely feasible, albeit with further tweaking. Even when he fed the decoder with fMRI data from a single trial, it still managed to correctly pick out 51% and 32% of the images respectively.

The decoder could even cope with much larger sets of images. When Kay repeated the experiment with a set of 1,000 pictures to choose from, the machine's success rate (using Kay's brain patterns) only fell from 92% to 82%. Based on these results, the decoder would still have a one in ten chance of picking the right image from a library of 100 billion images. That's over a hundred times greater than the number of pictures currently indexed by Google's image search.

Obviously, the technology is still in its infancy - we are still a while away from a real-time readout of a person's thoughts and dreams based on the activity of their neurons. But to Kay, his experiments are proof that such a device is possible and could be a reality sooner than we think.

Reference: Kay, K.N., Naselaris, T., Prenger, R.J., Gallant, J.L. (2008). Identifying natural images from human brain activity. Nature DOI: 10.1038/nature06713

More like this

Where is this published?

Nature, in the issue that comes out tomorrow. When the paper is published online, it will go into the DOI database and I can stick up a citation using the Research Blogging network.

Cool, thanks! Good post!

Very cool, thanks for sharing!

The first thing I thought of wasn't The Matrix. I grant that it's been a long time since I've watched The Matrix, but from what I recall this technology is less Matrix and more Final Fantasy: The Spirits Within. In the latter film, the main character's dreams are recorded for later review. In The Matrix, it was my understanding that the digital readout observed by the operators is a reflection of what is happening in the matrix. True, the crew's mind was plugged into the matrix, but the operators could observe events in the program beyond the crew's virtual perceptions.

Um... Anyway, definitely looking forward to seeing how this research progresses.

Reference now up. And you know, the point about the Matrix did run through my head while I was writing this, but it was 1 in the morning and the part of my head that contains the sci-fi references was asleep and I was sorta counting on people being awesomed by the machine and not noticing. :-p

I wrote an article about fMRIs and fear conditioning last year, and consider myself somewhat literate on the subject, but this simply blows my mind. I have to remain skeptical until I look at the primary data. And even then...ha

This is off-topic, for which I apologise -- but I figured, if anyone can help on this, it would be Ed Yong or readers here. ;-)

My husband is going to be teaching a graduate seminar on clear writing and communication. For illustration purposes, he needs to dig up a really badly written physics paper. Not bad research -- just impenetrable and unclear writing. My field is biology and computing, and I've already been able to give him examples of that for biology and medicine, but nearly all physics papers are jargon-filled, unclear and impenetrable as far as I'm concerned, so it's difficult for me to select one or two.

Sorry about sticking this on an unrelated thread, again.
But would you be able to make any nominations for this?

By Luna_the_cat (not verified) on 06 Mar 2008 #permalink

Sadly, I'm not sure I can - there's a reason that there aren't any physics posts on this blog and that's because anything physics that doesn't have the name "Feynman" or "Davies" on it is as impenetrable to me as it is to you. Any readers have suggestions?

Good luck to your husband though - a worthy cause indeed. For his info, I'd highlight this paper which I wrote about here as a really nice example of a very well written paper that still retains academic integrity while being a good read. And, obviously, the Origin of the Species, which is arguably the most important biological text for several centuries and can be read by a schoolkid.

There seem to be papers like this, inferring what someone is looking at from brain imaging data, coming out almost every other day now. The gee-whizzery gets a bit tedious. This stuff may be technically impressive (see how clever we are to get a signal out of this terribly noisy data!), but scientifically it is positively pernicious. These sorts of studies tell us nothing about neural perceptual processing that we did not already know, and, much worse, by pulling off these tricks, which will in fact only work under highly constrained and unnatural viewing conditions, they perpetuate and reinforce the simplistic, widespread, but quite false idea, deeply entrenched in popular consciousness, that visual perception is basically just a matter of opening the eyes and letting the information flow in along the optic nerve for the brain to "process" into visual experience (like turning meat into sausages), and that visual cortex is some sort of inner screen where the scene in front of us is redisplayed for the delectation of the inner eye.

The very fact that such "experiments" are done (and presented as contributions to visual science), let alone the fact that they quickly give rise to fantasies about mind-reading machines, that will also work on memories and fantasies, shows that even smart people who know a bit about the neuroscience of vision are not immune to being misled in this way.

But in fact there is all sorts of evidence that visual perception just does not work like that. It crucially, and centrally, involves all sorts of top-down processes driven not only by the nature of the stimulus, but, even more, by the organism's current condition and its needs and purposes. Seeing is not like making a video of whatever happens to be in front of our eyes; it has a real biological purpose, it is about actively seeking and finding the information we need (or might need) in the plethora of available visual evidence around us. The relevant processes include not only descending, modulatory signals within the brain (though there is lots and lots of that going on at every stage of visual processing, LGN, visual cortex, and just about everywhere else on the visual pathways), but also directed eye movements, which are absolutely necessary to normal vision, and which, in natural viewing conditions, cause the image on the retina (never mind its momentary reflection in visual cortex) to jiggle about, and to change radically several times per second.

The sorts of fMRI experiments described in the blog post can only be made to work by reducing all this visual activity to a minimum by such things as having the subjects lying motionless within an MRI machine and staring fixedly at decontextualized test pictures that are of no purposive significance whatsoever (that is, the fact that they are looking at a flower instead of a rhinoceros, or whatever, makes absolutely no difference at all to the subject's plans, purposes, and ongoing behavior). Of course you cannot eliminate the top-down processes altogether - the subject would not be able to actually see without them - but if they are dampened down sufficiently the evidence for them in the data can simply be discarded as irrelevant noise, leaving you with a nice clean, and thoroughly deceptive, demonstration of how perception is just like photography with the visual cortex as the film.