Neuroscientific methods are kind of unsung heroes of scientific journalism. Every now and then media flashes with buzzwords like “brain-controlled machines” and “mind reading” (“Scientists finally can READ your DIRTY THOUGHTS!!11”); but how exactly is it done? And while we’re at that, which methods are behind the mind-boggling futuristic projects like controlling virtual reality with just your eyes or determining whether a suspect was really present at the crime scene? Questions upon questions! (hint: you will find answers in this post).
Eye-tracking and virtual reality
As the name suggests, eye tracking measures the activity of your eyes. Where do you look first? When and how often do you blink? What makes your pupils dilate? Does your gaze stay longer on her ample bosom or her beautiful eyes? By answering these (and many more) questions eye tracking allows us to find out whether a person is attentive and focused or drowsy and bored, whether your website is user-friendly or whether a patient has an autistic spectrum disorder -- this technology has application in a variety of areas and it keeps expanding.
Unlike in late 1890s, modern eye tracking techniques do not include numbing your eyes with cocaine; today, the common method to measure your eyes’ behaviour is shining near-infrared light into the center of the eye and comparing the position of its reflection to the position of the pupil (while the position of the reflected light stays the same the pupil moves with respect to it). Combined with the position of the head, this data can be extrapolated to determine the direction of the gaze and the points it lingers at. Typical eye tracking measures include fixations (when your eye pauses on something that caught your attention), their length and time needed to arrive to them, saccades (when your eye leaves the fixation and moves to the next interesting point) and the resulting complete visual path (what did you look at, in which order and for how long). There are two ways to measure all that jazz: remote and head-mounted eye-trackers. The former ones are normally attached to a computer screen and are used to capture eye movements while watching a movie or performing some visual attention task. The latter ones are attached - surprise - to a person’s head in form of glasses which look like they belong to some futuristic Lady Gaga outfit; they allow the test subjects to move around freely.
Eye tracking is a jack of all trades (and master of all): it is used in fields ranging from marketing research (think product design, placement & package etc.) to neuroscience (think early screening for Alzheimer’s, attention and memory research and much more). But there is one particularly futuristic area employing it that will really float your nerdy boat: virtual reality. Eye tracking is becoming a hot topic in the VR community, and for good reason: it could help us create fully immersive VR experiences on less powerful devices or influence the (virtual) world around us just by using just our eyes.
So here is how VR and eye tracking produce their love child, better user experience. There is this thing called fovea, a small dimple in the very center of your pupil responsible for sharp vision. When we look at something, we only see a tiny part of your environment in sharp detail (which is stuff inside our foveal visual field); our peripheral vision, in turn, sees colours and movement, albeit rather blurry and less detailed. To create the impression that we see large parts of the environment (and not only one fragment of it) in sharp detail our brain just uses our previous experience and memory to fill in the blurry details. If there is a blurred watermelon in your peripheral vision your brain will just project an in-focus version of this watermelon from the memory center and voila - it’s much more clear. And then there is thing called foveated rendering, a technique about to be used in VR, which would drastically reduce computing workload by taking advantage of how the brain interprets the world and imitating what the eyes already do. First, eye tracking tells the program where exactly you are looking at any given moment. Foveated rendering utilizes this information and renders the area outside the fovea in lower resolution and less detail while using more of the available computing power to sharpen the foveal area and make it even more realistic. As you move your eyes, the virtual environment around you gets continuously updated so that the center of your vision stays sharp. As with the vision in the real reality your brain just creates the illusion of an overall sharp vision by predicting how the periphery would look like based on what it already knows about it. While sounding undeniably cool, it still being worked on as it is difficult to develop a both inexpensive and small enough way to track your eyes.
Tech giants also didn’t wait too long before jumping on a bandwagon: just recently Google has bought a major eye-tracking startup; rumor has it that it is working on a wireless VR headset blending augmented and virtual reality. Make a menu selection by holding your gaze on the option! Look at something long enough and zoom in on the picture! Tour Louvre and get information about a painting by blinking at it 4 times! Use your eyes as a mouse! Praise our new Lord and Saviour Eye Tracking! It looks like what began as tracking eye movements in reading in 1879 has come a long enough way to re-shape our understanding of reality we’re living in.
Functional magnetic resonance imaging (fMRI) and mind reading
fMRI is like modern art: everyone has heard of it, but no one really understands what it’s supposed to be. Fear not, it will become clear really soon! Basically, fMRI is based on the fact that blood which contains a lot of oxygen behaves differently in a magnetic field than blood containing little oxygen. And as it happens, the more active a brain area is the more oxygen it consumes; and the more oxygen is needed the more blood rushes to the active area to provide it. fMRi can pick up the increased blood flow and oxygen consumption to pinpoint greater brain activity in the resulting picture.This type of imaging is called blood oxygenation level dependant (BOLD) imaging. This method is non-invasive (you just need to lay in a very loud scanner tube for a bit) and provides quite a good level of spatial resolution (the temporal one is not very breathtaking as there is a 3-6 seconds delay between the actual activation and the subsequent demand for oxygenated blood); this makes it a to-go tool for a yuuuuge variety of studies.
fMRI can be used in two ways: it is either employed to see how the brain activates while doing some simple task (e.g. trials of doing nothing alternate with trials containing a simple memory/visual/moral judgement/”erotica or ice-cream” task) or it is utilized to get an impression of what the brain is like when we are not performing an explicit task (like we are daydreaming while idly looking out of the bus window or smiling and nodding pretending to listen to coworkers rambling about their children). The former, called resting-state fMRI, showed us that there is actually a lot going on even when we don’t really do anything: it identified different networks governing our brain, allowed us to see how the connectivity between brain regions changes in psychiatric diseases or in different states of consciousness and a lot more.
Now that we have an idea of how it works1, what cool stuff can you do with it? I’m trying to resist the temptation to say “read your mind!” (and hand over a tinfoil hat), so I will use a more accurate scientific term of “decoding thoughts based on your brain activity”.
Decoding techniques don’t just search for the area, say, responding to seeing a face; instead they start by looking at the whole-brain activation pattern connected to seeing this particular face. Then, after establishing specific activation patterns for a bazillion images, a computer algorithm called “pattern classifier” is fed all of them (plus the pictures they are associated with). This way the classifier learns the relationship between images and the brain activity caused by seeing them; at the end it knows that this exact pattern of activity is most likely to be caused by a cat or, say, by fighting babies. Once the program has seen enough samples, it can start looking on fMRI scans and trying to decipher what’s a person looking at or thinking about. It started out small: in the early studies scientists were able to tell what object category (shoes, bottles, scissors, etc.) participants were looking at2.
Soon after, the decoding mechanism took big steps forward: first, it was used to identify which of the 120 images people were looking at3 -- a task far more complex that guessing the broad object category, and then researchers developed a classifier that was able to produce primitive movies reflecting the videos participants were watching4. Since then this technology was used for all possible kinds of things -- visual scene imagery5, working memory (which object do I hold in mind?)6 and intentions7 (do I press this or that button?). Classifying intentions is a much harder problem than decoding images though. We can group objects by colour or shape, but how do we categorize intentions? Apart from that, another problem is generalizability. So far, decoders are built based on individual brains and developing a standardized mind-reader which can be used in law enforcement or marketing is not gonna happen for the next couple of years. At the current stage, quoting John-Dylan Haynes (a man working hard on the classifier research and also (bragging) being my professor (/bragging), “The best way to find out what someone is going to do is to ask them.”