Data quality can affect key dependent variables in eyetracking.
Data quality – boring, I know - but really important!
We’ve just published a paper on eyetracking data that in some ways was similar to a debate that’s been happening about brain connectivity in autism. Several recent papers (see here) suggested that people with Autism may show reduced connectivity between different areas of the brain. So there was great shock and horror when a paper raised the possibility that, in fact, this may be not because there is a genuine difference between the brains of typical people and those with Autism – but rather because people with Autism may just tend to move slightly more while they are in the scanner. The article raised the possibility that this may create the impression of reduced connectivity between brain areas, without this actually being the case.
Eyetracking, in which you present still images or movie clips to participants and track where on-screen they are looking, is a powerful and versatile method that is widely used within infant psychology. Most researchers use the built-in software provided by eyetracker manufacturers – which sets out, probably deliberately, to make researchers feel very confident in their results. When you are viewing a live replay of the gaze data, for example, the use large amounts of smoothing and interpolation to give the impression that the tracking looks very sensible, and real.
In fact, though, if you visualise the raw data that eyetrackers are recording, you see that in fact we shouldn’t necessarily have that much confidence at all. In particular, one problem is that the same eyetracker, working in identical conditions, can record very good quality from one child, and very bad quality from another child. In the paper we identify two problems in particular – one, low precision, and the other, low robustness. We also show that, on average, worse quality data tends to be recorded from younger (relative to older) participants, from fidgetier participants (as opposed to those who sit very still, and later (relative to earlier) in a testing session.
This would be OK if it were a random source of error – unpredictable measurement noise that did not systematically influence results. In fact, though, as we also show, data quality shows strong relationships with several different aspects of data quality. See the example below, in which we take a sample of real data obtained during the recording of a face (top left) and analyse it for the proportion of time the participant has spent looking at the eyes vs other areas of the face (top right). We then manipulate the data to simulate the effect of low precision (bottom left) and repeat the same analyses (bottom right). It appears that the bottom sample shows less looking to the eyes area – but in fact the only difference between the two samples is data quality. We also conduct a number of different analyses looking at other aspects of data quality – such as fixation duration and various reaction time measures.
x
Comments
Post a Comment