Wednesday, November 11, 2015

positive control analyses and checking data quality

I've long advocated that much pain can be avoided if analysis of a new dataset is begun with positive control analyses: classifying (or whatever) something that should produce a very strong signal in easy-to-identify regions. Button-presses often are good controls (particularly if the button boxes were in different hands): when classifying whether the left-hand or right-hand was moved (totally ignoring the experimental conditions), do you find motor areas? Classifying the presence or absence of visual stimuli is also a good, strong control. Once a positive control is found, some of the analysis choices and data quality checks can be run on the control analysis then carried over to the target analysis, reducing the chance of exploiting too many experimenter degrees of freedom.

Another reason to start with positive control analyses is simply to identify problems in the dataset. If the control analysis fails in a particular person, why? Were the event timings mislabeled? Movement too high? Preprocessing failed? I'd be very worried about interpreting the results of a subtle cognitive task in a person whose data is of too poor a quality to support classifying something as strong as hand movements.

The rest of this post is an example of what positive control analyses can look like, and how their results compare to measures of general dataset quality. Several practiCal fMRI posts were very useful for thinking about how to visualize the image quality, particularly this one describing temporal SNR and this one showing examples of high-quality images.

average the fMRI timeseries for each voxel

First, run-wise mean images. These are simply the average of the functional timeseries for each voxel, each run and person separately. I calculated these on the images after preprocessing, but before voxelwise normalization. This is evaluating the quality of the images as they "arrive" for MVPA; in this case, after motion-correction, slice-time correction, and spatial normalization to an anatomic template. We thus expect the slices to look fairly similar in all people (because of the normalization), basically like fuzzy anatomical images.

The images below show two slices (one coronal, one axial) of the mean fMRI image for four runs in six people from two datasets (people in rows, runs in columns). The first image shows a dataset with fairly decent spatial normalization, the second, not-so-good spatial normalization (the images should enlarge if clicked).

A dataset with fairly decent spatial normalization. While the images vary a bit (e.g., subject 17 is rather "pointy" compared to the others), they are all orientated correctly and capture the same brain structures.
A dataset with not-so-good spatial normalization. Note that each person is pretty consistently normalized with themselves (i.e., the images in the four runs within each person are similar), but vary quite a bit between people. For example, sub7's brain looks "short", and when viewed in 3d, is clearly tilted.

In my estimation, analysis should not proceed on this dataset: spatial normalization needs to be improved, or analysis should be performed in subject (native) space.

standard deviation of the fMRI timeseries for each voxel

As described by practiCal fMRI, images of the standard deviation of the fMRI timeseries are useful for spotting motion or other artifacts; see his post for more details. Basically, dimmer is better for these images, and we want to be able to see some brain structure. As with the mean images, these are simply calculating the standard deviation of each voxel's timeseries, separately within each run, using the post-preprocessing functional images. All image voxels were included, not just those in the brain mask, to allow spotting of blurry edges and ghosts.

The top part of this image are the standard deviations. This follows the convention of the mean images: subjects in the rows, four runs in the columns, with coronal slices first, then axial slices, both of the same 3d image. All images have the same color scaling, so brightness can be compared between runs and people.

Subject 34 is the best of these three people: the images for the four runs are pretty equally dark, but the brain outline and structure are visible. Subject 37 has the second and first runs much brighter and blurrier than the third and fourth runs; the first run in subject 36 is also brighter and blurrier than the others. These runs had more movement artifacts, reflected here as higher standard deviation.

The bottom part of this image is the accuracy from a positive control searchlight analysis in these same three people. In this case, the control analysis was classifying whether a particular image was from a cue or target/response trial segment, and we expect visual and motor areas to classify. (If you're curious, it was leave-one-run-out cross-validation within each person, linear SVM, c=1, 3-voxel radius searchlights, two balanced classes.) The overlay is color-scaled to show voxels with accuracy of 0.6 as red, 1 (perfect) as brightest yellow, not showing voxels with accuracy less than 0.6 (chance = 0.5). (I used knitr to make all the images in this post; see this demo for similar code.)

The accuracies and standard deviation are consistent in these images: sub34 has the lowest standard deviation (and highest temporal SNR, though this isn't shown here) and highest classification accuracy; sub36 and sub37 have fewer high-classifying searchlights. The relationship between image quality in these diagnostic tests and control classification accuracy is not always this clear, but I have seen it pretty often, and it should exist; by definition, the control classification should succeed in people with decent image quality. If it does not, the dataset should be checked for errors, such as mislabeled event timing files.

There's no magic threshold for image quality, nor perfect strategy for recovering signal from high-movement runs. But I would be very hesitant to continue analyzing a person without clear signal in the control analysis, particularly if they stand out in the mean and standard deviation images.

No comments:

Post a Comment