Another reason to start with positive control analyses is simply to identify problems in the dataset. If the control analysis fails in a particular person, why? Were the event timings mislabeled? Movement too high? Preprocessing failed? I'd be very worried about interpreting the results of a subtle cognitive task in a person whose data is of too poor a quality to support classifying something as strong as hand movements.
The rest of this post is an example of what positive control analyses can look like, and how their results compare to measures of general dataset quality. Several practiCal fMRI posts were very useful for thinking about how to visualize the image quality, particularly this one describing temporal SNR and this one showing examples of high-quality images.
average the fMRI timeseries for each voxelFirst, run-wise mean images. These are simply the average of the functional timeseries for each voxel, each run and person separately. I calculated these on the images after preprocessing, but before voxelwise normalization. This is evaluating the quality of the images as they "arrive" for MVPA; in this case, after motion-correction, slice-time correction, and spatial normalization to an anatomic template. We thus expect the slices to look fairly similar in all people (because of the normalization), basically like fuzzy anatomical images.
The images below show two slices (one coronal, one axial) of the mean fMRI image for four runs in six people from two datasets (people in rows, runs in columns). The first image shows a dataset with fairly decent spatial normalization, the second, not-so-good spatial normalization (the images should enlarge if clicked).
In my estimation, analysis should not proceed on this dataset: spatial normalization needs to be improved, or analysis should be performed in subject (native) space.
standard deviation of the fMRI timeseries for each voxelAs described by practiCal fMRI, images of the standard deviation of the fMRI timeseries are useful for spotting motion or other artifacts; see his post for more details. Basically, dimmer is better for these images, and we want to be able to see some brain structure. As with the mean images, these are simply calculating the standard deviation of each voxel's timeseries, separately within each run, using the post-preprocessing functional images. All image voxels were included, not just those in the brain mask, to allow spotting of blurry edges and ghosts.
Subject 34 is the best of these three people: the images for the four runs are pretty equally dark, but the brain outline and structure are visible. Subject 37 has the second and first runs much brighter and blurrier than the third and fourth runs; the first run in subject 36 is also brighter and blurrier than the others. These runs had more movement artifacts, reflected here as higher standard deviation.
The bottom part of this image is the accuracy from a positive control searchlight analysis in these same three people. In this case, the control analysis was classifying whether a particular image was from a cue or target/response trial segment, and we expect visual and motor areas to classify. (If you're curious, it was leave-one-run-out cross-validation within each person, linear SVM, c=1, 3-voxel radius searchlights, two balanced classes.) The overlay is color-scaled to show voxels with accuracy of 0.6 as red, 1 (perfect) as brightest yellow, not showing voxels with accuracy less than 0.6 (chance = 0.5). (I used knitr to make all the images in this post; see this demo for similar code.)
The accuracies and standard deviation are consistent in these images: sub34 has the lowest standard deviation (and highest temporal SNR, though this isn't shown here) and highest classification accuracy; sub36 and sub37 have fewer high-classifying searchlights. The relationship between image quality in these diagnostic tests and control classification accuracy is not always this clear, but I have seen it pretty often, and it should exist; by definition, the control classification should succeed in people with decent image quality. If it does not, the dataset should be checked for errors, such as mislabeled event timing files.
There's no magic threshold for image quality, nor perfect strategy for recovering signal from high-movement runs. But I would be very hesitant to continue analyzing a person without clear signal in the control analysis, particularly if they stand out in the mean and standard deviation images.