Neuroskeptic has a great post on this topic, listing some of the researcher degrees of freedom in analyzing a hypothetical fMRI experiment:
"Let's assume a very simple fMRI experiment. The task is a facial emotion visual response. Volunteers are shown 30 second blocks of Neutral, Fearful and Happy faces during a standard functional EPI scanning. We also collect a standard structural MRI as required to analyze that data."What are some of the options for analyzing this with MVPA? This is not an exhaustive list by any stretch, just the first few that came to mind.
temporal compression
- Average the volumes to one per block. Which volumes to include in the average (i.e. to account for the hemodynamic lag)?
- Create parameter estimate images (PEIs) (i.e. fit a linear model and do MVPA on the beta weights), one per block. The linear model could be canonical or individualized.
- Average the volumes to one per run. Calculate the averages from the block files or all at once from the raw images.
- Create one PEI for each run.
- Analyze individual volumes (first volume in each block, second volume in each block, etc).
- the "default": linear svm, c=1.
- a linear svm, but fit the c.
- a nonlinear svm (which type?).
- a different classifier (random forest, naive bayes, ....).
- correlation-based
- linear discriminants (multiple options)
- on the runs
- on a combination of runs (first two runs out, next two out, etc)
- ignoring the runs (ten-fold, leave-three-examples-out, etc)
- on the subjects (leave-one-subject-out)
- on the runs, but including multiple subjects
- whole-brain
- ROI (anatomical, functional, hybrid)
- searchlight (which radius? which shape? how to combine across subjects?)
- resize the voxels?
- scale (normalize) the data? (across voxels within an example, across examples?). Center, normalize the variance, take out linear trends, take out nonlinear trends?
Simmons JP, Nelson LD, & Simonsohn U (2011). False-positive psychology: undisclosed flexibility in data collection and analysis allows presenting anything as significant. Psychological science, 22 (11), 1359-66 PMID: 22006061
Thanks that is a great list! You should also add feature selection/extraction into this.
ReplyDeleteThe good news is that at least in my experience most of these options usually have very little influence on the results. Averaging volumes within a block tends to produced higher accuracy but the variance scales as well so the pattern of results is usually similar. I think the biggest effect comes from feature selection and perhaps also the choice of classifier (although that is less of a problem).
Good post. I'd also add preprocessing steps to the list like spatial smoothing with various kernel sizes.
ReplyDeleteWe often collect 1 or 2 pilot subjects that we know will not be part of the final dataset that we can use to play around with many of these variables. Then we will make choices about all of the analysis options before collecting the "real" data.
Very nice indeed, as always. Perhaps there should be some standard template that contains flags to what methods a researcher had used in the analysis, i.e. the protocol of the analysis. This would save the reader the time needed to decode what was done.
ReplyDeleteAs to the false positive(s) issue, I have seen a study (Pub. in Neuroimaging) in which the authors used few feature selection and few classifiers (for each subject), then, they selected the results that gave the highest accuracy. Their excuse is that they used a 7T MRI to obtain high resolution fMRI, which means higher levels of noise. Would anyone consider this case as "false positive?"
I've seen versions of that as well (I'm not sure which exact paper you mean, which is not a good sign). I suppose it could be done properly, if you were very careful about hold-out sets and data peeking, but it seems a very risky strategy. And 7T data is no excuse. :)
Delete