UPDATE 24 April 2020: I realized that there was at least one other possible big contributing factor to the difference between the runs: randomness in how I chose the not-button examples. I'll fix this and see what happens to the SVM results. I previously found extremely similar GLM results between fmriprep versions 1.1.2 and 1.3.2, so the variability here is very possibly only from my randomness in setting up the classification.
Here's another post in the category of "things that surprised me"; not really a problem, nor anything new, but more than I expected. Background: for various file management reasons we had to run DMCC13benchmark through fmriprep (1.3.2) twice for two people. Same BIDS input, etc. (even hardware, I think), but run twice. Some of the fmriprep components are not deterministic (see e.g. here and here), so a bit of variability between runs is expected.
Knowing that the preprocessing is not identical, I reran the QC and a few simple analyses on the second set of images to make sure that everything still seems ok. I expected a bit of change in shape along the edge of the brain (from the two runs of skull stripping, etc.), but not enough to change the parcel-level analysis results - hence my surprise when they did occur.
Here are standard deviation images for the same person, four runs (first row the Cued task-switching task, second row the Sternberg task; first three columns the first (AP encoding), last three columns the second (PA encoding) run), from the two times fmriprep was run (red numbers).
One of my test analyses is classification-style within individual people: responding to a task (right hand button press) or not? I expect motor/somatomotor parcels to be able to classify above chance, but also some visual and frontoparietal (since the attention demands and visual input also vary between response-period and not-response-period). Not really relevant, but these are linear SVM, c=1 with default scaling, chance=.5, Schaefer 400 parcels x 7 networks parcellation, simple averaging of each event (separately) for temporal compression, cross-validating on the tasks (Cuedts and Schaefer).
Above are the parcel-wise accuracies for the same person and two preprocessing runs. Only parcels with accuracy of .6 or better are colored, with hotter colors higher accuracies (up to a maximum of .76 for the first preprocessing run, .71 for the second). Many parcels in the same areas have similar accuracies both times, but the two maps are not identical; even some of the highest-classifying parcels shift a bit between the two processing runs - below are the same two maps, thresholded at .65:
Again, this is not necessarily a problem, and there's no reason to think that one set of results is closer to the truth than the other. But the amount of variability between presumed-equivalent preprocessing runs was greater than I expected, and may surprise you as well. I haven't yet run this analysis on the surface version, but plan to and will try to update this afterwards. The other person whose data was run twice shows a similar amount of variation.
Have you run your SVM classification on the same fMRIPrep output twice? Just wondering if there's any randomness involved in that processing step that could help explain it.
ReplyDeleteYes, that's possible as well; I suspect my error of not ensuring the same events were used in each version will prove the biggest factor.
Delete