Tuesday, February 17, 2015

research blogging: concatenating vs. averaging timepoints

A little bit of the (nicely described) methods in Shen et al. 2014 ("Decoding the individual finger movements ..." citation below) caught my eye: they report better results when concatenating the images from adjacent time points instead of averaging (or analyzing each independently). The study was straightforward: classifying which finger (or thumb) did a button press. They got good accuracies classifying single trials, with both searchlights and anatomical ROIs. There's a lot of nice methodological detail, including how they defined the ROIs in individual participants, and enough description of the permutation testing to tell that they followed what I'd call a dataset-wise scheme (nice to see!).

But what I want to highlight here is a pretty minor part of the paper: during preliminary analyses they classified the button presses in individual images (i.e., single timepoints; the image acquired during 1 TR), the average of two adjacent images (e.g., averaging the images collected 3 and 4 TR after a button press), and by concatenating adjacent images (e.g., concatenating the images collected 3 and 4 TR after the button press), and found the best results for concatenation (they don't specify how much better).

Concretely, concatenation sends more voxels to the classifier each time: if a ROI has 100 voxels, concatenating two adjacent images means that each example has 200 voxels (the 100 ROI voxels at timepoint 1 and the 100 ROI voxels at timepoint 2). The classifier doesn't "know" that this is actually 100 voxels at two timepoints; it "sees" 200 unique voxels. Shen et al.used linear SVM (c=1), which generally handles large numbers of voxels well; doubling ROI sizes might hurt the performance of other classifiers.

I haven't tried concatenating timepoints; my usual procedure is averaging (or fitting a HRF-type model). But I know others have also had success with concatenation; feel free to comment if you have any experience (good or bad).

ResearchBlogging.orgShen, G., Zhang, J., Wang, M., Lei, D., Yang, G., Zhang, S., & Du, X. (2014). Decoding the individual finger movements from single-trial functional magnetic resonance imaging recordings of human brain activity European Journal of Neuroscience, 39 (12), 2071-2082 DOI: 10.1111/ejn.12547


  1. Using concatenation one will end with lower sample size, both for the training and the testing sets, not to mention that fMRI data suffer from low sample size. So, does getting higher performance via concatenation, or even averaging, legitimate? or do we have to tweak the analysis to an already low sample size data get higher performance? Moreover, if one is sure that a single volume (one TR) represents the response to a single stimulus, then may be it is not a good choice to do concatenation, or averaging, to two or more volumes. If we have a high sample size with high amount of noise, as in EEG, that would have made sense, but even that, I am not so sure of.


    1. We need to do some comparison tests, Rawi! Perhaps I'll email the authors and see if they'll share some of the details from their experience. (Or if anyone else has tried?)

      I don't find it too surprising to get better accuracies with concatenation, given that they used linear SVM (c=1). With linear SVMs, accuracy will often increase (non-linearly) as more weakly informative voxels are added (I posted a bit about that at http://mvpa.blogspot.com/2013/09/linear-svm-behavior-discontinuous.html). In their dataset, doubling the voxels (by concatenation) must have increased the number of informative voxels enough to counteract the increase in dimensionality.

      Interpretability is another issue. I agree that if you think your effect is only in one TR, then concatenation (and averaging) is a bad idea. But when the effect is spread over multiple TRs (as it usually is), I have no problem with temporal compression (averaging, model-fitting) when the aim is to get to one example per trial (block, whatever). And I think I have no problem with concatenation, either, if it has the same aim as temporal compression (summarizing task-related effects within a particular time window), and that’s what you’re interested in. Trying to go smaller after an analysis, like looking into the classifier weights, is always problematic, but especially so if they span both time and space. But interpreting the concatenated window, I think is fine.