MVPA Meanderings

Wednesday, February 22, 2017

a methodological tour de force: "The effect of spatial resolution on decoding accuracy in fMRI multivariate pattern analysis"

"The effect of spatial resolution on decoding accuracy in fMRI multivariate pattern analysis" by Gardumi et al. (full citation below) is an impressive methodological tour de force: comprehensive analyses clearly described (even their group-level permutation scheme!). Some of its themes are similar to those by Coutanche, Solomon, & Thompson-Schill (2016) which I posted about yesterday: understanding the spatial scale of fMRI information.

The approach in Gardumi et al. (2016) is different than that of Coutanche et al. (2016): they started with 7T images acquired with 1.1 mm isotropic voxels, then reconstructed the images at 2.2 and 3.3 mm effective resolution as well, by zero-padding the k-space images, as illustrated in their Figure 1, below.

A neat part of this approach is that the resulting images have the same voxel size, but lower effective resolution, making it possible to directly compare analyses including the same number of voxels (which is good, since MVPA performance generally interacts with the number of voxels). Changing the effective resolution this way also avoids the issues related to differences between acquiring 1.1 and 3.3 mm voxels (e.g., movement sensitivity): only a single scanning session was used for each person.

Another interesting aspect is that they had two classification tasks: decoding the speaker or the spoken vowel (see paper for details; they used auditory stimuli and single-subject-defined auditory anatomical ROIs). One of their results summary figures is below (lots more detail in the paper!), showing group-level average accuracy for the two classifications at each effective resolution. As an aside, the x-axis is the number of voxels included, picked from univariate tests (n most active voxels from training set GLM): accuracy increased for both until around 1000 voxels were included, then leveled off (again, see the paper for details), which matches my general experience of plateauing performance (e.g.).

Anyway, Figure 6 (plus other tests that they describe) shows that smaller voxels generally did better for their vowel decoding classification, but not for speaker decoding. In the discussion Gardumi et al. (2016) ties this to previous literature findings "that informative voxels in the auditory cortex are widely distributed for vowel decoding, while more clustered for speaker decoding."

Yesterday I wrote that I'm not "convinced that it's safe to infer information spatial resolution from voxel resolution" ... am I convinced by Gardumi e al.? Yes, I think so. Below is my cartoon for how it could work. The blue squares are the brain region, the white circles informative parts, and the red squares voxels at two different sizes. Suppose that you need around a quarter of the voxel to be informative for its activity to be biased (and so contribute to a classification): this is much easier to obtain with small voxels than large ones if the informative parts are widely distributed (left), but about as easy to obtain with both small and large voxels if the informative parts are clustered (right).

So, I now am thinking that it can sometimes be valid to make inferences about the spatial distribution of information from comparisons across voxel resolutions. The way in which the different voxel resolutions are obtained strikes me as very important: I have a lot more reservations about inferences when the resolutions are generated by different acquisition sequences than by k-space zeroing. And perhaps some of my change of heart is due to the different mental models I have of "widely distributed" or "clustered" information as opposed to "coarse" or "fine-grained" spatial resolution. Both of my cartoons above have 10 informative bits (circles): would you describe the one on left as fine-grained and the one on the right as coarse-grained?

Gardumi A, Ivanov D, Hausfeld L, Valente G, Formisano E, & Uludağ K (2016). The effect of spatial resolution on decoding accuracy in fMRI multivariate pattern analysis. NeuroImage, 132, 32-42 PMID: 26899782

Tuesday, February 21, 2017

interesting approach: "A meta-analysis of fMRI decoding: Quantifying influences on human visual population codes"

A (fairly) recent paper from Coutanche, Solomon, & Thompson-Schill, "A meta-analysis of fMRI decoding: Quantifying influences on human visual population codes" (full citation below), has an interesting approach to the effort to understand the spatial scale at which information is present in fMRI signals.

Coutanche, Solomon, & Thompson-Schill 2016 describes a meta-analysis of visual MVPA studies, the details of which (and most findings) I won't get into here. But I do want to highlight their use of the different spatial resolutions (acquired voxel size) across studies to get at spatial resolution. In their words,

"Multi-voxel decoding should be optimized (all else being equal) when the voxel size of acquired data matches the spatial resolution (i.e., granularity) of a region's information-containing patterns. We hypothesized that if V1 holds a more fine-grained map of information than later visual regions, employing larger voxels should not benefit decoding in V1, but may benefit decoding in post-V1 regions (through greater signal-to-noise at the scale of these patterns). .... Naturally, at a certain point, increasing the voxel size is expected to impair performance for any region."

They found that,

"The results of our regression analyses supported this: using larger voxels improved decoding in V2 (B=0.01, p=0.049), unlike V1 (B=0.002, p=0.451). ... These findings are consistent with later visual regions holding coarser multi-voxel codes, while V1 relies on fine-grained patterns."

The "all else being equal" does a lot of work, since there are major interactions between acquisition parameters and the signal-to-noise in the resulting functional images (I'm far from convinced that using voxels around 2 mm isotropic or smaller is a good idea for general task fMRI, but that's another topic!). But if I take as a starting assumption that we have equally good signal-to-noise across a sensible range of voxel sizes, do I accept that decoding should then be optimized "when the voxel size of acquired data matches the spatial resolution (i.e., granularity) of a region's information-containing patterns"?

The idea that, "at a certain point, increasing the voxel size is expected to impair performance for any region", strikes me as plausible: if the voxels are large enough to encompass the entire region, only the average activity of the region as a whole can be used in the analysis, losing any information contained in within-region activation patterns. However, no brain region exists in a vacuum - they are surrounded by other brain structures - and fMRI voxels don't have sharply-defined edges, so in practice, too-large voxels will have signal from adjacent regions, and the combination of regions might have quite a lot of information.

Matching the voxel size to the spatial resolution might indeed optimize decoding if the brain was a fixed grid (so the voxels could be aligned to coincide perfectly with the grid), but I'm not convinced that it's a useful aim for actual fMRI datasets: even if both the voxels and spatial resolution was at 1 mm isotropic, the chance that the voxels would align with the brain grid seems vanishingly small. Setting the voxel size to something like 2 mm seems better, with the aim of having each voxel contain at least one of the 1 mm information units (in other words, setting the voxels larger than the true spatial resolution).

Overall, I accept that idea that voxel size could be used as a marker of spatial resolution in the abstract: a fixed (unmoving, not surrounded by other regions) region and equally good signal-to-noise across the range of voxel sizes. In actual fMRI datasets, I'm not as convinced that it's safe to infer information spatial resolution from voxel resolution, but it is an intriguing idea.

UPDATE 22 Feb 2017: see musings on Gardumi et al. (2016).

Coutanche, M., Solomon, S., & Thompson-Schill, S. (2016). A meta-analysis of fMRI decoding: Quantifying influences on human visual population codes Neuropsychologia, 82, 134-141 DOI: 10.1016/j.neuropsychologia.2016.01.018

Monday, January 23, 2017

getting afni sub-bricks by name in knitr

We've been using afni for GLMs, and using knitr to generate various summary documents (my demo for plotting NIfTI images in knitr is here).

A very nice feature of afni is that the 4th dimension in the statistics output files (even NIfTIs) can be assigned labels. The labels are viewed by the afni command 3dinfo -verb filename.nii.gz, which prints a list including the "sub-brick" (afni-speak for the 4th dimension) names and (0-based) indices, such as:

  -- At sub-brick #102 'Bng#5_Coef' datum type is float:   -13.6568 to    14.6825  
  -- At sub-brick #103 'Bng#5_Tstat' datum type is float:   -5.06214 to    4.55812 statcode = fitt; statpar = 1153  
  -- At sub-brick #104 'Bng#6_Coef' datum type is float:   -9.25095 to    13.3367  
  -- At sub-brick #105 'Bng#6_Tstat' datum type is float:   -4.56816 to    3.55517 statcode = fitt; statpar = 1153  
  -- At sub-brick #106 'Bng_Fstat' datum type is float:      0 to    5.56191 statcode = fift; statpar = 7 1153

What if I want to display the Bng_Fstat image? It's possible to hard-code the corresponding number (107, in this case: R is 1-based) in the knitr R code, but that's dangerous, as the indexing could change. But we can run 3dinfo -label2index label dataset from within knitr to look up the indexes by name:

 img <- readNIfTI(fname, reorient=FALSE);  
 brick.num <- as.numeric(system2("/usr/local/pkg/linux_openmp_64/3dinfo", args=paste("-label2index Bng_Fstat", fname), stdout=TRUE));  
 if (!is.na(brick.num)) { make.plots(in.img=img[,,,1,brick.num+1]); }

That the make.plots function is from my demo, and I provide the full path to 3dinfo in the system2 call to avoid command-not-found issues. Also note the indexing: afni-created images are often read by oro.nifti with an extra dimension (the sub-bricks are in the 5th dimension rather than the 4th).
dim(img)
[1] 75 90 75 1 152
Also, note again that it's brick.num+1: afni has a sub-brick #0, but R does not have an index 0!

Wednesday, October 19, 2016

note on the subcortical part of CIFTIs

As people working with HCP CIFTI images know, the cortex is represented by (surface) vertices, but subcortical structures as voxels. As I explained before, you can use the wb_command -cifti-separate to make a volumetric NIfTI of a subcortical structure (say AMYGDALA_LEFT). The HCP minimally preprocessing pipelines create the CIFTI versions of each (task, in my case) run and also volumetric NIfTI versions. So my question: is timecourse of a voxel in a subcortical structure the same when taken out of an unpacked CIFTI and a NIfTI? Answer: no.

This screenshot shows the volumetric NIfTI (red stripe),
AMYGDALA_LEFT via wb_command -cifti-separate (green stripe), and the _AtlasSubcortical_s2.nii.gz image (yellow stripe) for one of our HCP pipeline minimally-processed task fMRI runs.

The _AtlasSubcortical_s2.nii.gz image was created by the pipeline, and looks to be the subcortical (volumetric) part of the CIFTI: the value at this voxel is identical (green and yellow). The value for the voxel in the volumetric NIfTI (red) is a bit different, however.

Note that I don't think this mismatch is an error: parcel-constrained smoothing is done during the HCP pipelines that make the CIFTIs (e.g., Glasser et al. 2013), and I suspect that this smoothing accounts for the different voxel values. However, it's good to be aware of the additional processing: if you're doing a volumetric analysis with HCP-type preprocessing, it matters whether you take the subcortical structures out of the CIFTI or the NIfTI.

Friday, October 14, 2016

an update and some musings on motion regressors

In case you don't follow practiCal fMRI (you should!), his last two posts describe a series of tests exploring whether or not the multiband (AKA simultaneous multi-slice) sequence is especially liable to respiration artifacts: start here, then this one. Read his posts for the details; I think a takeaway for us non-physicists is that the startlingly-strong respiration signal I (and others) have been seeing in multiband sequence timecourses and motion regressors is not from the multiband itself, but rather that respiration and other motion-type signals are a much bigger deal when voxels are small (e.g., 2 mm isotropic).

This week I dove into the literature on motion regression, artifact correction, etc. Hopefully I'll do some research blogging about a few papers, but here I'll muse about one specific question: how many motion regressors should we use (as regressors of no interest) for our task GLMs? 6? 12? 24? This is one of those questions I hadn't realized was a question until running into people using more than 6 motion regressors (the 6 (x,y,z,roll,pitch,yaw) come from the realignment during preprocessing; transformations of these values are used to make the additional regressors).

Using more than 6 motion regressors seems more common in the resting state and functional connectivity literature than for task fMRI (Power et al. 2015, and Bright & Murphy 2015 , for example). I found a few (only a few) task papers mentioning more than 6 motion regressors, such as Johnstone et al. 2006, who mention testing "several alternative covariates of no interest derived from the estimated motion parameters", but they "lent no additional insight or sensitivity", and Lund et al. 2005, who concluded that including 24 regressors was better than none.

Out of curiosity, we ran a person through an afni TENT GLM (FIR model) using 6 (left) and 24 (right) motion regressors. This is a simple control analysis: all trials from two runs (one in blue, the other orange), averaging coefficients within my favorite left motor Gordon parcel 45 (there were button pushes in the trials). It's hard to tell the difference between the model with 6 and 24 regressors: both are similar and reasonable; at least in this test, the extra regressors didn't have much of an effect.

My thinking is that sticking with the usual practice of 6 regressors of no interest is sensible for task fMRI: adding regressors of no interest uses more degrees of freedom in the model, risks compounding the influence of task-linked motion, and hasn't been shown superior. But any other opinions or experiences?

Saturday, October 1, 2016

multiband acquisition sequence testing: timecourses 2

In a previous post I showed timecourses from the same person, doing the HCP MOTOR task, collected with a multiband 4 and a multiband 8 sequence (see previous posts for details). We ran another test person ("999010") through the same task, with the same two sequences (but not the MB0 control).

This plot shows the average activation (post-preprocessing) in the same SMmouthL Gordon parcel as before, but with the respiration recording as background (light blue), instead of the motion regressors. As before, since this is a mouth parcel, the activation should be strongest to the green "tongue" movement blocks. The respiration recording, events, and average activation are temporally aligned to the actual TR (x-axis), not shifted to account for the hemodynamic delay.

It is clear in both the MB8 and MB4 recordings that task activation is present, but it is perhaps a bit clearer with MB4. The little "jiggles" in the timecourse are present in all four runs, and look to be related to respiration, though not perfectly aligned with respiration. We're switching our ongoing MB8 imaging study to MB4 in the hopes of improving signal-to-noise, but respiration-related effects still look to be prominent in the MB4 BOLD, so dealing with them is an ongoing project.

Monday, September 5, 2016

respiration and movement: more experiences

Annika Linke, a postdoc at SDSU, contacted me, reporting that she's also seen oscillations in multiband acquisitions, with a Siemens Trio and Prisma, plus a GE Discovery. She kindly sent examples of her findings, and allowed me to share and describe a few of them here. More examples can be downloaded here and here.

Annika ran a multiband 8 sequence on a GE 3T Discovery MR750 scanner (UCSD CFMRI; TR = 800 msec, mb_factor 8, 32 channel head coil, 2 x 2 x 2 mm voxels; AP - PA encoding directions), and also saw prominent oscillations in the motion regressors, linked to respiration:

The subject was sleeping in this scan, and periodically stopped breathing. The movement in the motion regressors stopped and starts with the abnormal breathing periods, very similar to the traces from the purposeful breath-holding experiment we ran. I was also struck by the size of the oscillations in the movement regressors: somewhere between 0.5 and 1 mm, which neatly matches the size of the larger oscillations we've seen. Annika has results for an awake adult and toddlers, all of whom show oscillations (particularly in the A-P (y) and I-S (z) axes); see this pdf. These comparisons suggest the oscillation magnitude is not directly related to participant weight: the toddler and adult magnitudes are similar, though toddlers of course are smaller and have a faster respiration rate.

Here are some motion regressors Annika collected on a different scanner (Siemens 3T Prisma at Robarts Research Institute, Western University; CMRR release 10b, VD13D, 64 channel head coil, 3 x 3 x 3 mm voxels), at different MB factors. The oscillation is in all runs, though lowest amplitude with MB1.

Finally, here's a set of motion regressors collected on a Trio at MB2 (ipat2 acceleration, 2 mm voxels, TR=1300 msec, AP/PA phase encoding directions, Robarts Research Institute, Western University). Each subplot is a run from a different person. All again have oscillations mostly in the y and z directions, though the magnitude is less than the MB8 plots above.

Annika's results make it clear that the magnitude of the oscillations in the motion regressors is not due to some weird fluke with our scanner, but rather some aspect of the multiband sequences (or some related factor, such as reconstruction, etc.); hopefully these additional examples will help us understand what's going on.