Friday, April 21, 2017

task fMRI motion censoring (scrubbing) #1: categorizing

Motion ... whether caused by head movement, other movement, breathing, or something else, it is one of the banes of fMRI. Motion artifacts are a huge issue for resting state fMRI, but not only - it causes big problems in task fMRI as well.The best things to do, of course, is to minimize movement during acquisition, by consistent head positioning, bracing with pads (or other systems). But no system is perfect (or able to eliminate breathing and heart beats), so we need to consider motion in the analyses. Here (as usual, though it's certainly not perfect) I'll use the motion traces (by which I mean the x, y, z, roll, pitch, yaw values produced during realignment and often used as nuisance regressors) as a proxy for motion.

Before deciding on any sort of censoring scheme for a study, it's good to look at the motion from all of the people, to get an idea of general movement categories. This post will show some runs I've decided are representative; exemplars of different sorts of movement. For background, these are individual runs from a cognitive task fMRI study, mostly with an MB4 acquisition scheme (details here).

All of these plots have vertical grey lines at one-minute intervals; the runs are around 12 minutes long. The horizontal green lines show the timing of the three task blocks present in each run; tasks were presented at random times and of varying durations during these blocks. The top pane has the translation (mm) and rotation (degrees) from the Movement_Regressors.txt file produced during (HCP-style) preprocessing. The second pane has the enorm and FD versions of the same motion traces, in mm.

I'll start with really nice traces, then work through to some that are not so nice, illustrating our qualitative categorization. I think it's useful to "calibrate your eyes" in this way to have a baseline understanding of some of the data characteristics before starting serious analyses or data manipulations.

Best possible: freakishly smooth: not even 0.5 mm translation over the entire 12 minute run; the little jiggles are probably related to breathing, and are also incredibly regular.

Not perfect, but very good; isolated spiky movement. This trace has very little drifting, almost entirely regular oscillations. This is the sort of movement that seems exactly suited to motion censoring: quite nice, except for a few short periods. (The frames censored with a threshold of FD > 0.9 are marked by red x.)
 

The next category are traces with prominent oscillations, but otherwise pretty clean (not terribly spiky or drifting), and fairly consistent in magnitude and frequency across the run. We'll be using these types of runs without censoring in our analyses (at least for now).

Finally, are the ones of more questionable quality and utility: numerous spikes, drifting, and/or changes in oscillation magnitude. Frames to be censored at FD > 0.9 are marked, but that's only designed to detect spikes. Slow drifts have generally been considered less problematic for task fMRI than spikes, and we generally have comparatively few drifts in this dataset, regardless.


Spiking and drifting are fairly familiar in motion traces; oscillations, less so. (Though I'm sure this sort of movement existed prior to SMS!) It is certainly possible that the oscillation changes (e.g., third image in last set, second in previous pair) reflect changes in respiration rate (perhaps at least somewhat due to entraining to the task timing), which could affect BOLD in all sorts of problematic ways, and for extended periods. We're actively looking into ways to quantify these sorts of effects and minimize (or at least understand) their impacts, but I don't think there are any simple answers. We have respiration and pulse recordings for most runs, but haven't yet been working with those in detail.

Tuesday, March 21, 2017

upcoming events: PRNI and OHBM

June will be busy for me: I'll be attending PRNI in Toronto, then on to Vancouver for OHBM. Here's a bit of a sales pitch; hope to see many of you there!

PRNI (Pattern Recognition in NeuroImaging) is a great little conference, focused on machine learning neuroimaging applications (lots of fMRI, but also EEG, MEG, etc.). It has aspects of both engineering conferences, with proceedings (you can submit a short paper - and there's still time; the deadline isn't until 3 14 April) and psychology conferences (you can submit a short abstract for poster presentation). We're still collecting tutorial proposals, but have a good lineup of invited speakers: Randy McIntosh, Janaina Mourao-Miranda, Rajeev Raizada, and Irina Rish. Besides the website (prni.org), we put announcements on facebook and twitter (@prniworkshop).

At OHBM I'll be speaking at the PR4NI tutorial on Sunday, "A new MVPA-er’s guide to fMRI datasets". Here's the abstract: "fMRI datasets have properties which make the application of machine learning (pattern recognition) techniques challenging – and exciting! This talk will introduce some of the properties most relevant for MVPA, particularly the strong temporal and spatial dependencies inherent in BOLD imaging. These dependencies mean that some fMRI experimental designs are more suitable for MVPA than others, due, for example, to how the tasks are distributed within scanner runs. I will also introduce some of the necessary analysis choices, such as how to summarize the response in time (e.g., convolving with an HRF), which brain areas to include, and feature selection techniques."

I also organized a symposium, which will be Tuesday morning, "High resolution fMRI via multiband (SMS) acquisition: opportunities and limitations". This symposium isn't about MVPA, but rather the practicalities of high-resolution fMRI: working with fMRI datasets with small voxels (say, 2 mm or so isotropic) and multiband acquisitions is different than single-shot fMRI datasets with 3 mm voxels. I will put up a separate post with talk and speaker details soon - I think it'll be a great session. Finally, I'll be presenting a poster sometime, about MVPA-ish twin similarity analyses using parts of the HCP dataset.

Wednesday, March 1, 2017

adjusting my mental model: movement correlation after preprocessing

It's time to adjust my mental model of the fMRI signal: there's a lot more correlation with movement in the timecourses after preprocessing than I'd expected. That movement really affects fMRI is not at all new, of course, and is why including the motion regressors as covariates in GLMs is standard. But I'd pictured that after preprocessing (assuming it went well and included realignment and spatial normalization) the correlation with movement left in the voxel timecourses should be pretty low (something like normally distributed, centered on 0, ranging between -0.2 to 0.2), without much spatial structure (i.e., maybe some ringing, but fairly uniformly present over the brain). Asking around, I think this is a fairly common mental model, but it looks to be quite wrong.

For exploring, I simply used afni's 3dTcorr1D program to correlate the timecourse of every voxel in several preprocessed task fMRI datasets with each of the six motion regressors (generated during preprocessing). 3dTcorr1D makes an image with 6 entries in the 4th dimension (sub-brick, in afni-speak), one for each of the 6 motion columns; the value in each voxel the Pearson correlation between that voxel's timecourse and the movement column. I plotted these correlations on brains, and made histograms to summarize the distribution.

The correlations are much higher than I expected, even in people with very little movement. Here's an example; more follow. Below are the motion regressors from single task run (about 12.5 minutes long; HCP-style preprocessing; MB8 protocol), the correlation with each motion regressor, and a (density-style) histogram of the voxel-wise correlations. Color scaling for this and all brain images is from 1 (hottest) to -1 (coolest), not showing correlations between -0.25 and 0.25.


If my expectation (correlations normally distributed, centered on 0, ranging between -0.2 to 0.2) was right, there shouldn't be any color on these images at all, but there's clearly quite a bit: many voxels correlate around 0.5 with roll and pitch (4th and 5th brain rows are mostly "hot" colors), and around -0.5 with x, y, and z (first three rows mostly "cool" colors). There's some structure to the peak correlations (e.g., a hotter strip along the left side in slice 46), which may correspond with sulci or large vessels, but it's rather speckly. Note that this is a pretty low motion subject overall: less than 2 mm drift over the 12 minute run, and only 4 volumes marked for censoring (I didn't censor before the correlation).

Looking at other people and datasets, including from non-SMS acquisitions with larger voxels and longer TRs, it appears like correlations of 0.5 are pretty common: this isn't just some sort of weird effect that only shows up with high-resolution acquisitions. For another example, these are the histograms and motion regressors for four runs from one person included in this study (acquired with 4 mm isotropic voxels; run duration 7.6 min, TR 2.5 sec, SPM preprocessing). The corresponding brain images are below the jump.



So, really visible motion (which at least sometimes is linked to respiration) in the voxel activity timecourses (such as here) is to be expected. Unfortunately, the correlation is not always (or even usually) uniform across the brain or grey matter, such as below (just the correlation with x and y translation). It also looks like very little (under a mm) motion is needed to induce large correlations.
What to do? Well, adjust our mental models of how much correlation with movement is left in the activation timeseries after preprocessing: there's quite a bit. I'll be exploring further, particularly isolating the task windows (since I work with task, not resting state, datsets): how are the correlations during tasks? I'm not at all sure that applying a global signal regression-type step would be beneficial, given the lack of homogeneity across the brain (though I know there are at least a few reports on using it with task data). Censoring high-movement trials (i.e., not including them) is likely sensible. Interestingly, I've found similar MVPA performance in multiple cases with temporal compression by averaging and fitting a model (PEIs), which would not have been my guess looking at these correlation levels. Perhaps averaging across enough timepoints and trials balances out some of the residual motion effects? I am concerned, however, about respiration (and motion) effects remaining in the timecourses: it's clear that some people adjust their breathing to task timing, and we don't want to be interpreting a breath-holding effect as due to motivation.

Any other thoughts or experiences? Are you surprised by these correlation levels, or is it what you've already known?



Wednesday, February 22, 2017

a methodological tour de force: "The effect of spatial resolution on decoding accuracy in fMRI multivariate pattern analysis"

"The effect of spatial resolution on decoding accuracy in fMRI multivariate pattern analysis" by Gardumi et al. (full citation below) is an impressive methodological tour de force: comprehensive analyses clearly described (even their group-level permutation scheme!). Some of its themes are similar to those by Coutanche, Solomon, & Thompson-Schill (2016) which I posted about yesterday: understanding the spatial scale of fMRI information.

The approach in Gardumi et al. (2016) is different than that of Coutanche et al. (2016): they started with 7T images acquired with 1.1 mm isotropic voxels, then reconstructed the images at 2.2 and 3.3 mm effective resolution as well, by zero-padding the k-space images, as illustrated in their Figure 1, below.
A neat part of this approach is that the resulting images have the same voxel size, but lower effective resolution, making it possible to directly compare analyses including the same number of voxels (which is good, since MVPA performance generally interacts with the number of voxels). Changing the effective resolution this way also avoids the issues related to differences between acquiring 1.1 and 3.3 mm voxels (e.g., movement sensitivity): only a single scanning session was used for each person.

Another interesting aspect is that they had two classification tasks: decoding the speaker or the spoken vowel (see paper for details; they used auditory stimuli and single-subject-defined auditory anatomical ROIs). One of their results summary figures is below (lots more detail in the paper!), showing group-level average accuracy for the two classifications at each effective resolution. As an aside, the x-axis is the number of voxels included, picked from univariate tests (n most active voxels from training set GLM): accuracy increased for both until around 1000 voxels were included, then leveled off (again, see the paper for details), which matches my general experience of plateauing performance (e.g.).

Anyway, Figure 6 (plus other tests that they describe) shows that smaller voxels generally did better for their vowel decoding classification, but not for speaker decoding. In the discussion Gardumi et al. (2016) ties this to previous literature findings "that informative voxels in the auditory cortex are widely distributed for vowel decoding, while more clustered for speaker decoding."

Yesterday I wrote that I'm not "convinced that it's safe to infer information spatial resolution from voxel resolution" ... am I convinced by Gardumi e al.? Yes, I think so. Below is my cartoon for how it could work. The blue squares are the brain region, the white circles informative parts, and the red squares voxels at two different sizes. Suppose that you need around a quarter of the voxel to be informative for its activity to be biased (and so contribute to a classification): this is much easier to obtain with small voxels than large ones if the informative parts are widely distributed (left), but about as easy to obtain with both small and large voxels if the informative parts are clustered (right).

So, I now am thinking that it can sometimes be valid to make inferences about the spatial distribution of information from comparisons across voxel resolutions. The way in which the different voxel resolutions are obtained strikes me as very important: I have a lot more reservations about inferences when the resolutions are generated by different acquisition sequences than by k-space zeroing. And perhaps some of my change of heart is due to the different mental models I have of "widely distributed" or "clustered" information as opposed to "coarse" or "fine-grained" spatial resolution. Both of my cartoons above have 10 informative bits (circles): would you describe the one on left as fine-grained and the one on the right as coarse-grained?


ResearchBlogging.org Gardumi A, Ivanov D, Hausfeld L, Valente G, Formisano E, & Uluda─č K (2016). The effect of spatial resolution on decoding accuracy in fMRI multivariate pattern analysis. NeuroImage, 132, 32-42 PMID: 26899782

Tuesday, February 21, 2017

interesting approach: "A meta-analysis of fMRI decoding: Quantifying influences on human visual population codes"

A (fairly) recent paper from Coutanche, Solomon, & Thompson-Schill, "A meta-analysis of fMRI decoding: Quantifying influences on human visual population codes" (full citation below), has an interesting approach to the effort to understand the spatial scale at which information is present in fMRI signals.  

Coutanche, Solomon, & Thompson-Schill 2016 describes a meta-analysis of visual MVPA studies, the details of which (and most findings) I won't get into here. But I do want to highlight their use of the different spatial resolutions (acquired voxel size) across studies to get at spatial resolution. In their words,
"Multi-voxel decoding should be optimized (all else being equal) when the voxel size of acquired data matches the spatial resolution (i.e., granularity) of a region's information-containing patterns. We hypothesized that if V1 holds a more fine-grained map of information than later visual regions, employing larger voxels should not benefit decoding in V1, but may benefit decoding in post-V1 regions (through greater signal-to-noise at the scale of these patterns). .... Naturally, at a certain point, increasing the voxel size is expected to impair performance for any region."
They found that,
"The results of our regression analyses supported this: using larger voxels improved decoding in V2 (B=0.01, p=0.049), unlike V1 (B=0.002, p=0.451). ... These findings are consistent with later visual regions holding coarser multi-voxel codes, while V1 relies on fine-grained patterns."
The "all else being equal" does a lot of work, since there are major interactions between acquisition parameters and the signal-to-noise in the resulting functional images (I'm far from convinced that using voxels around 2 mm isotropic or smaller is a good idea for general task fMRI, but that's another topic!). But if I take as a starting assumption that we have equally good signal-to-noise across a sensible range of voxel sizes, do I accept that decoding should then be optimized "when the voxel size of acquired data matches the spatial resolution (i.e., granularity) of a region's information-containing patterns"?

The idea that, "at a certain point, increasing the voxel size is expected to impair performance for any region", strikes me as plausible: if the voxels are large enough to encompass the entire region, only the average activity of the region as a whole can be used in the analysis, losing any information contained in within-region activation patterns. However, no brain region exists in a vacuum - they are surrounded by other brain structures - and fMRI voxels don't have sharply-defined edges, so in practice, too-large voxels will have signal from adjacent regions, and the combination of regions might have quite a lot of information.

Matching the voxel size to the spatial resolution might indeed optimize decoding if the brain was a fixed grid (so the voxels could be aligned to coincide perfectly with the grid), but I'm not convinced that it's a useful aim for actual fMRI datasets: even if both the voxels and spatial resolution was at 1 mm isotropic, the chance that the voxels would align with the brain grid seems vanishingly small. Setting the voxel size to something like 2 mm seems better, with the aim of having each voxel contain at least one of the 1 mm information units (in other words, setting the voxels larger than the true spatial resolution).

Overall, I accept that idea that voxel size could be used as a marker of spatial resolution in the abstract: a fixed (unmoving, not surrounded by other regions) region and equally good signal-to-noise across the range of voxel sizes. In actual fMRI datasets, I'm not as convinced that it's safe to infer information spatial resolution from voxel resolution, but it is an intriguing idea.

UPDATE 22 Feb 2017: see musings on Gardumi et al. (2016).

ResearchBlogging.org Coutanche, M., Solomon, S., & Thompson-Schill, S. (2016). A meta-analysis of fMRI decoding: Quantifying influences on human visual population codes Neuropsychologia, 82, 134-141 DOI: 10.1016/j.neuropsychologia.2016.01.018

Monday, January 23, 2017

getting afni sub-bricks by name in knitr

We've been using afni for GLMs, and using knitr to generate various summary documents (my demo for plotting NIfTI images in knitr is here).

A very nice feature of afni is that the 4th dimension in the statistics output files (even NIfTIs) can be assigned labels. The labels are viewed by the afni command 3dinfo -verb filename.nii.gz, which prints a list including the "sub-brick" (afni-speak for the 4th dimension) names and (0-based) indices, such as:

  -- At sub-brick #102 'Bng#5_Coef' datum type is float:   -13.6568 to    14.6825  
  -- At sub-brick #103 'Bng#5_Tstat' datum type is float:   -5.06214 to    4.55812 statcode = fitt; statpar = 1153  
  -- At sub-brick #104 'Bng#6_Coef' datum type is float:   -9.25095 to    13.3367  
  -- At sub-brick #105 'Bng#6_Tstat' datum type is float:   -4.56816 to    3.55517 statcode = fitt; statpar = 1153  
  -- At sub-brick #106 'Bng_Fstat' datum type is float:      0 to    5.56191 statcode = fift; statpar = 7 1153  

What if I want to display the Bng_Fstat image? It's possible to hard-code the corresponding number (107, in this case: R is 1-based) in the knitr R code, but that's dangerous, as the indexing could change. But we can run 3dinfo -label2index label dataset from within knitr to look up the indexes by name:

 img <- readNIfTI(fname, reorient=FALSE);  
 brick.num <- as.numeric(system2("/usr/local/pkg/linux_openmp_64/3dinfo", args=paste("-label2index Bng_Fstat", fname), stdout=TRUE));  
 if (!is.na(brick.num)) { make.plots(in.img=img[,,,1,brick.num+1]); }  

That the make.plots function is from my demo, and I provide the full path to 3dinfo in the system2 call to avoid command-not-found issues. Also note the indexing: afni-created images are often read by oro.nifti with an extra dimension (the sub-bricks are in the 5th dimension rather than the 4th). 
dim(img)
[1]  75  90  75   1 152

Also, note again that it's brick.num+1: afni has a sub-brick #0, but R does not have an index 0!