MVPA Meanderings: 2020

Tuesday, December 15, 2020

DMCC13benchmark: public again

Apologies: I've had to temporarily delete DMCC13benchmark from OpenNeuro. We plan to upload a new version ASAP. The images and associated data in the new version will be the same, and I will keep the same dataset name.

The problem was with the subject ID codes: some of the DMCC participants were members of the Young Adult HCP, and we released DMCC13benchmark with those HCP ID codes. In consultation with members of the HCP we decided that it would be most appropriate to release data with unique ID codes instead. The subject key (linking the subject IDs released on openneuro to the actual HCP ID codes) is available via the HCP ConnectomeDatabase for people who have accepted the HCP data use terms (ConnectomeDB00026, titled "DMCC (Dual Mechanisms of Cognitive Control) subject key").

Again, I apologize for any inconvenience. We are working to upload a corrected version of DMCC13benchmark soon. Please contact me with questions or if you need something sooner; I will add links to this post when DMCC13benchmark is again public.

UPDATE 4 January 2021: The corrected version of DMCC13benchmark is now available as openneuro dataset ds003452, doi: 10.18112/openneuro.ds003452.v1.0.1. We ask anyone using the dataset to use this version (all subject IDs starting with "f") rather than the previous. Every file should be identical between the versions, excepting the subject IDs.

I again apologize for any inconvenience this change may have caused you, and urge you to contact me with any questions or concerns.

Wednesday, November 11, 2020

DMCC GLMs: afni TENTzero, knots, and HDR curves, #2

This post is the second in a series, which begins with this post. In that first post I included examples of the estimated HDR curves for a positive control analysis, one of which is copied here for reference.

What are the axis labels for the estimated HDR curves? How did we generate them? What made us think about changing them?

At the first level the (curve) axis label question has a short answer: y-axes are average (across subjects) beta coefficient estimates, x-axes are knots. But this answer leads to another question: how do knots translate into time? Answering this question is more involved, because we can't translate knots to time without knowing both the TR and the afni 3dDeconvolve command. The TR is easy: 1.2 seconds. Listing the relevant afni commands is also easy:

3dDeconvolve \  
 -local_times \  
 -x1D_stop \  
 -GOFORIT 5 \  
 -input 'lpi_scale_blur4_tfMRI_AxcptBas1_AP.nii.gz lpi_scale_blur4_tfMRI_AxcptBas2_PA.nii.gz' \  
 -polort A \  
 -float \  
 -censor movregs_FD_mask.txt \  
 -num_stimts 3 \  
 -stim_times_AM1 1 f1031ax_Axcpt_baseline_block.txt 'dmBLOCK(1)' -stim_label 1 ON_BLOCKS \  
 -stim_times 2 f1031ax_Axcpt_baseline_blockONandOFF.txt 'TENTzero(0,16.8,8)' -stim_label 2 ON_blockONandOFF \  
 -stim_times 3 f1031ax_Axcpt_baseline_allTrials.txt 'TENTzero(0,21.6,10)' -stim_label 3 ON_TRIALS \  
 -ortvec motion_demean_baseline.1D movregs \  
 -x1D X.xmat.1D \  
 -xjpeg X.jpg \  
 -nobucket  
 
3dREMLfit \
 -matrix X.xmat.1D \
 -GOFORIT 5 \
 -input 'lpi_scale_blur4_tfMRI_AxcptBas1_AP.nii.gz lpi_scale_blur4_tfMRI_AxcptBas2_PA.nii.gz' \
 -Rvar stats_var_f1031ax_REML.nii.gz \
 -Rbuck STATS_f1031ax_REML.nii.gz \
 -fout \
 -tout \
 -nobout \
 -verb

Unpacking these commands is not so easy, but I will try to do so here for the most relevant parts; please comment if you spot anything not quite correct in my explanation.

First, the above commands are for the Axcpt task, baseline session, subject f1031ax (aside: their data, including the event timing text files named above, is in DMCC13benchmark). The 3dDeconvolve command has the part describing the knots and HDR estimation; I included the 3dREMLfit call for completeness, because the plotted estimates are from the Coef sub-bricks of the STATS image (produced by -Rbuck).

Since this is a control GLM in which all trials are given the same label there are only three stimulus time series: BLOCKS, blockONandOFF, and TRIALS (this is a mixed design; see introduction). The TRIALS part is what generates the estimated HDR curves plotted above, as defined by TENTzero(0,21.6,10).

We chose TENTzero instead of TENT for the HDR estimation because we do not expect anticipatory activity (the trial responses should start at zero) in this task. (see afni message board, e.g., here and here) To include the full response we decided to model the trial duration plus at least 14 seconds (not "exactly" 14 seconds because we want the durations to be a multiple of the TENT duration). My understanding is that it's not a problem if more time than needed is included in the duration; if too long the last few knots should just approximate zero. I doubt you'd often want to use a duration much shorter than that of the canonical HRF (there's probably some case when that would be useful, but for the DMCC we want to model the entire response).

From the AFNI 3dDeconvolve help:

'TENT(b,c,n)': n parameter tent function expansion from times b..c after stimulus time [piecewise linear] [n must be at least 2; time step is (c-b)/(n-1)]

You can also use 'TENTzero' and 'CSPLINzero',which means to eliminate the first and last basis functions from each set. The effect of these omissions is to force the deconvolved HRF to be zero at t=b and t=c (to start and and end at zero response). With these 'zero' response models, there are n-2 parameters (thus for 'TENTzero', n must be at least 3).

The first TENTzero parameter is straightforward: b=0 to start at the event onset.

For the trial duration (c), we need to do some calculations. Here, the example is the DMCC Axcpt task, which has a trial duration of 5.9 seconds, so the modeled duration should be at least 14 + 5.9 = 19.9 seconds. That's not the middle value in the TENTzero command, though.

For these first analyses we decided to have the TENTs span two TRs (1.2 * 2 = 2.4 seconds), in the (mostly) mistaken hope it would improve our signal/noise, and also to make the file sizes and GLM estimation speeds more manageable (which it does). Thus, c=21.6, the shortest multiple of our desired TENT duration greater than the modeled duration (19.9/2.4 = 8.29, rounded up to 9; 9*2.4 = 21.6).

Figuring out n requires a bit of mind bending but less calculation; I think the slide below (#5 in afni06_decon.pdf) helps: in tent function deconvolution, the n basis functions are divided into n-1 intervals. In the above calculations I rounded 8.29 up to 9; we want to model 9 intervals (of 2.4 s each). Thus, n = 10 basis functions gives us the needed n-1 = 9 intervals.

Recall that since we're using TENTzero, the first and last tent functions are eliminated. Thus, having n=10 means that we will get 10-2=8 beta coefficients ("knots") out of the GLM. These, finally, are the x-axes in the estimated GLM curves above: the knots at which the HDR estimates are made, of which there are 8 for Axcpt.

(Aside: 0-based afni can make some confusion for those of us more accustomed to 1-based systems, especially with TENTzero. The top plots put the value labeled as #0_Coef at 1; a point is added at 0=0 since TENTzero sets the curves to start at zero. I could also have added a zero for the last knot (9, for this Axcpt example), but did not.)

These GLMs seem to be working fine, producing sensible HDR estimates. We've been using these results in many analyses (multiple lab members with papers in preparation!), including the fMRIprep and HCP pipeline comparisons described in previous posts. We became curious, though, if we could set the knot duration to the TR, rather than twice the TR (as done here): would the estimated HDR then follow the trial timing even more precisely? In the next posts(s) I'll describe how we changed the GLMs to have the knot duration match the TR, and a (hopefully interesting) hiccup we had along the way.

UPDATE 4 January 2021: Corrected DMCC13benchmark openneuro links.

Tuesday, November 10, 2020

DMCC GLMs: afni TENTzero, knots, and HDR curves, #1

One of the things we've been busy with during the pandemic-induced data collection pause is reviewing the DMCC (Dual Mechanisms of Cognitive Control; more descriptions here and here) project GLMs: did we set them up and are we interpreting them optimally?

We decided that the models were not quite right, and are rerunning all of them. This post (and its successors) will describe how the original GLMs were specified, why we thought a change was needed, what the change is, and how the new results look. The posts are intended to summarize and clarify the logic and process for myself, but also for others, as I believe many people find modeling BOLD responses (here, with afni 3dDeconvolve TENTs) confusing.

I won't just start explaining the DMCC GLMs however, because they're complex: the DMCC has a mixed design (task trials embedded in long blocks), and we're estimating both event-related ("transient") and block-related ("sustained") responses. If you're not familiar with them, Petersen and Dubis (2012) review mixed block/event fMRI designs, and Figure 1 from Visscher, et. al (2003) (left) illustrates the idea. In addition to the event and block regressors we have "blockONandOFF" regressors in the DMCC, which are intended to capture the transition between the control and task periods (see Dosenbach et. al (2006)).

We fit many separate GLMs for the DMCC, some as controls and others to capture effects of interest. For these posts I'll describe one of the control analyses (the "ONs") to make it a bit more tractable. The ON GLMs include all three effect types (event (transient), block (sustained), blockONandOFF (block start and stop)), but do not distinguish between the trial types; all trials (events) are given the same label ("on"). The ON GLMs are a positive control; they should show areas with changes in activation between task and rest (30 seconds of fixation cross before and after task blocks). For example, there should be positive, HRF-type responses in visual areas, because all of our task trials include visual stimuli.

Below is an example of the estimated responses from our original ONs GLMs (generated by R knitr code similar to this). The GLMs were fit for every voxel individually, then averaged within the Schaefer, et al. (2018) parcels (400x7 version); these are results for four visual parcels and volumetric preprocessing. These are (robust) mean and SEM over 13 participants (DMCC13benchmark). The results have multiple columns both because there are four DMCC tasks (Axcpt (AX-CPT), Cuedts (Cued task-switching), Stern (Sternberg working memory), and Stroop (color-word variant)), and because of the mixed design, so we generate estimates for both event/transient (lines) and sustained (blue bars) effects (blockONandOFF are not included here).

The line graphs show the modeled response at each time point ("knot"). Note that the duration (and so number of estimated knots) for each task varies; e.g., Stroop has the shortest tasks and Stern the longest. (We set the modeled response to trial length + 14 seconds.)

The double-peak response shape for Axcpt, Cuedts, and Stern is expected, as the trials have two stimuli slides separated by a 4 second ITI; the Stroop response should resemble a canonical HRF since each trial has a single stimulus and is short. The task timing and some sample trials are shown below (this is Figure 1 in a preprint (accepted version) which has much more task detail; see also the DMCC55B description).

So far, so good: the estimated responses for each task have a sensible shape, reflecting both the HRF and task timing. But what exactly is along the x-axis for the curves? And how did we fit these GLMs, and then decide to change them a bit? ... stay tuned.

later posts in this series: #2

UPDATE 4 January 2021: Corrected DMCC13benchmark openneuro links.

Monday, July 27, 2020

overlaying in Connectome Workbench 1.4.2

This post answers a question posted on my Workbench introductory tutorial: Rui asked how to visualize the common regions for two overlays. I've found two ways of viewing overlap.

The first is to reduce the Opacity (circled in red) of the top-most Layer. In this screen capture I have a blueish ROI on top of a yellow one. (I scribbled the colors over their corresponding rows in the Layers toolbox; blue is listed above - over - the yellow. Click the Layers On and off if you're not sure which is which.) Compare the appearance of the ROIs when the Opacity is set to 1.0 for both (left) versus 0.7 for the top (blue, right): the borders of both ROIs are visible on the right side, and the blue is less opaque all over (not just in the area that overlaps).

The second method is to set one (or both) ROIs to Outline (border) mode. Here, I set the top (blue) ROI to "Outline Only" in the Overlay and Map Settings dialog box (click the little wrench button to bring up the dialog).

Tuesday, June 16, 2020

tip: running FSL commands from within R

I very much believe that analyses should use existing/established programs/packages/functions as much as possible, preferably, the original version of the program in a transparent way. But I also want to preserve a record of the analysis, and be able to rerun it if needed, while minimizing the chance of difficult-to-identify errors like of clicking the wrong setting in a GUI.

I'm increasingly convinced that a good strategy is to do file management-type things in the script, build the needed (FSL, afni, etc.) command, then send the command directly to the target program (FLS, afni, etc.), rather than use a specialized interface package. There are at least two big advantages to building the actual commands: first, few (if any) dependencies or new syntax are introduced, second, the executed commands are immediately and transparently recoverable. Note that this advice is for analysis scripts, not full-scale pipelines or software.

For example, I needed to run FSL flirt to register around twenty images. This is the only step requiring FSL, and my script is in R. I have a few options, including the FLIRT GUI, generating a text file of the flirt commands and running it at the command line, typing the commands directly at the command line, or running fsl from within R using wrapper functions (e.g., fslr).

This code shows the solution I'm advocating: using R functions for the file-wrangling and command-building parts, and then using system2 to "send" the command to fsl for running. This strategy can be used not only with fsl, but also afni, wb_command, and other command-line programs.

 fsl.path <- "/usr/local/pkg/fsl6.0/bin/";   # path to fsl flirt
 in.path <- "/data/";  # path to the input images 
 ref.path <- "/data/";  # path to the reference images  
 out.path <- "/scratch3/registrations/";   # where to write files  
 sub.ids <- c(102008, 107321, 814649, 849971);  # subject IDs
 
 for (sid in 1:length(sub.ids)) {   # sid <- 1;   
  # make the file names
  fname.in <- paste0(in.path, sub.ids[sid], "/T1w/T1w_acpc_dc_restore.nii.gz");   
  fname.ref <- paste0(ref.path, sub.ids[sid], "/T1w/T1w_acpc_dc_restore.nii.gz");    
  fname.out <- paste0(out.path, sub.ids[sid], "_T1w.nii.gz");  
  fname.mat <- paste0(out.path, sub.ids[sid], "_T1w.txt");  
    
  # check if the input files exist (and output files do not), then run fsl if so
  if (file.exists(fname.in) & file.exists(fname.ref) & !file.exists(fname.out) & !file.exists(fname.mat)) {  
   system2(paste0(fsl.path, "flirt"), args=paste("-in", fname.in, "-ref", fname.ref, "-out", fname.out, "-omat", fname.mat, "-dof 6"),  
       stdout=TRUE, env="FSLOUTPUTTYPE=NIFTI_GZ");  
  }  
 }

The system2 function "sends" the command to fsl and can be used to check what was done:

paste0(fsl.path, "flirt") shows which program was run and its location on disk.
args= has the options sent to the flirt program, as a string. Copying out just the paste(...) part and running it in R will print it for inspection (example below).
stdout=TRUE tells R to print fsl's messages as it runs (useful to spot any errors)
env="FSLOUTPUTTYPE=NIFTI_GZ" sets the fsl environment variable FSLOUTPUTTYPE to NIFTI_GZ. Depending on your system and fsl installation you may need to set more (or no) environment variables.

Specifically, for the first participant (sid 1), the code generates this, which could be run without R:

/usr/local/pkg/fsl6.0/bin/flirt -in /data/102008/T1w/T1w_acpc_dc_restore.nii.gz -ref /data/102008/T1w/T1w_acpc_dc_restore.nii.gz -out /scratch3/registrations/102008_T1w.nii.gz -omat /scratch3/registrations/102008_T1w.txt -dof 6

Wednesday, May 13, 2020

"Pattern similarity analyses of frontoparietal task coding" update

A recently-published paper of ours (preprint; "Pattern Similarity Analyses of FrontoParietal Task Coding: Individual Variation and Genetic Influences") uses correlational MVPA/RSA-type methods to look at heritability effects in HCP task fMRI data. A primary motivation was methods development for the DMCC, which we're analyzing for patterns in individuals and twin pairs across time.

Since publishing, the HCP has updated their list of subjects with major issues, including some in our analyses. I wondered if these problematic subjects could have affected the results, so reran the analyses without them (advertisement: this sort of update is drastically easier if you use knitr or another report-generating system for your results). I'm pleased to say that the omitting these flagged subjects had a negligible impact on the results - it turns out that I'd already excluded most of them from the key analyses because of missing behavioral data.

Both the published and updated versions of the supplemental are on the osf site for the paper. Versions of my favorite figure before and after omitting the flagged participants are after the jump.

observation: variability from running fmriprep multiple times

UPDATE 13 May 2020: I'm now quite sure that the high variability wasn't due to the multiple fmriprep runs, but rather the way I picked the training/testing examples. I'm leaving this post here, but put most of it "below the fold" since my initial interpretation was incorrect. I will eventually release code and more details.

UPDATE 24 April 2020: I realized that there was at least one other possible big contributing factor to the difference between the runs: randomness in how I chose the not-button examples. I'll fix this and see what happens to the SVM results. I previously found extremely similar GLM results between fmriprep versions 1.1.2 and 1.3.2, so the variability here is very possibly only from my randomness in setting up the classification.

Getting started with Connectome Workbench 1.4.2

Here is an updated version of my introductory tutorial for the Connectome Workbench, written for Workbench 1.4.2. This new version of Workbench has some nice features (only a few of which are described here!), so I suggest you try this one rather than a previous version. I have a few other posts on Workbench and HCP-related topics, mostly linked from the "Connectome Workbench: 1st steps" post.

downloading the program

Connectome Workbench has versions for Windows, Mac OS, and Linux; just click the download link for your OS (Linux types can also install Workbench via NeuroDebian).

On Windows and Mac you don't "install" Workbench, but just unzip the download then double-click the executable. On my windows box I put the download into d:\Workbench\, unzipped it, and got a bunch of subdirectories. Navigate through until you find wb_view; in my case it's at D:\Workbench\workbench-windows64-v1.4.2\bin_windows64\wb_view.exe. Double-click wb_view to start the program. If you don't want to navigate to this directory each time to start Workbench, make a shortcut to wb_view.exe and put it on your desktop or in a handy menu.

Aside: wb_command.exe is in the same directory as wb_view.exe. wb_command.exe is not a GUI program (nothing much will happen if you double-click it!), but is useful for various functions; see this post and this documentation for more.

getting images to plot

Workbench doesn't come with any images, so this tutorial will use ones from my knitr tutorials, available at https://osf.io/w7zkc/. Download all of the files under "knitr tutorials", the "surface (GIFTI) brain plotting" and "volumetric (NIfTI) brain plotting" subdirectories. (The .rnw files should be kept in separate directories if you're going to compile the knitr tutorials.)

Aside: Only the files from the osf site are needed for this tutorial, but you will likely want more anatomic underlays than these. I suggest underlays from the HCP S1200 release for volumetric (MNI) and fsLR surfaces; I converted the fsaverage5 surfaces from our local FreeSurfer installation, but FreeSurfer provides many more than the pair I put into osf.

seeing blank brains

Open the Workbench GUI (e.g., by double-clicking wb_view.exe). A command window will open (just ignore it), as well as a interactive box prompting you to open a spec or scene file. Click the Skip button to load the main program. Note: spec and scene files are very useful, and a big reason to use Workbench, because they let you save collections of images and visualizations, which can save a massive amount of time. I won't cover them in this tutorial, though.

Since we skipped loading anything, Workbench opens with a blank screen. We want to first open images to use as underlays: a NIfTI volumetric underlay to plot volumetric blobs on, and GIFTI surface files to plot surface blobs on (see this post for a bit more about file types).

Select File -> Open File from the top menus to open a standard file selection dialog. Navigate to where you put the images from "surface (GIFTI) brain plotting", and change the Files of type drop-down to "Surface Files (*.surf.gii)", then select and open the four fsaverage5 .surf.gii files (two hemispheres * two inflations). Open the Open File dialog again, navigate to where you put the files from "volumetric (NIfTI) brain plotting", and set the Files of type drop-down to "Volume Files (*.nii *.nii.gz)", then select and open S1200_AverageT1w_81x96x81.nii.gz.

Aside: Connectome Workbench works with both fsLR and fsaverage5 surfaces (the two types in the gifti tutorial files), but not at the same time.

All of the underlay images are now loaded in Workbench, but we need to tell it to display them like we want: let's put the surfaces on the first tab and volumes on the second.

The above images show the settings to display the volumes and surfaces (click to enlarge). The first tab probably was already set for multiple surfaces ("Montage"), and likely now shows four wrinkly (pial) brains. Since both pial and inflated fsaverage5 surfaces were loaded, which are shown where can be adjusted with the Montage Selection part of the toolbar (highlighted in red). Try adjusting how many and which surfaces are viewed by changing these settings. You can click and drag the hemispheres around with the mouse; use the buttons in the Orientation part of the menu to reset.

Click on the second tab (probably labeled "All"), then choose Volume in the View part of the menu (circled in red above). The top menus and tab title should change, and a single slice of the volume be displayed. Try adjusting the number and spacing of the volume images. For example, show more slices by clicking the Montage On button (right red arrow), and adjust the image arrangement with the values in the Montage menu boxes (left of the vertical red line). The height of the slices is changed in the A: (axial) box of the Slice Indices/Coords menu (right of vertical red line). The crosshairs can be turned on and off with the button at the bottom left of the Slice Plane menu (left red arrow); the labels on and off with the adjacent LARP and XYZ buttons.

adding overlays

Now that we have arranged underlay images, let's add something on top. Overlays are opened in the same way as the underlay images, via File -> Open File. The Open File window is probably still in the nifti tutorial directory and set to Files of type "Volume", so select the two images in that directory that are not already loaded: continuousOverlay.nii.gz and Schaefer2018_400x7_81x96x81.nii.gz. Workbench won't show the overlays right away, but don't worry - they were opened.

To load the surface overlays use the File -> Open File menu option again (it doesn't matter which Workbench tab you're on), navigate to the directory with the gifti tutorial files, and change the Files of Type: box to "Metric Files (*.func.gii *shape.gii). Select the two fsaverage5 files (fsaverage5_STATS_L.func.gii and fsaverage5_STATS_R.func.gii) and click Open.

Workbench will pop up a query box like this, one for each hemisphere. Select the matching Cortex for each file, and click OK. As with the volume overlays the appearance of Workbench won't change, but the files were read.

To see the images we just loaded, we need to turn the proper overlay layers “On” the surface and volume tabs. The Overlay ToolBox settings control the loading and appearance of the overlay images and work similarly for surfaces and volumes.

On the surface (Montage) tab select the two overlays (one for Left and one for Right; order does not matter) in the File boxes, then click the rows On, as pointed out by red arrows. The color scaling is rather mysterious, but just leave it for now.

These STATS overlays were generated by an afni GLM, and contain multiple named slots. Which statistic is displayed on each hemisphere can be controlled by the Map boxes highlighted in green (note that this number is 1-based while afni uses 0-based, so the numbers shown with 3dinfo are one less). While Workbench lets you show different maps (statistics, in this case) on each hemisphere, it's more usual to want to see the same map on both hemispheres. The Yoke options (between the red and green arrows) will link the two hemispheres together: switch both from "Off" to "I", and when you change one hemisphere's map the other should as well.

On the Volume tab, note that the last row in the Overlay ToolBox is S1200_AverageT1w_81x96x81.nii.gz, the anatomic underlay. This is good - we want to have the anatomy under the overlays. Select one of the volumetric overlay files in each of the other rows using the File dropdown menu (red arrows), and click them On and off.

The two volume overlays in the tutorial dataset are 3d - only one image, rather than 4d (3d plus time or statistics) - so the Map and Yoke options are disabled. Workbench used greyscale for the overlays, but if you look closely and click the layers On and off you can see that the layers are in the order of the Overlay ToolBox File rows: Schaefer (parcel mask) on top, then the continuous image (speckly), then the anatomy. Change which file is listed in each row to change the stacking.

change the color scaling

Schaefer2018_400x7_81x96x81.nii.gz is the Schaefer 400 parcel x 7 network parcellation; let’s add some colors to the parcels. On the Volume tab and click the continuousOverlay.nii.gz layer off, so that only the Schaefer parcels are shown on the anatomy. Now click the wrench button in the Schaefer layer's row (green arrow below); the Overlay and Map Settings window should appear.

Change the settings in the window to match this image and see how the histogram and brains change. The histogram at the right shows the range of the data (1:400 for the Schaefer 400x7), how many voxels with each value, and what color they're assigned with the current color palette and scaling settings. In this case we can see that lower-numbered parcels are given greens, and plotted on the left hemisphere. Switch through the different Palettes and range options to see how the appearance shifts.

What if we like this coloring, but only want to show the right hemisphere parcels? The settings below are one way to accomplish it; try switching the Show Data Inside Thresholds and Show Data Outside Thresholds selection to show the left hemisphere.

Click the Close button at the lower right to close the volume's Overlay and Map Settings window, and switch to the surface tab. Show Map 3 ("ON_BLOCKS#0_Tstat") on each hemisphere, then click the wrench in one of the overlay's rows to open the Overlay and Map Settings window again (I clicked the one for the right hemisphere). You'll see that the histogram looks quite different: the t statistics are both positive and negative, roughly normal, with a lot of values around 0.

Can you make it show only values above 1 and below -1, using a typical warm colors for positive, cool for negative scheme, with red and blue corresponding to 1 and -1, respectively? Below is one solution.

Note that only the right hemisphere has the coloring scheme - since I started by clicking the wrench in the right hemisphere's row. I could close this Overlay and Map Settings window, then click the wrench in the left hemisphere's row and set everything again, but there's an easier way: click the Apply to Files button (green arrow) to open the Copy Palette Color ... window, then click both fsaverage5_STATS Metrics on. Click OK, then Close. Both hemispheres should now have the same coloring scheme.

Finally, switch through the different surface Maps - they will have have the new coloring scheme. This happened because the Apply to All Maps option (just above the green arrow in the previous picture) was checked in the Overlay and Map Settings window. If you uncheck this option, changes will only affect the Map you have shown when the wrench was clicked.

parting comments

This tutorial introduces some basic Workbench functionality, but it has many, many more features. The Workbench tutorial is a good next step to see what else it can do. Good luck!

Friday, March 13, 2020

volume and surface brain plotting knitr tutorial

Here is the second of my pair of posts introducing my updated knitr brain plotting tutorials. The first tutorial describes setting up RStudio for knitr compilation, with base R graphics examples - start there. This post adds a pair of brain plotting tutorials, one for volumetric images and the other for surfaces. The source (knitrIntro_NIfTI.rnw, knitrIntro_gifti.rnw) and image files needed for compilation can be downloaded from the blog osf site, "knitr tutorials" section. The compiled pdfs are included as well, at knitrIntro_NIfTI.pdf and knitrIntro_gifti.pdf.

The surface and volume examples are roughly parallel, covering plotting both continuous statistical overlays and parcellations. The example code includes assigning values to parcels (e.g., to show the results of an analysis), adjusting the plot appearance, and doing math with the images in R. Please read both the text in the pdf and the code comments.

The brain images in the knitrs (a few of which are above) are added with a pair of functions I wrote: plot.volume() and plot.surface(). These are updated versions of the functions in previous posts on this blog, with changes to improve the appearance of the surface images and make the color scaling parameters for volumes more similar to those for surfaces.

My intention is that the plotting functions and usage examples in these knitrs will enable beginners to quickly start making useful (and attractive!) knitr documents to summarize, explore, and describe their own analyses. Please let me know if you encounter any bugs, have a suggestion for an improvement, or have a new feature suggestion.

UPDATE 31 March 2023: I uploaded corrected versions of niftiPlottingFunctions.R and knitrIntro_NIfTI.pdf to osf and the volumetric screenshot above. The correction was needed because of a bug in my plot.volume() function causing the negative colors to be reversed: instead of values near zero being dark blue, they were cyan. Major thanks to Tan Nguyen for spotting the unexpected coloring! If you downloaded a copy of niftiPlottingFunctions.R before 30 March 2023, please replace it with this version.

UPDATE 29 June 2021: Changed the volumetric screenshot to show the new (diverging) cool colors (and a legend); niftiPlottingFunctions.R and giftiPlottingFunctions.R now use the same color scales.

Monday, March 9, 2020

introductory knitr tutorial

This is a new introductory knitr tutorial. I have posted two previous knitr tutorials, but this supersedes the first: I have now split the image plotting (NIfTI and GIFTI) into its own tutorial. This post describes setting up RStudio and knitr and compiling the tutorial .rnw, which can be downloaded from the (new!) blog osf site.

I'm a big fan of R and knitr; I now use knitr to create nearly all of my analysis-summary documents, even those with "brain blob" images, figures, and tables. The files can be complex, such as this supplemental information.

This post contains a knitr tutorial in the form of an example knitr-created document, and the source needed to recreate it. My intention is that this will be a "starter kit", containing examples of the basic formatting needed to quickly start using knitr. The file is also a starter kit for base R graphics ... I use base R for nearly everything, rather than ggplot2 or the tidyverse.

What does knitr do? Yihui has many demonstrations on his web site. I use knitr to create pdf files presenting, summarizing, and interpreting analysis results. Part of the demo pdf is in the image at left to give the idea: I have several paragraphs of explanatory text above a figure. This entire pdf was created from a knitr .rnw source file, which contains LaTeX text and R code blocks.

Previously, I'd make Word documents describing an analysis, copy-pasting figures and screenshots as needed, and manually formatting tables. Besides time, a big drawback of this system is human memory ... "how exactly did I calculate these figures?." I tried including links to the source R files and notes about thresholds, etc, but often missed some key detail, which I'd then have to reverse-engineer. knitr avoids that problem: I can look at the document's .rnw source code and immediately see which NIfTI image is displayed, which directory contains the plotted data, etc.

In addition to (human) memory and reproducibility benefits, the time saved by using knitr instead of Word for analysis summary documents is substantial. Need to change a parameter and rerun an analysis? With knitr there's no need to spend hours updating the images: just change the file names and parameters in the knitr document and recompile. Similarly, the color scaling or displayed slices can be changed easily.

Using knitr is relatively painless: if you use RStudio. There is still a bit of a learning curve, especially if you want fancy formatting in the text parts of the document, since it uses LaTeX syntax (this tutorial contains enough to get going with, though). But RStudio takes care of all of the interconnections: simply click the "Compile PDF" button (blue arrow) ... and it does!

installation and preparation

If you haven't already, first install RStudio and then the knitr package (use the GUI or install.packages("knitr")). If you're not using LaTeX already, I suggest installing TinyTeX next, which may be the only LaTeX compiler you need (it seems to vary with operating system; see troubleshooting notes below). TinyTeX is ease to install in R: first the tinytex package (use the GUI or install.packages("tinytex")), then tinytex::install_tinytex().

RStudio defaults .rnw files to Sweave, but this tutorial .rnw is in knitr, so you MUST change the RStudio setting. To do this, go through Tools then Global Options in the top RStudio menus to bring up the Options dialog box, as shown here. Click on the Sweave icon, then tell it to Weave Rnw files using knitr (marked with yellow arrow). If you installed tinytex, you should also check the Use tinytex when compiling .tex files box (marked with blue).

Then click OK to close the dialog box.

testing

I suggest you next test if everything is working. Start a blank document by selecting New File -> R Sweave (yes, "Sweave"). Type something between the \begin{document} and \end{document} lines, like shown here. Then save the file with the .rnw suffix; I strongly suggest saving it in an empty directory, as many temporary files will be created during compilation. Finally, click the Compile PDF button, and a new window containing the PDF should appear.

I named the test file test.rnw and saved it in an empty directory, but after compilation the directory has six files: the original test.rnw, the compiled test.pdf, plus test.log, test.syntex.gz, test.tex, and test-concordance.tex. (When compiling files with images a subdirectory with image files will also be created; enabling caching will lead to another subdirectory.) Only the .rnw and .pdf files need to be kept; every time the .rnw is compiled all the same other files will be recreated.

If the test file compiled successfully, you can try the base R graphics demo: download knitrIntro_baseRgraphics.rnw from the osf site, save it locally into its own directory, compile it, and you should have a pdf very similar to the one I posted.

troubleshooting

Starting with a very simple test document (like my hello world example above) is a good way to determine if the problem is with the knitr/R/LaTeX installation or a particular file. Invalid LaTeX code (e.g., _ instead of \textunderscore) or invalid R syntax (e.g., x <- c(1,);) can make odd error messages, but will only affect individual files.

The RStudio default for .rnw must be changed to knitr (see above); if it is not the file will not be compiled, plus some strange side effects. Error messages may include "LaTeX Error: File `ae.sty' not found." and "==> Fatal error occurred, no output PDF file produced!". Also, at least one \SweaveOpts line is usually added to the source .rnw, such as below. To correct, you must both change the global options to knitr AND delete all lines of code starting with \Sweave from your source file.

Sometimes it seems necessary to install MiKTeX (set it to 'Always install missing packages on-the-fly') as well as tinytex. If you have suggestions for which operating systems/situations require more than tinytext, please share them.

UPDATE 7 April 2020: added the note about needing MiKTeX as well as TinyTeX.

UPDATE 5 November 2020: added a link to the knitr tutorial on surface and volume brain image plotting.

UPDATE 2 September 2021: rearranged and expanded the installation, testing, and troubleshooting sections.

Wednesday, February 26, 2020

when making QC (or most any) images, don't mask the brain ...

Some preprocessing pipelines/programs mask out the space around the brain, while others don't. Masking makes the files a bit smaller and a lot neater looking, but also makes it harder to spot artifacts or other errors, as was highlighted for me just now when comparing results across pipelines.

These two images are from button-pressing (positive control; should have a hot "blob" of left hand motor-ish activity) GLMs for the same person, two different tasks (Axcpt and Cuedts), three different sessions (B, P, R). All tasks are performed within each session, but each session is collected on a different day - the two B from the same day, the two P, etc. These two sets of images are the same, but from two different pipelines:

Spot the error? There was a coil error on the day the P ("proactive") session was acquired, which is much easier to spot in the second (non-masked fmriprep) output. Those were GLM results; here's how it looks in the standard deviation images (standard deviation of each voxel over all the frames in each run), from each pipeline, for the (problematic) proactive session:

It's clear in both versions that something went wrong, but MUCH easier to see the artifact when the brain has not been masked.

For completeness, here's the surface version of the standard deviation from fmriprep preprocessing; I don't have the HCP pipeline version handy at the moment but expect it to look similar. Brain masking is of course not optional for surfaces so we can't look "around" the brain, but the "striping" pattern is abnormal in this image (reflecting the artifact instead of the brain structure).

I've often recommended against masking the brain in working files to make artifacts and oddities easier to spot, but the difference masking makes in this case is especially striking. (And this session will not be included in any real analyses!)

Monday, January 13, 2020

working with surfaces: musings and a suggestion

First, some musings about surface datasets. I'm not convinced that surface analysis is the best strategy for human fMRI, particularly task-based fMRI. I don't dispute that the cortex is a sheet/ribbon/surface; what I'm skeptical about is whether fMRI resolution is (or could be) anywhere close to good enough to accurately keep track of the surface over hours of scanning. (More grey matter musings here.)

I also have concerns about how exactly to do the surface interpolation: MRI collects volumes, from which surfaces must be interpolated, and there are multiple algorithms for finding the surface (in the anatomical images) and performing the transformation (including of the corresponding fMRI images). None of these are remotely trivial procedures, and I have a general bias against adding complexity (and experimenter degrees of freedom) to preprocessing pipelines/analyses.

For a practical issue, quality control and interpreting results on the surface is more difficult than in the volume. I prefer omitting masking during volumetric analysis, which allows checking if the "blobs" fall in an anatomically-sensible way (e.g., are the results stronger in the grey matter or the ventricles?). This type of judgment is not possible on the surface, because it only includes grey matter. For a more extreme example, in the image below the top row is from buggy code that randomized the vertex assignments, while the second row are results with a weak effect, but correct assignments (ignore the different underlay surfaces). The two images don't look all that different on the surface, but a parallel error in the volume would make salt-and-pepper all over the 3d frame: something clearly "not brain-shaped" and so easier to spot as a mistake.

Even with those reservations, I often need to use surface images (and the benchmark GLM results look reasonable, so it does seem to work ok); here are some suggestions if you need to as well.

Key: surfaces are surfaces (made up of vertices) and volumes are volumes (made up of voxels), do not transform back and forth. If you want surface GLM results, conduct GLMs on vertex timeseries; for volume GLM results, conduct GLMs on voxel timeseries.

It is possible to quickly make a surface from a volume (e.g., with a couple of wb_command -volume-to-surface-mapping commands), but that is a very rough process, and should be used only for making an illustration (e.g., for a poster, though even that's not great - see below), NOT for data that will be analyzed further. Instead, generating the surfaces should be done during preprocessing (e.g., with fmriprep), so that you have a pair of gifti timeseries files (one for each hemisphere) in your chosen output space (e.g., subject space, fsaverage5) along with the preprocessed 4d NIfTI file (e.g., subject space, MNI). (Note: I have found it easiest to work with surfaces as gifti files; cifti files can be separated into a pair of giftis and a NIfTI.)

Why this emphasis on keeping surfaces as surfaces and volumes as volumes? The main reason is related to the fuzziness and complexity in creating the surfaces. In both the fmriprep and HCP surface preprocessing pipelines vertices and voxels do not correspond at the end: you can't find a voxel timeseries to match each vertex timeseries (or the reverse). Instead, the vertex timeseries are a weighted mix of voxels determined to correspond to each bit of cortical ribbon, perhaps followed by parcel-constrained (or other) smoothing. It may be theoretically possible to undo all that weighting and interpolation, but no documented method or program currently exists, as far as I know. More generally, information is lost when transforming volumes to surfaces (and the reverse), so it should be done as rarely as possible, and the "rarest" possible is once (during preprocessing).

What can you do with giftis? Pretty much everything you can do with NIfTIs (plotting, extracting stats, running GLMs, etc.), though a bit of tweaking may be needed. For example, we use afni for GLMs, and many of its functions can now take gifti images. I use R for nearly all gifti plotting and manipulation, as in this tutorial, which includes my plotting function, and turning vertex vectors into surface images.

This snippet of code illustrates working with giftis in R (calculating vertex-wise mean, standard deviation, and tsnr for QC): in.fname is the path to a gifti timeseries (such as produced by fmriprep). It is read with readGIfTI from the gifti library, than rearranged into a vertex x frame matrix. Once in that form it's trivial to calculate vertex-wise statistics, which this code writes out in plain text; they could also be plotted directly.

      in.img <- readGIfTI(in.fname)$data;  # just save the vertex timeseries part  
      in.tbl <- matrix(unlist(in.img), nrow=length(in.img), byrow=TRUE); # now vertices in rows, timepoints in columns  
   
      out.tbl <- array(NA, c(ncol(in.tbl), 3));  # one row for each vertex  
      colnames(out.tbl) <- c("sd", "mean", "tsnr");  
   
      out.tbl[,1] <- apply(in.tbl, 2, "sd");  
      out.tbl[,2] <- apply(in.tbl, 2, "mean");  
      out.tbl[,3] <- out.tbl[,2]/out.tbl[,1];  # tsnr: mean/sd  
      write.table(out.tbl, out.fname);

I try to carry this "surfaces as surfaces and volumes as volumes" strategy throughout the process, to keep the distinction obvious. For example, if a parcel-average timecourse was derived from surface data, show it next to a surface illustration of the parcel. If a GLM was run in the volume, show the beta maps on volumetric slices. Many people (myself included!) have made surface versions of volume data for a final presentation because they look "prettier". This was perhaps acceptable when only volumetric analyses were possible - we knew the surface image was just an illustration. But now that entire analyses can be carried out in the surface or volume I think we should stop this habit and show the data as it actually is - volumes can be "pretty", too!

MVPA Meanderings