MVPA Meanderings: January 2025

When preparing to update my surface (gifti) tutorial knitr, I realized that some of the confusion I was trying to address wasn't due to the code so much as the general differences between volume and surface datasets. This post tries to fill in some of those gaps for people using my R scripts. Much of the information in this post is general, but there are many topics I am not going to cover at all - what I describe here is not the only way to work with fMRI brain data; there are many other file formats and software packages.

first, volumes.

Volumetric brain data files are first, both historically (all fMRI analyses were originally volumetric) and fundamentally: we (including our brains) are physically in three dimensions, as is the data that comes off the scanner (often as DICOM files, which neuroimagers usually promptly convert to NIfTI, such as via dcm2niix when making a BIDS dataset).

Consider a standard anatomical image, such as the HCP 1200 Group Average, which makes a good default underlay for adult MNI datasets (S1200_AverageT1w_81x96x81.nii.gz is a downsampled version used in my tutorials). Opening S1200_AverageT1w_81x96x81.nii.gz with MRIcron shows us slices for each of the three planes (below left), and the schematic at right is how the data comes out of a NIfTI image:

This image's data is a big cube subdivided into a grid of little cubes (voxels; volumetric pixels), each 2.4 mm on a side. It has 81 voxels along the i axis, 96 along the j, and 81 along the k, for 81*96*81 = 629,856 voxels total. There is a number in every one of these 629,856 voxels: the voxels fill the entire cube, not just the brain part in the middle. Thus, when you plot slices through a NIfTI it hopefully makes a brain-shaped picture, but that's not guaranteed - you can put any shape you want into a NIfTI file. (And hence why you should always include a "does it look like a brain" QC step.)

The code below opens the NIfTI in R with the splendid RNifti package and shows us the same information as the MRIcron screen capture above: the voxel under the cursor (at 25, 56, 37; red scribble) has value 605.887 (green scribble). (Warning: these match since both R and MRIcron are 1-based; if using something 0-based like afni or python you will need to shift indices.)

 library(RNifti); # package for NIfTI image reading https://github.com/jonclayden/RNifti  
   
 fname <- paste0(in.path, "S1200_AverageT1w_81x96x81.nii.gz");  # https://osf.io/6hdxv/  
 if (file.exists(fname)) { under.img <- readNifti(fname); } else { stop(paste("missing:", fname)); }  
   
 str(under.img)  
 # 'niftiImage' num [1:81, 1:96, 1:81] 0 0 0 0 0 0 0 0 0 0 ...  
 # - attr(*, "pixdim")= num [1:3] 2.4 2.4 2.4  
 # - attr(*, "pixunits")= chr [1:2] "mm" "Unknown"  
 # - attr(*, ".nifti_image_ptr")=<externalptr>   
 # - attr(*, ".nifti_image_ver")= int 2  
   
 under.img[25,56,37]  
 # [1] 605.887

Note that under.img is a 3d array of numbers with various attributes; its pixdim and pixunits match those in the MRIcron "NIfTI Header Information" dialog in the screen capture (open with Window -> Information). There are many more NIfTI header fields; the header information is what makes it possible for software like MRIcron to match the 3d array of numbers in the NIfTI with actual space.

We don't need to implement every NIfTI header field or work out warping and realigning on our own, thankfully! The key is to think of the volumetric images as a 3d (or 4d, if functional timeseries - lots of 3d images stuck together along the 4th dimension) array of numbers with important properties attached. If you have multiple NIfTI files you can treat them as normal arrays of numbers, so long as all the files have matching properties (same voxel sizes, orientation, etc. - compatible headers). Thankfully, all the main fMRI software programs have functions for making image properties match; here's my afni example.

why surfaces?

I wrote above that neuroimaging data (and our brains!) start in 3d volume space; why would we want to think about it any other way? Well, when doing fMRI the goal is often to analyze the cortical grey matter BOLD signal, and we know the human cortex is physically a folded sheet (cortical "ribbon"); a convoluted surface.

Figure 2 from doi:10.1007/s00429-012-0411-8, PMID: 22488096 (below) shows a cortical surface (lots more info in the paper and elsewhere); the blue scribble I added to pane c follows the grey matter ribbon.

In principle, if the preprocessing algorithms find this grey matter ribbon we can concentrate our analyses there and have better precision and stronger results. There are a lot of complications in finding and transforming the grey matter ribbon of any particular person into a surface (a few of my musings; Andrew Jahn's), but we won't explore any of those here. The explanations below use standard images from FreeSurfer; to get a surface version of a new dataset I recommend preprocessing with software (e.g., fmriprep) which will output gifti files (or whatever format you want) directly (don't, for example, preprocess completely as volumes and then convert to surfaces for analysis).

introduction to gifti files

A single NIfTI file is enough for an entire brain, but not a single gifti file: gifti files come in paired sets (see also). The "paired" is because each gifti file has info for one hemisphere only: for the whole cortex you need a pair of files, one left and one right. And the "sets" is because gifti files come in multiple types, labeled by the bit of the filename before the .gii. Here we'll only use two types: .surf.gii surface "underlay" images, and .label.gii parcel labels.

For my gifti knitr tutorial I shared several sets of paired gifti files; here we'll look at the Schaefer2018 parcellation fsaverage5 ones in R, using the gifti library's readGIfTI function. (I converted the original files to gifti using FreeSurfer's mris_convert function.)

First, read four files: a pair each of .surf.gii and .label.gii:

 library(gifti);  # for readGIfTI  https://github.com/muschellij2/gifti  
   
 # surface "underlay anatomy"  
 fname <- paste0(in.path, "fsaverage5_lh.inflated_avg.surf.gii");  # https://osf.io/nxpa7/  
 if (file.exists(fname)) { surf.L <- readGIfTI(fname); } else { stop("can't find fsaverage5_lh.inflated_avg.surf.gii"); }  
 fname <- paste0(in.path, "fsaverage5_rh.inflated_avg.surf.gii");  # https://osf.io/24y9q/  
 if (file.exists(fname)) { surf.R <- readGIfTI(fname); } else { stop("can't find fsaverage5_rh.inflated_avg.surf.gii"); }  
    
 # Schaefer parcellation "overlay" surfaces for each hemisphere   
 fname <- paste0(in.path, "Schaefer2018_400Parcels_7Networks_order_10K_L.label.gii");  # https://osf.io/7wk8v/  
 if (file.exists(fname)) { parcels.L <- readGIfTI(fname); } else { stop(paste("missing:", fname)); }  
 fname <- paste0(in.path, "Schaefer2018_400Parcels_7Networks_order_10K_R.label.gii");  # https://osf.io/7tms8/  
 if (file.exists(fname)) { parcels.R <- readGIfTI(fname); } else { stop(paste("missing:", fname)); }

The next code looks a bit at the list objects made with readGIfTI(). For plotting we only need the $data fields, but I suggest investigating the other entries as well to gain a sense of the type of information in them.


 str(surf.L$data);  
 # List of 2  
 # $ pointset: num [1:10242, 1:3] -0.685 22.359 41.099 9.658 -36.773 ...  
 # $ triangle: int [1:20480, 1:3] 0 0 0 0 0 3 3 3 4 4 ...  
   
 str(surf.R$data)  
 # List of 2  
 # $ pointset: num [1:10242, 1:3] -5.74 8.43 43.5 12.6 -40.33 ...  
 # $ triangle: int [1:20480, 1:3] 0 0 0 0 0 3 3 3 4 4 ..  
   
 str(parcels.L$data)  
 # List of 1  
 # $ labels: int [1:10242, 1] 57 81 49 180 40 132 14 22 0 143 ...  
   
 str(parcels.R$data)  
 # List of 1  
 # $ labels: int [1:10242, 1] 61 81 46 110 111 200 21 95 104 119 ...

Notice that the two surf objects each have two subpart arrays ($pointset and $triangle), while each parcel object only has a vector of integers. The parcel vectors' lengths are 10,242, which matches the number of rows in the surface pointset arrays; for plotting, the gifti data sizes must match like this. Why 10,242? That's the number of vertices per hemisphere in fsaverage5; other surface spaces have a different number of vertices.

Triangles and vertices? These terms come from how surface meshes can be represented in computer science: points (vertices, orange arrow) connected by edges of specific lengths, making lots of little triangles, as in this part of the FreeSurfer website brain I enlarged:

using and plotting giftis

The key idea when working with gifti images using my scripts is that we manipulate the vectors of vertex-wise data, not the .surf.gii files (nor any other spatial information). In a given surface space (here, freesurfer5) the location of each vertex as ordered in the 10,242-element vectors is fixed, so we can manipulate its value as needed, as shown in the next examples.

This code produces the image below. The left hemisphere is blank: all its vertices (vals.L) are 0. For the right I again started with a vector of 10,242 0s, but then changed its 1,800th entry to 1, producing a brain picture with a red splotch at the location of vertex 1800.

 vals.L <- rep(0, 10242); # start with all vertices 0  
 map.L <- gifti.map(surf.L.infl$data$pointset, surf.L.infl$data$triangle, vals.L); # map vals.L onto left surface  
   
 vals.R <- rep(0, 10242); # start with all vertices 0  
 vals.R[1800] <- 1; # change vertex 1800 to 1  
 map.R <- gifti.map(surf.R.infl$data$pointset, surf.R.infl$data$triangle, vals.R);  # map vals.R onto right surface  
   
 # plot lateral and medial of both hemispheres  
 plot.surface(map.L, ttl="left hemisphere, blank (all vertices 0)", pos.lims=c(0.5,5.5), which.surface='fsaverage5');   
 plot.surface(map.R, ttl="right hemisphere, vertex 1800 in red", pos.lims=c(0.5,5.5), which.surface='fsaverage5');

And a final example (more are in the knitr tutorial and other demo code): the next code plots (Schaefer 400x7) parcel 60 on the left hemisphere, and all 200 parcels on the right hemisphere. How to know which of the 10,242 vertices correspond to left hemisphere parcel 60? The key is parcels.L (created above from Schaefer2018_400Parcels_7Networks_order_10K_L.label.gii), which has a 10,242-element vector of parcel labels: we just need to find the 60s.

 vals.L <- rep(0, 10242); # start with all vertices 0  
 inds <- which(parcels.L$data[[1]] == 60); # find indices; where 60s are in the 10,242-element vector parcels5.L$data  
 vals.L[inds] <- 4;  # change indices in vals.L for parcel 60 to 4  
 map.L <- gifti.map(surf.L.infl$data$pointset, surf.L.infl$data$triangle, vals.L);  # map vals.L onto left surface  
 map.R <- gifti.map(surf.R.infl$data$pointset, surf.R.infl$data$triangle, parcels.R$data[[1]]); # map all parcels.R vertices onto right surface  
   
 # plot lateral and medial of both hemispheres  
 plot.surface(map.L, ttl="left hemisphere, parcel 60 in orange", pos.lims=c(0.5,5.5), which.surface='fsaverage5'); # plot lateral and medial, left  
 plot.surface(map.R, ttl="right hemisphere, parcels 1 to 200", pos.lims=c(0.5,200.5), which.surface='fsaverage5'); # plot lateral and medial, right

MVPA Meanderings

Friday, January 10, 2025

intro to working with volume and surface brain data

first, volumes.

why surfaces?

introduction to gifti files

using and plotting giftis