Wednesday, April 22, 2015

ROI-based MVPA: surface and volume comparisons, using HCP data

How should we do a "standard" ROI-based MVPA: on the surface or in the volume? There are a few mass-univariate (e.g., cited in the introduction of Glasser et al., 2013) and searchlight-based (most from comparisons in the literature, but not many, and not for larger ROIs (send some pointers if I just missed 'em).

This post describes a direct comparison of the carrying out the same ROI-based task-classification MVPA using both surface and volume data. It uses the working memory task fMRI data collected in the HCP, which should be some of the best-preprocessed data available now, with the Gordon, et al. (2014) communities as ROIs. Since the Gordon analyses were also done on the surface, I expect these masks to be especially suited for surface analyses. The Gordon parcellation is available perfectly aligned to the HCP volume and surface data, so no transformations (resampling, warping, etc.) are needed.

My intent was to design these analyses to be as "apples-to-apples" as possible, allowing direct comparisons between the surface and volume results. The analyses in this post were carried out in two (non-intersecting) sets of unrelated people (190 in group 1, 160 in group 2) from the 500-subjects HCP data release. I used ten-fold cross-validation in each group (leave-19-out for group 1; leave-16-out for group 2), determining the cross-validation folds ahead of time, so the same people made up the training and testing sets for both the volume and surface analyses and all four pairwise classifications (my intent being to eliminate as many non-surface or volume-related sources of variation as possible). All classifications were with linear SVM, c=1, on the HCP-provided cope images ("parameter estimate images", in my usual parlance), one per person per class (here, 0BK_FACE, OBK_PLACE, 2BK_FACE, 2BK_PLACE).

The analyses consisted of four pairwise classifications in each group of subjects and type of image (surface or volume), taken from the working memory task. Briefly, the working memory task was a blocked version of the n-back, with blocks of either 0-back or 2-back, performed with different categories of pictures. I can thus classify both the picture type used in the block (here, face or place) or the n-back level (0 or 2), and make sensible predictions (since these are well-understood tasks), such that the Visual community should classify face vs. place much better than 0-back vs. 2-back.

These busy graphs show the results. The first graph has the results using the Group 1 subjects, and the second, the Group 2 subjects (i.e., the replications: same analyses in the two groups of people). The accuracies from analyzing the volume data are plotted as solid lines with dots, surfaces, dashed lines with xs. The abbreviations along the x-axis are the 13 communities in the Gordon2014 parcellation. There are two points for each community, left hemisphere on the left side of the line, right hemisphere on the right. The shaded area below 0.6 accuracy is since these accuracies are unlikely to be meaningful (i.e., interpreted as chance).



The classification results are sensible: the Visual community classifies face vs. place extremely well; SMmouth doesn't classify anything; FrontoParietal classifies 0-back vs. 2-back. There's some variation in the details between the two replications (e.g., right-hemisphere None classifies better than left in Group 1, but about the same in Group 2), but the basic pattern of which communities do which classifications is the same in both.

For the surface and volume comparisons, my main impression of the results is of similarity: the accuracies produced from the surfaces and volumes tend to be very close for each community and classification (in the graphs, the dashed and solid lines of each color follow each other). It looks like the difference between the surface and volume versions is less than the difference between whether the analysis was run on Group 1 or 2 (the replications), making me conclude that the surface and volume versions classified  about the same - basically equivalent results either way.

Should this be surprising? Perhaps not: I used the same communities in each analysis, so it's reassuring that they reported basically the same information in the surface and volume versions. The volumetric Gordon community masks closely follow the grey matter (as do the surface reconstructions, of course), so we'd hope they capture the same brain areas. The results would presumably vary more if "lumpier" volumetric ROIs were projected onto the surface.


The number of voxels (volume) in each community is larger than the number of vertices (surface), particularly for larger communities, as shown here (grey line is x=y). This has implications for classification accuracy when using linear SVMs (like I did here), since they can be more likely to detect information when there are more weakly-informative features: more voxels could give an advantage to the volume-based analyses, if they were (even weakly) informative.

We might guess that the surface version would do better than the volume, if the volume included uninformative voxels that weren't assigned a surface vertex, but that doesn't seem to have happened here (perhaps because of the tight grey matter alignment in the volume masks), at least not enough to reduce the accuracy. We might also guess that the surface version would be worse than the volume, for example if preprocessing caused some activity to appear to be in adjacent non-grey-matter areas (which then wouldn't be included in the surface version). {The volumetric BOLD signal is blurred in space during preprocessing (e.g., motion correction and spatial normalization), since preprocessing is usually (including for the HCP) done volumetrically, before the conversion to surfaces.} But in this case, the surface and volume versions came out about the same.

So, can we answer the question I posed at the beginning of this post? How should we do a "standard" ROI-based MVPA: on the surface or in the volume? One set of comparisons can't answer the question definitively, of course, and the HCP data is unusual in many respects (e.g. multiband acquisition, small voxels, specialized preprocessing pipelines). There wasn't an advantage to doing the analyses on the surface here, or much of a disadvantage, since the HCP provides surface versions of the copes. But I'd be hesitant to recommend doing MVPA with fairly large ROIs (like the Gordon communities) on the surface in a new study, particularly if you had to generate the surface files yourself: that's a lot of extra work (making the surface versions) for what might be very little benefit.


Many more comparisons are needed to provide general guidance for when analyses should be done on the surface or volume. I'm particularly interested in trying more comparisons with dilated ROIs: does accuracy improve if voxels in adjacent non-grey matter are included in the mask? If so, we may be better with analyzing the volume. If you run or run across more comparisons, please let me know, and I'll add them here.

The MVPA results here were done in R, similarly to the ROI-based demo. I followed the steps here to read the functional data out of the surface (cifti) files, using Guillaume Flandin's MATLAB library to read the extracted gifti files and get the values for the vertices corresponding to each Gordon parcel. The post is hopefully detailed enough to allow running similar comparisons; I can provide details to replicate exactly if needed.

Monday, April 13, 2015

format conversion: 4dfp to NIfTI, plus setting handedness and headers

This post is a tutorial explaining how to convert fMRI datasets from 4dfp to NIfTI format. Most people not at Washington University in St Louis probably won't encounter a 4dfp dataset, since the format isn't in wide use. But much of the post isn't 4dfp-specific, and goes over ways to check (and correct) the orientation and handedness of voxel arrays. The general steps and alignment tests here should apply, regardless of which image format is being converted to NIfTI.

Converting images between formats (and software programs) is relatively easy ... ensuring that the orientation is set properly in the converted images is sometimes not, and often irritating. But it is very, very worth your time to make sure that the initial image conversion is done properly; sorting out orientation at the end of an analysis is much harder than doing it right from the start.

To understand the issues, remember that NIfTI (and 4dfp, and most other image formats) have two types of information: the 3d or 4d array of voxel values, and how to interpret those values (such as how to align the voxel array to anatomical space, the size of the voxels in mm, etc.), which is stored in a header. This post gives has more background information, as does this one. The NIfTI file specification allows several different ways of specifying alignment, and both left and right-handed voxel arrays. Flexibility is good, I guess, but can lead to great confusion, because different programs have different strategies for interpreting the header fields. Thus, the exact same NIfTI image can look different (e.g. left/right flipped, shifted off center) when opened in different programs. That's what we want to avoid: we want to create NIfTI images that will be read with proper alignment into most programs.

Step 1:  Read the image into R as a plain voxel array

I use the R4dfp R package to read 4dfp images into R, and the oro.nifti package to (read or) write NIfTI images out of R. The R4dfp package won't run in Windows, so needs to be run on a linux box, mac, or within NeuroDebian (oro.nifti runs on all platforms).

 in.4dfp <- R4dfp.Load(paste0(in.path, img.name));  
 vol <- in.4dfp[,,,];  

The 4dfp image fname at location in.path is read into R with the first command, and the voxel matrix extracted with the second. You can check properties of the 4dfp image by checking its fields, such as in.4dfp$dims and in.4dfp$center. The dimension of the extracted voxel array should match the input 4dfp fields (in.4dfp$dims == dim(vol)).

Step 2:  Determine the handedness of the input image and its alignment

Now, look at the extracted voxel array: how is it orientated? My strategy is to open the original image in the program whose display I consider to be the ground truth, then ensure the matrix in R matches that orientation. For 4dfp images, the "ground truth" program is fidl; often it's the program which created the input image. I then look at the original image and find a slice with a clear asymmetry, and display that slice in R with image.



Here's slice k=23 (green arrow) of the first timepoint in the image (blue arrow). The greyscale image is the view in fidl which we consider true: the left anterior structure (white arrow) is "pointy". The R code image(vol[,,23,1])) displays this same slice, using hot colors (bottom right). We can see that it's the correct slice (I truncated the side a bit in the screen shot), but the "pointy" section is at the bottom right of the image instead of the top left. So, the voxel array needs flipped left/right and anterior/posterior.

Step 3:  Convert and flip the array as needed

This doFlip R function reorders the voxel array. It takes and returns a 3d array, so should be run on each timepoint individually. This takes a minute or so to run, but I haven't bothered speeding it up, since it only needs to be run once per input image. If you have a faster version, please share!

 # flip the data matrix in the specified axis direction(s). Be very careful!!!  
 doFlip <- function(inArray, antPost=FALSE, leftRight=FALSE) {  
   if (antPost == FALSE & leftRight == FALSE) { stop("no flips specified"); }  
   if (length(dim(inArray)) != 3) { stop(print("not three dimensions in the array")); }  
   outvol <- array(NA, dim(inArray));  
   IMAX <- dim(inArray)[1];  
   JMAX <- dim(inArray)[2];  
   KMAX <- dim(inArray)[3];  
   for (k in 1:KMAX) {  
     for (j in 1:JMAX) { # i <- 1; j <- 1; k <- 1;  
       for (i in 1:IMAX) {  
         if (antPost == TRUE & leftRight == FALSE) { outvol[i,j,k] <- inArray[i,(JMAX-j+1), k]; }  
         if (antPost == TRUE & leftRight == TRUE) { outvol[i,j,k] <- inArray[(IMAX-i+1),(JMAX-j+1), k]; }  
         if (antPost == FALSE & leftRight == TRUE) { outvol[i,j,k] <- inArray[(IMAX-i+1), j, k]; }  
       }  
     }  
   }  
   return(outvol);  
 }  
  # put the array to right-handed coordinate system, timepoint-by-timepoint.  
  outvox <- array(NA, dim(vol));  
  for (sl in 1:dim(vol)[4]) { outvox[,,,sl] <- doFlip(vol[,,,sl], TRUE, TRUE); }  
  image(outvox[,,23,1])  

The last three code call the function  (TRUE, TRUE specifies flipping both left-right and anterior-posterior), putting the flipped timepoint images into the new array outvox. Now, when we look at the same slice and timepoint with the image command (image(outvox[,,23,1]))), it is orientated correctly.

Step 4:  Write out as a NIfTI image, confirming header information

A 3 or 4d array can be written out as a NIfTI image without specifying the header information (e.g. with writeNIfTI(nifti(outvox), fname), which uses the oro.nifti defaults). Such a "minimal header" image can be read in and out of R just fine, but will almost certainly not be displayed properly when used as an overlay image in standard programs (e.g., on a template anatomy in MRIcroN) - it will probably be out of alignment. The values in the NIfTI header fields are interpreted by programs like MRIcroN and the Workbench to ensure correct registration.

Some of the NIfTI header fields are still mysterious to me, so I usually work out the correct values for a particular dataset iteratively: writing a NIfTI with my best guess for the correct values, then checking the registration, then adjusting header fields if needed.

  out.nii <- nifti(outvox, datatype=64, dim=dim(outvox), srow_x=c(3,0,0,in.4dfp$center[1]),   
      srow_y=c(0,3,0,in.4dfp$center[2]), srow_z=c(0,0,3,in.4dfp$center[3]))   
  out.nii@sform_code <- 1    
  out.nii@xyzt_units <- 2; # for voxels in mm    
  pixdim(out.nii)[1:8] <- c(1, 3,3,3, dim(outvox)[4], 1,1,1);   
   # 1st slot is qfactor, then mm, then size of 4th dimension.   
  writeNIfTI(out.nii, paste0(out.path, img.name))    

These lines of code write out a 4d NIfTI with header values for my example image. The input image has 3x3x3 mm voxels, which is directly entered in the code, and the origin is read out of the input fields (you can also type the numbers in directly, such as srow_x=c(3,0,0,72.3)). The image below shows how to confirm that the output NIfTI image is orientated properly: the value of the voxel under the crosshairs is the same (blue arrows) in the original 4dfp image in fidl, the NIfTI in R, and the NIfTI in MRIcroN. The red arrows point out that the voxel's i,j,k coordinates are the same in MRIcroN and R, and the left-right flipping matches (yellow arrow).



Here's another example of what a correctly-aligned file looks like: the NIfTI output image is displayed here as an overlay on a template anatomy in MRIcroN (overlay is red) and the Workbench (overlay is yellow). Note that the voxel value is correct (blue), and that the voxel's x,y,z coordinates (purple) match between the two programs. If you look closely, the voxel's i,j,k coordinates (blue) are 19,4,23 in MRIcroN (like they were in R), but 18,3,22 in the Workbench. This is ok: Workbench starts counting from 0, R and MRIcroN from 1.



concluding remarks

It's straightforward to combine the R code snippets above into loops that convert an entire dataset. My strategy is usually to step through the first image carefully - like in this post - then convert the images for the rest of the runs and participants in a loop (i.e., everyone with the same scanning parameters). I'll then open several other images, confirming they are aligned properly.

This last image is a concrete example of what you don't want to see: the functional data (colored overlay) is offset from the anatomical template (greyscale). Don't panic if you see this: dive back into the headers.

Friday, April 3, 2015

below-chance classification accuracy: meaningful here?

Below-chance accuracy ... always exciting. It showed up today in an interesting way in a set of control analyses. For framing, this is a bit of the working memory task data from the HCP: 0-back and 2-back task blocks, using pictures of faces or places as the stimuli. This gives four examples (volumetric parameter estimate images) per person: 0-back with faces, 0-back with places, 2-back with faces, and 2-back with places.

The classification shown here was with linear SVM, two classes (all balanced, so chance is 0.5), with leave-16-subjects-out cross-validation. The cross-validation is a bit unusual since we're aiming for generalizability across people: I trained the classifiers on a pair of stimuli in the training people, then tested on a different pair of stimuli in the testing people. For example, one cross-validation fold is training on 0-back vs 2-back face in 144 people, then testing 0-back vs 2-back place with the 16 left-out people.

Anyway, a ROI-based classification analysis was performed, on 6 anatomic clusters (C1:C6), which are along the x-axis in the graphs. The left side graph shows a positive control-type analysis: as we expected, face vs. place is classified extremely well with these ROIs, but 0-back vs. 2-back is classified at chance. The right side graph shows some non-sensical, negative control-type analyses, all of which we expected to classify around chance. These are nonsense because we're training and testing on different classifications: for example, training a classifier to distinguish face vs place, then testing with all face stimuli, some of which were from 0-back blocks, others of which were from 2-back blocks.


The striking pattern is that the blue and green lines are quite far below chance, particularly in clusters C1 and C2, which classified face vs place nearly perfectly (ie in the left-side graph).

These ROIs classify face vs place very well. When trained on face vs place, but tested with 0-back vs. 2-back (red and purple lines), they classified basically at chance. This makes sense, because the classifiers learned meaningful ways to distinguish face and place during training, but then were tested with all face or all place stimuli, which they presumably would have classified according to picture type, not n-back. This gives a confusion matrix like the one shown here: all face test examples properly classified as face, for 10/20 = 0.5 accuracy.

Now, the classifiers did not learn to classify 0-back vs 2-back in these ROIs (left-side graph above). To get the observed below-chance accuracies, the confusion matrices would need to be something like this. Why? It is sort of plausible that the classifier could "know" that the place test examples are not face, and so split them equally between the two classes. But if the classifier properly classifies all the 2-back face examples here (as needed to get the below-chance accuracy), why wasn't 0-back vs 2-back properly classified before?

I'll keep looking at this, and save some of the actual confusion matrices to see how exactly the below-chance accuracies are being generated. It's not quite clear to me yet, but the striking pattern in the below-chance here makes me think that they actually might carry some meaning in this case, and perhaps give some more general insights. Any thoughts?