Friday, April 26, 2013

caret: plot a NIfTI image

Ah, Caret. Such beautiful images, such a learning curve. In the hopes of saving others some frustration, this tutorial shows how to use caret to make a nice image out of a nifti file - the nifti made in the previous post and available for download here. I'm using caret version 5.65, Jan 27, 2012.


The first step is to get caret and an anatomic template. I had no difficulty registering then downloading the files from the download page; my only confusion was that you do not (at least on Windows) need to install the program: you simply unzip it to wherever you want it to sit on your hard drive, then double-click the exe to start the program. Specifically, I unzipped the caret download to d:/caret/, so start it by double-clicking d:/caret/caret/bin_windows64/caret5.exe in the file explorer.

But, to do anything useful we need to download an atlas, not just the program. A caret-saavy coworker suggested that the atlas_Conte69_74k_pals is appropriate for displaying my SPM8-then-R-manipulated MNI-normalized files. The interface is rather unusual, but you should be able to download that atlas on this SumsDB page. Make sure you get the set that is made for caret 5, not the Workbench. The download is a zip archive containing a bunch of files. I expanded the zip into d:/caret/atlas_Conte69_74k_pals/; make note of where the files are unzipped.

getting a blank brain

Start up caret (double-click or whatever, as explained above). On my windows box it first opens a little dos window, then a GUI with a picture of caret's friendly carrot. Now we need a blank brain to put the nifti image's clusters onto. The example nifti is aligned to the MNI template, SPM8-style, so we can tell caret that when we're loading in the blank brain.

In caret, select File, then Open Spec File ... Navigate to the place where you put the atlas_Conte69_74k_pals files, and select pals.L.atlas.74k_pals.spec. Note the .L. in the middle of the filename: we're loading the left hemisphere blank brain only. To get the right, do the same steps, but pick the .R. file. It's ok that the nifti has both hemispheres, if all goes well, caret will put the correct clusters onto the correct hemispheres.

Now another window pops up. This lists some of the things that are linked (or whatever) to the .spec file we just opened. There are a few things that need to be set in this big window: which blank brains to load (flat, inflated, fiducial) and the atlas space we want to use.

In this case, the nifti image is MNI, so I set the Space box to SPM5 (it doesn't have an SPM8 option but SPM5 is compatible). Then, check the boxes in the Coordinate Files spot for the blank brains you want to have available. In the example here, I picked all of the Conte69 ones except FLAT, because I don't want to see a flat map. Now click Load.

With luck, the option screen will disappear and you will see a blank brain in the main caret window, as below.

changing the main viewing window

The main viewing window has buttons to rotate the brain, whether it is blank or has blobs mapped on.

The basic ones are circled at left. Clicking the little M L A P etc. buttons causes the brain to rotate to particular views - I clicked the L ("lateral") button to have the side view shown here.

The "Model" dropbox lets you pick which blank brain you want to see - how inflated. In this screenshot we are looking at the FIDUCIAL model - the most "bumpy" one in this set of blank brains. Change the model to INFLATED or VERY INFLATED to see inflated models. Note that these are the maps we picked by selecting the check-boxes on the previous screen (above picture - the screen that appears when you load a .spec file).

overlaying a NIfTI image

Now that we have the blank brain we can put our clusters on it. I'm assuming here that we have the clusters in a properly-formatted image file - that the header information with the orientation information has been set correctly. I'm also assuming that we indicated the space matching our nifti when we loaded the blank brain.
  1. In the caret window, click Attributes, then Map Volume(s) to Surface(s).
  2. Leave "Data Mapping Type" as "Metric (Functional) or Surface Shape Data" and click Next.
  3. Click the "Add Volumes From Disk ..." button and navigate to the image you want to overlay, such as fakeBrain.nii.gz. After you select it, its name will appear in the big white box with the word FILE. Click the Next button.
  4. In the next window ("Spec File and Surface Selection") click the "Map to Caret ..." button, then OK in the little window that pops up. Now the big white box should say "Caret (1 surfaces)". Click the Next button.
  5. Click the Next button on the "Data File Naming" screen without changing anything.
  6. Select "METRIC_INTERPOLATED_VOXEL" on the "Mapping Algorithm" screen, then click the Next button.
  7. Click Finish then Close.
The big window will disappear, showing the blank brain in the caret window again ... and it's still blank. Don't panic: caret doesn't show the blobs until you tell it to.

Click the little D/C button in the main caret window (to the right of the buttons that rotate the image). This brings up a big "Display Control" window.
Select the "Overlay/Underlay - Surface" page (like in this screenshot) if it didn't open to that page.

Set the "Primary Overlay" "Data Type" to "Metric" (as highlighted in green). This should cause the "Display Column" and "Threshold Column" boxes to automatically have the name of the image we loaded - fakeBrain.nii.gz. Click the "Close" button to close the window.

If following these steps, you should now see a colorful group of voxels on the medial wall of the template (click the "M" button in the viewing window to rotate). The "Metric Settings" page of the "Display Control" window lets you adjust the scale, set a different palette, and show a color key.


If you select an image as the Primary Overlay in the Display Control but no blobs appear, check that the orientation information is set properly in overlay image, such as by opening up the image in another program and checking if it is orientated as expected. Caret won't necessarily make error messages when it can't determine where to put an overlay.

Caret has many, many more features than those used in this tutorial. Some of the options chosen here may not be suitable for your data; be sure you check the images to confirm the clusters appear in the expected locations.

UPDATE (20 September 2013): directions for creating this overlay in the Connectome Workbench (successor to caret) are posted here.

working with NIfTI files in R: oro.nifti code

In the previous post I introduced a few ideas and conventions useful for working with NIfTI files (and most other neuroimaging formats). In this post I'll show how to read and write NIfTI files using R code, specifically, oro.nifti methods. The R code is available here, along with the nifti file it generates.

creating the NIfTI file

Before we write the NIfTI file, we need some data. For this demo, I copied the location of a few clusters from a real fMRI image, both the i,j,k coordinates (which cells of the 3d array correspond to the clusters) and the header information that puts those coordinates in the correct location in MNI space (copied from the spm-generated .nii containing the clusters).

Here is the NIfTI header information in the SPM8-generated image I used as the basis of the example, as shown in MRIcroN. I know that the image has been spatially normalized to match the MNI template.

In R, I start by making an empty 3d array to hold my data:  
img <- array(0, c(40,48,34));
I picked this size by reading off the i, j, k space dimensions of my template image.Then, we can set some numbers in the array to represent "activation clusters". See the code for the full version, but this just consists of change array values, like img[21,15,19] <- 1;.

Now, we have a 3d array of numbers.  The nifti function in the oro.nifti package collects the array and the header information to make a nifti that will be displayed properly. Again, see the code for the commented version, but here are the options I believe are correct for this image:
out.img <- nifti(img, datatype=64, dim=dim(img), 
srow_x=c(4,0,0,-78), srow_y=c(0,4,0,-112), srow_z=c(0,0,4,-50),
qoffset_x=-78, qoffset_y=-112, qoffset_z=-50, xyzt_units=2);
out.img@sform_code <- 4; 

out.img@qform_code <- 4;
pixdim(out.img)[1:8] <- c(1, 4,4,4,1, 1,1,1);  
# pixdim: qFactor, size in mm, num of frames, 1,1,1

Once the header information has been set, we create the image:
writeNIfTI(out.img, "d:/temp/fakeBrain");

reading the NIfTI file

Now we can read the nifti file back into R in.img <- readNIfTI("d:/temp/fakeBrain.nii.gz", reorient=FALSE); Once it's in R you can refer to the voxels by array location, as when creating the image (see the code for examples)

And, the image can be viewed in mricron (or other image viewers):

Above, you can see that the value at particular array indices match in R and MRIcroN. We can also display the clusters on the MNI template image, as below:

Notice that the X,Y,Z coordinates at the top of the window now refer to the template underlay, not the coordinates of the overlay. The voxel values at the bottom are correct, however.


Always check the orientation of images, particularly after manipulating them in R. It is exceptionally easy for them to get left/right flipped, from handedness problems, software incompatibilities, incorrect header settings, etc.

update 2 April 2015: A HopStat and Jump Away has a post with many examples using oro.nifti functions, illustrating functions and procedures I didn't cover here.

update 21 January 2016: Changed the example code to have datatype=64 and just one row for the pixdim call.

update 29 August 2019: John Muschelli's neurobase vignette describes working with oro.nifti, plus the wrapped neurobase functions. RNifti can be MUCH faster for reading large NIfTI images.

working with NIfTI files in R: background

As I've written about before, I generally work with NIfTI-formatted fMRI images, often in R, using the oro.nifti package.

Recently I've had some difficulty with my written nifti-format images: they were not being read properly in some software packages (particularly caret and afni). After much frustration, I think I tracked the problem down to setting some of the nifti header fields incorrectly. In the hopes of reducing others' frustration, and perhaps eliciting some suggestions, in this set of posts I will demonstrate what I believe is proper way to read and write nifti images in R. As always, please share your experiences, particularly if you think a different strategy is more robust.

thinking about brain image data

First, some framing. We're all used to seeing brain image data shown like this (in MRIcroN): a three-dimensional picture. Most neuroimaging software uses this type of visualization; we open a nifti (or whatever format) image, and see a brain-shaped picture. But what is in a nifti file? It is sometimes surprising to people that the nifti file for a single brain image just stores a 3d array of numbers, plus a "header" file that contains information about how the numbers should be turned into a picture; how the numbers in the array relate to the real world.

This picture is made by assigning a color to every value in the 3d array; the picture turns out brain-shaped because the numbers in the array are brain-shaped. If this isn't obvious, consider this picture:

This is my sketch of a 3d array. Each array entry (voxel) has a number, giving the amount of "activation" (or whatever) in that particular voxel. Programs like MRIcroN assign a color to the number in each box when they make the brain image plots like the top picture, using the header information to translate the 3d array into actual space. (If 3d arrays still are not clear, do a bit of google searching before reading on; it's critical for understanding how to work with nifti images in R.)

at least two sets of coordinates

We always need to consider at least two sets of coordinates with neuroimaging data, and be able to move between them when needed. One set gives the location of the voxel in the 3d array, while the other gives the location of the voxel in an aligned space (e.g. MNI, or the distance from an anatomic landmark). (This is sometimes referred to as i,j,k vs. x,y,z coordinates, particularly in afni.)

The two sets of coordinates are shown in my MRIcroN screen image above: the yellow bar highlights the X=10, Y=17, Z=21 boxes, while the red bar indicates the -42,-48,30 coordinates. This means that the crosshairs are showing the voxel at location 10,17,21 in the 3d array, but location 42,-48,30 in the anatomical space specified in the nifti header. This voxel has a value of 496 (green bar).

All this means that the same 3d array of numbers could be plotted in many ways just by changing the header information. For example, in the next post I'll include R code to write nifti files. Here I wrote a 3d array twice, once setting the voxel size in the header to the true value, 4x4x4mm (blue). But then I wrote the same 3d array again, but setting the voxel size in the header to 3x3x3 mm (red). I then opened up the ch2bet template in MRIcroN and overlaid both images.

As you can see, MRIcroN plotted both images without complaint, but the 'blob' locations and sizes do not match: the red clusters are smaller and shifted towards the posterior left.


Before describing reading and writing NIfTI files in R I wanted to introduce some of the underlying concepts: image data is stored as a 3d array*, with real-world orientating information in the header fields.

*I keep referring to 3d arrays for simplicity. But fMRI data is often stored as 4d arrays, or even higher. In this case the 4th dimension is time (usually TR). But the concepts are the same - we have the large array holding the voxel values, then the header with the orientating information.

Thursday, April 18, 2013

below-chance accuracy: case studies

Reading my previous post, I realized it is missing some concrete examples/case studies: why did I give some of that advice? So here are two from my experience; I'd love to hear others - put them in the comments or send them to me directly (I'll post them if you'd like, with or without your name).

Also, I am not particularly concerned if a few subjects have somewhat below-chance accuracy. There are no hard-and-fast rules, but I'd say that if you have at least a dozen subjects, if one or two classify around 0.45 while the others are above 0.6 I wouldn't be too concerned (assuming chance = 0.5 and within-subjects analysis). If some are classifying at 0.2 and others at 0.9, I'd be much more worried, or if something like a third of the subjects are below-chance.

first: check the dataset

This is my first piece of advice because of a project I was involved in several years ago. Early analyses of the dataset produced the strangest pattern of results I'd (still) ever seen. Most of the accuracies were not below-chance but implausibly above-chance: accuracies I'd expect to see on some sort of primary motor task, not the subtle cognitive effect we were actually testing. Also, accuracy went with the number of voxels in the ROI: larger ROIs classified better.

It took a lot of checking to find the problem. We weren't even really sure that there was a problem; the results just "felt" too good, and the relationship between ROI size and accuracy was worrisome. We finally tracked down that the problem was that some of the examples were mislabeled. Another researcher had generated PEIs (beta-weight images), which I then used as the MVPA input. I labeled which PEI was which according to my notes ... but my notes were wrong. Once we realized this and fixed the labels everything made much, much more sense. But this took weeks to sort out.

While not a case of straight below-chance accuracy, this experience made me convinced that just about anything can happen if something goes wrong in the data preparation (preprocessing, labeling, temporal compression, etc.). And it is very, very easy to have something go wrong in the data preparation since it is so complex. This is partly why I'm keen to perform a quality-checking classification: first classifying something that is not of interest (and so not impacting the experimenter degrees-of-freedom too much) but that should have a very strong signal (like a movement). Hopefully, this procedure will catch some dataset-level errors - if it fails, something is not right.

aiming for stability

I'm currently working with a dataset with a very complex design. I started by trying to classify button-presses (which finger) in motor areas, within-subjects. Classification was possible, but highly unstable: huge variability on cross-validation folds and between different subjects (some people classified perfectly, others very below-chance). Accordingly, the group-level results (e.g. t-test for accuracy > chance) were lousy.

I then tried adjusting some of the analysis choices to see if the button-push classification could be made more stable. In this case, I changed from giving the classifier individual examples (e.g. first button-push, second button-push, etc - number of examples matching the number of button-pushes) to classifying averaged examples (e.g. averaging all button "a" pushes in the first run together). This gave me far fewer examples to send to the classifier (just one per button per run), but accuracy and stability improved drastically. We later followed the same averaging strategy for the real analysis.

In this case, I suspect the individual examples were just too noisy for adequate classification; averaging the examples improved the signal more than reducing the number of examples hurt the classifier (a linear SVM for this particular project).

below-chance classification accuracy

This topic comes up repeatedly with MVPA, so I thought I'd start collecting some of my thoughts on the blog. As an introduction, what follows started as a version of what I posted to the pyMVPA message list some time ago. This topic has been discussed multiple times on mailing lists; see the thread at . Googling "below-chance accuracy" also brings up some useful links.

My opinion in short: Sorting out below-chance accuracy is really vexing, but I am highly skeptical that it represents anti-learning (or other task-related information) in the case of MVPA. Impossible? No, but I'd want to see very impressive documentation before I'd accept such a conclusion.

"below-chance accuracy"?

By "below-chance accuracy" I specifically mean classification accuracy that is worse than it should be, such as some subjects classifying at 0.3 accuracy when chance is 0.5. It can even turn up in permutation tests: the permutation test null distribution nicely centered on chance and reasonably normal but the true-labeled accuracy far into the left tail (and so significantly below chance).

I don't have a complete explanation for this (and would be very interested to see one), but tend to think it has to do with data that doesn't make a linear-svm-friendly shape in hyperspace. Often we don't have a huge number of examples in MVPA datasets, which also can make classification results unstable. We're so often on the edge of what the algorithms can handle  (tens of examples and thousands of voxels, complex correlation structures, etc.) that we see a lot of strange things.

what to do?

FIRST: Check the dataset. Are there mislabeled examples (it's happened to me!) or faulty preprocessing? Look at the preprocessed data. Anything strange in the voxel timecourses? Basically, go through the data processing streams, spot-checking as much as possible to make sure everything going into the classifier seems sensible.

Then look at stability, such as the performance over cross-validation folds and subjects. How much variation is there across the cross-validation folds? Things seem to often go better when all the cross-validation folds have fairly similar accuracies (0.55, 0.6, 0.59, ...) rather than widely variable ones (0.5, 0.75, 0.6, ...).

Often, I've found the most below-chance accuracy difficulties in cases where stability is poor: a great deal of variation in accuracy over cross-validation folds, large changes in accuracy with small changes in the classification (or preprocessing) procedure. If this seems to be happening, it can be sensible to try reasonable changes to the classification procedure to see if things improve. For example, try leaving two or three runs out instead of just one for the cross-validation, since having a small testing set can make a lot of variance in the cross-validation folds.Or, perhaps try a different temporal compression scheme (e.g. average more timepoints or repetitions together).


I am very aware that this type of "trying" can be dangerous - the experimenter degrees of freedom starts exploding. But there often seems little choice. In practice, a particular MVPA should start with a set of well-considered choices (which classifier? which cross-validation scheme? how to do scaling? how to do temporal compression?). But these choices are usually based far more on guesswork than objective criteria. Until such object criteria exist, I have the most faith in a classification result when it appears in several reasonable analysis schemes. 

In practice, we should think carefully and attempt to design the best-possible analyses scheme before looking at the data. (And this assumes a very concrete experimental hypothesis is in place! Make sure you have one!) If we start seeing signs that the scheme is not good (like high variability or below-chance accuracy) we should try several other reasonable schemes, attempting to find something that yields better consistency.

Ideally, we do this "optimization" on a separate part of the dataset (e.g. a subset of the subjects), or something other than the actual hypothesis. For example, suppose the experiment was done to test some cognitive hypothesis, and the people made responses with a button box. We can then use the button-pushing for optimizing some aspects of the procedure: Can we classify which button was pushed/when the buttons were pushed in motor and somatosensory areas? If not, something is seriously wrong with the dataset (labeling, preprocessing, etc) and that should be fixed (so button classification is possible) before trying to attempt the actual classification we care about. This is of course not a perfect procedure, but can catch some serious problems, and reduces looking at the real data a little bit (since we don't actually care about the button-pushes or motor cortex).

what not to do

You should certainly not simply reverse the accuracy on below-chance subjects (i.e. so 0.4 becomes 0.6 if chance is 0.5), nor omit below-chance subjects. That is likely to cause nearly any dataset (even just noise) to classify well , particularly since fMRI data is usually so variable. No double-dipping! There may be cases when such actions are proper, but they need to be very, very tightly controlled, and certainly not done at the first sign of below-chance accuracy.

UPDATE [6 May 2016]: see this post about Jamalabadi et. al (2016), for a possible explanation of how below-chance accuracy occurs.

Wednesday, April 3, 2013

announcement: "Searchlight analysis: promise, pitfalls, and potential"

I'm pleased to announce that my recently-accepted Comment for NeuroImage, entitled "Searchlight analysis: promise, pitfalls, and potential" is now online ( I plan to do a series of posts with expanded versions of some of the figures and supplemental material over the next weeks (it's that cool!).

It doesn't look like the supplemental material is officially online yet; let me know if you can't wait or have difficulty accessing the manuscript and I'll send it to you directly. I'm of course also interested in any feedback.