Tuesday, March 25, 2014

NIfTI, CIFTI, GIFTI in the HCP and Workbench: a primer

The HCP is releasing preprocessed data in both volumetric NIfTI and surface/volumetric CIFTI formats. Working with the HCP files, or doing much of anything with the Workbench, requires navigating through a plethora of .*.nii and .*.gii files. In this post I'll explain why we need all these files, and how they relate to each other. Disclaimer: I'm writing this as a primer from the viewpoint of someone familiar with volumetric fMRI data analysis; it is not at all a full description of everything the files can be used for. Also, though I'm referring to the HCP and Workbench, these file formats are used by other projects and software.

For a starting point, consider how we work with volumetric NIfTI files. Neuroimagers often think about volumetric NIfTI files as storing functional data in a 4d matrix (x,y,z, and time). Libraries such as oro.nifti make reading NIfTI files fairly easy: they create a 3d or 4d matrix of voxel values, plus a object with the header information.

While you can get an idea of the anatomy by looking at slices of the 4d functional data matrix, analyses generally rely on having a 3d matrix of anatomical data (binary mask of regions, anatomic scan, etc) perfectly aligned to the 4d functional data. So, the 4d NIfTI file doesn't contain everything we need: we get some alignment information out of the header (qfactor, etc), but also need the registered 3d anatomical data. For a concrete example, I had to provide two files for the little ROI-based analysis demo: the dataset (4d NIfTI with preprocessed BOLD) and the ROI mask (binary 3d NIfTI showing the voxels corresponding to the anatomical region of interest), plus stating that the dataset was normalized to the MNI anatomical atlas (so that we can overlay the data on the correct anatomical template).

Now, on to CIFTI. CIFTI-2 files follow the NIfTI-2 file format specification (CIFTI-2 is a "flavor" of NIfTI-2, so both use the *.nii file extension), and both consist of a data matrix and headers. In the case of the HCP data, the functional timecourses are in the data matrix part of *.dtseries.nii CIFTI files. Like NIfTI volume files, the CIFTI file contains information about where voxels are, though this information is stored in a different place (in the extension containing the CIFTI XML). But, paralleling how you need an anatomic file to figure out exactly where the voxels in a volumetric NIfTI lie, you need other files (not just the CIFTI) to tell you where the surface vertices lie, and how they're connected (the "triangles", etc). Aside: While I wrote "surface vertices" in this paragraph, note that the HCP CIFTIs store both surface vertices (for the cortical sheet) and volumetric voxels (for sub-cortical structures).

These "other files" are not a single file but multiple; as many as necessary. Having all of these files is akin to having multiple ROI files available for an analysis: you won't use each ROI in each analysis, just the ones corresponding to the anatomical area (or whatever) you need for a particular test. The "other files" for the HCP are not just ROIs, but can also be underlying anatomy at different inflation levels, maps of tissue types, etc.

For example, at left is a screenshot showing some of the "other files" provided for each HCP person in the released datasets. These files are from /100307_Q3/MNINonLinear/Native/: the maps are in subject space. Many files with similar names are in /100307_Q3/MNINonLinear/fsaverage_LR32k/: maps of the same structures/types, but aligned to the MNI template anatomy (specifically, the 32k Conte69 mesh, see page 112 of Glasser, et. al 2013).

And now we're encountering GIFTI files: many of the "other files" are in GIFTI format, with the extension .*.gii. The naming of the "other files" (the last bit before the .gii) in the HCP tends to follow the CARET conventions, and gives a hint as to what sort of information they contain:

*.surf.gii, "gifti surface files", contain only vertex coordinates and triangles (which vertices are connected). The HCP *.surf.gii files are mostly structures that you might want to overlay data onto, such as 100307.L.inflated.native.surf.gii (left hemisphere, inflated) and 100307.L.midthickness.native.surf.gii.(left hemisphere, not inflated at all, but rather halfway through the thickness of the cortical ribbon).

*.func.gii and *.shape.gii, "metric files", contain data values for every vertex. Essentially, these are data arrays whose indices correspond to a surface file - you need a matching surface file to know where in the brain to put the data stored in a metric file. For example, a metric file from the HCP release is 100307.L.corrThickness.native.shape.gii: the cortical thickness at each vertex.

For an example of how these files work together, my tutorial on plotting a NIfTI image with the Workbench uses the wb_command -volume-to-surface-mapping program to create .shape.gii files aligned to Conte69.*.midthickness.32k_fs_LR.surf.gii. The data from the volumetric NIfTI (e.g. searchlight accuracies at each voxel) is stored (by vertex) in .shape.gii files, but a shape.gii file by itself isn't enough to plot the data properly on a surface: you need an aligned .surf.gii file as well. Paralleling how you need an aligned anatomy to properly overlay a volumetric NIfTI ROI, you need an aligned surf.gii to know how to properly locate the data from a metric file.

Whew! Hopefully this primer helps explain why so many files are released with the HCP data, and a bit about how they work together. For additional information see the Workbench Glossary, as well as Glasser, et. al 2013. If you've found any references particularly useful that I haven't already linked to, please send them along and I'll add links.

I want to end this post with a BIG thank you to Tim Coalson, who patiently (and repeatedly) walked me through these file types and how they relate to each other.

UPDATE 17 November 2016: Emma Robinson describes the various surface files, including the MSM related ones.

1 comment: