Friday, February 27, 2026

detrending and normalizing timecourses: afni 3dDetrend and 3dDeconvolve in R

This post introduces an expanded and updated demo of how detrending and normalizing ("scaling") of individual voxel (or vertex) timecourses works with afni 3dDeconvolve and 3dDetrend commands. 

In my original post I showed how to duplicate what 3dDetrend -normalize -polort 2 does with R code, not to replace the afni function, but to understand what it is doing. This expanded version simplifies the old code a bit, but more importantly, adds sections explaining another common method of preparing timecourses for analysis: using 3dDeconvolve with -num_stimts 0 -polort A -errts (plus censoring, motion regressors, etc.; see below) to create the residual error timeseries. 

The compiled demo is afni3dDetrend3dDeconvolve_R.pdf  and is a knitr file; its source code (with many comments) and files required for compilation are in a section of my osf site, DOI https://doi.org/10.17605/OSF.IO/NU324 (please include that DOI in any citation). 

starting point: a voxel's timecourse

The examples use a left motor grey matter voxel from a preprocessed (with fmriprep 1.3.2) fMRI task run from the DMCC55B dataset; I chose it arbitrarily. The plot below is directly from the preprocessed nifti; this voxel has values around 6900, and the timecourse vector is length 540. DMCC55B's TR was 1.2 s, so this is a 10.8 minute-long run. The grey vertical lines are at one-minute intervals, with TR (frame) number along the x-axis and BOLD amplitude along the y-axis. (See the knitr .rnw for details, plotting code, etc.; this blog post just has a few highlights.)

raw (preprocessed) voxel timecourse

normalizing and detrending

Scaling alone isn't usually sufficient to prepare fMRI timeseries for analysis; we also need at least a bit of detrending. There's no universally correct degree or type of detrending to use. I generally recommend a modest amount of detrending before parcel-averaging types of analyses, specifically 3dDetrend -normalize -polort 2 . 

In the plot below, the same voxel timecourse as above is plotted after normalizing only (tc.np0, black, from 3dDetrend -normalize -polort 0), or with detrending at polort 2 (blue, tc.np2), and the more aggressive polort 5 (green, tc.np5). 

same timecourse, after 3dDetrend -normalize -polort 0, 2, or 5
Notice that the spiky parts of the timecourses are pretty much the same in all three versions, but the slower changes vary more; e.g., in the first minute without no-detrending line (tc.np0) is furthest from zero, the tc.np2 line is closer, and tc.np5 line closest. 

It's sensible that larger polort numbers have more of an effect on the timecourse's shape, since, as explained in the afni help for 3dDetrend, -polort ppp gives "the Legendre polynomials of order up to and including 'ppp' in the list of vectors to remove", so larger -polort numbers means removing more complex trends. The R code below shows how to do this type of normalizing and detrending; Legendre() is from Gregor Gorjanc, and requires the orthopolynom and polynom R packages.

 # R commands for 3dDetrend -normalize -polort 2  
 lm.out <- lm(tc.raw ~ Legendre(x=seq(tc.raw), n=2)); # lm with two Legendre polynomials (polort 2)  
 tmp <- residuals(lm.out);  # extract the residuals  
 tc.Rnp2 <- (tmp-mean(tmp))/sqrt(sum((tmp-mean(tmp))^2));  # normalize the residuals, afni-style   

residual error via 3dDeconvolve

It's common to have the realignment parameters as nuisance regressors and censor high-motion frames prior to fMRI timecourse analyses, which can be done with 3dDeconvolve. Skipping rather a lot of explanations from the full demo, the afni command is:

 # errts.fname is the file made by 3dDeconvolve, from which the single voxel timecourse was extracted:  
 system2(paste0(afni.path, "3dDeconvolve"),   
     args=paste0("-input '", scale.fname, "' -polort A -float ",  
           "-censor '", c.fname, "' -num_stimts 0 ",  
           "-ortvec '", mot.fname, "' moveregs ",  
           "-nobucket -errts ", errts.fname), stdout=TRUE);  

where -num_stimts 0 means not to include any events in the model,  -errts that we want afni to write the residual error time series from the "full model fit to the input data" into file errts.fname (in this case, a .nii.gz), and -polort A that afni should set the polort level according to the run length (here, that gives 5).

Below is the 3dDetrend -normalize -polort 5 (tc.np5) timecourse again in green, with the new (-errts) version from the 3dDeconvolve command in pink: 

same voxel timecourse, 3dDeconvolve errts over scale(tc.np5), censored frames marked

The errts timecourse is highly correlated with the np5 version, which makes sense, since both included polort 5 detrending. They're not perfectly correlated, though: the 3dDeconvolve command also did censoring and included the motion regressors. I don't have a simple way to describe the differences in the lines; they're clearly very similar, but not identical; sometimes one is more extreme or spiky, sometimes the other.

The errts image has 0 in the censored frames. This is obvious in a 4d nifti (entire frame filled with 0s), but ambiguous in a single voxel (or vertex) timecourse like this (in the plot the censored frames are circled on the tc.errts timecourse, squared on the np5 version). For some analyses in R (e.g., averaging frames after an event for temporal compression) it'd be sensible to use NA for the censored frames.

This R code matches the 3dDeconvolve calculations:

 # made in startup code chunk; download at https://osf.io/nu324/files/c6nax  
 mot.fname <- paste0(demo.path, "sub-f1027ao_ses-wave1bas_task-Stroop_6regressors_demean.txt");   
 mot.tbl <- read.delim(mot.fname, sep=" ", header=FALSE); # 540 x 6  
   
 # made in startup code chunk; download at https://osf.io/nu324/files/bjrk8  
 c.fname <- paste0(demo.path, "sub-f1027ao_ses-wave1bas_task-Stroop_FD_mask0.5.txt");  # 0 1 censor file  
 censor.vec <- read.table(c.fname, header=FALSE)[,1];  
 censor.TRs <- which(censor.vec == 0) # [1] 300 316 431 448 497  
   
 # first, remove censored frames from the motion regressors and input timecourse  
 # The input timecourse tc.scale is the voxel from file scale.fname: the input bold.fname after  
 # scaling with 3dcalc -expr 'min(200, a/b*100)*step(a)*step(b)'   
 scale.vec <- tc.scale[-censor.TRs];  
 col1.vec <- mot.tbl$V1[-censor.TRs]; # demeaned trans_x  
 col2.vec <- mot.tbl$V2[-censor.TRs];   
 col3.vec <- mot.tbl$V3[-censor.TRs];   
 col4.vec <- mot.tbl$V4[-censor.TRs];   
 col5.vec <- mot.tbl$V5[-censor.TRs];   
 col6.vec <- mot.tbl$V6[-censor.TRs];   
   
 # fit the lm, polort 5, on the censored scale.vec and including 6 censored motion regressors  
 lm.out <- lm(scale.vec ~ Legendre(x=seq(scale.vec), n=5) + col1.vec + col2.vec + col3.vec + col4.vec + col5.vec + col6.vec);   
 tc.Rerrts <- residuals(lm.out);  # extract the residuals  
   
 # put 0s back in where the censored frames were taken out.  
 for (i in 1:length(censor.TRs)) { tc.Rerrts <- append(tc.Rerrts, 0, (censor.TRs[i]-1)); }  
   
 # the R version matches the 3dDeconvolve errts version   
 cor(tc.errts, tc.Rerrts); # almost perfect  

musings

Working out the R code which matched the 3dDeconvolve -errts demystified it for me; the 3dDetrend -normalize -polort 2 detrending and normalizing ("np2") I've treated as a default (for timecourse-averaging type analyses) is closer to these 3dDeconvolve residuals ("errts") than I'd thought. 

Am I going to change my default detrending method? Is the 3dDeconvolve errts "better" than the 3dDetrend np2 for an analysis like this? I think incorporating the censoring (to NA, not 0) and polort-picking calculation (which gave 5 in this demo) from 3dDeconvolve (instead of using 2 regardless of run length) would be sensible, modest improvements.

I'm less confident about whether to add in the motion regressors. My sense was that including these would somehow "account for" or "clean up" any motion effects not "corrected" by preprocessing. And including the six realignment parameter columns does change the timecourse produced by the model a bit ... but it's not much different if the actual realignment parameters or random ones are included, making me suspect the change is more due to the change in model degrees of freedom than actually "fixing" the head motion.

Here's code for the random-motion-regressor models:

 # permute numbers in each motion regressor separately  
 lm.out <- lm(scale.vec ~ Legendre(x=seq(scale.vec), n=5) + sample(col1.vec) + sample(col2.vec) + sample(col3.vec) + sample(col4.vec) + sample(col5.vec) + sample(col6.vec));   
 tc.test1 <- residuals(lm.out);  # extract the residuals  
 for (i in 1:length(censor.TRs)) { tc.test1 <- append(tc.test1, 0, (censor.TRs[i]-1)); } # put 0s back in  
   
 # random numbers for the motion regressors  
 ct <- length(scale.vec);  # how long to make each fake motion regressor column  
 lm.out <- lm(scale.vec ~ Legendre(x=seq(scale.vec), n=5) + rnorm(ct) + rnorm(ct) + rnorm(ct) + rnorm(ct) + rnorm(ct) + rnorm(ct));   
 tc.test2 <- residuals(lm.out);  # extract the residuals  
 for (i in 1:length(censor.TRs)) { tc.test2 <- append(tc.test2, 0, (censor.TRs[i]-1)); } # put 0s back in  

3dDeconvolve errts timecourse plus permuted motion regressor columns (tc.test1)

3dDeconvolve errts timecourse plus random-number motion regressor columns (tc.test2)


Thursday, January 15, 2026

so long, windows

I've always had windows on my work computers, and have built up quite a bit of "muscle memory" over the (ahem) decades for how to use it; the hard part is what to analysis to do or image to look at, not how to do open the image. I didn't want to change my computer setup, but am unwilling to use windows 11, given all its privacy & AI intrusions and limitations. The last few months I've switched over to linux, and it's ... pretty much fine; everything is working more-or-less like it was before.

My hope is that this may smooth the path for other neuroscience folks considering a linux switch but hesitant or unsure how to go about it. For framing, I am most assuredly not a linux guru; I started with some experience using linux servers managed by others, but zero desire to switch my trusty mousepad for a command prompt or to spend days/weeks/months relearning how to do routine work tasks. But now you'd now have to look fairly closely to notice that the software changed on my work computer in the last six months, and I consider that a good thing.

os: zorin linux

There are a lot of "flavors" of linux, and choosing one is daunting. Since my goal was to keep my computers as windows-ish as possible (and not fiddle with settings) I chose Zorin 18 pro, and highly recommend it. I was wondering if hardware would be a hassle (my desktop computer has two monitors, two hard drives, a USB webcam, keyboard, and my beloved vintage Fingerworks iGesture mousepad), but everything just worked no problems whatsoever. (I've since also installed Zorin linux to dual-boot with windows on my laptop, also without hardware difficulty.)

My desktop computer has two drives: a smaller one for the operating system, and a larger for file storage. I had the zorin installation program reformat the smaller drive, but left the larger unchanged; it's still formatted NTFS as it was for windows. I was worried the mixed drive formats would cause trouble but haven't had any, nor any speed issues. 

Zorin comes with an array of menu/desktop appearance layouts: some mimic windows, others mac os or other linux versions. I picked a windows style, and then tweaked the start menu, colors, etc. to my preference. The oddest thing I changed from defaults was the "desktop environment", from wayland to x11, mostly because I wanted a picture gallery screensaver (screensavers are apparently a contentious topic in linux circles). Changing the start menu was non-intuitive: via the (installed) System Tools -> Main Menu program, not via settings or right-clicking on the start menu itself. 

software

Many programs have linux versions (R, zoom, etc.) and so are no problem (install from the Zorin Software collection directly or the programs' website; clicking the start menu and typing a program's name brings up an installer in many cases), but others present more of a challenge. 

The biggest hurdle for me was OneNote: LibreOffice (comes with Zorin) is fine, but doesn't have a OneNote equivalent. I first tried Logseq, but ended up going with Joplin. Which to use is definitely a matter of personal style and preference (e.g., my notes are organized hierarchically and I don't like tags); in my case some of the individual page layouts and formatting got scrambled in the conversion, but the critical text, images, and page organization all came through fine, and I can make new pages without difficulty:


I have a lot of Powerpoint and Word documents; LibreOffice has been managing them fine so far, but wanting Microsoft Office to be available "just in case" is the primary reason I installed Zorin linux on my laptop as dual-boot instead of removing windows entirely. 

In no particular order, here are some of the programs I used on windows and what I'm using instead on linux (I didn't list programs which are available for both, such as RStudio). Many came with Zorin, others I installed via its Software "store".  

  • Notepad++ -> NotepadNext
  • TigerVNC -> Remmina
  • Snipping Tool -> Gradia
  • Foxit & Sumatra (pdfs) -> qpdfview (has tabs; for knitr compiling), Document Viewer (for highlighting & notes), LibreOffice Draw (for complex editing)
  • WinMerge -> Meld
  • WinSCP -> FileZilla 
  • File Manager -> Dolphin (for navigation), Files (GNOME Nautilus; for mounting smb and turtle git)
  • tortoise git -> turtle git
  • MS Office -> LibreOffice 26.2*
  • MS OneNote -> Joplin
  • 7-zip -> File Roller

remaining headaches

I'm still puzzled why some file operations (smb mounting, turtle git) work differently in Files and Dolphin. I prefer Dolphin's navigation-tree layout and right-click options, but can only access smb-mounted files from Rstudio if I login via Files.


MRIcron and mango work fine for nifti and dicom image viewing, but I start the programs by double-clicking on their executables; getting pretty desktop icons or start menu items for them is tricky (you can't just right-click on the executable and make a working shortcut). Programs installed from the Software store don't have this configuration problem, so I assume it's something related to how I did the installation.

FileZilla often seems slower than winscp, despite connecting to the same servers. I think FileZilla disconnects more completely, requiring it to pause and reconnect when I, e.g., double-click a text file for viewing. There may be a setting for this? I had to change several FileZilla defaults, including increasing the Timeout time and changing the Double-click action on files to View/Edit. Setting file associations is still tricky; it doesn't always "see" installed programs, despite messing with the Flatseal permissions. 


* Zorin 18 came with LibreOffice 25 preinstalled. I find version 26 matches my (older windows-style) instincts more closely, especially after setting the "UI Mode" to "Standard Toolbar" and Icons to "Colibre" (Options -> LibreOffice -> Appearance). I did the upgrade in the Terminal, following the uninstall and install commands from this post. The terminal commands don't appear to change the GUI, but after the flatpak install command finishes, the start menu commands open the new version of LibreOffice, with all the file associations, etc. updated properly without rebooting (!).  [added 5 February 2026]


Wednesday, October 1, 2025

apparent motion also at 7T

WUSTL (ahem, WashU) recently set up a 7T scanner (Siemens Terra; 8Tx32Rx_Head_C head coil), and we've started a bit of piloting and exploring what we can do with it. I've been working through multiple aspects: the BOLD images themselves look different because of the stronger magnet, we're trying some new-to-me multi-echo sequences, collecting some noRF and phase images, and even the BIDS and fMRIPrep parts have taken some script updating. I think I'm getting enough of a handle on things now (thanks for help via neurostars, Chris Markiewicz and Taylor Salo!) to share some impressions, but it's all still very much in process.

We've had two pilot sessions so far, each with a different participant and somewhat different sequences. We're most interested in task fMRI, but that's tricky at the moment since the scanner doesn't (yet) have a way to present stimuli, nor to record responses. Eventually I want to do sequence comparisons with the reward-possible DMCC proactive Cued Task-Switching paradigm (and probably other tasks), like in my OHBM 2023 poster. But to get started I asked the second pilot participant to do a self-paced version of the HCP Motor task: blocks of right finger tapping, left finger tapping, right toe wiggling, left toe wiggling, tongue moving, with a bit of rest in between and a few deep breaths before and after each movement block to serve as onset/offset markers. (I'm not going to discuss the movement analysis parts yet, but so far I'm encouraged.)

Both pilot sessions included a pair (PA/AP) of runs with an acquisition similar-ish to what we used in the DMCC at 3T: 2.4 mm iso voxels, MB4, TR 1.2 s. But the 7T allows more acceleration, so they added in-plane acceleration (GRAPPA 2) and collected 4 echoes instead of just 1.

Below are the realignment parameters (from the fmriprep derivative _desc-confounds_timeseries.tsv files) for those runs from the two participants we've had so far. In both cases the grey vertical lines are at one-minute intervals; the first participant (TB7T1)'s runs were about 10.5 minutes and he alternated periods of regular and slow/deep breaths; the second participant (TD7T1)'s runs were about 6.5 minutes each and he did the blocked breathing-motor task.


TB7T1 clearly has much more apparent motion than TD7T1; the periods of regular and deep breathing are obvious, not only in the realignment parameters but also in movies of the BOLD run. TB7T1_run-44_echo-1_bold.avi is the "rawdata" (before preprocessing) version of echo 1; TB7T1_run-44_space-MNI152NLin2009cAsym_desc-preproc_bold.avi is after preprocessing (with fMRIPrep 25.1.0 using Tedana to optimally combine the echoes). The brain sort of looks like it's "jumping" with the breaths in the raw movie; perhaps more like expanding and contracting in the preprocessed version. 

The TB7T1 participant is the same individual as in some of our previous acquisition tests (at 3T); clearly, using 7T doesn't mean we can forget about apparent motion. ... I wouldn't have forgotten about apparent motion regardless since it's a favorite topic of mine, as is probably obvious since I'm starting off this (hopefully) series of posts about our 7T piloting with it.

Each of these sessions included multiple different acquisition sequences; in all cases TB7T1 had more obvious apparent motion than did TD7T1. TB7T1 run 28 had the largest apparent motion; it also had smaller voxels and a longer TR than the other examples.


Finally, the TB7T1 run 44 motion is also striking in the greyplot version created by fmriprep (left); TD7T1 run 24 is below, right:


Wednesday, June 18, 2025

OHBM 2025 (let's chat about crescents!)

I'll be at OHBM 2025 next week (want to meet up? drop me a line), presenting a poster about the crescent artifacts (""Crescent" artifacts with multiband fMRI acquisitions: appearance, causes, and consequences"). The poster pdf is available via OHBM's system, or at https://osf.io/pzj39.

My blog post last December covers some of the same information, but preparing the poster refined my thoughts a bit. I still don't have as clear a sense as I'd like of exactly how much the artifact affects fMRI signal quality, but am confident that there is enough likelihood of a substantial negative impact that they shouldn't be ignored. 

It seems to be a given in the MR physics literature that Nyquist ghosts appreciably degrade EPI signal to noise. Quantifying the impact in GLM results is difficult, however, especially after preprocessing, smoothing, in group analyses (artifact location and intensity varies across people), and when runs of different encoding directions are analyzed together. Qualitatively, I have seen crescents in single-subject statistical images (average activation, GLMs), but that is of course not the typical analysis. It could be informative to run a few comparison tests; for example, is the effect strength in frontal parcels (i.e., those in the path of the crescent artifact) worse in a group of participants with the artifact than without? Or in PA than AP encoding runs?

While the degree of impact is uncertain, I think there's enough evidence to recommend that crescent artifacts be one of the criteria for choosing acquisition protocols: select fMRI acquisition parameters so that crescent artifacts are minimized and/or appear in brain areas of low theoretical interest. Since the crescent artifact likely reduces BOLD signal quality somewhat, and is more often prominent in people with smaller brains, there's a risk of bias if an experimentally-important participant characteristic (e.g., age, gender) is associated with differences in head size; extra care should be taken in these cases.

Monday, March 10, 2025

"fun" with Box AI information leakage

Our university IT folks encourage employees to use their box account for data storage, including of sensitive (human subjects research data, medical records, etc.) files. I wasn't pleased to see the Box AI button appear, and asked our IT what exactly it does, and how it impacts file privacy. 

We went through several rounds of messages, including these responses: "Yes, the HIPAA protections are still in place with the BOX-AI application. Box AI securely engages an external AI provider to compute embeddings from these text chunks and the user’s question. Advanced embeddings models, such as Azure OpenAI’s ada-02, are utilized for this purpose." and "Box does not allow any access to data to outside vendors other than the isolated environment used to process the data. No retained data is being allowed. The data is processed in a bubble, then the bubble is destroyed when completed essentially."

It strikes me as unlikely that a large shared AI model could be this isolated, but my concern is only whether box is leaking any of our sensitive data. Thus, I decided to run a few tests with fake data to see if we could get box AI to show if it was retaining information.

The test file and two chat transcripts are below the jump. Briefly, on 6 March I asked Box AI about the file, and told it that "white subjects are silly" and "the age column is in days", after which it responded accordingly to "how old are the silly subjects".

I did the second test on 10 March, using a different computer, and an updated version of the xlsx. Critically, I asked the box AI, "how old are the silly subjects in years?" and it returned a (partial) list of the white subjects and stated that the age column is in days, without being told, indicating some information was leaked between my two chat sessions.

Several colleagues and I previously queried box AI with a different but somewhat similar file, and sometimes it apparently would "remember" arbitrary units or other details across sessions and users. Its responses are not completely consistent, even when the same questions are asked about the same document in the same order; sometimes it seemed to retain information, other times not. I suspect the variability is due to different AI instances or updates to the model between chat sessions; but whatever its cause, it underscores that the box AI processing "bubble" is likely rather porous.

Anyone else tried anything similar?

Friday, January 10, 2025

intro to working with volume and surface brain data

When preparing to update my surface (gifti) tutorial knitr, I realized that some of the confusion I was trying to address wasn't due to the code so much as the general differences between volume and surface datasets. This post tries to fill in some of those gaps for people using my R scripts. Much of the information in this post is general, but there are many topics I am not going to cover at all - what I describe here is not the only way to work with fMRI brain data; there are many other file formats and software packages.

first, volumes.

Volumetric brain data files are first, both historically (all fMRI analyses were originally volumetric) and fundamentally: we (including our brains) are physically in three dimensions, as is the data that comes off the scanner (often as DICOM files, which neuroimagers usually promptly convert to NIfTI, such as via dcm2niix when making a BIDS dataset).

Consider a standard anatomical image, such as the HCP 1200 Group Average, which makes a good default underlay for adult MNI datasets (S1200_AverageT1w_81x96x81.nii.gz is a downsampled version used in my tutorials). Opening S1200_AverageT1w_81x96x81.nii.gz with MRIcron shows us slices  for each of the three planes (below left), and the schematic at right is how the data comes out of a NIfTI image:


This image's data is a big cube subdivided into a grid of little cubes (voxels; volumetric pixels), each 2.4 mm on a side. It has 81 voxels along the i axis, 96 along the j, and 81 along the k, for 81*96*81 = 629,856 voxels total. There is a number in every one of these 629,856 voxels: the voxels fill the entire cube, not just the brain part in the middle. Thus, when you plot slices through a NIfTI it hopefully makes a brain-shaped picture, but that's not guaranteed - you can put any shape you want into a NIfTI file. (And hence why you should always include a "does it look like a brain" QC step.)

The code below opens the NIfTI in R with the splendid RNifti package and shows us the same information as the MRIcron screen capture above: the voxel under the cursor (at 25, 56, 37; red scribble) has value 605.887 (green scribble). (Warning: these match since both R and MRIcron are 1-based; if using something 0-based like afni or python you will need to shift indices.)
 library(RNifti); # package for NIfTI image reading https://github.com/jonclayden/RNifti  
   
 fname <- paste0(in.path, "S1200_AverageT1w_81x96x81.nii.gz");  # https://osf.io/6hdxv/  
 if (file.exists(fname)) { under.img <- readNifti(fname); } else { stop(paste("missing:", fname)); }  
   
 str(under.img)  
 # 'niftiImage' num [1:81, 1:96, 1:81] 0 0 0 0 0 0 0 0 0 0 ...  
 # - attr(*, "pixdim")= num [1:3] 2.4 2.4 2.4  
 # - attr(*, "pixunits")= chr [1:2] "mm" "Unknown"  
 # - attr(*, ".nifti_image_ptr")=<externalptr>   
 # - attr(*, ".nifti_image_ver")= int 2  
   
 under.img[25,56,37]  
 # [1] 605.887  
Note that under.img is a 3d array of numbers with various attributes; its pixdim and pixunits match those in the MRIcron "NIfTI Header Information" dialog in the screen capture (open with Window -> Information). There are many more NIfTI header fields; the header information is what makes it possible for software like MRIcron to match the 3d array of numbers in the NIfTI with actual space. 

We don't need to implement every NIfTI header field or work out warping and realigning on our own, thankfully! The key is to think of the volumetric images as a 3d (or 4d, if functional timeseries - lots of 3d images stuck together along the 4th dimension) array of numbers with important properties attached. If  you have multiple NIfTI files you can treat them as normal arrays of numbers, so long as all the files have matching properties (same voxel sizes, orientation, etc. - compatible headers). Thankfully, all the main fMRI software programs have functions for making image properties match; here's my afni example.

why surfaces?

I wrote above that neuroimaging data (and our brains!) start in 3d volume space; why would we want to think about it any other way? Well, when doing fMRI the goal is often to analyze the cortical grey matter BOLD signal, and we know the human cortex is physically a folded sheet (cortical "ribbon"); a convoluted surface.

Figure 2 from doi:10.1007/s00429-012-0411-8, PMID: 22488096 (below) shows a cortical surface (lots more info in the paper and elsewhere); the blue scribble I added to pane c follows the grey matter ribbon.


In principle, if the preprocessing algorithms find this grey matter ribbon we can concentrate our analyses there and have better precision and stronger results. There are a lot of complications in finding and transforming the grey matter ribbon of any particular person into a surface (a few of my musings; Andrew Jahn's), but we won't explore any of those here. The explanations below use standard images from FreeSurfer; to get a surface version of a new dataset I recommend preprocessing with software (e.g., fmriprep) which will output gifti files (or whatever format you want) directly (don't, for example, preprocess completely as volumes and then convert to surfaces for analysis).

introduction to gifti files

A single NIfTI file is enough for an entire brain, but not a single gifti file: gifti files come in paired sets (see also). The "paired" is because each gifti file has info for one hemisphere only: for the whole cortex you need a pair of files, one left and one right. And the "sets" is because gifti files come in multiple types, labeled by the bit of the filename before the .gii. Here we'll only use two types: .surf.gii surface "underlay" images, and .label.gii parcel labels.

For my gifti knitr tutorial I shared several sets of paired gifti files; here we'll look at the Schaefer2018 parcellation fsaverage5 ones in R, using the gifti library's readGIfTI function. (I converted the original files to gifti using FreeSurfer's mris_convert function.)

First, read four files: a pair each of .surf.gii and .label.gii:
 library(gifti);  # for readGIfTI  https://github.com/muschellij2/gifti  
   
 # surface "underlay anatomy"  
 fname <- paste0(in.path, "fsaverage5_lh.inflated_avg.surf.gii");  # https://osf.io/nxpa7/  
 if (file.exists(fname)) { surf.L <- readGIfTI(fname); } else { stop("can't find fsaverage5_lh.inflated_avg.surf.gii"); }  
 fname <- paste0(in.path, "fsaverage5_rh.inflated_avg.surf.gii");  # https://osf.io/24y9q/  
 if (file.exists(fname)) { surf.R <- readGIfTI(fname); } else { stop("can't find fsaverage5_rh.inflated_avg.surf.gii"); }  
    
 # Schaefer parcellation "overlay" surfaces for each hemisphere   
 fname <- paste0(in.path, "Schaefer2018_400Parcels_7Networks_order_10K_L.label.gii");  # https://osf.io/7wk8v/  
 if (file.exists(fname)) { parcels.L <- readGIfTI(fname); } else { stop(paste("missing:", fname)); }  
 fname <- paste0(in.path, "Schaefer2018_400Parcels_7Networks_order_10K_R.label.gii");  # https://osf.io/7tms8/  
 if (file.exists(fname)) { parcels.R <- readGIfTI(fname); } else { stop(paste("missing:", fname)); }   
   

The next code looks a bit at the list objects made with readGIfTI(). For plotting we only need the $data fields, but I suggest investigating the other entries as well to gain a sense of the type of information in them.

 str(surf.L$data);  
 # List of 2  
 # $ pointset: num [1:10242, 1:3] -0.685 22.359 41.099 9.658 -36.773 ...  
 # $ triangle: int [1:20480, 1:3] 0 0 0 0 0 3 3 3 4 4 ...  
   
 str(surf.R$data)  
 # List of 2  
 # $ pointset: num [1:10242, 1:3] -5.74 8.43 43.5 12.6 -40.33 ...  
 # $ triangle: int [1:20480, 1:3] 0 0 0 0 0 3 3 3 4 4 ..  
   
 str(parcels.L$data)  
 # List of 1  
 # $ labels: int [1:10242, 1] 57 81 49 180 40 132 14 22 0 143 ...  
   
 str(parcels.R$data)  
 # List of 1  
 # $ labels: int [1:10242, 1] 61 81 46 110 111 200 21 95 104 119 ...  
Notice that the two surf objects each have two subpart arrays ($pointset and $triangle), while each parcel object only has a vector of integers. The parcel vectors' lengths are 10,242, which matches the number of rows in the surface pointset arrays; for plotting, the gifti data sizes must match like this. Why 10,242? That's the number of vertices per hemisphere in fsaverage5; other surface spaces have a different number of vertices.

Triangles and vertices? These terms come from how surface meshes can be represented in computer science: points (vertices, orange arrow) connected by edges of specific lengths, making lots of little triangles, as in this part of the FreeSurfer website brain I enlarged:


using and plotting giftis

The key idea when working with gifti images using my scripts is that we manipulate the vectors of vertex-wise data, not the .surf.gii files (nor any other spatial information). In a given surface space (here, freesurfer5) the location of each vertex as ordered in the 10,242-element vectors is fixed, so we can manipulate its value as needed, as shown in the next examples.

This code produces the image below. The left hemisphere is blank: all its vertices (vals.L) are 0. For the right I again started with a vector of 10,242 0s, but then changed its 1,800th entry to 1, producing a brain picture with a red splotch at the location of vertex 1800. 
 vals.L <- rep(0, 10242); # start with all vertices 0  
 map.L <- gifti.map(surf.L.infl$data$pointset, surf.L.infl$data$triangle, vals.L); # map vals.L onto left surface  
   
 vals.R <- rep(0, 10242); # start with all vertices 0  
 vals.R[1800] <- 1; # change vertex 1800 to 1  
 map.R <- gifti.map(surf.R.infl$data$pointset, surf.R.infl$data$triangle, vals.R);  # map vals.R onto right surface  
   
 # plot lateral and medial of both hemispheres  
 plot.surface(map.L, ttl="left hemisphere, blank (all vertices 0)", pos.lims=c(0.5,5.5), which.surface='fsaverage5');   
 plot.surface(map.R, ttl="right hemisphere, vertex 1800 in red", pos.lims=c(0.5,5.5), which.surface='fsaverage5');   


And a final example (more are in the knitr tutorial and other demo code): the next code plots (Schaefer 400x7) parcel 60 on the left hemisphere, and all 200 parcels on the right hemisphere. How to know which of the 10,242 vertices correspond to left hemisphere parcel 60? The key is parcels.L (created above from Schaefer2018_400Parcels_7Networks_order_10K_L.label.gii), which has a 10,242-element vector of parcel labels: we just need to find the 60s.

 vals.L <- rep(0, 10242); # start with all vertices 0  
 inds <- which(parcels.L$data[[1]] == 60); # find indices; where 60s are in the 10,242-element vector parcels5.L$data  
 vals.L[inds] <- 4;  # change indices in vals.L for parcel 60 to 4  
 map.L <- gifti.map(surf.L.infl$data$pointset, surf.L.infl$data$triangle, vals.L);  # map vals.L onto left surface  
 map.R <- gifti.map(surf.R.infl$data$pointset, surf.R.infl$data$triangle, parcels.R$data[[1]]); # map all parcels.R vertices onto right surface  
   
 # plot lateral and medial of both hemispheres  
 plot.surface(map.L, ttl="left hemisphere, parcel 60 in orange", pos.lims=c(0.5,5.5), which.surface='fsaverage5'); # plot lateral and medial, left  
 plot.surface(map.R, ttl="right hemisphere, parcels 1 to 200", pos.lims=c(0.5,200.5), which.surface='fsaverage5'); # plot lateral and medial, right  




Friday, December 6, 2024

"crescent" artifacts revisited

Back in 2018 I posted twice about "crescent" artifacts (first, second), and have kept a casual eye out for them since, wondering how much they might affect fMRI analysis results. 

The artifact isn't totally consistent across people, but when present, is very stable over time (i.e., sessions several years apart), and bright in temporal standard deviation QC images. The artifact's location varies with encoding direction: front for PA encoding, rear with AP encoding; the PA encoding versions tend to be much brighter and obvious than the AP. 

Below is an example of the artifact, pointed out by the pink arrows. This is the temporal standard deviation image for sub-f8570ui from the DMCC55B dataset (doi:10.18112/openneuro.ds003465.v1.0.6), the first (left, AP encoding) and second (right, PA encoding) runs of Sternberg baseline (ses-wave1bas_task-Stern), after fmriprep preprocessing:

These two runs were collected sequentially within the same session, but the artifact is only visible in the PA encoding run (right). (Briefly, DMCC used a 3T Siemens Prisma, 32-channel headcoil, CMRR MB4, no in-plane acceleration, 2.4 mm iso, 1.2 s TR, alternating AP and PA encoding runs; details at openneuro, dataset description paper, and OSF sites, plus DMCC-tagged posts on this blog.)

In the previous "crescent" posts we speculated that these could be N/2 ghosts or related to incomplete fat suppression; I am now leaning away from the Nyquist ghost idea, because the crescents don't appear to line up with the most visible ghosts. (Some ghosts are a bit visible in the above image; playing with the contrast and looking in other slices makes the usual multiband-related sets of ghosts obvious, but none clearly intersect with the artifact.) It also seems odd that ghosts would be so much brighter and change their location with AP vs. PA encoding; I am no physicist, though!

link to cortex volume?

This week I gave three lab members a set of temporal standard deviation images (similar to the pair above) for 115 participants from the DMCC dataset. The participant images were in random order, and I asked the lab members to rate how confident they were that each showed the "crescent" artifact or not. My raters agreed that 34 participants showed the artifact, and 39 did not. (Ratings were mixed or less confident on the others; I asked them to decide quickly from single pairs of QC images, not investigate closely.)

We didn't measure external head size in the participants, but did run freesurfer during preprocessing, so I used its CortexVol and eTIV statistics as proxies (a different stat better?): and the group my colleagues rated as having the artifact tended to have smaller brains than those without:

If the appearance of this artifact is indeed somewhat related to head size, then it's logical that it would (as I've observed) generally be stable over time. DMCC's population was younger adults; it'd be interesting to see if there's a relationship with a wider range of head sizes.

only with DMCC or its MB4 acquisition?

Is the artifact restricted to this particular acquisition or study? No, not somehow related to DMCC; I've checked a few DMCC participants with the artifact who later participated in other studies and they have it (or not) in all of the datasets.

To see if it's restricted to the MB4 acquisition, I looked at a few images from a different study, which also has adult participants, a 3T Prisma, 32-channel headcoil, 2.4 mm iso voxels, and CMRR sequences, but with MB6 TR 0.8 s, and PA encoding for all runs. Below are standard deviation images for three different people from this MB6 study, one run each of the same task, after fmriprep preprocessing. (I chose these three because of the artifacts; not all are so prominent.)

Since this study has all PA runs I can't directly compare artifacts across the encoding directions, but there are clearly some "crescents", and more sets of rings than typical with MB4 (which makes sense for MB6). The rings are especially obvious in person #3; some of these appear to be full-brain ghosts. I suspect the artifacts would be much clearer in subject space; I haven't looked (I'm not directly involved in the study). But a substantial minority of these participants' standard deviation images resemble #1, whose artifacts strike me as quite similar to the "crescents" in some DMCC MB4 PA images.

but does it matter?

Not all artifacts that look strange in QC images actually change task-related BOLD statistics enough to be a serious concern. (Of course, how much is too much totally depends on the particular case!) I suspect that this artifact does matter for our analyses, though, both because of where it falls in the brain and because it affects BOLD enough to be visible by eye in some cases.

The artifact's most prominent frontal location with PA runs puts it uncomfortably close to regions of interest for most of my colleagues, and is one reason I have advised shifting to all AP encoding for new studies I'm involved with. Preprocessing, motion, transformation to standard space, and spatial smoothing blurs the artifact across a wider area, hopefully diluting its effect. But the artifact's location is somewhat consistent across participants, and present in a sizable enough minority (a third, perhaps, in the datasets I've looked at), that it seems possible it could reduce signal quality in our target ROIs.

For showing that it does indeed actually affect task-related BOLD enough to matter, so far I mostly just have qualitative impressions. For example, below left is the standard deviation of one DMCC person's PA Sternberg run, with the cursors on the artifact. The right side is from averaging together (after voxel-wise normalizing and detrending) frames after pressing a button with the right hand. Squinting, the statistical image is brighter in sensible motor-related grey matter areas, marked with green. But the "crescent" may also be faintly visible, as pointed out in pink.

I can imagine quantitative tests, such as comparing the single-run (so separating AP and PA encoding runs) GLM output images from the group of participants with and without the artifact. Differences in estimates in parcels/searchlights/areas overlapping the artifact would be suggestive, particularly as the estimates vary with encoding direction and participant subgroup (with-artifact or without).

thoughts?

I'm curious, have you seen similar? Do you think this artifact is from N/2 ghosting, incomplete fat suppression, or something else? (What should I call it? "Crescent" is visually descriptive, but not standard. 😅) Seem reasonable it could be related to head size? And that it can significantly affect BOLD? Other reactions? Thanks! (And we can chat about this at my OHBM 2025 poster, which will be on this topic.)