Tuesday, June 16, 2020

tip: running FSL commands from within R

I very much believe that analyses should use existing/established programs/packages/functions as much as possible, preferably, the original version of the program in a transparent way. But I also want to preserve a record of the analysis, and be able to rerun it if needed, while minimizing the chance of difficult-to-identify errors like of clicking the wrong setting in a GUI.

I'm increasingly convinced that a good strategy is to do file management-type things in the script,  build the needed (FSL, afni, etc.) command, then send the command directly to the target program (FLS, afni, etc.), rather than use a specialized interface package. There are at least two big advantages to building the actual commands: first, few (if any) dependencies or new syntax are introduced, second, the executed commands are immediately and transparently recoverable. Note that this advice is for analysis scripts, not full-scale pipelines or software.

For example, I needed to run FSL flirt to register around twenty images. This is the only step requiring FSL, and my script is in R. I have a few options, including the FLIRT GUI, generating a text file of the flirt commands and running it at the command line, typing the commands directly at the command line, or running fsl from within R using wrapper functions (e.g., fslr).

This code shows the solution I'm advocating: using R functions for the file-wrangling and command-building parts, and then using system2 to "send" the command to fsl for running. This strategy can be used not only with fsl, but also afni, wb_command, and other command-line programs.

 fsl.path <- "/usr/local/pkg/fsl6.0/bin/";   # path to fsl flirt
 in.path <- "/data/";  # path to the input images 
 ref.path <- "/data/";  # path to the reference images  
 out.path <- "/scratch3/registrations/";   # where to write files  
 sub.ids <- c(102008, 107321, 814649, 849971);  # subject IDs
 for (sid in 1:length(sub.ids)) {   # sid <- 1;   
  # make the file names
  fname.in <- paste0(in.path, sub.ids[sid], "/T1w/T1w_acpc_dc_restore.nii.gz");   
  fname.ref <- paste0(ref.path, sub.ids[sid], "/T1w/T1w_acpc_dc_restore.nii.gz");    
  fname.out <- paste0(out.path, sub.ids[sid], "_T1w.nii.gz");  
  fname.mat <- paste0(out.path, sub.ids[sid], "_T1w.txt");  
  # check if the input files exist (and output files do not), then run fsl if so
  if (file.exists(fname.in) & file.exists(fname.ref) & !file.exists(fname.out) & !file.exists(fname.mat)) {  
   system2(paste0(fsl.path, "flirt"), args=paste("-in", fname.in, "-ref", fname.ref, "-out", fname.out, "-omat", fname.mat, "-dof 6"),  
       stdout=TRUE, env="FSLOUTPUTTYPE=NIFTI_GZ");  

The system2 function "sends" the command to fsl and can be used to check what was done:
  • paste0(fsl.path, "flirt") shows which program was run and its location on disk.
  • args= has the options sent to the flirt program, as a string. Copying out just the paste(...) part and running it in R will print it for inspection (example below).
  • stdout=TRUE tells R to print fsl's messages as it runs (useful to spot any errors)
  • env="FSLOUTPUTTYPE=NIFTI_GZ" sets the fsl environment variable FSLOUTPUTTYPE to NIFTI_GZ. Depending on your system and fsl installation you may need to set more (or no) environment variables.
Specifically, for the first participant (sid 1), the code generates this, which could be run without R:
/usr/local/pkg/fsl6.0/bin/flirt -in /data/102008/T1w/T1w_acpc_dc_restore.nii.gz -ref /data/102008/T1w/T1w_acpc_dc_restore.nii.gz -out /scratch3/registrations/102008_T1w.nii.gz -omat /scratch3/registrations/102008_T1w.txt -dof 6