Friday, December 19, 2014

tutorial: knitr for neuroimagers

NOTICE: this post was replaced by a newer version 13 March 2020. I will leave this post here, but strongly suggest you follow the new version instead.


I'm a big fan of using R for my MVPA, and have become an even bigger fan over the last year because of knitr. I now use knitr to create nearly all of my analysis-summary documents, even those with "brain blob" images, figures, and tables. This post contains a knitr tutorial in the form of an example knitr-created document, and the source needed to recreate it.


What does knitr do? Yihui has many demonstrations on his web site. I use knitr to create pdf files presenting, summarizing, and interpreting analysis results. Part of the demo pdf is in the image at left to give the idea: I have several paragraphs of explanatory text above a series of overlaid brain images, along with graphs and tables. This entire pdf was created from a knitr .rnw source file, which contains LaTeX text and R code blocks.

Previously, I'd make Word documents describing an analysis, copy-pasting figures and screenshots as needed, and manually formatting tables. Besides time, a big drawback of this system is human memory ... "how exactly did I calculate these figures?." I tried including links to the source R files and notes about thresholds, etc, but often missed some key detail, which I'd then have to reverse-engineer. knitr avoids that problem: I can look at the document's .rnw source code and immediately see which NIfTI image is displayed, which directory contains the plotted data, etc.

In addition to (human) memory and reproducibility benefits, the time saved by using knitr instead of Word for analysis summary documents is substantial. Need to change a parameter and rerun an analysis? With knitr there's no need to spend hours updating the images: just change the file names and parameters in the knitr document and recompile. Similarly, the color scaling or displayed slices can be changed easily.

Using knitr is relatively painless: if you use RStudio. There is still a bit of a learning curve, especially if you want fancy formatting in the text parts of the document, since it uses LaTeX syntax. But RStudio takes care of all of the interconnections: simply click the "Compile PDF" button (yellow arrow) ... and it does! I generally don't use RStudio, except for knitr, which I only do in RStudio.


 to run the demo

We successfully tested this demo file on Windows, MacOS, and Ubuntu, always using RStudio, but with whichever LaTeX compiler was recommended for the system.

Software-wise, first install RStudio, then install a LaTeX compiler; if you'll only be using LaTeX with R I suggest using TinyTeX. Within R, you'll, need the knitr and oro.nifti packages (plus tinytex, if using).


Now, download the files needed for the demo (listed below). These are mostly the low-resolution NIfTI files I've used in previous tutorials, with a new anatomic underlay image, and the knitr .rnw demo file itself. Put all of the image files into a single directory. When knitr compiles it produces many intermediate files, so it is often best to put each .rnw file into its own directory. For example, put all of the image files into c:/temp/demo/, then brainPlotsDemo.rnw into c:/temp/demo/knitr/.

Next, open brainPlotsDemo.rnw in RStudio. The RStudio GUI tab menu should look like in the screenshot above, complete with a Compile PDF button. But don't click the button yet! Instead, go through Tools then Global Options in the top RStudio menus to bring up the Options dialog box, as shown here. Click on the Sweave icon, then tell it to Weave Rnw files using knitr (marked with yellow arrow). Then click Ok to close the dialog box, and everything should be ready. In my experience, RStudio just finds the LaTeX installation - you don't need to set the paths yourself.

In the first code block, change the path to point to where you put the image files. Finally, click the Compile PDF button! RStudio should bring up a  running Compile PDF log, finishing with opening the finished pdf in a separate window. A little reload pdf button also appears to the right of the Compile PDF button (red arrow at left). If the pdf viewer doesn't open itself, try clicking this button to reload.

Good luck!

NOTICE: this post was replaced by a newer version 13 March 2020. I will leave this post here, but strongly suggest you follow the new version instead.


UPDATE 15 January 2019: Here is a post with code for plotting surface (gifti) images in knitr (and R).

UPDATE 31 May 2019: Added a copy of the full text of brainPlotsDemo.rnw in below the jump. Note that this includes the make.plots function to plot a row of NIfTI images.

UPDATE 19 July 2019: Added recommendation to use TinyTeX as the LaTeX compiler. After updating R (3.6.0) and RStudio (to 1.2.1335) on my windows computer I was unable to compile to PDF (compilation would start but not complete, leaving empty .synctex(busy) files) with my previous MikTeX setup; switching to TinyTeX fixed the problems. Also slightly edited the demo text and graphs.


NOTICE: this post was replaced by a newer version 13 March 2020. I will leave this post here, but strongly suggest you follow the new version instead.


 \documentclass{article}  
  \addtolength{\oddsidemargin}{-1.25in}  
  \addtolength{\evensidemargin}{-1.25in}  
  \addtolength{\textwidth}{2.5in}  
  \addtolength{\topmargin}{-.875in}  
  \addtolength{\textheight}{1.75in}  
 \begin{document}  
   
 <<startup, echo=FALSE, message=FALSE>>=  
   
 library(oro.nifti); # for NIfTI image reading  
 rm(list=ls());  # clear out R's memory, just in case  
   
 in.path <- "d:/svnFiles/demoCode/knitRdemo/"; # where the images to plot are located  
   
 # load the underlay (anatomical) image.  
 # I generated conteMNI_transformed.nii.gz from Conte69_AverageT1w.nii.gz, provided with the Conte69 atlas  
 # http://brainvis.wustl.edu/wiki/index.php//Caret:Atlases/Conte69_Atlas  
 fname <- paste0(in.path, "conteMNI_transformed.nii.gz");  
 if (file.exists(fname) == FALSE) { stop(paste("missing:", fname)); }  
 under.img <- readNIfTI(fname, reorient=FALSE); # read in the NIfTI. reorient=FALSE here, as well as when loading the functional images  
   
 # this sequence determines which slices will be shown in make.plots. its length should be 10, matching layout(matrix(1:10, 1:10));   
 plot.slices <- seq(from=12, to=30, by=2);   
   
   
 # little function to round the output to 'digits' digits and removing the leading zero.  
 # this function is called in make.plots to shorten numbers for display  
 doformat <- function (in.val, digits=3) {  # in.val <- 0.0034912 # digits <- 3; in.val <- 0.00003;  
  if (is.na(in.val)) {   
   out.val <- "";  
  } else {  
   if (in.val < 0) {out.val <- gsub("^-0", "-", round(in.val, digits)); }  
   if (in.val > 0) { out.val <- gsub("^0", "", round(in.val, digits)); }  
   if (in.val == 0) { out.val <- "0"; }  
   if (out.val == "") {  
    if (digits == 2) { out.val <- "<.01"; }  
    if (digits == 3) { out.val <- "<.001"; }  
    if (digits == 4) { out.val <- "<.0001"; }  
    if (digits == 5) { out.val <- "<.00001"; }  
   }  
   if (out.val == "-") {  
    if (digits == 2) { out.val <- "<-.01"; }  
    if (digits == 3) { out.val <- "<-.001"; }  
    if (digits == 4) { out.val <- "<-.0001"; }  
    if (digits == 5) { out.val <- "<-.00001"; }  
   }  
  }  
    
  return(out.val)  
 }  
   
 # function to plot overlay in.img onto anatomy under.img using typical neuroimaging conventions.  
 # 10 slices shown. positive values hot colors, negative values cool colors.   
 make.plots <- function(in.img, plot.lim, top.ttl=" ", tclr='white', neg.center=0, pos.center=0, second.ttl=" ") {    
  # in.img <- img ; plot.lim <- c(-0.15,0.15); neg.center <- -0.07; pos.center <- 0.07; top.ttl <- "top"; second.ttl <- " "; tclr <- 'white';  
  # in.img <- img ; plot.lim <- c(0.5,0.5); neg.center <- 0; pos.center <- 0; top.ttl <- "top"; second.ttl <- " "; tclr <- 'white';  
    
  # do a bit of error-catching  
  if (pos.center < 0) { stop("pos.center must be greater than zero"); }  
  if (neg.center > 0) { stop("neg.center must be less than zero"); }  
  if (length(plot.lim) != 2) { stop("plot.lim must be two numbers, the first smaller than the second, like c(1,2)"); }  
  if (plot.lim[1] > plot.lim[2]) { stop("plot.lim must be two numbers, the first smaller than the second, like c(1,2)"); }  
    
    
  if (second.ttl == " ") { rng.at <- 0.87 } else { rng.at <- 0.79; } # put the range label at the right spot, depending on the titles  
  # generate the color-scaling label  
  if (plot.lim[1] == plot.lim[2]) { clr.lbl <- paste("clr:", doformat(plot.lim[2],2)); } # ROI-mode, so just list one value  
  if (plot.lim[1] >= 0 & plot.lim[2] > 0) { # positive scaling only  
   clr.lbl <- paste0("clr: ", doformat(plot.lim[1],2), " to ", doformat(plot.lim[2],2));  
  }  
  if (plot.lim[1] < 0 & plot.lim[2] <= 0) { # negative scaling only  
   clr.lbl <- paste0("clr: ", doformat(plot.lim[2],2), " to ", doformat(plot.lim[1],2));  
  }  
  if (plot.lim[1] < 0 & plot.lim[2] > 0) {  # both positive and negative scaling, so need ot show neg.center and pos.center  
   clr.lbl <- paste0("clr: ", doformat(plot.lim[2],2), " to ", doformat(pos.center,2), "; ", doformat(neg.center,2), " to ", doformat(plot.lim[1],2));  
  }  
    
  # start making the image  
  layout(matrix(1:10, 1:10));  # 10 brain slices will be displayed side-by-side  
  par(oma=rep(0,4), mar=rep(0,4), bg="black");  # set the plots to have a black background and very narrow margins   
  for (i in plot.slices) {  # i <- 1; # i <- max(plot.slices)  
   # NOTE : zlim truncates values outside this range. to match mricron behavior, set any values > plot.lim[2] == plot.lim[2].  
   image(under.img@.Data[,,i], col=gray(0:64/64), xlab="", ylab="", axes=FALSE, useRaster=TRUE); # draw the underlay  
   plt.data <- in.img@.Data[,,i];  # get the values for the overlay  
   
   # plot the text labels.  
   if (i == max(plot.slices)) {   
    text(x=0, y=0.94, labels=top.ttl, col=tclr, pos=4, cex=0.9);   
    text(x=0, y=0.87, labels=second.ttl, col=tclr, pos=4, cex=0.9);   
    text(x=0, y=rng.at, labels=paste0("rng: ", doformat(min(in.img, na.rm=TRUE),2), " to ",  
                    doformat(max(in.img, na.rm=TRUE),2)), col=tclr, pos=4, cex=0.9);   
    text(x=0, y=0.05, labels=clr.lbl, col=tclr, pos=4, cex=0.9);   
   } else {  
    text(x=0, y=0.95, labels=paste0("k=",i), col=tclr, pos=4, cex=0.9);   
   }  
     
   # draw the overlay, depending on the color scaling  
   plotted <- FALSE;  
   # ROI mask type plot; only one color  
   if (plot.lim[1] == plot.lim[2]) {  # just plot voxels with this value (or bigger), using one color only.  
    inds <- which(plt.data > plot.lim[2]);  
    if (length(inds) > 0) { # only plot the overlay if there is something to plot on this slice  
     plt.data2 <- array(NA, dim(in.img[,,i])); # get rid of everything else  
     plt.data2[inds] <- plot.lim[2];  
     image(plt.data2, col='blue', useRaster=TRUE, add=TRUE);  
    }  
    plotted <- TRUE;  
   }  
     
   # plot with a range of colors  
   if (!plotted & plot.lim[2] > 0) {  # positive scaling  
    if (plot.lim[1] > 0) { use.min <- plot.lim[1]; } else { use.min <- pos.center; }  
    plt.data[which(plt.data > plot.lim[2])] <- plot.lim[2];  
    image(plt.data, col=heat.colors(64), zlim=c(use.min, plot.lim[2]), useRaster=TRUE, add=TRUE);   
   }  
   if (!plotted & plot.lim[1] < 0) {  # negative scaling  
    if (plot.lim[2] < 0) { use.min <- plot.lim[2]; } else { use.min <- neg.center; }  
    plt.data[which(plt.data < plot.lim[1])] <- plot.lim[1];  
    image(plt.data, col=cm.colors(64), zlim=c(plot.lim[1], use.min), useRaster=TRUE, add=TRUE);   
   }  
  }  
 }  
   
 @  
   
 \section*{brainPlotsDemo.rnw}  
 \noindent source file: \texttt{d:/svnFiles/demoCode/knitRdemo/knitr/brainPlotsDemo.rnw} \par  
 \noindent compiled \today\ \par  
 \noindent This file was written by Jo Etzel (\texttt{jetzel@wustl.edu}), started in December 2014. It may be adapted for personal use, but should be cited rather than redistributed. \par  
 \vspace{0.3 cm}   
 \noindent The purpose of this document is to demonstrate using knitr, particularly to display ``brain blob'' images, but also graphs, tables, and text. The key function for plotting NIfTI images is the \texttt{make.plots} function in the \texttt{startup} code block. I (Jo Etzel) wrote the code, functions, and text in this document but didn't create knitr! The \texttt{knitr} package itself is described at \texttt{http://yihui.name/knitr/}, and was created by Yihui Xie. \par  
 \vspace{0.3 cm}   
 \noindent The \texttt{make.plots} function in this file plots a thresholded overlay image onto a brain anatomy underlay image. The function \textbf{requires} both the overlay and underlay images to be in NIfTI format, with \textbf{exactly the same} voxel size and bounding box (i.e. perfectly aligned voxel arrays). It is common for anatomical images to be at a higher resolution than functional, but such images will not be plotted properly with this function. I recommend generating an underlay anatomy image which matches the functional space for each project; see http://mvpa.blogspot.com/2016/05/resampling-images-with-afni-3dresample.html for a resampling tutorial. \par  
 \vspace{0.3 cm}   
 \noindent This demo should be accompanied by example underlay and overlay images. These files should be saved locally, then their location specified in the \texttt{in.path} variable in the \texttt{startup} code block. I suggest you use RStudio to compile knitr files; it runs seamlessly on my Windows machine. I aim to use universally-valid LaTeX syntax in this demo, so if any of it gives you errors, please let me know. \par  
   
 \vspace{0.3 cm}   
 <<code1, cache=TRUE, dev='pdf', echo=FALSE, fig.height=1, fig.width=8, fig.align='center'>>=  
   
 # display a ROI mask. This is a binary image, so set both entries of plot.lim a bit smaller than the values indicating ROI voxels  
 fname <- paste0(in.path, "roi.nii.gz");  
 if (file.exists(fname) == FALSE) { stop(paste("missing:", fname)); }  
 img <- readNIfTI(fname, reorient=TRUE); # read in the NIfTI. reorient=TRUE here, as well as when loading the underlay image  
 make.plots(in.img=img, plot.lim=c(0.5,0.5), top.ttl="a ROI mask");  
   
 # display a searchlight accuracy mask. This image has continuous positive values; I only want to display values from 0.6 to 1.  
 fname <- paste0(in.path, "searchlightAccuracies_rad2.nii.gz");  
 if (file.exists(fname) == FALSE) { stop(paste("missing:", fname)); }  
 img <- readNIfTI(fname, reorient=TRUE); # read in the NIfTI. reorient=TRUE here, as well as when loading the underlay image  
 make.plots(in.img=img, plot.lim=c(0.6,1), top.ttl="searchlight map");  
   
 # display an image with positive and negative values. The values range from -7.95 to 11.6 (max(img, na.rm=TRUE)).  
 # to display the extreme positive and negative values but not the ones near zero, set plot.lim, neg.center, and pos.center.  
 # while not required, it is often best to set the plot.lim, neg.center, and pos.center to symmetrical values (as here), so that  
 # the color scaling is visually consistent for positive and negative values.  
 fname <- paste0(in.path, "dataset.nii.gz");  
 if (file.exists(fname) == FALSE) { stop(paste("missing:", fname)); }  
 img <- readNIfTI(fname, reorient=TRUE); # read in the NIfTI. reorient=TRUE here, as well as when loading the underlay image  
 img <- img[,,,2]; # this is a 4d image; we only want to display one of them.  
 make.plots(in.img=img, plot.lim=c(-5,5), neg.center=-1, pos.center=1, top.ttl="functional data");  
   
   
 @  
   
 \vspace{0.3 cm}   
 \noindent The color scaling in \texttt{make.plots} follows common neuroimaging conventions: positive values are hot, negative are cool. The function sets values more extreme than the values sent in \texttt{plot.lim} (and displayed as \texttt{clr} on the far-right slice) to the extreme value. For example, if the function is called with \texttt{plot.lim=c(0.5, 0.7)}, a voxel with a value of 0.8 will be colored bright yellow (most extreme hot color), while one with a value of 0.4 will \emph{not} be plotted at all. Likewise, when the color scale includes both positive and negative values (e.g. \texttt{plot.lim=c(-1, 1)}), values more extreme than the first value will be given most extreme negative color (e.g. a voxel with a value of -2 will be colored cyan). Values around zero can be omitted (not plotted) by setting \texttt{neg.center} and \texttt{pos.center}. For example, if \texttt{make.plots} is called with \texttt{plot.lim=c(-1, 1), neg.center=-0.5, pos.center=0.5}, voxels with values between -0.5 and 0.5 will not be plotted; ones with values -0.5 to -1 will be plotted with cool colors, and voxels with 0.5 to 1 will be plotted with hot values (more extreme values are always plotted, so a voxel of 1.5 will be plotted bright yellow). \par  
 \vspace{0.3 cm}   
 \noindent The images are labeled the slice number (in the \emph{i},\emph{j},\emph{k} sense: the voxel array, not anatomic space), with the displayed slices set in \texttt{plot.slices}. The right-most slice gives a title for the plot, along with the range (\texttt{rng}) of values in the overlay image as a whole, not just the displayed slices. The color scaling (\texttt{clr}) is also listed. \par  
 \vspace{0.3 cm}   
 \noindent Note that with \texttt{cache=TRUE} (as it is in \texttt{code1}) you need to make a change in the code block for it to re-execute the code; otherwise it will just read from the cache. So, changing something in the startup code block only will not change the plotted image. Adding a blank line or space (meaningless change) is sufficent to trigger knitr to re-execute the code block. If the cache seems to be causing problems, set \texttt{cache=FALSE} and/or delete everything in the knitr directory (except the \texttt{.rnw}!) and recompile. \par  
   
   
 \vspace{0.2 cm}   
 \section*{two non-fMRI plots, side-by-side}  
 This is two plots, side-by-side, to demonstrate using \texttt{layout} and a little bit of base R graphics. \par  
 \vspace{0.1 cm}   
 <<code2, cache=TRUE, echo=FALSE, dev='pdf', fig.height=3, fig.width=7, fig.align='center'>>=  
 layout(matrix(1:2, c(1,2)))  
 par(mar=c(2.5, 2, 2, 0.75), mgp=c(1.1, 0.2, 0), tcl=-0.3)   
 # mar: c(bottom, left, top, right), number of lines of margin. default is c(5, 4, 4, 2) + 0.1.  
 # mgp: margin line (in mex units) for the axis title, axis labels and axis line. mgp[1] is title, mgp[2:3] is axis. default is c(3, 1, 0).  
 # The length of tick marks as a fraction of the height of a line of text. The default value is -0.5; setting tcl = NA sets tck = -0.01 which is S' default.  
   
 # this makes a fancy bar plot  
 set.seed(4901); # to reproduce the random numbers  
   
 num.x <- 5; # how many x-axis values there are (condition labels, whatever)  
 a.vals <- abs(rnorm(num.x))*2; # num.x random positive numbers, making up the values for condition a  
 b.vals <- abs(rnorm(num.x))*2;  # and some for condition b  
   
 a.ebs <- abs(rnorm(num.x)); # error bar lengths (calculated elsewhere, if this was real data)  
 b.ebs <- abs(rnorm(num.x));   
   
 # set plot display-type parameters  
 a.col <- 'salmon'; # color for the a columns  
 b.col <- 'lightblue'   
 oset <- 0.3; # x offset for the bars  
 x.ttl <- "x-axis label"  
 y.ttl <- "y-axis label"  
 ttl <- "plot title";  
 x.lim <- c(0, num.x+1);   
 y.lim <- c(0, max(c(a.vals+a.ebs, b.vals+b.ebs))+0.2); # probably just put in the limits for real data, but here, calculate them from the random nums  
   
 plot(x=0, y=0, col='white', xlim=x.lim, ylim=y.lim, xlab=x.ttl, ylab=y.ttl, main="", xaxt='n', yaxs='i');  # start with a blank plot  
 mtext(ttl, side=3, cex=1, line=0.1);  # add the title (can also use main=ttl in plot(), but mtext gives more spacing control)  
 axis(side=1, at=1:num.x, labels=paste0("val", 1:num.x)); # put the x-axis on with text labels  
 # add the rectangles and error bars  
 for (i in 1:num.x) {   # i <- 1  
  rect(xleft=i-oset, ybottom=0, xright=i, ytop=a.vals[i], col=a.col, border=NA); # colored bar  
  arrows(x0=i-(oset/2), y0=a.vals[i], x1=i-(oset/2), y1=a.vals[i]+a.ebs[i], angle=90, length=0.03);  # error bar up  
  arrows(x0=i-(oset/2), y0=a.vals[i], x1=i-(oset/2), y1=a.vals[i]-a.ebs[i], angle=90, length=0.03);  # error bar down  
    
  rect(xleft=i, ybottom=0, xright=i+oset, ytop=b.vals[i], col=b.col, border=NA)  
  arrows(x0=i+(oset/2), y0=b.vals[i], x1=i+(oset/2), y1=b.vals[i]+b.ebs[i], angle=90, length=0.03);  
  arrows(x0=i+(oset/2), y0=b.vals[i], x1=i+(oset/2), y1=b.vals[i]-b.ebs[i], angle=90, length=0.03);  
 }  
   
 legend(x='topleft', fill=c(a.col, b.col), legend=c("a type", "b type"), bty='n');  
 box(); # put the line around the plot again, since it was interrupted by the bars and legend  
   
   
 # and a simple plot of some points for the right side.  
 c.vals <- c(0.5, 2.25, 4.5, 5, 1.2);  
 plot(x=1:length(c.vals), y=c.vals); # plotting parameters guessed by R based on the plotting call  
 points(x=1:length(a.vals), y=a.vals, col='red', pch=2, cex=2);  # added points, but a.vals[1] barely shown since < 0.5  
   
 @  
   
 \vspace{0.3 cm}   
 \noindent A table of the plotted values, with \texttt{echo=TRUE}. Fancier tables can be made with \texttt{xtable}, if needed. \par  
 <<code3, cache=TRUE>>=  
   
 # knitr has nice syntax coloring when showing code  
 tbl <- data.frame(paste0("val", 1:5), a.vals, b.vals, c.vals);  
 colnames(tbl) <- c("case", " a.vals", " b.vals", "c.vals");  
 print(tbl);  
   
 print("text for the example");  
   
 @  
   
 \noindent The same, except \texttt{echo=FALSE}. \par  
 <<code3b, echo=FALSE, cache=TRUE>>=  
   
 # knitr has nice syntax coloring when showing code  
 tbl <- data.frame(paste0("val", 1:5), a.vals, b.vals, c.vals);  
 colnames(tbl) <- c("case", " a.vals", " b.vals", "c.vals");  
 print(tbl);  
   
 print("text for the example");  
   
 @  
   
   
 \end{document}  
   

1 comment: