Friday, August 5, 2022

Tracking US universities' post-Dobbs research privacy guarantees

Many types of human-subjects research collects information related to pregnancy (e.g., date of last menstrual cycle for a circadian rhythm study; pregnancy test before imaging; questionnaires during high-risk pregnancies). After Dobbs v. Jackson Women's Health Organization, this data could expose participants to legal risk if they were charged with something like obtaining an abortion or endangering a pregnancy.

NIH Certificates of Confidentiality protect participants' information from disclosure, but have exceptions in "limited circumstances":
*Disclosure of identifiable, sensitive information (i.e., information, physical documents, or biospecimens) protected by a Certificate of Confidentiality must be done when such disclosure is required by other applicable Federal, State, or local laws. [emphasis in original]

The question then, is if abortion-related lawsuits would fall under the Certificate of Confidentiality's limited circumstances; could researchers be required to disclose data? If so, participants must be informed of the risk during consent, and researchers must consider whether some data can be ethically collected.

These issues apply to all US researchers, but I am not aware of any official NIH-level guidance, nor universities that have issued a formal opinion. Many universities want to avoid abortion-related publicity and so are not making public statements, but at the same time are trying to quietly reassure faculty/staff/students that medical treatment and privacy are the same as they were pre-Dobbs. 

I believe that this public/private message disconnect can and should not continue indefinitely, and that proactive privacy guarantees are less potentially harmful to participants than a lawsuit or court-forced data release. My motivation for this page is that openness is generally good, and since universities and other institutions tend to adopt each other's policies, if a few take action, others will likely follow.

This page is meant to track what researchers are told about the privacy of their pregnancy-related data. Have you asked whether your university would require release of data in the case of a abortion-related lawsuit? If so, what was the response? Was the response formal and cite-able, such as a memo, HRPO or IRB guidance? 

I started the table with my own understanding of the situation at my university. Please send me (via email, twitter, or a comment on this post) what is happening at your institution. If there was a formal communication, please include its URL. Notes such as "asked HRPO 15 July, no response yet" are also welcome. I promise not to include your name/contact info unless you explicitly request otherwise (blog comments can be made anonymously).

Thank You!

StateInstitutionDateFormal?Status/Notes
MOWashington University in St. Louis2 Aug 2022noverbal communication that pregnancy-related research records will not be released in an abortion lawsuit, regardless of whether the NIH Certificate of Confidentiality is sufficient protection.
----


Friday, July 15, 2022

research in the United States after the fall of Roe v. Wade

This post is an essay-style, expanded version of messages I’ve posted on twitter (@JosetAEtzel) the last few weeks, responding to the Dobbs v. Jackson Women's Health Organization decision overturning Roe v. Wade in the United States, and Missouri’s subsequent trigger law outlawing abortion except in dire emergency. I hoped these issues would rapidly become outdated, but unfortunately that is not the case; if anything they are compounding, and I very much fear no end is in sight. I am not willing to be silent on the topic of protecting participants, or university ethics more generally.

I am a staff scientist at Washington University in St. Louis, Missouri, USA, and have been here twelve years now. It’s been a good place to do research, and I have great colleagues. I work with data collected on humans, mostly task fMRI. I generally spend my time at work on analysis and hunting for missing or weird images in our datasets, but the last few weeks I’ve spent substantial amount of time hunting for pregnancy-related information in our procedures and datasets, and seeking answers to how the legal changes affect us and our participants.

Our fMRI consenting protocols require the use of screening forms that ask if currently pregnant; high-risk studies (PET-MR) require a pregnancy test be performed immediately before the scan. These signed and dated screening forms are retained indefinitely by the imaging center at the hospital and/or our lab. Imaging studies routinely include pregnancy questions in the phone screening to determine eligibility.

An additional source of pregnancy information in our datasets is via studies using passive sensing data collection (e.g., via an app installed on participants’ phones). These can include GPS and other forms of tracking, which could e.g., show whether the participant spent time at a place where abortions are possible or searched for abortion information. Previous data breaches have happened with this type of research software, and the collection of any GPS or other tracking information raises serious privacy concerns, but my focus here is the security of this data after it is in the researchers’ hands.

We need guarantees that we will never be asked to release this data, even in the (appalling but not totally unprecedented) case that someone is charged with abortion and we are asked by a court to disclose whether the participant said that yes, they were pregnant on a particular date.

NIH Certificates of Confidentiality protect participants’ information from disclosure, but have exceptions in “limited circumstances”. “*Disclosure of identifiable, sensitive information (i.e., information, physical documents, or biospecimens) protected by a Certificate of Confidentiality must be done when such disclosure is required by other applicable Federal, State, or local laws.” At Washington University in St. Louis (as of 11 July 2022) we are being told it might not be sufficient to rely upon the Certificate of Confidentiality; that it is not "bulletproof" for state-level abortion-related lawsuits. University counsel here is still investigating, as I assume are those elsewhere.

I have been hoping that Washington University in St. Louis and other research universities would promise to protect participant (and patient) pregnancy-related information; announcing that they would fight attempts to force disclosure in any abortion-related lawsuits. So far, this has not occurred. Universities often have strong law departments and a pronounced influence on their communities, both as large employers and venerable, respected institutions. Ethics-based statements that some laws will not be complied with could have an outsized influence, and serve as a brake on those pushing enforcement and passing of ever more extreme abortion-related laws.

Since we currently lack pregnancy-related data confidentiality guarantees, in our group we have begun efforts to lessen the chances of our participants incurring extra risks from being in our studies – or even from being *asked* to be in our studies. Reducing our collection of potentially sensitive information to the absolute minimum is one step: even if subpoenaed or otherwise requested, we will not have potentially harmful records to disclose. Concretely, we have submitted changed our screening procedures, such that the participant is asked if any of a group of several exclusion criteria apply, only one of which is pregnancy (rather than asking about pregnancy separately). The participant then does not have to verbally state that they are pregnant, nor the experimenter note which of the exclusion criteria was met.

Participants will still need to complete the screening form immediately before scanning, but presumably anyone that reaches this stage will respond that they are not pregnant; if they are pregnant, the scan is cancelled and the screening form destroyed. This procedure reduces risk if we assume that recording “no, not pregnant” on a particular date has less potential legal trouble for the participant than a “yes, pregnant” response, which hopefully is the case. However, it is not unimaginable that an abortion lawsuit could have proof from elsewhere that the participant was pregnant on a particular date before the experiment, in which case their statement (or test result, in the case of studies requiring one) of not being pregnant on the experiment date could be relevant and damaging. At this time we can’t avoid using the forms with the pregnancy questions, but may start warning participants in advance that they will have to respond to a pregnancy question, and that we can’t guarantee their response will be kept private and only used in the context of the experiment.

The impact of the Dobbs decision (and in our case, Missouri state abortion trigger laws) on non-reproduction-related human subjects research is only a small subset of the harm from these laws, of course, but it is a new risk US-based researchers should consider. Human subjects protections are not trivial and must not be brushed aside, even if we hope no more abortion-related legal actions will occur. As scientists, our ethics, honor, and integrity require us to follow not just the letter but also the spirit of guidelines like the Declaration of Helsinki; we must work towards the absolutely best and strongest participant protections.

I hope that this essay has caused you to consider what data you are collecting, whether it puts your participants at new legal risk, and what you can do to minimize such risk in the short and long terms. Immediate actions such as changing how pregnancy is asked about or stopping collection of especially sensitive information seem to me the minimum ethically appropriate action; stronger, legally-binding guarantees of confidentiality may be needed soon for many types of human subjects research to continue responsibly in the United States.


UPDATE 3 August 2022: Our screening changes were approved, so I edited the relevant text and added a link to the tweet showing the approved version.

Yesterday I tweeted that a source I trust (and in a position to know) told me that Washington University in St. Louis counsel/administration told them that pregnancy-related research records will not be released in an abortion-related lawsuit, regardless of whether the NIH Certificate of Confidentiality is sufficient protection. That is good news, but I am troubled that it came via word of mouth; my source said I shouldn’t “hold out” for an official statement. It is hard to be confident without something concrete; even a technically-phrased memo or HRPO website note would be encouraging. It seems that we are being asked to act as if nothing has changed post-Dobbs, and trust that everything will be fine, but that's an awfully big ask for issues this consequential.

Tuesday, July 12, 2022

What happened in this fMRI run?

This is one of those occurrences (artifacts?) that is difficult to google, but perhaps someone will recognize it or have a guess.

This run is from a session in which a person completed four fMRI runs of a task sequentially. They did not get out of the scanner between these runs, nothing was changed in the protocol, no one entered the scanner room. Later participants (with the same protocol, scanner, etc.), have been fine. This study uses CMRR MB4 acquisitions, so we have an SBRef image for each run; the artifact is the same in the SBRef and functional run.

Runs 1 (not shown), 2, and 4 are normal, but run 3 is much darker than the others and has an obvious ghost-ish artifact, here are the DICOMs from each run's SBRef, allowing mango to adjust the contrast in each:


And here they are again, with contrast set to 1-15000 in all three images:


The functional run's DICOMs are also dark and have the prominent artifact; here's a frame:


When the run is viewed as a movie in mango the blood flow, small head movements, etc. are plainly and typically visible. The artifact does not appreciably shift or change over the course of the run, other than appearing to follow the (small) overt head motions (when the head nodded a bit, the artifact shifted in approximately the same way). The two surrounding runs (2 & 4) are typical in all frames (no sign of the artifact).

Given that this artifact is in the DICOMs, it's not introduced by preprocessing, and I am assuming this run is unusable. I'd like an explanation, though, if nothing else, so we can take any steps to reduce the chance of a recurrence. Our best guess at this time is some sort of transient machine fault, but that's not an especially satisfactory explanation. 

Any ideas? Thanks!


update 13 July 2022:

In response to Ben and Renzo's suggestions, I skimmed through the DICOM headers for fields with large differences between the three runs; if there are particular fields to look for, please let me know (this is a Siemens Prisma); I am not fluent in DICOM header! The most obvious are these, which I believe are related to color intensity, but I'm not sure if it's reporting a setting or something determined from the image after it was acquired.

run 2 (typical)

(0028,0107) Largest Image Pixel Value 32238

(0028,1050) Window Center 7579

(0028,1051) Window Width 16269

(0028,1055) Window Center & Width Explanation Algo1


run 3 (dark/artifact)

(0028,0107) Largest Image Pixel Value 3229

(0028,1050) Window Center 1218

(0028,1051) Window Width 3298

(0028,1055) Window Center & Width Explanation Algo1


run 4 (typical)

(0028,0107) Largest Image Pixel Value 31787

(0028,1050) Window Center 7423

(0028,1051) Window Width 15912

(0028,1055) Window Center & Width Explanation Algo1 

 

And here's yet another view from the three functional runs, in which I played with the contrast a bit. There's definitely a difference in which structures are brightest between the three.


 

Monday, June 13, 2022

now available: DMCC55B rejected structurals and ratings

We received several requests for the rejected DMCC55B structural images, since comparing images of different quality can be useful for training. Thanks to the assistance of Rachel Brough, we have now released T1 and T2 images for the 13 DMCC55B people whose initial structural scans we rated of poor quality, as well as our ratings for both sets of images (the initial rejected images and better repeated ones). 

The rejected structural images (with session name “doNotUse”) and ratings are in a new sub-component (postPublication_rejectedStructurals) of the DMCC55B supplemental site, rather than with the released dataset on openneuro, to avoid confusion about which images should be used for processing (use the previously-released ones available at openneuro).

Wednesday, June 8, 2022

troubleshooting run-level failed preprocessing

Sometimes preprocessed images for a few runs are just ... wrong. (These failures can be hard to find without looking, one of the reasons I strongly suggest including visual summaries in your QC procedures; make sure you have one that works for your study population and run types.) 

Here's an example of a run with "just wrong" preprocessing that I posted on neurostars: the images are all of the same person and imaging session, but one of the runs came out of the preprocessing seriously distorted.


And here's another: the images from the run at the lower right are clearly tilted and squashed compared to the others:

The above images are temporal means made after fmriprep preprocessing, including transformation to the MNI template anatomy (i.e., _space-MNI152NLin2009cAsym_desc-preproc_bold.nii.gz);  see this post for more details and code. 

How to troubleshoot this type of partial failed preprocessing?

First, note that this is not due to the entire preprocessing pipeline failing: we have the expected set of derivative images for the participant, and everything looks fine for most runs. This suggests that the problem is with not with the pipeline, or that something is unusual about the participant.

fmriprep (and I think most preprocessing pipelines?) is not fully deterministic: if you run the same script twice with the same input files and settings the output images will not be exactly the same. They should be quite similar, but not identical. We have found that sometimes simply rerunning the preprocessing will correct the type of sporadic single-run normalization/alignment failures shown above.

If repeating the preprocessing doesn't fix the failure, or you want to investigate more before rerunning, I suggest checking the "raw" (before preprocessing; as close to the images coming off the scanner as practicable) images for oddities. Simply looking at all the images, comparing those from the run with failed preprocessing and the other (successful) runs from the same session can often make the problem apparent.

Look at the actual functional images of the problematic run (e.g., by loading the DICOMs into mango and viewing as a movie): do you see anything strange? If the problematic run seems to have more-or-less the same orientation, movement, visible blood vessels, etc. as the non-problematic runs for the person, it is unlikely that the source of the problem is the functional run itself and you should keep looking. (If the functional run itself is clearly unusual/full of artifacts, it is most likely simply unusable and should be marked as missing.)

If the functional run seems fine, look at all of the other images used in preprocessing, especially any fieldmaps and single-band reference (SBRef) images. Depending on your acquisition you may have one or more fieldmaps and SBRef images per run, or per set of runs. For the DMCC we use CMRR multiband sequences, so have an SBRef image for every functional run, plus a fieldmap each session. Both the fieldmaps and SBRef images are used in preprocessing, but differently, and if either has artifacts they will affect the preprocessing.

How an artifact in the fieldmap or SBRef affects the preprocessing can be difficult to predict; both can cause similar-looking failures. In the two examples above, the first was due to artifacts in the fieldmaps, the second in the SBRef.

This is a screen capture of three SBRef images from the session shown in the second example. The numbers identify the scans and are in temporal order; image 41 is the SBRef for the affected run (a "tilted" brain); 23 for the correctly-preprocessed run above it. There are dark bands in parts of scan 41 (red), and it looks a bit warped compared to the other two; below is how they look in coronal slices:


All SBRef images look rather odd (here, there's some expected ear dropout (yellow) and encoding direction stretching), but the key is to compare the images for runs in which preprocessing was successful (23 and 32) with those for which it failed (41). The SBRef for 41 is obviously different (red): black striping in the front, and extra stretching in the lowest slices. This striping and stretching in the SBRef (probably from movement) translated to tilting in the preprocessed output above.

What to do about it?

Ideally, you won't collect SBRef or fmaps with strange artifacts; the scanner will work properly and participants will be still. If the images are checked during acquisition it is often possible to repeat problematic scans or fix incorrect settings. This is of course the best solution!

However, sometimes an artifact or movement is not apparent during data collection, the scan can't be repeated, or you are working with an existing dataset and so are stuck with problematic scans. In these cases, I suggest doing something like in this post: look at all of the scans from the session (and other sessions if relevant) and try to determine the extent and source of the problem. 

In the case of the fieldmap artifacts, every fieldmap from the scanning session was affected, but fieldmaps for the same person from two other sessions were fine. We "fixed" the failed preprocessing by building a new BIDS dataset, swapping out bad fieldmaps for good ones and changing the filenames accordingly. Before trying this I checked that the person's head position, distortion, etc. were quite similar between the runs. I do not really recommend this type of image swapping; things can go badly wrong. But it is something to consider if you have similar artifact-filled and good images from the same person and acquisition. With the SBRef image we have another option: SBRef images are not required, so we could delete poor ones (here, scan 41) and repeat the preprocessing without them. 

Neither workaround (swapping or deleting) should be used lightly or often. But it has produced adequate quality images for us in a few cases. To evaluate I look very closely at the resulting preprocessed images and BOLD timecourses, both anatomy and comparing the positive control analysis (see also) results for runs/sessions with normal and workaround preprocessing. For example, confirm that the changes in BOLD timed with your stimuli in particular visual parcels are essentially the same in runs with the different preprocessing. If different visual parcels show the stimulus-caused changes in BOLD depending on preprocessing, the workaround was not successful.

Wednesday, September 15, 2021

"The Dual Mechanisms of Cognitive Control Project": post-publication analyses

The recently-published "The Dual Mechanisms of Cognitive Control Project" paper (preprint publisher) describes the motivations and components of the project as a whole, but also contains several analyses of its task fMRI data. The supplemental information for the manuscript has the R (knitr) code and input files needed to generate the results and figures in the manuscript. 

The supplemental information now has more results than those included in the paper, however: versions of the analyses using different sets of participants, parcels, and estimating the HDR curves (GLMs), which I will briefly describe here.

For example, Figure 5 (below) shows the estimated HDR curves for each task (high - low control demand) in the Baseline (blue, BAS) and Reactive (green, REA) sessions. The grey shading marks the target; when we expect the control demand difference to be greatest. Of interest is that the estimates are greater in the target window for all tasks and sessions, with the Baseline session estimates larger than those from the Reactive session (see manuscript for more explanation and framing).

The top version (supplemental file) includes 80 participants in the estimates (some related), averages the estimates from a set of 35 parcels (from the Schaefer 400x7 parcellation) found to be particularly sensitive to DMCC tasks, and uses GLMs estimating one knot for every 2 TRs.

The second version (supplemental file) shares the theoretically interesting aspects: curves mostly peak in the target (grey) area, blue curves mostly above green. There are many differences, though: the second graph is from a post-publication analysis using the DMCC55B participants (55 unrelated people; 48 of whom are in both 55B and the 80-participant set), the set of 32 Schaefer 400x7 parcels approximating the Core Multiple Demand network (12 parcels are in both sets), and GLMs estimating one knot for every TR.

It is reassuring to see that the analysis results are generally consistent despite these fairly substantial changes to its inputs. Sometimes results can look great, but are due to a statistical fluke or overfitting; in these cases small changes to the analysis that shouldn't matter (e.g., removing or replacing several participants) often make large changes in the results. The opposite occurred here: fairly substantial changes to the parcels, participants, and (to a lesser extent) GLMs led to generally matching results.

The paper's osf site now contains results files for all the different ways to set up the analyses, within the "postPublication_coreMD" and "postPublication_1TRpK" subdirectories. The variations:

  • 80 or 55 participants. Files for analyses using the DMCC55B participants have a "DMCC55B" suffix; files for the original set of 80 participants has either no suffix or "DMCC80".
  • 35 or 32 parcels. The set of 35 parcels identified via DMCC data are referred to as the 35-megaparcel or "parcels35"; the 32 parcels approximating the core MD are referred to as "core32".
  • GLMs with 1 or 2 knots per TR. The original analyses all used GLMs with 2 TRs per knot ("2TRpK"). The 1 TR per knot GLMs are abbreviated "1TRpK", including in the file names.

Tuesday, September 7, 2021

approximately matching different parcellations

We want to use the the core MD (multiple demand) regions described in Assem et al., 2020 in some analyses but ran into a difficulty: Assem2020's core MD regions are defined in terms of HCP MMP1.0 (Multimodal Parcellation, MMP) parcels and the fsLR (HCP, fsLR32K surface), but for the project we wanted to use Schaefer et al., 2018 400x7 (400 parcels by 7 networks) parcellation and the fsaverage5 surface. Is it possible to approximate the core MD regions with a set of Schaefer 400x7 parcels? How to determine which set produces the best approximation? This post describes our answer, as well as the resulting set of Schaefer 400x7 parcels; its logic and code should be adaptable to other parcellations.

Surface calculations can be complex because vertices do not correspond to a fixed area in the same way that voxels do (e.g., a cubical "searchlight" of  eight voxels will have the same volume no matter which voxel it's centered on, but the surface area covered by the circle of a vertex's neighbors will vary across the brain according to the degree of folding at that vertex). I decided to work with parcellations defined in the same space (here, fsLR), and match at the vertex level. Matching at the vertex level has some implications, including that all vertices are equally important for determining the degree of correspondence between the parcellations; vertices are not weighted by the surface area of their neighborhood. This has the advantage of being independent of factors like surface inflation, but may not be sensible in all cases.

The approximation procedure is iterative, and uses the Dice coefficient to quantify how well two collections of vertices match; a larger Dice coefficient is better. This use was inspired by the parcellation comparisons in Lawrence et al., 2021, and I adapted their calculation code. The MMP parcellation is symmetric but the Schaefer is not, so each hemisphere was run separately.

Start: Make a vector (length 32,492 since fsLR) with 1s in vertices belonging to a core MD MMP parcel and 0 elsewhere. This vector does not change. 
List all Schaefer 400x7 parcels with one or more vertex overlapping a core MD MMP parcel. (This starts with the most complete possible coverage of core MD vertices with Schaefer parcels.)

Iterative Steps:
Step 1: Make a vector (also length 32,492) with 1s in the vertices of all the listed Schaefer parcels and 0 elsewhere. Calculate the Dice coefficient between the two vectors (core MD and Schaefer). 

For each listed Schaefer parcel, make a vector with 1s in the vertices of all BUT ONE of the listed Schaefer parcels and 0 elsewhere. Calculate the Dice coefficient between these two vectors (core MD and Schaefer-1 subset). 

Step 2: Compare the Dice coefficient of each subset to that of the entire list. Form a new list of Schaefer parcels, keeping only those whose removal made the fit worse (i.e., drop the parcel from the list if the Dice coefficient was higher without the parcel).

Repeat Steps 1 and 2 until removing any additional Schaefer parcel makes the fit worse.


Using this procedure, we obtained a set of 32 Schaefer 400x7 parcels (IDs 99, 127, 129, 130, 131, 132, 137, 140, 141, 142, 148, 163, 165, 182, 186, 300, 332, 333, 334, 335, 336, 337, 340, 345, 349, 350, 351, 352, 354, 361, 365, 387) as best approximating the core MD described by Assem, et al. 2020 (colors are just to show boundaries):


The approximation seems reasonable, and we plan to use this set of parcels (and the procedure that found them) in future analyses. 

The R (knitr) code to create the above figure, calculate Dice, and find the parcel set is in Schaefer400x7approxAssem2020.rnw, along with the compiled version.