Wednesday, June 2, 2021

DMCC55B supplemental as tutorial: questionnaire data

The previous post introduces the DMCC55B dataset, with pointers to the dataset, documentation, and why you might want to work with it. As mentioned at the end of that post, the description manuscript is accompanied by files intended to be both an introduction to working with DMCC55B data specifically, and more general tutorials for some common analyses. This series of posts will describe the DMCC55B supplemental files in their tutorial aspect and methodological details, starting with the questionnaire data.

Separately from the fMRI sessions, DMCC participants complete up to 28 individual difference questionnaires (described in the "Behavioral Session" section of the manuscript and the DMCC NDA site). Sharing questionnaire data is difficult, since different groups often use different versions of the same questionnaires, and critical details like response ranges and scoring algorithm may not be stated. Several projects are working on standard formats for questionnaire data; as an NIH-funded project the DMCC is using the format developed by the NIMH Data Archive (NDA). For DMCC55B, the questionnaire data is released under derivatives/questionnaires/, with one (NDA-format) csv file per questionnaire.

The behavioralSession_individualDifferences.rnw file contains (knitr) code to read these NDA-format csv files, then calculate and display both group and single-subject summary statistics for each. This code should be useful for anyone reading data from NDA-format questionnaire data files, though with the warning that frequent consultation with the NDA Data Dictionary is required. There is unfortunately quite a bit of variability between the structure definitions, even in fundamentals such as the code used for missings. behavioralSession_individualDifferences.rnw does not calculate all possible summary statistics, only those currently used in the DMCC. 

Here is an example of the summaries in the compiled behavioralSession_individualDifferences.pdf, for the DOSPERT (Domain-Specific Risk-Taking) questionnaire. The NDA names this questionnaire dospert01, so that name is also used for the DMCC55B derivative files. In the case of dospert01, we were able to match our questions with those in the Dictionary, so both individual item responses and summary scores are included. 

Above left is a small bit of the released dospert01_DMCC55B.csv file. The contents are in accord with the NDA Data Dictionary (above right), except for the first few columns: GUIDs, HCP IDs, sex, and other restricted/sensitive information is omitted. We can use the Dictionary definitions to interpret the other columns, however; for example, that the first participant's answer of 7 to rt_1 corresponds to an answer of "Extremely Likely" to the question of whether they would "Admitting that your tastes are different from those of a friend.".

Scoring the DOSPERT produces five measures, which are reported as group means (above left) and for individual participants (above right); behavioralSession_individualDifferences.pdf presents similar summaries for each scored questionnaire.

Caution: behavioralSession_individualDifferences.rnw has code to parse NDA-format csv files and produce group score summaries. However, its scoring code and the Dictionary definitions and should be carefully reviewed before using with non-DMCC datasets, to make sure the desired calculations are being performed. 

No comments:

Post a Comment