## Tuesday, September 7, 2021

### approximately matching different parcellations

We want to use the the core MD (multiple demand) regions described in Assem et al., 2020 in some analyses but ran into a difficulty: Assem2020's core MD regions are defined in terms of HCP MMP1.0 (Multimodal Parcellation, MMP) parcels and the fsLR (HCP, fsLR32K surface), but for the project we wanted to use Schaefer et al., 2018 400x7 (400 parcels by 7 networks) parcellation and the fsaverage5 surface. Is it possible to approximate the core MD regions with a set of Schaefer 400x7 parcels? How to determine which set produces the best approximation? This post describes our answer, as well as the resulting set of Schaefer 400x7 parcels; its logic and code should be adaptable to other parcellations.

Surface calculations can be complex because vertices do not correspond to a fixed area in the same way that voxels do (e.g., a cubical "searchlight" of  eight voxels will have the same volume no matter which voxel it's centered on, but the surface area covered by the circle of a vertex's neighbors will vary across the brain according to the degree of folding at that vertex). I decided to work with parcellations defined in the same space (here, fsLR), and match at the vertex level. Matching at the vertex level has some implications, including that all vertices are equally important for determining the degree of correspondence between the parcellations; vertices are not weighted by the surface area of their neighborhood. This has the advantage of being independent of factors like surface inflation, but may not be sensible in all cases.

The approximation procedure is iterative, and uses the Dice coefficient to quantify how well two collections of vertices match; a larger Dice coefficient is better. This use was inspired by the parcellation comparisons in Lawrence et al., 2021, and I adapted their calculation code. The MMP parcellation is symmetric but the Schaefer is not, so each hemisphere was run separately.

Start: Make a vector (length 32,492 since fsLR) with 1s in vertices belonging to a core MD MMP parcel and 0 elsewhere. This vector does not change.
List all Schaefer 400x7 parcels with one or more vertex overlapping a core MD MMP parcel. (This starts with the most complete possible coverage of core MD vertices with Schaefer parcels.)

Iterative Steps:
Step 1: Make a vector (also length 32,492) with 1s in the vertices of all the listed Schaefer parcels and 0 elsewhere. Calculate the Dice coefficient between the two vectors (core MD and Schaefer).

For each listed Schaefer parcel, make a vector with 1s in the vertices of all BUT ONE of the listed Schaefer parcels and 0 elsewhere. Calculate the Dice coefficient between these two vectors (core MD and Schaefer-1 subset).

Step 2: Compare the Dice coefficient of each subset to that of the entire list. Form a new list of Schaefer parcels, keeping only those whose removal made the fit worse (i.e., drop the parcel from the list if the Dice coefficient was higher without the parcel).

Repeat Steps 1 and 2 until removing any additional Schaefer parcel makes the fit worse.

Using this procedure, we obtained a set of 32 Schaefer 400x7 parcels (IDs 99, 127, 129, 130, 131, 132, 137, 140, 141, 142, 148, 163, 165, 182, 186, 300, 332, 333, 334, 335, 336, 337, 340, 345, 349, 350, 351, 352, 354, 361, 365, 387) as best approximating the core MD described by Assem, et al. 2020 (colors are just to show boundaries):

The approximation seems reasonable, and we plan to use this set of parcels (and the procedure that found them) in future analyses.

The R (knitr) code to create the above figure, calculate Dice, and find the parcel set is in Schaefer400x7approxAssem2020.rnw, along with the compiled version.