Extracting surface data in VertexWiseR
Charly Billaud, Junhong Yu
2024-12-12
VertexWiseR_surface_extraction.Rmd
As of version 1.2.0, four functions in VertexWiseR do surface data extraction and synthesis: SURFvextract(), FSLRvextract(), CAT12vextract() and HIPvextract().
Surface extraction consists in reading through a preprocessing pipeline’s subjects directory, collating the surface data (for a chosen vertex-wise measures, e.g. thickness), and summarising it into one compact matrix R object, with N rows per subject and M columns per vertex values.
The functions save such objects as a .rds file. The file contains the R surface matrix and, with the default subj_ID setting, an appended list of the corresponding subjects ID. This file can be shared across any device with R and all VertexWiseR statistical analyses functions can be run on these, without the need to access the initially preprocessed data.
Extracting cortical surface data: from FreeSurfer
SURFvextract() extracts cortical surface data from a preprocessed FreeSurfer subjects directory (Fischl 2012).
The function makes use of internal FreeSurfer functions to resample every participant’s individual surface to fsaverage5 or fsaverage6. Therefore, it requires FreeSurfer to be installed and set in the environment where R is run and cannot be automatically run here.
For demonstration, we provide a subsample of 2 participants from the SPRENG dataset (Spreng et al. 2022), after preprocessing their surface data using FreeSurfer’s default recon-all pipeline. Certain files that were not needed (volumes, surfaces, label files) were removed to minimise the subsample’s size.
The subsample (29.6 MB) can be downloaded from the repository as follows (unzip can be changed to untar if errors occur):
# download and unzip the surface data directory
download.file(url = "https://github.com/CogBrainHealthLab/VertexWiseR/blob/main/inst/demo_data/spreng_surf_data_freesurfer.zip?raw=TRUE",
destfile = "spreng_surf_data_freesurfer.zip",
mode='wb')
unzip("spreng_surf_data_freesurfer.zip")
We give the following code as example:
SPRENG_CTv = SURFvextract(sdirpath = 'spreng_surf_data_freesurfer',
filename = "SPRENG_CTv.rds",
template='fsaverage5',
measure = 'thickness',
subj_ID = FALSE)
- The following arguments can be used:
- sdirpath: Path to the preprocessed subjects directory (will be used to define the SUBJECTS_DIR variable automatically)
- filename: Name of the saved .rds output (can include a specific path to it)
- template: The surface template space in which to extract the data, which can be ‘fsaverage5’ (default) or ‘fsaverage6’
- measure: The name of the surface-based measure of interest computed in FreeSurfer. That includes cortical thickness (‘thickness’), surface curvature (‘curv’), depth/height of vertex (‘sulc’), surface area (‘area’), and ‘volume’ (for freesurfer 7.4.1 or later). Default is ‘thickness’.
- subj_ID Whether to obtain a list object containing both subject ID and data matrix instead of just the matrix (TRUE OR FALSE)
An example of surface matrix object, extracted from FreeSurfer preprocessing of the all site 1 participants is available on VertexWiseR online repository:
SPRENG_CTv = readRDS(file = url("https://github.com/CogBrainHealthLab/VertexWiseR/blob/main/inst/demo_data/SPRENG_CTv_site1.rds?raw=TRUE"))
dim(SPRENG_CTv)
## [1] 238 20484
What dim(SPRENG_CTv) shows is that the matrix object contains the surface values of 238 participants, each with 20484 thickness values which correspond to the vertices of fsaverage5, both left-to-right hemispheres.
When the subj_ID argument is set to TRUE, the object returned is not a matrix on its own but a list containing both the matrix and an array listing the subject IDs from the directory. In our example: * SPRENG_CTv[[1]] will be the list of subject IDs * SPRENG_CTv[[2]] will be the matrix object
Extracting cortical surface data: from HCP and fMRIprep
FSLRvextract() extracts cortical data in FSLR32k surface space from Human Connectome Project (HCP) (Van Essen et al. 2013) or fMRIprep (Esteban et al. 2019) preprocessing output directories. FSLRvextract() requires the HCP workbench to be installed, and uses the ciftiTools R package to read the .dscalar.nii files.
For demonstration, we provide a subsample of 2 participants from the SPRENG dataset (Spreng et al. 2022), after preprocessing their surface data using fMRIprep. The latter outputs fslr32k surface data when using the “–cifti-output” option (Esteban et al. 2019). Other anatomical files were removed and only the dscalar.nii and associated json files were preserved, to minimise its size.
The subsample (14.1 MB) can be downloaded from the repository as follows (unzip can be changed to untar if errors occur):
# download and unzip the surface data directory
download.file(url = "https://github.com/CogBrainHealthLab/VertexWiseR/blob/main/inst/demo_data/spreng_surf_data_fmriprep.zip?raw=TRUE",
destfile = "spreng_surf_data_fmriprep.zip",
mode='wb')
unzip("spreng_surf_data_fmriprep.zip")
FSLRvextract() gets the data from .dscalar.nii files associated with the specified measure (e.g. thickness, curv), and can extract it as follows (specify your own connectome workbench path):
dat_fslr32k=FSLRvextract(sdirpath="spreng_surf_data/",
wb_path="path/to/workbench",
filename="dat_fslr32k.rds",
dscaler="_space-fsLR_den-91k_thickness.dscalar.nii",
subj_ID = FALSE,
silent=FALSE)
- The following arguments can be used:
- sdirpath: Path to the preprocessed subjects directory
- wb_path: Path to the HCP workbench directory
- filename: Name of the saved .rds output (can include a specific path to it)
- dscaler: Suffix of the dscaler surface files. Because these files are named differently depending on the preprocessing pipeline, the user needs to specify what they are in the dataset.
- subj_ID Whether to obtain a list object containing both subject ID and data matrix instead of just the matrix (TRUE OR FALSE). Default is TRUE.
- silent: Whether to silence messages from the process (TRUE or FALSE). Default is FALSE.
Accordingly, the dat_fslr32k matrix will contain 2 rows (for 2 participants) and 64,984 columns (the subject’s cortical thickness values in every vertex of the fslr32k surface).
Extracting cortical surface data: from CAT12
CAT12vextract() was implemented in version 1.2.0 and can extract surface preprocessed with the CAT12 (Gaser et al. 2024) surface based morphometry pipeline. The function requires reticulate (Ushey, Allaire, and Tang 2023) to run.
For demonstration, we provide again a subsample of 2 participants from the SPRENG dataset (Spreng et al. 2022), after preprocessing their surface data using CAT12’s SBM pipeline (segmentation and resampling without smoothing), and extracting different possible surface measures. Only the 32k mesh .gii and .dat files were kept.
The subsample (~13 MB) can be downloaded from the repository as follows (unzip can be changed to untar if errors occur):
# download and unzip the surface data directory
download.file(url = "https://github.com/CogBrainHealthLab/VertexWiseR/blob/main/inst/demo_data/spreng_surf_data_cat12.zip?raw=TRUE",
destfile = "spreng_surf_data_cat12.zip",
mode='wb')
unzip("spreng_surf_data_cat12.zip")
CAT12vextract() extracts surface data resampled to 32k meshes (with or without smoothing) and can be used as in this example:
CATsurf=CAT12vextract(sdirpath="SPRENG_CAT12_subsample",
filename='thickness.rds',
measure='thickness',
subj_ID = TRUE,
silent = FALSE)
- The following arguments can be used:
- sdirpath: Path to the preprocessed subjects directory (will be used to define the SUBJECTS_DIR variable automatically). We recommend that the directory follows the BIDS structure for accuracy.
- filename: Name of the saved .rds output (can include a specific path to it)
- measure: The name of the surface-based measure of interest computed in CAT12. That includes ‘thickness’, ‘depth’, ‘fractaldimension’, ‘gyrification’, and ‘toroGI20mm’.
- subj_ID Whether to obtain a list object containing both subject ID and data matrix instead of just the matrix (TRUE OR FALSE)
Accordingly, the matrix will contained 2 rows (for 2 participants) and 64,984 columns (the subject’s cortical thickness values in every vertex of the 32k mesh):
#The surface object contains the surface matrix and the list of subjects ID (and session number if applicable)
names(CATsurf)
## [1] "sub_list" "surf_obj"
#The surface matrix dimensions:
dim(CATsurf$surf_obj)
## [1] 2 64984
Extracting hippocampal surface data: from HippUnfold
HIPvextract() extracts cortical data in CITI168 surface space from the HippUnfold preprocessing pipeline (DeKraker et al. 2023). As opposed to the other two functions, HIPvextract() does not require any system requirement.
For demonstration, we provide a subsample of 2 participants from the Fink dataset (Fink et al. 2021), after preprocessing their surface data using HippUnfold, keeping all output .gii files. The subsample (11.3 MB) can be downloaded from the repository as follows (untar can be changed to unzip if errors occur):
# download and unzip the surface data directory
download.file(url = "https://github.com/CogBrainHealthLab/VertexWiseR/blob/main/inst/demo_data/fink_surf_data_hippunfold.zip?raw=TRUE",
destfile = "fink_surf_data_hippunfold.zip",
mode = "wb")
untar("fink_surf_data_hippunfold.zip")
To extract and collate the data of the two participants, HIPvextract() can be run as follows:
hipp_surf=HIPvextract(sdirpath="fink_surf_data",
filename="hippocampal_surf.rds",
measure="thickness",
subj_ID = TRUE)
- The following arguments can be used:
- sdirpath: Path to the preprocessed subjects directory
- filename: Name of the saved .rds output (can include a specific path to it)
- measure: The name of the surface-based measure of interest computed in HippUnfold. That includes ‘thickness’,‘curvature’,‘gyrification’ and ‘surfarea’. Default is ‘thickness’
- subj_ID: Whether to obtain a list object containing both subject ID and data matrix instead of just the matrix (TRUE OR FALSE). Default is TRUE.
Note that when subjects directories have multiple sessions, the matrix object will contain N rows per participant and per session.
hipp_surf[[1]]
## [1] "sub-season101_ses-1" "sub-season101_ses-2" "sub-season101_ses-3"
## [4] "sub-season102_ses-1" "sub-season102_ses-2" "sub-season102_ses-3"
Here the matrix has 6 rows for 2 particiants with 3 sessions each; and 14,524 columns (the hippocampal thickness values in every vertex of the CITI168 surface).
dim(hipp_surf[[2]])
## [1] 6 14524