Data preprocessing and QC

Procedure

fMRI data was preprocessed in fMRIPrep (see description below for specifics). Used MRIQC (v0.14.2) individual and group reports to inspect image quality for anatomical and functional images. Link to tracking sheet.

More documentation and details on my fmriprep and MRIQC usage in the notebook Tutorial: BIDS, fMRIPrep, MRIQC

Excluded subjects

  • sub-142 [D142] was excluded from analysis due to excessive movement (max FD 11mm. at txA, consistent other movements up to 5-6mm at txA, 3.8mm at txB).
  • sub-147 [D142] was excluded due to FOV issue at txB (a fair amount of brain outside FOV). This wasn’t a problem for the task data that I’m aware of, so it was likely caused by him moving out of the FOV between sequences.

fMRIPrep

Boilerplate from fmriprep reports:

Results included in this manuscript come from preprocessing performed using fMRIPprep 1.1.8 (@fmriprep1; @fmriprep2; RRID:SCR_016216), which is based on Nipype 1.1.3 (@nipype1; @nipype2; RRID:SCR_002502).

Anatomical data preprocessing
A total of 2 T1-weighted (T1w) images were found within the input BIDS dataset. All of them were corrected for intensity non-uniformity (INU) using N4BiasFieldCorrection [@n4, ANTs 2.2.0]. A T1w-reference map was computed after registration of 2 T1w images (after INU-correction) using mri_robust_template [FreeSurfer 6.0.1, @fs_template]. The T1w-reference was then skull-stripped using antsBrainExtraction.sh (ANTs 2.2.0), using OASIS as target template. Brain surfaces were reconstructed using recon-all [FreeSurfer 6.0.1, RRID:SCR_001847, @fs_reconall], and the brain mask estimated previously was refined with a custom variation of the method to reconcile ANTs-derived and FreeSurfer-derived segmentations of the cortical gray-matter of Mindboggle [RRID:SCR_002438, @mindboggle]. Spatial normalization to the ICBM 152 Nonlinear Asymmetrical template version 2009c [@mni, RRID:SCR_008796] was performed through nonlinear registration with antsRegistration [ANTs 2.2.0, RRID:SCR_004757, @ants], using brain-extracted versions of both T1w volume and template. Brain tissue segmentation of cerebrospinal fluid (CSF), white-matter (WM) and gray-matter (GM) was performed on the brain-extracted T1w using fast [FSL 5.0.9, RRID:SCR_002823, @fsl_fast].

Functional data preprocessing
For each of the 2 BOLD runs found per subject (across all tasks and sessions), the following preprocessing was performed. First, a reference volume and its skull-stripped version were generated using a custom methodology of fMRIPrep. A deformation field to correct for susceptibility distortions was estimated based on fMRIPrep’s fieldmap-less approach. The deformation field is that resulting from co-registering the BOLD reference to the same-subject T1w-reference with its intensity inverted [@fieldmapless1; @fieldmapless2]. Registration is performed with antsRegistration (ANTs 2.2.0), and the process regularized by constraining deformation to be nonzero only along the phase-encoding direction, and modulated with an average fieldmap template [@fieldmapless3]. Based on the estimated susceptibility distortion, an unwarped BOLD reference was calculated for a more accurate co-registration with the anatomical reference. The BOLD reference was then co-registered to the T1w reference using bbregister (FreeSurfer) which implements boundary-based registration [@bbr]. Co-registration was configured with nine degrees of freedom to account for distortions remaining in the BOLD reference. Head-motion parameters with respect to the BOLD reference (transformation matrices, and six corresponding rotation and translation parameters) are estimated before any spatiotemporal filtering using mcflirt [FSL 5.0.9, @mcflirt]. BOLD runs were slice-time corrected using 3dTshift from AFNI [@afni, RRID:SCR_005927]. The BOLD time-series (including slice-timing correction when applied) were resampled onto their original, native space by applying a single, composite transform to correct for head-motion and susceptibility distortions. These resampled BOLD time-series will be referred to as preprocessed BOLD in original space, or just preprocessed BOLD. Automatic removal of motion artifacts using independent component analysis [ICA-AROMA, @aroma] was performed on the preprocessed BOLD on MNI space time-series after a spatial smoothing with an isotropic, Gaussian kernel of 6mm FWHM (full-width half-maximum). Corresponding “non-aggresively” denoised runs were produced after such smoothing. Additionally, the “aggressive” noise-regressors were collected and placed in the corresponding confounds file. The BOLD time-series were resampled to MNI152NLin2009cAsym standard space, generating a preprocessed BOLD run in MNI152NLin2009cAsym space. Several confounding time-series were calculated based on the preprocessed BOLD: framewise displacement (FD), DVARS and three region-wise global signals. FD and DVARS are calculated for each functional run, both using their implementations in Nipype [following the definitions by @power_fd_dvars]. The three global signals are extracted within the CSF, the WM, and the whole-brain masks. Additionally, a set of physiological regressors were extracted to allow for component-based noise correction [*CompCor*, @compcor]. Principal components are estimated after high-pass filtering the preprocessed BOLD time-series (using a discrete cosine filter with 128s cut-off) for the two CompCor variants: temporal (tCompCor) and anatomical (aCompCor). Six tCompCor components are then calculated from the top 5% variable voxels within a mask covering the subcortical regions. This subcortical mask is obtained by heavily eroding the brain mask, which ensures it does not include cortical GM regions. For aCompCor, six components are calculated within the intersection of the aforementioned mask and the union of CSF and WM masks calculated in T1w space, after their projection to the native space of each functional run (using the inverse BOLD-to-T1w transformation). The head-motion estimates calculated in the correction step were also placed within the corresponding confounds file. The BOLD time-series, were resampled to surfaces on the following spaces: fsaverage5. All resamplings can be performed with a single interpolation step by composing all the pertinent transformations (i.e. head-motion transform matrices, susceptibility distortion correction when available, and co-registrations to anatomical and template spaces). Gridded (volumetric) resamplings were performed using antsApplyTransforms (ANTs), configured with Lanczos interpolation to minimize the smoothing effects of other kernels [@lanczos]. Non-gridded (surface) resamplings were performed using mri_vol2surf (FreeSurfer).

Many internal operations of fMRIPrep use Nilearn 0.4.2 [@nilearn, RRID:SCR_001362], mostly within the functional processing workflow. For more details of the pipeline, see the section corresponding to workflows in fMRIPrep’s documentation.

MRIQC

Methods described on the readthedocs site. I ran the classifier on the T1w images and inspected the images flagged as likely problems (based on quality, i.e. classifier probability metric > .50):

    118_ses-txA
    122_ses-txB
    130_ses-txA
    130_ses-txB
    137_ses-txB
    142_ses-txA
    142_ses-txB
    148_ses-txA
    148_ses-txB

Apart from sub-142 (who I had already decided to drop), none of the anatomical images were poor enough quality in my eyes to drop given that I am only using them for registration purposes and the Freesurfer bbregister for BOLD-T1w registration seems to be pretty robust.

  • Note that for a few subjects (sub-142, sub-147), I had already used the T1w from one of their sessions as the anatomical image for both sessions because of breathing/coughing/movement artifacts (which means I swapped out the bad one for the okay one when I did the BIDS dataset conversion and DICOM import).

Pre-GIFT steps

Reorganize data

Within /data/derivatives/, make a new directory for the GIFT analyses and outputs:

$ cd /Users/sarenseeley/Desktop/restingstate/data/derivatives/
$ mkdir -p gift/{analyses-spr-2019, data-smoothed, data-denoised}

data-denoised/ is for the non-aggressively denoised images from ICA AROMA. Note: Did not actually end up using the denoised images because the number of components they generated was way too high (>100) and my computer couldn’t handle it. I can fix this in future analyses by setting the number of ICs to estimate rather than asking GIFT to estimate it from the data.

data-smoothed/ is for the non-denoised images, which will be smoothed in a later step.

We will move all of the preprocessed functional images from fmriprep here because when we smooth them later, SPM will dump the smoothed images into the original folder and I don’t want those mixed in with the fmriprep derivatives directories.1

# this command finds files up to 4 subfolders down ending in the specified string/extension,
# then it copies those files to a new location while replicating the origin directory structure

$  cd /Users/sarenseeley/Desktop/restingstate/data/derivatives/fmriprep
$  find . -type f -depth 4 -name "*smoothAROMAnonaggr_preproc.nii.gz" -print0 | cpio -pmd0 /Users/sarenseeley/Desktop/restingstate/data/derivatives/gift/data-denoised/
  
  $  cd /Users/sarenseeley/Desktop/restingstate/data/derivatives/fmriprep
$  find . -type f -depth 4 -name "*task-rest_bold_space-MNI152NLin2009cAsym_preproc.nii" -print0 | cpio -pmd0 /Users/sarenseeley/Desktop/restingstate/data/derivatives/gift/data-smoothed/

GIFT requires files to be organized within individual subject folders, with different folders within the subject folder for each session if multiple (here, ses-txA and ses-txB). It will not find files in subdirectories within the session folder, so we need to move each .nii file up one level (i.e., to data-smoothed/sub-101/txA instead of data-smoothed/sub-101/txA/func). I just did this by drag & dropping, couldn’t figure out an efficient command line way to do it.

Move the excluded subjects to their own separate folder:

$  cd /Users/sarenseeley/Desktop/restingstate/data/derivatives/gift/data-smoothed # or /data-denoised
$  mv 'sub-142' 'sub-147' /Users/sarenseeley/Desktop/restingstate/data/derivatives/gift/excluded-subjects

Finally, if using the denoised images, unzip the .nii.gz files (N/A if using the non-denoised images) so that GIFT can work with them.

$ gunzip -r data-denoised/ *.nii.gz

This recursively looks for files ending in *.nii.gz within data-denoised/, and uses gunzip to extract them.

Smooth images

GIFT suggests that images are smoothed prior to running the gICA (however they also note, _“if your goal is to examine potential artifacts in your data, then you may not want to smooth, since the artifacts can be”hidden" by the smoothing process.“_) The fmriprep does not perform spatial smoothing so this needs to happen if using the non-denoised preprocessed images. The ICA-AROMA non-aggressively denoised images have already been smoothed with an isotropic 6mm FWHM Gaussian kernel.

For the non-denoised images, I chose to use a 4mm FWHM Gaussian kernel because we may be interested in smaller subcortical structures.

SPM smooth

Use the batch interface of SPM to apply only the spatial smoothing module to preprocessed images. Make a new folder batch within /gift/analyses-spr-2019 for all of the batch scripts we’ll be using (both SPM and GIFT).

For smoothing, you need to modify two scripts: spm_smooth_rest.m and spm_smooth_rest_job.m. Both are generated when you use the batch editor in the SPM12 GUI to set up smoothing on one participant, then select “save batch and script” to save the script from the batch editor GUI as a template that you can edit.

The way the SPM batch interface works, the _job file contains inputs and parameters, such as a list of files to apply the function to and other parameters. When you run spm_smooth_rest.m, it calls on spm_smooth_rest_job.m.

First, open Matlab (I’m using 2016b).

Then, get the filepaths for all of the images to be smoothed using the script below in Matlab.2

%% Make a variable called "inputs" 
%  containing paths to all files ending in '*task-rest_bold_space-MNI152NLin2009cAsym_preproc.nii' 
%  (these images are the non-denoised outputs from fmriprep)

% This script recursively gets the names and paths of all the files ending in the specified extension
% then it puts those into a structure with cells "folder" and "name".
% We then get those cells out of the structure and combine the two using 'strcat' to get the complete filepath

cd /Users/sarenseeley/Desktop/restingstate/data/derivatives/gift/data-smoothed;
dirData = dir('**/*task-rest_bold_space-MNI152NLin2009cAsym_preproc.nii');

folder = cell2table(transpose({dirData.folder}));
names = cell2table(transpose({dirData.name}));

names.Properties.VariableNames = {'filename'};
folder.Properties.VariableNames = {'path'};

files = strcat(folder.path,'/',names.filename);

Open the cell array files (should be 76x1 cell) by double-clicking on it, and copy the contents of the first column.

spm_smooth_rest_job.m

Open spm_smooth_rest_job.m and paste the contents of files into the place where SPM wants you to tell it which data to work on (matlabbatch{1}.spm.spatial.smooth.data = {''}).

Also change the FWHM to “[4 4 4]” and the prefix to “4mmSmoothed_”3:

matlabbatch{1}.spm.spatial.smooth.data = {'/Users/sarenseeley/Desktop/restingstate/data/derivatives/gift/data-smoothed/sub-101/ses-txA/sub-101_ses-txA_task-rest_bold_space-MNI152NLin2009cAsym_preproc.nii';
    '/Users/sarenseeley/Desktop/restingstate/data/derivatives/gift/data-smoothed/sub-101/ses-txB/sub-101_ses-txB_task-rest_bold_space-MNI152NLin2009cAsym_preproc.nii'}; % all files you want to smooth should be listed in this field, separated by ;
matlabbatch{1}.spm.spatial.smooth.fwhm = [4 4 4];
matlabbatch{1}.spm.spatial.smooth.dtype = 0;
matlabbatch{1}.spm.spatial.smooth.im = 0;
matlabbatch{1}.spm.spatial.smooth.prefix = '4mmSmoothed_';
spm_smooth_rest.m

You shouldn’t need to edit anything here apart from changing nrun = X; to nrun = 1;.

% List of open inputs
nrun = 1;
jobfile = {'/Users/sarenseeley/Desktop/restingstate/data/derivatives/gift/analyses-spr-2019/batch/spm_smooth_rest_job.m'};
jobs = repmat(jobfile, 1, nrun);
inputs = cell(0, nrun);
for crun = 1:nrun
end
spm('defaults', 'FMRI');
spm_jobman('run', jobs, inputs{:});

Finally, type run('/Users/sarenseeley/Desktop/restingstate/data/derivatives/gift/analyses-spr-2019/batch/spm_smooth_rest.m') in the Matlab command window, hit enter, and wait for it to complete smoothing for all of the files. This will take a while (about a minute or two per subject on my Mac) so be patient.

AFNI 3dresample

For the non-denoised images, GIFT fails when it tries to create a group mask before running the PCA4 due to the fact that some of the images have slightly different dimensions (ugh), I’m guessing due to slight protocol changes around the time when the scanner was updated. This doesn’t seem to be an issue for the ICA-AROMA denoised images for some reason.

Used AFNI’s 3dinfo and 3dresample tools, as well as the error/warnings log from the BIDS validator (which gave some warnings about some subjects/sessions having different dimensions) to identify affected sessions and to standardize dataset dimensions.s

Note that, apparently, small differences in dimensions (e.g., 1 or 2 voxels) aren’t a problem, as this only seems to affect the subjects/sessions with the following warning:

    Type:       Warning
  File:     sub-119_ses-txA_task-rest_bold.nii
    Location:       data/sub-119/ses-txA/func/sub-119_ses-txA_task-rest_bold.nii
    Reason:      The most common set of dimensions is: 92,80,29,180 (voxels), This file has the dimensions: 88,84,29,180 (voxels). The most common resolution is: 2.61mm x 2.61mm x 4.38mm x 2.00s, This file has the resolution: 2.50mm x 2.50mm x 4.38mm x 2.00s.

How did I figure this out? I tried running GIFT on a bunch of different subsets of the data (e.g., first third of subjects, last third) to try to narrow down which ones were causing it to crash. Then once I had an idea of around where they were in the dataset, I looked in the BIDS validator log, guessed that I could start with the people with the larger differences, moved their subject folders out of derivatives/gift/data-smoothed/, and tried to run GIFT again. It ran, which confirmed my hypothesis.

First, install AFNI: https://afni.nimh.nih.gov/pub/dist/doc/htmldoc/background_install/install_instructs/steps_mac.html Don’t need to install netpbm or R (already have R, don’t need netpbm) but did need to open XQuartz and > “Check for Updates” > install any updates.

To run AFNI from Terminal, type: source ~/.bashrc afni (see discussion of error message & solution here: https://afni.nimh.nih.gov/afni/community/board/read.php?1,158888,158908#msg-158908)

Within /data/derivatives/, make a new directory for the AFNI outputs:

$ cd /Users/sarenseeley/Desktop/restingstate/data/derivatives/
$ mkdir -p afni/{data-3dresample}

# if you need to copy ALL of the subjects' files over (this preserves directory structure):
# $ cd /Users/sarenseeley/Desktop/restingstate/data/derivatives/gift/data-smoothed/
# $  find . -type f -depth 4 -name "4mmSmoothed.nii*" -print0 | cpio -pmd0  /Users/sarenseeley/Desktop/restingstate/data/derivatives/afni/data-3dresample/

# to copy to flat folder (only do this if you need to copy ALL of the subjects' files over)
# cp `find . -name "4mm*"` /Users/sarenseeley/Desktop/restingstate/data/derivatives/afni/data-3dresample   

Move the 4mmSmoothed* files from sub-101/ses-txA, sub-117/ses-txA, sub-119/ses-txA, sub-121/ses-txA and sub-122/ses-txB into derivatives/afni/data-3dresample. AFNI should be able to cope with subdirectories (I think?) but I haven’t figured that out yet. So, dump everything into the same data folder.

Note: In AFNI, a “dataset” means the file itself (e.g., 4mmSmoothed_sub-101_ses-txA_task-rest_bold_space-MNI152NLin2009cAsym_preproc.nii), not a folder containing subdirectories.

3dinfo

Documentation: https://afni.nimh.nih.gov/pub/dist/doc/program_help/3dinfo.html

Running 3dinfo will show information about the “dataset”, including dimensions:

$ 3dinfo 4mmSmoothed_sub-117_ses-txA_task-rest_bold_space-MNI152NLin2009cAsym_preproc_3dresampled.nii

Example output:

++ 3dinfo: AFNI version=AFNI_19.0.24 (Mar  8 2019) [64-bit]
Dataset File:    4mmSmoothed_sub-117_ses-txA_task-rest_bold_space-MNI152NLin2009cAsym_preproc_3dresampled.nii
Identifier Code: XYZ_qc1srKfckc7Z7t_aWfiPVQ  Creation Date: Sat Mar 16 16:28:37 2019
Template Space:  TLRC
Dataset Type:    Echo Planar (-epan)
Byte Order:      LSB_FIRST {assumed} [this CPU native = LSB_FIRST]
Storage Mode:    NIFTI
Storage Space:   216,270,000 (216 million) bytes
Geometry String: "MATRIX(-2.609,0,0,96,0,-2.609,0,132,0,0,4.375,-78):75,89,45"
Data Axes Tilt:  Plumb
Data Axes Orientation:
  first  (x) = Left-to-Right
  second (y) = Posterior-to-Anterior
  third  (z) = Inferior-to-Superior   [-orient LPI]
R-to-L extent:   -97.066 [R] -to-    96.000 [L] -step-     2.609 mm [ 75 voxels]
A-to-P extent:   -97.592 [A] -to-   132.000 [P] -step-     2.609 mm [ 89 voxels]
I-to-S extent:   -78.000 [I] -to-   114.500 [S] -step-     4.375 mm [ 45 voxels]
Number of time steps = 180  Time step = 1.00000s  Origin = 0.00000s
  -- At sub-brick #0 '?' datum type is float:     -4.60834 to       573.885
  -- At sub-brick #1 '?' datum type is float:     -4.66345 to       662.353
  -- At sub-brick #2 '?' datum type is float:     -4.29606 to       568.889
** For info on all 180 sub-bricks, use '3dinfo -verb' **`
3dresample

Documentation: https://afni.nimh.nih.gov/pub/dist/doc/program_help/3dresample.html

3dresample -master allows you to change the dimensions of a specified “dataset” (aka NIFTI file) to match another “dataset” (aka NIFTI file). So this is what we want to do to fix the issue with some subjects/sessions having slightly different dimensions that causes GIFT to crash.

General usage: 3dresample -master [dataset you want to use as template, ending in .nii for NIFTI files] -prefix [what you want the new dataset to be named, ending in .nii for NIFTI files] -input [dataset to be resampled, ending in .nii for NIFTI files]

Run the below lines:

$ 3dresample -master 4mmSmoothed_sub-101_ses-txA_task-rest_bold_space-MNI152NLin2009cAsym_preproc.nii -prefix 4mmSmoothed_sub-117_ses-txA_task-rest_bold_space-MNI152NLin2009cAsym_preproc_3dresampled.nii  -input 4mmSmoothed_sub-117_ses-txA_task-rest_bold_space-MNI152NLin2009cAsym_preproc.nii
$ 3dresample -master 4mmSmoothed_sub-101_ses-txA_task-rest_bold_space-MNI152NLin2009cAsym_preproc.nii -prefix 4mmSmoothed_sub-119_ses-txA_task-rest_bold_space-MNI152NLin2009cAsym_preproc_3dresampled.nii  -input 4mmSmoothed_sub-119_ses-txA_task-rest_bold_space-MNI152NLin2009cAsym_preproc.nii
$ 3dresample -master 4mmSmoothed_sub-101_ses-txA_task-rest_bold_space-MNI152NLin2009cAsym_preproc.nii -prefix 4mmSmoothed_sub-121_ses-txA_task-rest_bold_space-MNI152NLin2009cAsym_preproc_3dresampled.nii  -input 4mmSmoothed_sub-121_ses-txA_task-rest_bold_space-MNI152NLin2009cAsym_preproc.nii
$ 3dresample -master 4mmSmoothed_sub-101_ses-txA_task-rest_bold_space-MNI152NLin2009cAsym_preproc.nii -prefix 4mmSmoothed_sub-122_ses-txB_task-rest_bold_space-MNI152NLin2009cAsym_preproc_3dresampled.nii  -input 4mmSmoothed_sub-122_ses-txB_task-rest_bold_space-MNI152NLin2009cAsym_preproc.nii

Then move the new files (*_3rdresampled.nii) to their respective folders. Because GIFT is going to look for files starting with *4mmSmooth, delete the original smoothed files so the session folder only has two files: (1) the non-smoothed preprocessed file and (2) the new 3dresampled file. Note: I ended up removing the _3rdresampled suffix once those files were in their folders within derivatives/gift/ because it was making it more complicated to edit some batch scripts.

Now you are ready to run GIFT! (The PCA part, anyway…)

random AFNI bits

AFNI’s GUI is amazingly confusing.

Look for datasets recursively:

$ cd /Users/sarenseeley/Desktop/restingstate/data/derivatives/afni
$ afni -R data-3dresample

Regressors (single-subject)

Don’t technically need the single-subject design matrix for resting state data as GIFT does not use the design matrix when running the PCA/ICA. However, you could use the design matrix to provide temporal regressors which can be used for temporal sorting of components (on the single-subject level) in order to identify components that are particularly correlated with motion parameters or CSF/white matter signal fluctuations.

If you have task data, you DO need a design matrix for temporal sorting in order to identify task-related components.

Create regressors files

Based on https://neurostars.org/t/confounds-from-fmriprep-which-one-would-you-use-for-glm/326/4, I want to set up the design matrix to include the the motion parameters X, Y, Z, RotX, RotY, and RotZ as well as GlobalSignal, CSF and WhiteMatter. These columns are located in the sub-*_ses-tx*_task-rest_bold_confounds.tsv files that fmriprep generates (one per subject/session, within /func).

Note: It is not necessary to use the “NonSteadyStateOutliers” column because I set the GIFT batch script to exclude the first 3 volumes (based on MRIQC group report, this is the max number of dummy volumes for any subject).

Each subject and session needs a .txt or .mat file containing the regressors.

The confounds.tsv files list one row per volume. 180 volumes = 180 rows. So we need to specify for SPM that these are in scans (aka volumes) rather than seconds.

First, we need to use R convert the TSV files to CSV (easier to work with), create new files with only the columns we want to include as multiple regressors for the SPM design matrix, and save those new files as .txt files to /derivatives/gift/analyses-spr-2019/batch/regressors/.

library(tidyverse)
filter <- dplyr::filter
select <- dplyr::select

# get a list of all of the confounds.tsv files
filenames <- list.files(path = "~/Desktop/restingstate/data/derivatives/fmriprep", pattern = "confounds.tsv", full.names = TRUE, recursive = TRUE)
dir <- "/Users/sarenseeley/Desktop/restingstate/data/derivatives/gift/analyses-spr-2019/batch/regressors"

# convert tsv to csv files
lapply(filenames, function(f) {
  df = read_tsv(f)
  write.csv(df, gsub("tsv", "csv", f), row.names=FALSE)
})

# get a list of the confounds.csv files
filenames.csv <- list.files(path = "~/Desktop/restingstate/data/derivatives/fmriprep", pattern = "confounds.csv", full.names = TRUE, recursive = TRUE)
# this means we now have both TSV and CSV versions of the confounds files, so can delete TSVs

# here's what I modified from something I found online...
# first, create an object with the directory name
the_dir <- "/Users/sarenseeley/Desktop/restingstate/data/derivatives/gift/analyses-spr-2019/batch/regressors"

check_create_dir <- function(the_dir) {
  if (!dir.exists(the_dir)) {
  dir.create(the_dir, recursive = TRUE) }
} # a function to check if the directory exists

# check to see if the directory exists (make it if it doesn't)
check_create_dir(the_dir)

# read in files (selected columns only) and write new files
for (file in filenames.csv) {
  # read in the csv
  the_data <- read_csv(file, col_names = TRUE, col_types=cols_only(
  GlobalSignal  = col_number(),
  CSF   = col_number(),
  WhiteMatter   = col_number(),
  X = col_number(),
  Y = col_number(),
  Z = col_number(),
  RotX  = col_number(),
  RotY  = col_number(),
  RotZ  = col_number())) 
  # remove first 3 rows since we are dropping the first 3 TRs and starting with the 4th volume
  the_data <- the_data[-c(1:3), ]
  # write the csv to a new file
  write_csv(the_data, file.path(paste0(the_dir, "/", basename(file))), col_names = FALSE) 
  # col_names must be false, SPM doesn't want headers in the txt files
  # but this means you need to make sure you double-check the order of the regressors if/when labeling them later
}

# rename the files with the suffix "_regressors.txt" vs. "_confounds.csv.txt":
filenames.txt <- list.files(path = "~/Desktop/restingstate/data/derivatives/gift/analyses-spr-2019/batch/regressors", pattern = "confounds.csv", full.names = TRUE, recursive = TRUE)
  sapply(filenames.txt,FUN=function(file){
      file.rename(from=file,to=sub(pattern="confounds.csv",replacement="regressors.txt",file))
})

All of the new _regressors.txt files we just generated are located in Users/sarenseeley/Desktop/restingstate/data/derivatives/gift/analyses-spr-2019/batch/regressors

SPM design matrices

One design matrix (SPM.mat) per subject, per session. Scripts are located in /data/derivatives/gift/analyses-spr-19/batch/multi_regressor_jobs

spm_1stleveldesign_rest_job_[subj#]tx[A/B]_job.m

This file will look like the following, except here only the first (MNI152NLin2009cAsym_preproc.nii,4) and last (MNI152NLin2009cAsym_preproc.nii,180) file are shown below.

Note: For 4D NIFTI files, SPM requires specifying each 3D file individually using the comma and number following the nii suffix, e.g., *preproc.nii,4 refers to the fourth volume. If you are using the file selector GUI, it only shows you the single un-numbered file by default. In order to select all of the relevant volumes, specify 4:180 in the filter field.

%-----------------------------------------------------------------------
% Job saved on 22-Mar-2019 12:32:49 by cfg_util (rev $Rev: 6460 $)
% spm SPM - SPM12 (6685)
% cfg_basicio BasicIO - Unknown
%-----------------------------------------------------------------------
matlabbatch{1}.spm.stats.fmri_spec.dir = {'/Users/sarenseeley/Desktop/restingstate/data/derivatives/gift/data-smoothed/sub-101/ses-txA'};
matlabbatch{1}.spm.stats.fmri_spec.timing.units = 'scans';
matlabbatch{1}.spm.stats.fmri_spec.timing.RT = 2;
matlabbatch{1}.spm.stats.fmri_spec.timing.fmri_t = 16;
matlabbatch{1}.spm.stats.fmri_spec.timing.fmri_t0 = 8;
%%
matlabbatch{1}.spm.stats.fmri_spec.sess.scans = {
                                                 '/Users/sarenseeley/Desktop/restingstate/data/derivatives/gift/data-smoothed/sub-101/ses-txA/4mmSmoothed_sub-101_ses-txA_task-rest_bold_space-MNI152NLin2009cAsym_preproc.nii,4'
                                                 '/Users/sarenseeley/Desktop/restingstate/data/derivatives/gift/data-smoothed/sub-101/ses-txA/4mmSmoothed_sub-101_ses-txA_task-rest_bold_space-MNI152NLin2009cAsym_preproc.nii,180'
                                                 };
%%
matlabbatch{1}.spm.stats.fmri_spec.sess.cond = struct('name', {}, 'onset', {}, 'duration', {}, 'tmod', {}, 'pmod', {}, 'orth', {});
matlabbatch{1}.spm.stats.fmri_spec.sess.multi = {''};
matlabbatch{1}.spm.stats.fmri_spec.sess.regress = struct('name', {}, 'val', {});
matlabbatch{1}.spm.stats.fmri_spec.sess.multi_reg = {'/Users/sarenseeley/Desktop/restingstate/data/derivatives/gift/analyses-spr-2019/batch/regressors/sub-101_ses-txA_task-rest_bold_regressors.txt'};
matlabbatch{1}.spm.stats.fmri_spec.sess.hpf = 128;
matlabbatch{1}.spm.stats.fmri_spec.fact = struct('name', {}, 'levels', {});
matlabbatch{1}.spm.stats.fmri_spec.bases.hrf.derivs = [1 1];
matlabbatch{1}.spm.stats.fmri_spec.volt = 1;
matlabbatch{1}.spm.stats.fmri_spec.global = 'None';
matlabbatch{1}.spm.stats.fmri_spec.mthresh = 0.8;
matlabbatch{1}.spm.stats.fmri_spec.mask = {''};
matlabbatch{1}.spm.stats.fmri_spec.cvi = 'AR(1)';
spm_1stleveldesign_rest_job_all_job.m

I combined all of the individual spm_1stleveldesign_rest_job_[subj#]tx[A/B]_job.m scripts into one single SPM job in order to avoid running individual scripts if I don’t need to.

spm_1stleveldesign_rest_job_all.m
% List of open inputs
nrun = 1; % enter the number of runs here
jobfile = {'/Users/sarenseeley/Desktop/restingstate/data/derivatives/gift/analyses-spr-2019/batch/multi_regressor_jobs/spm_1stleveldesign_rest_job_all_job.m'};
jobs = repmat(jobfile, 1, nrun);
inputs = cell(0, nrun);
for crun = 1:nrun
end
spm('defaults', 'FMRI');
spm_jobman('run', jobs, inputs{:});

Usage: run('/Users/sarenseeley/Desktop/restingstate/data/derivatives/gift/analyses-spr-2019/batch/multi_regressor_jobs/spm_1stleveldesign_rest_job_all.m')

This generates individual design matrix SPM.mat files and puts them into their appropriate subject/session folders.

Regressors (group-level)

A design matrix with group-level regressors/covariates is used for post-ICA analyses in SPM. Note: this design matrix is created later, after the gICA is completed.

Create group-level file

Data (one value per subject/session for each variable) come from two places: the group BOLD report generated by MRIQC, and the master dataset. For the SPR poster analysis, I created a new dataset containing mean framewise displacement and two subsets of other variables from the master dataset - one containing the measure total scores, and the other containing demographic and health-related variables.

# read in master dataset
data <- readRDS("~/Dropbox/GLASS Lab/OT Study/data/master-dataset/ot-fmri_master-dataset_020719.rds")

# read in MRQC group stats to get mean FD 
mriqc <- read_tsv("~/Desktop/restingstate/data/derivatives/mriqc/group_bold.tsv", col_names = TRUE, col_types=cols_only(
  bids_name = col_character(),
  fd_mean   = col_number()))

# split up mriqc by session
mriqc_A <- dplyr::filter(mriqc, grepl("ses-txA",bids_name))
mriqc_B <- dplyr::filter(mriqc, grepl("ses-txB",bids_name))

# then rename for merging with master dataset variables
mriqc_A <- rename(mriqc_A, ID = bids_name, 
       fd_mean_txA = fd_mean)

mriqc_B <- rename(mriqc_B, ID = bids_name, 
       fd_mean_txB = fd_mean)

mriqc_A <- mutate_if(tibble::as_tibble(mriqc_A), 
                                is.character, 
                                stringr::str_replace_all, pattern = "sub-", replacement = "D")
mriqc_A <- mutate_if(tibble::as_tibble(mriqc_A), 
                                is.character, 
                                stringr::str_replace_all, pattern = "_ses-txA_task-rest_bold", replacement = "")

mriqc_B <- mutate_if(tibble::as_tibble(mriqc_B), 
                                is.character, 
                                stringr::str_replace_all, pattern = "sub-", replacement = "D")
mriqc_B <- mutate_if(tibble::as_tibble(mriqc_B), 
                                is.character, 
                                stringr::str_replace_all, pattern = "_ses-txB_task-rest_bold", replacement = "")

mean_fd <- left_join(mriqc_A, mriqc_B, by="ID")

# add the fd_mean variables to a subset of variables from the master dataset

# make a subset of measure total scores
tots <- data[, grep('^tot|ID', names(data))]

# get a list of variable names to pick out the ones I want
names <- names(data)
cat(names, sep="\n")

# make a subset of other variables in master dataset
group <- subset(data, select=c(ID,
                               sex_m,
                               age,
                               age_yrs,
                               knowbefore_y,
                               caretaker_y,
                               education,
                               highestdegree,
                               education_other,
                               employment,
                               householdsize,
                               householdsize_adult,
                               householdsize_income,
                               ethnicity_hisp,
                               race,
                               race_other,
                               learnaboutstudy,
                               meds_y,
                               meds_rxonly,
                               meds_psychoactive,
                               meds_hrt,
                               meds_hormone_opiate,
                               meds_total,
                               majorhealthprobs_y,
                               majorhealthprobs_what,
                               postmenopausal_y,
                               n_pregnancies,
                               n_livebirths,
                               n_nursed,
                               alcohol,
                               smoking_y,
                               smoking_howlong,
                               smoking_perday,
                               caffeine_perday,
                               exercise,
                               timesincedeath,
                               yrs_together,
                               group,
                               group_20mFU,
                               timetorest_A,
                               timetorest_B,
                               timeto20mFU,
                               timeto20mFU_mos,
                               post_sroe_1_A,
                               post_sroe_2_A,
                               post_sroe_3_A,
                               post_sroe_1_B,
                               post_sroe_2_B,
                               post_sroe_3_B,
                               rs2254298,
                               rs2268498,
                               rs53576,
                               rs2254298_A.carrier,
                               rs2268498_C.carrier,
                               rs53576_A.carrier))

# merge all the group data
rest_data <- left_join(group, tots, by="ID") %>% left_join(., mean_fd)

# replace "D" prefix in ID to "sub-" to be BIDS-compliant
rest_data <- rest_data %>% 
  mutate(ID = str_replace(ID, "D", "sub-"))

# save as a .rds file and a csv file
write_csv(rest_data, "/Users/sarenseeley/Desktop/restingstate/data/derivatives/gift/group-covariates/group_covariates.csv")
saveRDS(rest_data, "/Users/sarenseeley/Desktop/restingstate/data/derivatives/gift/group-covariates/group_covariates.rds")

# and save a version without sub-142, sub-147 (excluded from GIFT analyses)
# this is what I will use to set up the model specification in SPM
rest_data_no_subs142147 <- filter(rest_data, !grepl("sub-142|sub-147",ID)) 
write_csv(rest_data, "/Users/sarenseeley/Desktop/restingstate/data/derivatives/gift/group-covariates/group_covariates_no-142-147.csv")

I was also saving some of these individually for use with MANCOVAN…

# save the variables of interest individually for use in MANCOVAN toolbox 
age <- as_tibble(rest_data_no_subs142147$age_yrs)
write_csv(age, "/Users/sarenseeley/Desktop/restingstate/data/derivatives/gift/group-covariates/age.txt", col_names = FALSE)

sex <- as_tibble(rest_data_no_subs142147$sex_m)
write_csv(sex, "/Users/sarenseeley/Desktop/restingstate/data/derivatives/gift/group-covariates/sex.txt", col_names = FALSE)

group <- as_tibble(rest_data_no_subs142147$group)
write_csv(group, "/Users/sarenseeley/Desktop/restingstate/data/derivatives/gift/group-covariates/group.txt", col_names = FALSE)

bdi <- as_tibble(rest_data_no_subs142147$tot_bdi)
write_csv(bdi, "/Users/sarenseeley/Desktop/restingstate/data/derivatives/gift/group-covariates/bdi.txt", col_names = FALSE)

ysl <- as_tibble(rest_data_no_subs142147$tot_ysl)
write_csv(ysl, "/Users/sarenseeley/Desktop/restingstate/data/derivatives/gift/group-covariates/ysl.txt", col_names = FALSE)

icg <- as_tibble(rest_data_no_subs142147$tot_icg)
write_csv(icg, "/Users/sarenseeley/Desktop/restingstate/data/derivatives/gift/group-covariates/icg.txt", col_names = FALSE)

meds_totaln <- as_tibble(rest_data_no_subs142147$meds_total)
write_csv(meds_totaln, "/Users/sarenseeley/Desktop/restingstate/data/derivatives/gift/group-covariates/meds_totaln.txt", col_names = FALSE)

timesincedeath <- as_tibble(rest_data_no_subs142147$timesincedeath)
write_csv(timesincedeath, "/Users/sarenseeley/Desktop/restingstate/data/derivatives/gift/group-covariates/timesincedeath.txt", col_names = FALSE)

timetorest_A <- as_tibble(rest_data_no_subs142147$timetorest_A)
write_csv(timetorest_A, "/Users/sarenseeley/Desktop/restingstate/data/derivatives/gift/group-covariates/timetorest_txA.txt", col_names = FALSE)

timetorest_B <- as_tibble(rest_data_no_subs142147$timetorest_B)
write_csv(timetorest_B, "/Users/sarenseeley/Desktop/restingstate/data/derivatives/gift/group-covariates/timetorest_txB.txt", col_names = FALSE)

meanFD_txA <- as_tibble(rest_data_no_subs142147$fd_mean_txA)
write_csv(meanFD_txA, "/Users/sarenseeley/Desktop/restingstate/data/derivatives/gift/group-covariates/meanFD_txA.txt", col_names = FALSE)

meanFD_txB <- as_tibble(rest_data_no_subs142147$fd_mean_txB)
write_csv(meanFD_txB, "/Users/sarenseeley/Desktop/restingstate/data/derivatives/gift/group-covariates/meanFD_txB.txt", col_names = FALSE)

From these group data, can also do things like comparing how much people moved on average by group:

Mean FD is not significantly different between treatments or groups, and is not correlated with total ICG score, BDI score, or pre-scan STAI (state anxiety). It is also not significantly associated with age.

IMPORTANT

Will be noted again below, but make sure that when you’re importing and using the group covariates file to set up a design matrix, the subject order matches the order in which the files are listed in the SPM design, as well as matching any renumbering of subjects that GIFT has done (e.g., what GIFT says is subject 38 is the 38th file – NOT sub-038).

For each design matrix, you will likely need a different group covariates file with the rows ordered the way that the files are ordered in SPM (e.g., if you are entering subjects for whom group=0 as your first group, and for whom group=1 as your second group, all of the files for subjects in the first group will be listed first in the design matrix).

(Don’t need to do anything about this now, but just be really aware of it when you get to the point of running group SPM stats with these covariates later).

GIFT (Group ICA of fMRI Toolbox)

“It is a MATLAB toolbox which implements multiple algorithms for independent component analysis and blind source separation of group (and single subject) functional magnetic resonance imaging data.” Download and documentation available here: http://mialab.mrn.org/software/gift/index.html

For the poster analyses, I looked at treatment A alone.

Note: txA is referred to as “session 1” in GIFT (it labels sessions and subjects based on alphanumerical order of file/folder names).

Note:If you want to compare two sessions (or two groups), you should run them in a single ICA so that the component numbers match.

Data reduction

Run PCA & ICA #1

Use batch file Input_data_subjects_restingstate_txA.m to input data and initiate the analysis. Parameters used for the PCA and ICA data reduction steps are described in /Users/sarenseeley/Desktop/restingstate/data/derivatives/gift/analyses-spr-2019/results_no-sub-142_drop-3-vols_regressors_intnorm_txA/gift_txA__ica_parameter_info.mat.

Wait [hours] for it to finish.

Open GIFT by entering gift in the Matlab command line. Click the “Display GUI” button to look at the identified ICs in different ways (group, single-subject, means, timecourses, only specific ICs, etc.) You can edit display defaults, such as Z threshold to use for display, using the menu in the upper left corner of the window.

Write a summary of results to HTML by clicking the “Results Summary” button. View the HTML report, which shows ICASSO stability estimate plots as well as component spatial maps, timecourses, and spectra.

Identify noise ICs

Based on spatial distribution, power spectra (including fALFF) and ICASSO stability estimates.

Allen et al., 2011:

2.5.1 RSN selection We identified a subset of C1 components considered to be RSNs (as opposed to physiological artifacts) by inspecting the aggregate SMs and average power spectra (see below; Figure 1, step 3).Four viewers rated the components from 0 (definite artifact) to 1 (certain RSN) based on expectations that RSNs should exhibit peak activations in gray matter, low spatial overlap with known vascular, ventricular, motion, and susceptibility artifacts, and TCs dominated by low frequency fluctuations (Cordes et al., 2000). To facilitate evaluation, spectra were characterized with two metrics used previously to classify components (Robinson et al., 2009): dynamic range, the difference between the peak power and minimum power at frequencies to the right of the peak, and low frequency to high frequency power ratio, the ratio of the integral of spectral power below 0.10 Hz to the integral of power between 0.15 and 0.25 Hz (Figure 3).

FYI:

ALFF is defined as the total power within the frequency range between 0.01 and 0.1 Hz, and thus indexes the strength or intensity of LFO. f/ALFF is defined as the power within the low-frequency range (0.01-0.1 Hz) divided by the total power in the entire detectable frequency range, and represents the relative contribution of specific LFO to the whole frequency range (Zuo et al., 2010).

You can also use the “Component Labeller” tool in GIFT to see how closely your ICs resemble established functional RSNs using the Stanford templates. From the GIFT manual:

3.12.8 Component Labeler Option is provided to label components given the templates of interest. We provide templates in icatb/icatb_templates/RSN.zip file and description about the templates in icatb/icatb_templates/RSN.txt. Each component is correlated with the given templates and best template is selected based on the maximum correlation value.

Good ICs should have higher correlation values. Note that the labeller tries to correlate all of the voxels across the brain so the correlation values will likely be moderate, not very high.

Remove noise ICs

I removed the following ICs:

  • Eyeballs: 23, 38, 40, 41, 44
  • Motion: 3, 5, 10, 35
  • CSF: 8, 26, 34 (my notes are unclear here - may have removed 2 and 22 as well)
  • Other: 1, 4, 12, 17, 21, 27

If I wasn’t sure, I erred on the side of caution and left it in. Hence leaving in some that I couldn’t tell if they were physiological artifact or legit subcortical stuff.

After selecting the components to remove, GIFT will rewrite the data. At this point, GIFT creates generic subject and session numbers based on how many of each there are and the order in which it encounters the files (so for these data, subjects are labelled Sub_001 through Sub_038, and the session folder that was txA is now called 1). I created a file that links the original subject IDs and GIFT rewritten data IDs (key_link-giftIDs-studyIDs.csv) located within /Users/sarenseeley/Desktop/restingstate/data/derivatives/gift/analyses-spr-2019/results_no-sub-142_drop-3-vols_regressors_intnorm_txA/gift_txA__noise-comps-removed_data. This will be important later.

New data were written to /Users/sarenseeley/Desktop/restingstate/data/derivatives/gift/analyses-spr-2019/results_no-sub-142_drop-3-vols_regressors_intnorm_txA/gift_txA__noise-comps-removed_data.

Run ICA #2

GIFT was run a second time on the noise-components-removed data. Results and outputs of this analysis are located within /Users/sarenseeley/Desktop/restingstate/data/derivatives/gift/analyses-spr-2019/results_no-sub-142_drop-3-vols_regressors_intnorm_txA/gift_txA__noise-comps-removed_results.

Parameters used for the PCA and ICA data reduction steps are described in /Users/sarenseeley/Desktop/restingstate/data/derivatives/gift/analyses-spr-2019/results_no-sub-142_drop-3-vols_regressors_intnorm_txA/gift_txA__noise-comps-removed_results/gift_txA_noisecompsremoved__ica_parameter_info.mat. This time, I asked it to use 30 as the number of ICs vs. estimating from the data.

Pick ICs

Based on the same criteria as used to determine noise components (stability, separation, and other ICASSO metrics, spatial distribution, power, fALFF), I selected six ICs to focus on.

I did notice that the fALFF scores overall were lower than those described in the Allen et al. (2011) paper, and decreased from those seen in the first run of the ICA. All of the below components had good stability and separation.

  • IC1: Labelled as precuneus network, majority seems to be in cuneus, r = .31 with the RNS template precuneus network, fALFF = 3.4.
  • IC5: Labelled as DMN/mPFC/PCC, r = .26, fALFF = 2.89.
  • IC9: Labelled as R CEN/dlPFC, r = .27, fALFF = 2.17.
  • IC12: Labelled as DMN/MTL, r = .39, fALFF = 2.81.
  • IC20: Labelled as DMN/mPFC/PCC, r = .40, fALFF = 1.25.

I also looked at IC2, which was labelled as SN/aI/dACC but wasn’t too convinced it was that so dropped it (potential MR artifact?). Neurosynth decoder says the t map is most similar to primary motor cortex and a lot of somatosensory and movement-related terms. For future analyses, since there was no SN IC that was identified, may want to do a semi-blind ICA using the SN template for spatial constraint.

There were a number of other components that seemed decent and correlated with sensorimotor, language, auditory, visual, and visuospatial networks.

The fact that GIFT identified multiple ICs related to a single given network (e.g., DMN, visual) suggests that I probably could have gone down to 20 so that there wasn’t as much splitting of networks. OTOH, it’s interesting to see how the different DMN components relate to different mental functions (per comparison to Neurosynth meta-analytic maps) given that there is evidence for parcellation of these large-scale networks.

SPM stats

You can do this either via the “SPM Stats” button in GIFT or just fire up SPM. The first step is to run a one-sample T test which tests the effect against null. This creates a thresholded mask that can be used to mask subsequent analyses to constrain tests to just those voxels where a particular component showed supra-threshold signal. Note: you may or may not want to use the mask in later analyses. It depends: if you want to look just for intranetwork effects, then use it. If you are interested in areas outside the IC/network, then don’t.

The next step was a two-sample t test on the individual images for each component with mean framewise displacement, age, sex, time since death, psychoactive medications (coded 0/1), and BDI score included as covariates. The t test compared CG and NCG. Note that voxelwise tests are testing for group differences in the spatial maps.

Results

At my exploratory p < .001 level there were a number of tiny clusters that showed up in one or both contrasts. Not too interesting. However, the CG > NCG contrast for IC1 did reveal a small cluster in the left dorsal precentral gyrus (-23 -20 75, k = 17) that was significantly more correlated with the IC1 timecourse (I think? not 100% clear on interpretation) in the CG group than the NCG group, T = 6.83, Z = 5.26, voxelwise p uncorrected = < .001, pFDR = .028, pFWE = .004 (.027 for clusterwise).

CG > NCG: precentral gyrus

The precentral gyrus is involved in movement and motor imagery, but may also play a role in person-specific patterns of brain activity (Thornton & Mitchell, 2017; coordinates not listed) and reward outcome in addiction (fMRI meta analysis: Luijten et al., 2017 JAMA; coordinates extremely close to ours for the addicted > control contrast, see their table e5).

marsbar (ROI analysis)

To look further at the precentral gyrus, I used marsbar to create a functional ROI (a mask of the cluster) and extract parameter estimates per subject. Then read these data into R to conduct subsequent analyses.

install.packages("R.matlab")
library(R.matlab)
library(tidyverse)
library(ggpubr)
library(psych)

# read mat file with beta weights from CG > NCG contrast in the precentral gyrus cluster from IC1
roidata <- readMat("/Users/sarenseeley/Desktop/restingstate/data/derivatives/gift/analyses-spr-2019/results_no-sub-142_drop-3-vols_regressors_intnorm_txA/gift_txA__noise-comps-removed_results/gift_txA_noisecompsremoved__marsbar_roi_results/cg_ncg_-23_-20_75_comp01_betas.mat", 1)

betas <- as.tibble(roidata[[1]][[2]][[1]]) %>% rename(roi_betas_CGvNCG_comp01_precentralgyrus = V1)
beta_sub <- as.tibble(roidata[[1]][[1]][[1]]) %>% rename(fname = V1)

betas <-bind_cols(beta_sub, betas) %>% mutate(fname = str_replace(fname, "/Users/sarenseeley/Desktop/restingstate/data/derivatives/gift/analyses-spr-2019/results_no-sub-142_drop-3-vols_regressors_intnorm_txA/gift_txA__noise-comps-removed_results/gift_txA_noisecompsremoved__scaling_components_files/sub-", "file")) %>% mutate(fname = str_replace(fname, "_component_ica_ses-1_.nii", ""))

key <- read_csv("/Users/sarenseeley/Desktop/restingstate/data/derivatives/gift/analyses-spr-2019/results_no-sub-142_drop-3-vols_regressors_intnorm_txA/gift_txA__noise-comps-removed_results/key_link-giftIDs-studyIDs.csv")

betas <- left_join(betas, key, by="fname")

# now ready to merge betas with data
data <- readRDS("~/Dropbox/GLASS Lab/OT Study/data/master-dataset/ot-fmri_master-dataset_020719.rds")

data_spr19 <- left_join(data, betas, by="ID")

Correlations:

# first used YSL item 10 as the sort of prototypic index of yearning..
corr.test(data_spr19$roi_betas_CGvNCG_comp01_precentralgyrus, data_spr19$ysl_10) # "Feeling of wanting back is so strong it is indescribable..."

# then looked at YSL total
corr.test(data_spr19$roi_betas_CGvNCG_comp01_precentralgyrus, data_spr19$tot_ysl) # not significantly correlated

At MFO’s suggestion, looked separately at affective and cognitive aspects of yearning:

# subset YSL into cognitive and affective subscales and calculate totals
ysl_cog <- subset(data_spr19, select=c("ysl_2","ysl_4", "ysl_8", "ysl_11", "ysl_14"))
data_spr19$tot_ysl_cog <- rowSums(ysl_cog, na.rm=TRUE) 
describe(data_spr19$tot_ysl_cog)

ysl_aff <- subset(data_spr19, select=c("ysl_6","ysl_7", "ysl_10", "ysl_16", "ysl_21"))
data_spr19$tot_ysl_aff <- rowSums(ysl_aff, na.rm=TRUE) 
describe(data_spr19$tot_ysl_aff)

# is the relationship with precentral gyrus specific to affective items, or not?
corr.test(data_spr19$roi_betas_CGvNCG_comp01_precentralgyrus, data_spr19$tot_ysl_cog) # r = .31, p = .06
corr.test(data_spr19$roi_betas_CGvNCG_comp01_precentralgyrus, data_spr19$tot_ysl_aff) # r = .40, p = .01

Are these correlations significantly different from each other?

For the below results:
  • Null hypothesis is that the true difference in overlapping correlations IS equal to 0.
  • Alternative hypothesis is that the true difference in correlations IS NOT equal to 0.
  • Conclusions based on Steiger’s (1980) z and Zou’s (2007) confidence interval.

    library(cocor)
    # correlation of postcentral gyrus betas (j) and YSL-cog (k):
    jk <- corr.test(data_spr19$roi_betas_CGvNCG_comp01_precentralgyrus, data_spr19$tot_ysl_cog) # r = .31, p = .058
    # correlation of postcentral gyrus betas (j) and YSL-aff (h):
    jh <- corr.test(data_spr19$roi_betas_CGvNCG_comp01_precentralgyrus, data_spr19$tot_ysl_aff) # r = .40, p = .012
    # correlation of YSL-cog (k) and YSL-aff (h), dropping D142 & D147:
    data_spr19 <- filter(data_spr19, !grepl("D142|D147",ID))
                                        
    kh <- corr.test(data_spr19$tot_ysl_cog, data_spr19$tot_ysl_aff) # r = .89, p < .001
    cocor.dep.groups.overlap(r.jk=+(jk[[1]]), r.jh=+(jh[[1]]), r.kh=+(kh[[1]]), n=38, alternative="two.sided", alpha=0.05, conf.level=0.95, null.value=0)
    
      Results of a comparison of two overlapping correlations based on dependent groups
    
    Comparison between r.jk = 0.31 and r.jh = 0.4023
    Difference: r.jk - r.jh = -0.0923
    Related correlation: r.kh = 0.8939
    Group size: n = 38
    Null hypothesis: r.jk is equal to r.jh
    Alternative hypothesis: r.jk is not equal to r.jh (two-sided)
    Alpha: 0.05
    
    pearson1898: Pearson and Filon's z (1898)
      z = -1.3201, p-value = 0.1868
      Null hypothesis retained
    
    hotelling1940: Hotelling's t (1940)
      t = -1.3044, df = 35, p-value = 0.2006
      Null hypothesis retained
    
    williams1959: Williams' t (1959)
      t = -1.3042, df = 35, p-value = 0.2007
      Null hypothesis retained
    
    olkin1967: Olkin's z (1967)
      z = -1.3201, p-value = 0.1868
      Null hypothesis retained
    
    dunn1969: Dunn and Clark's z (1969)
      z = -1.2796, p-value = 0.2007
      Null hypothesis retained
    
    hendrickson1970: Hendrickson, Stanley, and Hills' (1970) modification of Williams' t (1959)
      t = -1.3044, df = 35, p-value = 0.2006
      Null hypothesis retained
    
    steiger1980: Steiger's (1980) modification of Dunn and Clark's z (1969) using average correlations
      z = -1.2752, p-value = 0.2022
      Null hypothesis retained
    
    meng1992: Meng, Rosenthal, and Rubin's z (1992)
      z = -1.2738, p-value = 0.2028
      Null hypothesis retained
      95% confidence interval for r.jk - r.jh: -0.2687 0.0570
      Null hypothesis retained (Interval includes 0)
    
    hittner2003: Hittner, May, and Silver's (2003) modification of Dunn and Clark's z (1969) using a backtransformed average Fisher's (1921) Z procedure
      z = -1.2748, p-value = 0.2024
      Null hypothesis retained
    
    zou2007: Zou's (2007) confidence interval
      95% confidence interval for r.jk - r.jh: -0.2510 0.0538
      Null hypothesis retained (Interval includes 0)

    Null hypothesis retained: the correlations of YSL-Aff and YSL-Cog with the precentral gyrus parameter estimates are NOT significantly different from each other.

    However, I realized that correlating the YSL and beta weights is a circular analysis: (1) ICG and YSL totals are highly correlated at .84 (p < .001), (2) the ICG was used to determine CG vs. NCG group, so (3) the voxels that emerged in the t test for group are biased towards being related to YSL totals. To account for this, need to include tot_icg in the linear model.

    I also included BISBAS Reward Responsiveness in the model to try to control for overall trait approach bias. Selected BISBAS-RR because it was the index of approach bias that differed significantly between NCG and CG groups, whereas Drive and Fun-Seeking did not (BIS subscale also differed between groups, with CG being higher in BIS than NCG).

    t.test(data_spr19$tot_bisbas_basrr ~ data_spr19$group) # CG < NCG, p = .027
    
        Welch Two Sample t-test
    
    data:  data_spr19$tot_bisbas_basrr by data_spr19$group
    t = 2.3895, df = 18.611, p-value = 0.02763
    alternative hypothesis: true difference in means is not equal to 0
    95 percent confidence interval:
     0.2307184 3.5258034
    sample estimates:
    mean in group NCG  mean in group CG 
             16.47826          14.60000 
    t.test(data_spr19$tot_bisbas_basdr ~ data_spr19$group) # no sig diff
    
        Welch Two Sample t-test
    
    data:  data_spr19$tot_bisbas_basdr by data_spr19$group
    t = 0.9539, df = 26.927, p-value = 0.3486
    alternative hypothesis: true difference in means is not equal to 0
    95 percent confidence interval:
     -0.9410345  2.5758171
    sample estimates:
    mean in group NCG  mean in group CG 
             9.217391          8.400000 
    t.test(data_spr19$tot_bisbas_basfun ~ data_spr19$group) # no sig diff
    
        Welch Two Sample t-test
    
    data:  data_spr19$tot_bisbas_basfun by data_spr19$group
    t = 0.99494, df = 23.163, p-value = 0.33
    alternative hypothesis: true difference in means is not equal to 0
    95 percent confidence interval:
     -0.8595628  2.4537657
    sample estimates:
    mean in group NCG  mean in group CG 
            10.130435          9.333333 
    t.test(data_spr19$tot_bisbas_bis ~ data_spr19$group) # CG > NCG, p = .044
    
        Welch Two Sample t-test
    
    data:  data_spr19$tot_bisbas_bis by data_spr19$group
    t = -2.1799, df = 21.698, p-value = 0.0404
    alternative hypothesis: true difference in means is not equal to 0
    95 percent confidence interval:
     -5.9977837 -0.1471438
    sample estimates:
    mean in group NCG  mean in group CG 
             18.26087          21.33333 
    t.test(data_spr19$gAAT_bias_death_A ~ data_spr19$group) # no sig diff
    
        Welch Two Sample t-test
    
    data:  data_spr19$gAAT_bias_death_A by data_spr19$group
    t = 1.5305, df = 32.228, p-value = 0.1356
    alternative hypothesis: true difference in means is not equal to 0
    95 percent confidence interval:
     -12.55152  88.50607
    sample estimates:
    mean in group NCG  mean in group CG 
             30.47727          -7.50000 
    t.test(data_spr19$gAAT_bias_neutral_A ~ data_spr19$group) # no sig diff
    
        Welch Two Sample t-test
    
    data:  data_spr19$gAAT_bias_neutral_A by data_spr19$group
    t = 1.2019, df = 33.912, p-value = 0.2377
    alternative hypothesis: true difference in means is not equal to 0
    95 percent confidence interval:
     -19.89916  77.49007
    sample estimates:
    mean in group NCG  mean in group CG 
             51.79545          23.00000 
    t.test(data_spr19$gAAT_bias_spouse_A ~ data_spr19$group) # no sig diff
    
        Welch Two Sample t-test
    
    data:  data_spr19$gAAT_bias_spouse_A by data_spr19$group
    t = 1.0843, df = 33.073, p-value = 0.2861
    alternative hypothesis: true difference in means is not equal to 0
    95 percent confidence interval:
     -21.91864  71.95198
    sample estimates:
    mean in group NCG  mean in group CG 
             52.25000          27.23333 
    t.test(data_spr19$gAAT_bias_stranger_A ~ data_spr19$group) # no sig diff
    
        Welch Two Sample t-test
    
    data:  data_spr19$gAAT_bias_stranger_A by data_spr19$group
    t = 0.41214, df = 34.156, p-value = 0.6828
    alternative hypothesis: true difference in means is not equal to 0
    95 percent confidence interval:
     -43.74911  66.01275
    sample estimates:
    mean in group NCG  mean in group CG 
             29.43182          18.30000 
    t.test(data_spr19$gAAT_bias_whoto_A ~ data_spr19$group) # no sig diff
    
        Welch Two Sample t-test
    
    data:  data_spr19$gAAT_bias_whoto_A by data_spr19$group
    t = 0.57733, df = 30.814, p-value = 0.5679
    alternative hypothesis: true difference in means is not equal to 0
    95 percent confidence interval:
     -40.64803  72.73591
    sample estimates:
    mean in group NCG  mean in group CG 
             42.47727          26.43333 

    Interestingly, the NCG group is actually higher in Reward Responsiveness than CG on average.

    plot(data_spr19$tot_bisbas_basrr ~ data_spr19$group)

    Regression models

    summary(c)
    
    
    Call:
    lm(formula = roi_betas_CGvNCG_comp01_precentralgyrus ~ tot_icg + 
        tot_bisbas_basrr + tot_ysl_aff, data = data_spr19)
    
    Residuals:
         Min       1Q   Median       3Q      Max 
    -16.9493  -1.5333   0.0369   1.7970   8.2247 
    
    Coefficients:
                     Estimate Std. Error t value Pr(>|t|)  
    (Intercept)       1.58123    5.93663   0.266   0.7916  
    tot_icg          -0.07472    0.09574  -0.780   0.4405  
    tot_bisbas_basrr -0.50907    0.33163  -1.535   0.1340  
    tot_ysl_aff       0.47217    0.22359   2.112   0.0421 *
    ---
    Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
    
    Residual standard error: 4.398 on 34 degrees of freedom
    Multiple R-squared:  0.2212,    Adjusted R-squared:  0.1524 
    F-statistic: 3.218 on 3 and 34 DF,  p-value: 0.0348

    Model c, which includes YSL-Affective (but not YSL-Cognitive or YSL total) scores, is predictive of precentral gyrus activation controlling for overall grief severity and reward responsiveness.

    Regression plot

    summary(c)
    
    
    Call:
    lm(formula = roi_betas_CGvNCG_comp01_precentralgyrus ~ tot_icg + 
        tot_bisbas_basrr + tot_ysl_aff, data = data_spr19)
    
    Residuals:
         Min       1Q   Median       3Q      Max 
    -16.9493  -1.5333   0.0369   1.7970   8.2247 
    
    Coefficients:
                     Estimate Std. Error t value Pr(>|t|)  
    (Intercept)       1.58123    5.93663   0.266   0.7916  
    tot_icg          -0.07472    0.09574  -0.780   0.4405  
    tot_bisbas_basrr -0.50907    0.33163  -1.535   0.1340  
    tot_ysl_aff       0.47217    0.22359   2.112   0.0421 *
    ---
    Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
    
    Residual standard error: 4.398 on 34 degrees of freedom
    Multiple R-squared:  0.2212,    Adjusted R-squared:  0.1524 
    F-statistic: 3.218 on 3 and 34 DF,  p-value: 0.0348
    ggplotRegression(c)

    Wordcloud

    Follows the approach from the Bethlehem et al. (2017) oxytocin ICA paper: > “To better characterize the components showing an oxytocin-related effect on connectivity we used the decoder function in NeuroSynth34 to compare the whole-brain component maps with large-scale automated meta-analysis maps within NeuroSynth. The top 100 terms (excluding terms for brain regions) ranked by the correlation strength between the component map and the meta-analytic map were visualized as a word cloud using the wordcloud library in R, with the size of the font scaled by correlation strength.”

    To do this, I first uploaded the component spatial maps to Neurosynth and used the “decoder” function to view the terms associated with the correlations between the component map and meta-analytic map. I copied these terms and their correlation values to CSV files and removed all terms related to brain regions (such as “dlpfc”, “gyrus”, or “anterior posterior”).

    Then

    library(wordcloud)
    library(RColorBrewer)
    library(viridis) # use the "viridis" package for color palette (viridis palettes are colorblind-safe, many ggplot2 and RColorBrewer palettes are not)
    
    # demo the package using code from RColorBrewer
    n = 8 # change this to however many bins you want to see, a high number like 200 will show more of a spectrum
    image(
      1:n, 1, as.matrix(1:n),
      col = viridis(n, option = "D"),
      xlab = "viridis n", ylab = "", xaxt = "n", yaxt = "n", bty = "n"
    )
  • Import data from MATLAB marsbar ROI output

    plot(data_spr19\(roi_betas_CGvNCG_comp01_precentralgyrus, data_spr19\)tot_ysl)

    library(psych) corr.test(rest_data_no_subs142147\(fd_mean_txA, rest_data_no_subs142147\)tot_icg)

    corr.test(data_spr19\(roi_betas_CGvNCG_comp01_precentralgyrus, data_spr19\)ysl_1) corr.test(data_spr19\(roi_betas_CGvNCG_comp01_precentralgyrus, data_spr19\)ysl_2) corr.test(data_spr19\(roi_betas_CGvNCG_comp01_precentralgyrus, data_spr19\)ysl_3) corr.test(data_spr19\(roi_betas_CGvNCG_comp01_precentralgyrus, data_spr19\)ysl_4) corr.test(data_spr19\(roi_betas_CGvNCG_comp01_precentralgyrus, data_spr19\)ysl_5) corr.test(data_spr19\(roi_betas_CGvNCG_comp01_precentralgyrus, data_spr19\)ysl_6) corr.test(data_spr19\(roi_betas_CGvNCG_comp01_precentralgyrus, data_spr19\)ysl_7) corr.test(data_spr19\(roi_betas_CGvNCG_comp01_precentralgyrus, data_spr19\)ysl_8) corr.test(data_spr19\(roi_betas_CGvNCG_comp01_precentralgyrus, data_spr19\)ysl_9) corr.test(data_spr19\(roi_betas_CGvNCG_comp01_precentralgyrus, data_spr19\)ysl_10) # “Feeling of wanting back is so strong it is indescribable…” corr.test(data_spr19\(roi_betas_CGvNCG_comp01_precentralgyrus, data_spr19\)ysl_11) corr.test(data_spr19\(roi_betas_CGvNCG_comp01_precentralgyrus, data_spr19\)ysl_12) corr.test(data_spr19\(roi_betas_CGvNCG_comp01_precentralgyrus, data_spr19\)ysl_13) corr.test(data_spr19\(roi_betas_CGvNCG_comp01_precentralgyrus, data_spr19\)ysl_14) corr.test(data_spr19\(roi_betas_CGvNCG_comp01_precentralgyrus, data_spr19\)ysl_15) corr.test(data_spr19\(roi_betas_CGvNCG_comp01_precentralgyrus, data_spr19\)ysl_16) corr.test(data_spr19\(roi_betas_CGvNCG_comp01_precentralgyrus, data_spr19\)ysl_17) corr.test(data_spr19\(roi_betas_CGvNCG_comp01_precentralgyrus, data_spr19\)ysl_18) corr.test(data_spr19\(roi_betas_CGvNCG_comp01_precentralgyrus, data_spr19\)ysl_19) corr.test(data_spr19\(roi_betas_CGvNCG_comp01_precentralgyrus, data_spr19\)ysl_20) corr.test(data_spr19\(roi_betas_CGvNCG_comp01_precentralgyrus, data_spr19\)ysl_21)

    corr.test(data_spr19\(roi_betas_CGvNCG_comp01_precentralgyrus, data_spr19\)tot_gcq_approp) corr.test(data_spr19\(roi_betas_CGvNCG_comp01_precentralgyrus, data_spr19\)tot_gcq_blame) corr.test(data_spr19\(roi_betas_CGvNCG_comp01_precentralgyrus, data_spr19\)tot_gcq_cherish) corr.test(data_spr19\(roi_betas_CGvNCG_comp01_precentralgyrus, data_spr19\)tot_gcq_future) corr.test(data_spr19\(roi_betas_CGvNCG_comp01_precentralgyrus, data_spr19\)tot_gcq_life) corr.test(data_spr19\(roi_betas_CGvNCG_comp01_precentralgyrus, data_spr19\)tot_gcq_others) corr.test(data_spr19\(roi_betas_CGvNCG_comp01_precentralgyrus, data_spr19\)tot_gcq_self) corr.test(data_spr19\(roi_betas_CGvNCG_comp01_precentralgyrus, data_spr19\)gcq_24) corr.test(data_spr19\(roi_betas_CGvNCG_comp01_precentralgyrus, data_spr19\)gcq_30) corr.test(data_spr19\(roi_betas_CGvNCG_comp01_precentralgyrus, data_spr19\)tot_gcq_threat) corr.test(data_spr19\(roi_betas_CGvNCG_comp01_precentralgyrus, data_spr19\)tot_gcq_world)

    corr.test(data_spr19\(roi_betas_CGvNCG_comp01_precentralgyrus, data_spr19\)tot_ecrrs_global_anx) corr.test(data_spr19\(roi_betas_CGvNCG_comp01_precentralgyrus, data_spr19\)tot_ecrrs_global_avoid) corr.test(data_spr19\(roi_betas_CGvNCG_comp01_precentralgyrus, data_spr19\)tot_ecrrs_spouse_anx) corr.test(data_spr19\(roi_betas_CGvNCG_comp01_precentralgyrus, data_spr19\)tot_ecrrs_spouse_avoid)

    corr.test(data_spr19\(roi_betas_CGvNCG_comp01_precentralgyrus, data_spr19\)tot_ucla_loneliness) corr.test(data_spr19\(roi_betas_CGvNCG_comp01_precentralgyrus, data_spr19\)tot_bisbas_basdr) corr.test(data_spr19\(roi_betas_CGvNCG_comp01_precentralgyrus, data_spr19\)tot_bisbas_basfun) corr.test(data_spr19\(roi_betas_CGvNCG_comp01_precentralgyrus, data_spr19\)tot_bisbas_basrr) corr.test(data_spr19\(roi_betas_CGvNCG_comp01_precentralgyrus, data_spr19\)tot_bisbas_bis)

    corr.test(data_spr19\(roi_betas_CGvNCG_comp01_precentralgyrus, data_spr19\)tot_ppss_fa) corr.test(data_spr19\(roi_betas_CGvNCG_comp01_precentralgyrus, data_spr19\)tot_ppss_fr)

    corr.test(data_spr19\(roi_betas_CGvNCG_comp01_precentralgyrus, data_spr19\)gAAT_bias_whoto_A)

    data_spr19\(gAAT_bias_spouse_A <- data_spr19\)gAAT_RT_spouse_push_A - data_spr19\(gAAT_RT_spouse_pull_A data_spr19\)gAAT_bias_whoto_A <- data_spr19\(gAAT_RT_whoto_push_A - data_spr19\)gAAT_RT_whoto_pull_A data_spr19\(gAAT_bias_stranger_A <- data_spr19\)gAAT_RT_stranger_push_A - data_spr19\(gAAT_RT_stranger_pull_A data_spr19\)gAAT_bias_death_A <- data_spr19\(gAAT_RT_death_push_A - data_spr19\)gAAT_RT_death_pull_A data_spr19\(gAAT_bias_neutral_A <- data_spr19\)gAAT_RT_neutral_push_A - data_spr19$gAAT_RT_neutral_pull_A

    
    library(ggplot2)
    
    # Color by groups and facet
    #::::::::::::::::::::::::::::::::::::::::::::::::::::
    sp <- ggscatter(p, x = "roi_betas_CGvNCG_comp01_precentralgyrus", y = "ysl_10",
       color = "group", palette = "jco",
       add = "reg.line", conf.int = TRUE)
    sp + stat_cor(aes(color = group), label.x = 3)
    
    stat_cor(data = p, method = "pearson")
    
    p <- na.omit(subset(data_spr19, select=c("roi_betas_CGvNCG_comp01_precentralgyrus", "ysl_10", "group"))) # remove two people who were dropped from the imaging analysis
    
    # scatterplot
    sp <- ggscatter(p, x = "roi_betas_CGvNCG_comp01_precentralgyrus", y = "ysl_10",
       add = "reg.line",  # Add regression line
       add.params = list(color = "blue", fill = "lightgray"), # Customize reg. line
       conf.int = TRUE # Add confidence interval
       )
    # Add correlation coefficient
    sp + stat_cor(method = "pearson", label.x = 3, label.y = 6)
    
    # Use R2 instead of R
    ggscatter(p, x = "roi_betas_CGvNCG_comp01_precentralgyrus", y = "ysl_10", add = "reg.line",
              add.params = list(color = "blue", fill = "lightgray"), # Customize reg. line
              conf.int = TRUE, color="group") +
     stat_cor(
       aes(label = paste(..rr.label.., ..p.label.., sep = "~`,`~")),
      label.x = 3
    )
    
    corr.test(data_spr19$ysl_10, data_spr19$icg_2)
    plot(data_spr19$ysl_10, data_spr19$tot_icg)

    # Regression: predict ROI activation from

    
    
    a <- lm(roi_betas_CGvNCG_comp01_precentralgyrus ~ tot_icg + tot_bisbas_basrr, data = data_spr19)
    b <- lm(roi_betas_CGvNCG_comp01_precentralgyrus ~ tot_icg + tot_bisbas_basrr + tot_ysl_aff, data = data_spr19)
    anova(a,b)
    summary(a)
    summary(b)
    ggplotRegression(f)
    
    c <- lm(roi_betas_CGvNCG_comp01_precentralgyrus ~ tot_icg + tot_bisbas_basrr + tot_ysl_cog, data = data_spr19)
    anova(a,c)
    summary(c)
    
    z <- lm(roi_betas_CGvNCG_comp01_precentralgyrus ~ tot_icg + gAAT_spouse_v_str_A + tot_ysl_aff, data = data_spr19)
    summary(z)
    
    ggsave(ggplotRegression(b))
    ggsave("~/Desktop/regression.png", plot = last_plot(), device = NULL, path = NULL,
      scale = 1, width = 8, units = c("in"), dpi = 600)
    
    
    ggplotRegression <- function (fit) {
    ggplot(fit$model, aes_string(x = names(fit$model)[4], y = names(fit$model)[1])) + 
      geom_point() +
      stat_smooth(method = "lm", col = "blue") +
      labs(title = paste("R^2 = ",signif(summary(fit)$r.squared, 2),
                         "\nIntercept = ",signif(fit$coef[[1]],3 ),
                         "\nSlope =",signif(fit$coef[[4]], 2),
                         "\np =",signif(summary(fit)$coef[4,4], 2))) +
        xlab("Yearning in Situations of Loss - Affective Subscale") +
        ylab("ROI at -23, -20, 75 (betas)")
    }
    
    
    library(apaTables)
    apa.reg.table(b, filename="~/Dropbox/Dissertation/SPR-2019-poster/regressiontable.doc", table.number=1)
    
    ## Using behavioral data
    # create a variable for spouse - stranger
    neuvstr <- group <- subset(data_spr19, select=c(ID,
                                   gAAT_RT_spouse_pull_A,
                                   gAAT_RT_stranger_pull_A))
    
    spvstr$gAAT_spouse_v_str_A <- spvstr$gAAT_RT_spouse_pull_A-spvstr$gAAT_RT_stranger_pull_A
    
    data_spr19 <- left_join(data_spr19, spvstr, by = "ID")
    
    e <- lm(roi_betas_CGvNCG_comp01_precentralgyrus ~ tot_icg + gAAT_bias_spouse_A, data = data_spr19)
    f <- lm(roi_betas_CGvNCG_comp01_precentralgyrus ~ tot_icg +  gAAT_bias_spouse_A + tot_ysl_aff, data = data_spr19)
    anova(e,f)
    summary(e)
    summary(f)
    
    g <- lm(roi_betas_CGvNCG_comp01_precentralgyrus ~ tot_icg +  gAAT_bias_spouse_A + tot_ysl_cog, data = data_spr19)
    anova(e,g)
    summary(g)
    
    data_spr19 <- left_join(data_spr19, mean_fd, by="ID")
    
    ####
    # QUESTIONS:
    #  INCLUDE COVARS IN REGRESSION MODEL?
    #  HOW TO TEST SPECIFICITY TO YEARNING VS. OTHER CG SX?
      
    corr.test(data_spr19$roi_betas_CGvNCG_comp01_precentralgyrus, data_spr19$gAAT_spouse_v_str)
    plot(data_spr19$roi_betas_CGvNCG_comp01_precentralgyrus, data_spr19$gAAT_spouse_v_str)
    corr.test(data_spr19$roi_betas_CGvNCG_comp01_precentralgyrus, data$tot_gcq_blame)
    corr.test(data_spr19$roi_betas_CGvNCG_comp01_precentralgyrus, data$tot_gcq_cherish)
    corr.test(data_spr19$roi_betas_CGvNCG_comp01_precentralgyrus, data$tot_gcq_future)
    corr.test(data_spr19$roi_betas_CGvNCG_comp01_precentralgyrus, data$tot_gcq_life)
    corr.test(data_spr19$roi_betas_CGvNCG_comp01_precentralgyrus, data$tot_gcq_others)
    corr.test(data_spr19$roi_betas_CGvNCG_comp01_precentralgyrus, data$tot_gcq_self)
    corr.test(data_spr19$roi_betas_CGvNCG_comp01_precentralgyrus, data$tot_gcq_threat)
    corr.test(data_spr19$roi_betas_CGvNCG_comp01_precentralgyrus, data$tot_gcq_world)

    Footnotes


    1. I’m sure there’s probably a way to modify this in SPM defaults but didn’t want to get too far into the weeds with that.

    2. Note: dir only searches recursively in Matlab 2016b and later. In earlier versions, it will only look for files in the specified directory and ignore any subdirectories.

    3. Ideally this would be a suffix rather than a prefix to be more consistent with BIDS naming standards, but so far haven’t figured that out.

    4. See error message here: https://sourceforge.net/p/icatb/mailman/message/31511448/
      I tried to address this by creating my own group mask and telling GIFT to use that instead of making its own, but ran into the same issue of the mask differing in dimensions from some of the data. Instructions below (adapted from http://www.jpeelle.net/mri/misc/creating_explicit_mask.html):

      How to make a group brain mask
      GIFT allows you to provide a user-specified mask, otherwise it will calculate one for you by default. Copy all of the BOLD mask files generated by fmriprep: $ find /Users/sarenseeley/Desktop/restingstate/data/derivatives/fmriprep -name "*task-rest_bold_space-MNI152NLin2009cAsym_brainmask.nii.gz" -depth 4 | xargs -I {} cp {} /Users/sarenseeley/Desktop/restingstate/data/derivatives/gift/data-smoothed/mask Unzip them:

      $ cd /Users/sarenseeley/Desktop/restingstate/data/derivatives/gift/data-smoothed/mask
      $ gunzip *brainmask.nii.gz

      gunzip --keep will keep the .gz files (by default they’re removed).
      We need to smooth them because having smoothed the images, some of the voxels will now be outside of the mask that fmriprep calculated on the unsmoothed images.
      Smooth them with a 4mm Gaussian kernel:
      In Matlab, do the same as above to get file names and add them to spm_smooth_mask_job.m. In spm_smooth_mask.m, change the line jobfile = {'/Users/sarenseeley/Desktop/restingstate/data/derivatives/gift/analyses-spr-2019/batch/spm_smooth_rest_job.m'}; to jobfile = {'/Users/sarenseeley/Desktop/restingstate/data/derivatives/gift/analyses-spr-2019/batch/spm_smooth_mask_job.m'};

      %% in matlab:
      run('/Users/sarenseeley/Desktop/restingstate/data/derivatives/gift/analyses-spr-2019/batch/spm_smooth_mask.m')
      %% in matlab, edit spm_smooth_mask_job.m accordingly:
      matlabbatch{1}.spm.util.imcalc.input = {'/Users/sarenseeley/Desktop/restingstate/data/derivatives/gift/data-smoothed/mask/4mmSmoothed_sub-101_ses-txA_task-rest_bold_space-MNI152NLin2009cAsym_brainmask.nii,1'
        '/Users/sarenseeley/Desktop/restingstate/data/derivatives/gift/data-smoothed/mask/4mmSmoothed_sub-101_ses-txB_task-rest_bold_space-MNI152NLin2009cAsym_brainmask.nii,1'};
      %%
      matlabbatch{1}.spm.util.imcalc.output = 'group_bold_mask';
      matlabbatch{1}.spm.util.imcalc.outdir = {'/Users/sarenseeley/Desktop/restingstate/data/derivatives/gift/data-smoothed/mask'};
      matlabbatch{1}.spm.util.imcalc.expression = 'all(X)';
      matlabbatch{1}.spm.util.imcalc.var = struct('name', {}, 'value', {});
      matlabbatch{1}.spm.util.imcalc.options.dmtx = 1;
      matlabbatch{1}.spm.util.imcalc.options.mask = 0;
      matlabbatch{1}.spm.util.imcalc.options.interp = 1;
      matlabbatch{1}.spm.util.imcalc.options.dtype = 4;
      %% in matlab:
      run('/Users/sarenseeley/Desktop/restingstate/data/derivatives/gift/analyses-spr-2019/batch/imcalc_group_mask.m')
    ---
title: "Analyses for SPR 2019 poster"
author: "Saren Seeley"
date: "Last updated 03-29-2019"
output: 
  html_notebook:
    toc: yes
    number_sections: no
    toc_float: yes
    theme: "default"
---
# Data preprocessing and QC
## Procedure
fMRI data was preprocessed in `fMRIPrep` (see description below for specifics). Used `MRIQC` (v0.14.2) individual and group reports to inspect image quality for anatomical and functional images. [Link to tracking sheet](https://docs.google.com/spreadsheets/d/17nQm0frBYUSJgH3Hg3yN6K_2ZjzWiib9hdFk0j9x7uM/edit?usp=sharing). 

More documentation and details on my `fmriprep` and `MRIQC` usage in the notebook [*Tutorial: BIDS, fMRIPrep, MRIQC*](http://rpubs.com/sarenseeley/463941)

### Excluded subjects

<ul><li>**sub-142** [D142] was excluded from analysis due to excessive movement (max FD 11mm. at txA, consistent other movements up to 5-6mm at txA, 3.8mm at txB).
<li>**sub-147** [D142] was excluded due to FOV issue at txB (a fair amount of brain outside FOV). This wasn't a problem for the task data that I'm aware of, so it was likely caused by him moving out of the FOV between sequences. </li></li></ul>

### fMRIPrep
Boilerplate from `fmriprep` reports:

>Results included in this manuscript come from preprocessing
performed using *fMRIPprep* 1.1.8 (@fmriprep1; @fmriprep2; RRID:SCR_016216),
which is based on *Nipype* 1.1.3 (@nipype1; @nipype2; RRID:SCR_002502).
<p>
<p>**Anatomical data preprocessing** <br>
A total of 2 T1-weighted (T1w) images were found within the input
BIDS dataset.
All of them were corrected for intensity non-uniformity (INU)
using `N4BiasFieldCorrection` [@n4, ANTs 2.2.0].
A T1w-reference map was computed after registration of
2 T1w images (after INU-correction) using
`mri_robust_template` [FreeSurfer 6.0.1, @fs_template].
The T1w-reference was then skull-stripped using `antsBrainExtraction.sh`
(ANTs 2.2.0), using OASIS as target template.
Brain surfaces were reconstructed using `recon-all` [FreeSurfer 6.0.1,
RRID:SCR_001847, @fs_reconall], and the brain mask estimated
previously was refined with a custom variation of the method to reconcile
ANTs-derived and FreeSurfer-derived segmentations of the cortical
gray-matter of Mindboggle [RRID:SCR_002438, @mindboggle].
Spatial normalization to the ICBM 152 Nonlinear Asymmetrical
template version 2009c [@mni, RRID:SCR_008796] was performed
through nonlinear registration with `antsRegistration`
[ANTs 2.2.0, RRID:SCR_004757, @ants], using
brain-extracted versions of both T1w volume and template.
Brain tissue segmentation of cerebrospinal fluid (CSF),
white-matter (WM) and gray-matter (GM) was performed on
the brain-extracted T1w using `fast` [FSL 5.0.9, RRID:SCR_002823,
@fsl_fast].
<p>**Functional data preprocessing** <br>
For each of the 2 BOLD runs found per subject (across all
tasks and sessions), the following preprocessing was performed.
First, a reference volume and its skull-stripped version were generated
using a custom methodology of *fMRIPrep*.
A deformation field to correct for susceptibility distortions was estimated
based on *fMRIPrep*'s *fieldmap-less* approach.
The deformation field is that resulting from co-registering the BOLD reference
to the same-subject T1w-reference with its intensity inverted [@fieldmapless1;
@fieldmapless2].
Registration is performed with `antsRegistration` (ANTs 2.2.0), and
the process regularized by constraining deformation to be nonzero only
along the phase-encoding direction, and modulated with an average fieldmap
template [@fieldmapless3].
Based on the estimated susceptibility distortion, an
unwarped BOLD reference was calculated for a more accurate
co-registration with the anatomical reference.
The BOLD reference was then co-registered to the T1w reference using
`bbregister` (FreeSurfer) which implements boundary-based registration [@bbr].
Co-registration was configured with nine degrees of freedom to account
for distortions remaining in the BOLD reference.
Head-motion parameters with respect to the BOLD reference
(transformation matrices, and six corresponding rotation and translation
parameters) are estimated before any spatiotemporal filtering using
`mcflirt` [FSL 5.0.9, @mcflirt].
BOLD runs were slice-time corrected using `3dTshift` from
AFNI  [@afni, RRID:SCR_005927].
The BOLD time-series (including slice-timing correction when applied)
were resampled onto their original, native space by applying
a single, composite transform to correct for head-motion and
susceptibility distortions.
These resampled BOLD time-series will be referred to as *preprocessed
BOLD in original space*, or just *preprocessed BOLD*.
Automatic removal of motion artifacts using independent component analysis
[ICA-AROMA, @aroma] was performed on the *preprocessed BOLD on MNI space*
time-series after a spatial smoothing with an isotropic, Gaussian kernel
of 6mm FWHM (full-width half-maximum).
Corresponding "non-aggresively" denoised runs were produced after such
smoothing.
Additionally, the "aggressive" noise-regressors were collected and placed
in the corresponding confounds file.
The BOLD time-series were resampled to MNI152NLin2009cAsym standard space,
generating a *preprocessed BOLD run in MNI152NLin2009cAsym space*.
Several confounding time-series were calculated based on the
*preprocessed BOLD*: framewise displacement (FD), DVARS and
three region-wise global signals.
FD and DVARS are calculated for each functional run, both using their
implementations in *Nipype* [following the definitions by @power_fd_dvars].
The three global signals are extracted within the CSF, the WM, and
the whole-brain masks.
Additionally, a set of physiological regressors were extracted to
allow for component-based noise correction [*CompCor*, @compcor].
Principal components are estimated after high-pass filtering the
*preprocessed BOLD* time-series (using a discrete cosine filter with
128s cut-off) for the two *CompCor* variants: temporal (tCompCor)
and anatomical (aCompCor).
Six tCompCor components are then calculated from the top 5% variable
voxels within a mask covering the subcortical regions.
This subcortical mask is obtained by heavily eroding the brain mask,
which ensures it does not include cortical GM regions.
For aCompCor, six components are calculated within the intersection of
the aforementioned mask and the union of CSF and WM masks calculated
in T1w space, after their projection to the native space of each
functional run (using the inverse BOLD-to-T1w transformation).
The head-motion estimates calculated in the correction step were also
placed within the corresponding confounds file.
The BOLD time-series, were resampled to surfaces on the following
spaces: *fsaverage5*.
All resamplings can be performed with *a single interpolation
step* by composing all the pertinent transformations (i.e. head-motion
transform matrices, susceptibility distortion correction when available,
and co-registrations to anatomical and template spaces).
Gridded (volumetric) resamplings were performed using `antsApplyTransforms` (ANTs),
configured with Lanczos interpolation to minimize the smoothing
effects of other kernels [@lanczos].
Non-gridded (surface) resamplings were performed using `mri_vol2surf`
(FreeSurfer).<p>
Many internal operations of *fMRIPrep* use
*Nilearn* 0.4.2 [@nilearn, RRID:SCR_001362],
mostly within the functional processing workflow.
For more details of the pipeline, see [the section corresponding
to workflows in *fMRIPrep*'s documentation](https://fmriprep.readthedocs.io/en/latest/workflows.html "FMRIPrep's documentation").

### MRIQC
Methods described on the [readthedocs site](https://mriqc.readthedocs.io/en/stable/). I ran [the classifier](https://mriqc.readthedocs.io/en/stable/classifier.html) on the T1w images and inspected the images flagged as likely problems (based on quality, i.e. classifier probability metric > .50): 

<ul>118_ses-txA<br>
122_ses-txB<br>
130_ses-txA<br>
130_ses-txB<br>
137_ses-txB<br>
142_ses-txA<br>
142_ses-txB<br>
148_ses-txA<br>
148_ses-txB<br></ul>

Apart from sub-142 (who I had already decided to drop), none of the anatomical images were poor enough quality in my eyes to drop given that I am only using them for registration purposes and the Freesurfer `bbregister` for BOLD-T1w registration seems to be pretty robust. 

* Note that for a few subjects (sub-142, sub-147), I had already used the T1w from *one* of their sessions as the anatomical image for *both* sessions because of breathing/coughing/movement artifacts (which means I swapped out the bad one for the okay one when I did the BIDS dataset conversion and DICOM import). 

# Pre-GIFT steps

## Reorganize data

Within `/data/derivatives/`, make a new directory for the GIFT analyses and outputs: 
```{unix}
$ cd /Users/sarenseeley/Desktop/restingstate/data/derivatives/
$ mkdir -p gift/{analyses-spr-2019, data-smoothed, data-denoised}
```

`data-denoised/` is for the non-aggressively denoised images from ICA AROMA. _**Note: Did not actually end up using the denoised images because the number of components they generated was way too high (>100) and my computer couldn't handle it. I can fix this in future analyses by setting the number of ICs to estimate rather than asking GIFT to estimate it from the data.**_

`data-smoothed/` is for the non-denoised images, which will be smoothed in a later step.

We will move all of the preprocessed functional images from `fmriprep` here because when we smooth them later, SPM will dump the smoothed images into the original folder and I don't want those mixed in with the `fmriprep` derivatives directories.[^1] 

```{unix}
# this command finds files up to 4 subfolders down ending in the specified string/extension,
# then it copies those files to a new location while replicating the origin directory structure

$  cd /Users/sarenseeley/Desktop/restingstate/data/derivatives/fmriprep
$  find . -type f -depth 4 -name "*smoothAROMAnonaggr_preproc.nii.gz" -print0 | cpio -pmd0 /Users/sarenseeley/Desktop/restingstate/data/derivatives/gift/data-denoised/
  
  $  cd /Users/sarenseeley/Desktop/restingstate/data/derivatives/fmriprep
$  find . -type f -depth 4 -name "*task-rest_bold_space-MNI152NLin2009cAsym_preproc.nii" -print0 | cpio -pmd0 /Users/sarenseeley/Desktop/restingstate/data/derivatives/gift/data-smoothed/
```

GIFT requires files to be organized within individual subject folders, with different folders within the subject folder for each session if multiple (here, `ses-txA` and `ses-txB`). It will **not** find files in subdirectories within the session folder, so we need to move each `.nii` file up one level (i.e., to `data-smoothed/sub-101/txA` instead of `data-smoothed/sub-101/txA/func`).  I just did this by drag & dropping, couldn't figure out an efficient command line way to do it.

Move the excluded subjects to their own separate folder:
```{unix}
$  cd /Users/sarenseeley/Desktop/restingstate/data/derivatives/gift/data-smoothed # or /data-denoised
$  mv 'sub-142' 'sub-147' /Users/sarenseeley/Desktop/restingstate/data/derivatives/gift/excluded-subjects
```

Finally, if using the denoised images, unzip the `.nii.gz` files (N/A if using the non-denoised images) so that GIFT can work with them. 
```
$ gunzip -r data-denoised/ *.nii.gz
```
This recursively looks for files ending in *.nii.gz within `data-denoised/`, and uses `gunzip` to extract them.

## Smooth images
GIFT suggests that images are smoothed prior to running the gICA (however they also note, _"if your goal is to examine potential artifacts in your data, then you may not want to smooth, since the artifacts can be "hidden" by the smoothing process."_) The `fmriprep` does not perform spatial smoothing so this needs to happen if using the non-denoised preprocessed images. The ICA-AROMA non-aggressively denoised images have already been smoothed with an isotropic 6mm FWHM Gaussian kernel. 

For the non-denoised images, I chose to use a 4mm FWHM Gaussian kernel because we may be interested in smaller subcortical structures.

### SPM smooth
Use the batch interface of SPM to apply only the spatial smoothing module to preprocessed images. Make a new folder `batch` within `/gift/analyses-spr-2019` for all of the batch scripts we'll be using (both SPM and GIFT).

For smoothing, you need to modify **two** scripts: `spm_smooth_rest.m` and `spm_smooth_rest_job.m`. Both are generated when you use the batch editor in the SPM12 GUI to set up smoothing on one participant, then select "*save batch and script*" to save the script from the batch editor GUI as a template that you can edit.

The way the SPM batch interface works, the `_job` file contains inputs and parameters, such as a list of files to apply the function to and other parameters. When you run `spm_smooth_rest.m`, it calls on `spm_smooth_rest_job.m`.

First, open Matlab (I'm using 2016b).

Then, get the filepaths for all of the images to be smoothed using the script below in Matlab.[^2]

```
%% Make a variable called "inputs" 
%  containing paths to all files ending in '*task-rest_bold_space-MNI152NLin2009cAsym_preproc.nii' 
%  (these images are the non-denoised outputs from fmriprep)

% This script recursively gets the names and paths of all the files ending in the specified extension
% then it puts those into a structure with cells "folder" and "name".
% We then get those cells out of the structure and combine the two using 'strcat' to get the complete filepath

cd /Users/sarenseeley/Desktop/restingstate/data/derivatives/gift/data-smoothed;
dirData = dir('**/*task-rest_bold_space-MNI152NLin2009cAsym_preproc.nii');

folder = cell2table(transpose({dirData.folder}));
names = cell2table(transpose({dirData.name}));

names.Properties.VariableNames = {'filename'};
folder.Properties.VariableNames = {'path'};

files = strcat(folder.path,'/',names.filename);
```

Open the cell array `files` (should be 76x1 cell) by double-clicking on it, and copy the contents of the first column.

##### spm_smooth_rest_job.m
Open `spm_smooth_rest_job.m` and paste the contents of `files` into the place where SPM wants you to tell it which data to work on (`matlabbatch{1}.spm.spatial.smooth.data = {''}`). 

Also change the FWHM to "[4 4 4]" and the prefix to "4mmSmoothed_"[^3]:

```
matlabbatch{1}.spm.spatial.smooth.data = {'/Users/sarenseeley/Desktop/restingstate/data/derivatives/gift/data-smoothed/sub-101/ses-txA/sub-101_ses-txA_task-rest_bold_space-MNI152NLin2009cAsym_preproc.nii';
    '/Users/sarenseeley/Desktop/restingstate/data/derivatives/gift/data-smoothed/sub-101/ses-txB/sub-101_ses-txB_task-rest_bold_space-MNI152NLin2009cAsym_preproc.nii'}; % all files you want to smooth should be listed in this field, separated by ;
matlabbatch{1}.spm.spatial.smooth.fwhm = [4 4 4];
matlabbatch{1}.spm.spatial.smooth.dtype = 0;
matlabbatch{1}.spm.spatial.smooth.im = 0;
matlabbatch{1}.spm.spatial.smooth.prefix = '4mmSmoothed_';
```
##### spm_smooth_rest.m
You shouldn't need to edit anything here apart from changing `nrun = X;` to `nrun = 1;`. 
```
% List of open inputs
nrun = 1;
jobfile = {'/Users/sarenseeley/Desktop/restingstate/data/derivatives/gift/analyses-spr-2019/batch/spm_smooth_rest_job.m'};
jobs = repmat(jobfile, 1, nrun);
inputs = cell(0, nrun);
for crun = 1:nrun
end
spm('defaults', 'FMRI');
spm_jobman('run', jobs, inputs{:});
```

Finally, type `run('/Users/sarenseeley/Desktop/restingstate/data/derivatives/gift/analyses-spr-2019/batch/spm_smooth_rest.m')` in the Matlab command window, hit enter, and wait for it to complete smoothing for all of the files. This will take a while (about a minute or two per subject on my Mac) so be patient.


## AFNI 3dresample 

For the non-denoised images, GIFT fails when it tries to create a group mask before running the PCA[^4] due to the fact that some of the images have slightly different dimensions (ugh), I'm guessing due to slight protocol changes around the time when the scanner was updated. This doesn't seem to be an issue for the ICA-AROMA denoised images for some reason.

Used AFNI's `3dinfo` and `3dresample` tools, as well as the error/warnings log from the BIDS validator (which gave some warnings about some subjects/sessions having different dimensions) to identify affected sessions and to standardize dataset dimensions.s

Note that, apparently, small differences in dimensions (e.g., 1 or 2 voxels) aren't a problem, as this only seems to affect the subjects/sessions with the following warning:
```
	Type:		Warning
  File:		sub-119_ses-txA_task-rest_bold.nii
	Location:		data/sub-119/ses-txA/func/sub-119_ses-txA_task-rest_bold.nii
	Reason:		 The most common set of dimensions is: 92,80,29,180 (voxels), This file has the dimensions: 88,84,29,180 (voxels). The most common resolution is: 2.61mm x 2.61mm x 4.38mm x 2.00s, This file has the resolution: 2.50mm x 2.50mm x 4.38mm x 2.00s.
```
*How did I figure this out? I tried running GIFT on a bunch of different subsets of the data (e.g., first third of subjects, last third) to try to narrow down which ones were causing it to crash. Then once I had an idea of around where they were in the dataset, I looked in the BIDS validator log, guessed that I could start with the people with the larger differences, moved their subject folders out of derivatives/gift/data-smoothed/, and tried to run GIFT again. It ran, which confirmed my hypothesis.*

First, install AFNI: https://afni.nimh.nih.gov/pub/dist/doc/htmldoc/background_install/install_instructs/steps_mac.html
*Don't need to install netpbm or R (already have R, don't need netpbm) but did need to open XQuartz and > "Check for Updates" > install any updates.*

To run AFNI from Terminal, type:
`source ~/.bashrc afni` (see discussion of error message & solution here: https://afni.nimh.nih.gov/afni/community/board/read.php?1,158888,158908#msg-158908)

Within `/data/derivatives/`, make a new directory for the AFNI outputs: 
```{unix}
$ cd /Users/sarenseeley/Desktop/restingstate/data/derivatives/
$ mkdir -p afni/{data-3dresample}

# if you need to copy ALL of the subjects' files over (this preserves directory structure):
# $ cd /Users/sarenseeley/Desktop/restingstate/data/derivatives/gift/data-smoothed/
# $  find . -type f -depth 4 -name "4mmSmoothed.nii*" -print0 | cpio -pmd0  /Users/sarenseeley/Desktop/restingstate/data/derivatives/afni/data-3dresample/

# to copy to flat folder (only do this if you need to copy ALL of the subjects' files over)
# cp `find . -name "4mm*"` /Users/sarenseeley/Desktop/restingstate/data/derivatives/afni/data-3dresample   

```

Move the `4mmSmoothed*` files from `sub-101/ses-txA`, `sub-117/ses-txA`, `sub-119/ses-txA`, `sub-121/ses-txA` and `sub-122/ses-txB` into `derivatives/afni/data-3dresample`. AFNI *should* be able to cope with subdirectories (I think?) but I haven't figured that out yet. So, dump everything into the same data folder. 

**Note: In AFNI, a "dataset" means the file itself (e.g., `4mmSmoothed_sub-101_ses-txA_task-rest_bold_space-MNI152NLin2009cAsym_preproc.nii`), not a folder containing subdirectories.**

##### 3dinfo
Documentation: https://afni.nimh.nih.gov/pub/dist/doc/program_help/3dinfo.html

Running `3dinfo` will show information about the "dataset", including dimensions:
```
$ 3dinfo 4mmSmoothed_sub-117_ses-txA_task-rest_bold_space-MNI152NLin2009cAsym_preproc_3dresampled.nii
```
Example output:
```
++ 3dinfo: AFNI version=AFNI_19.0.24 (Mar  8 2019) [64-bit]
Dataset File:    4mmSmoothed_sub-117_ses-txA_task-rest_bold_space-MNI152NLin2009cAsym_preproc_3dresampled.nii
Identifier Code: XYZ_qc1srKfckc7Z7t_aWfiPVQ  Creation Date: Sat Mar 16 16:28:37 2019
Template Space:  TLRC
Dataset Type:    Echo Planar (-epan)
Byte Order:      LSB_FIRST {assumed} [this CPU native = LSB_FIRST]
Storage Mode:    NIFTI
Storage Space:   216,270,000 (216 million) bytes
Geometry String: "MATRIX(-2.609,0,0,96,0,-2.609,0,132,0,0,4.375,-78):75,89,45"
Data Axes Tilt:  Plumb
Data Axes Orientation:
  first  (x) = Left-to-Right
  second (y) = Posterior-to-Anterior
  third  (z) = Inferior-to-Superior   [-orient LPI]
R-to-L extent:   -97.066 [R] -to-    96.000 [L] -step-     2.609 mm [ 75 voxels]
A-to-P extent:   -97.592 [A] -to-   132.000 [P] -step-     2.609 mm [ 89 voxels]
I-to-S extent:   -78.000 [I] -to-   114.500 [S] -step-     4.375 mm [ 45 voxels]
Number of time steps = 180  Time step = 1.00000s  Origin = 0.00000s
  -- At sub-brick #0 '?' datum type is float:     -4.60834 to       573.885
  -- At sub-brick #1 '?' datum type is float:     -4.66345 to       662.353
  -- At sub-brick #2 '?' datum type is float:     -4.29606 to       568.889
** For info on all 180 sub-bricks, use '3dinfo -verb' **`
```

##### 3dresample
Documentation: https://afni.nimh.nih.gov/pub/dist/doc/program_help/3dresample.html

`3dresample -master` allows you to change the dimensions of a specified "dataset" (aka NIFTI file) to match another "dataset" (aka NIFTI file). So this is what we want to do to fix the issue with some subjects/sessions having slightly different dimensions that causes GIFT to crash.

General usage: `3dresample -master [dataset you want to use as template, ending in .nii for NIFTI files] -prefix [what you want the new dataset to be named, ending in .nii for NIFTI files] -input [dataset to be resampled, ending in .nii for NIFTI files]`

Run the below lines:
```
$ 3dresample -master 4mmSmoothed_sub-101_ses-txA_task-rest_bold_space-MNI152NLin2009cAsym_preproc.nii -prefix 4mmSmoothed_sub-117_ses-txA_task-rest_bold_space-MNI152NLin2009cAsym_preproc_3dresampled.nii  -input 4mmSmoothed_sub-117_ses-txA_task-rest_bold_space-MNI152NLin2009cAsym_preproc.nii
$ 3dresample -master 4mmSmoothed_sub-101_ses-txA_task-rest_bold_space-MNI152NLin2009cAsym_preproc.nii -prefix 4mmSmoothed_sub-119_ses-txA_task-rest_bold_space-MNI152NLin2009cAsym_preproc_3dresampled.nii  -input 4mmSmoothed_sub-119_ses-txA_task-rest_bold_space-MNI152NLin2009cAsym_preproc.nii
$ 3dresample -master 4mmSmoothed_sub-101_ses-txA_task-rest_bold_space-MNI152NLin2009cAsym_preproc.nii -prefix 4mmSmoothed_sub-121_ses-txA_task-rest_bold_space-MNI152NLin2009cAsym_preproc_3dresampled.nii  -input 4mmSmoothed_sub-121_ses-txA_task-rest_bold_space-MNI152NLin2009cAsym_preproc.nii
$ 3dresample -master 4mmSmoothed_sub-101_ses-txA_task-rest_bold_space-MNI152NLin2009cAsym_preproc.nii -prefix 4mmSmoothed_sub-122_ses-txB_task-rest_bold_space-MNI152NLin2009cAsym_preproc_3dresampled.nii  -input 4mmSmoothed_sub-122_ses-txB_task-rest_bold_space-MNI152NLin2009cAsym_preproc.nii
```
Then move the new files (`*_3rdresampled.nii`) to their respective folders. Because GIFT is going to look for files starting with `*4mmSmooth`, delete the original smoothed files so the session folder only has two files: (1) the non-smoothed preprocessed file and (2) the new 3dresampled file. *Note: I ended up removing the `_3rdresampled` suffix once those files were in their folders within `derivatives/gift/` because it was making it more complicated to edit some batch scripts.*

Now you are ready to run GIFT! (The PCA part, anyway...)

##### random AFNI bits
AFNI's GUI is *amazingly* confusing.

Look for datasets recursively:
```{unix}
$ cd /Users/sarenseeley/Desktop/restingstate/data/derivatives/afni
$ afni -R data-3dresample
```


## Regressors (single-subject)
Don't technically need the single-subject design matrix for resting state data as GIFT does not use the design matrix when running the PCA/ICA. However, you could use the design matrix to provide temporal regressors which can be used for temporal sorting of components (on the single-subject level) in order to identify components that are particularly correlated with motion parameters or CSF/white matter signal fluctuations.

If you have task data, you DO need a design matrix for temporal sorting in order to identify task-related components.

### Create regressors files
Based on https://neurostars.org/t/confounds-from-fmriprep-which-one-would-you-use-for-glm/326/4, I want to set up the design matrix to include the the motion parameters `X`, `Y`, `Z`, `RotX`, `RotY`, and `RotZ` as well as `GlobalSignal`, `CSF` and `WhiteMatter`. These columns are located in the `sub-*_ses-tx*_task-rest_bold_confounds.tsv` files that fmriprep generates (one per subject/session, within `/func`).

*Note: It is not necessary to use the "NonSteadyStateOutliers" column because I set the GIFT batch script to exclude the first 3 volumes (based on MRIQC group report, this is the max number of dummy volumes for any subject).* 

Each subject and session needs a .txt or .mat file containing the regressors. 

The `confounds.tsv` files list one row per volume. 180 volumes = 180 rows. So we need to specify for SPM that these are in scans (aka volumes) rather than seconds.

First, we need to use R convert the TSV files to CSV (easier to work with), create new files with only the columns we want to include as multiple regressors for the SPM design matrix, and save those new files as `.txt` files to `/derivatives/gift/analyses-spr-2019/batch/regressors/`. 

```{r}
library(tidyverse)
filter <- dplyr::filter
select <- dplyr::select

# get a list of all of the confounds.tsv files
filenames <- list.files(path = "~/Desktop/restingstate/data/derivatives/fmriprep", pattern = "confounds.tsv", full.names = TRUE, recursive = TRUE)
dir <- "/Users/sarenseeley/Desktop/restingstate/data/derivatives/gift/analyses-spr-2019/batch/regressors"

# convert tsv to csv files
lapply(filenames, function(f) {
  df = read_tsv(f)
  write.csv(df, gsub("tsv", "csv", f), row.names=FALSE)
})

# get a list of the confounds.csv files
filenames.csv <- list.files(path = "~/Desktop/restingstate/data/derivatives/fmriprep", pattern = "confounds.csv", full.names = TRUE, recursive = TRUE)
# this means we now have both TSV and CSV versions of the confounds files, so can delete TSVs

# here's what I modified from something I found online...
# first, create an object with the directory name
the_dir <- "/Users/sarenseeley/Desktop/restingstate/data/derivatives/gift/analyses-spr-2019/batch/regressors"

check_create_dir <- function(the_dir) {
  if (!dir.exists(the_dir)) {
  dir.create(the_dir, recursive = TRUE) }
} # a function to check if the directory exists

# check to see if the directory exists (make it if it doesn't)
check_create_dir(the_dir)

# read in files (selected columns only) and write new files
for (file in filenames.csv) {
  # read in the csv
  the_data <- read_csv(file, col_names = TRUE, col_types=cols_only(
  GlobalSignal	= col_number(),
  CSF	= col_number(),
  WhiteMatter	= col_number(),
  X	= col_number(),
  Y	= col_number(),
  Z	= col_number(),
  RotX	= col_number(),
  RotY	= col_number(),
  RotZ	= col_number())) 
  # remove first 3 rows since we are dropping the first 3 TRs and starting with the 4th volume
  the_data <- the_data[-c(1:3), ]
  # write the csv to a new file
  write_csv(the_data, file.path(paste0(the_dir, "/", basename(file))), col_names = FALSE) 
  # col_names must be false, SPM doesn't want headers in the txt files
  # but this means you need to make sure you double-check the order of the regressors if/when labeling them later
}

# rename the files with the suffix "_regressors.txt" vs. "_confounds.csv.txt":
filenames.txt <- list.files(path = "~/Desktop/restingstate/data/derivatives/gift/analyses-spr-2019/batch/regressors", pattern = "confounds.csv", full.names = TRUE, recursive = TRUE)
  sapply(filenames.txt,FUN=function(file){
      file.rename(from=file,to=sub(pattern="confounds.csv",replacement="regressors.txt",file))
})

```

All of the new `_regressors.txt` files we just generated are located in `Users/sarenseeley/Desktop/restingstate/data/derivatives/gift/analyses-spr-2019/batch/regressors`

### SPM design matrices

One design matrix (`SPM.mat`) per subject, per session. Scripts are located in `/data/derivatives/gift/analyses-spr-19/batch/multi_regressor_jobs`


##### spm_1stleveldesign_rest_job_[subj#]tx[A/B]_job.m

This file will look like the following, except here only the first (`MNI152NLin2009cAsym_preproc.nii,4`) and last (`MNI152NLin2009cAsym_preproc.nii,180`) file are shown below. 

_Note: For 4D NIFTI files, SPM requires specifying each 3D file individually using the comma and number following the `nii` suffix, e.g., `*preproc.nii,4` refers to the fourth volume. If you are using the file selector GUI, it only shows you the single un-numbered file by default. In order to select all of the relevant volumes, specify `4:180` in the filter field._

```{matlab}
%-----------------------------------------------------------------------
% Job saved on 22-Mar-2019 12:32:49 by cfg_util (rev $Rev: 6460 $)
% spm SPM - SPM12 (6685)
% cfg_basicio BasicIO - Unknown
%-----------------------------------------------------------------------
matlabbatch{1}.spm.stats.fmri_spec.dir = {'/Users/sarenseeley/Desktop/restingstate/data/derivatives/gift/data-smoothed/sub-101/ses-txA'};
matlabbatch{1}.spm.stats.fmri_spec.timing.units = 'scans';
matlabbatch{1}.spm.stats.fmri_spec.timing.RT = 2;
matlabbatch{1}.spm.stats.fmri_spec.timing.fmri_t = 16;
matlabbatch{1}.spm.stats.fmri_spec.timing.fmri_t0 = 8;
%%
matlabbatch{1}.spm.stats.fmri_spec.sess.scans = {
                                                 '/Users/sarenseeley/Desktop/restingstate/data/derivatives/gift/data-smoothed/sub-101/ses-txA/4mmSmoothed_sub-101_ses-txA_task-rest_bold_space-MNI152NLin2009cAsym_preproc.nii,4'
                                                 '/Users/sarenseeley/Desktop/restingstate/data/derivatives/gift/data-smoothed/sub-101/ses-txA/4mmSmoothed_sub-101_ses-txA_task-rest_bold_space-MNI152NLin2009cAsym_preproc.nii,180'
                                                 };
%%
matlabbatch{1}.spm.stats.fmri_spec.sess.cond = struct('name', {}, 'onset', {}, 'duration', {}, 'tmod', {}, 'pmod', {}, 'orth', {});
matlabbatch{1}.spm.stats.fmri_spec.sess.multi = {''};
matlabbatch{1}.spm.stats.fmri_spec.sess.regress = struct('name', {}, 'val', {});
matlabbatch{1}.spm.stats.fmri_spec.sess.multi_reg = {'/Users/sarenseeley/Desktop/restingstate/data/derivatives/gift/analyses-spr-2019/batch/regressors/sub-101_ses-txA_task-rest_bold_regressors.txt'};
matlabbatch{1}.spm.stats.fmri_spec.sess.hpf = 128;
matlabbatch{1}.spm.stats.fmri_spec.fact = struct('name', {}, 'levels', {});
matlabbatch{1}.spm.stats.fmri_spec.bases.hrf.derivs = [1 1];
matlabbatch{1}.spm.stats.fmri_spec.volt = 1;
matlabbatch{1}.spm.stats.fmri_spec.global = 'None';
matlabbatch{1}.spm.stats.fmri_spec.mthresh = 0.8;
matlabbatch{1}.spm.stats.fmri_spec.mask = {''};
matlabbatch{1}.spm.stats.fmri_spec.cvi = 'AR(1)';

```

##### spm_1stleveldesign_rest_job_all_job.m
I combined all of the individual `spm_1stleveldesign_rest_job_[subj#]tx[A/B]_job.m` scripts into one single SPM job in order to avoid running individual scripts if I don't need to.

##### spm_1stleveldesign_rest_job_all.m

```{matlab}
% List of open inputs
nrun = 1; % enter the number of runs here
jobfile = {'/Users/sarenseeley/Desktop/restingstate/data/derivatives/gift/analyses-spr-2019/batch/multi_regressor_jobs/spm_1stleveldesign_rest_job_all_job.m'};
jobs = repmat(jobfile, 1, nrun);
inputs = cell(0, nrun);
for crun = 1:nrun
end
spm('defaults', 'FMRI');
spm_jobman('run', jobs, inputs{:});
```


Usage: `run('/Users/sarenseeley/Desktop/restingstate/data/derivatives/gift/analyses-spr-2019/batch/multi_regressor_jobs/spm_1stleveldesign_rest_job_all.m')`

This generates individual design matrix `SPM.mat` files and puts them into their appropriate subject/session folders.


## Regressors (group-level)

A design matrix with group-level regressors/covariates is used for post-ICA analyses in SPM. _**Note: this design matrix is created later, after the gICA is completed.**_

### Create group-level file
Data (one value per subject/session for each variable) come from two places: the group BOLD report generated by `MRIQC`, and the master dataset. For the SPR poster analysis, I created a new dataset containing mean framewise displacement and two subsets of other variables from the master dataset - one containing the measure total scores, and the other containing demographic and health-related variables.

```{r}
# read in master dataset
data <- readRDS("~/Dropbox/GLASS Lab/OT Study/data/master-dataset/ot-fmri_master-dataset_020719.rds")

# read in MRQC group stats to get mean FD 
mriqc <- read_tsv("~/Desktop/restingstate/data/derivatives/mriqc/group_bold.tsv", col_names = TRUE, col_types=cols_only(
  bids_name	= col_character(),
  fd_mean	= col_number()))

# split up mriqc by session
mriqc_A <- dplyr::filter(mriqc, grepl("ses-txA",bids_name))
mriqc_B <- dplyr::filter(mriqc, grepl("ses-txB",bids_name))

# then rename for merging with master dataset variables
mriqc_A <- rename(mriqc_A, ID = bids_name, 
       fd_mean_txA = fd_mean)

mriqc_B <- rename(mriqc_B, ID = bids_name, 
       fd_mean_txB = fd_mean)

mriqc_A <- mutate_if(tibble::as_tibble(mriqc_A), 
                                is.character, 
                                stringr::str_replace_all, pattern = "sub-", replacement = "D")
mriqc_A <- mutate_if(tibble::as_tibble(mriqc_A), 
                                is.character, 
                                stringr::str_replace_all, pattern = "_ses-txA_task-rest_bold", replacement = "")

mriqc_B <- mutate_if(tibble::as_tibble(mriqc_B), 
                                is.character, 
                                stringr::str_replace_all, pattern = "sub-", replacement = "D")
mriqc_B <- mutate_if(tibble::as_tibble(mriqc_B), 
                                is.character, 
                                stringr::str_replace_all, pattern = "_ses-txB_task-rest_bold", replacement = "")

mean_fd <- left_join(mriqc_A, mriqc_B, by="ID")

# add the fd_mean variables to a subset of variables from the master dataset

# make a subset of measure total scores
tots <- data[, grep('^tot|ID', names(data))]

# get a list of variable names to pick out the ones I want
names <- names(data)
cat(names, sep="\n")

# make a subset of other variables in master dataset
group <- subset(data, select=c(ID,
                               sex_m,
                               age,
                               age_yrs,
                               knowbefore_y,
                               caretaker_y,
                               education,
                               highestdegree,
                               education_other,
                               employment,
                               householdsize,
                               householdsize_adult,
                               householdsize_income,
                               ethnicity_hisp,
                               race,
                               race_other,
                               learnaboutstudy,
                               meds_y,
                               meds_rxonly,
                               meds_psychoactive,
                               meds_hrt,
                               meds_hormone_opiate,
                               meds_total,
                               majorhealthprobs_y,
                               majorhealthprobs_what,
                               postmenopausal_y,
                               n_pregnancies,
                               n_livebirths,
                               n_nursed,
                               alcohol,
                               smoking_y,
                               smoking_howlong,
                               smoking_perday,
                               caffeine_perday,
                               exercise,
                               timesincedeath,
                               yrs_together,
                               group,
                               group_20mFU,
                               timetorest_A,
                               timetorest_B,
                               timeto20mFU,
                               timeto20mFU_mos,
                               post_sroe_1_A,
                               post_sroe_2_A,
                               post_sroe_3_A,
                               post_sroe_1_B,
                               post_sroe_2_B,
                               post_sroe_3_B,
                               rs2254298,
                               rs2268498,
                               rs53576,
                               rs2254298_A.carrier,
                               rs2268498_C.carrier,
                               rs53576_A.carrier))

# merge all the group data
rest_data <- left_join(group, tots, by="ID") %>% left_join(., mean_fd)

# replace "D" prefix in ID to "sub-" to be BIDS-compliant
rest_data <- rest_data %>% 
  mutate(ID = str_replace(ID, "D", "sub-"))

# save as a .rds file and a csv file
write_csv(rest_data, "/Users/sarenseeley/Desktop/restingstate/data/derivatives/gift/group-covariates/group_covariates.csv")
saveRDS(rest_data, "/Users/sarenseeley/Desktop/restingstate/data/derivatives/gift/group-covariates/group_covariates.rds")

# and save a version without sub-142, sub-147 (excluded from GIFT analyses)
# this is what I will use to set up the model specification in SPM
rest_data_no_subs142147 <- filter(rest_data, !grepl("sub-142|sub-147",ID)) 
write_csv(rest_data, "/Users/sarenseeley/Desktop/restingstate/data/derivatives/gift/group-covariates/group_covariates_no-142-147.csv")
```

I was also saving some of these individually for use with MANCOVAN...
```{r}
# save the variables of interest individually for use in MANCOVAN toolbox 
age <- as_tibble(rest_data_no_subs142147$age_yrs)
write_csv(age, "/Users/sarenseeley/Desktop/restingstate/data/derivatives/gift/group-covariates/age.txt", col_names = FALSE)

sex <- as_tibble(rest_data_no_subs142147$sex_m)
write_csv(sex, "/Users/sarenseeley/Desktop/restingstate/data/derivatives/gift/group-covariates/sex.txt", col_names = FALSE)

group <- as_tibble(rest_data_no_subs142147$group)
write_csv(group, "/Users/sarenseeley/Desktop/restingstate/data/derivatives/gift/group-covariates/group.txt", col_names = FALSE)

bdi <- as_tibble(rest_data_no_subs142147$tot_bdi)
write_csv(bdi, "/Users/sarenseeley/Desktop/restingstate/data/derivatives/gift/group-covariates/bdi.txt", col_names = FALSE)

ysl <- as_tibble(rest_data_no_subs142147$tot_ysl)
write_csv(ysl, "/Users/sarenseeley/Desktop/restingstate/data/derivatives/gift/group-covariates/ysl.txt", col_names = FALSE)

icg <- as_tibble(rest_data_no_subs142147$tot_icg)
write_csv(icg, "/Users/sarenseeley/Desktop/restingstate/data/derivatives/gift/group-covariates/icg.txt", col_names = FALSE)

meds_totaln <- as_tibble(rest_data_no_subs142147$meds_total)
write_csv(meds_totaln, "/Users/sarenseeley/Desktop/restingstate/data/derivatives/gift/group-covariates/meds_totaln.txt", col_names = FALSE)

timesincedeath <- as_tibble(rest_data_no_subs142147$timesincedeath)
write_csv(timesincedeath, "/Users/sarenseeley/Desktop/restingstate/data/derivatives/gift/group-covariates/timesincedeath.txt", col_names = FALSE)

timetorest_A <- as_tibble(rest_data_no_subs142147$timetorest_A)
write_csv(timetorest_A, "/Users/sarenseeley/Desktop/restingstate/data/derivatives/gift/group-covariates/timetorest_txA.txt", col_names = FALSE)

timetorest_B <- as_tibble(rest_data_no_subs142147$timetorest_B)
write_csv(timetorest_B, "/Users/sarenseeley/Desktop/restingstate/data/derivatives/gift/group-covariates/timetorest_txB.txt", col_names = FALSE)

meanFD_txA <- as_tibble(rest_data_no_subs142147$fd_mean_txA)
write_csv(meanFD_txA, "/Users/sarenseeley/Desktop/restingstate/data/derivatives/gift/group-covariates/meanFD_txA.txt", col_names = FALSE)

meanFD_txB <- as_tibble(rest_data_no_subs142147$fd_mean_txB)
write_csv(meanFD_txB, "/Users/sarenseeley/Desktop/restingstate/data/derivatives/gift/group-covariates/meanFD_txB.txt", col_names = FALSE)
```

From these group data, can also do things like comparing how much people moved on average by group:
```{r}
library(psych)

# comparing mean FD between treatments (A/B) and groups:
t.test(rest_data_no_subs142147$fd_mean_txA, rest_data_no_subs142147$fd_mean_txB,
       alternative = "two.sided", paired = TRUE, var.equal = FALSE) # not significantly different

t.test(rest_data_no_subs142147$fd_mean_txA, rest_data_no_subs142147$fd_mean_txB,
       alternative = "two.sided", paired = TRUE, var.equal = FALSE) # not significantly different
  
t.test(fd_mean_txA ~ group, data=rest_data_no_subs142147) # not significantly different

t.test(fd_mean_txB ~ group, data=rest_data_no_subs142147) # not significantly different (p < .08) though the NCG group overall has a higher mean FD at txB (.37 vs .27) 

corr.test(rest_data_no_subs142147$fd_mean_txA, rest_data_no_subs142147$tot_icg) 
corr.test(rest_data_no_subs142147$fd_mean_txB, rest_data_no_subs142147$tot_icg) 
corr.test(rest_data_no_subs142147$fd_mean_txA, rest_data_no_subs142147$tot_bdi) 
corr.test(rest_data_no_subs142147$fd_mean_txB, rest_data_no_subs142147$tot_bdi) 
corr.test(rest_data_no_subs142147$fd_mean_txA, rest_data_no_subs142147$tot_pre_stai_A) 
corr.test(rest_data_no_subs142147$fd_mean_txB, rest_data_no_subs142147$tot_pre_stai_B) 
corr.test(rest_data_no_subs142147$fd_mean_txA, rest_data_no_subs142147$age_yrs)
corr.test(rest_data_no_subs142147$fd_mean_txB, rest_data_no_subs142147$age_yrs) 
```
Mean FD is not significantly different between treatments or groups, and is not correlated with total ICG score, BDI score, or pre-scan STAI (state anxiety). It is also not significantly associated with age.

### IMPORTANT 
Will be noted again below, but make sure that when you're importing and using the group covariates file to set up a design matrix, the subject order matches the order in which the files are listed in the SPM design, as well as matching any renumbering of subjects that GIFT has done (e.g., what GIFT says is subject 38 is the 38th file -- NOT sub-038).  

For each design matrix, you will likely need a different group covariates file with the rows ordered the way that the files are ordered in SPM (e.g., if you are entering subjects for whom `group`=0 as your first group, and for whom `group`=1 as your second group, all of the files for subjects in the first group will be listed first in the design matrix).

(Don't need to do anything about this now, but just be really aware of it when you get to the point of running group SPM stats with these covariates later).


# GIFT (Group ICA of fMRI Toolbox)
*"It is a MATLAB toolbox which implements multiple algorithms for independent component analysis and blind source separation of group (and single subject) functional magnetic resonance imaging data."* Download and documentation available here: http://mialab.mrn.org/software/gift/index.html 

For the poster analyses, I looked at treatment A alone.

_**Note: txA is referred to as "session 1" in GIFT (it labels sessions and subjects based on alphanumerical order of file/folder names).**_

_**Note:If you want to compare two sessions (or two groups), you should run them in a single ICA so that the component numbers match.**_

## Data reduction

### Run PCA & ICA #1
Use batch file `Input_data_subjects_restingstate_txA.m` to input data and initiate the analysis. Parameters used for the PCA and ICA data reduction steps are described in `/Users/sarenseeley/Desktop/restingstate/data/derivatives/gift/analyses-spr-2019/results_no-sub-142_drop-3-vols_regressors_intnorm_txA/gift_txA__ica_parameter_info.mat`. 

Wait [hours] for it to finish.

Open GIFT by entering `gift` in the Matlab command line. Click the "Display GUI" button to look at the identified ICs in different ways (group, single-subject, means, timecourses, only specific ICs, etc.) You can edit display defaults, such as Z threshold to use for display, using the menu in the upper left corner of the window.

Write a summary of results to HTML by clicking the "Results Summary" button. View the HTML report, which shows ICASSO stability estimate plots as well as component spatial maps, timecourses, and spectra.

### Identify noise ICs
Based on spatial distribution, power spectra (including fALFF) and ICASSO stability estimates.

Allen et al., 2011:

>2.5.1 RSN selection
We identified a subset of C1 components considered to be RSNs (as opposed to physiological artifacts) by inspecting the aggregate SMs and average power spectra (see below; Figure 1, step 3).Four viewers rated the components from 0 (definite artifact) to 1 (certain RSN) based on expectations that RSNs should exhibit peak activations in gray matter, low spatial overlap with known vascular, ventricular, motion, and susceptibility artifacts, and TCs dominated by low frequency fluctuations (Cordes et al., 2000). To facilitate evaluation, spectra were characterized with two metrics used previously to classify components (Robinson et al., 2009): dynamic range, the difference between the peak power and minimum power at frequencies to the right of the peak, and low frequency to high frequency power ratio, the ratio of the integral of spectral power below 0.10 Hz to the integral of power between 0.15 and 0.25 Hz (Figure 3). 

_FYI:_ 

> ALFF is defined as the total power within the frequency range between 0.01 and 0.1 Hz, and thus indexes the strength or intensity of LFO. f/ALFF is defined as the power within the low-frequency range (0.01-0.1 Hz) divided by the total power in the entire detectable frequency range, and represents the relative contribution of specific LFO to the whole frequency range (Zuo et al., 2010).

You can also use the "Component Labeller" tool in GIFT to see how closely your ICs resemble established functional RSNs using the Stanford templates. From the GIFT manual:

> 3.12.8 Component Labeler
Option is provided to label components given the templates of interest. We provide templates in icatb/icatb_templates/RSN.zip file and description about the templates in icatb/icatb_templates/RSN.txt. Each component is correlated with the given templates and best template is selected based on the maximum correlation value.

Good ICs should have higher correlation values. Note that the labeller tries to correlate all of the voxels across the brain so the correlation values will likely be moderate, not very high.

### Remove noise ICs
I removed the following ICs:

* Eyeballs: 23, 38, 40, 41, 44
* Motion: 3, 5, 10, 35
* CSF: 8, 26, 34 (my notes are unclear here - may have removed 2 and 22 as well)
* Other: 1, 4, 12, 17, 21, 27

If I wasn't sure, I erred on the side of caution and left it in. Hence leaving in some that I couldn't tell if they were physiological artifact or legit subcortical stuff. 

After selecting the components to remove, GIFT will rewrite the data. At this point, GIFT creates generic subject and session numbers based on how many of each there are and the order in which it encounters the files (so for these data, subjects are labelled `Sub_001` through `Sub_038`, and the session folder that was `txA` is now called `1`). I created a file that links the original subject IDs and GIFT rewritten data IDs (`key_link-giftIDs-studyIDs.csv`) located within `/Users/sarenseeley/Desktop/restingstate/data/derivatives/gift/analyses-spr-2019/results_no-sub-142_drop-3-vols_regressors_intnorm_txA/gift_txA__noise-comps-removed_data`. This will be important later.

New data were written to `/Users/sarenseeley/Desktop/restingstate/data/derivatives/gift/analyses-spr-2019/results_no-sub-142_drop-3-vols_regressors_intnorm_txA/gift_txA__noise-comps-removed_data`.

### Run ICA #2
GIFT was run a second time on the noise-components-removed data. Results and outputs of this analysis are located within `/Users/sarenseeley/Desktop/restingstate/data/derivatives/gift/analyses-spr-2019/results_no-sub-142_drop-3-vols_regressors_intnorm_txA/gift_txA__noise-comps-removed_results`. 

Parameters used for the PCA and ICA data reduction steps are described in `/Users/sarenseeley/Desktop/restingstate/data/derivatives/gift/analyses-spr-2019/results_no-sub-142_drop-3-vols_regressors_intnorm_txA/gift_txA__noise-comps-removed_results/gift_txA_noisecompsremoved__ica_parameter_info.mat`. This time, I asked it to use 30 as the number of ICs vs. estimating from the data.

### Pick ICs
Based on the same criteria as used to determine noise components (stability, separation, and other ICASSO metrics, spatial distribution, power, fALFF), I selected six ICs to focus on.

I did notice that the fALFF scores overall were lower than those described in the Allen et al. (2011) paper, and decreased from those seen in the first run of the ICA. All of the below components had good stability and separation.

* IC1: Labelled as precuneus network, majority seems to be in cuneus, r = .31 with the RNS template precuneus network, fALFF = 3.4.
* IC5: Labelled as DMN/mPFC/PCC, r = .26, fALFF = 2.89. 
* IC9: Labelled as R CEN/dlPFC, r = .27, fALFF = 2.17.
* IC12: Labelled as DMN/MTL, r = .39, fALFF = 2.81.
* IC20: Labelled as DMN/mPFC/PCC, r = .40, fALFF = 1.25.

I also looked at IC2, which was labelled as SN/aI/dACC but wasn't too convinced it was that so dropped it (potential MR artifact?). Neurosynth decoder says the t map is most similar to primary motor cortex and a lot of somatosensory and movement-related terms. _**For future analyses, since there was no SN IC that was identified, may want to do a semi-blind ICA using the SN template for spatial constraint.**_

There were a number of other components that seemed decent and correlated with sensorimotor, language, auditory, visual, and visuospatial networks. 

The fact that GIFT identified multiple ICs related to a single given network (e.g., DMN, visual) suggests that I probably could have gone down to 20 so that there wasn't as much splitting of networks. OTOH, it's interesting to see how the different DMN components relate to different mental functions (per comparison to Neurosynth meta-analytic maps) given that there is evidence for parcellation of these large-scale networks.

## SPM stats
You can do this either via the "SPM Stats" button in GIFT or just fire up SPM. The first step is to run a one-sample T test which tests the effect against null. This creates a thresholded mask that can be used to mask subsequent analyses to constrain tests to just those voxels where a particular component showed supra-threshold signal. _*Note: you may or may not want to use the mask in later analyses. It depends: if you want to look just for intranetwork effects, then use it. If you are interested in areas outside the IC/network, then don't.*_

The next step was a two-sample t test on the individual images for each component with mean framewise displacement, age, sex, time since death, psychoactive medications (coded 0/1), and BDI score included as covariates. The t test compared CG and NCG. Note that voxelwise tests are testing for group differences in the spatial maps.

### Results
At my exploratory _p_ < .001 level there were a number of tiny clusters that showed up in one or both contrasts. Not too interesting. However, the CG > NCG contrast for IC1 did reveal a small cluster in the left dorsal precentral gyrus (-23 -20 75, _k_ = 17) that was significantly more correlated with the IC1 timecourse (I think? not 100% clear on interpretation) in the CG group than the NCG group, _T_ = 6.83, _Z_ = 5.26, voxelwise _p_ uncorrected = < .001, _pFDR_ = .028, _pFWE_ = .004 (.027 for clusterwise). 

#### CG > NCG: precentral gyrus
The precentral gyrus is involved in movement and motor imagery, but may also play a role in person-specific patterns of brain activity (Thornton & Mitchell, 2017; coordinates not listed) and reward outcome in addiction (fMRI meta analysis: Luijten et al., 2017 JAMA; coordinates extremely close to ours for the addicted > control contrast, see their table e5).

# marsbar (ROI analysis)
To look further at the precentral gyrus, I used marsbar to create a functional ROI (a mask of the cluster) and extract parameter estimates per subject. Then read these data into R to conduct subsequent analyses.

```{r}
install.packages("R.matlab")
library(R.matlab)
library(tidyverse)
library(ggpubr)
library(psych)

# read mat file with beta weights from CG > NCG contrast in the precentral gyrus cluster from IC1
roidata <- readMat("/Users/sarenseeley/Desktop/restingstate/data/derivatives/gift/analyses-spr-2019/results_no-sub-142_drop-3-vols_regressors_intnorm_txA/gift_txA__noise-comps-removed_results/gift_txA_noisecompsremoved__marsbar_roi_results/cg_ncg_-23_-20_75_comp01_betas.mat", 1)

betas <- as.tibble(roidata[[1]][[2]][[1]]) %>% rename(roi_betas_CGvNCG_comp01_precentralgyrus = V1)
beta_sub <- as.tibble(roidata[[1]][[1]][[1]]) %>% rename(fname = V1)

betas <-bind_cols(beta_sub, betas) %>% mutate(fname = str_replace(fname, "/Users/sarenseeley/Desktop/restingstate/data/derivatives/gift/analyses-spr-2019/results_no-sub-142_drop-3-vols_regressors_intnorm_txA/gift_txA__noise-comps-removed_results/gift_txA_noisecompsremoved__scaling_components_files/sub-", "file")) %>% mutate(fname = str_replace(fname, "_component_ica_ses-1_.nii", ""))

key <- read_csv("/Users/sarenseeley/Desktop/restingstate/data/derivatives/gift/analyses-spr-2019/results_no-sub-142_drop-3-vols_regressors_intnorm_txA/gift_txA__noise-comps-removed_results/key_link-giftIDs-studyIDs.csv")

betas <- left_join(betas, key, by="fname")

# now ready to merge betas with data
data <- readRDS("~/Dropbox/GLASS Lab/OT Study/data/master-dataset/ot-fmri_master-dataset_020719.rds")

data_spr19 <- left_join(data, betas, by="ID")
```

Correlations:
```{r}
# first used YSL item 10 as the sort of prototypic index of yearning..
corr.test(data_spr19$roi_betas_CGvNCG_comp01_precentralgyrus, data_spr19$ysl_10) # "Feeling of wanting back is so strong it is indescribable..."

# then looked at YSL total
corr.test(data_spr19$roi_betas_CGvNCG_comp01_precentralgyrus, data_spr19$tot_ysl) # not significantly correlated
```

At MFO's suggestion, looked separately at affective and cognitive aspects of yearning:

```{r}
# subset YSL into cognitive and affective subscales and calculate totals
ysl_cog <- subset(data_spr19, select=c("ysl_2","ysl_4", "ysl_8", "ysl_11", "ysl_14"))
data_spr19$tot_ysl_cog <- rowSums(ysl_cog, na.rm=TRUE) 
describe(data_spr19$tot_ysl_cog)

ysl_aff <- subset(data_spr19, select=c("ysl_6","ysl_7", "ysl_10", "ysl_16", "ysl_21"))
data_spr19$tot_ysl_aff <- rowSums(ysl_aff, na.rm=TRUE) 
describe(data_spr19$tot_ysl_aff)

# is the relationship with precentral gyrus specific to affective items, or not?
corr.test(data_spr19$roi_betas_CGvNCG_comp01_precentralgyrus, data_spr19$tot_ysl_cog) # r = .31, p = .06
corr.test(data_spr19$roi_betas_CGvNCG_comp01_precentralgyrus, data_spr19$tot_ysl_aff) # r = .40, p = .01
```

Are these correlations significantly different from each other?

For the below results:
<li>Null hypothesis is that the true difference in overlapping correlations IS equal to 0.
<li>Alternative hypothesis is that the true difference in correlations IS NOT equal to 0.
<li>Conclusions based on Steiger's (1980) <i>z</i> and Zou's (2007) confidence interval.

```{r}
library(cocor)

# correlation of postcentral gyrus betas (j) and YSL-cog (k):
jk <- corr.test(data_spr19$roi_betas_CGvNCG_comp01_precentralgyrus, data_spr19$tot_ysl_cog) # r = .31, p = .058

# correlation of postcentral gyrus betas (j) and YSL-aff (h):
jh <- corr.test(data_spr19$roi_betas_CGvNCG_comp01_precentralgyrus, data_spr19$tot_ysl_aff) # r = .40, p = .012

# correlation of YSL-cog (k) and YSL-aff (h), dropping D142 & D147:
data_spr19 <- filter(data_spr19, !grepl("D142|D147",ID))
                                    
kh <- corr.test(data_spr19$tot_ysl_cog, data_spr19$tot_ysl_aff) # r = .89, p < .001

cocor.dep.groups.overlap(r.jk=+(jk[[1]]), r.jh=+(jh[[1]]), r.kh=+(kh[[1]]), n=38, alternative="two.sided", alpha=0.05, conf.level=0.95, null.value=0)
```
Null hypothesis retained: the correlations of YSL-Aff and YSL-Cog with the precentral gyrus parameter estimates are NOT significantly different from each other. 

However, I realized that correlating the YSL and beta weights is a circular analysis: (1) ICG and YSL totals are highly correlated at .84 (p < .001), (2) the ICG was used to determine CG vs. NCG group, so (3) the voxels that emerged in the t test for group are biased towards being related to YSL totals. To account for this, need to include `tot_icg` in the linear model.

I also included BISBAS Reward Responsiveness in the model to try to control for overall trait approach bias. Selected BISBAS-RR because it was the index of approach bias that differed significantly between NCG and CG groups, whereas Drive and Fun-Seeking did not (BIS subscale also differed between groups, with CG being higher in BIS than NCG). 

```{r}
# compute bias scores from gAAT RT data as index of behavioral approach/avoid bias for each stimulus category
# positive scores = approach bias, negative scores = avoid bias
data_spr19$gAAT_bias_spouse_A <- data_spr19$gAAT_RT_spouse_push_A - data_spr19$gAAT_RT_spouse_pull_A
data_spr19$gAAT_bias_whoto_A <- data_spr19$gAAT_RT_whoto_push_A - data_spr19$gAAT_RT_whoto_pull_A
data_spr19$gAAT_bias_stranger_A <- data_spr19$gAAT_RT_stranger_push_A - data_spr19$gAAT_RT_stranger_pull_A
data_spr19$gAAT_bias_death_A <- data_spr19$gAAT_RT_death_push_A - data_spr19$gAAT_RT_death_pull_A
data_spr19$gAAT_bias_neutral_A <- data_spr19$gAAT_RT_neutral_push_A - data_spr19$gAAT_RT_neutral_pull_A

# t tests
t.test(data_spr19$tot_bisbas_basrr ~ data_spr19$group) # CG < NCG, p = .027
t.test(data_spr19$tot_bisbas_basdr ~ data_spr19$group) # no sig diff
t.test(data_spr19$tot_bisbas_basfun ~ data_spr19$group) # no sig diff
t.test(data_spr19$tot_bisbas_bis ~ data_spr19$group) # CG > NCG, p = .044
t.test(data_spr19$gAAT_bias_death_A ~ data_spr19$group) # no sig diff
t.test(data_spr19$gAAT_bias_neutral_A ~ data_spr19$group) # no sig diff
t.test(data_spr19$gAAT_bias_spouse_A ~ data_spr19$group) # no sig diff
t.test(data_spr19$gAAT_bias_stranger_A ~ data_spr19$group) # no sig diff
t.test(data_spr19$gAAT_bias_whoto_A ~ data_spr19$group) # no sig diff
```

Interestingly, the NCG group is actually higher in Reward Responsiveness than CG on average.  

```{r}
plot(data_spr19$tot_bisbas_basrr ~ data_spr19$group)
```

## Regression models
```{r}
# YSL total
a <- lm(roi_betas_CGvNCG_comp01_precentralgyrus ~ tot_icg + tot_bisbas_basrr, data = data_spr19)
b <- lm(roi_betas_CGvNCG_comp01_precentralgyrus ~ tot_icg + tot_bisbas_basrr + tot_ysl, data = data_spr19)
summary(a)
summary(b)
anova(a,b)

# YSL - affective items
c <- lm(roi_betas_CGvNCG_comp01_precentralgyrus ~ tot_icg + tot_bisbas_basrr + tot_ysl_aff, data = data_spr19)
summary(c)
anova(a,c)

# YSL - cognitive items
d <- lm(roi_betas_CGvNCG_comp01_precentralgyrus ~ tot_icg + tot_bisbas_basrr + tot_ysl_cog, data = data_spr19)
summary(d)
anova(a,d)

```
Model `c`, which includes YSL-Affective (but not YSL-Cognitive or YSL total) scores, is predictive of precentral gyrus activation controlling for overall grief severity and reward responsiveness. 

## Regression plot
```{r}
# make a function to plot a regression line w/stats from a model w/multiple predictors
# got this from https://sejohnston.com/2012/08/09/a-quick-and-easy-function-to-plot-lm-results-in-r/
# note that if you want to graph other predictors, you have to change the indexing to reflect that coefficient 
ggplotRegression <- function (fit) {
ggplot(fit$model, aes_string(x = names(fit$model)[4], y = names(fit$model)[1])) + 
  geom_point() +
  stat_smooth(method = "lm", col = "blue") +
  labs(title = paste("R^2 = ",signif(summary(fit)$r.squared, 2),
                     "\nIntercept = ",signif(fit$coef[[1]],3 ),
                     "\nSlope =",signif(fit$coef[[4]], 2),
                     "\np =",signif(summary(fit)$coef[4,4], 2))) +
    xlab("Yearning in Situations of Loss - Affective Subscale") +
    ylab("ROI at -23, -20, 75 (betas)")
}

summary(c)
ggplotRegression(c)

ggplot(c, aes(y = roi_betas_CGvNCG_comp01_precentralgyrus, x = tot_ysl_aff)) + 
  geom_point() +
  stat_smooth(method = "lm", col = "red")

ggsave(ggplotRegression(c))
ggsave("~/Dropbox/Dissertation/SPR-2019-poster/regression_ysl-aff.png", plot = last_plot(), device = NULL, path = NULL,
  scale = 1, width = 8, units = c("in"), dpi = 600)
```

## Wordcloud
Follows the approach from the Bethlehem et al. (2017) oxytocin ICA paper: 
>  "To better characterize the components showing an oxytocin-related effect on connectivity we used the decoder function in NeuroSynth34 to compare the whole-brain component maps with large-scale automated meta-analysis maps within NeuroSynth. The top 100 terms (excluding terms for brain regions) ranked by the correlation strength between the component map and the meta-analytic map were visualized as a word cloud using the wordcloud library in R, with the size of the font scaled by correlation strength."

To do this, I first uploaded the component spatial maps to Neurosynth and used the "decoder" function to view the terms associated with the correlations between the component map and meta-analytic map. I copied these terms and their correlation values to CSV files and removed all terms related to brain regions (such as "dlpfc", "gyrus", or "anterior posterior").

Then
```{r}
library(wordcloud)
library(RColorBrewer)
library(viridis) # use the "viridis" package for color palette (viridis palettes are colorblind-safe, many ggplot2 and RColorBrewer palettes are not)

# demo the package using code from RColorBrewer
n = 8 # change this to however many bins you want to see, a high number like 200 will show more of a spectrum
image(
  1:n, 1, as.matrix(1:n),
  col = viridis(n, option = "D"),
  xlab = "viridis n", ylab = "", xaxt = "n", yaxt = "n", bty = "n"
)
```


# Import data from MATLAB marsbar ROI output


plot(data_spr19$roi_betas_CGvNCG_comp01_precentralgyrus, data_spr19$tot_ysl)


library(psych)
corr.test(rest_data_no_subs142147$fd_mean_txA, rest_data_no_subs142147$tot_icg)


corr.test(data_spr19$roi_betas_CGvNCG_comp01_precentralgyrus, data_spr19$ysl_1) 
corr.test(data_spr19$roi_betas_CGvNCG_comp01_precentralgyrus, data_spr19$ysl_2)
corr.test(data_spr19$roi_betas_CGvNCG_comp01_precentralgyrus, data_spr19$ysl_3) 
corr.test(data_spr19$roi_betas_CGvNCG_comp01_precentralgyrus, data_spr19$ysl_4) 
corr.test(data_spr19$roi_betas_CGvNCG_comp01_precentralgyrus, data_spr19$ysl_5)
corr.test(data_spr19$roi_betas_CGvNCG_comp01_precentralgyrus, data_spr19$ysl_6) 
corr.test(data_spr19$roi_betas_CGvNCG_comp01_precentralgyrus, data_spr19$ysl_7) 
corr.test(data_spr19$roi_betas_CGvNCG_comp01_precentralgyrus, data_spr19$ysl_8)
corr.test(data_spr19$roi_betas_CGvNCG_comp01_precentralgyrus, data_spr19$ysl_9) 
corr.test(data_spr19$roi_betas_CGvNCG_comp01_precentralgyrus, data_spr19$ysl_10) # "Feeling of wanting back is so strong it is indescribable..."
corr.test(data_spr19$roi_betas_CGvNCG_comp01_precentralgyrus, data_spr19$ysl_11) 
corr.test(data_spr19$roi_betas_CGvNCG_comp01_precentralgyrus, data_spr19$ysl_12) 
corr.test(data_spr19$roi_betas_CGvNCG_comp01_precentralgyrus, data_spr19$ysl_13) 
corr.test(data_spr19$roi_betas_CGvNCG_comp01_precentralgyrus, data_spr19$ysl_14)
corr.test(data_spr19$roi_betas_CGvNCG_comp01_precentralgyrus, data_spr19$ysl_15) 
corr.test(data_spr19$roi_betas_CGvNCG_comp01_precentralgyrus, data_spr19$ysl_16) 
corr.test(data_spr19$roi_betas_CGvNCG_comp01_precentralgyrus, data_spr19$ysl_17) 
corr.test(data_spr19$roi_betas_CGvNCG_comp01_precentralgyrus, data_spr19$ysl_18) 
corr.test(data_spr19$roi_betas_CGvNCG_comp01_precentralgyrus, data_spr19$ysl_19) 
corr.test(data_spr19$roi_betas_CGvNCG_comp01_precentralgyrus, data_spr19$ysl_20) 
corr.test(data_spr19$roi_betas_CGvNCG_comp01_precentralgyrus, data_spr19$ysl_21)

corr.test(data_spr19$roi_betas_CGvNCG_comp01_precentralgyrus, data_spr19$tot_gcq_approp)
corr.test(data_spr19$roi_betas_CGvNCG_comp01_precentralgyrus, data_spr19$tot_gcq_blame)
corr.test(data_spr19$roi_betas_CGvNCG_comp01_precentralgyrus, data_spr19$tot_gcq_cherish)
corr.test(data_spr19$roi_betas_CGvNCG_comp01_precentralgyrus, data_spr19$tot_gcq_future)
corr.test(data_spr19$roi_betas_CGvNCG_comp01_precentralgyrus, data_spr19$tot_gcq_life)
corr.test(data_spr19$roi_betas_CGvNCG_comp01_precentralgyrus, data_spr19$tot_gcq_others)
corr.test(data_spr19$roi_betas_CGvNCG_comp01_precentralgyrus, data_spr19$tot_gcq_self)
corr.test(data_spr19$roi_betas_CGvNCG_comp01_precentralgyrus, data_spr19$gcq_24)
corr.test(data_spr19$roi_betas_CGvNCG_comp01_precentralgyrus, data_spr19$gcq_30)
corr.test(data_spr19$roi_betas_CGvNCG_comp01_precentralgyrus, data_spr19$tot_gcq_threat)
corr.test(data_spr19$roi_betas_CGvNCG_comp01_precentralgyrus, data_spr19$tot_gcq_world)

corr.test(data_spr19$roi_betas_CGvNCG_comp01_precentralgyrus, data_spr19$tot_ecrrs_global_anx)
corr.test(data_spr19$roi_betas_CGvNCG_comp01_precentralgyrus, data_spr19$tot_ecrrs_global_avoid)
corr.test(data_spr19$roi_betas_CGvNCG_comp01_precentralgyrus, data_spr19$tot_ecrrs_spouse_anx)
corr.test(data_spr19$roi_betas_CGvNCG_comp01_precentralgyrus, data_spr19$tot_ecrrs_spouse_avoid)

corr.test(data_spr19$roi_betas_CGvNCG_comp01_precentralgyrus, data_spr19$tot_ucla_loneliness)
corr.test(data_spr19$roi_betas_CGvNCG_comp01_precentralgyrus, data_spr19$tot_bisbas_basdr)
corr.test(data_spr19$roi_betas_CGvNCG_comp01_precentralgyrus, data_spr19$tot_bisbas_basfun)
corr.test(data_spr19$roi_betas_CGvNCG_comp01_precentralgyrus, data_spr19$tot_bisbas_basrr)
corr.test(data_spr19$roi_betas_CGvNCG_comp01_precentralgyrus, data_spr19$tot_bisbas_bis)

corr.test(data_spr19$roi_betas_CGvNCG_comp01_precentralgyrus, data_spr19$tot_ppss_fa)
corr.test(data_spr19$roi_betas_CGvNCG_comp01_precentralgyrus, data_spr19$tot_ppss_fr)



corr.test(data_spr19$roi_betas_CGvNCG_comp01_precentralgyrus, data_spr19$gAAT_bias_whoto_A)

data_spr19$gAAT_bias_spouse_A <- data_spr19$gAAT_RT_spouse_push_A - data_spr19$gAAT_RT_spouse_pull_A
data_spr19$gAAT_bias_whoto_A <- data_spr19$gAAT_RT_whoto_push_A - data_spr19$gAAT_RT_whoto_pull_A
data_spr19$gAAT_bias_stranger_A <- data_spr19$gAAT_RT_stranger_push_A - data_spr19$gAAT_RT_stranger_pull_A
data_spr19$gAAT_bias_death_A <- data_spr19$gAAT_RT_death_push_A - data_spr19$gAAT_RT_death_pull_A
data_spr19$gAAT_bias_neutral_A <- data_spr19$gAAT_RT_neutral_push_A - data_spr19$gAAT_RT_neutral_pull_A





```{r}

library(ggplot2)

# Color by groups and facet
#::::::::::::::::::::::::::::::::::::::::::::::::::::
sp <- ggscatter(p, x = "roi_betas_CGvNCG_comp01_precentralgyrus", y = "ysl_10",
   color = "group", palette = "jco",
   add = "reg.line", conf.int = TRUE)
sp + stat_cor(aes(color = group), label.x = 3)

stat_cor(data = p, method = "pearson")

p <- na.omit(subset(data_spr19, select=c("roi_betas_CGvNCG_comp01_precentralgyrus", "ysl_10", "group"))) # remove two people who were dropped from the imaging analysis

# scatterplot
sp <- ggscatter(p, x = "roi_betas_CGvNCG_comp01_precentralgyrus", y = "ysl_10",
   add = "reg.line",  # Add regression line
   add.params = list(color = "blue", fill = "lightgray"), # Customize reg. line
   conf.int = TRUE # Add confidence interval
   )
# Add correlation coefficient
sp + stat_cor(method = "pearson", label.x = 3, label.y = 6)

# Use R2 instead of R
ggscatter(p, x = "roi_betas_CGvNCG_comp01_precentralgyrus", y = "ysl_10", add = "reg.line",
          add.params = list(color = "blue", fill = "lightgray"), # Customize reg. line
          conf.int = TRUE, color="group") +
 stat_cor(
   aes(label = paste(..rr.label.., ..p.label.., sep = "~`,`~")),
  label.x = 3
)

corr.test(data_spr19$ysl_10, data_spr19$icg_2)
plot(data_spr19$ysl_10, data_spr19$tot_icg)
```

## # Regression: predict ROI activation from 
```{r}


a <- lm(roi_betas_CGvNCG_comp01_precentralgyrus ~ tot_icg + tot_bisbas_basrr, data = data_spr19)
b <- lm(roi_betas_CGvNCG_comp01_precentralgyrus ~ tot_icg + tot_bisbas_basrr + tot_ysl_aff, data = data_spr19)
anova(a,b)
summary(a)
summary(b)
ggplotRegression(f)

c <- lm(roi_betas_CGvNCG_comp01_precentralgyrus ~ tot_icg + tot_bisbas_basrr + tot_ysl_cog, data = data_spr19)
anova(a,c)
summary(c)

z <- lm(roi_betas_CGvNCG_comp01_precentralgyrus ~ tot_icg + gAAT_spouse_v_str_A + tot_ysl_aff, data = data_spr19)
summary(z)

ggsave(ggplotRegression(b))
ggsave("~/Desktop/regression.png", plot = last_plot(), device = NULL, path = NULL,
  scale = 1, width = 8, units = c("in"), dpi = 600)


ggplotRegression <- function (fit) {
ggplot(fit$model, aes_string(x = names(fit$model)[4], y = names(fit$model)[1])) + 
  geom_point() +
  stat_smooth(method = "lm", col = "blue") +
  labs(title = paste("R^2 = ",signif(summary(fit)$r.squared, 2),
                     "\nIntercept = ",signif(fit$coef[[1]],3 ),
                     "\nSlope =",signif(fit$coef[[4]], 2),
                     "\np =",signif(summary(fit)$coef[4,4], 2))) +
    xlab("Yearning in Situations of Loss - Affective Subscale") +
    ylab("ROI at -23, -20, 75 (betas)")
}


library(apaTables)
apa.reg.table(b, filename="~/Dropbox/Dissertation/SPR-2019-poster/regressiontable.doc", table.number=1)

```





```{r}
## Using behavioral data
# create a variable for spouse - stranger
neuvstr <- group <- subset(data_spr19, select=c(ID,
                               gAAT_RT_spouse_pull_A,
                               gAAT_RT_stranger_pull_A))

spvstr$gAAT_spouse_v_str_A <- spvstr$gAAT_RT_spouse_pull_A-spvstr$gAAT_RT_stranger_pull_A

data_spr19 <- left_join(data_spr19, spvstr, by = "ID")

e <- lm(roi_betas_CGvNCG_comp01_precentralgyrus ~ tot_icg + gAAT_bias_spouse_A, data = data_spr19)
f <- lm(roi_betas_CGvNCG_comp01_precentralgyrus ~ tot_icg +  gAAT_bias_spouse_A + tot_ysl_aff, data = data_spr19)
anova(e,f)
summary(e)
summary(f)

g <- lm(roi_betas_CGvNCG_comp01_precentralgyrus ~ tot_icg +  gAAT_bias_spouse_A + tot_ysl_cog, data = data_spr19)
anova(e,g)
summary(g)

data_spr19 <- left_join(data_spr19, mean_fd, by="ID")

####
# QUESTIONS:
#  INCLUDE COVARS IN REGRESSION MODEL?
#  HOW TO TEST SPECIFICITY TO YEARNING VS. OTHER CG SX?
  
corr.test(data_spr19$roi_betas_CGvNCG_comp01_precentralgyrus, data_spr19$gAAT_spouse_v_str)
plot(data_spr19$roi_betas_CGvNCG_comp01_precentralgyrus, data_spr19$gAAT_spouse_v_str)
corr.test(data_spr19$roi_betas_CGvNCG_comp01_precentralgyrus, data$tot_gcq_blame)
corr.test(data_spr19$roi_betas_CGvNCG_comp01_precentralgyrus, data$tot_gcq_cherish)
corr.test(data_spr19$roi_betas_CGvNCG_comp01_precentralgyrus, data$tot_gcq_future)
corr.test(data_spr19$roi_betas_CGvNCG_comp01_precentralgyrus, data$tot_gcq_life)
corr.test(data_spr19$roi_betas_CGvNCG_comp01_precentralgyrus, data$tot_gcq_others)
corr.test(data_spr19$roi_betas_CGvNCG_comp01_precentralgyrus, data$tot_gcq_self)
corr.test(data_spr19$roi_betas_CGvNCG_comp01_precentralgyrus, data$tot_gcq_threat)
corr.test(data_spr19$roi_betas_CGvNCG_comp01_precentralgyrus, data$tot_gcq_world)
```








































# Footnotes
[^1]: I'm sure there's probably a way to modify this in SPM defaults but didn't want to get too far into the weeds with that.
[^2]: Note: `dir` only searches recursively in Matlab 2016b and later. In earlier versions, it will only look for files in the specified directory and ignore any subdirectories.
[^3]: Ideally this would be a suffix rather than a prefix to be more consistent with BIDS naming standards, but so far haven't figured that out.
[^4]: See error message here: https://sourceforge.net/p/icatb/mailman/message/31511448/ <br>I tried to address this by creating my own group mask and telling GIFT to use that instead of making its own, but ran into the same issue of the mask differing in dimensions from some of the data. Instructions below (adapted from http://www.jpeelle.net/mri/misc/creating_explicit_mask.html):<p>
**How to make a group brain mask**<br>
GIFT allows you to provide a user-specified mask, otherwise it will calculate one for you by default. 
Copy all of the BOLD mask files generated by fmriprep: `$ find /Users/sarenseeley/Desktop/restingstate/data/derivatives/fmriprep -name "*task-rest_bold_space-MNI152NLin2009cAsym_brainmask.nii.gz" -depth 4 | xargs -I {} cp {} /Users/sarenseeley/Desktop/restingstate/data/derivatives/gift/data-smoothed/mask`
Unzip them:
```
$ cd /Users/sarenseeley/Desktop/restingstate/data/derivatives/gift/data-smoothed/mask
$ gunzip *brainmask.nii.gz
```
`gunzip --keep` will keep the .gz files (by default they're removed).<br>
We need to smooth them because having smoothed the images, some of the voxels will now be outside of the mask that fmriprep calculated on the unsmoothed images.<br>
Smooth them with a 4mm Gaussian kernel:<br>
In Matlab, do the same as above to get file names and add them to `spm_smooth_mask_job.m`. In `spm_smooth_mask.m`, change the line `jobfile = {'/Users/sarenseeley/Desktop/restingstate/data/derivatives/gift/analyses-spr-2019/batch/spm_smooth_rest_job.m'};` to `jobfile = {'/Users/sarenseeley/Desktop/restingstate/data/derivatives/gift/analyses-spr-2019/batch/spm_smooth_mask_job.m'};`
```
%% in matlab:
run('/Users/sarenseeley/Desktop/restingstate/data/derivatives/gift/analyses-spr-2019/batch/spm_smooth_mask.m')
```
```
%% in matlab, edit spm_smooth_mask_job.m accordingly:
matlabbatch{1}.spm.util.imcalc.input = {'/Users/sarenseeley/Desktop/restingstate/data/derivatives/gift/data-smoothed/mask/4mmSmoothed_sub-101_ses-txA_task-rest_bold_space-MNI152NLin2009cAsym_brainmask.nii,1'
  '/Users/sarenseeley/Desktop/restingstate/data/derivatives/gift/data-smoothed/mask/4mmSmoothed_sub-101_ses-txB_task-rest_bold_space-MNI152NLin2009cAsym_brainmask.nii,1'};
%%
matlabbatch{1}.spm.util.imcalc.output = 'group_bold_mask';
matlabbatch{1}.spm.util.imcalc.outdir = {'/Users/sarenseeley/Desktop/restingstate/data/derivatives/gift/data-smoothed/mask'};
matlabbatch{1}.spm.util.imcalc.expression = 'all(X)';
matlabbatch{1}.spm.util.imcalc.var = struct('name', {}, 'value', {});
matlabbatch{1}.spm.util.imcalc.options.dmtx = 1;
matlabbatch{1}.spm.util.imcalc.options.mask = 0;
matlabbatch{1}.spm.util.imcalc.options.interp = 1;
matlabbatch{1}.spm.util.imcalc.options.dtype = 4;
```
```
%% in matlab:
run('/Users/sarenseeley/Desktop/restingstate/data/derivatives/gift/analyses-spr-2019/batch/imcalc_group_mask.m')
```