About

Overview

This ADA uses data from the project “The effect of intranasal oxytocin on neural functioning in widow(er)s” (#1503744420, PI: Mary-Frances O’Connor, Co-PI: Brian Arizmendi). It was a double-blind randomized crossover study, designed to address conflicting accounts of motivated behavior in complicatedgrief using a novel approach-avoidance task variant (grief AAT; gAAT) that allows for comparison between participant-provided photos and nomothetic grief stimuli. Participants were 40 English-speaking adults ages 55-80, who experienced the death of a spouse or long-term romantic partner between six months and three years prior to their participation.

Task

Participants attended two identical experimental sessions, at which they received one of two intranasal sprays (treatment A/B) and then took part in an fMRI scan, which involved a structural sequence, the gAAT, and a resting state sequence.

For the gAAT, five stimulus categories were presented: (1) deceased spouse, (2) living loved one [attachment figure], (3) stranger, (4) generic grief-related scenes, and (5) neutral scenes. Photos provided by the participant were scanned and resized to a standard 750-pixel height and width, and framed in colored frames to match the stimuli used for non-idiographic categories (i.e., stranger, grief-related, neutral). Photos of the stranger were sex-matched to the participant’s partner.

During the task, participants were instructed to follow rules based on an irrelevant characteristic of the stimuli, namely, the color of the frame around the photo (e.g., “push the joystick when the frame is blue, pull when the frame is yellow”). The task was completed in two runs of approximately seven minutes each. In the second run, the instructions were reversed (e.g. “now pull the joystick for blue framed photos”). Order of instructions was counterbalanced across participants. Stimuli were presented via Inquisit 4 (2014), in a pseudorandomized order determined by genetic algorithm to optimize statistical power and psychological validity (Wager & Nichols, 2003). Each trial last 3000ms, with 2500 ms allowed for response. Intertrial interval was 500ms.

Participants completed the task using an MRI-compatible joystick. The stimuli were animated so that a joystick push would make the image smaller (as if they were pushing it away from them) and a joystick pull would make the image larger (as if they were bringing it nearer to them).

Variables

The primary dependent variable of interest for the present analysis is reaction time bias, computed as RTpush-RTpull. Negative bias scores (faster to push than pull) indicate relative avoidance bias, while positive scores (faster to pull than to push) indicate relative approach bias. Because the two runs were identical in every way except for the instruction provided, each trial had both a push RT and a pull RT (e.g., run1 trial1 was push, run2 trial1 was pull, so subtract RT on run2 trial 1 from run1 trial1 to get the difference - this is further elaborated in the data cleaning script below), unless they failed to reverse the instructions.

Main predictors of interest included trial number (in order to establish whether peoples’ response bias changed over the 7-minute run), stimulus (neutral, spouse, stranger, living loved one, death), and condition (treatment A/B - oxytocin or placebo [I’m still blinded]).

Participants also completed a number of self-report measures (state and trait) and provided biological samples for OXTR genotyping and cortisol assay, some of which were used as predictors.

Equations

Level 1:

We first performed MLM to estimate the effect of trial number (i.e., the random effect of time) on response bias (push RT-pull RT) on a given trial. The level-1 model estimated, for each participant j (j = 1–35), a regression line that predicted each participant’s response bias in milliseconds on each trial i from the trial number. This model was represented as follows:
[biasi]j = b0j + b1j [trialnumberi] + rij

where b0j, the intercept, is interpreted as participant j’s mean response bias for a given trial; b1j, the slope, represents the effect of trial number on response bias for each participant j (positive coefficients represent the predictive magnitude of trial number on response bias for a given participant); and rij is the residual error term.

Level 2:

The level-2 models estimated the average effects for the entire sample. The models were represented as follows:

b0j = γ00 + μ0j

b1j = γ10 + μ1j

where the intercept, γ00, is interpreted as the average response bias for the entire sample; γ10 is the average effect of represents the effect of trial number on response bias for each participant j (positive coefficients represent the predictive magnitude of trial number on response bias for the sample as a whole), and thus, the primary estimate of interest; and μ0j and μ1j are the residual error terms.

Adding the fixed effect of stimulus type resulted in the following model:

b0j = γ00 + γ01(stimulusi) + μ0j

b1j = γ10 + γ11(stimulusi) + μ1j

Data cleaning

Note: Went back to the raw data due to some inconsistencies in the OT_ADAA_Omnibus.xlsx file from BA that I found.

Pre-R steps:

  1. Removed all .iqdat files under 25KB or so in order to filter out pilot data and false starts.
  2. Removed non-study data (i.e., tests and OT undergrad files).
  3. Organized raw .iqdat files for each visit by run 1/run 2 (two separate folders) based on date/time stamp.

    In Terminal, within each of the folders in turn…

  4. Changed file extensions to .tsv (tab-separated) from .iqdat: find . -iname "*.iqdat" -exec bash -c 'mv "$0" "${0%\.iqdat}.tsv"' {} \;
  5. Merged the files: cat *.tsv > OT_gAAT_run-1.tsv (or OT_gAAT_run-2.tsv) 5. Merged the files: mlr --tsvlite cat *.tsv > OT_gAAT_run-1.tsv (or OT_gAAT_run-2.tsv)
      5a. mlr --tsvlite drops the header from all but the first file, so column names are not repeated throughout the dataset.
Notes:
  • D117: Had 3 visits due to scanner technical issues. Use “_b" and “_c" as visit 1 and visit 2 respectively.
  • D114: Has multiple files from visit 2. Use the last two.
  • D135: Use files from 6/06/16 and 6/17/18 (scanner issues on 06/13/16). Use “_b" and “_c" as visit 1 and visit 2 respectively.
  • D147: Visit 1 run 2 is partial data for some reason (I do have notes that she switched back to PUSH-YELLOW during run 2, but did not re-run because running way over time.)

Creating the MLM dataset

Import data

library(tidyverse)
filter <- dplyr::filter
select <- dplyr::select
# read in the data (only importing a subset of columns)
raw_r1 <- data.frame(read_tsv("~/Desktop/OT-gAAT_MLM-data/Run1_tsv/OT_gAAT_run-1.tsv", col_types=cols_only(
  date = col_integer(),
  time = col_character(),
  subject = col_character(),
  blockcode = col_character(),
  blocknum = col_character(),
  trialcode = col_character(),
  values.trialcode = col_character(),
  values.stimulus = col_character(),
  values.initialresponse = col_character(),
  values.RT = col_number()
)))
number of columns of result is not a multiple of vector length (arg 1)154 parsing failures.
row # A tibble: 5 x 5 col     row col       expected   actual file                                                    expected   <int> <chr>     <chr>      <chr>  <chr>                                                   actual 1   327 date      an integer date   '~/Desktop/OT-gAAT_MLM-data/Run1_tsv/OT_gAAT_run-1.tsv' file 2   327 values.RT a number   .      '~/Desktop/OT-gAAT_MLM-data/Run1_tsv/OT_gAAT_run-1.tsv' row 3   653 date      an integer date   '~/Desktop/OT-gAAT_MLM-data/Run1_tsv/OT_gAAT_run-1.tsv' col 4   653 values.RT a number   .      '~/Desktop/OT-gAAT_MLM-data/Run1_tsv/OT_gAAT_run-1.tsv' expected 5   974 date      an integer date   '~/Desktop/OT-gAAT_MLM-data/Run1_tsv/OT_gAAT_run-1.tsv'
... ................................. ... ........................................................................................... ........ ........................................................................................................................................................................................................................ ...... ........................................................................................................... .... ........................................................................................................... ... ........................................................................................................... ... ........................................................................................................... ........ ...........................................................................................................
See problems(...) for more details.
raw_r2 <- data.frame(read_tsv("~/Desktop/OT-gAAT_MLM-data/Run2_tsv/OT_gAAT_run-2.tsv", col_types=cols_only(
  date = col_integer(),
  time = col_character(),
  subject = col_character(),
  blockcode = col_character(),
  blocknum = col_character(),
  trialcode = col_character(),
  values.trialcode = col_character(),
  values.stimulus = col_character(),
  values.initialresponse = col_character(),
  values.RT = col_number()
)))
number of columns of result is not a multiple of vector length (arg 1)154 parsing failures.
row # A tibble: 5 x 5 col     row col       expected   actual file                                                    expected   <int> <chr>     <chr>      <chr>  <chr>                                                   actual 1   324 date      an integer date   '~/Desktop/OT-gAAT_MLM-data/Run2_tsv/OT_gAAT_run-2.tsv' file 2   324 values.RT a number   .      '~/Desktop/OT-gAAT_MLM-data/Run2_tsv/OT_gAAT_run-2.tsv' row 3   643 date      an integer date   '~/Desktop/OT-gAAT_MLM-data/Run2_tsv/OT_gAAT_run-2.tsv' col 4   643 values.RT a number   .      '~/Desktop/OT-gAAT_MLM-data/Run2_tsv/OT_gAAT_run-2.tsv' expected 5   954 date      an integer date   '~/Desktop/OT-gAAT_MLM-data/Run2_tsv/OT_gAAT_run-2.tsv'
... ................................. ... ........................................................................................... ........ ........................................................................................................................................................................................................................ ...... ........................................................................................................... .... ........................................................................................................... ... ........................................................................................................... ... ........................................................................................................... ........ ...........................................................................................................
See problems(...) for more details.
# header line repeats when I used `cat` to merge, so get those out of there
raw_r1 <- raw_r1 %>% filter(!is.na(date))
raw_r2 <- raw_r2 %>% filter(!is.na(date))
unique(raw_r1$subject)
 [1] "D101"   "D101_2" "D102"   "D102_B" "D103"   "D103_b" "D104"   "D104_b" "D105"   "D105_b" "D107"   "D107_b" "D110"   "D110_b" "D113"   "D113_b" "D114"   "D114_b" "D115"  
[20] "D115_b" "D117_b" "D117_c" "test"   "D118_b" "D119"   "D119_b" "D120"   "D120_b" "D121"   "D121_b" "D122"   "D122_b" "D123"   "D123_b" "D125"   "D125_b" "D126"   "D126_b"
[39] "D127"   "D127_b" "D128"   "D128_b" "D129"   "D129_b" "D130"   "D131"   "D131_b" "D132"   "D132_b" "D133"   "D133_b" "D134"   "D134_b" "D135"   "D135_c" "D137"   "D137_b"
[58] "D138"   "D138_b" "D139"   "D139_b" "D140"   "D140_b" "D141"   "D141_b" "D142"   "D144"   "D144_b" "D145"   "D145_b" "D146"   "D146_b" "D147"   "D147_b" "D148"   "D148_b"
# D118 is showing up as "test" for some reason (checked against the .iqdat file), and D130_b is missing
# both of D130's and D142's visits were entered as "D130"/"D142", so change the subject ID for the visit on 052416 to D130_b and on 
raw_r1$subject[raw_r1$date == 020916] <- "D118"
raw_r1$subject[raw_r1$date == 052416] <- "D130_b"
raw_r1$subject[raw_r1$date == 111516] <- "D142_b"
# fixing some other stuff in a similar vein
raw_r1$subject[raw_r1$subject == "D101_2"] <- "D101_b"
raw_r1$subject[raw_r1$subject == "D102_B"] <- "D102_b"
raw_r1$subject[raw_r1$subject == "D117_b"] <- "D117"
raw_r1$subject[raw_r1$subject == "D117_c"] <- "D117_b"
raw_r1$subject[raw_r1$subject == "D135_c"] <- "D135_b"
unique(raw_r1$subject) # 78 unique - that is correct
 [1] "D101"   "D101_b" "D102"   "D102_b" "D103"   "D103_b" "D104"   "D104_b" "D105"   "D105_b" "D107"   "D107_b" "D110"   "D110_b" "D113"   "D113_b" "D114"   "D114_b" "D115"  
[20] "D115_b" "D117"   "D117_b" "D118"   "D118_b" "D119"   "D119_b" "D120"   "D120_b" "D121"   "D121_b" "D122"   "D122_b" "D123"   "D123_b" "D125"   "D125_b" "D126"   "D126_b"
[39] "D127"   "D127_b" "D128"   "D128_b" "D129"   "D129_b" "D130"   "D130_b" "D131"   "D131_b" "D132"   "D132_b" "D133"   "D133_b" "D134"   "D134_b" "D135"   "D135_b" "D137"  
[58] "D137_b" "D138"   "D138_b" "D139"   "D139_b" "D140"   "D140_b" "D141"   "D141_b" "D142"   "D142_b" "D144"   "D144_b" "D145"   "D145_b" "D146"   "D146_b" "D147"   "D147_b"
[77] "D148"   "D148_b"
unique(raw_r2$subject)
 [1] "D101"     "D101_2Y"  "D102"     "D102_B"   "D103"     "D103_b"   "D104"     "D104_b"   "D105"     "D105_b"   "D107"     "D107_B_2" "D110"     "D110_b"   "D113"     "D113_b"  
[17] "D114"     "D114_b"   "D115"     "D115_b"   "D117_b"   "D117_c"   "D118"     "D118_b"   "D119"     "D119_b"   "D120"     "D120_b"   "D121"     "D121_b"   "D122"     "D122_b"  
[33] "D123"     "D123_b"   "D125"     "D125_b"   "D126"     "D126_B"   "D127"     "D127_b"   "D128"     "D128_b"   "D129"     "D129_b"   "D130"     "D130_b"   "D131"     "D131_b"  
[49] "D132"     "D132_b"   "D133"     "D133_b"   "D134"     "D134_b"   "D135"     "D135_c"   "D137"     "D137_b"   "D138"     "D138_b"   "D139"     "D139_b"   "D140"     "D140_b"  
[65] "D141"     "D141_b"   "D142"     "D144"     "D144_b"   "D145"     "D145_b"   "D146"     "D146_b"   "D147"     "D147_b"   "D148"     "D148_b"  
# some similar issues to fix
raw_r2$subject[raw_r2$subject == "D101_2Y"] <- "D101_b"
raw_r2$subject[raw_r2$subject == "D102_B"] <- "D102_b"
raw_r2$subject[raw_r2$subject == "D107_B_2"] <- "D107_b"
raw_r2$subject[raw_r2$subject == "D117_b"] <- "D117"
raw_r2$subject[raw_r2$subject == "D117_c"] <- "D117_b"
raw_r2$subject[raw_r2$subject == "D126_B"] <- "D126_b"
raw_r2$subject[raw_r2$subject == "D135_c"] <- "D135_b"
raw_r2$subject[raw_r2$date == 111516] <- "D142_b"
unique(raw_r2$subject) # 78 unique - that is correct
 [1] "D101"   "D101_b" "D102"   "D102_b" "D103"   "D103_b" "D104"   "D104_b" "D105"   "D105_b" "D107"   "D107_b" "D110"   "D110_b" "D113"   "D113_b" "D114"   "D114_b" "D115"  
[20] "D115_b" "D117"   "D117_b" "D118"   "D118_b" "D119"   "D119_b" "D120"   "D120_b" "D121"   "D121_b" "D122"   "D122_b" "D123"   "D123_b" "D125"   "D125_b" "D126"   "D126_b"
[39] "D127"   "D127_b" "D128"   "D128_b" "D129"   "D129_b" "D130"   "D130_b" "D131"   "D131_b" "D132"   "D132_b" "D133"   "D133_b" "D134"   "D134_b" "D135"   "D135_b" "D137"  
[58] "D137_b" "D138"   "D138_b" "D139"   "D139_b" "D140"   "D140_b" "D141"   "D141_b" "D142"   "D142_b" "D144"   "D144_b" "D145"   "D145_b" "D146"   "D146_b" "D147"   "D147_b"
[77] "D148"   "D148_b"
str(raw_r1)
'data.frame':   25746 obs. of  10 variables:
 $ date                  : int  101515 101515 101515 101515 101515 101515 101515 101515 101515 101515 ...
 $ time                  : chr  "15:38:09" "15:38:09" "15:38:09" "15:38:09" ...
 $ subject               : chr  "D101" "D101" "D101" "D101" ...
 $ blockcode             : chr  "AATpracticeB" "AATpracticeB" "AATpracticeB" "AATpracticeB" ...
 $ blocknum              : chr  "2" "2" "2" "2" ...
 $ trialcode             : chr  "practicetrial_B" "practiceincrease" "practiceincrease" "practiceincrease" ...
 $ values.trialcode      : chr  "practicetrial_B" "practicetrial_B" "practicetrial_B" "practicetrial_B" ...
 $ values.stimulus       : chr  "grayrectangle_Y.jpg" "grayrectangle_Y.jpg" "grayrectangle_Y.jpg" "grayrectangle_Y.jpg" ...
 $ values.initialresponse: chr  "PULL" "PULL" "PULL" "PULL" ...
 $ values.RT             : num  1411 1411 1411 1411 1411 ...
str(raw_r2)
'data.frame':   25457 obs. of  10 variables:
 $ date                  : int  101515 101515 101515 101515 101515 101515 101515 101515 101515 101515 ...
 $ time                  : chr  "15:48:12" "15:48:12" "15:48:12" "15:48:12" ...
 $ subject               : chr  "D101" "D101" "D101" "D101" ...
 $ blockcode             : chr  "AATpracticeA" "AATpracticeA" "AATpracticeA" "AATpracticeA" ...
 $ blocknum              : chr  "2" "2" "2" "2" ...
 $ trialcode             : chr  "practicetrial_A" "practicedecrease" "practicedecrease" "practicedecrease" ...
 $ values.trialcode      : chr  "practicetrial_A" "practicetrial_A" "practicetrial_A" "practicetrial_A" ...
 $ values.stimulus       : chr  "grayrectangle_B.jpg" "grayrectangle_B.jpg" "grayrectangle_B.jpg" "grayrectangle_B.jpg" ...
 $ values.initialresponse: chr  "PUSH" "PUSH" "PUSH" "PUSH" ...
 $ values.RT             : num  1472 1472 1472 1472 1472 ...
# take out ITIs
unique(raw_r1$trialcode)
 [1] "practicetrial_B"      "practiceincrease"     "endincrease_practice" "InterTrialInterval"   "practicetrial_A"      "practicedecrease"     "enddecrease_practice"
 [8] "AAT_1"                "AAT_5"                "AAT_6"                "AAT_3"                "AAT_7"                "AAT_4"                "AAT_8"               
[15] "AAT_9"                "AAT_2"                "AAT_10"               "AAT_11"               "AAT_12"              
mlm_r1 <- filter(raw_r1, grepl("^A",trialcode))
unique(mlm_r1$trialcode)
 [1] "AAT_1"  "AAT_5"  "AAT_6"  "AAT_3"  "AAT_7"  "AAT_4"  "AAT_8"  "AAT_9"  "AAT_2"  "AAT_10" "AAT_11" "AAT_12"
unique(raw_r2$trialcode)
 [1] "practicetrial_A"      "practicedecrease"     "enddecrease_practice" "InterTrialInterval"   "practicetrial_B"      "practiceincrease"     "endincrease_practice"
 [8] "AAT_1"                "AAT_5"                "AAT_6"                "AAT_3"                "AAT_7"                "AAT_4"                "AAT_8"               
[15] "AAT_9"                "AAT_2"                "AAT_10"               "AAT_11"               "AAT_12"              
mlm_r2 <- filter(raw_r2, grepl("^A",trialcode))
unique(mlm_r2$trialcode)
 [1] "AAT_1"  "AAT_5"  "AAT_6"  "AAT_3"  "AAT_7"  "AAT_4"  "AAT_8"  "AAT_9"  "AAT_2"  "AAT_10" "AAT_11" "AAT_12"
# add a "trialnum" variable, grouped by ID
mlm_r1 <- mlm_r1 %>% group_by(subject) %>% mutate(trialnum = row_number())
View(mlm_r2)
mlm_r2 <- mlm_r2 %>% group_by(subject) %>% mutate(trialnum = row_number())
View(mlm_r2)
mlm_r1 %>% group_by(trialnum) %>% count(trialnum) # everyone has 144 trials, hooray!
mlm_r2 %>% group_by(trialnum) %>% count(trialnum) # everyone has 144 trials, one person has 177
mlm_r2 %>% group_by(subject) %>% count() # that person is D122_b
# no clue why that happened, although I do vaguely remember we had one session where the task seemed to go on and on and we force quit it?
# remove trials 145-177 for D122_b
mlm_r2 <- filter(mlm_r2, trialnum <=144)
mlm_r2 %>% group_by(trialnum) %>% count(trialnum)
# now everyone has 144 trials
# add a column for "run"
mlm_r1$run <- "1"
mlm_r2$run <- "2"
# add a column for "visit"
mlm_r1$visit <- as.factor(ifelse(grepl("*_b", mlm_r1$subject), "2", "1"))
mlm_r2$visit <- as.factor(ifelse(grepl("*_b", mlm_r2$subject), "2", "1"))
mlm_r1 %>% group_by(subject) %>% count(visit) # all looks good
mlm_r2 %>% group_by(subject) %>% count(visit) # all looks good
# now remove the "_b" from visit 2 IDs
mlm_r1 <- mlm_r1 %>% ungroup() %>%
  mutate(subject = str_replace(subject, "_b", ""))
unique(mlm_r1$subject)
 [1] "D101" "D102" "D103" "D104" "D105" "D107" "D110" "D113" "D114" "D115" "D117" "D118" "D119" "D120" "D121" "D122" "D123" "D125" "D126" "D127" "D128" "D129" "D130" "D131" "D132"
[26] "D133" "D134" "D135" "D137" "D138" "D139" "D140" "D141" "D142" "D144" "D145" "D146" "D147" "D148"
mlm_r2 <- mlm_r2 %>% ungroup() %>%
  mutate(subject = str_replace(subject, "_b", ""))
unique(mlm_r2$subject)
 [1] "D101" "D102" "D103" "D104" "D105" "D107" "D110" "D113" "D114" "D115" "D117" "D118" "D119" "D120" "D121" "D122" "D123" "D125" "D126" "D127" "D128" "D129" "D130" "D131" "D132"
[26] "D133" "D134" "D135" "D137" "D138" "D139" "D140" "D141" "D142" "D144" "D145" "D146" "D147" "D148"
# merge data from the 2 runs 
mlm_bx <- bind_rows(mlm_r1, mlm_r2)
# split up by visit
mlm_v1 <- mlm_bx %>% filter(visit == "1")
mlm_v2 <- mlm_bx %>% filter(visit == "2")
mlm_v1 %>% group_by(trialnum) %>% count(trialnum) # everyone has 144 trials, hooray!
mlm_v2 %>% group_by(trialnum) %>% count(trialnum) # everyone has 144 trials, hooray!
# add columns for "cond" at visit 1 and visit 2 [condition: A or B]
# load in the randomization data
randomize <- readRDS("~/Dropbox/GLASS Lab/OT Study/data/cleaned-data/randomization_clean.rds")
head(randomize)
mlm_v1 <- mlm_v1 %>%
  rename("ID" = subject)
mlm_v2 <- mlm_v2 %>%
  rename("ID" = subject)
# get the IDs for everyone who got treatment A at visit 1 
IDs_v1_txA <- randomize %>% filter(tx_v1 == "A")
View(IDs_v1_txA)
# ifelse statement: if ID = any of those listed in txA_list, make cond_v1 = "A", else make it "B"
mlm_v1 <- mlm_v1 %>%
  mutate(cond_v1 = as.factor(ifelse(ID == "D101" | ID == "D105" | ID == "D109" | ID == "D110" | ID == "D114" | ID == "D115" | ID == "D116" | ID == "D118" | ID == "D119" | ID == "D120" | ID == "D123" | ID == "D124" | ID == "D126" | ID == "D128" | ID == "D132" | ID == "D135" | ID == "D136" | ID == "D137" | ID == "D138" | ID == "D140" | ID == "D141" | ID == "D145" | ID == "D148" | ID == "D149" | ID == "D117", "A", "B")), cond_v2 = as.factor(ifelse(ID == "D101" | ID == "D105" | ID == "D109" | ID == "D110" | ID == "D114" | ID == "D115" | ID == "D116" | ID == "D118" | ID == "D119" | ID == "D120" | ID == "D123" | ID == "D124" | ID == "D126" | ID == "D128" | ID == "D132" | ID == "D135" | ID == "D136" | ID == "D137" | ID == "D138" | ID == "D140" | ID == "D141" | ID == "D145" | ID == "D148" | ID == "D149" | ID == "D117", "B", "A")))
head(mlm_v1)
# ifelse statement: if ID = any of those listed in txA_list, make cond_v2 = "B", else make it "A" (people who were A at visit 1 should be B for visit 2, and vice versa)
mlm_v2 <- mlm_v2 %>%
  mutate(cond_v2 = as.factor(ifelse(ID == "D101" | ID == "D105" | ID == "D109" | ID == "D110" | ID == "D114" | ID == "D115" | ID == "D116" | ID == "D118" | ID == "D119" | ID == "D120" | ID == "D123" | ID == "D124" | ID == "D126" | ID == "D128" | ID == "D132" | ID == "D135" | ID == "D136" | ID == "D137" | ID == "D138" | ID == "D140" | ID == "D141" | ID == "D145" | ID == "D148" | ID == "D149" | ID == "D117", "B", "A")), cond_v1 = as.factor(ifelse(ID == "D101" | ID == "D105" | ID == "D109" | ID == "D110" | ID == "D114" | ID == "D115" | ID == "D116" | ID == "D118" | ID == "D119" | ID == "D120" | ID == "D123" | ID == "D124" | ID == "D126" | ID == "D128" | ID == "D132" | ID == "D135" | ID == "D136" | ID == "D137" | ID == "D138" | ID == "D140" | ID == "D141" | ID == "D145" | ID == "D148" | ID == "D149" | ID == "D117", "A", "B")))
levels(mlm_v1$cond_v1)
[1] "A" "B"
levels(mlm_v2$cond_v2)
[1] "A" "B"
head(mlm_v1)
  
IDs_v1_txA$ID
 [1] "D101" "D105" "D109" "D110" "D114" "D115" "D116" "D118" "D119" "D120" "D123" "D124" "D126" "D128" "D132" "D135" "D136" "D137" "D138" "D140" "D141" "D145" "D148" "D149" "D117"
mlm_v1 %>% group_by(cond_v1) %>% count(ID) # IDs and cond match txA_list
mlm_v2 %>% group_by(cond_v2) %>% count(ID) # IDs and cond match txA_list
mlm_bx <- bind_rows(mlm_v1,mlm_v2)
View(mlm_bx)
# save it
saveRDS(mlm_bx, "~/Dropbox/2018Fall/FSHD617C/mlm_bx.rds")
# rename some variables
colnames(mlm_bx)
 [1] "date"                   "time"                   "ID"                     "blockcode"              "blocknum"               "trialcode"              "values.trialcode"      
 [8] "values.stimulus"        "values.initialresponse" "values.RT"              "trialnum"               "run"                    "visit"                  "cond_v1"               
[15] "cond_v2"               
mlm_bx <- mlm_bx %>% rename("push_pull" = values.initialresponse,
                  "gAAT_RT" = values.RT)
head(mlm_bx)
# add a new variable for condition
mlm_bx1 <- mlm_bx %>% mutate(cond = as.factor(ifelse(cond_v1 == "A" & visit == "1", "A",
                              ifelse(cond_v2 == "A" & visit == "2", "A",
                                    ifelse(cond_v1 == "B" & visit == "1", "B",
                                           ifelse(cond_v2 == "B" & visit == "2", "B", NA))))))
View(mlm_bx1) # check that it coded the "cond" column correctly      
# add a new variable for stimulus category
# in values.stimulus:
# 1 = spouse, 2 = living loved one/WHOTO, 3 = stranger, 4 = nomothetic death-related, 5 = neutral images
sort(unique(mlm_bx1$values.stimulus))
 [1] "1B_1.jpg" "1B_2.jpg" "1B_3.jpg" "1Y_1.jpg" "1Y_2.jpg" "1Y_3.jpg" "2B_1.jpg" "2B_2.jpg" "2B_3.jpg" "2Y_1.jpg" "2Y_2.jpg" "2Y_3.jpg" "3B_1.jpg" "3B_2.jpg" "3B_3.jpg" "3Y_1.jpg"
[17] "3Y_2.jpg" "3Y_3.jpg" "4B_1.jpg" "4B_2.jpg" "4B_3.jpg" "4Y_1.jpg" "4Y_2.jpg" "4Y_3.jpg" "5B_1.jpg" "5B_2.jpg" "5B_3.jpg" "5Y_1.jpg" "5Y_2.jpg" "5Y_3.jpg"
mlm_bx2 <- mlm_bx1 %>%
  mutate(stim = as.factor(ifelse(values.stimulus == "1B_1.jpg"|values.stimulus == "1B_2.jpg"|values.stimulus == "1B_3.jpg"|values.stimulus == "1Y_1.jpg"|values.stimulus == "1Y_2.jpg"|values.stimulus == "1Y_3.jpg", "spouse", 
                       ifelse(values.stimulus == "2B_1.jpg"|values.stimulus == "2B_2.jpg"|values.stimulus == "2B_3.jpg"|values.stimulus == "2Y_1.jpg"|values.stimulus == "2Y_2.jpg"|values.stimulus == "2Y_3.jpg", "living",
                              ifelse(values.stimulus == "3B_1.jpg"|values.stimulus == "3B_2.jpg"|values.stimulus == "3B_3.jpg"|values.stimulus == "3Y_1.jpg"|values.stimulus == "3Y_2.jpg"|values.stimulus == "3Y_3.jpg", "stranger",
                                     ifelse(values.stimulus == "4B_1.jpg"|values.stimulus == "4B_2.jpg"|values.stimulus == "4B_3.jpg"|values.stimulus == "4Y_1.jpg"|values.stimulus == "4Y_2.jpg"|values.stimulus == "4Y_3.jpg", "death",
                                            ifelse(values.stimulus == "5B_1.jpg"|values.stimulus == "5B_2.jpg"|values.stimulus == "5B_3.jpg"|values.stimulus == "5Y_1.jpg"|values.stimulus == "5Y_2.jpg"|values.stimulus == "5Y_3.jpg", "neutral", NA)))))))
levels(mlm_bx2$stim)
[1] "death"    "living"   "neutral"  "spouse"   "stranger"
mlm_bx2$stim <- relevel(mlm_bx2$stim, ref = "neutral") # make neutral the reference level
                       
# save it
saveRDS(mlm_bx2, "~/Dropbox/2018Fall/FSHD617C/mlm_bx2.rds")

Bias scores

Separate out push minus pull, then subtract run1 from run2.

mlm_r1.1 <- mlm_bx2 %>% filter(run == "1")
tail(mlm_r1.1) # does it line up?
mlm_r2.1 <- mlm_bx2 %>% filter(run == "2")
tail(mlm_r2.1) 
p99r1 <- quantile(mlm_r1.1$gAAT_RT, 0.99) 
p01r1 <- quantile(mlm_r1.1$gAAT_RT, 0.01) 
p99r1 # 99th percentile RTs >= 1713.48ms 
    99% 
1713.38 
p01r1 # 1st percentile RTs <= 472ms
 1% 
472 
p99r2 <- quantile(mlm_r2.1$gAAT_RT, 0.99) 
p01r2 <- quantile(mlm_r2.1$gAAT_RT, 0.01) # 1st percentile RTs <= 463ms
p99r2 # 99th percentile RTs >= 1712.45ms 
    99% 
1712.45 
p01r2 # 1st percentile RTs <= 463ms
 1% 
463 
mlm_r1.2 <- mlm_r1.1 %>%
  mutate(outlier_RT = ifelse(gAAT_RT <= p01r1 | gAAT_RT >= p99r1, 1, 0))
table(mlm_r1.2$outlier_RT) 

    0     1 
11005   227 
227/11005 # 227 trials will be dropped, 11005 will be retained...about 2% of trials.
[1] 0.02062699
mlm_r2.2 <- mlm_r2.1 %>%
  mutate(outlier_RT = ifelse(gAAT_RT <= p01r2 | gAAT_RT >= p99r2, 1, 0))
table(mlm_r2.2$outlier_RT) 

    0     1 
11002   230 
230/11002 # 230 trials will be dropped, 11002 will be retained...about 2% of trials.
[1] 0.02090529
mlm_r1r2 <- bind_cols(mlm_r1.2, mlm_r2.2) # 11232 trials total.
head(mlm_r1r2) # make sure that trials in the two runs line up post-merge
tail(mlm_r1r2) # make sure that trials in the two runs line up post-merge
# get rid of any trials with outliers in either run1 or run2 
# (need run1/run2 to be the same length so that the trials match up)
mlm_r1r2.1 <- mlm_r1r2 %>% filter(outlier_RT == "0" & outlier_RT1 == "0") 
count(mlm_r1r2.1) # 10790 obs.
10790/11232 # 96% of trials retained = 4% of trials dropped. That feels acceptable.
[1] 0.9606481
push_pull <- mlm_r1r2.1 %>% filter(push_pull == "PUSH" & push_pull1 == "PULL") %>% mutate(bias = (gAAT_RT-gAAT_RT1))
pull_push <- mlm_r1r2.1 %>% filter(push_pull == "PULL" & push_pull1 == "PUSH") %>% mutate(bias = (gAAT_RT1-gAAT_RT))
mlm_pp <- bind_rows(push_pull,pull_push)
View(mlm_pp)

Transformations

library(psych)
describe(mlm_pp$bias) # skewness is 0.1, kurtosis is 1.69
   vars    n  mean     sd median trimmed    mad   min  max range skew kurtosis   se
X1    1 9880 11.83 246.66      8    9.97 192.74 -1157 1119  2276  0.1     1.69 2.48
hist(mlm_pp$bias)

qqnorm(mlm_pp$bias)

ggplot(mlm_pp, aes(sample=bias))+stat_qq()+stat_qq_line() # not a good look at all...

# install.packages("bestNormalize")
# what's the best transformation for my data?
library(bestNormalize)
orderNorm_bias <- bestNormalize(mlm_pp$bias) # suggested orderNorm transform as best
boxcox  did not work;  Error in estimate_boxcox_lambda(x, ...) : x must be positive
exp_x  did not work;  Error in exp_x(standardize = TRUE, warn = TRUE, x = c(26, -187, 45, 782,  : 
  Transformation finite for less than 3 x values
Ties in data, Normal distribution not guaranteed

|===                                                                                                                                                          |  2% ~3 s remaining     
|======                                                                                                                                                       |  4% ~3 s remaining     
|=========                                                                                                                                                    |  6% ~3 s remaining     
|============                                                                                                                                                 |  8% ~3 s remaining     
|===============                                                                                                                                              | 10% ~3 s remaining     
|==================                                                                                                                                           | 12% ~3 s remaining     
|=====================                                                                                                                                        | 14% ~3 s remaining     
|=========================                                                                                                                                    | 16% ~3 s remaining     
|============================                                                                                                                                 | 18% ~3 s remaining     
|===============================                                                                                                                              | 20% ~3 s remaining     
|==================================                                                                                                                           | 22% ~3 s remaining     
|=====================================                                                                                                                        | 24% ~3 s remaining     
|========================================                                                                                                                     | 26% ~3 s remaining     
|===========================================                                                                                                                  | 28% ~3 s remaining     
|===============================================                                                                                                              | 30% ~3 s remaining     
|==================================================                                                                                                           | 32% ~3 s remaining     
|=====================================================                                                                                                        | 34% ~2 s remaining     
|========================================================                                                                                                     | 36% ~2 s remaining     
|===========================================================                                                                                                  | 38% ~2 s remaining     
|==============================================================                                                                                               | 40% ~2 s remaining     
|=================================================================                                                                                            | 42% ~2 s remaining     
|=====================================================================                                                                                        | 44% ~2 s remaining     
|========================================================================                                                                                     | 46% ~2 s remaining     
|===========================================================================                                                                                  | 48% ~2 s remaining     
|==============================================================================                                                                               | 50% ~2 s remaining     
|=================================================================================                                                                            | 52% ~2 s remaining     
|====================================================================================                                                                         | 54% ~2 s remaining     
|=======================================================================================                                                                      | 56% ~2 s remaining     
|===========================================================================================                                                                  | 58% ~2 s remaining     
|==============================================================================================                                                               | 60% ~2 s remaining     
|=================================================================================================                                                            | 62% ~1 s remaining     
|====================================================================================================                                                         | 64% ~1 s remaining     
|=======================================================================================================                                                      | 66% ~1 s remaining     
|==========================================================================================================                                                   | 68% ~1 s remaining     
|=============================================================================================================                                                | 70% ~1 s remaining     
|=================================================================================================================                                            | 72% ~1 s remaining     
|====================================================================================================================                                         | 74% ~1 s remaining     
|=======================================================================================================================                                      | 76% ~1 s remaining     
|==========================================================================================================================                                   | 78% ~1 s remaining     
|=============================================================================================================================                                | 80% ~1 s remaining     
|================================================================================================================================                             | 82% ~1 s remaining     
|===================================================================================================================================                          | 84% ~1 s remaining     
|=======================================================================================================================================                      | 86% ~1 s remaining     
|==========================================================================================================================================                   | 88% ~0 s remaining     
|=============================================================================================================================================                | 90% ~0 s remaining     
|================================================================================================================================================             | 92% ~0 s remaining     
|===================================================================================================================================================          | 94% ~0 s remaining     
|======================================================================================================================================================       | 96% ~0 s remaining     
|=========================================================================================================================================================    | 98% ~0 s remaining     
|=============================================================================================================================================================|100% ~0 s remaining     
mlm_pp$bias_t <- orderNorm_bias$x.t # pull x.t from orderNorm_bias and assign it back into dataset
describe(mlm_pp$bias_t)
   vars    n mean sd median trimmed mad   min  max range skew kurtosis   se
X1    1 9880    0  1      0       0   1 -3.61 3.89   7.5    0    -0.02 0.01
hist(mlm_pp$bias_t)

qqnorm(mlm_pp$bias_t)

ggplot(mlm_pp, aes(sample=bias_t))+stat_qq()+stat_qq_line() # so much better!

# back-transformation, just to show how to do it:
p <- predict(orderNorm_bias)
    x2 <- predict(orderNorm_bias, newdata = p, inverse = TRUE) # x2 is the back-transformed data
    all.equal(x2, mlm_pp$bias_t) 
[1] "Mean relative difference: 0.9955923"
mlm_bx3 <- mlm_pp
saveRDS(mlm_bx3, "~/Dropbox/2018Fall/FSHD617C/mlm_bx3.rds")

Checking the data

table <- mlm_bx3 %>% group_by(ID) %>% group_by(cond) %>% count(ID) 
table
# missing data...
# cond A: D101 (3 trials), D141 (18 trials), D148 (19 trials)
# cond B:  D147 (1 trial)
describe(table$n)
   vars  n   mean    sd median trimmed  mad min max range  skew kurtosis   se
X1    1 78 126.67 28.35    134  133.11 5.93   1 144   143 -3.58    12.05 3.21
ggplot(table, aes(sample=n))+stat_qq()+stat_qq_line() # can see those 4 on the QQ plot

# where are the trials getting dropped?
mlm_r1r2 %>% group_by(ID) %>% group_by(cond) %>% count(ID) # everyone has 144 here
mlm_r1r2.1 %>% group_by(ID) %>% group_by(cond) %>% count(ID) 
# after getting rid of outliers, most people missing a few trials (D107 cond A/B and D113 cond A missing >20)
mlm_r1r2.1 %>% filter(ID=="D101" & cond=="A")
mlm_r1r2.1 %>% filter(ID=="D141" & cond=="A")
mlm_r1r2.1 %>% filter(ID=="D148" & cond=="A")
mlm_r1r2.1 %>% filter(ID=="D147" & cond=="B")
# so trials are getting dropped because they didn't effectively reverse the instructions or they reverted to the old instructions
# in essence, this removes error trials
# create a "drop" variable for these 4, just in case...
mlm_bx3 <- mlm_bx3 %>% mutate(low_n_trials = ifelse(ID == "D101" | ID == "D141" | ID == "D147" | ID == "D148", 1, 0))
# check whether the rows line up
notlinedup <- mlm_bx3 %>% filter(stim != stim1)
notlinedup
notlinedup <- mlm_bx3 %>% filter(trialcode != trialcode1)
notlinedup
notlinedup <- mlm_bx3 %>% filter(trialnum != trialnum1)
notlinedup
# 0 rows returned; all the rows match
head(mlm_bx3)
saveRDS(mlm_bx3, "~/Dropbox/2018Fall/FSHD617C/mlm_bx3.rds")

Add time-invariant variables

Next step is to add demographics and other variables that we might potentially want to include in the model but that do not change over time. Looking at the codebook for the master dataset, these include:

Demographics:

  • age
  • age_yrs
  • sex_m
  • ethnicity_hisp
  • race
  • timesincedeath

Genetic variables:

  • rs2254298
  • rs2268498
  • rs53576
  • rs2254298_A.carrier
  • rs2268498_C.carrier
  • rs53576_A.carrier

Self-report measures:

  • tot_icg [grief]
  • tot_bdi [depression]
  • tot_bgq [brief grief questionnaire with impairment item]
  • tot_ecrrs_global_anx [anxious attachment - global]
  • tot_ecrrs_global_avoid [avoidant attachment - global]
  • tot_ecrrs_spouse_anx [anxious attachment - spouse]
  • tot_ecrrs_spouse_avoid [avoidant attachment - spouse]
  • tot_ysl [yearning for the deceased]
# load in master dataset
master <- readRDS("~/Dropbox/GLASS Lab/OT Study/data/cleaned-data/selfreports_oxtr_cortisol_random_AB_bx_clean.rds") 
# subset the selected variables
mlm_add <- subset(master, select=c(ID,age, age_yrs, sex_m, ethnicity_hisp, race, timesincedeath, rs2254298, rs2268498, rs53576, rs2254298_A.carrier, rs2268498_C.carrier, rs53576_A.carrier, tot_bdi, tot_bgq, tot_ecrrs_global_anx,tot_ecrrs_global_avoid, tot_ecrrs_spouse_anx, tot_ecrrs_spouse_avoid, tot_ysl, tot_icg))
head(mlm_add)
# merge into the behavioral dataset
mlm_bx4 <- left_join(mlm_bx3, mlm_add, by="ID")
head(mlm_bx4)
# save it
saveRDS(mlm_bx4, "~/Dropbox/2018Fall/FSHD617C/mlm_bx4.rds")

Exploratory analysis and checking assumptions

First step is to overview the entire dataset using summarytools.

library(summarytools)
# overview the entire dataset
# this will look like TOTAL crap in the console (notebook output)
# use view(dfSummary(mlm_bx4)) from the console to see a much more readable and attractive output
dfSummary(mlm_bx4)
Data Frame Summary   
mlm_bx4     
N: 9880   
-------------------------------------------------------------------------------------------------------------------------------------------------------
No   Variable                  Stats / Values                     Freqs (% of Valid)      Text Graph                               Valid      Missing  
---- ------------------------- ---------------------------------- ----------------------- ---------------------------------------- ---------- ---------
1    date                      mean (sd) : 63183.18 (35369.28)    73 distinct val.                          :                      9880       0        
     [integer]                 min < med < max :                                          : .               :                      (100%)     (0%)     
                               11416 < 52316 < 121515                                     : : . :           :                                          
                               IQR (CV) : 69500 (0.56)                                    : : : :   : .   : :                                          
                                                                                          : : : : . : :   : :                                          

2    time                      1. 12:36:02                        144 ( 1.5%)                                                      9880       0        
     [character]               2. 14:24:39                        144 ( 1.5%)                                                      (100%)     (0%)     
                               3. 15:06:35                        144 ( 1.5%)                                                                          
                               4. 13:46:36                        142 ( 1.4%)                                                                          
                               5. 16:32:30                        142 ( 1.4%)                                                                          
                               6. 14:12:22                        141 ( 1.4%)                                                                          
                               7. 16:29:40                        141 ( 1.4%)                                                                          
                               8. 11:18:48                        140 ( 1.4%)                                                                          
                               9. 13:07:05                        140 ( 1.4%)                                                                          
                               10. 13:15:27                       140 ( 1.4%)                                                                          
                               [ 68 others ]                      8462 (85.8%)            IIIIIIIIIIIIIIII                                             

3    ID                        1. D110                            284 ( 2.9%)                                                      9880       0        
     [character]               2. D120                            283 ( 2.9%)                                                      (100%)     (0%)     
                               3. D123                            282 ( 2.9%)                                                                          
                               4. D132                            281 ( 2.8%)                                                                          
                               5. D127                            279 ( 2.8%)                                                                          
                               6. D146                            277 ( 2.8%)                                                                          
                               7. D102                            275 ( 2.8%)                                                                          
                               8. D130                            275 ( 2.8%)                                                                          
                               9. D138                            274 ( 2.8%)                                                                          
                               10. D131                           273 ( 2.8%)                                                                          
                               [ 29 others ]                      7097 (71.8%)            IIIIIIIIIIIIIIII                                             

4    blockcode                 1. AAT4                            894 ( 9.0%)             IIIIIIIIIII                              9880       0        
     [character]               2. AAT6                            894 ( 9.0%)             IIIIIIIIIII                              (100%)     (0%)     
                               3. AAT1                            892 ( 9.0%)             IIIIIIIIIII                                                  
                               4. AAT5                            889 ( 9.0%)             IIIIIIIIIII                                                  
                               5. AAT7                            886 ( 9.0%)             IIIIIIIIIII                                                  
                               6. AAT3                            873 ( 8.8%)             IIIIIIIIIII                                                  
                               7. AAT10                           843 ( 8.5%)             IIIIIIIIII                                                   
                               8. AAT9                            839 ( 8.5%)             IIIIIIIIII                                                   
                               9. AAT8                            834 ( 8.4%)             IIIIIIIIII                                                   
                               10. AAT2                           804 ( 8.1%)             IIIIIIIIII                                                   
                               [ 2 others ]                       1232 (12.5%)            IIIIIIIIIIIIIIII                                             

5    blocknum                  1. 154                             74 ( 0.8%)                                                       9880       0        
     [character]               2. 71                              74 ( 0.8%)                                                       (100%)     (0%)     
                               3. 15                              73 ( 0.7%)                                                                           
                               4. 22                              73 ( 0.7%)                                                                           
                               5. 102                             72 ( 0.7%)                                                                           
                               6. 107                             72 ( 0.7%)                                                                           
                               7. 117                             72 ( 0.7%)                                                                           
                               8. 124                             72 ( 0.7%)                                                                           
                               9. 125                             72 ( 0.7%)                                                                           
                               10. 130                            72 ( 0.7%)                                                                           
                               [ 134 others ]                     9154 (92.9%)            IIIIIIIIIIIIIIII                                             

6    trialcode                 1. AAT_4                           894 ( 9.0%)             IIIIIIIIIII                              9880       0        
     [character]               2. AAT_6                           894 ( 9.0%)             IIIIIIIIIII                              (100%)     (0%)     
                               3. AAT_1                           892 ( 9.0%)             IIIIIIIIIII                                                  
                               4. AAT_5                           889 ( 9.0%)             IIIIIIIIIII                                                  
                               5. AAT_7                           886 ( 9.0%)             IIIIIIIIIII                                                  
                               6. AAT_3                           873 ( 8.8%)             IIIIIIIIIII                                                  
                               7. AAT_10                          843 ( 8.5%)             IIIIIIIIII                                                   
                               8. AAT_9                           839 ( 8.5%)             IIIIIIIIII                                                   
                               9. AAT_8                           834 ( 8.4%)             IIIIIIIIII                                                   
                               10. AAT_2                          804 ( 8.1%)             IIIIIIIIII                                                   
                               [ 2 others ]                       1232 (12.5%)            IIIIIIIIIIIIIIII                                             

7    values.trialcode          1. AAT_4                           894 ( 9.0%)             IIIIIIIIIII                              9880       0        
     [character]               2. AAT_6                           894 ( 9.0%)             IIIIIIIIIII                              (100%)     (0%)     
                               3. AAT_1                           892 ( 9.0%)             IIIIIIIIIII                                                  
                               4. AAT_5                           889 ( 9.0%)             IIIIIIIIIII                                                  
                               5. AAT_7                           886 ( 9.0%)             IIIIIIIIIII                                                  
                               6. AAT_3                           873 ( 8.8%)             IIIIIIIIIII                                                  
                               7. AAT_10                          843 ( 8.5%)             IIIIIIIIII                                                   
                               8. AAT_9                           839 ( 8.5%)             IIIIIIIIII                                                   
                               9. AAT_8                           834 ( 8.4%)             IIIIIIIIII                                                   
                               10. AAT_2                          804 ( 8.1%)             IIIIIIIIII                                                   
                               [ 2 others ]                       1232 (12.5%)            IIIIIIIIIIIIIIII                                             

8    values.stimulus           1. 3Y_3.jpg                        418 ( 4.2%)             I                                        9880       0        
     [character]               2. 3Y_2.jpg                        413 ( 4.2%)             I                                        (100%)     (0%)     
                               3. 3Y_1.jpg                        411 ( 4.2%)             I                                                            
                               4. 2B_3.jpg                        386 ( 3.9%)             I                                                            
                               5. 2B_2.jpg                        382 ( 3.9%)                                                                          
                               6. 2B_1.jpg                        371 ( 3.8%)                                                                          
                               7. 1B_3.jpg                        362 ( 3.7%)                                                                          
                               8. 1B_2.jpg                        336 ( 3.4%)                                                                          
                               9. 3B_1.jpg                        336 ( 3.4%)                                                                          
                               10. 1B_1.jpg                       334 ( 3.4%)                                                                          
                               [ 20 others ]                      6131 (62.1%)            IIIIIIIIIIIIIIII                                             

9    push_pull                 1. PULL                            4953 (50.1%)            IIIIIIIIIIIIIIII                         9880       0        
     [character]               2. PUSH                            4927 (49.9%)            IIIIIIIIIIIIIII                          (100%)     (0%)     

10   gAAT_RT                   mean (sd) : 794.92 (203.04)        959 distinct val.         : .                                    9880       0        
     [numeric]                 min < med < max :                                            : :                                    (100%)     (0%)     
                               473 < 756 < 1712                                           . : : :                                                      
                               IQR (CV) : 251 (0.26)                                      : : : : .                                                    
                                                                                          : : : : : : .                                                

11   trialnum                  mean (sd) : 72.87 (41.54)          144 distinct val.       : . . : . . : . . :                      9880       0        
     [integer]                 min < med < max :                                          : : : : : : : : : :                      (100%)     (0%)     
                               1 < 73 < 144                                               : : : : : : : : : :                                          
                               IQR (CV) : 72 (0.57)                                       : : : : : : : : : :                                          
                                                                                          : : : : : : : : : :                                          

12   run                       1. 1                               9880 (100.0%)           IIIIIIIIIIIIIIII                         9880       0        
     [character]                                                                                                                   (100%)     (0%)     

13   visit                     1. 1                               4683 (47.4%)            IIIIIIIIIIIIII                           9880       0        
     [factor]                  2. 2                               5197 (52.6%)            IIIIIIIIIIIIIIII                         (100%)     (0%)     

14   cond_v1                   1. A                               5014 (50.7%)            IIIIIIIIIIIIIIII                         9880       0        
     [factor]                  2. B                               4866 (49.2%)            IIIIIIIIIIIIIII                          (100%)     (0%)     

15   cond_v2                   1. A                               4866 (49.2%)            IIIIIIIIIIIIIII                          9880       0        
     [factor]                  2. B                               5014 (50.7%)            IIIIIIIIIIIIIIII                         (100%)     (0%)     

16   cond                      1. A                               4787 (48.4%)            IIIIIIIIIIIIIII                          9880       0        
     [factor]                  2. B                               5093 (51.5%)            IIIIIIIIIIIIIIII                         (100%)     (0%)     

17   stim                      1. neutral                         1882 (19.1%)            IIIIIIIIIIIII                            9880       0        
     [factor]                  2. death                           1790 (18.1%)            IIIIIIIIIIIII                            (100%)     (0%)     
                               3. living                          2101 (21.3%)            IIIIIIIIIIIIIII                                              
                               4. spouse                          1906 (19.3%)            IIIIIIIIIIIII                                                
                               5. stranger                        2201 (22.3%)            IIIIIIIIIIIIIIII                                             

18   outlier_RT                mean (sd) : 0 (0)                  0.00 : 9880 (100.0%)    IIIIIIIIIIIIIIII                         9880       0        
     [numeric]                 min < med < max :                                                                                   (100%)     (0%)     
                               0 < 0 < 0                                                                                                               
                               IQR (CV) : 0 (NaN)                                                                                                      

19   date1                     mean (sd) : 63183.18 (35369.28)    73 distinct val.                          :                      9880       0        
     [integer]                 min < med < max :                                          : .               :                      (100%)     (0%)     
                               11416 < 52316 < 121515                                     : : . :           :                                          
                               IQR (CV) : 69500 (0.56)                                    : : : :   : .   : :                                          
                                                                                          : : : : . : :   : :                                          

20   time1                     1. 12:45:25                        144 ( 1.5%)                                                      9880       0        
     [character]               2. 14:34:34                        144 ( 1.5%)                                                      (100%)     (0%)     
                               3. 15:15:51                        144 ( 1.5%)                                                                          
                               4. 13:55:57                        142 ( 1.4%)                                                                          
                               5. 16:42:40                        142 ( 1.4%)                                                                          
                               6. 14:22:28                        141 ( 1.4%)                                                                          
                               7. 16:40:38                        141 ( 1.4%)                                                                          
                               8. 11:28:07                        140 ( 1.4%)                                                                          
                               9. 13:15:44                        140 ( 1.4%)                                                                          
                               10. 13:26:09                       140 ( 1.4%)                                                                          
                               [ 68 others ]                      8462 (85.8%)            IIIIIIIIIIIIIIII                                             

21   ID1                       1. D110                            284 ( 2.9%)                                                      9880       0        
     [character]               2. D120                            283 ( 2.9%)                                                      (100%)     (0%)     
                               3. D123                            282 ( 2.9%)                                                                          
                               4. D132                            281 ( 2.8%)                                                                          
                               5. D127                            279 ( 2.8%)                                                                          
                               6. D146                            277 ( 2.8%)                                                                          
                               7. D102                            275 ( 2.8%)                                                                          
                               8. D130                            275 ( 2.8%)                                                                          
                               9. D138                            274 ( 2.8%)                                                                          
                               10. D131                           273 ( 2.8%)                                                                          
                               [ 29 others ]                      7097 (71.8%)            IIIIIIIIIIIIIIII                                             

22   blockcode1                1. AAT4                            894 ( 9.0%)             IIIIIIIIIII                              9880       0        
     [character]               2. AAT6                            894 ( 9.0%)             IIIIIIIIIII                              (100%)     (0%)     
                               3. AAT1                            892 ( 9.0%)             IIIIIIIIIII                                                  
                               4. AAT5                            889 ( 9.0%)             IIIIIIIIIII                                                  
                               5. AAT7                            886 ( 9.0%)             IIIIIIIIIII                                                  
                               6. AAT3                            873 ( 8.8%)             IIIIIIIIIII                                                  
                               7. AAT10                           843 ( 8.5%)             IIIIIIIIII                                                   
                               8. AAT9                            839 ( 8.5%)             IIIIIIIIII                                                   
                               9. AAT8                            834 ( 8.4%)             IIIIIIIIII                                                   
                               10. AAT2                           804 ( 8.1%)             IIIIIIIIII                                                   
                               [ 2 others ]                       1232 (12.5%)            IIIIIIIIIIIIIIII                                             

23   blocknum1                 1. 154                             74 ( 0.8%)                                                       9880       0        
     [character]               2. 71                              74 ( 0.8%)                                                       (100%)     (0%)     
                               3. 15                              73 ( 0.7%)                                                                           
                               4. 22                              73 ( 0.7%)                                                                           
                               5. 102                             72 ( 0.7%)                                                                           
                               6. 107                             72 ( 0.7%)                                                                           
                               7. 117                             72 ( 0.7%)                                                                           
                               8. 124                             72 ( 0.7%)                                                                           
                               9. 125                             72 ( 0.7%)                                                                           
                               10. 130                            72 ( 0.7%)                                                                           
                               [ 134 others ]                     9154 (92.9%)            IIIIIIIIIIIIIIII                                             

24   trialcode1                1. AAT_4                           894 ( 9.0%)             IIIIIIIIIII                              9880       0        
     [character]               2. AAT_6                           894 ( 9.0%)             IIIIIIIIIII                              (100%)     (0%)     
                               3. AAT_1                           892 ( 9.0%)             IIIIIIIIIII                                                  
                               4. AAT_5                           889 ( 9.0%)             IIIIIIIIIII                                                  
                               5. AAT_7                           886 ( 9.0%)             IIIIIIIIIII                                                  
                               6. AAT_3                           873 ( 8.8%)             IIIIIIIIIII                                                  
                               7. AAT_10                          843 ( 8.5%)             IIIIIIIIII                                                   
                               8. AAT_9                           839 ( 8.5%)             IIIIIIIIII                                                   
                               9. AAT_8                           834 ( 8.4%)             IIIIIIIIII                                                   
                               10. AAT_2                          804 ( 8.1%)             IIIIIIIIII                                                   
                               [ 2 others ]                       1232 (12.5%)            IIIIIIIIIIIIIIII                                             

25   values.trialcode1         1. AAT_4                           894 ( 9.0%)             IIIIIIIIIII                              9880       0        
     [character]               2. AAT_6                           894 ( 9.0%)             IIIIIIIIIII                              (100%)     (0%)     
                               3. AAT_1                           892 ( 9.0%)             IIIIIIIIIII                                                  
                               4. AAT_5                           889 ( 9.0%)             IIIIIIIIIII                                                  
                               5. AAT_7                           886 ( 9.0%)             IIIIIIIIIII                                                  
                               6. AAT_3                           873 ( 8.8%)             IIIIIIIIIII                                                  
                               7. AAT_10                          843 ( 8.5%)             IIIIIIIIII                                                   
                               8. AAT_9                           839 ( 8.5%)             IIIIIIIIII                                                   
                               9. AAT_8                           834 ( 8.4%)             IIIIIIIIII                                                   
                               10. AAT_2                          804 ( 8.1%)             IIIIIIIIII                                                   
                               [ 2 others ]                       1232 (12.5%)            IIIIIIIIIIIIIIII                                             

26   values.stimulus1          1. 3Y_1.jpg                        427 ( 4.3%)             I                                        9880       0        
     [character]               2. 3Y_3.jpg                        417 ( 4.2%)             I                                        (100%)     (0%)     
                               3. 3Y_2.jpg                        398 ( 4.0%)             I                                                            
                               4. 2B_3.jpg                        391 ( 4.0%)             I                                                            
                               5. 2B_1.jpg                        384 ( 3.9%)             I                                                            
                               6. 2B_2.jpg                        364 ( 3.7%)                                                                          
                               7. 1B_1.jpg                        349 ( 3.5%)                                                                          
                               8. 1B_2.jpg                        344 ( 3.5%)                                                                          
                               9. 1B_3.jpg                        339 ( 3.4%)                                                                          
                               10. 5B_3.jpg                       332 ( 3.4%)                                                                          
                               [ 20 others ]                      6135 (62.1%)            IIIIIIIIIIIIIIII                                             

27   push_pull1                1. PULL                            4927 (49.9%)            IIIIIIIIIIIIIII                          9880       0        
     [character]               2. PUSH                            4953 (50.1%)            IIIIIIIIIIIIIIII                         (100%)     (0%)     

28   gAAT_RT1                  mean (sd) : 769.98 (197.98)        957 distinct val.         :                                      9880       0        
     [numeric]                 min < med < max :                                            : :                                    (100%)     (0%)     
                               464 < 727 < 1709                                           . : :                                                        
                               IQR (CV) : 236 (0.26)                                      : : : : .                                                    
                                                                                          : : : : : . .                                                

29   trialnum1                 mean (sd) : 72.87 (41.54)          144 distinct val.       : . . : . . : . . :                      9880       0        
     [integer]                 min < med < max :                                          : : : : : : : : : :                      (100%)     (0%)     
                               1 < 73 < 144                                               : : : : : : : : : :                                          
                               IQR (CV) : 72 (0.57)                                       : : : : : : : : : :                                          
                                                                                          : : : : : : : : : :                                          

30   run1                      1. 2                               9880 (100.0%)           IIIIIIIIIIIIIIII                         9880       0        
     [character]                                                                                                                   (100%)     (0%)     

31   visit1                    1. 1                               4683 (47.4%)            IIIIIIIIIIIIII                           9880       0        
     [factor]                  2. 2                               5197 (52.6%)            IIIIIIIIIIIIIIII                         (100%)     (0%)     

32   cond_v11                  1. A                               5014 (50.7%)            IIIIIIIIIIIIIIII                         9880       0        
     [factor]                  2. B                               4866 (49.2%)            IIIIIIIIIIIIIII                          (100%)     (0%)     

33   cond_v21                  1. A                               4866 (49.2%)            IIIIIIIIIIIIIII                          9880       0        
     [factor]                  2. B                               5014 (50.7%)            IIIIIIIIIIIIIIII                         (100%)     (0%)     

34   cond1                     1. A                               4787 (48.4%)            IIIIIIIIIIIIIII                          9880       0        
     [factor]                  2. B                               5093 (51.5%)            IIIIIIIIIIIIIIII                         (100%)     (0%)     

35   stim1                     1. neutral                         1882 (19.1%)            IIIIIIIIIIIII                            9880       0        
     [factor]                  2. death                           1790 (18.1%)            IIIIIIIIIIIII                            (100%)     (0%)     
                               3. living                          2101 (21.3%)            IIIIIIIIIIIIIII                                              
                               4. spouse                          1906 (19.3%)            IIIIIIIIIIIII                                                
                               5. stranger                        2201 (22.3%)            IIIIIIIIIIIIIIII                                             

36   outlier_RT1               mean (sd) : 0 (0)                  0.00 : 9880 (100.0%)    IIIIIIIIIIIIIIII                         9880       0        
     [numeric]                 min < med < max :                                                                                   (100%)     (0%)     
                               0 < 0 < 0                                                                                                               
                               IQR (CV) : 0 (NaN)                                                                                                      

37   bias                      mean (sd) : 11.83 (246.66)         1342 distinct val.                :                              9880       0        
     [numeric]                 min < med < max :                                                  : :                              (100%)     (0%)     
                               -1157 < 8 < 1119                                                   : :                                                  
                               IQR (CV) : 260 (20.85)                                             : : .                                                
                                                                                              . : : : : .                                              

38   bias_t                    mean (sd) : 0 (1)                  1342 distinct val.              : .                              9880       0        
     [numeric]                 min < med < max :                                                  : :                              (100%)     (0%)     
                               -3.61 < 0 < 3.89                                                 : : : .                                                
                               IQR (CV) : 1.35 (2539.72)                                        : : : :                                                
                                                                                            . : : : : : .                                              

39   low_n_trials              mean (sd) : 0.06 (0.24)            0.00 : 9289 (94.0%)     IIIIIIIIIIIIIIII                         9880       0        
     [numeric]                 min < med < max :                  1.00 :  591 ( 6.0%)     I                                        (100%)     (0%)     
                               0 < 0 < 1                                                                                                               
                               IQR (CV) : 0 (3.96)                                                                                                     

40   age                       mean (sd) : 25229.12 (2370.06)     39 distinct val.                    :                            9880       0        
     [numeric]                 min < med < max :                                                      :                            (100%)     (0%)     
                               20845 < 25597 < 29101                                        :     . . :     .                                          
                               IQR (CV) : 3573 (0.09)                                     : :     : : :   : :                                          
                                                                                          : : . : : : : : : :                                          

41   age_yrs                   mean (sd) : 69.12 (6.49)           39 distinct val.                    :                            9880       0        
     [numeric]                 min < med < max :                                                      :                            (100%)     (0%)     
                               57.11 < 70.13 < 79.73                                        :     . . :     .                                          
                               IQR (CV) : 9.79 (0.09)                                     : :     : : :   : :                                          
                                                                                          : : . : : : : : : :                                          

42   sex_m                     1. female                          7250 (73.4%)            IIIIIIIIIIIIIIII                         9880       0        
     [ordered, factor]         2. male                            2630 (26.6%)            IIIII                                    (100%)     (0%)     

43   ethnicity_hisp            1. not Hispanic or Latino          9449 (95.6%)            IIIIIIIIIIIIIIII                         9880       0        
     [ordered, factor]         2. Hispanic or Latino               431 ( 4.4%)                                                     (100%)     (0%)     

44   race                      1. White                           9724 (98.4%)            IIIIIIIIIIIIIIII                         9880       0        
     [factor]                  2. Other (specify)                  156 ( 1.6%)                                                     (100%)     (0%)     

45   timesincedeath            mean (sd) : 460.85 (245.61)        36 distinct val.          :                                      9880       0        
     [numeric]                 min < med < max :                                            :                                      (100%)     (0%)     
                               151 < 411 < 1095                                             :   :                                                      
                               IQR (CV) : 320 (0.53)                                      . :   :                                                      
                                                                                          : : . : : . .   : .                                          

46   rs2254298                 1. GG                              7005 (72.8%)            IIIIIIIIIIIIIIII                         9627       253      
     [factor]                  2. A/G                             2622 (27.2%)            IIIII                                    (97.44%)   (2.56%)  

47   rs2268498                 1. C/T                             4325 (43.8%)            IIIIIIIIIIIIIIII                         9880       0        
     [character]               2. CC                              2636 (26.7%)            IIIIIIIII                                (100%)     (0%)     
                               3. TT                              2919 (29.5%)            IIIIIIIIII                                                   

48   rs53576                   1. A/G                             5207 (52.7%)            IIIIIIIIIIIIIIII                         9880       0        
     [character]               2. GG                              4673 (47.3%)            IIIIIIIIIIIIII                           (100%)     (0%)     

49   rs2254298_A.carrier       1. 0                               7005 (72.8%)            IIIIIIIIIIIIIIII                         9627       253      
     [ordered, factor]         2. 1                               2622 (27.2%)            IIIII                                    (97.44%)   (2.56%)  

50   rs2268498_C.carrier       1. 0                               2919 (29.5%)            IIIIII                                   9880       0        
     [ordered, factor]         2. 1                               6961 (70.5%)            IIIIIIIIIIIIIIII                         (100%)     (0%)     

51   rs53576_A.carrier         1. 0                               4673 (47.3%)            IIIIIIIIIIIIII                           9880       0        
     [ordered, factor]         2. 1                               5207 (52.7%)            IIIIIIIIIIIIIIII                         (100%)     (0%)     

52   tot_bdi                   mean (sd) : 10.47 (7.69)           22 distinct val.          :                                      9880       0        
     [numeric]                 min < med < max :                                            :                                      (100%)     (0%)     
                               0 < 10 < 28                                                : : .   .                                                    
                               IQR (CV) : 12 (0.73)                                       : : : : : . :                                                
                                                                                          : : : : : : : . : :                                          

53   tot_bgq                   mean (sd) : 9.56 (2.26)            6.00 :  958 ( 9.7%)     IIIII                                    9880       0        
     [numeric]                 min < med < max :                  7.00 :  809 ( 8.2%)     IIII                                     (100%)     (0%)     
                               6 < 9 < 15                         8.00 : 1079 (10.9%)     IIIIII                                                       
                               IQR (CV) : 2 (0.24)                9.00 : 2640 (26.7%)     IIIIIIIIIIIIIIII                                             
                                                                  10.00 : 1997 (20.2%)    IIIIIIIIIIII                                                 
                                                                  11.00 :  495 ( 5.0%)    III                                                          
                                                                  12.00 :  591 ( 6.0%)    III                                                          
                                                                  13.00 :  508 ( 5.1%)    III                                                          
                                                                  14.00 :  528 ( 5.3%)    III                                                          
                                                                  15.00 :  275 ( 2.8%)    I                                                            

54   tot_ecrrs_global_anx      mean (sd) : 3.24 (1.45)            16 distinct val.          :     .                                9880       0        
     [numeric]                 min < med < max :                                            :     :                                (100%)     (0%)     
                               1 < 3.33 < 6.33                                            . :     :   :                                                
                               IQR (CV) : 2.33 (0.45)                                     : : : . :   :   :                                            
                                                                                          : : : : : : : . : :                                          

55   tot_ecrrs_global_avoid    mean (sd) : 3.42 (1.07)            22 distinct val.                :                                9880       0        
     [numeric]                 min < med < max :                                                  : :                              (100%)     (0%)     
                               1 < 3.5 < 6.33                                                   . : :                                                  
                               IQR (CV) : 1.17 (0.31)                                         . : : :                                                  
                                                                                          : : : : : : : : . .                                          

56   tot_ecrrs_spouse_anx      mean (sd) : 2.22 (1.53)            12 distinct val.        :                                        9880       0        
     [numeric]                 min < med < max :                                          :                                        (100%)     (0%)     
                               1 < 1.33 < 6                                               :                                                            
                               IQR (CV) : 2.33 (0.69)                                     :                                                            
                                                                                          : : . : : :     . .                                          

57   tot_ecrrs_spouse_avoid    mean (sd) : 2.13 (1.15)            16 distinct val.        :                                        9880       0        
     [numeric]                 min < med < max :                                          :                                        (100%)     (0%)     
                               1 < 1.67 < 5.33                                            :                                                            
                               IQR (CV) : 1.83 (0.54)                                     : .   .                                                      
                                                                                          : : . : : : .   .                                            

58   tot_ysl                   mean (sd) : 67.32 (18.45)          31 distinct val.            .       : :                          9880       0        
     [numeric]                 min < med < max :                                              : . .   : :                          (100%)     (0%)     
                               31 < 69 < 100                                                . : : : . : :   :                                          
                               IQR (CV) : 32 (0.27)                                       : : : : : : : : . :                                          
                                                                                          : : : : : : : : : :                                          

59   tot_icg                   mean (sd) : 23.07 (12.16)          28 distinct val.            :     .                              9880       0        
     [numeric]                 min < med < max :                                          :   :     :                              (100%)     (0%)     
                               4 < 22 < 51                                                : . : : : :                                                  
                               IQR (CV) : 13 (0.53)                                       : : : : : : : . .                                            
                                                                                          : : : : : : : : : :                                          
-------------------------------------------------------------------------------------------------------------------------------------------------------
 # no missing data on any of the variables!

Descriptives

library(psych)
library(stargazer)

Please cite as: 

 Hlavac, Marek (2018). stargazer: Well-Formatted Regression and Summary Statistics Tables.
 R package version 5.2.1. https://CRAN.R-project.org/package=stargazer 
library(summarytools)
# descriptives for the time-invariant variables are going to be wrong if we use mlm_bx4 (since observations are duplicated)
# so going back to the mlm_add object...
# note that n = 40, but n = 39 with RT data
# D149's behavioral data was dropped (ran outside of scanner), so drop his case
mlm_add <- filter(mlm_add, !grepl("D149",ID))
unique(mlm_add$ID) # it worked
 [1] "D101" "D103" "D102" "D104" "D105" "D107" "D110" "D114" "D115" "D113" "D118" "D119" "D121" "D117" "D120" "D122" "D123" "D125" "D127" "D130" "D132" "D128" "D134" "D126" "D133"
[26] "D135" "D131" "D129" "D137" "D138" "D140" "D139" "D141" "D144" "D145" "D142" "D147" "D146" "D148"
nums <- select_if(mlm_add, is.numeric) # just get the numeric variables
view(dfSummary(nums)) 
stargazer(nums, type="text", title="Descriptives", digits=2, out="table1.txt", notes = "Note: 'stargazer' makes nicer looking tables, but doesn't provide skewness or kurtosis statistics, so we'll use 'psych' for that.") 

Descriptives
======================================================================================================================================
Statistic                                       N                 Mean             St. Dev.             Min                Max        
--------------------------------------------------------------------------------------------------------------------------------------
age                                             39             25,309.79           2,382.79            20,845             29,101      
age_yrs                                         39               69.34               6.53              57.11              79.73       
timesincedeath                                  39               468.64             251.59              151               1,095       
tot_bdi                                         39               10.54               7.70                0                  28        
tot_bgq                                         39                9.59               2.29                6                  15        
tot_ecrrs_global_anx                            39                3.29               1.45               1.00               6.33       
tot_ecrrs_global_avoid                          39                3.40               1.10               1.00               6.33       
tot_ecrrs_spouse_anx                            39                2.17               1.54               1.00               6.00       
tot_ecrrs_spouse_avoid                          39                2.10               1.15               1.00               5.33       
tot_ysl                                         39               67.85              18.79                31                100        
tot_icg                                         39               23.38              12.63                4                  51        
--------------------------------------------------------------------------------------------------------------------------------------
Note: 'stargazer' makes nicer looking tables, but doesn't provide skewness or kurtosis statistics, so we'll use 'psych' for that.     
describe(nums)
                       vars  n     mean      sd   median  trimmed     mad      min      max   range  skew kurtosis     se
age                       1 39 25309.79 2382.79 25640.00 25366.85 2659.78 20845.00 29101.00 8256.00 -0.27    -1.01 381.55
age_yrs                   2 39    69.34    6.53    70.25    69.50    7.29    57.11    79.73   22.62 -0.27    -1.01   1.05
timesincedeath            3 39   468.64  251.59   411.00   442.36  198.67   151.00  1095.00  944.00  1.02    -0.02  40.29
tot_bdi                   4 39    10.54    7.70    10.00    10.06    8.90     0.00    28.00   28.00  0.52    -0.78   1.23
tot_bgq                   5 39     9.59    2.29     9.00     9.48    1.48     6.00    15.00    9.00  0.43    -0.38   0.37
tot_ecrrs_global_anx      6 39     3.29    1.45     3.33     3.23    1.98     1.00     6.33    5.33  0.34    -0.84   0.23
tot_ecrrs_global_avoid    7 39     3.40    1.10     3.50     3.39    0.99     1.00     6.33    5.33  0.07     0.14   0.18
tot_ecrrs_spouse_anx      8 39     2.17    1.54     1.33     1.96    0.49     1.00     6.00    5.00  1.08    -0.10   0.25
tot_ecrrs_spouse_avoid    9 39     2.10    1.15     1.67     1.98    0.99     1.00     5.33    4.33  0.88    -0.20   0.18
tot_ysl                  10 39    67.85   18.79    70.00    67.88   19.27    31.00   100.00   69.00 -0.08    -1.14   3.01
tot_icg                  11 39    23.38   12.63    22.00    22.79   10.38     4.00    51.00   47.00  0.36    -0.76   2.02

Frequencies

# which variables are factors?
mlm_add %>% select_if(is.factor) %>% colnames()
[1] "sex_m"               "ethnicity_hisp"      "race"                "rs2254298"           "rs2254298_A.carrier" "rs2268498_C.carrier" "rs53576_A.carrier"  
# hmm, rs2268498 and rs53576 should also be factors, so fix that now
mlm_add <- mlm_add %>% mutate(rs2268498 = recode_factor(rs2268498, "C/T" = "C/T", "CC" = "CC", "TT" = "TT"), rs53576 = recode_factor(rs53576, "A/G" = "A/G", "AA" = "AA", "GG" = "GG"))
mlm_add %>% select_if(is.factor) %>% colnames()
[1] "sex_m"               "ethnicity_hisp"      "race"                "rs2254298"           "rs2268498"           "rs53576"             "rs2254298_A.carrier" "rs2268498_C.carrier"
[9] "rs53576_A.carrier"  
# function to count frequencies for variables that are factors...this isn't working
# countAll <- function(basedata)
# {
#   facs <- sapply(basedata, is.factor)
#   facdata <- basedata[ ,facs]
#   for(i in 1:length(facdata))
#   {
#       count(facdata)
#    }
# }
# countAll(mlm_bx4)
mlm_add %>% count(sex_m) # 28 females, 11 males
mlm_add %>% count(race) # 38 white, 1 another race not listed
mlm_add %>% count(ethnicity_hisp) # 37 non-Hispanic/Latino, 2 Hispanic/Latino
mlm_add %>% count(rs2254298) # 22 GG, 10 A/G, 1 NA/missing (0 AA)
mlm_add %>% count(rs2268498) # 17 C/T, 10 CC, 12 TT
mlm_add %>% count(rs53576) # 21 A/G, 18 GG (0 AA)
mlm_add %>% count(rs2254298_A.carrier) # yes = 10, no = 28, NA = 1
mlm_add %>% count(rs2268498_C.carrier) # yes = 27, no = 12
mlm_add %>% count(rs53576_A.carrier) # yes = 21, no = 18

Checking distribution shapes and normality

Got a preview of this above when we used summarytools, but here’s a closer look at some histograms and QQ plots.

# a function to produce histograms for all numeric variables in a dataframe
histAll <- function(basedata)
{
    nums <- sapply((basedata), is.numeric)
    numdata <- basedata[ ,nums]
    par(mfrow=c(4,4))
    for(i in 1:length(numdata))
    {
        hist(numdata[,i], xlab=NULL, main=names(numdata[i]))
    }
}
histAll(as.data.frame(mlm_bx4))

Centering L2 predictors

# pull out only the total score variables, age, timesincedeath
gcvars <- as.data.frame(mlm_bx4[, grep('^tot|age|timesincedeath|ID', names(mlm_bx4))])
gcvars1 <- gcvars[!duplicated(gcvars$ID), ]
stargazer(gcvars1, type="text", median = T, digits=3)

===================================================================
Statistic              N     Mean    St. Dev.   Min   Median  Max  
-------------------------------------------------------------------
age                    39 25,309.790 2,382.786 20,845 25,640 29,101
age_yrs                39   69.342     6.528   57.110 70.247 79.729
timesincedeath         39  468.641    251.589   151    411   1,095 
tot_bdi                39   10.538     7.701     0      10     28  
tot_bgq                39   9.590      2.291     6      9      15  
tot_ecrrs_global_anx   39   3.291      1.449   1.000  3.333  6.333 
tot_ecrrs_global_avoid 39   3.402      1.101   1.000  3.500  6.333 
tot_ecrrs_spouse_anx   39   2.171      1.543   1.000  1.333  6.000 
tot_ecrrs_spouse_avoid 39   2.098      1.152   1.000  1.667  5.333 
tot_ysl                39   67.846    18.790     31     70    100  
tot_icg                39   23.385    12.630     4      22     51  
-------------------------------------------------------------------
gmean_age <- mean(gcvars1$age)
gmean_age_yrs <- mean(gcvars1$age_yrs)
gmean_time <- mean(gcvars1$timesincedeath)
gmean_bdi <- mean(gcvars1$tot_bdi)
gmean_bgq <- mean(gcvars1$tot_bgq)
gmean_ecrrs_g_anx <- mean(gcvars1$tot_ecrrs_global_anx)
gmean_ecrrs_g_av <- mean(gcvars1$tot_ecrrs_global_avoid)
gmean_ecrrs_s_anx <- mean(gcvars1$tot_ecrrs_spouse_anx)
gmean_ecrrs_s_av <- mean(gcvars1$tot_ecrrs_spouse_avoid)
gmean_icg <- mean(gcvars1$tot_icg)
gmean_ysl <- mean(gcvars1$tot_ysl)
mlm_bx5 <- mlm_bx4 %>% mutate(gmean_age = age-gmean_age, gmean_age_yrs = age_yrs-gmean_age_yrs, gmean_timesincedeath = timesincedeath-gmean_time, gmean_bdi = tot_bdi-gmean_bdi, gmean_bgq = tot_bgq-gmean_bgq, gmean_ecrrs_global_anx = tot_ecrrs_global_anx-gmean_ecrrs_g_anx, gmean_ecrrs_global_avoid = tot_ecrrs_global_avoid-gmean_ecrrs_g_av, gmean_ecrrs_spouse_anx = tot_ecrrs_spouse_anx-gmean_ecrrs_s_anx, gmean_ecrrs_spouse_avoid = tot_ecrrs_spouse_avoid-gmean_ecrrs_s_av, gmean_ysl = tot_ysl-gmean_ysl, gmean_icg = tot_icg-gmean_icg)
stargazer(as.data.frame(mlm_bx5), type="text") # note that means of the group-mean centered variables are not zero - I think this is because some people had more trials than others

=========================================================================
Statistic                  N      Mean     St. Dev.     Min        Max   
-------------------------------------------------------------------------
date                     9,880 63,183.180 35,369.280   11,416    121,515 
gAAT_RT                  9,880  794.920    203.041      473       1,712  
trialnum                 9,880   72.873     41.538       1         144   
outlier_RT               9,880   0.000      0.000        0          0    
date1                    9,880 63,183.180 35,369.280   11,416    121,515 
gAAT_RT1                 9,880  769.980    197.980      464       1,709  
trialnum1                9,880   72.873     41.538       1         144   
outlier_RT1              9,880   0.000      0.000        0          0    
bias                     9,880   11.828    246.662     -1,157     1,119  
bias_t                   9,880   0.0004     0.999      -3.612     3.888  
low_n_trials             9,880   0.060      0.237        0          1    
age                      9,880 25,229.120 2,370.058    20,845    29,101  
age_yrs                  9,880   69.121     6.493      57.110    79.729  
timesincedeath           9,880  460.855    245.608      151       1,095  
tot_bdi                  9,880   10.467     7.688        0         28    
tot_bgq                  9,880   9.558      2.258        6         15    
tot_ecrrs_global_anx     9,880   3.241      1.450      1.000      6.333  
tot_ecrrs_global_avoid   9,880   3.416      1.070      1.000      6.333  
tot_ecrrs_spouse_anx     9,880   2.218      1.532      1.000      6.000  
tot_ecrrs_spouse_avoid   9,880   2.127      1.150      1.000      5.333  
tot_ysl                  9,880   67.317     18.448       31        100   
tot_icg                  9,880   23.066     12.156       4         51    
gmean_age                9,880  -80.671   2,370.058  -4,464.795 3,791.205
gmean_age_yrs            9,880   -0.221     6.493     -12.232    10.387  
gmean_timesincedeath     9,880   -7.786    245.608    -317.641   626.359 
gmean_bdi                9,880   -0.071     7.688     -10.538    17.462  
gmean_bgq                9,880   -0.032     2.258      -3.590     5.410  
gmean_ecrrs_global_anx   9,880   -0.050     1.450      -2.291     3.043  
gmean_ecrrs_global_avoid 9,880   0.015      1.070      -2.402     2.932  
gmean_ecrrs_spouse_anx   9,880   0.047      1.532      -1.171     3.829  
gmean_ecrrs_spouse_avoid 9,880   0.029      1.150      -1.098     3.235  
gmean_ysl                9,880   -0.529     18.448    -36.846    32.154  
gmean_icg                9,880   -0.318     12.156    -19.385    27.615  
-------------------------------------------------------------------------
# test that hypothesis with the non-duplicated data
gcvars2 <- gcvars1 %>% mutate(gmean_age = age-gmean_age, gmean_age_yrs = age_yrs-gmean_age_yrs, gmean_timesincedeath = timesincedeath-gmean_time, gmean_bdi = tot_bdi-gmean_bdi, gmean_bgq = tot_bgq-gmean_bgq, gmean_ecrrs_global_anx = tot_ecrrs_global_anx-gmean_ecrrs_g_anx, gmean_ecrrs_global_avoid = tot_ecrrs_global_avoid-gmean_ecrrs_g_av, gmean_ecrrs_spouse_anx = tot_ecrrs_spouse_anx-gmean_ecrrs_s_anx, gmean_ecrrs_spouse_avoid = tot_ecrrs_spouse_avoid-gmean_ecrrs_s_av, gmean_ysl = tot_ysl-gmean_ysl, gmean_icg = tot_icg-gmean_icg)
stargazer(as.data.frame(gcvars2), type="text") # yes, the latter are zero

=====================================================================
Statistic                N     Mean    St. Dev.     Min        Max   
---------------------------------------------------------------------
age                      39 25,309.790 2,382.786   20,845    29,101  
age_yrs                  39   69.342     6.528     57.110    79.729  
timesincedeath           39  468.641    251.589     151       1,095  
tot_bdi                  39   10.538     7.701       0         28    
tot_bgq                  39   9.590      2.291       6         15    
tot_ecrrs_global_anx     39   3.291      1.449     1.000      6.333  
tot_ecrrs_global_avoid   39   3.402      1.101     1.000      6.333  
tot_ecrrs_spouse_anx     39   2.171      1.543     1.000      6.000  
tot_ecrrs_spouse_avoid   39   2.098      1.152     1.000      5.333  
tot_ysl                  39   67.846    18.790       31        100   
tot_icg                  39   23.385    12.630       4         51    
gmean_age                39   0.000    2,382.786 -4,464.795 3,791.205
gmean_age_yrs            39   0.000      6.528    -12.232    10.387  
gmean_timesincedeath     39   0.000     251.589   -317.641   626.359 
gmean_bdi                39   0.000      7.701    -10.538    17.462  
gmean_bgq                39   0.000      2.291     -3.590     5.410  
gmean_ecrrs_global_anx   39   0.000      1.449     -2.291     3.043  
gmean_ecrrs_global_avoid 39   0.000      1.101     -2.402     2.932  
gmean_ecrrs_spouse_anx   39   0.000      1.543     -1.171     3.829  
gmean_ecrrs_spouse_avoid 39   -0.000     1.152     -1.098     3.235  
gmean_ysl                39   0.000     18.790    -36.846    32.154  
gmean_icg                39   0.000     12.630    -19.385    27.615  
---------------------------------------------------------------------
saveRDS(mlm_bx5, "~/Dropbox/2018Fall/FSHD617C/mlm_bx5.rds")

Evaluate missing data

mlm_bx5 %>% 
  select_if(function(x) any(is.na(x))) %>% 
  summarise_all(funs(mean(is.na(.)))) -> mlm_bx5_NAs # or use sum() instead of mean() to find n missing values
mlm_bx5_NAs <- gather(mlm_bx5_NAs) # create long data that lists variables with any NAs, and % of data missing
mlm_bx5_NAs

One person is missing rs2254298 genotype (rs2254298 and rs2254298_A.carrier), which we already knew about.

Scale reliabilities

This dataset does not include the item-level data, but we can get that from the master dataset we loaded in earlier.

library(psych)
mlm_scales <- filter(master, !grepl("D149",ID))
## BDI
mlm_tot_bdi <- subset(mlm_scales, select=c(bdi_1c:bdi_21c))  
psych::alpha(mlm_tot_bdi)

Reliability analysis   
Call: psych::alpha(x = mlm_tot_bdi)

  raw_alpha std.alpha G6(smc) average_r S/N   ase mean   sd median_r
       0.9       0.9    0.97      0.31 9.4 0.023 0.51 0.37     0.32

 lower alpha upper     95% confidence boundaries
0.85 0.9 0.94 

 Reliability if an item is dropped:
        raw_alpha std.alpha G6(smc) average_r S/N alpha se var.r med.r
bdi_1c       0.89      0.90    0.97      0.31 8.9    0.024 0.036  0.32
bdi_2c       0.89      0.90    0.97      0.31 9.0    0.024 0.035  0.32
bdi_3c       0.90      0.90    0.97      0.32 9.3    0.023 0.034  0.33
bdi_4c       0.89      0.89    0.97      0.30 8.5    0.025 0.034  0.30
bdi_5c       0.89      0.90    0.97      0.30 8.7    0.024 0.036  0.31
bdi_6c       0.90      0.91    0.97      0.32 9.6    0.023 0.031  0.34
bdi_7c       0.89      0.90    0.97      0.30 8.7    0.024 0.035  0.30
bdi_8c       0.89      0.90    0.97      0.31 8.8    0.024 0.036  0.32
bdi_9c       0.89      0.90    0.97      0.31 9.1    0.024 0.035  0.33
bdi_10c      0.89      0.90    0.97      0.30 8.6    0.025 0.035  0.30
bdi_11c      0.89      0.90    0.97      0.32 9.3    0.023 0.035  0.33
bdi_12c      0.88      0.89    0.97      0.29 8.2    0.027 0.033  0.30
bdi_13c      0.89      0.90    0.97      0.31 9.1    0.024 0.035  0.32
bdi_14c      0.89      0.90    0.97      0.30 8.7    0.024 0.035  0.31
bdi_15c      0.89      0.90    0.96      0.30 8.5    0.026 0.035  0.30
bdi_16c      0.89      0.90    0.97      0.31 8.9    0.024 0.037  0.31
bdi_17c      0.89      0.90    0.97      0.31 9.0    0.024 0.035  0.32
bdi_18c      0.90      0.91    0.97      0.33 9.8    0.022 0.033  0.35
bdi_19c      0.89      0.90    0.97      0.31 8.9    0.024 0.033  0.31
bdi_20c      0.89      0.90    0.97      0.30 8.6    0.025 0.035  0.30
bdi_21c      0.91      0.91    0.97      0.32 9.5    0.021 0.033  0.33

 Item statistics 
         n raw.r std.r r.cor r.drop mean   sd
bdi_1c  38  0.58  0.60  0.59   0.53 0.42 0.50
bdi_2c  39  0.55  0.56  0.55   0.47 0.36 0.63
bdi_3c  39  0.41  0.46  0.45   0.36 0.21 0.47
bdi_4c  39  0.75  0.75  0.75   0.71 0.79 0.70
bdi_5c  39  0.66  0.69  0.68   0.62 0.41 0.50
bdi_6c  39  0.29  0.35  0.33   0.25 0.10 0.38
bdi_7c  38  0.64  0.68  0.68   0.59 0.32 0.66
bdi_8c  38  0.63  0.63  0.62   0.56 0.32 0.53
bdi_9c  38  0.50  0.54  0.53   0.46 0.13 0.34
bdi_10c 39  0.70  0.70  0.70   0.66 0.67 0.62
bdi_11c 39  0.47  0.46  0.45   0.40 0.54 0.60
bdi_12c 39  0.87  0.85  0.85   0.84 0.64 0.84
bdi_13c 38  0.56  0.52  0.51   0.49 0.66 0.75
bdi_14c 39  0.63  0.65  0.65   0.58 0.38 0.63
bdi_15c 38  0.79  0.73  0.74   0.71 0.55 0.72
bdi_16c 39  0.60  0.60  0.58   0.53 0.77 0.67
bdi_17c 39  0.55  0.57  0.56   0.50 0.23 0.48
bdi_18c 39  0.27  0.27  0.24   0.20 0.64 0.67
bdi_19c 38  0.63  0.61  0.60   0.58 0.63 0.63
bdi_20c 39  0.74  0.70  0.70   0.69 0.67 0.66
bdi_21c 39  0.44  0.36  0.34   0.32 1.18 1.14

Non missing response frequency for each item
           0    1    2    3 miss
bdi_1c  0.58 0.42 0.00 0.00 0.03
bdi_2c  0.72 0.21 0.08 0.00 0.00
bdi_3c  0.82 0.15 0.03 0.00 0.00
bdi_4c  0.33 0.56 0.08 0.03 0.00
bdi_5c  0.59 0.41 0.00 0.00 0.00
bdi_6c  0.92 0.05 0.03 0.00 0.00
bdi_7c  0.79 0.11 0.11 0.00 0.03
bdi_8c  0.71 0.26 0.03 0.00 0.03
bdi_9c  0.87 0.13 0.00 0.00 0.03
bdi_10c 0.38 0.59 0.00 0.03 0.00
bdi_11c 0.51 0.44 0.05 0.00 0.00
bdi_12c 0.54 0.33 0.08 0.05 0.00
bdi_13c 0.50 0.34 0.16 0.00 0.03
bdi_14c 0.69 0.23 0.08 0.00 0.00
bdi_15c 0.58 0.29 0.13 0.00 0.03
bdi_16c 0.36 0.51 0.13 0.00 0.00
bdi_17c 0.79 0.18 0.03 0.00 0.00
bdi_18c 0.44 0.51 0.03 0.03 0.00
bdi_19c 0.45 0.47 0.08 0.00 0.03
bdi_20c 0.44 0.46 0.10 0.00 0.00
bdi_21c 0.41 0.15 0.28 0.15 0.00
## BGQ
mlm_tot_bgq <- subset(mlm_scales, select=c(bgq_1:bgq_5))
psych::alpha(mlm_tot_bgq)

Reliability analysis   
Call: psych::alpha(x = mlm_tot_bgq)

  raw_alpha std.alpha G6(smc) average_r S/N   ase mean   sd median_r
      0.71      0.72    0.73      0.34 2.5 0.071  1.9 0.46     0.31

 lower alpha upper     95% confidence boundaries
0.57 0.71 0.85 

 Reliability if an item is dropped:
      raw_alpha std.alpha G6(smc) average_r S/N alpha se var.r med.r
bgq_1      0.62      0.63    0.60      0.29 1.7    0.097 0.030  0.32
bgq_2      0.56      0.55    0.51      0.23 1.2    0.112 0.014  0.29
bgq_3      0.72      0.73    0.73      0.41 2.8    0.074 0.028  0.32
bgq_4      0.67      0.67    0.68      0.34 2.1    0.085 0.052  0.32
bgq_5      0.73      0.73    0.73      0.40 2.7    0.070 0.033  0.35

 Item statistics 
       n raw.r std.r r.cor r.drop mean   sd
bgq_1 39  0.76  0.76  0.72   0.57  1.9 0.72
bgq_2 39  0.85  0.86  0.88   0.74  2.1 0.62
bgq_3 39  0.49  0.56  0.38   0.30  2.0 0.51
bgq_4 39  0.69  0.68  0.55   0.47  1.8 0.71
bgq_5 39  0.61  0.57  0.38   0.34  1.7 0.76

Non missing response frequency for each item
         1    2    3 miss
bgq_1 0.28 0.49 0.23    0
bgq_2 0.15 0.62 0.23    0
bgq_3 0.13 0.74 0.13    0
bgq_4 0.33 0.49 0.18    0
bgq_5 0.46 0.36 0.18    0
## ICG
mlm_tot_icg <- subset(mlm_scales, select=c(icg_1:icg_19))
psych::alpha(mlm_tot_icg)

Reliability analysis   
Call: psych::alpha(x = mlm_tot_icg)

  raw_alpha std.alpha G6(smc) average_r S/N   ase mean   sd median_r
      0.92      0.92    0.96      0.37  11 0.017  1.2 0.66     0.36

 lower alpha upper     95% confidence boundaries
0.89 0.92 0.96 

 Reliability if an item is dropped:
       raw_alpha std.alpha G6(smc) average_r S/N alpha se var.r med.r
icg_1       0.92      0.91    0.96      0.36  10    0.018 0.036  0.35
icg_2       0.92      0.92    0.96      0.37  11    0.018 0.039  0.37
icg_3       0.91      0.91    0.96      0.36  10    0.019 0.036  0.35
icg_4       0.92      0.91    0.96      0.37  10    0.018 0.041  0.35
icg_5       0.92      0.92    0.96      0.39  12    0.017 0.040  0.39
icg_6       0.92      0.91    0.96      0.37  11    0.018 0.038  0.35
icg_7       0.91      0.91    0.96      0.36  10    0.019 0.037  0.35
icg_8       0.92      0.91    0.96      0.36  10    0.018 0.039  0.35
icg_9       0.92      0.91    0.96      0.36  10    0.018 0.040  0.35
icg_10      0.92      0.91    0.96      0.37  11    0.018 0.041  0.34
icg_11      0.93      0.92    0.96      0.40  12    0.017 0.034  0.40
icg_12      0.92      0.92    0.96      0.40  12    0.016 0.039  0.41
icg_13      0.91      0.91    0.96      0.36  10    0.019 0.039  0.33
icg_14      0.93      0.92    0.96      0.39  12    0.016 0.037  0.40
icg_15      0.92      0.92    0.96      0.39  11    0.017 0.039  0.38
icg_16      0.92      0.91    0.96      0.37  11    0.018 0.041  0.35
icg_17      0.92      0.91    0.96      0.37  11    0.018 0.039  0.36
icg_18      0.92      0.92    0.96      0.38  11    0.017 0.042  0.38
icg_19      0.92      0.91    0.95      0.37  10    0.018 0.040  0.35

 Item statistics 
        n raw.r std.r r.cor r.drop mean   sd
icg_1  39  0.82  0.81  0.81   0.80 1.41 0.88
icg_2  39  0.65  0.64  0.63   0.60 1.79 0.95
icg_3  39  0.83  0.82  0.82   0.80 1.21 1.20
icg_4  39  0.72  0.72  0.71   0.67 2.54 1.14
icg_5  39  0.43  0.43  0.39   0.36 1.87 0.92
icg_6  39  0.73  0.71  0.71   0.68 1.46 1.19
icg_7  39  0.80  0.79  0.79   0.77 1.64 1.16
icg_8  39  0.76  0.76  0.76   0.72 1.36 1.09
icg_9  39  0.75  0.76  0.76   0.72 0.72 0.76
icg_10 39  0.69  0.71  0.70   0.65 0.90 0.94
icg_11 39  0.24  0.28  0.23   0.19 0.26 0.55
icg_12 39  0.37  0.38  0.33   0.30 0.72 0.92
icg_13 39  0.82  0.81  0.81   0.78 1.97 1.27
icg_14 39  0.37  0.39  0.38   0.31 0.67 0.96
icg_15 39  0.47  0.49  0.48   0.42 0.33 0.84
icg_16 39  0.69  0.68  0.66   0.64 0.77 1.22
icg_17 39  0.68  0.67  0.66   0.63 0.85 1.09
icg_18 39  0.56  0.56  0.55   0.50 0.92 1.09
icg_19 39  0.73  0.72  0.72   0.69 2.00 1.08

Non missing response frequency for each item
          0    1    2    3    4 miss
icg_1  0.15 0.38 0.36 0.10 0.00    0
icg_2  0.08 0.31 0.38 0.21 0.03    0
icg_3  0.38 0.21 0.28 0.08 0.05    0
icg_4  0.08 0.08 0.28 0.36 0.21    0
icg_5  0.05 0.28 0.46 0.15 0.05    0
icg_6  0.26 0.28 0.26 0.15 0.05    0
icg_7  0.18 0.31 0.26 0.21 0.05    0
icg_8  0.26 0.31 0.28 0.13 0.03    0
icg_9  0.46 0.36 0.18 0.00 0.00    0
icg_10 0.41 0.36 0.15 0.08 0.00    0
icg_11 0.79 0.15 0.05 0.00 0.00    0
icg_12 0.54 0.26 0.15 0.05 0.00    0
icg_13 0.13 0.26 0.28 0.18 0.15    0
icg_14 0.59 0.21 0.18 0.00 0.03    0
icg_15 0.82 0.08 0.08 0.00 0.03    0
icg_16 0.64 0.13 0.10 0.08 0.05    0
icg_17 0.54 0.18 0.21 0.05 0.03    0
icg_18 0.49 0.23 0.15 0.13 0.00    0
icg_19 0.08 0.26 0.33 0.26 0.08    0
## YSL
mlm_tot_ysl <- subset(mlm_scales, select=c(ysl_1:ysl_21))
psych::alpha(mlm_tot_ysl)

Reliability analysis   
Call: psych::alpha(x = mlm_tot_ysl)

  raw_alpha std.alpha G6(smc) average_r S/N    ase mean   sd median_r
      0.96      0.96    0.98      0.52  22 0.0094  3.2 0.89     0.55

 lower alpha upper     95% confidence boundaries
0.94 0.96 0.98 

 Reliability if an item is dropped:
       raw_alpha std.alpha G6(smc) average_r S/N alpha se var.r med.r
ysl_1       0.96      0.96    0.98      0.55  24   0.0093 0.018  0.57
ysl_2       0.95      0.95    0.98      0.51  21   0.0100 0.026  0.54
ysl_3       0.95      0.95    0.98      0.51  21   0.0101 0.026  0.54
ysl_4       0.96      0.96    0.98      0.52  21   0.0098 0.026  0.55
ysl_5       0.95      0.95    0.98      0.51  21   0.0100 0.026  0.54
ysl_6       0.95      0.95    0.98      0.51  21   0.0101 0.026  0.54
ysl_7       0.96      0.96    0.98      0.52  21   0.0097 0.025  0.55
ysl_8       0.95      0.95    0.98      0.51  21   0.0102 0.025  0.54
ysl_9       0.95      0.95    0.98      0.51  21   0.0102 0.025  0.54
ysl_10      0.95      0.95    0.98      0.51  21   0.0101 0.025  0.53
ysl_11      0.95      0.95    0.98      0.51  21   0.0101 0.025  0.54
ysl_12      0.96      0.96    0.98      0.52  22   0.0095 0.026  0.57
ysl_13      0.96      0.96    0.98      0.53  23   0.0092 0.024  0.57
ysl_14      0.95      0.95    0.98      0.51  21   0.0100 0.026  0.53
ysl_15      0.96      0.96    0.98      0.53  23   0.0092 0.024  0.57
ysl_16      0.96      0.96    0.98      0.53  22   0.0095 0.026  0.57
ysl_17      0.95      0.96    0.98      0.52  21   0.0099 0.025  0.54
ysl_18      0.96      0.96    0.98      0.52  22   0.0097 0.026  0.55
ysl_19      0.95      0.95    0.98      0.51  21   0.0099 0.026  0.54
ysl_20      0.95      0.96    0.98      0.52  21   0.0099 0.025  0.55
ysl_21      0.95      0.95    0.98      0.51  21   0.0102 0.024  0.54

 Item statistics 
        n raw.r std.r r.cor r.drop mean   sd
ysl_1  39  0.37  0.39  0.38   0.33  4.4 0.74
ysl_2  39  0.79  0.80  0.79   0.77  3.6 1.10
ysl_3  39  0.85  0.85  0.85   0.83  2.4 0.99
ysl_4  39  0.73  0.74  0.73   0.71  2.8 0.90
ysl_5  39  0.79  0.79  0.79   0.76  2.9 1.14
ysl_6  39  0.82  0.82  0.82   0.79  2.9 1.20
ysl_7  39  0.74  0.73  0.73   0.70  3.2 1.63
ysl_8  39  0.84  0.83  0.82   0.82  2.5 1.50
ysl_9  39  0.84  0.83  0.83   0.81  2.7 1.40
ysl_10 39  0.82  0.82  0.82   0.80  2.6 1.35
ysl_11 39  0.83  0.82  0.82   0.80  2.9 1.32
ysl_12 39  0.65  0.65  0.65   0.61  2.7 1.28
ysl_13 39  0.57  0.57  0.57   0.53  3.7 1.25
ysl_14 39  0.80  0.80  0.79   0.78  3.7 1.21
ysl_15 39  0.56  0.57  0.56   0.52  3.9 1.24
ysl_16 39  0.58  0.60  0.58   0.55  4.5 0.76
ysl_17 39  0.75  0.74  0.74   0.72  3.5 1.14
ysl_18 39  0.71  0.71  0.70   0.68  3.1 1.20
ysl_19 39  0.77  0.77  0.76   0.74  3.4 1.12
ysl_20 39  0.76  0.76  0.76   0.73  3.3 1.36
ysl_21 39  0.84  0.84  0.84   0.82  3.2 1.38

Non missing response frequency for each item
          1    2    3    4    5 miss
ysl_1  0.00 0.00 0.15 0.33 0.51    0
ysl_2  0.03 0.15 0.28 0.31 0.23    0
ysl_3  0.26 0.21 0.44 0.10 0.00    0
ysl_4  0.08 0.28 0.46 0.15 0.03    0
ysl_5  0.08 0.36 0.26 0.21 0.10    0
ysl_6  0.18 0.18 0.28 0.31 0.05    0
ysl_7  0.23 0.15 0.15 0.10 0.36    0
ysl_8  0.36 0.21 0.18 0.08 0.18    0
ysl_9  0.28 0.15 0.31 0.10 0.15    0
ysl_10 0.28 0.23 0.21 0.18 0.10    0
ysl_11 0.15 0.23 0.31 0.13 0.18    0
ysl_12 0.21 0.28 0.23 0.18 0.10    0
ysl_13 0.08 0.08 0.23 0.26 0.36    0
ysl_14 0.05 0.10 0.28 0.21 0.36    0
ysl_15 0.03 0.15 0.21 0.15 0.46    0
ysl_16 0.00 0.00 0.15 0.15 0.69    0
ysl_17 0.05 0.13 0.33 0.26 0.23    0
ysl_18 0.10 0.26 0.21 0.33 0.10    0
ysl_19 0.05 0.13 0.36 0.26 0.21    0
ysl_20 0.10 0.23 0.21 0.21 0.26    0
ysl_21 0.18 0.10 0.26 0.26 0.21    0
## ECRRS-global anxiety
mlm_tot_ecrrs_global_anx <- subset(mlm_scales,  select=c(ecrrs_global_7c:ecrrs_global_9c))
psych::alpha(mlm_tot_ecrrs_global_anx)

Reliability analysis   
Call: psych::alpha(x = mlm_tot_ecrrs_global_anx)

  raw_alpha std.alpha G6(smc) average_r S/N   ase mean  sd median_r
       0.9       0.9    0.86      0.74 8.6 0.029  3.3 1.4     0.75

 lower alpha upper     95% confidence boundaries
0.84 0.9 0.95 

 Reliability if an item is dropped:
                raw_alpha std.alpha G6(smc) average_r S/N alpha se var.r med.r
ecrrs_global_7c      0.88      0.88    0.79      0.79 7.3    0.038    NA  0.79
ecrrs_global_8c      0.82      0.82    0.69      0.69 4.4    0.059    NA  0.69
ecrrs_global_9c      0.86      0.86    0.75      0.75 6.1    0.045    NA  0.75

 Item statistics 
                 n raw.r std.r r.cor r.drop mean  sd
ecrrs_global_7c 39  0.89  0.89  0.80   0.76  3.6 1.6
ecrrs_global_8c 39  0.93  0.93  0.89   0.84  3.1 1.6
ecrrs_global_9c 39  0.91  0.91  0.84   0.79  3.2 1.6

Non missing response frequency for each item
                   1    2    3    4    5    6    7 miss
ecrrs_global_7c 0.08 0.23 0.18 0.26 0.13 0.08 0.05    0
ecrrs_global_8c 0.13 0.31 0.23 0.18 0.03 0.10 0.03    0
ecrrs_global_9c 0.15 0.28 0.08 0.23 0.18 0.08 0.00    0
## ECRRS-global avoid
mlm_tot_ecrrs_global_avoid <- subset(mlm_scales,  select=c(ecrrs_global_1r:ecrrs_global_4r, ecrrs_global_5c, ecrrs_global_6c))
psych::alpha(mlm_tot_ecrrs_global_avoid)

Reliability analysis   
Call: psych::alpha(x = mlm_tot_ecrrs_global_avoid)

  raw_alpha std.alpha G6(smc) average_r S/N   ase mean  sd median_r
      0.84      0.85    0.88      0.48 5.6 0.038  3.4 1.1      0.5

 lower alpha upper     95% confidence boundaries
0.77 0.84 0.92 

 Reliability if an item is dropped:
                raw_alpha std.alpha G6(smc) average_r S/N alpha se var.r med.r
ecrrs_global_1r      0.84      0.85    0.88      0.53 5.7    0.041 0.029  0.50
ecrrs_global_2r      0.78      0.78    0.81      0.42 3.7    0.052 0.030  0.40
ecrrs_global_3r      0.80      0.80    0.82      0.45 4.1    0.048 0.033  0.47
ecrrs_global_4r      0.84      0.84    0.88      0.51 5.3    0.040 0.043  0.52
ecrrs_global_5c      0.83      0.84    0.84      0.51 5.2    0.042 0.023  0.51
ecrrs_global_6c      0.79      0.81    0.82      0.46 4.3    0.051 0.035  0.50

 Item statistics 
                 n raw.r std.r r.cor r.drop mean   sd
ecrrs_global_1r 39  0.57  0.64  0.53   0.46  2.0 0.93
ecrrs_global_2r 39  0.87  0.89  0.89   0.80  3.1 1.32
ecrrs_global_3r 39  0.80  0.83  0.82   0.70  2.9 1.42
ecrrs_global_4r 39  0.69  0.68  0.56   0.53  4.3 1.59
ecrrs_global_5c 39  0.74  0.69  0.65   0.58  4.0 1.72
ecrrs_global_6c 39  0.83  0.80  0.78   0.72  4.2 1.71

Non missing response frequency for each item
                   1    2    3    4    5    6    7 miss
ecrrs_global_1r 0.33 0.44 0.18 0.03 0.03 0.00 0.00    0
ecrrs_global_2r 0.10 0.26 0.36 0.08 0.18 0.03 0.00    0
ecrrs_global_3r 0.15 0.31 0.28 0.08 0.13 0.05 0.00    0
ecrrs_global_4r 0.08 0.05 0.18 0.10 0.41 0.10 0.08    0
ecrrs_global_5c 0.05 0.15 0.26 0.15 0.13 0.18 0.08    0
ecrrs_global_6c 0.05 0.13 0.23 0.10 0.26 0.13 0.10    0
## ECRRS-spouse anxiety
mlm_tot_ecrrs_spouse_anx <- subset(mlm_scales,  select=c(ecrrs_spouse_7c:ecrrs_spouse_9c))
psych::alpha(mlm_tot_ecrrs_spouse_anx)

Reliability analysis   
Call: psych::alpha(x = mlm_tot_ecrrs_spouse_anx)

  raw_alpha std.alpha G6(smc) average_r S/N   ase mean  sd median_r
      0.86      0.86    0.84      0.67 6.1 0.042  2.2 1.5     0.63

 lower alpha upper     95% confidence boundaries
0.77 0.86 0.94 

 Reliability if an item is dropped:
                raw_alpha std.alpha G6(smc) average_r  S/N alpha se var.r med.r
ecrrs_spouse_7c      0.92      0.92    0.86      0.86 11.9    0.025    NA  0.86
ecrrs_spouse_8c      0.69      0.69    0.53      0.53  2.2    0.099    NA  0.53
ecrrs_spouse_9c      0.77      0.77    0.63      0.63  3.3    0.074    NA  0.63

 Item statistics 
                 n raw.r std.r r.cor r.drop mean  sd
ecrrs_spouse_7c 39  0.82  0.81  0.63   0.60  2.2 1.8
ecrrs_spouse_8c 39  0.93  0.94  0.93   0.84  2.3 1.8
ecrrs_spouse_9c 39  0.89  0.90  0.87   0.77  2.1 1.6

Non missing response frequency for each item
                   1    2    3    4    5    6    7 miss
ecrrs_spouse_7c 0.59 0.15 0.03 0.08 0.08 0.03 0.05    0
ecrrs_spouse_8c 0.54 0.18 0.00 0.13 0.08 0.05 0.03    0
ecrrs_spouse_9c 0.59 0.18 0.00 0.13 0.03 0.08 0.00    0
## ECRRS-spouse avoid
mlm_tot_ecrrs_spouse_avoid <- subset(mlm_scales,  select=c(ecrrs_spouse_1r:ecrrs_spouse_4r, ecrrs_spouse_5c, ecrrs_spouse_6c))
psych::alpha(mlm_tot_ecrrs_spouse_avoid)

Reliability analysis   
Call: psych::alpha(x = mlm_tot_ecrrs_spouse_avoid)

  raw_alpha std.alpha G6(smc) average_r S/N   ase mean  sd median_r
      0.91      0.93    0.95      0.68  13 0.024  2.1 1.2     0.67

 lower alpha upper     95% confidence boundaries
0.86 0.91 0.95 

 Reliability if an item is dropped:
                raw_alpha std.alpha G6(smc) average_r  S/N alpha se var.r med.r
ecrrs_spouse_1r      0.87      0.89    0.91      0.63  8.5    0.034 0.024  0.60
ecrrs_spouse_2r      0.88      0.91    0.92      0.66  9.6    0.030 0.021  0.65
ecrrs_spouse_3r      0.88      0.90    0.92      0.65  9.1    0.031 0.025  0.62
ecrrs_spouse_4r      0.89      0.92    0.94      0.69 11.0    0.029 0.029  0.65
ecrrs_spouse_5c      0.92      0.94    0.95      0.75 15.2    0.021 0.021  0.76
ecrrs_spouse_6c      0.89      0.92    0.94      0.71 12.2    0.030 0.036  0.76

 Item statistics 
                 n raw.r std.r r.cor r.drop mean  sd
ecrrs_spouse_1r 39  0.94  0.96  0.97   0.91  2.0 1.2
ecrrs_spouse_2r 39  0.87  0.90  0.91   0.81  1.9 1.1
ecrrs_spouse_3r 39  0.90  0.92  0.93   0.86  1.8 1.0
ecrrs_spouse_4r 39  0.82  0.84  0.81   0.74  2.0 1.4
ecrrs_spouse_5c 39  0.77  0.72  0.63   0.61  2.7 1.9
ecrrs_spouse_6c 39  0.83  0.80  0.76   0.73  2.3 1.6

Non missing response frequency for each item
                   1    2    3    4    5    6    7 miss
ecrrs_spouse_1r 0.44 0.31 0.13 0.10 0.00 0.03 0.00    0
ecrrs_spouse_2r 0.44 0.41 0.05 0.08 0.00 0.03 0.00    0
ecrrs_spouse_3r 0.51 0.33 0.05 0.08 0.03 0.00 0.00    0
ecrrs_spouse_4r 0.51 0.23 0.10 0.10 0.03 0.00 0.03    0
ecrrs_spouse_5c 0.38 0.23 0.03 0.18 0.05 0.10 0.03    0
ecrrs_spouse_6c 0.49 0.23 0.00 0.13 0.13 0.03 0.00    0

Drop the 4 people with low trial counts

data <- filter(mlm_bx5, !grepl("D101|D141|D147|D148",ID))
unique(data$ID)
 [1] "D102" "D103" "D104" "D105" "D107" "D110" "D113" "D114" "D115" "D117" "D118" "D119" "D120" "D121" "D122" "D123" "D125" "D126" "D127" "D128" "D129" "D130" "D131" "D132" "D133"
[26] "D134" "D135" "D137" "D138" "D139" "D140" "D142" "D144" "D145" "D146"

OLS estimates for exploratory purposes

# getting regular regression estimates one person at a time for exploratory purposes:
lmEst <- by(data, data$ID, function(x) summary(lm(bias_t ~ trialnum, data=x)))
lmEst
data$ID: D102

Call:
lm(formula = bias_t ~ trialnum, data = x)

Residuals:
     Min       1Q   Median       3Q      Max 
-2.48996 -0.58245  0.06197  0.68084  2.64005 

Coefficients:
              Estimate Std. Error t value Pr(>|t|)
(Intercept) -1.291e-01  1.136e-01  -1.137    0.257
trialnum    -2.241e-05  1.359e-03  -0.016    0.987

Residual standard error: 0.9253 on 273 degrees of freedom
Multiple R-squared:  9.965e-07, Adjusted R-squared:  -0.003662 
F-statistic: 0.000272 on 1 and 273 DF,  p-value: 0.9869

----------------------------------------------------------------------------------------------------------------------------------------- 
data$ID: D103

Call:
lm(formula = bias_t ~ trialnum, data = x)

Residuals:
    Min      1Q  Median      3Q     Max 
-2.2719 -0.5849 -0.0927  0.3886  2.9673 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)
(Intercept)  0.082364   0.102043   0.807    0.420
trialnum    -0.001906   0.001211  -1.573    0.117

Residual standard error: 0.8135 on 262 degrees of freedom
Multiple R-squared:  0.009358,  Adjusted R-squared:  0.005577 
F-statistic: 2.475 on 1 and 262 DF,  p-value: 0.1169

----------------------------------------------------------------------------------------------------------------------------------------- 
data$ID: D104

Call:
lm(formula = bias_t ~ trialnum, data = x)

Residuals:
     Min       1Q   Median       3Q      Max 
-2.34415 -0.69203 -0.01214  0.68738  2.23131 

Coefficients:
              Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.1884022  0.1162064  -1.621    0.106
trialnum     0.0006281  0.0013690   0.459    0.647

Residual standard error: 0.9107 on 260 degrees of freedom
Multiple R-squared:  0.000809,  Adjusted R-squared:  -0.003034 
F-statistic: 0.2105 on 1 and 260 DF,  p-value: 0.6468

----------------------------------------------------------------------------------------------------------------------------------------- 
data$ID: D105

Call:
lm(formula = bias_t ~ trialnum, data = x)

Residuals:
    Min      1Q  Median      3Q     Max 
-3.3081 -0.7206 -0.0130  0.6973  3.2083 

Coefficients:
              Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.0087356  0.1191993  -0.073    0.942
trialnum     0.0001354  0.0014298   0.095    0.925

Residual standard error: 0.9797 on 269 degrees of freedom
Multiple R-squared:  3.331e-05, Adjusted R-squared:  -0.003684 
F-statistic: 0.008962 on 1 and 269 DF,  p-value: 0.9247

----------------------------------------------------------------------------------------------------------------------------------------- 
data$ID: D107

Call:
lm(formula = bias_t ~ trialnum, data = x)

Residuals:
     Min       1Q   Median       3Q      Max 
-1.85010 -0.44853 -0.01855  0.41297  3.11259 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.060853   0.101464  -0.600    0.549
trialnum    -0.001334   0.001215  -1.098    0.273

Residual standard error: 0.731 on 222 degrees of freedom
Multiple R-squared:  0.0054,    Adjusted R-squared:  0.0009193 
F-statistic: 1.205 on 1 and 222 DF,  p-value: 0.2735

----------------------------------------------------------------------------------------------------------------------------------------- 
data$ID: D110

Call:
lm(formula = bias_t ~ trialnum, data = x)

Residuals:
     Min       1Q   Median       3Q      Max 
-2.67052 -0.56069 -0.05879  0.55996  2.49168 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept) -0.364749   0.096444  -3.782  0.00019 ***
trialnum     0.001562   0.001150   1.359  0.17533    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.8085 on 282 degrees of freedom
Multiple R-squared:  0.006504,  Adjusted R-squared:  0.002981 
F-statistic: 1.846 on 1 and 282 DF,  p-value: 0.1753

----------------------------------------------------------------------------------------------------------------------------------------- 
data$ID: D113

Call:
lm(formula = bias_t ~ trialnum, data = x)

Residuals:
    Min      1Q  Median      3Q     Max 
-3.5435 -0.7334 -0.0605  0.6933  3.8484 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)
(Intercept)  0.156910   0.143946    1.09    0.277
trialnum    -0.001568   0.001723   -0.91    0.364

Residual standard error: 1.158 on 255 degrees of freedom
Multiple R-squared:  0.003239,  Adjusted R-squared:  -0.0006703 
F-statistic: 0.8285 on 1 and 255 DF,  p-value: 0.3636

----------------------------------------------------------------------------------------------------------------------------------------- 
data$ID: D114

Call:
lm(formula = bias_t ~ trialnum, data = x)

Residuals:
     Min       1Q   Median       3Q      Max 
-2.50969 -0.65882 -0.00707  0.71733  2.52092 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.025504   0.120929  -0.211    0.833
trialnum     0.000752   0.001435   0.524    0.601

Residual standard error: 0.9673 on 264 degrees of freedom
Multiple R-squared:  0.001039,  Adjusted R-squared:  -0.002745 
F-statistic: 0.2745 on 1 and 264 DF,  p-value: 0.6007

----------------------------------------------------------------------------------------------------------------------------------------- 
data$ID: D115

Call:
lm(formula = bias_t ~ trialnum, data = x)

Residuals:
    Min      1Q  Median      3Q     Max 
-2.9324 -0.7991  0.0765  0.8424  3.6521 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.029711   0.148768  -0.200    0.842
trialnum    -0.000728   0.001758  -0.414    0.679

Residual standard error: 1.191 on 268 degrees of freedom
Multiple R-squared:  0.0006395, Adjusted R-squared:  -0.00309 
F-statistic: 0.1715 on 1 and 268 DF,  p-value: 0.6791

----------------------------------------------------------------------------------------------------------------------------------------- 
data$ID: D117

Call:
lm(formula = bias_t ~ trialnum, data = x)

Residuals:
     Min       1Q   Median       3Q      Max 
-2.64965 -0.58759  0.01891  0.64528  2.42830 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.1011772  0.1241156   0.815    0.416
trialnum    0.0005858  0.0014759   0.397    0.692

Residual standard error: 1.006 on 265 degrees of freedom
Multiple R-squared:  0.000594,  Adjusted R-squared:  -0.003177 
F-statistic: 0.1575 on 1 and 265 DF,  p-value: 0.6918

----------------------------------------------------------------------------------------------------------------------------------------- 
data$ID: D118

Call:
lm(formula = bias_t ~ trialnum, data = x)

Residuals:
    Min      1Q  Median      3Q     Max 
-2.5708 -0.6143  0.0243  0.5573  3.5952 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.147982   0.115324  -1.283    0.201
trialnum     0.001085   0.001367   0.794    0.428

Residual standard error: 0.9324 on 258 degrees of freedom
Multiple R-squared:  0.002435,  Adjusted R-squared:  -0.001432 
F-statistic: 0.6298 on 1 and 258 DF,  p-value: 0.4282

----------------------------------------------------------------------------------------------------------------------------------------- 
data$ID: D119

Call:
lm(formula = bias_t ~ trialnum, data = x)

Residuals:
     Min       1Q   Median       3Q      Max 
-1.46893 -0.28725 -0.04541  0.27308  1.41940 

Coefficients:
              Estimate Std. Error t value Pr(>|t|)   
(Intercept) -0.1848698  0.0580857  -3.183  0.00163 **
trialnum     0.0009938  0.0006790   1.464  0.14452   
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.4621 on 263 degrees of freedom
Multiple R-squared:  0.008078,  Adjusted R-squared:  0.004307 
F-statistic: 2.142 on 1 and 263 DF,  p-value: 0.1445

----------------------------------------------------------------------------------------------------------------------------------------- 
data$ID: D120

Call:
lm(formula = bias_t ~ trialnum, data = x)

Residuals:
     Min       1Q   Median       3Q      Max 
-2.23591 -0.59528  0.02358  0.66195  2.25539 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.013139   0.106695   0.123    0.902
trialnum    0.001517   0.001279   1.186    0.237

Residual standard error: 0.8919 on 281 degrees of freedom
Multiple R-squared:  0.004979,  Adjusted R-squared:  0.001438 
F-statistic: 1.406 on 1 and 281 DF,  p-value: 0.2367

----------------------------------------------------------------------------------------------------------------------------------------- 
data$ID: D121

Call:
lm(formula = bias_t ~ trialnum, data = x)

Residuals:
     Min       1Q   Median       3Q      Max 
-2.43694 -0.91284  0.06214  0.88456  2.63851 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.0733386  0.1441131   0.509    0.611
trialnum    0.0007977  0.0016891   0.472    0.637

Residual standard error: 1.109 on 250 degrees of freedom
Multiple R-squared:  0.0008913, Adjusted R-squared:  -0.003105 
F-statistic: 0.223 on 1 and 250 DF,  p-value: 0.6372

----------------------------------------------------------------------------------------------------------------------------------------- 
data$ID: D122

Call:
lm(formula = bias_t ~ trialnum, data = x)

Residuals:
     Min       1Q   Median       3Q      Max 
-2.56956 -0.41297  0.06257  0.47489  2.13686 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept)  0.363954   0.087539   4.158 4.36e-05 ***
trialnum    -0.002274   0.001048  -2.170   0.0309 *  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.7109 on 262 degrees of freedom
Multiple R-squared:  0.01766,   Adjusted R-squared:  0.01391 
F-statistic:  4.71 on 1 and 262 DF,  p-value: 0.03089

----------------------------------------------------------------------------------------------------------------------------------------- 
data$ID: D123

Call:
lm(formula = bias_t ~ trialnum, data = x)

Residuals:
     Min       1Q   Median       3Q      Max 
-2.40012 -0.56537 -0.00043  0.66146  2.34616 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.0491546  0.1061556   0.463    0.644
trialnum    0.0006319  0.0012798   0.494    0.622

Residual standard error: 0.8798 on 280 degrees of freedom
Multiple R-squared:  0.0008698, Adjusted R-squared:  -0.002698 
F-statistic: 0.2438 on 1 and 280 DF,  p-value: 0.6219

----------------------------------------------------------------------------------------------------------------------------------------- 
data$ID: D125

Call:
lm(formula = bias_t ~ trialnum, data = x)

Residuals:
    Min      1Q  Median      3Q     Max 
-2.8243 -0.7796  0.1010  0.8776  2.5089 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)  
(Intercept) -0.104281   0.149377  -0.698   0.4858  
trialnum     0.003034   0.001758   1.726   0.0855 .
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 1.171 on 251 degrees of freedom
Multiple R-squared:  0.01174,   Adjusted R-squared:  0.007799 
F-statistic: 2.981 on 1 and 251 DF,  p-value: 0.08549

----------------------------------------------------------------------------------------------------------------------------------------- 
data$ID: D126

Call:
lm(formula = bias_t ~ trialnum, data = x)

Residuals:
     Min       1Q   Median       3Q      Max 
-2.59811 -0.84432  0.03587  0.79812  2.77645 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)
(Intercept)  0.116054   0.146348   0.793    0.429
trialnum    -0.001164   0.001716  -0.678    0.498

Residual standard error: 1.132 on 251 degrees of freedom
Multiple R-squared:  0.001829,  Adjusted R-squared:  -0.002148 
F-statistic: 0.4599 on 1 and 251 DF,  p-value: 0.4983

----------------------------------------------------------------------------------------------------------------------------------------- 
data$ID: D127

Call:
lm(formula = bias_t ~ trialnum, data = x)

Residuals:
     Min       1Q   Median       3Q      Max 
-2.34833 -0.46407  0.01749  0.49797  2.14017 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.0083635  0.0926001   0.090    0.928
trialnum    0.0002109  0.0011159   0.189    0.850

Residual standard error: 0.7748 on 277 degrees of freedom
Multiple R-squared:  0.0001289, Adjusted R-squared:  -0.003481 
F-statistic: 0.03571 on 1 and 277 DF,  p-value: 0.8503

----------------------------------------------------------------------------------------------------------------------------------------- 
data$ID: D128

Call:
lm(formula = bias_t ~ trialnum, data = x)

Residuals:
     Min       1Q   Median       3Q      Max 
-2.88746 -0.77939  0.01618  0.90018  2.92400 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.036492   0.144944   0.252    0.801
trialnum    0.001748   0.001781   0.982    0.327

Residual standard error: 1.122 on 246 degrees of freedom
Multiple R-squared:  0.003902,  Adjusted R-squared:  -0.0001475 
F-statistic: 0.9636 on 1 and 246 DF,  p-value: 0.3273

----------------------------------------------------------------------------------------------------------------------------------------- 
data$ID: D129

Call:
lm(formula = bias_t ~ trialnum, data = x)

Residuals:
     Min       1Q   Median       3Q      Max 
-2.29761 -0.68312 -0.07219  0.64877  2.57404 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)  
(Intercept)  0.271839   0.116233   2.339   0.0201 *
trialnum    -0.001223   0.001378  -0.887   0.3756  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.9355 on 270 degrees of freedom
Multiple R-squared:  0.002909,  Adjusted R-squared:  -0.0007842 
F-statistic: 0.7876 on 1 and 270 DF,  p-value: 0.3756

----------------------------------------------------------------------------------------------------------------------------------------- 
data$ID: D130

Call:
lm(formula = bias_t ~ trialnum, data = x)

Residuals:
     Min       1Q   Median       3Q      Max 
-1.96670 -0.73599  0.05166  0.64512  2.42892 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)   
(Intercept) -0.286183   0.107700  -2.657  0.00834 **
trialnum     0.002860   0.001289   2.218  0.02737 * 
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.8939 on 273 degrees of freedom
Multiple R-squared:  0.0177,    Adjusted R-squared:  0.01411 
F-statistic:  4.92 on 1 and 273 DF,  p-value: 0.02737

----------------------------------------------------------------------------------------------------------------------------------------- 
data$ID: D131

Call:
lm(formula = bias_t ~ trialnum, data = x)

Residuals:
    Min      1Q  Median      3Q     Max 
-3.2194 -0.5982 -0.0545  0.6498  2.7764 

Coefficients:
              Estimate Std. Error t value Pr(>|t|)
(Intercept)  0.0791588  0.1059716   0.747    0.456
trialnum    -0.0003126  0.0012645  -0.247    0.805

Residual standard error: 0.8565 on 271 degrees of freedom
Multiple R-squared:  0.0002255, Adjusted R-squared:  -0.003464 
F-statistic: 0.06111 on 1 and 271 DF,  p-value: 0.8049

----------------------------------------------------------------------------------------------------------------------------------------- 
data$ID: D132

Call:
lm(formula = bias_t ~ trialnum, data = x)

Residuals:
    Min      1Q  Median      3Q     Max 
-3.1408 -0.7653  0.1364  0.8599  2.3351 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.013177   0.136249   0.097    0.923
trialnum    0.001955   0.001622   1.206    0.229

Residual standard error: 1.121 on 279 degrees of freedom
Multiple R-squared:  0.005182,  Adjusted R-squared:  0.001616 
F-statistic: 1.453 on 1 and 279 DF,  p-value: 0.229

----------------------------------------------------------------------------------------------------------------------------------------- 
data$ID: D133

Call:
lm(formula = bias_t ~ trialnum, data = x)

Residuals:
    Min      1Q  Median      3Q     Max 
-2.4840 -0.8749 -0.1888  0.8248  3.3248 

Coefficients:
              Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.1745814  0.1439426  -1.213    0.226
trialnum    -0.0008207  0.0017355  -0.473    0.637

Residual standard error: 1.142 on 240 degrees of freedom
Multiple R-squared:  0.000931,  Adjusted R-squared:  -0.003232 
F-statistic: 0.2236 on 1 and 240 DF,  p-value: 0.6367

----------------------------------------------------------------------------------------------------------------------------------------- 
data$ID: D134

Call:
lm(formula = bias_t ~ trialnum, data = x)

Residuals:
    Min      1Q  Median      3Q     Max 
-3.2731 -0.5992 -0.0479  0.7295  3.3435 

Coefficients:
              Estimate Std. Error t value Pr(>|t|)
(Intercept)  0.0536221  0.1424674   0.376    0.707
trialnum    -0.0004335  0.0016789  -0.258    0.796

Residual standard error: 1.138 on 262 degrees of freedom
Multiple R-squared:  0.0002544, Adjusted R-squared:  -0.003561 
F-statistic: 0.06666 on 1 and 262 DF,  p-value: 0.7965

----------------------------------------------------------------------------------------------------------------------------------------- 
data$ID: D135

Call:
lm(formula = bias_t ~ trialnum, data = x)

Residuals:
     Min       1Q   Median       3Q      Max 
-2.78490 -0.53797  0.01393  0.67350  2.64560 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)  
(Intercept) 0.053305   0.123706   0.431   0.6669  
trialnum    0.003670   0.001462   2.510   0.0127 *
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.9873 on 261 degrees of freedom
Multiple R-squared:  0.02356,   Adjusted R-squared:  0.01982 
F-statistic: 6.298 on 1 and 261 DF,  p-value: 0.01269

----------------------------------------------------------------------------------------------------------------------------------------- 
data$ID: D137

Call:
lm(formula = bias_t ~ trialnum, data = x)

Residuals:
    Min      1Q  Median      3Q     Max 
-2.6075 -0.9216 -0.0313  0.7593  3.2136 

Coefficients:
              Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.2405934  0.1503080  -1.601    0.111
trialnum     0.0008816  0.0017989   0.490    0.624

Residual standard error: 1.231 on 270 degrees of freedom
Multiple R-squared:  0.0008888, Adjusted R-squared:  -0.002812 
F-statistic: 0.2402 on 1 and 270 DF,  p-value: 0.6245

----------------------------------------------------------------------------------------------------------------------------------------- 
data$ID: D138

Call:
lm(formula = bias_t ~ trialnum, data = x)

Residuals:
     Min       1Q   Median       3Q      Max 
-1.90652 -0.57527 -0.05418  0.52103  2.44687 

Coefficients:
              Estimate Std. Error t value Pr(>|t|)  
(Intercept) -0.2472722  0.1014525  -2.437   0.0154 *
trialnum     0.0003618  0.0012118   0.299   0.7655  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.8344 on 272 degrees of freedom
Multiple R-squared:  0.0003276, Adjusted R-squared:  -0.003348 
F-statistic: 0.08913 on 1 and 272 DF,  p-value: 0.7655

----------------------------------------------------------------------------------------------------------------------------------------- 
data$ID: D139

Call:
lm(formula = bias_t ~ trialnum, data = x)

Residuals:
     Min       1Q   Median       3Q      Max 
-2.89378 -0.76546  0.05778  0.67061  2.69126 

Coefficients:
              Estimate Std. Error t value Pr(>|t|)
(Intercept)  0.1587970  0.1281068    1.24    0.216
trialnum    -0.0001376  0.0015243   -0.09    0.928

Residual standard error: 1.025 on 260 degrees of freedom
Multiple R-squared:  3.136e-05, Adjusted R-squared:  -0.003815 
F-statistic: 0.008153 on 1 and 260 DF,  p-value: 0.9281

----------------------------------------------------------------------------------------------------------------------------------------- 
data$ID: D140

Call:
lm(formula = bias_t ~ trialnum, data = x)

Residuals:
    Min      1Q  Median      3Q     Max 
-3.3323 -0.6421  0.0875  0.6323  3.0150 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)
(Intercept)  0.03402    0.12900   0.264    0.792
trialnum    -0.00110    0.00152  -0.724    0.470

Residual standard error: 1.031 on 251 degrees of freedom
Multiple R-squared:  0.002084,  Adjusted R-squared:  -0.001892 
F-statistic: 0.5241 on 1 and 251 DF,  p-value: 0.4698

----------------------------------------------------------------------------------------------------------------------------------------- 
data$ID: D142

Call:
lm(formula = bias_t ~ trialnum, data = x)

Residuals:
    Min      1Q  Median      3Q     Max 
-2.9667 -0.7474  0.1062  0.7347  2.7858 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)
(Intercept)  0.046481   0.128445   0.362    0.718
trialnum    -0.001103   0.001506  -0.732    0.465

Residual standard error: 1.014 on 271 degrees of freedom
Multiple R-squared:  0.001975,  Adjusted R-squared:  -0.001707 
F-statistic: 0.5364 on 1 and 271 DF,  p-value: 0.4646

----------------------------------------------------------------------------------------------------------------------------------------- 
data$ID: D144

Call:
lm(formula = bias_t ~ trialnum, data = x)

Residuals:
    Min      1Q  Median      3Q     Max 
-3.4566 -0.6576 -0.1585  0.7136  3.0419 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.154162   0.137451  -1.122    0.263
trialnum    -0.000492   0.001635  -0.301    0.764

Residual standard error: 1.077 on 258 degrees of freedom
Multiple R-squared:  0.0003507, Adjusted R-squared:  -0.003524 
F-statistic: 0.0905 on 1 and 258 DF,  p-value: 0.7638

----------------------------------------------------------------------------------------------------------------------------------------- 
data$ID: D145

Call:
lm(formula = bias_t ~ trialnum, data = x)

Residuals:
    Min      1Q  Median      3Q     Max 
-2.9882 -1.0491  0.1342  0.9403  3.2810 

Coefficients:
              Estimate Std. Error t value Pr(>|t|)
(Intercept)  0.0368251  0.1526175   0.241    0.810
trialnum    -0.0003689  0.0018315  -0.201    0.841

Residual standard error: 1.246 on 267 degrees of freedom
Multiple R-squared:  0.0001519, Adjusted R-squared:  -0.003593 
F-statistic: 0.04056 on 1 and 267 DF,  p-value: 0.8405

----------------------------------------------------------------------------------------------------------------------------------------- 
data$ID: D146

Call:
lm(formula = bias_t ~ trialnum, data = x)

Residuals:
    Min      1Q  Median      3Q     Max 
-2.2402 -0.7404 -0.0087  0.6633  3.1953 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)
(Intercept)  0.075070   0.122868   0.611    0.542
trialnum    -0.002039   0.001469  -1.388    0.166

Residual standard error: 1.014 on 275 degrees of freedom
Multiple R-squared:  0.00696,   Adjusted R-squared:  0.003349 
F-statistic: 1.928 on 1 and 275 DF,  p-value: 0.1662

Stem plot for the intercepts based on the OLS estimates:

int <- by(data, data$ID, function(x) coefficients(lm(bias_t ~ trialnum, data=x))[[1]])
int
data$ID: D102
[1] -0.1290878
----------------------------------------------------------------------------------------------------------------------------------------- 
data$ID: D103
[1] 0.08236442
----------------------------------------------------------------------------------------------------------------------------------------- 
data$ID: D104
[1] -0.1884022
----------------------------------------------------------------------------------------------------------------------------------------- 
data$ID: D105
[1] -0.008735585
----------------------------------------------------------------------------------------------------------------------------------------- 
data$ID: D107
[1] -0.06085291
----------------------------------------------------------------------------------------------------------------------------------------- 
data$ID: D110
[1] -0.3647491
----------------------------------------------------------------------------------------------------------------------------------------- 
data$ID: D113
[1] 0.1569096
----------------------------------------------------------------------------------------------------------------------------------------- 
data$ID: D114
[1] -0.02550412
----------------------------------------------------------------------------------------------------------------------------------------- 
data$ID: D115
[1] -0.02971102
----------------------------------------------------------------------------------------------------------------------------------------- 
data$ID: D117
[1] 0.1011772
----------------------------------------------------------------------------------------------------------------------------------------- 
data$ID: D118
[1] -0.1479823
----------------------------------------------------------------------------------------------------------------------------------------- 
data$ID: D119
[1] -0.1848698
----------------------------------------------------------------------------------------------------------------------------------------- 
data$ID: D120
[1] 0.01313922
----------------------------------------------------------------------------------------------------------------------------------------- 
data$ID: D121
[1] 0.07333861
----------------------------------------------------------------------------------------------------------------------------------------- 
data$ID: D122
[1] 0.3639537
----------------------------------------------------------------------------------------------------------------------------------------- 
data$ID: D123
[1] 0.04915464
----------------------------------------------------------------------------------------------------------------------------------------- 
data$ID: D125
[1] -0.1042809
----------------------------------------------------------------------------------------------------------------------------------------- 
data$ID: D126
[1] 0.116054
----------------------------------------------------------------------------------------------------------------------------------------- 
data$ID: D127
[1] 0.008363529
----------------------------------------------------------------------------------------------------------------------------------------- 
data$ID: D128
[1] 0.0364918
----------------------------------------------------------------------------------------------------------------------------------------- 
data$ID: D129
[1] 0.2718394
----------------------------------------------------------------------------------------------------------------------------------------- 
data$ID: D130
[1] -0.2861826
----------------------------------------------------------------------------------------------------------------------------------------- 
data$ID: D131
[1] 0.07915881
----------------------------------------------------------------------------------------------------------------------------------------- 
data$ID: D132
[1] 0.01317668
----------------------------------------------------------------------------------------------------------------------------------------- 
data$ID: D133
[1] -0.1745814
----------------------------------------------------------------------------------------------------------------------------------------- 
data$ID: D134
[1] 0.05362205
----------------------------------------------------------------------------------------------------------------------------------------- 
data$ID: D135
[1] 0.05330519
----------------------------------------------------------------------------------------------------------------------------------------- 
data$ID: D137
[1] -0.2405934
----------------------------------------------------------------------------------------------------------------------------------------- 
data$ID: D138
[1] -0.2472722
----------------------------------------------------------------------------------------------------------------------------------------- 
data$ID: D139
[1] 0.158797
----------------------------------------------------------------------------------------------------------------------------------------- 
data$ID: D140
[1] 0.03402106
----------------------------------------------------------------------------------------------------------------------------------------- 
data$ID: D142
[1] 0.04648122
----------------------------------------------------------------------------------------------------------------------------------------- 
data$ID: D144
[1] -0.1541616
----------------------------------------------------------------------------------------------------------------------------------------- 
data$ID: D145
[1] 0.03682513
----------------------------------------------------------------------------------------------------------------------------------------- 
data$ID: D146
[1] 0.07506976
summary(int)
    Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
-0.36470 -0.13850  0.01318 -0.01496  0.07420  0.36400 
stem(int, scale=2)

  The decimal point is 1 digit(s) to the left of the |

  -3 | 6
  -3 | 
  -2 | 95
  -2 | 4
  -1 | 98755
  -1 | 30
  -0 | 6
  -0 | 331
   0 | 111344
   0 | 55557888
   1 | 02
   1 | 66
   2 | 
   2 | 7
   3 | 
   3 | 6
hist(int)

Stem plot for the slopes based on the OLS estimates

slope <- by(data, data$ID, function(x) coefficients(lm(bias_t ~ trialnum, data=x))[[2]])
slope
data$ID: D102
[1] -2.241218e-05
----------------------------------------------------------------------------------------------------------------------------------------- 
data$ID: D103
[1] -0.001905679
----------------------------------------------------------------------------------------------------------------------------------------- 
data$ID: D104
[1] 0.0006280884
----------------------------------------------------------------------------------------------------------------------------------------- 
data$ID: D105
[1] 0.0001353558
----------------------------------------------------------------------------------------------------------------------------------------- 
data$ID: D107
[1] -0.001334097
----------------------------------------------------------------------------------------------------------------------------------------- 
data$ID: D110
[1] 0.001561949
----------------------------------------------------------------------------------------------------------------------------------------- 
data$ID: D113
[1] -0.001568244
----------------------------------------------------------------------------------------------------------------------------------------- 
data$ID: D114
[1] 0.0007520295
----------------------------------------------------------------------------------------------------------------------------------------- 
data$ID: D115
[1] -0.0007279879
----------------------------------------------------------------------------------------------------------------------------------------- 
data$ID: D117
[1] 0.000585758
----------------------------------------------------------------------------------------------------------------------------------------- 
data$ID: D118
[1] 0.001084571
----------------------------------------------------------------------------------------------------------------------------------------- 
data$ID: D119
[1] 0.000993783
----------------------------------------------------------------------------------------------------------------------------------------- 
data$ID: D120
[1] 0.001516656
----------------------------------------------------------------------------------------------------------------------------------------- 
data$ID: D121
[1] 0.0007976749
----------------------------------------------------------------------------------------------------------------------------------------- 
data$ID: D122
[1] -0.002274025
----------------------------------------------------------------------------------------------------------------------------------------- 
data$ID: D123
[1] 0.0006318915
----------------------------------------------------------------------------------------------------------------------------------------- 
data$ID: D125
[1] 0.003034465
----------------------------------------------------------------------------------------------------------------------------------------- 
data$ID: D126
[1] -0.001163756
----------------------------------------------------------------------------------------------------------------------------------------- 
data$ID: D127
[1] 0.0002108626
----------------------------------------------------------------------------------------------------------------------------------------- 
data$ID: D128
[1] 0.001748003
----------------------------------------------------------------------------------------------------------------------------------------- 
data$ID: D129
[1] -0.001222927
----------------------------------------------------------------------------------------------------------------------------------------- 
data$ID: D130
[1] 0.002859666
----------------------------------------------------------------------------------------------------------------------------------------- 
data$ID: D131
[1] -0.0003125823
----------------------------------------------------------------------------------------------------------------------------------------- 
data$ID: D132
[1] 0.001955351
----------------------------------------------------------------------------------------------------------------------------------------- 
data$ID: D133
[1] -0.0008207093
----------------------------------------------------------------------------------------------------------------------------------------- 
data$ID: D134
[1] -0.0004334724
----------------------------------------------------------------------------------------------------------------------------------------- 
data$ID: D135
[1] 0.003669641
----------------------------------------------------------------------------------------------------------------------------------------- 
data$ID: D137
[1] 0.0008815924
----------------------------------------------------------------------------------------------------------------------------------------- 
data$ID: D138
[1] 0.000361766
----------------------------------------------------------------------------------------------------------------------------------------- 
data$ID: D139
[1] -0.0001376385
----------------------------------------------------------------------------------------------------------------------------------------- 
data$ID: D140
[1] -0.001100133
----------------------------------------------------------------------------------------------------------------------------------------- 
data$ID: D142
[1] -0.001103152
----------------------------------------------------------------------------------------------------------------------------------------- 
data$ID: D144
[1] -0.000491994
----------------------------------------------------------------------------------------------------------------------------------------- 
data$ID: D145
[1] -0.0003688567
----------------------------------------------------------------------------------------------------------------------------------------- 
data$ID: D146
[1] -0.002038851
summary(slope)
      Min.    1st Qu.     Median       Mean    3rd Qu.       Max. 
-0.0022740 -0.0009604  0.0001354  0.0001824  0.0009377  0.0036700 
stem(slope, scale=2)

  The decimal point is 3 digit(s) to the left of the |

  -2 | 30
  -1 | 96
  -1 | 32211
  -0 | 875
  -0 | 44310
   0 | 124
   0 | 666889
   1 | 01
   1 | 567
   2 | 0
   2 | 9
   3 | 0
   3 | 7
hist(slope)

Note that these are all extremely close to zero - probably because of the lack of time effect.

R-square for OLS models

rsq <- by(data, data$ID, function(x) summary(lm(bias_t ~ trialnum, data=x))$r.squared)
summary(rsq)
     Min.   1st Qu.    Median      Mean   3rd Qu.      Max. 
9.960e-07 4.724e-04 1.829e-03 4.105e-03 5.291e-03 2.356e-02 
stem(rsq, scale=2)

  The decimal point is 3 digit(s) to the left of the |

   0 | 000122334668999908
   2 | 014929
   4 | 024
   6 | 50
   8 | 14
  10 | 7
  12 | 
  14 | 
  16 | 77
  18 | 
  20 | 
  22 | 6
cor(int, slope)
[1] -0.475133

The correlation between intercepts and slopes is -0.475, so people who show less avoidance bias overall change less over the course of the task.

Graphing multilevel data for exploratory purposes

library(lattice)
xyplot(bias_t ~ trialnum | ID, data = data, as.table=T) # too many people/hard to see, try random subset

ids <- sample(unique(data$ID), 12)
ids
 [1] "D125" "D128" "D145" "D114" "D115" "D129" "D117" "D131" "D123" "D134" "D132" "D118"
temp <- data[data$ID %in% ids, ]
xyplot(bias_t ~ trialnum | ID, data = temp, as.table=T) 

# add fit line(s)
xyplot(bias_t ~ trialnum | ID, data = temp, type=c("p","r")) # linear fit

xyplot(bias_t ~ trialnum | ID, data = temp, type=c("p","smooth")) # curvilinear fit

# graph of the raw data for each person
interaction.plot(data$trialnum, data$ID, data$bias_t)

# generate linear fitted values for each person and then plot all together in one plot
fit <- by(data, data$ID, function(x) fitted.values(lm(bias_t ~ trialnum, data=x)))
fit2 <- unlist(fit)
interaction.plot(data$trialnum, data$ID, fit2, xlab="trial number", ylab="RT bias (transformed)")

There does not seem to be either a linear or curvilinear effect of trial number (i.e., time) on RT bias in the task.

Analyses (NHST)

Remember, the effect of time refers to trialnumber (this is NOT the time variable; time refers to time of day when task was run.)

Unconditional means model

library(nlme)
# unconditional means model
m1 <- lme(bias_t ~ 1, random = ~ 1 | ID, data=data, method="ML")
summary(m1)
Linear mixed-effects model fit by maximum likelihood
 Data: data 
       AIC      BIC    logLik
  26249.82 26271.23 -13121.91

Random effects:
 Formula: ~1 | ID
        (Intercept)  Residual
StdDev:   0.1244563 0.9906337

Fixed effects: bias_t ~ 1 
                   Value  Std.Error   DF     t-value p-value
(Intercept) -0.001588482 0.02341911 9254 -0.06782845  0.9459

Standardized Within-Group Residuals:
         Min           Q1          Med           Q3          Max 
-3.545388737 -0.657743071  0.002789368  0.676201102  3.889482930 

Number of Observations: 9289
Number of Groups: 35 
# ICC
iVar <- as.numeric(VarCorr(m1)[1,1])
residVar <- as.numeric(VarCorr(m1)[2,1])
icc1 <- iVar/(iVar+residVar) 
icc1
[1] 0.0155384

The overall mean bias score (transformed) is about -0.002, with a standard deviation of 0.023.

ICC is about 1.6% of the variance. This means that 1.6% of the variance in bias is explained by people being different from one another.

Add fixed and random intercept and slope of time

# model with fixed and random intercept and slope of time (intercepts implied)
m2 <- lme(bias_t ~ trialnum, data = data, random = ~ trialnum | ID, method="ML")
summary(m2) 
Linear mixed-effects model fit by maximum likelihood
 Data: data 
       AIC      BIC    logLik
  26254.44 26297.26 -13121.22

Random effects:
 Formula: ~trialnum | ID
 Structure: General positive-definite, Log-Cholesky parametrization
            StdDev      Corr  
(Intercept) 0.112903390 (Intr)
trialnum    0.000529973 0.112 
Residual    0.990386738       

Fixed effects: bias_t ~ trialnum 
                   Value   Std.Error   DF    t-value p-value
(Intercept) -0.015938869 0.028255586 9253 -0.5640962  0.5727
trialnum     0.000196016 0.000263655 9253  0.7434570  0.4572
 Correlation: 
         (Intr)
trialnum -0.577

Standardized Within-Group Residuals:
         Min           Q1          Med           Q3          Max 
-3.540840497 -0.657596342  0.001668977  0.677196004  3.894566704 

Number of Observations: 9289
Number of Groups: 35 
anova(m2)
            numDF denDF   F-value p-value
(Intercept)     1  9253 0.0272704  0.8688
trialnum        1  9253 0.5527283  0.4572

The average bias at trial number = 0 is -0.016 and gets increasingly approach-biased by a nonsignificant fraction of a percent with each unit of time. There is no significant effect of trial number (time) on RT bias score, t = 0.74, p = 0.457.

# ICC
iVar <- as.numeric(VarCorr(m2)[1,1])
residVar <- as.numeric(VarCorr(m2)[2,1])
icc2 <- iVar/(iVar+residVar)
icc2
[1] 0.999978

99.99% of variance in bias is due to the between-person variance, after controlling for time. This is consistent with the lack of a significant effect of time. However… I have no clue how this makes sense with the ICC of 0.016 from the unconditional means model above, which suggests that <2% of the variance is due to between-person differences. ZERO clue. Atina tried to help me think through this for 20 minutes and I still don’t get it.

Adding stimulus as a fixed effect

There were 5 different types of stimuli, presented in a randomized order (consistent across participants/sessions) determined by genetic algorithm. Everyone saw the same stimuli in the same order, so it is a fixed effect. The different categories of images were: Death (generic), Deceased (personal photo), Living loved one (personal photo), Stranger (sex-matched to spouse), and Neutral (objects).

# add the fixed effect of stim and compare its fit to m2 by inspecting the t and f tests
m3 <- lme(bias_t ~ trialnum + stim, data = data, random = ~ trialnum | ID, method="ML")
summary(m3)
Linear mixed-effects model fit by maximum likelihood
 Data: data 
      AIC      BIC    logLik
  26222.9 26294.27 -13101.45

Random effects:
 Formula: ~trialnum | ID
 Structure: General positive-definite, Log-Cholesky parametrization
            StdDev       Corr  
(Intercept) 0.1132056534 (Intr)
trialnum    0.0005257672 0.108 
Residual    0.9882755468       

Fixed effects: bias_t ~ trialnum + stim 
                   Value  Std.Error   DF   t-value p-value
(Intercept)  -0.01666628 0.03734155 9249 -0.446320  0.6554
trialnum      0.00006687 0.00026759 9249  0.249892  0.8027
stimdeath    -0.05312693 0.03423609 9249 -1.551782  0.1207
stimliving    0.04449279 0.03262355 9249  1.363824  0.1727
stimspouse    0.11774054 0.03318866 9249  3.547614  0.0004
stimstranger -0.05577654 0.03227777 9249 -1.728017  0.0840
 Correlation: 
             (Intr) trilnm stmdth stmlvn stmsps
trialnum     -0.533                            
stimdeath    -0.537  0.171                     
stimliving   -0.529  0.123  0.517              
stimspouse   -0.484  0.063  0.497  0.518       
stimstranger -0.534  0.123  0.523  0.541  0.523

Standardized Within-Group Residuals:
         Min           Q1          Med           Q3          Max 
-3.539701896 -0.662289657  0.002436611  0.671269781  3.967555161 

Number of Observations: 9289
Number of Groups: 35 

NO effect of time. There is a significant fixed effect of spouse, t = 3.55, p = .0004. There is also a non-significant effect of stranger, t = -1.73, p = .084. This indicates that people tended to show more approach bias to photos of their spouse (across conditions). The parameter estimate is 0.12 with an SE of 0.03. They also tended to show greater more avoidance bias to photos of the sex-matched stranger.

intervals(m3, which="fixed")
Approximate 95% confidence intervals

 Fixed effects:
                     lower          est.       upper
(Intercept)  -0.0898403107 -1.666628e-02 0.056507748
trialnum     -0.0004574916  6.686771e-05 0.000591227
stimdeath    -0.1202155468 -5.312693e-02 0.013961677
stimliving   -0.0194359045  4.449279e-02 0.108421484
stimspouse    0.0527044697  1.177405e-01 0.182776618
stimstranger -0.1190276408 -5.577654e-02 0.007474562
attr(,"label")
[1] "Fixed effects:"

95% CIs don’t cross zero for spouse, but do for all other stimulus types.

iVar <- as.numeric(VarCorr(m3)[1,1])
residVar <- as.numeric(VarCorr(m3)[2,1])
icc3 <- iVar/(iVar+residVar)
icc3 # Same ICC as m2
[1] 0.9999784

The ICC is identical to that in model 2.

anova(m2, m3) # compare this model with the first one
   Model df      AIC      BIC    logLik   Test  L.Ratio p-value
m2     1  6 26254.44 26297.26 -13121.22                        
m3     2 10 26222.90 26294.27 -13101.45 1 vs 2 39.53877  <.0001

Comparing the model with the fixed effect of stimulus (m3) to the model without (m2), m3 is better (p < .0001) as evidenced by lower AIC, BIC, and lL values.

Take out fixed and random effect of time

Because there’s no significant effect of time, makes sense to drop it from the model. Now we’re basically doing regression but accounting for whatever interdependence exists. Note that these results will NOT match the results of the regressions, because the regressions use medians of the bias scores, rather than means.

# without fixed or random effect of time
m4 <- lme(bias_t ~ stim, data = data, random = ~ 1 | ID, method="ML")
summary(m4)
Linear mixed-effects model fit by maximum likelihood
 Data: data 
       AIC      BIC    logLik
  26217.69 26267.65 -13101.85

Random effects:
 Formula: ~1 | ID
        (Intercept)  Residual
StdDev:   0.1244692 0.9884897

Fixed effects: bias_t ~ stim 
                   Value  Std.Error   DF   t-value p-value
(Intercept)  -0.01079980 0.03153608 9250 -0.342458  0.7320
stimdeath    -0.05495775 0.03367448 9250 -1.632029  0.1027
stimliving    0.04329407 0.03234999 9250  1.338302  0.1808
stimspouse    0.11715733 0.03312053 9250  3.537302  0.0004
stimstranger -0.05692371 0.03200718 9250 -1.778467  0.0754
 Correlation: 
             (Intr) stmdth stmlvn stmsps
stimdeath    -0.519                     
stimliving   -0.541  0.506              
stimspouse   -0.528  0.495  0.515       
stimstranger -0.546  0.512  0.533  0.520

Standardized Within-Group Residuals:
         Min           Q1          Med           Q3          Max 
-3.543324753 -0.662107099  0.003549832  0.673528310  3.963270484 

Number of Observations: 9289
Number of Groups: 35 

Fixed effects are still more or less the same - significant for spouse, marginal for stranger - although the parameter estimates and test statistics are slightly different.

intervals(m4, which="fixed")
Approximate 95% confidence intervals

 Fixed effects:
                   lower        est.       upper
(Intercept)  -0.07260082 -0.01079980 0.051001227
stimdeath    -0.12094940 -0.05495775 0.011033896
stimliving   -0.02010198  0.04329407 0.106690123
stimspouse    0.05225126  0.11715733 0.182063398
stimstranger -0.11964795 -0.05692371 0.005800537
attr(,"label")
[1] "Fixed effects:"

As above, spouse is the only stimulus category whose 95 CIs don’t cross zero, supporting a significant effect of spouse on bias.

Is m4 any better than m3?

anova(m3,m4)
   Model df      AIC      BIC    logLik   Test   L.Ratio p-value
m3     1 10 26222.90 26294.27 -13101.45                         
m4     2  7 26217.69 26267.65 -13101.85 1 vs 2 0.7920774  0.8514

Nope. Models do not significantly differ (p = 0.851), though the AIC and BIC values are a little smaller for m4 than m3. So let’s forget time and go with m4…

Adding condition as a fixed effect

Participants completed the task at two different sessions, where they were randomized to receive treatment A at one and treatment B at the other (order counterbalanced across participants). The two treatments are oxytocin vs. placebo.

# add the interaction of stim and check its fit by inspecting t and f tests. 
m5 <- lme(bias_t ~ stim + cond, data = data, random = ~ 1 | ID, method="ML")
summary(m5)
Linear mixed-effects model fit by maximum likelihood
 Data: data 
       AIC      BIC    logLik
  26219.67 26276.77 -13101.84

Random effects:
 Formula: ~1 | ID
        (Intercept)  Residual
StdDev:   0.1244747 0.9884885

Fixed effects: bias_t ~ stim + cond 
                   Value  Std.Error   DF   t-value p-value
(Intercept)  -0.01230112 0.03317411 9249 -0.370805  0.7108
stimdeath    -0.05494483 0.03367637 9249 -1.631554  0.1028
stimliving    0.04326388 0.03235236 9249  1.337271  0.1812
stimspouse    0.11715898 0.03312228 9249  3.537166  0.0004
stimstranger -0.05694662 0.03200925 9249 -1.779068  0.0753
condB         0.00299612 0.02053218 9249  0.145923  0.8840
 Correlation: 
             (Intr) stmdth stmlvn stmsps stmstr
stimdeath    -0.495                            
stimliving   -0.512  0.506                     
stimspouse   -0.502  0.494  0.515              
stimstranger -0.518  0.512  0.533  0.520       
condB        -0.310  0.003 -0.006  0.000 -0.005

Standardized Within-Group Residuals:
         Min           Q1          Med           Q3          Max 
-3.544942380 -0.661684418  0.005083279  0.672392393  3.961649422 

Number of Observations: 9289
Number of Groups: 35 

The fixed effect of condition B is not significant, t = 0.15, p = 0.884.

anova(m4, m5)
   Model df      AIC      BIC    logLik   Test    L.Ratio p-value
m4     1  7 26217.69 26267.65 -13101.85                          
m5     2  8 26219.67 26276.76 -13101.84 1 vs 2 0.02130728  0.8839

The two models are not significantly different, but the one with the fixed effect of condition (m5) has slightly higher AIC/BIC.

Adding the stimulus*condition interaction

We might expect that peoples’ tendency to approach or avoid might differ depending on which treatment they got, but only for certain stimuli.

# add the interaction of stim and check its fit by inspecting t and f tests. 
m6 <- lme(bias_t ~ stim*cond, data = data, random = ~ 1 | ID, method="ML")
summary(m6)
Linear mixed-effects model fit by maximum likelihood
 Data: data 
       AIC   BIC    logLik
  26222.36 26308 -13099.18

Random effects:
 Formula: ~1 | ID
        (Intercept)  Residual
StdDev:    0.124494 0.9882046

Fixed effects: bias_t ~ stim * cond 
                         Value  Std.Error   DF    t-value p-value
(Intercept)         0.01640033 0.03934944 9245  0.4167868  0.6768
stimdeath          -0.11111299 0.04756084 9245 -2.3362283  0.0195
stimliving          0.04131019 0.04600610 9245  0.8979286  0.3692
stimspouse          0.06476556 0.04686377 9245  1.3819963  0.1670
stimstranger       -0.09233079 0.04545796 9245 -2.0311251  0.0423
condB              -0.05431038 0.04696981 9245 -1.1562826  0.2476
stimdeath:condB     0.11261285 0.06734696 9245  1.6721298  0.0945
stimliving:condB    0.00494200 0.06470579 9245  0.0763764  0.9391
stimspouse:condB    0.10467698 0.06624022 9245  1.5802632  0.1141
stimstranger:condB  0.07047422 0.06401636 9245  1.1008783  0.2710
 Correlation: 
                   (Intr) stmdth stmlvn stmsps stmstr condB  stmd:B stml:B stmsp:B
stimdeath          -0.590                                                         
stimliving         -0.610  0.505                                                  
stimspouse         -0.599  0.496  0.513                                           
stimstranger       -0.618  0.511  0.528  0.519                                    
condB              -0.598  0.495  0.511  0.502  0.517                             
stimdeath:condB     0.417 -0.706 -0.357 -0.350 -0.361 -0.697                      
stimliving:condB    0.434 -0.359 -0.711 -0.364 -0.376 -0.726  0.506               
stimspouse:condB    0.424 -0.351 -0.363 -0.707 -0.367 -0.709  0.495  0.515        
stimstranger:condB  0.439 -0.363 -0.375 -0.368 -0.710 -0.734  0.512  0.533  0.520 

Standardized Within-Group Residuals:
         Min           Q1          Med           Q3          Max 
-3.517164674 -0.660296590  0.006315667  0.672822601  3.934464966 

Number of Observations: 9289
Number of Groups: 35 

Interesting. With the interaction in the model, there is a non-significant interaction between death and condition B.

anova(m6,m4,m5)
   Model df      AIC      BIC    logLik   Test  L.Ratio p-value
m6     1 12 26222.36 26308.00 -13099.18                        
m4     2  7 26217.69 26267.65 -13101.85 1 vs 2 5.332051  0.3767
m5     3  8 26219.67 26276.76 -13101.84 2 vs 3 0.021307  0.8839
intervals(m6)
Approximate 95% confidence intervals

 Fixed effects:
                         lower         est.       upper
(Intercept)        -0.06069172  0.016400326  0.09349237
stimdeath          -0.20429254 -0.111112992 -0.01793344
stimliving         -0.04882335  0.041310191  0.13144373
stimspouse         -0.02704831  0.064765560  0.15657943
stimstranger       -0.18139044 -0.092330795 -0.00327115
condB              -0.14633201 -0.054310379  0.03771125
stimdeath:condB    -0.01933096  0.112612852  0.24455667
stimliving:condB   -0.12182734  0.004941997  0.13171134
stimspouse:condB   -0.02509855  0.104676978  0.23445251
stimstranger:condB -0.05494441  0.070474223  0.19589286
attr(,"label")
[1] "Fixed effects:"

 Random Effects:
  Level: ID 
                    lower     est.     upper
sd((Intercept)) 0.0931069 0.124494 0.1664618

 Within-group standard error:
    lower      est.     upper 
0.9740020 0.9882046 1.0026143 

The model with the interaction in it is not significantly better than either the one with just the fixed effect of condition or with no effect of condition at all, so makes sense to go with the simplest model which only includes stimulus as a fixed effect.

Adding other plausible predictors

There are a few other variables that we might expect to impact approach/avoidance bias, namely grief severity, OXTR genotype, time since death, and attachment style (avoidant, anxious).

# used group mean centered scores for the additional variables for interpretability
m7 <- lme(bias_t ~ stim + rs53576_A.carrier + rs2268498_C.carrier + gmean_ecrrs_global_anx + gmean_ecrrs_global_avoid + gmean_age + gmean_icg, data = data, random = ~ 1 | ID, method="ML")
summary(m7)
Linear mixed-effects model fit by maximum likelihood
 Data: data 
       AIC      BIC    logLik
  26227.59 26320.36 -13100.79

Random effects:
 Formula: ~1 | ID
        (Intercept)  Residual
StdDev:   0.1198702 0.9884902

Fixed effects: bias_t ~ stim + rs53576_A.carrier + rs2268498_C.carrier + gmean_ecrrs_global_anx +      gmean_ecrrs_global_avoid + gmean_age + gmean_icg 
                               Value  Std.Error   DF   t-value p-value
(Intercept)              -0.01203754 0.03405637 9250 -0.353459  0.7238
stimdeath                -0.05487972 0.03368552 9250 -1.629178  0.1033
stimliving                0.04330158 0.03236055 9250  1.338098  0.1809
stimspouse                0.11722059 0.03313178 9250  3.538010  0.0004
stimstranger             -0.05691502 0.03201784 9250 -1.777603  0.0755
rs53576_A.carrier.L       0.00525467 0.04347813   28  0.120858  0.9047
rs2268498_C.carrier.L     0.00193315 0.04822525   28  0.040086  0.9683
gmean_ecrrs_global_anx   -0.01559904 0.01800716   28 -0.866269  0.3937
gmean_ecrrs_global_avoid -0.00361422 0.02238085   28 -0.161487  0.8729
gmean_age                 0.00000470 0.00001033   28  0.455159  0.6525
gmean_icg                -0.00073388 0.00229617   28 -0.319611  0.7516
 Correlation: 
                         (Intr) stmdth stmlvn stmsps stmstr r53576 r22684 gmn_crrs_glbl_n gmn_crrs_glbl_v gmen_g
stimdeath                -0.481                                                                                 
stimliving               -0.500  0.506                                                                          
stimspouse               -0.489  0.495  0.515                                                                   
stimstranger             -0.506  0.512  0.533  0.520                                                            
rs53576_A.carrier.L       0.248  0.001  0.000 -0.002 -0.001                                                     
rs2268498_C.carrier.L    -0.406  0.000  0.000  0.001  0.001 -0.646                                              
gmean_ecrrs_global_anx    0.017  0.000  0.000  0.000  0.000 -0.143  0.054                                       
gmean_ecrrs_global_avoid -0.078 -0.002 -0.002 -0.001 -0.001 -0.111  0.120 -0.049                                
gmean_age                -0.026  0.002  0.002  0.004  0.003 -0.187  0.171  0.304           0.142                
gmean_icg                -0.021  0.000  0.001  0.002  0.003  0.150  0.055 -0.468          -0.179          -0.263

Standardized Within-Group Residuals:
         Min           Q1          Med           Q3          Max 
-3.540273271 -0.660509033  0.004325344  0.672841199  3.966239659 

Number of Observations: 9289
Number of Groups: 35 
anova(m4,m7)
   Model df      AIC      BIC    logLik   Test  L.Ratio p-value
m4     1  7 26217.69 26267.65 -13101.85                        
m7     2 13 26227.59 26320.37 -13100.80 1 vs 2 2.104686  0.9098

None of the added fixed effects are significant and the AIC/BIC values are higher, so take ’em out.

Final model

# variance explained by the entire model 
fitted <- fitted(m4)
data$fitted <- as.vector(fitted)
rsq <- lm(bias ~ fitted, data=data) 
summary(rsq)

Call:
lm(formula = bias ~ fitted, data = data)

Residuals:
     Min       1Q   Median       3Q      Max 
-1170.66  -128.74    -2.77   127.93  1115.32 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)   11.850      2.532    4.68 2.91e-06 ***
fitted       272.525     19.581   13.92  < 2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 244 on 9287 degrees of freedom
Multiple R-squared:  0.02043,   Adjusted R-squared:  0.02033 
F-statistic: 193.7 on 1 and 9287 DF,  p-value: < 2.2e-16

A case of significant, but not meaningful: The R2 for the entire model is .02, indicating that 2% of the variance in bias is explained by the model. So this model is not very good at predicting approach/avoidance bias from the fixed effect of stimulus within person. Oh well.

Check final model assumptions

Checking normality of the residuals:

# checking normality of residuals
# level 1:
qqnorm(m4, ~resid(.), id=0.00075)

plot(m4, resid(., type="p") ~ trialnum, id=0.0025)

The QQ plot looks pretty okay, albeit a few outliers. D113 is the person who most falls off the line, and does so at both ends. The residuals are also pretty evenly distributed across time, though many fall outside of the +/- 2 range.

# the following bit wants IDs to be numeric, so take out the "D" from each ID
data <- data %>% mutate(numID = as.numeric(str_replace(ID, "D", "")))
# level 2:
qqnorm(m4, ~ranef(.), id=0.25)

ranInt <- ranef(m4)[1] # create object with the random intercept estimates
ranInt <- unlist(ranInt) 
names(ranInt) <- NULL 
ID <- unlist(subset(data, !duplicated(data$numID), select="numID")) # create an ID variable
plot(ranInt ~ ID)

ranSlope <- ranef(m4)[2] # create an object with the random slopes
Error in `[.data.frame`(ranef(m4), 2) : undefined columns selected

Getting an error here Error in [.data.frame(ranef(m4), 72) : undefined columns selected, which I got stuck for way too long on and am leaving alone for now.

Checking homoscedasticity:

# checking homoscedasticity
# level 1:
data$resid <- as.numeric(residuals(m4)) # create an object with the Level-1 residuals
plot(data$resid ~ data$stim)

# level 2:
library(lattice)
stim <- unlist(subset(data, !duplicated(data$numID), select="stim"))
Unknown or uninitialised column: 'numID'.Length of logical index must be 1 or 9289, not 0
xyplot(ranInt ~ stim)

xyplot(ranSlope ~ stim)
Error in eval(expr, envir, enclos) : object 'ranSlope' not found

Note that again there are a lot of residuals falling outside of the +/-2 range.

Also, there’s something weird about the L2 plot. Not sure why it shows only 3 of 5 stimulus types (edit: now not showing anything).

The error message is because the ranSlope object does not exist (see error message in output from previous chunk.)

Analyses (Bayesian)

Unconditional means model

First step is to look at the unconditional means model. We know from the exploratory stuff in previous sections that bias_t is normally distributed, so we can tell brms to use normal likelihood (Gaussian).

summary(b1N)
 Family: gaussian 
  Links: mu = identity; sigma = identity 
Formula: bias_t ~ 1 + (1 | ID) 
   Data: data (Number of observations: 9289) 
Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
         total post-warmup samples = 4000

Group-Level Effects: 
~ID (Number of levels: 35) 
              Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
sd(Intercept)     0.13      0.02     0.10     0.17       1184 1.00

Population-Level Effects: 
          Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
Intercept    -0.00      0.02    -0.05     0.05       1712 1.00

Family Specific Parameters: 
      Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
sigma     0.99      0.01     0.98     1.01       4000 1.00

Samples were drawn using sampling(NUTS). For each parameter, Eff.Sample 
is a crude measure of effective sample size, and Rhat is the potential 
scale reduction factor on split chains (at convergence, Rhat = 1).

Rhat = 1 across the board, so that’s good.

I think these plots look okay? b_Intercept and SD_ID__Intercept plots are not exactly on top of each other, but do have a lot of overlap in their distributions.

pp_check(b1N, nsamples=20)  # posterior predictive check using normal likelihood

pp_check(b1N, type="error_hist", nsamples=20, binwidth=1) # error histogram (histogram of prediction residuals)

The observed distribution is a very good fit to the predicted data (based on the normal distribution). The error histogram also shows that residuals are normally distributed around 0. See this page for some helpful documentation.

# Check ICC to get sense of how variance is distributed between- and within- people
iSd <- as.numeric(summary(b1N)$random$ID[1,1])
iVar <- iSd^2
residSd <- as.numeric(summary(b1N)$spec_pars[[1,1]])
residVar <- residSd^2
icc1 <- iVar/(iVar+residVar) 
icc1 # this is the unconditional growth model - just happens in this dataset that it's a nuisance model
[1] 0.01749842

For the unconditional growth model, the ICC is 0.017, meaning that only about 2% of the variance in bias is because people on average perform differently from each other on the task.

Time as fixed and random effect

summary(b2) 
 Family: skew_normal 
  Links: mu = identity; sigma = identity; alpha = identity 
Formula: bias_t ~ 1 + trialnum + (1 + trialnum | ID) 
   Data: data (Number of observations: 9289) 
Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
         total post-warmup samples = 4000

Group-Level Effects: 
~ID (Number of levels: 35) 
                        Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
sd(Intercept)               0.12      0.03     0.07     0.17       1651 1.00
sd(trialnum)                0.00      0.00     0.00     0.00        510 1.01
cor(Intercept,trialnum)     0.22      0.47    -0.75     0.95       2352 1.00

Population-Level Effects: 
          Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
Intercept    -0.02      0.03    -0.07     0.04       1819 1.00
trialnum      0.00      0.00    -0.00     0.00       4000 1.00

Family Specific Parameters: 
      Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
sigma     0.99      0.01     0.98     1.01       4000 1.00
alpha    -0.02      0.35    -0.64     0.60       4000 1.00

Samples were drawn using sampling(NUTS). For each parameter, Eff.Sample 
is a crude measure of effective sample size, and Rhat is the potential 
scale reduction factor on split chains (at convergence, Rhat = 1).

Note: Estimate = mean of posterior distribution for each of the parameters. Group-level effects: When trialnum is 0, the standard deviation of the intercepts is 0.12. There is no random effect of time: the slopes are not at all variable (people are not different from each other). Zero is a credible value for the correlation, because the 95% CIs (credible intervals) cross zero.

Note that Rhat for sd(trialnum) in the random effects is 1.01. Per the brms documentation, this does not seem to indicate a convergence issue due to lack of error messages and the fact that the Rhat for the group-level effect in their example is 1.01: “If ‘Rhat’ is considerably greater than 1, the algorithm has not yet converged and it is necessary to run more iterations and / or set stronger priors.

Population-level effects: There is no fixed effect of time, either, given that the upper and lower CI limits are -0.0, 0.0.

SD_ID__trialnum and cor_ID__Intercept__trialnum are totally skewed.

pp_check(b2, nsamples=20)

pp_check(b2, type="error_hist", nsamples=20, binwidth=1) 

Again, the trace plot shows that the observed distribution is a very good fit to the predicted data based on the normal distribution. The error histogram also shows that residuals are normally distributed around 0.

SKIPPING THE CHUNK BELOW… I can’t get sjstats to install (issue with compiling the glmmTMB package - freaking dependencies are the bane of my existence with all these different packages), and no point in running cross validation on a model that we’re going to drop anyway due to lack of both fixed/random effects of time.

# install.packages("sjstats")
# library(sjstats)
# equi_test(b2) # is there a fixed effect?
# waic(b1N, b2) # does WAIC think we should keep time? If so, given there is no fixed effect, this is due to the random effect accounting for variance.

# b1N <- add_ic(b1N, ic="R2")
# b2 <- add_ic(b2, ic="R2") 
# mean(b1N$R2)
# mean(b2$R2) # does R2 thinks we should keep time? 

# kfold(b1SN, b2, K=10) # use CV to check for over-fitting
# this takes too long, so reduce n(folds) or only perform for final model

Fixed effect of stimulus

b3 <- brm(bias_t ~ 1 + stim + (1 |ID), data=data, family=skew_normal(), control=list(adapt_delta = .95))
saveRDS(b3, "mlm_ada1_b3.rds")
summary(b3) 
# b3 <- readRDS("mlm_ada1_b3.rds")

Similar to the results of the NHST analysis, there is a notable fixed effect of spouse (estimate = 0.12, est. error = 0.03, 95% CI[0.05,0.18]) on bias, such that people show more approach bias on trials where the stimulus is their spouse.

plot(b3)

marginal_effects(b3) ```

The high density intervals illustrate the positive effect of spouse on bias, which we also see in the marginal effects plot. In the analysis above, the 95% CI for death and stranger stimuli barely crossed zero, which corresponds with what we see in the marginal effects plot, where the upper limit of the error bar is very close to zero (or below). This also corresponds with the effects we saw in NHST, where there were significant or marginally significant effects of death and stranger in addition to spouse.

mean(b3$R2)
[1] 0.02043414

In both models, we have explained about 2 percent of the variance (1.6% vs. 2.0%).

Adding the stimulus*condition interaction (no time)

One of the original study hypotheses was that there would be an interaction (actually, a 3-way interaction with group, but not going to go there for ADA #1; also we may be underpowered for that) between stimulus and condition, such that people might display different response biases for different stimulus categories in condition A vs. B, or vice versa.

summary(b4) 
 Family: skew_normal 
  Links: mu = identity; sigma = identity; alpha = identity 
Formula: bias_t ~ 1 + stim * cond + (1 | ID) 
   Data: data (Number of observations: 9289) 
Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
         total post-warmup samples = 4000

Group-Level Effects: 
~ID (Number of levels: 35) 
              Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
sd(Intercept)     0.13      0.02     0.10     0.17       1088 1.00

Population-Level Effects: 
                   Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
Intercept              0.02      0.04    -0.06     0.10       1357 1.00
stimdeath             -0.11      0.05    -0.20    -0.02       1815 1.00
stimliving             0.04      0.05    -0.05     0.13       1791 1.00
stimspouse             0.06      0.05    -0.03     0.16       2030 1.00
stimstranger          -0.09      0.05    -0.18    -0.00       2068 1.00
condB                 -0.05      0.05    -0.15     0.04       1546 1.00
stimdeath:condB        0.11      0.07    -0.02     0.24       1760 1.00
stimliving:condB       0.01      0.06    -0.12     0.13       1959 1.00
stimspouse:condB       0.11      0.07    -0.02     0.24       2012 1.00
stimstranger:condB     0.07      0.06    -0.06     0.20       2009 1.00

Family Specific Parameters: 
      Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
sigma     0.99      0.01     0.98     1.00       4000 1.00
alpha    -0.05      0.35    -0.64     0.58       3415 1.00

Samples were drawn using sampling(NUTS). For each parameter, Eff.Sample 
is a crude measure of effective sample size, and Rhat is the potential 
scale reduction factor on split chains (at convergence, Rhat = 1).

Got some warnings:

Rejecting initial value: Gradient evaluated at the initial value is not finite. Stan can't start sampling from this initial value. Rejecting initial value: Log probability evaluates to log(0), i.e. negative infinity. Stan can't start sampling from this initial value.

Apparently setting priors may help with this.

Here, we see notable effects of death and stranger stimuli, but not condition nor its interaction with stimulus type.

plot(b4)

marginal_effects(b4) 

pp_check(b4, nsamples=20)

pp_check(b4, type="error_hist", nsamples=20, binwidth=1) # residuals seem about the same

# equi_test(b4) 

Nothing terribly new here, except for the marginal effects. Looking at the plot for stim, the distributions for spouse and stranger and spouse and death don’t seem to overlap, and living loved one and death just barely overlap. Unsurprisingly, people tend to show more approach bias for spouse and living loved one photos compared to death and stranger photos. The third marginal effects plot illustrates the lack of condition or stimulus*condition interaction effects.

b4 <- add_ic(b4, ic="R2") 
mean(b3$R2)
[1] 0.02043414
mean(b4$R2) # we haven't explained any additional variance
[1] 0.02156434

The variance accounted for shows a negligable increase with the stimulus*condition interaction in the model (goes from 0.020 to 0.022).

Using priors

# First find out what default priors are. 
get_prior(bias_t ~ 1 + stim*cond + (1|ID), data=data, family=skew_normal())

We didn’t get here yet, so I’m not too sure what I’m looking at. To be continued…

---
title: "MLM ADA #1"
author: "Saren Seeley"
date: "10-08-2018"
output:
  html_notebook:
    number_sections: no
    theme: paper
    toc: yes
    toc_float: yes
  html_document:
    df_print: paged
    toc: yes
---
## About
### Overview
This ADA uses data from the project “The effect of intranasal oxytocin on neural functioning in widow(er)s” (#1503744420, PI: Mary-Frances O'Connor, Co-PI: Brian Arizmendi). It was a double-blind randomized crossover study, designed to address conflicting accounts of motivated behavior in complicatedgrief using a novel approach-avoidance task variant (grief AAT; gAAT) that allows for comparison between participant-provided photos and nomothetic grief stimuli. Participants were 40 English-speaking adults ages 55-80, who experienced the death of a spouse or long-term romantic partner between six months and three years prior to their participation. 

### Task
Participants attended two identical experimental sessions, at which they received one of two intranasal sprays (treatment A/B) and then took part in an fMRI scan, which involved a structural sequence, the gAAT, and a resting state sequence.

For the gAAT, five stimulus categories were presented: (1) deceased spouse, (2) living loved one [attachment figure], (3) stranger, (4) generic grief-related scenes, and (5) neutral scenes. Photos provided by the participant were scanned and resized to a standard 750-pixel height and width, and framed in colored frames to match the stimuli used for non-idiographic categories (i.e., stranger, grief-related, neutral). Photos of the stranger were sex-matched to the participant's partner. 

During the task, participants were instructed to follow rules based on an irrelevant characteristic of the stimuli, namely, the color of the frame around the photo (e.g., _"push the joystick when the frame is blue, pull when the frame is yellow"_). The task was completed in two runs of approximately seven minutes each. In the second run, the instructions were reversed (e.g. _"now pull the joystick for blue framed photos"_). Order of instructions was counterbalanced across participants. Stimuli were presented via Inquisit 4 (2014), in a pseudorandomized order determined by genetic algorithm to optimize statistical power and psychological validity (Wager & Nichols, 2003). Each trial last 3000ms, with 2500 ms allowed for response. Intertrial interval was 500ms.

Participants completed the task using an MRI-compatible joystick. The stimuli were animated so that a joystick push would make the image smaller (as if they were pushing it away from them) and a joystick pull would make the image larger (as if they were bringing it nearer to them).

### Variables
The primary dependent variable of interest for the present analysis is **reaction time bias**, computed as RT<sub>push</sub>-RT<sub>pull</sub>. Negative bias scores (faster to push than pull) indicate relative avoidance bias, while positive scores (faster to pull than to push) indicate relative approach bias. Because the two runs were identical in every way except for the instruction provided, each trial had both a push RT and a pull RT (e.g., run1 trial1 was push, run2 trial1 was pull, so subtract RT on run2 trial 1 from run1 trial1 to get the difference - this is further elaborated in the data cleaning script below), unless they failed to reverse the instructions.

Main predictors of interest included **trial number** (in order to establish whether peoples' response bias changed over the 7-minute run), **stimulus** (neutral, spouse, stranger, living loved one, death), and **condition** (treatment A/B - oxytocin or placebo [I'm still blinded]). 

Participants also completed a number of self-report measures (state and trait) and provided biological samples for OXTR genotyping and cortisol assay, some of which were used as predictors.

## Equations
#### Level 1:
We first performed MLM to estimate the effect of trial number (i.e., the random effect of time) on response bias (push RT-pull RT) on a given trial. The level-1 model estimated, for each participant _j_ (_j_ = 1–35), a regression line that predicted each participant’s response bias in milliseconds on each trial _i_ from the trial number. This model was represented as follows: 
<blockquote><center><font size=4>
[bias<i><sub>i</sub></i>]<sub>j</sub> = <i>b</i><sub>0<i><sub>j</sub></i></sub> + <i>b</i><sub>1<i><sub>j</sub></i></sub> [trialnumber<i><sub>i</sub></i>] + <i>r<sub>ij</sub></i>
</font></center>
</blockquote>

where <i>b</i><sub>0<i><sub>j</sub></i></sub>, the intercept, is interpreted as participant _j_’s mean response bias for a given trial; <i>b</i><sub>1<i><sub>j</sub></i></sub>, the slope, represents the effect of trial number on response bias for each participant _j_ (positive coefficients represent the predictive magnitude of trial number on response bias for a given participant); and <i>r<sub>ij</sub></i> is the residual error term.

#### Level 2:
The level-2 models estimated the average effects for the entire sample. The models were represented as follows: 

<blockquote><center><font size=4>
<i>b</i><sub>0</sub><sub>j</sub> = γ<sub>00</sub> + μ<sub>0<sub>j</sub></sub>
<p>
<i>b</i><sub>1<sub><i>j</i></sub></sub> = γ<sub>10</sub> +  μ<sub>1<sub>j</sub></sub>
</font></center></blockquote>

where the intercept, γ<sub>00</sub>, is interpreted as the average response bias for the entire sample; γ<sub>10</sub> is the average effect of represents the effect of trial number on response bias for each participant _j_ (positive coefficients represent the predictive magnitude of trial number on response bias for the sample as a whole), and thus, the primary estimate of interest; and μ<sub>0<sub>j</sub></sub> and μ<sub>1<sub>j</sub></sub> are the residual error terms. 

Adding the fixed effect of stimulus type resulted in the following model:

<blockquote><center><font size=4>
<i>b</i><sub>0</sub><sub>j</sub> = γ<sub>00</sub> + γ<sub>01</sub>(stimulus<sub>i</sub>) + μ<sub>0<sub>j</sub></sub>
<p>
<i>b</i><sub>1<sub><i>j</i></sub></sub> = γ<sub>10</sub> + γ<sub>11</sub>(stimulus<sub>i</sub>) + μ<sub>1<sub>j</sub></sub></font></center></blockquote>


## Data cleaning
<i>Note: Went back to the raw data due to some inconsistencies in the `OT_ADAA_Omnibus.xlsx` file from BA that I found.</i>

<b>Pre-R steps</b>:

1. Removed all .iqdat files under 25KB or so in order to filter out pilot data and false starts.<br>
2. Removed non-study data (i.e., tests and OT undergrad files). <br>
3. Organized raw .iqdat files for each visit by run 1/run 2 (two separate folders) based on date/time stamp. <p>
<i>In Terminal, within each of the folders in turn...</i><p>
4. Changed file extensions to .tsv (tab-separated) from .iqdat: `find . -iname "*.iqdat" -exec bash -c 'mv "$0" "${0%\.iqdat}.tsv"' {} \;`<br>
5. Merged the files: `cat *.tsv > OT_gAAT_run-1.tsv` (or `OT_gAAT_run-2.tsv`)
<strike>5. Merged the files: `mlr --tsvlite cat *.tsv > OT_gAAT_run-1.tsv` (or `OT_gAAT_run-2.tsv`)
<ul>5a. `mlr --tsvlite` drops the header from all but the first file, so column names are not repeated throughout the dataset.</ul>
<ul>5b. This requires Miller, which can be installed via Homebrew on OSX (http://johnkerl.org/miller/doc/build.html)</strike> <i>tried to make this work but no luck</i></ul>

<i>Notes:<br>
<ul><li>D117: Had 3 visits due to scanner technical issues. Use "_b" and "_c" as visit 1 and visit 2 respectively.</li>
<li>D114: Has multiple files from visit 2. Use the last two.</li>
<li>D135: Use files from 6/06/16 and 6/17/18 (scanner issues on 06/13/16). Use "_b" and "_c" as visit 1 and visit 2 respectively.</li>
<li>D147: Visit 1 run 2 is partial data for some reason (I do have notes that she switched back to <i>PUSH-YELLOW</i> during run 2, but did not re-run because running way over time.)</li></ul></i>

## Creating the MLM dataset
### Import data
```{r}
library(tidyverse)
filter <- dplyr::filter
select <- dplyr::select

# read in the data (only importing a subset of columns)
raw_r1 <- data.frame(read_tsv("~/Desktop/OT-gAAT_MLM-data/Run1_tsv/OT_gAAT_run-1.tsv", col_types=cols_only(
  date = col_integer(),
  time = col_character(),
  subject = col_character(),
  blockcode = col_character(),
  blocknum = col_character(),
  trialcode = col_character(),
  values.trialcode = col_character(),
  values.stimulus = col_character(),
  values.initialresponse = col_character(),
  values.RT = col_number()
)))

raw_r2 <- data.frame(read_tsv("~/Desktop/OT-gAAT_MLM-data/Run2_tsv/OT_gAAT_run-2.tsv", col_types=cols_only(
  date = col_integer(),
  time = col_character(),
  subject = col_character(),
  blockcode = col_character(),
  blocknum = col_character(),
  trialcode = col_character(),
  values.trialcode = col_character(),
  values.stimulus = col_character(),
  values.initialresponse = col_character(),
  values.RT = col_number()
)))

# header line repeats when I used `cat` to merge, so get those out of there
raw_r1 <- raw_r1 %>% filter(!is.na(date))
raw_r2 <- raw_r2 %>% filter(!is.na(date))

unique(raw_r1$subject)
# D118 is showing up as "test" for some reason (checked against the .iqdat file), and D130_b is missing
# both of D130's and D142's visits were entered as "D130"/"D142", so change the subject ID for the visit on 052416 to D130_b and on 
raw_r1$subject[raw_r1$date == 020916] <- "D118"
raw_r1$subject[raw_r1$date == 052416] <- "D130_b"
raw_r1$subject[raw_r1$date == 111516] <- "D142_b"

# fixing some other stuff in a similar vein
raw_r1$subject[raw_r1$subject == "D101_2"] <- "D101_b"
raw_r1$subject[raw_r1$subject == "D102_B"] <- "D102_b"
raw_r1$subject[raw_r1$subject == "D117_b"] <- "D117"
raw_r1$subject[raw_r1$subject == "D117_c"] <- "D117_b"
raw_r1$subject[raw_r1$subject == "D135_c"] <- "D135_b"

unique(raw_r1$subject) # 78 unique - that is correct

unique(raw_r2$subject)
# some similar issues to fix
raw_r2$subject[raw_r2$subject == "D101_2Y"] <- "D101_b"
raw_r2$subject[raw_r2$subject == "D102_B"] <- "D102_b"
raw_r2$subject[raw_r2$subject == "D107_B_2"] <- "D107_b"
raw_r2$subject[raw_r2$subject == "D117_b"] <- "D117"
raw_r2$subject[raw_r2$subject == "D117_c"] <- "D117_b"
raw_r2$subject[raw_r2$subject == "D126_B"] <- "D126_b"
raw_r2$subject[raw_r2$subject == "D135_c"] <- "D135_b"
raw_r2$subject[raw_r2$date == 111516] <- "D142_b"


unique(raw_r2$subject) # 78 unique - that is correct

str(raw_r1)
str(raw_r2)
```

```{r}
# take out ITIs
unique(raw_r1$trialcode)
mlm_r1 <- filter(raw_r1, grepl("^A",trialcode))
unique(mlm_r1$trialcode)

unique(raw_r2$trialcode)
mlm_r2 <- filter(raw_r2, grepl("^A",trialcode))
unique(mlm_r2$trialcode)

# add a "trialnum" variable, grouped by ID
mlm_r1 <- mlm_r1 %>% group_by(subject) %>% mutate(trialnum = row_number())
View(mlm_r2)
mlm_r2 <- mlm_r2 %>% group_by(subject) %>% mutate(trialnum = row_number())
View(mlm_r2)

mlm_r1 %>% group_by(trialnum) %>% count(trialnum) # everyone has 144 trials, hooray!
mlm_r2 %>% group_by(trialnum) %>% count(trialnum) # everyone has 144 trials, one person has 177
mlm_r2 %>% group_by(subject) %>% count() # that person is D122_b
# no clue why that happened, although I do vaguely remember we had one session where the task seemed to go on and on and we force quit it?
# remove trials 145-177 for D122_b

mlm_r2 <- filter(mlm_r2, trialnum <=144)
mlm_r2 %>% group_by(trialnum) %>% count(trialnum)
# now everyone has 144 trials

# add a column for "run"
mlm_r1$run <- "1"
mlm_r2$run <- "2"

# add a column for "visit"
mlm_r1$visit <- as.factor(ifelse(grepl("*_b", mlm_r1$subject), "2", "1"))
mlm_r2$visit <- as.factor(ifelse(grepl("*_b", mlm_r2$subject), "2", "1"))

mlm_r1 %>% group_by(subject) %>% count(visit) # all looks good
mlm_r2 %>% group_by(subject) %>% count(visit) # all looks good

# now remove the "_b" from visit 2 IDs
mlm_r1 <- mlm_r1 %>% ungroup() %>%
  mutate(subject = str_replace(subject, "_b", ""))
unique(mlm_r1$subject)

mlm_r2 <- mlm_r2 %>% ungroup() %>%
  mutate(subject = str_replace(subject, "_b", ""))
unique(mlm_r2$subject)

# merge data from the 2 runs 
mlm_bx <- bind_rows(mlm_r1, mlm_r2)

# split up by visit
mlm_v1 <- mlm_bx %>% filter(visit == "1")
mlm_v2 <- mlm_bx %>% filter(visit == "2")

mlm_v1 %>% group_by(trialnum) %>% count(trialnum) # everyone has 144 trials, hooray!
mlm_v2 %>% group_by(trialnum) %>% count(trialnum) # everyone has 144 trials, hooray!
```

```{r}
# add columns for "cond" at visit 1 and visit 2 [condition: A or B]
# load in the randomization data
randomize <- readRDS("~/Dropbox/GLASS Lab/OT Study/data/cleaned-data/randomization_clean.rds")
head(randomize)

mlm_v1 <- mlm_v1 %>%
  rename("ID" = subject)

mlm_v2 <- mlm_v2 %>%
  rename("ID" = subject)

# get the IDs for everyone who got treatment A at visit 1 
IDs_v1_txA <- randomize %>% filter(tx_v1 == "A")
View(IDs_v1_txA)

# ifelse statement: if ID = any of those listed in txA_list, make cond_v1 = "A", else make it "B"
mlm_v1 <- mlm_v1 %>%
  mutate(cond_v1 = as.factor(ifelse(ID == "D101" | ID == "D105" | ID == "D109" | ID == "D110" | ID == "D114" | ID == "D115" | ID == "D116" | ID == "D118" | ID == "D119" | ID == "D120" | ID == "D123" | ID == "D124" | ID == "D126" | ID == "D128" | ID == "D132" | ID == "D135" | ID == "D136" | ID == "D137" | ID == "D138" | ID == "D140" | ID == "D141" | ID == "D145" | ID == "D148" | ID == "D149" | ID == "D117", "A", "B")), cond_v2 = as.factor(ifelse(ID == "D101" | ID == "D105" | ID == "D109" | ID == "D110" | ID == "D114" | ID == "D115" | ID == "D116" | ID == "D118" | ID == "D119" | ID == "D120" | ID == "D123" | ID == "D124" | ID == "D126" | ID == "D128" | ID == "D132" | ID == "D135" | ID == "D136" | ID == "D137" | ID == "D138" | ID == "D140" | ID == "D141" | ID == "D145" | ID == "D148" | ID == "D149" | ID == "D117", "B", "A")))
head(mlm_v1)

# ifelse statement: if ID = any of those listed in txA_list, make cond_v2 = "B", else make it "A" (people who were A at visit 1 should be B for visit 2, and vice versa)
mlm_v2 <- mlm_v2 %>%
  mutate(cond_v2 = as.factor(ifelse(ID == "D101" | ID == "D105" | ID == "D109" | ID == "D110" | ID == "D114" | ID == "D115" | ID == "D116" | ID == "D118" | ID == "D119" | ID == "D120" | ID == "D123" | ID == "D124" | ID == "D126" | ID == "D128" | ID == "D132" | ID == "D135" | ID == "D136" | ID == "D137" | ID == "D138" | ID == "D140" | ID == "D141" | ID == "D145" | ID == "D148" | ID == "D149" | ID == "D117", "B", "A")), cond_v1 = as.factor(ifelse(ID == "D101" | ID == "D105" | ID == "D109" | ID == "D110" | ID == "D114" | ID == "D115" | ID == "D116" | ID == "D118" | ID == "D119" | ID == "D120" | ID == "D123" | ID == "D124" | ID == "D126" | ID == "D128" | ID == "D132" | ID == "D135" | ID == "D136" | ID == "D137" | ID == "D138" | ID == "D140" | ID == "D141" | ID == "D145" | ID == "D148" | ID == "D149" | ID == "D117", "A", "B")))

levels(mlm_v1$cond_v1)
levels(mlm_v2$cond_v2)

head(mlm_v1)
  
IDs_v1_txA$ID
mlm_v1 %>% group_by(cond_v1) %>% count(ID) # IDs and cond match txA_list
mlm_v2 %>% group_by(cond_v2) %>% count(ID) # IDs and cond match txA_list

mlm_bx <- bind_rows(mlm_v1,mlm_v2)
View(mlm_bx)

# save it
saveRDS(mlm_bx, "~/Dropbox/2018Fall/FSHD617C/mlm_bx.rds")
```

```{r}
# rename some variables
colnames(mlm_bx)
mlm_bx <- mlm_bx %>% rename("push_pull" = values.initialresponse,
                  "gAAT_RT" = values.RT)
head(mlm_bx)

# add a new variable for condition
mlm_bx1 <- mlm_bx %>% mutate(cond = as.factor(ifelse(cond_v1 == "A" & visit == "1", "A",
                              ifelse(cond_v2 == "A" & visit == "2", "A",
                                    ifelse(cond_v1 == "B" & visit == "1", "B",
                                           ifelse(cond_v2 == "B" & visit == "2", "B", NA))))))

View(mlm_bx1) # check that it coded the "cond" column correctly      

# add a new variable for stimulus category
# in values.stimulus:
# 1 = spouse, 2 = living loved one/WHOTO, 3 = stranger, 4 = nomothetic death-related, 5 = neutral images

sort(unique(mlm_bx1$values.stimulus))

mlm_bx2 <- mlm_bx1 %>%
  mutate(stim = as.factor(ifelse(values.stimulus == "1B_1.jpg"|values.stimulus == "1B_2.jpg"|values.stimulus == "1B_3.jpg"|values.stimulus == "1Y_1.jpg"|values.stimulus == "1Y_2.jpg"|values.stimulus == "1Y_3.jpg", "spouse", 
                       ifelse(values.stimulus == "2B_1.jpg"|values.stimulus == "2B_2.jpg"|values.stimulus == "2B_3.jpg"|values.stimulus == "2Y_1.jpg"|values.stimulus == "2Y_2.jpg"|values.stimulus == "2Y_3.jpg", "living",
                              ifelse(values.stimulus == "3B_1.jpg"|values.stimulus == "3B_2.jpg"|values.stimulus == "3B_3.jpg"|values.stimulus == "3Y_1.jpg"|values.stimulus == "3Y_2.jpg"|values.stimulus == "3Y_3.jpg", "stranger",
                                     ifelse(values.stimulus == "4B_1.jpg"|values.stimulus == "4B_2.jpg"|values.stimulus == "4B_3.jpg"|values.stimulus == "4Y_1.jpg"|values.stimulus == "4Y_2.jpg"|values.stimulus == "4Y_3.jpg", "death",
                                            ifelse(values.stimulus == "5B_1.jpg"|values.stimulus == "5B_2.jpg"|values.stimulus == "5B_3.jpg"|values.stimulus == "5Y_1.jpg"|values.stimulus == "5Y_2.jpg"|values.stimulus == "5Y_3.jpg", "neutral", NA)))))))

levels(mlm_bx2$stim)
mlm_bx2$stim <- relevel(mlm_bx2$stim, ref = "neutral") # make neutral the reference level
                       
# save it
saveRDS(mlm_bx2, "~/Dropbox/2018Fall/FSHD617C/mlm_bx2.rds")
```

### Bias scores
Separate out push minus pull, then subtract run1 from run2.
```{r}

mlm_r1.1 <- mlm_bx2 %>% filter(run == "1")
tail(mlm_r1.1) # does it line up?
mlm_r2.1 <- mlm_bx2 %>% filter(run == "2")
tail(mlm_r2.1) 

p99r1 <- quantile(mlm_r1.1$gAAT_RT, 0.99) 
p01r1 <- quantile(mlm_r1.1$gAAT_RT, 0.01) 
p99r1 # 99th percentile RTs >= 1713.48ms 
p01r1 # 1st percentile RTs <= 472ms

p99r2 <- quantile(mlm_r2.1$gAAT_RT, 0.99) 
p01r2 <- quantile(mlm_r2.1$gAAT_RT, 0.01) # 1st percentile RTs <= 463ms
p99r2 # 99th percentile RTs >= 1712.45ms 
p01r2 # 1st percentile RTs <= 463ms

mlm_r1.2 <- mlm_r1.1 %>%
  mutate(outlier_RT = ifelse(gAAT_RT <= p01r1 | gAAT_RT >= p99r1, 1, 0))
table(mlm_r1.2$outlier_RT) 
227/11005 # 227 trials will be dropped, 11005 will be retained...about 2% of trials.

mlm_r2.2 <- mlm_r2.1 %>%
  mutate(outlier_RT = ifelse(gAAT_RT <= p01r2 | gAAT_RT >= p99r2, 1, 0))
table(mlm_r2.2$outlier_RT) 
230/11002 # 230 trials will be dropped, 11002 will be retained...about 2% of trials.

mlm_r1r2 <- bind_cols(mlm_r1.2, mlm_r2.2) # 11232 trials total.
head(mlm_r1r2) # make sure that trials in the two runs line up post-merge
tail(mlm_r1r2) # make sure that trials in the two runs line up post-merge

# get rid of any trials with outliers in either run1 or run2 
# (need run1/run2 to be the same length so that the trials match up)
mlm_r1r2.1 <- mlm_r1r2 %>% filter(outlier_RT == "0" & outlier_RT1 == "0") 
count(mlm_r1r2.1) # 10790 obs.
10790/11232 # 96% of trials retained = 4% of trials dropped. That feels acceptable.

push_pull <- mlm_r1r2.1 %>% filter(push_pull == "PUSH" & push_pull1 == "PULL") %>% mutate(bias = (gAAT_RT-gAAT_RT1))
pull_push <- mlm_r1r2.1 %>% filter(push_pull == "PULL" & push_pull1 == "PUSH") %>% mutate(bias = (gAAT_RT1-gAAT_RT))

mlm_pp <- bind_rows(push_pull,pull_push)

View(mlm_pp)
```

### Transformations
```{r}
library(psych)
describe(mlm_pp$bias) # skewness is 0.1, kurtosis is 1.69
hist(mlm_pp$bias)
qqnorm(mlm_pp$bias)
ggplot(mlm_pp, aes(sample=bias))+stat_qq()+stat_qq_line() # not a good look at all...

# install.packages("bestNormalize")
# what's the best transformation for my data?
library(bestNormalize)

orderNorm_bias <- bestNormalize(mlm_pp$bias) # suggested orderNorm transform as best
mlm_pp$bias_t <- orderNorm_bias$x.t # pull x.t from orderNorm_bias and assign it back into dataset

describe(mlm_pp$bias_t)
hist(mlm_pp$bias_t)
qqnorm(mlm_pp$bias_t)
ggplot(mlm_pp, aes(sample=bias_t))+stat_qq()+stat_qq_line() # so much better!

# back-transformation, just to show how to do it:
p <- predict(orderNorm_bias)
    x2 <- predict(orderNorm_bias, newdata = p, inverse = TRUE) # x2 is the back-transformed data
    all.equal(x2, mlm_pp$bias_t) 

mlm_bx3 <- mlm_pp
saveRDS(mlm_bx3, "~/Dropbox/2018Fall/FSHD617C/mlm_bx3.rds")
```

### Checking the data
```{r}
table <- mlm_bx3 %>% group_by(ID) %>% group_by(cond) %>% count(ID) 
table
# missing data...
# cond A: D101 (3 trials), D141 (18 trials), D148 (19 trials)
# cond B:  D147 (1 trial)

describe(table$n)
ggplot(table, aes(sample=n))+stat_qq()+stat_qq_line() # can see those 4 on the QQ plot

# where are the trials getting dropped?
mlm_r1r2 %>% group_by(ID) %>% group_by(cond) %>% count(ID) # everyone has 144 here
mlm_r1r2.1 %>% group_by(ID) %>% group_by(cond) %>% count(ID) 
# after getting rid of outliers, most people missing a few trials (D107 cond A/B and D113 cond A missing >20)

mlm_r1r2.1 %>% filter(ID=="D101" & cond=="A")
mlm_r1r2.1 %>% filter(ID=="D141" & cond=="A")
mlm_r1r2.1 %>% filter(ID=="D148" & cond=="A")
mlm_r1r2.1 %>% filter(ID=="D147" & cond=="B")
# so trials are getting dropped because they didn't effectively reverse the instructions or they reverted to the old instructions
# in essence, this removes error trials

# create a "drop" variable for these 4, just in case...
mlm_bx3 <- mlm_bx3 %>% mutate(low_n_trials = ifelse(ID == "D101" | ID == "D141" | ID == "D147" | ID == "D148", 1, 0))

# check whether the rows line up
notlinedup <- mlm_bx3 %>% filter(stim != stim1)
notlinedup
notlinedup <- mlm_bx3 %>% filter(trialcode != trialcode1)
notlinedup
notlinedup <- mlm_bx3 %>% filter(trialnum != trialnum1)
notlinedup
# 0 rows returned; all the rows match

head(mlm_bx3)
saveRDS(mlm_bx3, "~/Dropbox/2018Fall/FSHD617C/mlm_bx3.rds")
```

### Add time-invariant variables
Next step is to add demographics and other variables that we might potentially want to include in the model but that do not change over time. Looking at the codebook for the master dataset, these include:

Demographics:

* `age`
* `age_yrs`
* `sex_m`
* `ethnicity_hisp`
* `race`
* `timesincedeath`

Genetic variables:

* `rs2254298`
* `rs2268498`
* `rs53576`
* `rs2254298_A.carrier`
* `rs2268498_C.carrier`
* `rs53576_A.carrier`

Self-report measures:

* `tot_icg` [grief]
* `tot_bdi` [depression]
* `tot_bgq` [brief grief questionnaire with impairment item]
* `tot_ecrrs_global_anx` [anxious attachment - global]
* `tot_ecrrs_global_avoid` [avoidant attachment - global]
* `tot_ecrrs_spouse_anx` [anxious attachment - spouse]
* `tot_ecrrs_spouse_avoid` [avoidant attachment - spouse]
* `tot_ysl` [yearning for the deceased]


```{r}
# load in master dataset
master <- readRDS("~/Dropbox/GLASS Lab/OT Study/data/cleaned-data/selfreports_oxtr_cortisol_random_AB_bx_clean.rds") 

# subset the selected variables
mlm_add <- subset(master, select=c(ID,age, age_yrs, sex_m, ethnicity_hisp, race, timesincedeath, rs2254298, rs2268498, rs53576, rs2254298_A.carrier, rs2268498_C.carrier, rs53576_A.carrier, tot_bdi, tot_bgq, tot_ecrrs_global_anx,tot_ecrrs_global_avoid, tot_ecrrs_spouse_anx, tot_ecrrs_spouse_avoid, tot_ysl, tot_icg))
head(mlm_add)

# merge into the behavioral dataset
mlm_bx4 <- left_join(mlm_bx3, mlm_add, by="ID")
head(mlm_bx4)

# save it
saveRDS(mlm_bx4, "~/Dropbox/2018Fall/FSHD617C/mlm_bx4.rds")
```


## Exploratory analysis and checking assumptions

First step is to overview the entire dataset using `summarytools`. 
```{r}
library(summarytools)

# overview the entire dataset
# this will look like TOTAL crap in the console (notebook output)
# use view(dfSummary(mlm_bx4)) from the console to see a much more readable and attractive output
dfSummary(mlm_bx4)
 # no missing data on any of the variables!
```

### Descriptives
```{r}
library(psych)
library(stargazer)
library(summarytools)

# descriptives for the time-invariant variables are going to be wrong if we use mlm_bx4 (since observations are duplicated)
# so going back to the mlm_add object...
# note that n = 40, but n = 39 with RT data
# D149's behavioral data was dropped (ran outside of scanner), so drop his case

mlm_add <- filter(mlm_add, !grepl("D149",ID))
unique(mlm_add$ID) # it worked
nums <- select_if(mlm_add, is.numeric) # just get the numeric variables
view(dfSummary(nums)) 
stargazer(nums, type="text", title="Descriptives", digits=2, out="table1.txt", notes = "Note: 'stargazer' makes nicer looking tables, but doesn't provide skewness or kurtosis statistics, so we'll use 'psych' for that.") 

describe(nums)
```

### Frequencies
```{r}
# which variables are factors?
mlm_add %>% select_if(is.factor) %>% colnames()

# hmm, rs2268498 and rs53576 should also be factors, so fix that now
mlm_add <- mlm_add %>% mutate(rs2268498 = recode_factor(rs2268498, "C/T" = "C/T", "CC" = "CC", "TT" = "TT"), rs53576 = recode_factor(rs53576, "A/G" = "A/G", "AA" = "AA", "GG" = "GG"))
mlm_add %>% select_if(is.factor) %>% colnames()

# function to count frequencies for variables that are factors...this isn't working
# countAll <- function(basedata)
# {
#	facs <- sapply(basedata, is.factor)
#	facdata <- basedata[ ,facs]
#	for(i in 1:length(facdata))
#	{
#		count(facdata)
#	 }
# }
# countAll(mlm_bx4)

mlm_add %>% count(sex_m) # 28 females, 11 males
mlm_add %>% count(race) # 38 white, 1 another race not listed
mlm_add %>% count(ethnicity_hisp) # 37 non-Hispanic/Latino, 2 Hispanic/Latino
mlm_add %>% count(rs2254298) # 22 GG, 10 A/G, 1 NA/missing (0 AA)
mlm_add %>% count(rs2268498) # 17 C/T, 10 CC, 12 TT
mlm_add %>% count(rs53576) # 21 A/G, 18 GG (0 AA)
mlm_add %>% count(rs2254298_A.carrier) # yes = 10, no = 28, NA = 1
mlm_add %>% count(rs2268498_C.carrier) # yes = 27, no = 12
mlm_add %>% count(rs53576_A.carrier) # yes = 21, no = 18
```

### Checking distribution shapes and normality
Got a preview of this above when we used `summarytools`, but here's a closer look at some histograms and QQ plots.
```{r}
# a function to produce histograms for all numeric variables in a dataframe
histAll <- function(basedata)
{
	nums <- sapply((basedata), is.numeric)
	numdata <- basedata[ ,nums]
	par(mfrow=c(4,4))
	for(i in 1:length(numdata))
	{
		hist(numdata[,i], xlab=NULL, main=names(numdata[i]))
	}
}

histAll(as.data.frame(mlm_bx4))

```

### Centering L2 predictors

```{r}
# pull out only the total score variables, age, timesincedeath
gcvars <- as.data.frame(mlm_bx4[, grep('^tot|age|timesincedeath|ID', names(mlm_bx4))])
gcvars1 <- gcvars[!duplicated(gcvars$ID), ]
stargazer(gcvars1, type="text", median = T, digits=3)

gmean_age <- mean(gcvars1$age)
gmean_age_yrs <- mean(gcvars1$age_yrs)
gmean_time <- mean(gcvars1$timesincedeath)
gmean_bdi <- mean(gcvars1$tot_bdi)
gmean_bgq <- mean(gcvars1$tot_bgq)
gmean_ecrrs_g_anx <- mean(gcvars1$tot_ecrrs_global_anx)
gmean_ecrrs_g_av <- mean(gcvars1$tot_ecrrs_global_avoid)
gmean_ecrrs_s_anx <- mean(gcvars1$tot_ecrrs_spouse_anx)
gmean_ecrrs_s_av <- mean(gcvars1$tot_ecrrs_spouse_avoid)
gmean_icg <- mean(gcvars1$tot_icg)
gmean_ysl <- mean(gcvars1$tot_ysl)

mlm_bx5 <- mlm_bx4 %>% mutate(gmean_age = age-gmean_age, gmean_age_yrs = age_yrs-gmean_age_yrs, gmean_timesincedeath = timesincedeath-gmean_time, gmean_bdi = tot_bdi-gmean_bdi, gmean_bgq = tot_bgq-gmean_bgq, gmean_ecrrs_global_anx = tot_ecrrs_global_anx-gmean_ecrrs_g_anx, gmean_ecrrs_global_avoid = tot_ecrrs_global_avoid-gmean_ecrrs_g_av, gmean_ecrrs_spouse_anx = tot_ecrrs_spouse_anx-gmean_ecrrs_s_anx, gmean_ecrrs_spouse_avoid = tot_ecrrs_spouse_avoid-gmean_ecrrs_s_av, gmean_ysl = tot_ysl-gmean_ysl, gmean_icg = tot_icg-gmean_icg)
stargazer(as.data.frame(mlm_bx5), type="text") # note that means of the group-mean centered variables are not zero - I think this is because some people had more trials than others

# test that hypothesis with the non-duplicated data
gcvars2 <- gcvars1 %>% mutate(gmean_age = age-gmean_age, gmean_age_yrs = age_yrs-gmean_age_yrs, gmean_timesincedeath = timesincedeath-gmean_time, gmean_bdi = tot_bdi-gmean_bdi, gmean_bgq = tot_bgq-gmean_bgq, gmean_ecrrs_global_anx = tot_ecrrs_global_anx-gmean_ecrrs_g_anx, gmean_ecrrs_global_avoid = tot_ecrrs_global_avoid-gmean_ecrrs_g_av, gmean_ecrrs_spouse_anx = tot_ecrrs_spouse_anx-gmean_ecrrs_s_anx, gmean_ecrrs_spouse_avoid = tot_ecrrs_spouse_avoid-gmean_ecrrs_s_av, gmean_ysl = tot_ysl-gmean_ysl, gmean_icg = tot_icg-gmean_icg)
stargazer(as.data.frame(gcvars2), type="text") # yes, the latter are zero

saveRDS(mlm_bx5, "~/Dropbox/2018Fall/FSHD617C/mlm_bx5.rds")
```

### Evaluate missing data 
```{r}
mlm_bx5 %>% 
  select_if(function(x) any(is.na(x))) %>% 
  summarise_all(funs(mean(is.na(.)))) -> mlm_bx5_NAs # or use sum() instead of mean() to find n missing values

mlm_bx5_NAs <- gather(mlm_bx5_NAs) # create long data that lists variables with any NAs, and % of data missing
mlm_bx5_NAs
```

One person is missing rs2254298 genotype (`rs2254298` and `rs2254298_A.carrier`), which we already knew about.

### Scale reliabilities
This dataset does not include the item-level data, but we can get that from the master dataset we loaded in earlier.
```{r}
library(psych)
mlm_scales <- filter(master, !grepl("D149",ID))

## BDI
mlm_tot_bdi <- subset(mlm_scales, select=c(bdi_1c:bdi_21c))  
psych::alpha(mlm_tot_bdi)

## BGQ
mlm_tot_bgq <- subset(mlm_scales, select=c(bgq_1:bgq_5))
psych::alpha(mlm_tot_bgq)

## ICG
mlm_tot_icg <- subset(mlm_scales, select=c(icg_1:icg_19))
psych::alpha(mlm_tot_icg)

## YSL
mlm_tot_ysl <- subset(mlm_scales, select=c(ysl_1:ysl_21))
psych::alpha(mlm_tot_ysl)

## ECRRS-global anxiety
mlm_tot_ecrrs_global_anx <- subset(mlm_scales,  select=c(ecrrs_global_7c:ecrrs_global_9c))
psych::alpha(mlm_tot_ecrrs_global_anx)

## ECRRS-global avoid
mlm_tot_ecrrs_global_avoid <- subset(mlm_scales,  select=c(ecrrs_global_1r:ecrrs_global_4r, ecrrs_global_5c, ecrrs_global_6c))
psych::alpha(mlm_tot_ecrrs_global_avoid)

## ECRRS-spouse anxiety
mlm_tot_ecrrs_spouse_anx <- subset(mlm_scales,  select=c(ecrrs_spouse_7c:ecrrs_spouse_9c))
psych::alpha(mlm_tot_ecrrs_spouse_anx)

## ECRRS-spouse avoid
mlm_tot_ecrrs_spouse_avoid <- subset(mlm_scales,  select=c(ecrrs_spouse_1r:ecrrs_spouse_4r, ecrrs_spouse_5c, ecrrs_spouse_6c))
psych::alpha(mlm_tot_ecrrs_spouse_avoid)

```

### Drop the 4 people with low trial counts
```{r}
data <- filter(mlm_bx5, !grepl("D101|D141|D147|D148",ID))
unique(data$ID)
```

## OLS estimates for exploratory purposes

```{r}
# getting regular regression estimates one person at a time for exploratory purposes:
lmEst <- by(data, data$ID, function(x) summary(lm(bias_t ~ trialnum, data=x)))
lmEst
```

### Stem plot for the intercepts based on the OLS estimates:
```{r}
int <- by(data, data$ID, function(x) coefficients(lm(bias_t ~ trialnum, data=x))[[1]])
int

summary(int)
stem(int, scale=2)
hist(int)
```
### Stem plot for the slopes based on the OLS estimates
```{r}
slope <- by(data, data$ID, function(x) coefficients(lm(bias_t ~ trialnum, data=x))[[2]])
slope

summary(slope)
stem(slope, scale=2)
hist(slope)
```
Note that these are all extremely close to zero - probably because of the lack of time effect.

### R-square for OLS models
```{r}

rsq <- by(data, data$ID, function(x) summary(lm(bias_t ~ trialnum, data=x))$r.squared)
summary(rsq)
stem(rsq, scale=2)

cor(int, slope)
```
The correlation between intercepts and slopes is -0.475, so people who show less avoidance bias overall change less over the course of the task.
 
## Graphing multilevel data for exploratory purposes
```{r}
library(lattice)
xyplot(bias_t ~ trialnum | ID, data = data, as.table=T) # too many people/hard to see, try random subset

ids <- sample(unique(data$ID), 12)
ids
temp <- data[data$ID %in% ids, ]
xyplot(bias_t ~ trialnum | ID, data = temp, as.table=T) 

# add fit line(s)
xyplot(bias_t ~ trialnum | ID, data = temp, type=c("p","r")) # linear fit
xyplot(bias_t ~ trialnum | ID, data = temp, type=c("p","smooth")) # curvilinear fit

# graph of the raw data for each person
interaction.plot(data$trialnum, data$ID, data$bias_t)

# generate linear fitted values for each person and then plot all together in one plot
fit <- by(data, data$ID, function(x) fitted.values(lm(bias_t ~ trialnum, data=x)))
fit2 <- unlist(fit)
interaction.plot(data$trialnum, data$ID, fit2, xlab="trial number", ylab="RT bias (transformed)")
```
There does not seem to be either a linear or curvilinear effect of trial number (i.e., time) on RT bias in the task.


## Analyses (NHST)
<center><i>Remember, the effect of time refers to `trialnumber` (this is NOT the `time` variable; `time` refers to time of day when task was run.)</i></center>

### Unconditional means model
```{r}
library(nlme)

# unconditional means model
m1 <- lme(bias_t ~ 1, random = ~ 1 | ID, data=data, method="ML")
summary(m1)

# ICC
iVar <- as.numeric(VarCorr(m1)[1,1])
residVar <- as.numeric(VarCorr(m1)[2,1])
icc1 <- iVar/(iVar+residVar) 
icc1
```
The overall mean bias score (transformed) is about -0.002, with a standard deviation of 0.023.

ICC is about 1.6% of the variance. This means that 1.6% of the variance in bias is explained by people being different from one another.

### Add fixed and random intercept and slope of time
```{r}
# model with fixed and random intercept and slope of time (intercepts implied)
m2 <- lme(bias_t ~ trialnum, data = data, random = ~ trialnum | ID, method="ML")
summary(m2) 
anova(m2)
```
The average bias at trial number = 0 is -0.016 and gets increasingly approach-biased by a nonsignificant fraction of a percent with each unit of time. There is no significant effect of trial number (time) on RT bias score, _t_ = 0.74, _p_ = 0.457. 

```{r}
# ICC
iVar <- as.numeric(VarCorr(m2)[1,1])
residVar <- as.numeric(VarCorr(m2)[2,1])
icc2 <- iVar/(iVar+residVar)
icc2
```
99.99% of variance in bias is due to the between-person variance, after controlling for time. This is consistent with the lack of a significant effect of time. 
However... I have no clue how this makes sense with the ICC of 0.016 from the unconditional means model above, which suggests that <2% of the variance is due to between-person differences. ZERO clue. Atina tried to help me think through this for 20 minutes and I still don't get it.

### Adding stimulus as a fixed effect
There were 5 different types of stimuli, presented in a randomized order (consistent across participants/sessions) determined by genetic algorithm. Everyone saw the same stimuli in the same order, so it is a fixed effect. The different categories of images were: Death (generic), Deceased (personal photo), Living loved one (personal photo), Stranger (sex-matched to spouse), and Neutral (objects).
```{r}
# add the fixed effect of stim and compare its fit to m2 by inspecting the t and f tests
m3 <- lme(bias_t ~ trialnum + stim, data = data, random = ~ trialnum | ID, method="ML")
summary(m3)
```
NO effect of time. There is a significant fixed effect of spouse, _t_ = 3.55, _p_ = .0004. There is also a non-significant effect of stranger, _t_ = -1.73, _p_ = .084. This indicates that people tended to show more approach bias to photos of their spouse (across conditions). The parameter estimate is 0.12 with an _SE_ of 0.03. They also tended to show greater more avoidance bias to photos of the sex-matched stranger.

```{r}
intervals(m3, which="fixed")
```
95% CIs don't cross zero for spouse, but do for all other stimulus types.

```{r}
iVar <- as.numeric(VarCorr(m3)[1,1])
residVar <- as.numeric(VarCorr(m3)[2,1])
icc3 <- iVar/(iVar+residVar)
icc3 # Same ICC as m2
```
The ICC is identical to that in model 2. 

```{r}
anova(m2, m3) # compare this model with the first one
```
Comparing the model with the fixed effect of stimulus (m3) to the model without (m2), m3 is better (_p_ < .0001) as evidenced by lower AIC, BIC, and lL values.

### Take out fixed and random effect of time
Because there's no significant effect of time, makes sense to drop it from the model. Now we're basically doing regression but accounting for whatever interdependence exists. <font color="red">Note that these results will _NOT_ match the results of the regressions, because the regressions use medians of the bias scores, rather than means.</font>

```{r}
# without fixed or random effect of time
m4 <- lme(bias_t ~ stim, data = data, random = ~ 1 | ID, method="ML")
summary(m4)
```
Fixed effects are still more or less the same - significant for spouse, marginal for stranger - although the parameter estimates and test statistics are slightly different.

```{r}
intervals(m4, which="fixed")
```
As above, spouse is the only stimulus category whose 95 CIs don't cross zero, supporting a significant effect of spouse on bias.

Is m4 any better than m3?
```{r}
anova(m3,m4)
```
Nope. Models do not significantly differ (_p_ = 0.851), though the AIC and BIC values are a little smaller for m4 than m3. So let's forget time and go with m4...

### Adding condition as a fixed effect
Participants completed the task at two different sessions, where they were randomized to receive treatment A at one and treatment B at the other (order counterbalanced across participants). The two treatments are oxytocin vs. placebo.
```{r}
# add the interaction of stim and check its fit by inspecting t and f tests. 
m5 <- lme(bias_t ~ stim + cond, data = data, random = ~ 1 | ID, method="ML")
summary(m5)
```
The fixed effect of condition B is not significant, _t_ = 0.15, _p_ = 0.884.

```{r}
anova(m4, m5)
```
The two models are not significantly different, but the one with the fixed effect of condition (m5) has slightly higher AIC/BIC. 

### Adding the stimulus*condition interaction

We might expect that peoples' tendency to approach or avoid might differ depending on which treatment they got, but only for certain stimuli.
```{r}
# add the interaction of stim and check its fit by inspecting t and f tests. 
m6 <- lme(bias_t ~ stim*cond, data = data, random = ~ 1 | ID, method="ML")
summary(m6)
```
Interesting. With the interaction in the model, there is a non-significant interaction between death and condition B.

```{r}
anova(m6,m4,m5)
intervals(m6)
```
The model with the interaction in it is not significantly better than either the one with just the fixed effect of condition or with no effect of condition at all, so makes sense to go with the simplest model which only includes stimulus as a fixed effect.

### Adding other plausible predictors
There are a few other variables that we might expect to impact approach/avoidance bias, namely grief severity, OXTR genotype, time since death, and attachment style (avoidant, anxious). 
```{r}
# used group mean centered scores for the additional variables for interpretability
m7 <- lme(bias_t ~ stim + rs53576_A.carrier + rs2268498_C.carrier + gmean_ecrrs_global_anx + gmean_ecrrs_global_avoid + gmean_age + gmean_icg, data = data, random = ~ 1 | ID, method="ML")
summary(m7)
anova(m4,m7)
```
None of the added fixed effects are significant and the AIC/BIC values are higher, so take 'em out. 

### Final model

```{r}
# variance explained by the entire model 
fitted <- fitted(m4)
data$fitted <- as.vector(fitted)
rsq <- lm(bias ~ fitted, data=data) 
summary(rsq)
```
A case of significant, but not meaningful: The R<sup>2</sup> for the entire model is .02, indicating that 2% of the variance in bias is explained by the model. So this model is not very good at predicting approach/avoidance bias from the fixed effect of stimulus within person. Oh well.

### Check final model assumptions

Checking normality of the residuals:
```{r}
# checking normality of residuals

# level 1:
qqnorm(m4, ~resid(.), id=0.00075)
plot(m4, resid(., type="p") ~ trialnum, id=0.0025)
```
The QQ plot looks pretty okay, albeit a few outliers. D113 is the person who most falls off the line, and does so at both ends. The residuals are also pretty evenly distributed across time, though many fall outside of the +/- 2 range.

```{r}
# the following bit wants IDs to be numeric, so take out the "D" from each ID
data <- data %>% mutate(numID = as.numeric(str_replace(ID, "D", "")))

# level 2:
qqnorm(m4, ~ranef(.), id=0.25)
ranInt <- ranef(m4)[1] # create object with the random intercept estimates
ranInt <- unlist(ranInt) 
names(ranInt) <- NULL 
ID <- unlist(subset(data, !duplicated(data$numID), select="numID")) # create an ID variable
plot(ranInt ~ ID)

ranSlope <- ranef(m4)[2] # create an object with the random slopes
ranSlope<- unlist(ranSlope) 
names(ranSlope) <- NULL 
plot(ranSlope ~ ID)

```
Getting an error here <font color="red">`Error in [.data.frame(ranef(m4), 72) : undefined columns selected`</font>, which I got stuck for way too long on and am leaving alone for now.

Checking homoscedasticity:
```{r}
# checking homoscedasticity

# level 1:
data$resid <- as.numeric(residuals(m4)) # create an object with the Level-1 residuals
plot(data$resid ~ data$stim)

# level 2:
library(lattice)
stim <- unlist(subset(data, !duplicated(data$numID), select="stim"))
xyplot(ranInt ~ stim)
xyplot(ranSlope ~ stim)
```
Note that again there are a lot of residuals falling outside of the +/-2 range.

Also, there's something weird about the L2 plot. Not sure why it shows only 3 of 5 stimulus types (edit: now not showing anything).

The error message is because the ranSlope object does not exist (see error message in output from previous chunk.)


## Analyses (Bayesian)

###Unconditional means model
First step is to look at the unconditional means model. We know from the exploratory stuff in previous sections that `bias_t` is normally distributed, so we can tell `brms` to use normal likelihood (Gaussian).

```{r}
library(brms)
b1N <- brm(bias_t ~ 1 + (1|ID), data=data, family=gaussian()) 
saveRDS(b1N, "mlm_ada1_b1N.rds")
summary(b1N)
```
Rhat = 1 across the board, so that's good.

```{r}
plot(b1N) # check trace plots
```
I think these plots look okay? b_Intercept and SD_ID__Intercept plots are not exactly on top of each other, but do have a lot of overlap in their distributions.

```{r}
pp_check(b1N, nsamples=20)  # posterior predictive check using normal likelihood
pp_check(b1N, type="error_hist", nsamples=20, binwidth=1) # error histogram (histogram of prediction residuals)
```
The observed distribution is a very good fit to the predicted data (based on the normal distribution).
The error histogram also shows that residuals are normally distributed around 0. See [this page](https://cran.r-project.org/web/packages/bayesplot/vignettes/graphical-ppcs.html) for some helpful documentation.

```{r}
# check ICC to get sense of how variance is distributed between- and within- people
iSd <- as.numeric(summary(b1N)$random$ID[1,1]) # finds 1st row, 1st column of random effects inside the summary of b1N
iVar <- iSd^2
residSd <- as.numeric(summary(b1N)$spec_pars[[1,1]])
residVar <- residSd^2
icc1 <- iVar/(iVar+residVar) 
icc1 # this is the unconditional growth model - just happens in this dataset that it's a nuisance model
```
For the unconditional growth model, the ICC is 0.017, meaning that only about 2% of the variance in bias is because people on average perform differently from each other on the task.

### Time as fixed and random effect
```{r}
## check whether we need time as a control variable
b2 <- brm(bias_t ~ 1 + trialnum + (1 + trialnum|ID), data=data, family=skew_normal(), control=list(adapt_delta = .95))
saveRDS(b2, "mlm_ada1_b2.rds")
summary(b2) 
# b2 <- readRDS("mlm_ada1_b2.rds")
```
<i>Note: Estimate = mean of posterior distribution for each of the parameters.</i>
**Group-level effects**:
When trialnum is 0, the standard deviation of the intercepts is 0.12.
There is no random effect of time: the slopes are not at all variable (people are not different from each other).
Zero is a credible value for the correlation, because the 95% CIs (credible intervals) cross zero.

Note that Rhat for sd(trialnum) in the random effects is 1.01. Per the [brms documentation](https://github.com/paul-buerkner/brms), this does not seem to indicate a convergence issue due to lack of error messages and the fact that the Rhat for the group-level effect in their example is 1.01: <i>"If 'Rhat' is considerably greater than 1, the algorithm has not yet converged and it is necessary to run more iterations and / or set stronger priors.</i>"

**Population-level effects**:
There is no fixed effect of time, either, given that the upper and lower CI limits are -0.0, 0.0.

```{r}
plot(b2)
```
SD_ID__trialnum and cor_ID__Intercept__trialnum are totally skewed.
```{r}
pp_check(b2, nsamples=20)
pp_check(b2, type="error_hist", nsamples=20, binwidth=1) 
```
Again, the trace plot shows that the observed distribution is a very good fit to the predicted data based on the normal distribution.
The error histogram also shows that residuals are normally distributed around 0. 

SKIPPING THE CHUNK BELOW...
I can't get sjstats to install (issue with compiling the glmmTMB package - freaking dependencies are the bane of my existence with all these different packages), and no point in running cross validation on a model that we're going to drop anyway due to lack of both fixed/random effects of time.
```{r}
# install.packages("sjstats")
# library(sjstats)
# equi_test(b2) # is there a fixed effect?
# waic(b1N, b2) # does WAIC think we should keep time? If so, given there is no fixed effect, this is due to the random effect accounting for variance.

# b1N <- add_ic(b1N, ic="R2")
# b2 <- add_ic(b2, ic="R2") 
# mean(b1N$R2)
# mean(b2$R2) # does R2 thinks we should keep time? 

# kfold(b1SN, b2, K=10) # use CV to check for over-fitting
# this takes too long, so reduce n(folds) or only perform for final model
```

### Fixed effect of stimulus
```{r}
b3 <- brm(bias_t ~ 1 + stim + (1 |ID), data=data, family=skew_normal(), control=list(adapt_delta = .95))
saveRDS(b3, "mlm_ada1_b3.rds")
summary(b3) 
# b3 <- readRDS("mlm_ada1_b3.rds")
```
Similar to the results of the NHST analysis, there is a notable fixed effect of spouse (estimate = 0.12, est. error = 0.03, 95% CI[0.05,0.18]) on bias, such that people show more approach bias on trials where the stimulus is their spouse.
```{r}
plot(b3)
marginal_effects(b3)
```
The high density intervals illustrate the positive effect of spouse on bias, which we also see in the marginal effects plot. In the analysis above, the 95% CI for death and stranger stimuli barely crossed zero, which corresponds with what we see in the marginal effects plot, where the upper limit of the error bar is very close to zero (or below). This also corresponds with the effects we saw in NHST, where there were significant or marginally significant effects of death and stranger in addition to spouse.

```{r}
pp_check(b3, nsamples=20)
pp_check(b3, type="error_hist", nsamples=20, binwidth=1) # looks like we have improved the residuals; if we rerun this random sampling at least some of the time we no longer see the lower end outliers
# equi_test(b3) # TRY THIS CODE WHEN I GET SJTOOLS TO INSTALL (what does ROPE tells us about the size of effect?)
# waic(b2, b3) # TRY THIS CODE WHEN I GET SJTOOLS TO INSTALL
```

```{r}
b2 <- add_ic(b2, ic="R2") 
b3 <- add_ic(b3, ic="R2") 
mean(b2$R2)
mean(b3$R2)
```
In both models, we have explained about 2 percent of the variance (1.6% vs. 2.0%).

### Adding the stimulus*condition interaction (no time)

One of the original study hypotheses was that there would be an interaction (actually, a 3-way interaction with group, but not going to go there for ADA #1; also we may be underpowered for that) between stimulus and condition, such that people might display different response biases for different stimulus categories in condition A vs. B, or vice versa.

```{r}
b4 <- brm(bias_t ~ 1 + stim*cond + (1|ID), data=data, family=skew_normal(), control=list(adapt_delta = .95))
saveRDS(b4, "mlm_ada1_b4.rds")
summary(b4) 
# b4 <- readRDS("mlm_ada1_b4.rds")
```
Got some warnings:

`Rejecting initial value:`
  `Gradient evaluated at the initial value is not finite.`
  `Stan can't start sampling from this initial value.`
`Rejecting initial value:`
  `Log probability evaluates to log(0), i.e. negative infinity.`
  `Stan can't start sampling from this initial value.`
  
Apparently setting priors may help with this.

Here, we see notable effects of death and stranger stimuli, but not condition nor its interaction with stimulus type.
  
```{r}
plot(b4)
marginal_effects(b4) 

pp_check(b4, nsamples=20)
pp_check(b4, type="error_hist", nsamples=20, binwidth=1) # residuals seem about the same
# equi_test(b4) 
```
Nothing terribly new here, except for the marginal effects. Looking at the plot for `stim`, the distributions for spouse and stranger and spouse and death don't seem to overlap, and living loved one and death just barely overlap. Unsurprisingly, people tend to show more approach bias for spouse and living loved one photos compared to death and stranger photos. The third marginal effects plot illustrates the lack of condition or stimulus*condition interaction effects.
```{r}
b4 <- add_ic(b4, ic="R2") 
mean(b3$R2)
mean(b4$R2)
```
The variance accounted for shows a negligable increase with the stimulus*condition interaction in the model (goes from 0.020 to 0.022).

### Using priors
```{r}
# First find out what default priors are. 
get_prior(bias_t ~ 1 + stim*cond + (1|ID), data=data, family=skew_normal())
```
We didn't get here yet, so I'm not too sure what I'm looking at. <font color="red"><i>To be continued...</i></font>