This project uses behavioral and self-report data from the study “The effect of intranasal oxytocin on neural functioning in widow(er)s” (#1503744420, PI: Mary-Frances O’Connor, Co-PI: Brian Arizmendi). It was a double-blind randomized crossover study using an intranasal oxytocin manipulation and approach-avoidance task variant (grief AAT; gAAT) to investigate conflicting accounts of motivated behavior in people with complicated grief. Enrolled participants were older adults who had experienced the death of a spouse or long-term romantic partner between 6-36 months prior to their participation. Stratified sampling was used to ensure a range of grief symptom severity scores was represented in the sample.
Participants attended two identical experimental sessions, at which they received one of two intranasal sprays (oxytocin or placebo) and then took part in an fMRI scan, during which time they completed the task.
In the gAAT, five stimulus categories were presented:
Photos provided by the participant were scanned,resized, and framed with blue/yellow frames to match the other stimuli used in the task. Photos of the stranger were sex-matched to the participant’s partner.
Participants completed the task using a joystick. The stimuli were animated so that a joystick push would make the image smaller (as if they were pushing it away) and a joystick pull would make the image larger (as if they were bringing it closer).
Participants were instructed to respond during the task to the color of a photo’s frame, e.g., “push the joystick when the frame is BLUE, pull when the frame is YELLOW”). Once participants had completed the first run of the task (144 trials), they completed a second run that was identical to the first except for the instructions, which were reversed (e.g. “now push the joystick for YELLOW and pull for BLUE”). Order of instructions was counterbalanced across participants. Stimuli were presented via Inquisit 4 (2014), in a pseudorandomized order determined by genetic algorithm to optimize statistical power and psychological validity (Wager & Nichols, 2003). Each trial last 3000ms, with 2500 ms allowed for response. Intertrial interval was 500ms.
These are the primary aims for the project:
We also wanted to test whether any observed moderation of oxytocin effects grief severity would be specific to certain types of stimuli. However, we likely lack power to detect a three-way interaction in the existing sample.
Note that the experimental design for the present study is not identical to the studies below.
In both papers, bias scores were computed as (median RTpush - median RTpull) in each stimulus category.
ID - participant IDtreatment - treatment received (oxytocin or placebo)direction - direction of their response (push or pull)stimulus - the type of image they saw (spouse, living loved one, stranger, generic grief-related, neutral)grief_severity - continuous measure of self-reported complicated grief symptomsgroup - complicated grief (CG) or non-complicated grief (NCG), based on grief_severity >25gAAT_RT - reaction time in msgAAT_bias - difference between push and pull reaction times (positive values indicate approach bias, negative values indicate avoidance bias)Aim 1: To identify whether grief severity moderates behavioral response to specific types of stimuli, comparing high-severity (CG group) and low-severity (Non-CG group) participants.
library(tidyverse)
library(afex)
library(emmeans)
filter <- dplyr::filter
select <- dplyr::select
# read in behavioral data (long dataset containing trial-level reaction times)
data_long <- readRDS("~/Dropbox/GLASS Lab/OT gAAT for meeting 5-10-19/bx_long_noOut.rds")
# read in master dataset and subset variables to add to behavioral dataset
# varstoadd <- readRDS("~/Dropbox/GLASS Lab/OT Study/data/master-dataset/ot-fmri_master-dataset_020719.rds")
# varstoadd <- subset(varstoadd, select=c(ID, # participant ID
# sex_m, # sex (female as reference category)
# age_yrs, # age in years
# ethnicity_hisp, # ethnicity (non-Hispanic as reference category)
# race, # race
# timesincedeath, # time since partner's death and baseline survey completion
# yrs_together, # relationship length in years
# group, # CG or NCG, based on tot_icg >25 as threshold for CG
# tx_v1, # whether they received oxytocin or placebo at first session
# tot_icg # grief severity (Inventory of Complicated Grief)
# ))
# saveRDS(varstoadd, "~/Dropbox/GLASS Lab/OT gAAT for meeting 5-10-19/varstoadd.rds")
# already saved the subset, so just read in the subset:
varstoadd <- readRDS("~/Dropbox/GLASS Lab/OT gAAT for meeting 5-10-19/varstoadd.rds")
########## MODEL 1 ##########
# Model 1 requires a stacked/long dataset containing the following, at minimum:
## one variable `ID` with participant IDs
## one numeric variable `gAAT_RT` with TRIAL-LEVEL reaction times
## one factor `stimulus` with 5 levels (spouse, stranger, loved one, grief scenes, neutral scenes)
## one factor `treatment` with 2 levels (oxytocin, placebo)
## one factor `direction` with 2 levels (push, pull)
## one factor `group` with 2 factors (CG, NCG) *OR* one numeric variable `grief_score` with individual grief severity scores (time-invariant)
data_long <- left_join(data_long, varstoadd, ID="ID") # varstoadd = subset of master dataset
data_placebo <- data_long %>% rename(treatment = cond,
stimulus = stim,
direction = push_pull,
grief_severity = tot_icg) %>%
mutate(treatment = recode(treatment,
A = "placebo",
B = "oxytocin")) %>%
filter(treatment == "placebo")
head(data_placebo)
# Model 1: RM-ANOVA with two within-subjects factors (stimulus, push or pull) and one between-subjects factor (group)
m1 <- aov_ez(id = "ID", dv = "gAAT_RT", data_placebo, between = "group", within = c("stimulus", "direction"), fun_aggregate = median) # or use between = "grief_severity" for grief severity as continuous vs. categorical
summary(m1)
# use the `emmeans` package to conduct pairwise comparisons
# (emmeans is the same thing as lsmeans and pmmeans, just different terms)
# need to double-check that I'm doing this correctly...
emmeans(m1, ~c(direction, stimulus), contr = "pairwise", adjust="holm")
########## MODEL 2 ##########
# Model 2 requires a stacked/long dataset containing the following, at minimum:
## one variable `ID` with participant IDs
## one numeric variable `gAAT_bias` with participants' push-pull difference scores for each stimulus x treatment level
## e.g., median push RT to death stimuli under placebo MINUS median pull RT to death stimuli under placebo
## one factor `stimulus` with 5 levels (spouse, stranger, loved one, grief scenes, neutral scenes)
## one factor `treatment` with 2 levels (oxytocin, placebo)
## one factor `direction` with 2 levels (push, pull)
## one factor `group` with 2 factors (CG, NCG) *OR* one numeric variable `grief_score` with individual grief severity scores (time-invariant)
data_bias <- readRDS("~/Dropbox/GLASS Lab/OT gAAT for meeting 5-10-19/bias_long.rds")
data_bias <- left_join(data_bias, varstoadd, by="ID")
# NOTE that this is the dataset that calculates bias by subtracting the median pull reaction time in each category from the median push reaction time in that category (NOT the trial-by-trial bias subtracting run1 from run2)
data_bias_placebo <- data_bias %>%
rename(treatment = cond,
gAAT_bias = bias) %>%
mutate(treatment = recode(treatment,
A = "placebo",
B = "oxytocin")) %>%
filter(treatment == "placebo")
# Model 2: RM-ANOVA with one within-subjects factor (stimulus) and one between-subjects factor (group)
m2 <- aov_ez(id = "ID", dv = "gAAT_bias", data_bias_placebo, between = "group", within = c("stimulus")) # or use between = "grief_severity" for grief severity as continuous vs. categorical
summary(m2)
# use the `emmeans` package to conduct pairwise comparisons
# (emmeans is the same thing as lsmeans and pmmeans, just different terms)
# need to double-check that I'm doing this correctly...
emmeans(m2, ~stimulus, contr = "pairwise", adjust="holm")
emmeans(m2, ~group, contr = "pairwise", adjust="holm")
# additional covariates such as participant age, task run (first or second), etc. may be included in final model
library(tidyverse)
filter <- dplyr::filter
select <- dplyr::select
########## MODEL 1 ##########
# Model 1 requires a stacked/long dataset containing the following, at minimum:
## one variable `ID` with participant IDs
## one numeric variable `gAAT_RT` with TRIAL-LEVEL reaction times
## one factor `stimulus` with 5 levels (spouse, stranger, loved one, grief scenes, neutral scenes)
## one factor `treatment` with 2 levels (oxytocin, placebo)
## one factor `direction` with 2 levels (push, pull)
## one factor `run` with 2 levels (first, second)
## one factor `group` with 2 factors (CG, NCG) *OR* one numeric variable `grief_score` with individual grief severity scores (time-invariant)
# Model 1:
# Multiple linear regression model specifying three-way interaction effect on median push and pull reaction times
m1 <- lm(gAAT_RT ~ stimulus*direction, data=data_placebo) # or use `grief_severity` instead of `group`
summary(m1)
########## MODEL 2 ##########
# Model 2 requires a stacked/long dataset containing the following, at minimum:
## one variable `ID` with participant IDs
## one numeric variable `gAAT_bias` with participants' push-pull difference scores for each stimulus x treatment level
## e.g., median push RT to death stimuli under placebo MINUS median pull RT to death stimuli under placebo
## one factor `stimulus` with 5 levels (spouse, stranger, loved one, grief scenes, neutral scenes)
## one factor `treatment` with 2 levels (oxytocin, placebo)
## one factor `direction` with 2 levels (push, pull)
## one factor `group` with 2 factors (CG, NCG) *OR* one numeric variable `grief_score` with individual grief severity scores (time-invariant)
# Model 2:
# Multiple linear regression model specifying two-way interaction effect on bias scores
m2 <- lm(gAAT_bias ~ stimulus*group, data=data_bias_placebo) # or use `grief_severity` instead of `group`
summary(m2)
# additional covariates such as participant age, task run (first or second), etc. may be included in final model
library(tidyverse)
library(nlme)
filter <- dplyr::filter
select <- dplyr::select
########## MODEL 1 ##########
# Model 1 requires a stacked/long dataset containing the following, at minimum:
## one variable `ID` with participant IDs
## one numeric variable `gAAT_RT` with TRIAL-LEVEL reaction times
## one factor `stimulus` with 5 levels (spouse, stranger, loved one, grief scenes, neutral scenes)
## one factor `treatment` with 2 levels (oxytocin, placebo)
## one factor `direction` with 2 levels (push, pull)
## one factor `run` with 2 levels (first, second)
## one factor `group` with 2 factors (CG, NCG) *OR* one numeric variable `grief_score` with individual grief severity scores (time-invariant)
# Model 1:
# Mixed model specifying three-way interaction effect on median push and pull reaction times
m1 <- lme(gAAT_RT ~ stimulus*direction*group, data = data_placebo, random = ~ 1|ID, method="REML") # or use `grief_severity` instead of `group`
summary(m1)
########## MODEL 2 ##########
# Model 2 requires a stacked/long dataset containing the following, at minimum:
## one variable `ID` with participant IDs
## one numeric variable `gAAT_bias` with participants' push-pull difference scores for each stimulus x treatment level
## e.g., median push RT to death stimuli under placebo MINUS median pull RT to death stimuli under placebo
## one factor `stimulus` with 5 levels (spouse, stranger, loved one, grief scenes, neutral scenes)
## one factor `treatment` with 2 levels (oxytocin, placebo)
## one factor `direction` with 2 levels (push, pull)
## one factor `group` with 2 factors (CG, NCG) *OR* one numeric variable `grief_score` with individual grief severity scores (time-invariant)
# Model 2:
# Mixed model specifying two-way interaction effect on bias scores
m2 <- lme(gAAT_bias ~ stimulus*group, data = data_bias_placebo, random = ~ 1|ID, method="REML") # or use `grief_severity` instead of `group`
summary(m2)
# additional covariates such as participant age, task run (first or second), etc. may be included in final model
This is what I did for Emily’s class - I think we all agreed that this is unneccessarily complicated given that we are not interested in trial-level changes in reaction time over the course of the task. In these analyses, I ultimately ended up removing time (trial #) as a predictor anyway since there was no fixed or random effect of time.
In this approach, we also end up dropping more participants because four people failed to reverse the instructions on their second run of the task so I could not subtract, say, RT on run 2 trial #30 (“PULL”) from RT on run 1 trial #30 (“PUSH”) because they pushed (or pulled) on that trial on both runs.
Aim 2: To identify whether bereaved individuals show different behavioral responses when grief-related stimuli are idiographic (personal photos of spouse) versus nomothetic (generic grief-related scenes).
This should be a planned contrast within one of the Aim 1 analyses, I think. There is a way to specify contrasts in nlme using the contrasts = argument but I need some guidance as to how to correctly set up the list it’s expecting.
This is how you do it for the ANOVA using emmeans:
?aov_ez
The S3 object returned per default can be directly passed to emmeans::emmeans for further analysis. This allows to test any type of contrasts that might be of interest independent of whether or not this contrast involves between-subject variables, within-subject variables, or a combination thereof. The general procedure to run those contrasts is the following (see Examples for a full example):
(1) Estimate an afex_aov object with the function returned here. For example: x <- aov_car(dv ~ a*b + (id/c), d)
(2) Obtain a emmGrid-class object by running emmeans on the afex_aov object from step 1 using the factors involved in the contrast. For example: r <- emmeans(x, ~a:c)
(3) Create a list containing the desired contrasts on the reference grid object from step 2. For example:
con1 <- list(a_x = c(-1, 1, 0, 0, 0, 0), b_x = c(0, 0, -0.5, -0.5, 0, 1))
(4) Test the contrast on the reference grid using contrast. For example: contrast(r, con1)
To control for multiple testing p-value adjustments can be specified. For example the Bonferroni-Holm correction: contrast(r, con1, adjust = "holm")
Or… is the test for Aim 2 a separate test? I may be thinking too much of fMRI but was thinking of two additional factors, one in which spouse trials = 1 and generic grief trials = -1, and a second one in which spouse trials = -1 and generic grief trials = 1, which could be used as predictors to test the effect of spouse > generic grief (or versa).
Aim 3: To identify whether effects of intranasal oxytocin differ in high-severity [CG group] versus low-severity [Non-CG group] participants.
As in Aim 1, this could involve any of the three potential tests (RM-ANOVA, linear regression, linear mixed effects), except that the data for this analysis includes data from both placebo and oxytocin sessions and the two-level factor treatment is included as an additional predictor.
If using push and pull RTs rather than bias scores, this would potentially yield a 4-way interaction (stimulus*direction*treatment*group) which is not so desirable from a statistical power and interpretability standpoint. There isn’t likely to be one, so I think I would first run the most complex model, then drop predictors that don’t significantly improve the model fit to the data (as judged by AIC, BIC, log likelihood stats). The other option would be to test push and pull in separate models and correct your alpha for the number of tests, but this means that the models aren’t conditioned on both directions being included - not clear what impact this has and how important it is.
library(tidyverse)
library(jtools)
# load in data
varstoadd <- readRDS("~/Dropbox/GLASS Lab/OT Study/data/master-dataset/ot-fmri_master-dataset_020719.rds")
varstoadd <- subset(varstoadd, select=c(ID, # participant ID
sex_m, # sex (female as reference category)
age_yrs, # age in years
ethnicity_hisp, # ethnicity (non-Hispanic as reference category)
race, # race
timesincedeath, # time since partner's death and baseline survey completion
yrs_together, # relationship length in years
group, # CG or NCG, based on tot_icg >25 as threshold for CG
tx_v1, # whether they received oxytocin or placebo at first session
tot_icg # grief severity (Inventory of Complicated Grief)
))
data_bias <- readRDS("~/Dropbox/GLASS Lab/OT gAAT for meeting 5-10-19/bias_long.rds")
data_bias <- left_join(data_bias, varstoadd, by="ID")
data_long <- readRDS("~/Dropbox/GLASS Lab/OT gAAT for meeting 5-10-19/bx_long_noOut.rds")
data_long <- left_join(data_long, varstoadd, by="ID")
# panel by stimulus
panelbystim <- ggplot(data_bias, aes(fill=cond, y=bias, x=group)) +
geom_boxplot(aes(fill=cond), outlier.alpha = 0.3)
p1 <- panelbystim + facet_grid(. ~ stimulus) + theme(strip.text.x = element_text(size=12, color="red",
face="bold.italic"),
strip.text.y = element_text(size=12, color="red",
face="bold.italic")) + geom_hline(yintercept=0, linetype="solid", color="black", size=.5) + xlab("Group") + ylab("Avoid bias Approach bias")
p1
# ggsave("~/Desktop/p1.png", units="in", width=6.5, height=4)
# while (!is.null(dev.list())) dev.off()
# panel by group
panelbystim <- ggplot(data_bias, aes(fill=cond, y=bias, x=stimulus)) +
geom_boxplot(aes(fill=cond), outlier.alpha = 0.3)
p2 <- panelbystim + facet_grid(. ~ group) + theme(strip.text.x = element_text(size=12, color="red",
face="bold.italic"),
strip.text.y = element_text(size=12, color="red",
face="bold.italic")) + geom_hline(yintercept=0, linetype="solid", color="black", size=.5) + xlab("Group") + ylab("Avoid bias Approach bias")
p2
# ggsave("~/Desktop/p2.png", units="in", width=6.5, height=4)
# while (!is.null(dev.list())) dev.off()
# three-way interaction
m <- lm(bias ~ stimulus*cond*tot_icg, data=data_bias)
summary(m)
Call:
lm(formula = bias ~ stimulus * cond * tot_icg, data = data_bias)
Residuals:
Min 1Q Median 3Q Max
-248.321 -43.880 -0.662 39.729 266.863
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 31.5257 22.9905 1.371 0.1711
stimulusliving 23.2035 32.5135 0.714 0.4759
stimulusneutral -0.9685 32.5135 -0.030 0.9763
stimulusspouse 28.9937 32.5135 0.892 0.3731
stimulusstranger 8.2273 32.5135 0.253 0.8004
condB -42.0367 32.5135 -1.293 0.1969
tot_icg -1.8163 0.8676 -2.094 0.0370 *
stimulusliving:condB -19.8412 45.9811 -0.432 0.6663
stimulusneutral:condB 15.9719 45.9811 0.347 0.7285
stimulusspouse:condB 16.8846 45.9811 0.367 0.7137
stimulusstranger:condB 27.6590 45.9811 0.602 0.5479
stimulusliving:tot_icg 0.8932 1.2269 0.728 0.4671
stimulusneutral:tot_icg 0.6291 1.2269 0.513 0.6084
stimulusspouse:tot_icg 0.9065 1.2269 0.739 0.4605
stimulusstranger:tot_icg 0.2014 1.2269 0.164 0.8697
condB:tot_icg 2.1973 1.2269 1.791 0.0741 .
stimulusliving:condB:tot_icg -0.0194 1.7351 -0.011 0.9911
stimulusneutral:condB:tot_icg -1.2411 1.7351 -0.715 0.4749
stimulusspouse:condB:tot_icg -0.6886 1.7351 -0.397 0.6917
stimulusstranger:condB:tot_icg -1.4459 1.7351 -0.833 0.4052
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 67.54 on 370 degrees of freedom
Multiple R-squared: 0.1146, Adjusted R-squared: 0.06916
F-statistic: 2.521 on 19 and 370 DF, p-value: 0.0004856
interact_plot(m, pred="tot_icg", modx = "cond", plot.points=T, jitter=0.1, point.shape=T)
The scripts below generate the data files used in the analyses above.
Pre-R steps:
In Terminal, within each of the folders in turn…
find . -iname "*.iqdat" -exec bash -c 'mv "$0" "${0%\.iqdat}.tsv"' {} \;cat *.tsv > OT_gAAT_run-1.tsv (or OT_gAAT_run-2.tsv)library(tidyverse)
filter <- dplyr::filter
select <- dplyr::select
# read in the data (only importing a subset of columns)
raw_r1 <- as_data_frame(read_tsv("~/Dropbox/GLASS Lab/OT Study/data/raw-data/OT-gAAT-behavioral-data/Run1_tsv/OT_gAAT_run-1.tsv", col_types=cols_only(
date = col_integer(),
time = col_character(),
subject = col_character(),
blockcode = col_character(),
blocknum = col_character(),
trialcode = col_character(),
values.trialcode = col_character(),
values.stimulus = col_character(),
values.initialresponse = col_character(),
values.RT = col_number()
)))
raw_r2 <- as_tibble(read_tsv("~/Dropbox/GLASS Lab/OT Study/data/raw-data/OT-gAAT-behavioral-data/Run2_tsv/OT_gAAT_run-2.tsv", col_types=cols_only(
date = col_integer(),
time = col_character(),
subject = col_character(),
blockcode = col_character(),
blocknum = col_character(),
trialcode = col_character(),
values.trialcode = col_character(),
values.stimulus = col_character(),
values.initialresponse = col_character(),
values.RT = col_number()
)))
# header line repeats when I used `cat` to merge, so get those out of there
raw_r1 <- raw_r1 %>% filter(!is.na(date))
raw_r2 <- raw_r2 %>% filter(!is.na(date))
unique(raw_r1$subject)
# D118 is showing up as "test" for some reason (checked against the .iqdat file), and D130_b is missing
# both of D130's and D142's visits were entered as "D130"/"D142", so change the subject ID for the visit on 052416 to D130_b and on
raw_r1$subject[raw_r1$date == 020916] <- "D118"
raw_r1$subject[raw_r1$date == 052416] <- "D130_b"
raw_r1$subject[raw_r1$date == 111516] <- "D142_b"
# fixing some other stuff in a similar vein
raw_r1$subject[raw_r1$subject == "D101_2"] <- "D101_b"
raw_r1$subject[raw_r1$subject == "D102_B"] <- "D102_b"
raw_r1$subject[raw_r1$subject == "D117_b"] <- "D117"
raw_r1$subject[raw_r1$subject == "D117_c"] <- "D117_b"
raw_r1$subject[raw_r1$subject == "D135_c"] <- "D135_b"
length(unique(raw_r1$subject)) # 78 unique - that is correct
unique(raw_r2$subject)
# some similar issues to fix
raw_r2$subject[raw_r2$subject == "D101_2Y"] <- "D101_b"
raw_r2$subject[raw_r2$subject == "D102_B"] <- "D102_b"
raw_r2$subject[raw_r2$subject == "D107_B_2"] <- "D107_b"
raw_r2$subject[raw_r2$subject == "D117_b"] <- "D117"
raw_r2$subject[raw_r2$subject == "D117_c"] <- "D117_b"
raw_r2$subject[raw_r2$subject == "D126_B"] <- "D126_b"
raw_r2$subject[raw_r2$subject == "D135_c"] <- "D135_b"
raw_r2$subject[raw_r2$date == 111516] <- "D142_b"
length(unique(raw_r2$subject)) # 78 unique - that is correct
str(raw_r1)
str(raw_r2)
# take out ITIs
unique(raw_r1$trialcode)
mlm_r1 <- filter(raw_r1, grepl("^A",trialcode))
unique(mlm_r1$trialcode)
unique(raw_r2$trialcode)
mlm_r2 <- filter(raw_r2, grepl("^A",trialcode))
unique(mlm_r2$trialcode)
# add a "trialnum" variable, grouped by ID
mlm_r1 <- mlm_r1 %>% group_by(subject) %>% mutate(trialnum = row_number())
View(mlm_r1)
mlm_r2 <- mlm_r2 %>% group_by(subject) %>% mutate(trialnum = row_number())
View(mlm_r2)
mlm_r1 %>% group_by(subject) %>% count() # everyone has 144 trials
mlm_r2 %>% group_by(subject) %>% count() # everyone has 144 trials, except D122_b has 177
# no clue why that happened, although I do vaguely remember we had one session where the task seemed to go on and on and we force quit it?
# remove trials 145-177 for D122_b:
mlm_r2 <- filter(mlm_r2, trialnum <=144)
mlm_r2 %>% group_by(subject) %>% count()
# now everyone has 144 trials
# add a column for "run"
mlm_r1$run <- "1"
mlm_r2$run <- "2"
# add a column for "visit"
mlm_r1$visit <- as.factor(ifelse(grepl("*_b", mlm_r1$subject), "2", "1"))
mlm_r2$visit <- as.factor(ifelse(grepl("*_b", mlm_r2$subject), "2", "1"))
mlm_r1 %>% group_by(subject, visit, run) %>% count() # all looks good: everyone has 144 trials for each run at each visit
mlm_r2 %>% group_by(subject, visit, run) %>% count() # all looks good: everyone has 144 trials for each run at each visit
# now remove the "_b" from visit 2 IDs
mlm_r1 <- mlm_r1 %>% ungroup() %>%
mutate(subject = str_replace(subject, "_b", ""))
unique(mlm_r1$subject)
mlm_r2 <- mlm_r2 %>% ungroup() %>%
mutate(subject = str_replace(subject, "_b", ""))
unique(mlm_r2$subject)
# merge data from the 2 runs
bx <- bind_rows(mlm_r1, mlm_r2)
# split up by visit
bx_v1 <- bx %>% filter(visit == "1")
bx_v2 <- bx %>% filter(visit == "2")
bx_v1 %>% group_by(subject, run) %>% count() # everyone has 144 trials for each run at visit 1
bx_v2 %>% group_by(subject, run) %>% count() # everyone has 144 trials for each run at visit 2
# add columns for "cond" at visit 1 and visit 2 [condition: A/placebo or B/oxytocin]
# load in the randomization data
randomize <- readRDS("~/Dropbox/GLASS Lab/OT Study/data/cleaned-data/randomization.rds")
str(randomize)
bx_v1 <- bx_v1 %>%
rename("ID" = subject)
bx_v2 <- bx_v2 %>%
rename("ID" = subject)
# get the IDs for everyone who got treatment A at visit 1
IDs_v1_txA <- randomize %>% filter(tx_v1 == "A")
txA_list <- IDs_v1_txA$ID
txA_list
# ifelse statement: if ID = any of those listed in txA_list, make cond_v1 = "A", else make it "B"
bx_v1 <- bx_v1 %>%
mutate(cond_v1 = as.factor(ifelse(ID == "D101" | ID == "D105" | ID == "D109" | ID == "D110" | ID == "D114" | ID == "D115" | ID == "D116" | ID == "D118" | ID == "D119" | ID == "D120" | ID == "D123" | ID == "D124" | ID == "D126" | ID == "D128" | ID == "D132" | ID == "D135" | ID == "D136" | ID == "D137" | ID == "D138" | ID == "D140" | ID == "D141" | ID == "D145" | ID == "D148" | ID == "D149" | ID == "D117", "A", "B")), cond_v2 = as.factor(ifelse(ID == "D101" | ID == "D105" | ID == "D109" | ID == "D110" | ID == "D114" | ID == "D115" | ID == "D116" | ID == "D118" | ID == "D119" | ID == "D120" | ID == "D123" | ID == "D124" | ID == "D126" | ID == "D128" | ID == "D132" | ID == "D135" | ID == "D136" | ID == "D137" | ID == "D138" | ID == "D140" | ID == "D141" | ID == "D145" | ID == "D148" | ID == "D149" | ID == "D117", "B", "A")))
# ifelse statement: if ID = any of those listed in txA_list, make cond_v2 = "B", else make it "A" (people who were A at visit 1 should be B for visit 2, and vice versa)
bx_v2 <- bx_v2 %>%
mutate(cond_v2 = as.factor(ifelse(ID == "D101" | ID == "D105" | ID == "D109" | ID == "D110" | ID == "D114" | ID == "D115" | ID == "D116" | ID == "D118" | ID == "D119" | ID == "D120" | ID == "D123" | ID == "D124" | ID == "D126" | ID == "D128" | ID == "D132" | ID == "D135" | ID == "D136" | ID == "D137" | ID == "D138" | ID == "D140" | ID == "D141" | ID == "D145" | ID == "D148" | ID == "D149" | ID == "D117", "B", "A")), cond_v1 = as.factor(ifelse(ID == "D101" | ID == "D105" | ID == "D109" | ID == "D110" | ID == "D114" | ID == "D115" | ID == "D116" | ID == "D118" | ID == "D119" | ID == "D120" | ID == "D123" | ID == "D124" | ID == "D126" | ID == "D128" | ID == "D132" | ID == "D135" | ID == "D136" | ID == "D137" | ID == "D138" | ID == "D140" | ID == "D141" | ID == "D145" | ID == "D148" | ID == "D149" | ID == "D117", "A", "B")))
levels(bx_v1$cond_v1)
levels(bx_v2$cond_v2)
head(bx_v1)
txA_list
bx_v1 %>% group_by(cond_v1) %>% count(ID) # IDs and cond match txA_list *note that some IDs from the randomization dataset are listed in txA but not in the behavioral data. This is because some participants were dropped or never ended up completing their experimental session: D109 (dropped after 1st session), D116 and D136 (never enrolled), D124 (incidental finding, excluded after T1MPRAGE).
bx_v2 %>% group_by(cond_v2) %>% count(ID) # IDs and cond match txA_list
bx_long <- bind_rows(bx_v1,bx_v2)
View(bx_long)
# rename some variables
colnames(bx_long)
bx_long <- bx_long %>% rename("push_pull" = values.initialresponse,
"gAAT_RT" = values.RT)
head(bx_long)
# add a new variable for condition
bx_long <- bx_long %>% mutate(cond = as.factor(ifelse(cond_v1 == "A" & visit == "1", "A",
ifelse(cond_v2 == "A" & visit == "2", "A",
ifelse(cond_v1 == "B" & visit == "1", "B",
ifelse(cond_v2 == "B" & visit == "2", "B", NA))))))
View(bx_long) # check that it coded the "cond" column correctly
# add a new variable for stimulus category
# in values.stimulus:
# 1 = spouse, 2 = living loved one/WHOTO, 3 = stranger, 4 = nomothetic death-related, 5 = neutral images
sort(unique(bx_long$values.stimulus))
bx_long <- bx_long %>%
mutate(stim = as.factor(ifelse(values.stimulus == "1B_1.jpg"|values.stimulus == "1B_2.jpg"|values.stimulus == "1B_3.jpg"|values.stimulus == "1Y_1.jpg"|values.stimulus == "1Y_2.jpg"|values.stimulus == "1Y_3.jpg", "spouse",
ifelse(values.stimulus == "2B_1.jpg"|values.stimulus == "2B_2.jpg"|values.stimulus == "2B_3.jpg"|values.stimulus == "2Y_1.jpg"|values.stimulus == "2Y_2.jpg"|values.stimulus == "2Y_3.jpg", "living",
ifelse(values.stimulus == "3B_1.jpg"|values.stimulus == "3B_2.jpg"|values.stimulus == "3B_3.jpg"|values.stimulus == "3Y_1.jpg"|values.stimulus == "3Y_2.jpg"|values.stimulus == "3Y_3.jpg", "stranger",
ifelse(values.stimulus == "4B_1.jpg"|values.stimulus == "4B_2.jpg"|values.stimulus == "4B_3.jpg"|values.stimulus == "4Y_1.jpg"|values.stimulus == "4Y_2.jpg"|values.stimulus == "4Y_3.jpg", "death",
ifelse(values.stimulus == "5B_1.jpg"|values.stimulus == "5B_2.jpg"|values.stimulus == "5B_3.jpg"|values.stimulus == "5Y_1.jpg"|values.stimulus == "5Y_2.jpg"|values.stimulus == "5Y_3.jpg", "neutral", NA)))))))
levels(bx_long$stim)
bx_long$stim <- relevel(bx_long$stim, ref = "neutral") # make neutral the reference level
# save it
saveRDS(bx_long, "~/Dropbox/GLASS Lab/OT gAAT for meeting 5-10-19/bx_long.rds")
Outlier removal follows conventions from other gAAT papers, which exclude trials with RTs that are equal to or above the 99th percentile (calculated from the sample as a whole) or equal to or below 1st percentile.
# calculate percentiles separately for placebo/oxytocin conditions
bx_pl <- bx_long %>% filter(cond == "A")
bx_ot <- bx_long %>% filter(cond == "B")
p99pl <- quantile(bx_pl$gAAT_RT, 0.99) # 1716.76 ms
p01pl <- quantile(bx_pl$gAAT_RT, 0.01) # 463 ms
p99ot <- quantile(bx_ot$gAAT_RT, 0.99) # 1711.07 ms
p01ot <- quantile(bx_ot$gAAT_RT, 0.01) # 473 ms
# identify placebo condition outliers
bx_pl <- bx_pl %>%
mutate(outlier_RT = ifelse(gAAT_RT <= p01pl | gAAT_RT >= p99pl, 1, 0))
table(bx_pl$outlier_RT)
227/11005 # about 2% of trials are outliers
# identify oxytocin condition outliers
bx_ot <- bx_ot %>%
mutate(outlier_RT = ifelse(gAAT_RT <= p01ot | gAAT_RT >= p99ot, 1, 0))
table(bx_ot$outlier_RT)
229/11003 # about 2% of trials are outliers
bx_long_noOut <- bind_rows(bx_pl, bx_ot) %>% filter(outlier_RT == 0)
dim(bx_long)
dim(bx_long_noOut) # after outliers removed, there are 22008 trials total of the original 22464: 22464 - (227+229) = 22008
saveRDS(bx_long_noOut, "~/Dropbox/GLASS Lab/OT gAAT for meeting 5-10-19/bx_long_noOut.rds")
library(psych)
# how much missing data is there?
ntrials <- bx_long_noOut %>% group_by(ID, cond) %>% count() # don't group by push_pull because peopled pulled/pushed on different numbers of trials, for ex. responses in the wrong direction
describe(ntrials$n/288) # everyone has at least 90% of the 288 trials expected at each visit (no more than 10% trials dropped because of outliers or missed responses)
describe(bx_long_noOut$gAAT_RT) # skewness is 1.17, kurtosis is 1.7
hist(bx_long_noOut$gAAT_RT) # distribution is skewed, as usual for RT data...
qqnorm(bx_long_noOut$gAAT_RT) # and skewness is reflected in QQ plot
# creating a wide dataset with the bias variables, which will then be converted back to a long dataset
# The arguments to spread():
# - data: Data object
# - key: Name of column containing the new column names
# - value: Name of column containing values
data_wide <- spread(bx_long_noOut, key=stim, value=gAAT_RT)
colnames(data_wide)
View(data_wide)
# subset by condition and direction
data_pushA <- data_wide %>% select(ID, cond, push_pull, visit, neutral, death, living, spouse, stranger) %>% filter(cond == "A" & push_pull == "PUSH") %>% group_by(ID) %>% mutate(neutralApush = median(neutral, na.rm = TRUE), deathApush = median(death, na.rm = TRUE), spouseApush = median(spouse, na.rm=TRUE), livingApush = median(living, na.rm=TRUE), strangerApush = median(stranger, na.rm=TRUE))
data_pullA <- data_wide %>% select(ID, cond, push_pull, visit, neutral, death, living, spouse, stranger) %>% filter(cond == "A" & push_pull == "PULL") %>% group_by(ID) %>% mutate(neutralApull = median(neutral, na.rm = TRUE), deathApull = median(death, na.rm = TRUE), spouseApull = median(spouse, na.rm=TRUE), livingApull = median(living, na.rm=TRUE), strangerApull = median(stranger, na.rm=TRUE))
data_pushB <- data_wide %>% select(ID, cond, push_pull, visit, neutral, death, living, spouse, stranger) %>% filter(cond == "B" & push_pull == "PUSH") %>% group_by(ID) %>% mutate(neutralBpush = median(neutral, na.rm = TRUE), deathBpush = median(death, na.rm = TRUE), spouseBpush = median(spouse, na.rm=TRUE), livingBpush = median(living, na.rm=TRUE), strangerBpush = median(stranger, na.rm=TRUE))
data_pullB <- data_wide %>% select(ID, cond, push_pull, visit, neutral, death, living, spouse, stranger) %>% filter(cond == "B" & push_pull == "PULL") %>% group_by(ID) %>% mutate(neutralBpull = median(neutral, na.rm = TRUE), deathBpull = median(death, na.rm = TRUE), spouseBpull = median(spouse, na.rm=TRUE), livingBpull = median(living, na.rm=TRUE), strangerBpull = median(stranger, na.rm=TRUE))
# summarize by ID
data_pushA1 <- data_pushA[!duplicated(data_pushA$ID), ] %>% select(-c(neutral, death, living, spouse, stranger, visit, cond, push_pull))
data_pushB1 <- data_pushB[!duplicated(data_pushB$ID), ] %>% select(-c(neutral, death, living, spouse, stranger, visit, cond, push_pull))
data_pullA1 <- data_pullA[!duplicated(data_pullA$ID), ] %>% select(-c(neutral, death, living, spouse, stranger, visit, cond, push_pull))
data_pullB1 <- data_pullB[!duplicated(data_pullB$ID), ] %>% select(-c(neutral, death, living, spouse, stranger, visit, cond, push_pull))
# merge the 4 subsets together
join1 <- left_join(data_pushA1, data_pushB1, by="ID")
join2 <- left_join(join1, data_pullA1, by = "ID")
bias <- left_join(join2, data_pullB1, by = "ID")
head(bias)
bias <- bias %>% mutate(bias_neu_A = neutralApush - neutralApull,
bias_dea_A = deathApush - deathApull,
bias_spo_A = spouseApush - spouseApull,
bias_str_A = strangerApush - strangerApull,
bias_liv_A = livingApush - livingApull,
bias_neu_B = neutralBpush - neutralBpull,
bias_dea_B = deathBpush - deathBpull,
bias_spo_B = spouseBpush - spouseBpull,
bias_str_B = strangerBpush - strangerBpull,
bias_liv_B = livingBpush - livingBpull
)
# drop the condition/stim specific variables
bias <- bias %>% select(-c(neutralApush, deathApush, spouseApush, livingApush, strangerApush, neutralBpush, deathBpush, spouseBpush, livingBpush, strangerBpush, neutralApull, deathApull, spouseApull, livingApull, strangerApull, neutralBpull, deathBpull, spouseBpull, livingBpull, strangerBpull))
colnames(bias)
# turn it into a long dataset
bias_long <- gather(bias,
key = "stim",
value = "bias", -c(ID))
head(bias_long)
# create new columns for stimulus category and condition
bias_long$cond <- as.factor(ifelse(grepl("*_A", bias_long$stim), "A", "B"))
bias_long$stimulus <- as.factor(ifelse(grepl("*neu*", bias_long$stim), "neutral",
ifelse(grepl("*dea*", bias_long$stim), "death",
ifelse(grepl("*spo*", bias_long$stim), "spouse",
ifelse(grepl("*str*", bias_long$stim), "stranger",
ifelse(grepl("*liv*", bias_long$stim), "living", NA))))))
bias_long <- bias_long %>% select(-stim) # no longer need the "stim" variable
describe(bias_long$bias) # skewness is -.01, kurtosis is 1.34, M = 14.04, SD = 70.01
hist(bias_long$bias) # normal distribution
qqnorm(bias_long$bias) # definitely some outliers
# check out the outliers
## function from https://stackoverflow.com/questions/12866189/calculating-the-outliers-in-r
outfun <- function(x) {
abs(x-mean(x,na.rm=TRUE)) > 3*sd(x,na.rm=TRUE)
}
# add a variable for outlier = T/F
bias_long$outlier <- outfun(bias_long$bias)
table(bias_long$outlier) # 4 observations where outlier = TRUE (>3SD)
# which are these?
bias_long %>% filter(outlier==TRUE)
### this is another way to do it using boxplot.stats
# outlier_values <- boxplot.stats(medians_long$bias)$out
# boxplot(data_long$bias, main="Outliers", boxwex=0.3)
# mtext(paste("Outlying values: ", paste(round(outlier_values, digits=2), collapse=", ")), cex=0.9, side=1)
saveRDS(bias_long, "~/Dropbox/GLASS Lab/OT gAAT for meeting 5-10-19/bias_long.rds")
The four extreme outliers (>3 SD) are:
To calculate trial-level bias (aka “compatibility”) scores: separate push minus pull trials, then subtract RT on that trial at run1 from RT on that trial at run2.
bx_long <- readRDS("~/Dropbox/GLASS Lab/OT gAAT for meeting 5-10-19/bx_long.rds")
bx_r1.1 <- bx_long %>% filter(run == "1")
bx_r2.1 <- bx_long %>% filter(run == "2")
tail(bx_r1.1)
tail(bx_r2.1)
# run1 should line up with run2: "stim" should match (i.e, spouse trials with spouse trials) but push_pull should say PUSH at one run and PULL at the other
# calculate percentiles for outlier removal (run1 and run2)
p99r1 <- quantile(bx_r1.1$gAAT_RT, 0.99)
p01r1 <- quantile(bx_r1.1$gAAT_RT, 0.01)
p99r1 # 99th percentile RTs = 1713.38ms
p01r1 # 1st percentile RTs = 472ms
p99r2 <- quantile(bx_r2.1$gAAT_RT, 0.99)
p01r2 <- quantile(bx_r2.1$gAAT_RT, 0.01)
p99r2 # 99th percentile RTs = 1712.45ms
p01r2 # 1st percentile RTs = 463ms
bx_r1.2 <- bx_r1.1 %>%
mutate(outlier_RT = ifelse(gAAT_RT <= p01r1 | gAAT_RT >= p99r1, 1, 0))
table(bx_r1.2$outlier_RT)
227/11005 # 227 trials will be dropped, 11005 will be retained...about 2% of trials.
bx_r2.2 <- bx_r2.1 %>%
mutate(outlier_RT = ifelse(gAAT_RT <= p01r2 | gAAT_RT >= p99r2, 1, 0))
table(bx_r2.2$outlier_RT)
230/11002 # 230 trials will be dropped, 11002 will be retained...about 2% of trials.
bx_r1r2 <- bind_cols(bx_r1.2, bx_r2.2)
dim(bx_r1r2) # after outliers removed, there are 11232 trials total
# make sure that trials in the two runs line up post-merge
head(bx_r1r2)
tail(bx_r1r2)
# get rid of any trials with outliers in either run1 or run2
# (need run1/run2 to be the same length so that the trials match up)
bx_r1r2.1 <- bx_r1r2 %>% filter(outlier_RT == "0" & outlier_RT1 == "0")
count(bx_r1r2.1) # 10790 obs.
10790/11232 # ~96% of trials retained = 4% data excluded
push_pull <- bx_r1r2.1 %>% filter(push_pull == "PUSH" & push_pull1 == "PULL") %>% mutate(bias = (gAAT_RT-gAAT_RT1))
pull_push <- bx_r1r2.1 %>% filter(push_pull == "PULL" & push_pull1 == "PUSH") %>% mutate(bias = (gAAT_RT1-gAAT_RT))
bx_pp <- bind_rows(push_pull,pull_push)
head(bx_pp) # can see from this that D101 (and possibly others) has only a few trials, because they failed to reverse the instructions (i.e., push_pull and push_pull1 columns have the same values instead of PUSH at one and PULL at the other)
# in essence, this removes their "error" trials
bx_pp %>% group_by(ID, cond) %>% count() # yep, this is visible in their *very low* trialcounts:
# D101 A n = 3
# D141 A n = 18
# D147 B n = 1
# D148 A n = 19
As we see from the last part of the script above, 4 participants have extremely low trial counts at one visit or another, due to failure to reverse the instructions. This will be addressed next.
# create a "drop" variable for these 4
bx_pp <- bx_pp %>% mutate(low_n_trials = ifelse(ID == "D101" & cond == "A" | ID == "D141" & cond == "A"| ID == "D148" & cond == "A" | ID == "D147" & cond == "B", 1, 0))
# check whether the rows line up
notlinedup <- bx_pp %>% filter(stim != stim1)
notlinedup
notlinedup <- bx_pp %>% filter(trialcode != trialcode1)
notlinedup
notlinedup <- bx_pp %>% filter(trialnum != trialnum1)
notlinedup
notlinedup <- bx_pp %>% filter(push_pull == push_pull1)
notlinedup
# 0 rows returned; all the rows match. Thus we conclude the rows from the two runs line up.
saveRDS(bx_pp, "~/Dropbox/GLASS Lab/OT gAAT for meeting 5-10-19/bx_long_trialbytrialBias.rds")