This week mainly involved building upon the start on exploratory analyses from last week.

Goals

Specifically my goals are:

to try out statistical analyses, in particular, applying it to the data I calculated last week
to start finalising my report and putting it all in an Rmarkdown file, in particular, knitting the doc every so often to ensure it works!

Load packages

library(tidyverse)

## -- Attaching packages --------------------------------------- tidyverse 1.3.1 --

## v ggplot2 3.3.5     v purrr   0.3.4
## v tibble  3.1.2     v dplyr   1.0.6
## v tidyr   1.1.3     v stringr 1.4.0
## v readr   1.4.0     v forcats 0.5.1

## -- Conflicts ------------------------------------------ tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()

library(readspss)
library(plotrix)
library(gt)
library(ggpubr)
library(jmv)
library(rstatix)

## 
## Attaching package: 'rstatix'

## The following object is masked from 'package:stats':
## 
##     filter

library(here)

## here() starts at C:/Users/miche/Documents/Coding-R/Learning logs

library(readxl)
library(psych)

## 
## Attaching package: 'psych'

## The following objects are masked from 'package:jmv':
## 
##     pca, reliability

## The following object is masked from 'package:plotrix':
## 
##     rescale

## The following objects are masked from 'package:ggplot2':
## 
##     %+%, alpha

library(gtsummary)

## #BlackLivesMatter

Loading data

data <- read.sav("Humiston & Wamsley 2019 data.sav")

Cleaning data

cleandata <- data %>% 
  filter(exclude == "no")

Question 1: Is the effect of TMR mediated by bias type?

To remind everyone of my progress from last week, I developed two different plots.

One plot compared the mean bias levels for race and gender bias across time.

Here we found that both types of biases showed similar trends across time, however, race bias showed less bias increase at the 1 week delay time point.

biasdata <- data.frame(
  condition = factor(c("Gender", "Gender", "Gender", "Gender", "Race", "Race", "Race", "Race")),
  time = factor(c("Baseline", "Prenap", "Postnap", "1-week", "Baseline", "Prenap", "Postnap", "1-week")),
  levels = c("Baseline", "Prenap", "Postnap", "1-week"),
  bias_av = c(0.49767,  0.3357758,  0.3389568, 0.4857734, 0.533979, 0.1080363,  0.2803619, 0.3292815),
  se = c(  0.07186912, 0.11089910, 0.12809562, 0.08824424, 0.1051369, 0.1394592, 0.1035415, 0.1031192)
)

ggplot(data = biasdata, aes(
  x = factor(time, level = c("Baseline", "Prenap", "Postnap", "1-week")), 
  y = bias_av,
  colour = condition,
  group = condition)) +
  geom_line() +
  geom_errorbar(aes(
    x= time,
    ymin=bias_av-se,
    ymax=bias_av+se),
width=0.1, colour="grey", alpha= 0.9) +
  labs(x = "time", title = "Bias change for bias type")

The other plot looked specifically at race bias, comparing the cued and uncued conditions.

racebiasdata2 <- data.frame(
  condition = factor(c("Cued", "Cued", "Cued", "Cued", "Uncued", "Uncued", "Uncued", "Uncued")),
  time = factor(c("Baseline", "Prenap", "Postnap", "1-week", "Baseline", "Prenap", "Postnap", "1-week")),
  levels = c("Baseline", "Prenap", "Postnap", "1-week"),
  bias_av = c(0.533979, 0.1080363,  0.2803619   , 0.3292815, 0.7215597, 0.3168438,  0.3243954, 0.5191932),
  se = c( 0.1051369,  0.1394592, 0.1035415, 0.1031192,  0.1193954, 0.1462797, 0.1344576, 0.1267464)
)

ggplot(data = racebiasdata2, aes(
  x = factor(time, level = c("Baseline", "Prenap", "Postnap", "1-week")), 
  y = bias_av,
  colour = condition,
  group = condition)) +
  geom_line() +
  geom_errorbar(aes(
    x= time,
    ymin=bias_av-se,
    ymax=bias_av+se),
width=0.1, colour="grey", alpha= 0.9) +
  labs(x = "time", title = "Race Bias Change")

Here we observed that there doesn’t appear to be much difference between cued and uncued conditions for race bias.

So far this has provided evidence that TMR is unsuccessful, regardless of bias type.

Statistical analyses

Now, I’ll try apply statistical analyses!

Firstly, I will attempt to use the stat_compare_means function, by applying it to the plot.

racevsgender <- list(c("race", "gender"))
  
ggplot(data = biasdata, aes(
  x = factor(time, level = c("Baseline", "Prenap", "Postnap", "1-week")), 
  y = bias_av,
  colour = condition,
  group = condition)) +
  geom_line() +
  geom_errorbar(aes(
    x= time,
    ymin=bias_av-se,
    ymax=bias_av+se),
width=0.1, colour="grey", alpha= 0.9) +
  labs(x = "time", title = "Bias change for bias type") +
  stat_compare_means(comparisons = racevsgender, method = "t.test")

## Warning: Computation failed in `stat_signif()`:
## missing value where TRUE/FALSE needed

Hm, It’s coming up with a warning: Warning: Computation failed in stat_signif(): missing value where TRUE/FALSE needed

Not quite sure what this means! I’ll rewatch Jenny’s Q and A, and maybe I’ll figure it out.

In the meantime, I’ll move on to doing t-tests.

t-tests are used to compare two means, so I’ll need to do multiple t-tests for each time point, to compare the types of bias. i.e. baseline, prenap, postnap, 1 week delay.

Additionally, I could compare the means between each time point, to determine if bias significantly changes between time points for each type of bias.

I’ll start off with race vs gender bias for prenap.

race vs gender

prenap_race <- cleandata %>%
  filter(Cue_condition == "race")

prenap_gender <- cleandata %>%
  filter(Cue_condition == "gender")

t.test(prenap_race$preIATcued, prenap_gender$preIATcued)

Oh no! This doesn’t appear to work either.

I’m getting a warning that there is not enough x observations. When I check the environment, for some reason there’s 0 for prenap_race and prenap_gender!

In the data, I assumed it was coded as “race”, “gender”, but perhaps I have to use the coded numeral if that makes sense. i.e. race = 1, gender = 2

Lets try that.

prenap_race <- cleandata %>% 
  filter(Cue_condition == 1)

prenap_gender <- cleandata %>%
  filter(Cue_condition == 2)

t.test(prenap_race$preIATcued, prenap_gender$preIATcued)

Hm, they still have 0 observations.

After looking back at my other learning logs and Rmarkdowns, I think I found the problem.

Weirdly, when viewing the data, depending where, the entries are coded differently. In the csv file, the data is coded in 1’s and 0’s.

However, it is actually coded in words specifically: - “race cue played” - “gender cue played”

Hence, why, when I used “race” and “gender”, there were no observations, as these data entries didn’t exist!

Lets do it again

prenap_race <- cleandata %>% 
  filter(Cue_condition == "race cue played")

prenap_gender <- cleandata %>%
  filter(Cue_condition == "gender cue played")

t.test(prenap_race$preIATcued, prenap_gender$preIATcued)

## 
##  Welch Two Sample t-test
## 
## data:  prenap_race$preIATcued and prenap_gender$preIATcued
## t = -1.2782, df = 28.572, p-value = 0.2115
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.5923922  0.1369133
## sample estimates:
## mean of x mean of y 
## 0.1080363 0.3357758

Tada! It worked!!

the p value provided is = 0.2115, much bigger than 0.05. According to this analysis, we cannot say there is a significant difference between race and gender bias types at the prenap timepoint.

However, this is not of much concern as this is before TMR occurs.

Now we will repeat for post-nap.

prenap_race <- cleandata %>% 
  filter(Cue_condition == "race cue played")

prenap_gender <- cleandata %>%
  filter(Cue_condition == "gender cue played")

t.test(prenap_race$postIATcued, prenap_gender$postIATcued)

## 
##  Welch Two Sample t-test
## 
## data:  prenap_race$postIATcued and prenap_gender$postIATcued
## t = -0.35575, df = 26.385, p-value = 0.7249
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.3969202  0.2797304
## sample estimates:
## mean of x mean of y 
## 0.2803619 0.3389568

For this analysis, p= 0.7249 > 0.05. Again, there is no significant difference!

When comparing these results to the plot, this is not that surprising. However, there may potentially be an effect at the 1 week delay time point.

prenap_race <- cleandata %>% 
  filter(Cue_condition == "race cue played")

prenap_gender <- cleandata %>%
  filter(Cue_condition == "gender cue played")

t.test(prenap_race$weekIATcued, prenap_gender$weekIATcued)

## 
##  Welch Two Sample t-test
## 
## data:  prenap_race$weekIATcued and prenap_gender$weekIATcued
## t = -1.153, df = 28.924, p-value = 0.2583
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.4341075  0.1211237
## sample estimates:
## mean of x mean of y 
## 0.3292815 0.4857734

Following this analysis, we can confirm there is no significant difference between bias types at the week delay time point as p = 0.2583 > 0.05.

Regardless, I will now try to do statistical analyses for the second plot which focuses on just race.

Race uncued vs cued

This time, I will try using jmv package to run t-tests.

Like before, I will run a t-test for each time point.

To use the JMV package, it is similar to the previous method. You indicate which variable means you are comparing, and how you are dividing the data. Here I indicate I want to compare the means of base_IAT_race, for each cue condition.

ttestIS(formula = base_IAT_race ~ Cue_condition, data = cleandata)

## 
##  INDEPENDENT SAMPLES T-TEST
## 
##  Independent Samples T-Test                                             
##  ---------------------------------------------------------------------- 
##                                    Statistic    df          p           
##  ---------------------------------------------------------------------- 
##    base_IAT_race    Student's t    -1.182656    29.00000    0.2465512   
##  ----------------------------------------------------------------------

The p value calculated is 0.257 (rounded) which is larger than 0.05. Therefore we can say there is no signifcant different between cued and uncued conditions for race bias. Again, this isn’t that concerning as this is prior to TMR.

Now we repeat for prenap.

ttestIS(formula = pre_IAT_race ~ Cue_condition, data = cleandata)

## 
##  INDEPENDENT SAMPLES T-TEST
## 
##  Independent Samples T-Test                                            
##  --------------------------------------------------------------------- 
##                                   Statistic    df          p           
##  --------------------------------------------------------------------- 
##    pre_IAT_race    Student's t    -1.028076    29.00000    0.3124129   
##  ---------------------------------------------------------------------

Again, we find no significant result, and again this isn’t concerning as TMR has not been implemented at this stage.

Now, post-nap.

ttestIS(formula = post_IAT_race ~ Cue_condition, data = cleandata)

## 
##  INDEPENDENT SAMPLES T-TEST
## 
##  Independent Samples T-Test                                              
##  ----------------------------------------------------------------------- 
##                                    Statistic     df          p           
##  ----------------------------------------------------------------------- 
##    post_IAT_race    Student's t    -0.2637360    29.00000    0.7938484   
##  -----------------------------------------------------------------------

Unfortunately, there’s no significant difference for this time point! This indicated that at the post-nap timepoint, there is no significant difference in implicit bias levels for cued and uncued conditions. This suggests that TMR does not have an effect at the post nap time point.

However, lets see if there is an effect at the one week delay timepoint.

ttestIS(formula = week_IAT_race ~ Cue_condition, data = cleandata)

## 
##  INDEPENDENT SAMPLES T-TEST
## 
##  Independent Samples T-Test                                             
##  ---------------------------------------------------------------------- 
##                                    Statistic    df          p           
##  ---------------------------------------------------------------------- 
##    week_IAT_race    Student's t    -1.175013    29.00000    0.2495474   
##  ----------------------------------------------------------------------

Again, there is no significant difference! (p=0.25 > 0.05)

Moving on from question 1, lets get a start on question 2!

Question 2: does the duration of cue sound influence the effect of TMR?

This question looks at if the length of time the cue is played for during the nap, influence the effectiveness of TMR.

Perhaps, the longer the cue is played for, the stronger the TMR effect will be?

cue_minutesdescribe <- cleandata %>% 
  select(cue_minutes)

describe(cue_minutesdescribe)

##    vars  n  mean    sd median trimmed   mad min max range skew kurtosis   se
## X1    1 31 21.56 10.82     19   21.52 11.86 2.5  44  41.5 0.11     -0.8 1.94

I thought of this question, as the cue duration ranges from 2.5 minutes to 44 minutes! That’s such a large range of time (41.5 minutes) considering that the study is looking at the effect of reactivating memories during sleep using cues. How could a memory be reactivated effectively, if it is only being played for 2.5 minutes, compared to 44!

Therefore, this question aims to address if there is a correlation between duration of cue, and bias change.

Selecting relevant variables

Firstly, I select the relevant variables to calculate the descriptive statistics.

I then mutate these variables to calculate differential bias change. I got this equation from the original paper.

cueduration_data <- cleandata %>% 
  select(ParticipantID, baseIATcued, weekIATcued, baseIATuncued, weekIATuncued, cue_minutes) %>% 
  mutate(cued_differential = baseIATcued - weekIATcued,
         uncued_differential = baseIATuncued - weekIATuncued,
         diff_bias_change = cued_differential - uncued_differential)

Plot

I then use ggplot to make a scatter plot to visualise the data.

I include a line of best fit using geom_smooth

ggplot(data = cueduration_data, aes(
  x = cue_minutes,
  y = diff_bias_change
))+
  geom_point()+
  geom_smooth(method = lm, 
              se = F)+ 
  scale_x_continuous(expand = c(0,0),limits = c(0,50))+ 
  scale_y_continuous(expand = c(0,0),limits = c(-2,1.5))+
  labs(title = "Fig 2", 
       x = "Cue duration (minutes)",
       y = "Differential bias change")+
  theme_bw()

## `geom_smooth()` using formula 'y ~ x'

Looking at the plot, there seems to be slight decreasing trend, but it is not very dramatic. It doesn’t look quite convincing that there is an effect.

Therefore, lets try to apply some statistical analysis to determine if there is!

Statistical analysis

To do analysis, I need to use the cor_test function

Lets give that a go

cor.test(cueduration_data$cue_minutes, cueduration_data$diff_bias_change)

## 
##  Pearson's product-moment correlation
## 
## data:  cueduration_data$cue_minutes and cueduration_data$diff_bias_change
## t = -0.99925, df = 29, p-value = 0.3259
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  -0.5041878  0.1837791
## sample estimates:
##        cor 
## -0.1824417

This test shows us that there is a correlation of -0.182, however, this is not a significant correlation.

It is noted that this would be a good suggestion for future research: to control, or conduct a quasi experiment with a larger sample size so the spread of data is better. The data I used for this analysis wasn’t ideal as there were few participants with a really small cue duration, compared to other duration lengths.

The correlation does slightly fit expected results. TMR is meant to reduce bias. Therefore, we want differential bias change to be negative, to indicate there is a decrease in bias between the uncued an cued conditions (such that the cued condition has lower bias levels).

In the plot, you can see, that as cue duration increases, the difference between cued and uncued condition increases! Perhaps with a better sample size and spread between durations, future researchs could find an effect.

Next steps

Next steps would to rewatch the Q and A from this week to get an even better understanding of statistical analyses.

I also need to start my third question which is looking at if changes in procedure may have affected the results of the presented study.

Learning Log 8

Michelle

24/07/2021