Week 10 coding goals

Quite a lot of goals this week since it’s the last week before everything is done, but I’ve managed to finish up alomst everything.

Here were the goals for Week 10:

Fix up issues with plotting exploratory question 1
Finalise exploratory analysis question 2 and 3
Work out how to write functions

Challenges and successes

1. Fix up issues with plotting exploratory question 1

At the end of last week, I had left exploratory question 1 as a set of boxplots comparing men and women’s reaction to TMR overall. However, this was not optimal for a few reasons.

The original plot aggregated the scores for both cued and uncued IAT scores. Given that the cued and uncued conditions are the independent variable, we couldn’t count these scores together. So, rather than attempt to look only at the gender variable, I decided to look at the effect of TMR overall and split the cued and uncued conditions. I decided to change this from a boxplot to a line graph as a box plot would result in four boxes for each time point which would be difficult to interpret.

Firstly, I had to select the cued and uncued IAT scores then filter them into two datasets according to gender. I did this by using the overall dataset, filtering it by gender, and selecting the variables we wanted to look at. Then, using summarise_all(), we calculated averages so that we had points to plot onto the line graph.

#load packages
library(tidyverse)

## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.1 ──

## ✓ ggplot2 3.3.4     ✓ purrr   0.3.4
## ✓ tibble  3.1.2     ✓ dplyr   1.0.6
## ✓ tidyr   1.1.3     ✓ stringr 1.4.0
## ✓ readr   1.4.0     ✓ forcats 0.5.1

## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()

library(ggplot2)
library(readspss)

#read data
data <- read.sav("Humiston & Wamsley 2019 data.sav")

#remove excluded 
cleandata <- data %>%     #remove excluded participants 
  filter(exclude=="no")

#filter data according to gender
explore_male <- cleandata %>%
  filter(General_1_Sex == "Male") %>%
  select(baseIATcued, baseIATuncued, preIATcued, preIATuncued, postIATcued, postIATuncued, weekIATcued, weekIATuncued) 

explore_3a_male <- explore_male %>%
  summarise_all(list(mean = mean))

explore_female <- cleandata %>%
  filter(General_1_Sex == "Female") %>%
  select(baseIATcued, baseIATuncued, preIATcued, preIATuncued, postIATcued, postIATuncued, weekIATcued, weekIATuncued) 

explore_3a_female <- explore_female %>%
  summarise_all(list(mean = mean))

Then, I had to true work out how to put this into a dataframe. I initially drew this out on a piece of paper so I could figure out the columns. Here is a photo of my notebook:

After visualising it, I came up with this tibble with the following 4 columns. I think this could be neater, but I will look into that after I’ve finished my exploratory analyses.

#exploratory analysis question 1 dataframe
df_explore_3a <- tibble(
    condition = c(rep("cued", 8), rep("uncued", 8)),
    time = rep(c("Baseline", "Prenap", "Postnap", "1-week"),4),
    gender = rep(c(rep("Male",4), rep("Female", 4)),2),
    bias_av = c(explore_3a_male$baseIATcued_mean,
                explore_3a_male$preIATcued_mean,
                explore_3a_male$postIATcued_mean, 
                explore_3a_male$weekIATcued_mean,
                explore_3a_female$baseIATcued_mean,
                explore_3a_female$preIATcued_mean,
                explore_3a_female$postIATcued_mean,
                explore_3a_female$weekIATcued_mean,
                explore_3a_male$baseIATuncued_mean,
                explore_3a_male$preIATuncued_mean,
                explore_3a_male$postIATuncued_mean,
                explore_3a_male$weekIATuncued_mean,
                explore_3a_female$baseIATuncued_mean,
                explore_3a_female$preIATuncued_mean,
                explore_3a_female$postIATuncued_mean,
                explore_3a_female$weekIATuncued_mean))

However, when I tried to put this into a graph with the following code, it didn’t work because I didn’t specify how the group overlaps would work.

  ggplot(df_explore_3a,
         aes(x = factor
             (time, level = c("Baseline", "Prenap", "Postnap", "1-week")),
             y = bias_av,
             colour = gender,
             lty = condition)) +
  geom_line()

## geom_path: Each group consists of only one observation. Do you need to adjust
## the group aesthetic?

I then asked the Slack how I could produce the graph that I wanted and Jenny R replied telling me to create a new column in my dataframe that combined gender and condition, which I did with mutate()

df_explore_3a_2 <- df_explore_3a %>%
  mutate(gender_cue = case_when(
    gender == "Female" & condition == "cued" ~ "cued_Female",
    gender == "Female" & condition == "uncued" ~ "uncued_Female",
    gender == "Male" & condition == "cued" ~ "cued_Male",
    gender == "Male" & condition == "uncued" ~ "uncued_Male"))

From there, I put it into ggplot and managed to group the gender by the colour of the line, and the condition by the line type (lty()). In the end, this was a success!

ggplot(df_explore_3a_2,
       aes(x = factor
             (time, level = c("Baseline", "Prenap", "Postnap", "1-week")),
           y = bias_av,
           colour = gender,
           group = gender_cue,
           lty = condition
           ))+
  geom_line()+
  scale_colour_brewer(palette = "Set2") + 
  labs(x = "", 
       y = "Average D600 Bias Score", 
       caption = "Fig 6. Male and female average D600 scores at each IAT timepoint") +
  theme_classic()

I then managed to do a series of t-tests to compare the men’s and women’s cued conditions at each timepoint using the t.test() function.

#cued t-tests
base_cued_test <- t.test(explore_male$baseIATcued, explore_female$baseIATcued)
pre_cued_test <- t.test(explore_male$preIATcued, explore_female$preIATcued)
post_cued_test <- t.test(explore_male$postIATcued, explore_female$postIATcued)
week_cued_test <- t.test(explore_male$weekIATcued, explore_female$weekIATcued)

#uncued t-tests
base_uncued_test <- t.test(explore_male$baseIATuncued,
                         explore_female$baseIATuncued)
pre_uncued_test <- t.test(explore_male$preIATuncued, 
                        explore_female$preIATuncued)
post_uncued_test <- t.test(explore_male$postIATuncued,
                         explore_female$postIATuncued)
week_uncued_test <- t.test(explore_male$weekIATuncued,
                         explore_female$weekIATuncued)

2. Finalise exploratory analysis question 2 and 3

Since question 1 ended up being so complicated, I tried to do something more simple for question 2. Do women have less gender bias at baseline?

For this, I created a boxplot (since I didn’t have any before). Since all the variables were neatly arranged, I didn’t need to create my own dataframe, I just went straight into the graph.

Using the entire dataset, set the x axis as the gender, the y axis as the bias scores, and the colour of the boxplot according to gender. I also added geom_jitter() so that we can clearly see the distribution of participants.

This one was a success because it was much easier than the first one.

cleandata %>%
  ggplot(
    aes(x = General_1_Sex,
        y = base_IAT_gen,
        fill = General_1_Sex)) +
  geom_boxplot(alpha = 0.2)+
  geom_jitter(aes(colour = General_1_Sex))+
  labs(x = "", 
       y = "D600 Bias Score", 
       caption = "Fig 7. male and female implicit bias levels at baseline") +
  theme_classic()

I then used t-tests again to compare the two means:

#filter gender
female_base_gen <- cleandata %>%
  filter(General_1_Sex == "Female")

male_base_gen <- cleandata %>%
  filter(General_1_Sex == "Male")

#statistical analysis
genderbias_gender <- t.test(female_base_gen$base_IAT_gen,
                            male_base_gen$base_IAT_gen)

As for the third exploratory analysis question, I haven’t decided it yet but I think I wanted to look at two continuous variables.

*3. Work out how to write functions**

After some advice from Jenny S, I was able to make my function work!

This is what my function looked like last week.

#the function
function_implicit_bias_av_sd <- function(time_race, time_gen) {
  implicit_bias_time %>% 
  select(
    time_race,
    time_gen) %>% 
  summarise(
    IBaverage = mean(rbind(time_race, time_gen)),
    IBsd = sd(rbind(time_race, time_gen))
            )
}

#running the function
function_implicit_bias_av_sd(time_race = base_IAT_race, time_gen = base_IAT_gen)

However, this wasn’t working because R was getting confused at the rbind() step. Instead of combining time_race and time_get as double variables, it was putting them together as character variables. So, with Jenny’s help, we changed the code to this:

#selecting variables of interest
implicit_bias_time <- cleandata %>%
  select(base_IAT_race, base_IAT_gen, 
         pre_IAT_race, pre_IAT_gen, 
         post_IAT_race, post_IAT_gen, 
         week_IAT_race, week_IAT_gen)

#the function
function_implicit_bias_av_sd <- function(time_race, time_gen) {
  
  implicit_bias_time %>% 
    select(all_of(time_race), all_of(time_gen)) %>%
    summarise(IB_mean = mean(c_across(everything())),
              IB_sd = sd(c_across(everything())))
}
#running the function
BaselineIB <- function_implicit_bias_av_sd( 
  time_race = "base_IAT_race", 
  time_gen = "base_IAT_gen")

PreNapIB <- function_implicit_bias_av_sd(
  time_race = "pre_IAT_race", 
  time_gen = "pre_IAT_gen")

PostNapIB <- function_implicit_bias_av_sd(
  time_race = "post_IAT_race", 
  time_gen = "post_IAT_gen")

OWIB <- function_implicit_bias_av_sd(
  time_race = "week_IAT_race", 
  time_gen = "week_IAT_gen")

By using c_across(), R was able to understand that time_race and time_gen had to be treated as doubles and so the function worked!

How my thinking has changed over Term 2

Over the term, my thinking has changed most significantly about science in general and the open science movement. At the beginning of the term, while I had understood a little bit about the reproducibility crisis, at the end of PSYC2001 and 3001, I was led to believe that this issue was slowly being addressed with increased emphasis on replication studies. However, this course has taught me that the solution is not that simple, and that even replication studies need to be taken with a grain of salt.

This course also made R and coding much more approachable. Prior to this term, I had attempted to learn code on my own outside of uni, and it was incredibly difficult retain. However, Danielle’s tutorials along with the learn-it-yourself approach made the entire experience feel more solidly ingrained in my mind. Additionally, the process of learning to code taught me to be much more patient and forgiving to myself if I didn’t understand something straight away.

Finally, this course has also taught me a lot about the value of group work. Like many others, I have often dreaded group work, but this was one of the best assignments I have ever done. Everyone in our group was so responsive, and for the first time, we didn’t really take a divide and conquer approach. We all had coding sessions where we coded together and and as a result, were able to ask each other questions and work to help each other. This mean all our problems were solved pretty quickly because all four heads were put to solve one question before collectively moving forward. As a result, we got our work done really quickly, while also producing a great final presentation! Thus, when everyone’s goals and conditions were clearly set out from the beginning, group work was a pleasant, and even favourable to individual assignments. Even after the group aspect ended, we still asked each other for help on our verification reports, which was also very sweet.

Finally, I’d like to thank Jenny R, Jenny S, and Kate for teaching this course! This was one of my favourite courses that I’ve done across all four years of uni. It was challenging, but approachable, and I felt like I was on my own learning journey with Jenny, Jenny, and Kate as my spiritual guides. Thank you all for being so approachable, helpful, and encouraging us to challenge ourselves throughout the term. Hope you all get rest, raises, and have time to eat some yummy food throughout the holidays.

Week 10 Learning Log

Katherine Wong

07/08/2021