This week's coding goals

  1. Import data into RStudio
  2. Reproduce demographic statistics as shown in the report's Table 1 (Participant Characteristics)
  1. Create and reproduce Table 1

How did I go?

We had a group meeting this week where we aimed to import our data into RStudio and get started on reproducing the demographic statistics.

Goal 1: Import data into RStudio

Load the relevant packages

library(tidyverse)
library(janitor)
library(dplyr)

Load the data:

library(remotes)
install_github("JanMarvin/readspss")
library(remotes)
install_github("JanMarvin/readspss")

I had issues with the read.sav and read_sav function as it kept coming up with the error: "function not found"

library(haven)
library(readspss)
replicationdata <- read.sav("Humiston & Wamsley 2019 data.sav")

Goal 2: Reproduce demographic statistics

  • All of the coding is here
  • I encountered many challenges which I have outlined later in this learning log

Exclude participants from the data

cleandata <- replicationdata %>%    #removing participants who were excluded 
  filter(exclude == "no")

Calculate age average

ageaverage <- cleandata %>% #calculating average age including sd using cleaned data
  select(General_1_Age) %>%
  summarise(ageaverage = mean(General_1_Age), 
          agesd = sd(General_1_Age))

print(ageaverage)

Calculate average ESS score

ESS = Epworth Sleepiness Scale

ESS <- cleandata %>% 
  select(Epworth_total) %>% 
  summarise(ESSaverage = mean(Epworth_total), 
            ESSsd = sd(Epworth_total))

print(ESS)

Calculate SSS

SSS = Stanford Sleepiness Scale

Keeps coming up with error

SSSerror <- cleandata %>% 
  select(AlertTest_1_Feel) %>% 
  summarise(SSSaverage = mean(AlertTest_1_Feel),
            SSSsd = sd(AlertTest_1_Feel))

Gives non-matching values

SSStrial <- replicationdata %>% 
  select(AlertTest_1_Feel, 
         AlertTest_2_Feel, 
         AlertTest_3_Feel, 
         AlertTest_4_Feel) %>% 
  drop_na() %>% 
  summarise(SSStrialaverage = mean(rbind(AlertTest_1_Feel, AlertTest_2_Feel, AlertTest_3_Feel, AlertTest_4_Feel)),
             SSStrialsd = sd(rbind(AlertTest_1_Feel, AlertTest_2_Feel, AlertTest_3_Feel, AlertTest_4_Feel)))

print(SSStrial)

Another attempt: using mutate to create new variable (SUCCESS~~)

Create new variable SSSvalue

cleandata <- cleandata %>% 
  mutate(
    SSSvalue = as.numeric(
      x = AlertTest_1_Feel,
      levels = 1:5,
      labels = c("1 - Feeling active, vital alert, or wide awake",
      "2 - Functioning at high levels, but not at peak; able to concentrate",
      "3 - Awake, but relaxed; responsive but not fully alert",
      "4 - Somewhat foggy, let down",
      "5 - Foggy; losing interest in remaining awake; slowed down"),
      exclude = NA
    )
  )
SSS <- cleandata %>% 
  select(SSSvalue) %>% 
  summarise(SSSaverage = mean(SSSvalue),
            SSSsd = sd(SSSvalue))

Calculate baseline implicit bias

BIB <- cleandata %>% 
  select(
    base_IAT_race,
    base_IAT_gen) %>% 
  summarise(
    BIBaverage = mean(rbind(base_IAT_race, base_IAT_gen)), 
    BIBsd = sd(rbind(base_IAT_race, base_IAT_gen))
            )

print(BIB)

Calculate Prenap implicit bias

PrenapIB <- cleandata %>% 
  select(
    pre_IAT_race,
    pre_IAT_gen) %>% 
  summarise(
    PrenapIBaverage = mean(
      rbind(
        pre_IAT_race, 
        pre_IAT_gen)
      ),
    PrenapIBsd = sd(
      rbind(
        pre_IAT_race, 
        pre_IAT_gen))
            )

print(PrenapIB)

Calculate Postnap implicit bias

PostnapIB <- cleandata %>% 
  select(
    post_IAT_race,
    post_IAT_gen) %>% 
  summarise(
    PostnapIBaverage = mean(
      rbind(
        post_IAT_race,
        post_IAT_gen
      )),
    PostnapIBsd = sd(
      rbind(
        post_IAT_race,
        post_IAT_gen
      ))
  )

print(PostnapIB)

Calculate one-week delay implicit bias

OWDIB <- cleandata %>% 
  select(
    week_IAT_race,
    week_IAT_gen) %>% 
  summarise(
    OWDIBaverage = mean(
      rbind(
        week_IAT_race,
        week_IAT_gen
      )
    ),
    OWDIBsd = sd(
      rbind(
        week_IAT_race,
        week_IAT_gen
      )
    )
  )

print(OWDIB)

Calculate average sex

Male <- cleandata %>% 
  select(General_1_Sex) %>% 
  tally(General_1_Sex == "Male") 

Male_percentage <- Male/31 #31 as the clean data set has 31 participants

print(Male_percentage)

Calculate average Cue played during nap (% racial cue)

Napcue <- cleandata %>% 
  select(Cue_condition) %>% 
  tally(Cue_condition == "race cue played")

racialcue_perentage <- Napcue/31

print(racialcue_perentage)

Goal 3: Create and reproduce Table 1

After doing some research, there were three main packages for creating a table: - kableExtra() - magick() - gt()

I decided to go with gt() because it seemed to produce the cleanest table. If it wasn't what I wanted, then I would then try kableExtra(),and then magick().

I wasn't sure how to input the information, but I decided to just create a new dataframe using the statistics that was reproduced from the earlier section.

Recreate Table 1: Participant characteristics

Install relevant packages

#install.packages("kableExtra")
#install.packages("magick")
#install.packages("gt")

Load relevant packages

library(knitr)
library(kableExtra)
library(magick)
library(gt)
library(glue)

Create a dataframe for Table 1 (Participant characteristics)

table1 <- tibble(
  Characteristics = c("Age (yrs)", "ESS", "SSS", "Baseline implicit bias", "Prenap implicit bias", "Postnap implicit bias", "One-week delay implicit bias", "Sex (% male)", "Cue played during nap (% racial cue)"),
  Mean = c(19.5, 15.3, 2.81, 0.557, 0.257, 0.278, 0.399, 0.484, 0.548),
  SD = c(1.23, 2.83, 0.749, 0.406, 0.478, 0.459, 0.425, NA, NA)
)

Create and format a table

table1 %>% 
  gt() %>% 
  tab_header(
    title = "Participant characteristics")

Challenges and successes

  • I encountered many problems!
  • I had issues with the read.sav and read_sav function as it kept coming up with an error
  • I started with the chunk below but it kept coming up with an error: "function not found".

    library(remotes)
    install_github("JanMarvin/readspss")
    library(readspss)
    library(dplyr)
    replicationdata <- read.sav("Humiston & Wamsley 2019 data.sav")
  • After consulting google, most people recommended using the [haven] package:

    library(haven)
    library(readspss)
    replicationdata <- read.sav("Humiston & Wamsley 2019 data.sav")
  • After solving and importing the SPSS date, I then had issues trying to exclude participants. I kept getting "0 observations" for my [cleandata] variable when I first tried doing this:

cleandata <- data %>% 
  filter(exclude==0)
  • After trying various solutions, I was finally able to correctly exclude participants without getting "NA" or "O observations":

    cleandata <- replicationdata %>%    #removing participants who were excluded 
      filter(exclude == "no")
  • When calculating the SSS mean and SD, I had issues producing the relevant values. It kept coming up with the below error:

  • I realised that this is because the SSS has been inputted as a categorical variable e.g. "1 - Feeling active, vital alert, or wide awake". I had to change this variable into numerical values in order to calculate means and standard deviations:
cleandata <- cleandata %>% #Create new variable SSSvalue
  mutate(
    SSSvalue = as.numeric(
      x = AlertTest_1_Feel,
      levels = 1:5,
      labels = c("1 - Feeling active, vital alert, or wide awake",
      "2 - Functioning at high levels, but not at peak; able to concentrate",
      "3 - Awake, but relaxed; responsive but not fully alert",
      "4 - Somewhat foggy, let down",
      "5 - Foggy; losing interest in remaining awake; slowed down"),
      exclude = NA
    )
  )
SSS <- cleandata %>% 
  select(SSSvalue) %>% 
  summarise(SSSaverage = mean(SSSvalue), #Calculate mean and SD of new variable SSSvalue
            SSSsd = sd(SSSvalue))

Successes

  • After all the drama with reproducing the participant characteristics/statistics, creating the table was fairly simple. I created a dataframe including all the relevant values needed for Table1. I then created the table using gt()

    table1 %>% 
      gt() %>% 
      tab_header(
    title = "Participant characteristics")

Next steps

  • Find easier and better ways to format our data into a table, so that it looks more similar to what is seen our replication study
  • Find out how to round up our figures into 2 decimal places, as is seen in our replication study
  • Start reproducing our next figure!