This week's coding goals

This week, my goal was to start thinking of questions for my exploratory analyses and code the descriptive statistics, figure and statistics for at least one question.

The Questions

I brainstormed a couple of questions, like:

Is there an effect of gender on changes in implicit biases?
- Or an alternate question: Are males more biased than women?
Is the number of cues that participants are exposed to influence changes in implicit bias?
- (Number of cues range from 37 to 660)
Does the difference in procedure (between the original study and the replication study) matter - does getting offered course credit (vs. cash) significantly change implicit biases?
Are the number of years spent in university affect average bias scores for each IAT timepoint, averaged over the cued and uncued conditions?

I'm still not sure which 3 questions I'll choose, but I will definitely choose the last question.

So I started with that question:

Do the number of years spent in university affect students' bias scores?
I've defined this question as the average bias scores recorded at each of the 4 IAT timepoints, averaged over the cued and uncued conditions.
My graph will look like a line graph with 4 different coloured lines, connoting the number of years spent in university (0, 1, 2, or 3). The x-axis will have the 4 IAT timepoints (baseline, prenap, postnap, and one-week delay), and the y-axis will have their average bias scores.

How did I go?

Preliminaries

First, I loaded the packages and read in the data.

load packages

library(tidyverse)

## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.1 ──

## ✓ ggplot2 3.3.4     ✓ purrr   0.3.4
## ✓ tibble  3.1.2     ✓ dplyr   1.0.7
## ✓ tidyr   1.1.3     ✓ stringr 1.4.0
## ✓ readr   1.4.0     ✓ forcats 0.5.1

## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()

library(readspss) 
library(ggplot2)
library(janitor)

## 
## Attaching package: 'janitor'

## The following objects are masked from 'package:stats':
## 
##     chisq.test, fisher.test

library(plotrix)
library(gt)

Read in the data

cleandata <- read_csv("cleandata.csv")

## 
## ── Column specification ────────────────────────────────────────────────────────
## cols(
##   .default = col_character(),
##   General_1_Age = col_double(),
##   General_1_EnglishYrs = col_double(),
##   General_1_CaffCups = col_double(),
##   General_1_CaffHrsAgo = col_double(),
##   General_1_UniYears = col_double(),
##   Epworth_total = col_double(),
##   AlertTest_1_Concentr_1 = col_double(),
##   AlertTest_1_Refresh_1 = col_double(),
##   AlertTest_2_Concentr_1 = col_double(),
##   AlertTest_2_Refresh_1 = col_double(),
##   AlertTest_3_Concentr_1 = col_double(),
##   AlertTest_3_Refresh_1 = col_double(),
##   AlertTest_4_Concentr_1 = col_double(),
##   AlertTest_4_Refresh_1 = col_double(),
##   Total_sleep = col_double(),
##   Wake_amount = col_double(),
##   NREM1_amount = col_double(),
##   NREM2_amount = col_double(),
##   SWS_amount = col_double(),
##   REM_amount = col_double()
##   # ... with 26 more columns
## )
## ℹ Use `spec()` for the full column specifications.

Descriptive statistics

First, I have to calculate the mean and standard errors for my plot later on.

Attempt 1

I calculated the average bias scores by first selecting my needed variables from my data file using select(). Then, using summarise() to list the means for each variable containing "IAT" using across() and contains(). -head() is used to view the calculated data

avgbias <- cleandata %>% 
  select(baseIATcued, baseIATuncued, preIATcued, preIATuncued, postIATcued, postIATuncued,weekIATcued, weekIATuncued, General_1_UniYears) 

avgbias <- avgbias %>% 
  summarise(across(contains("IAT"), list(mean = mean)))

head(avgbias)

## # A tibble: 1 x 8
##   baseIATcued_mean baseIATuncued_mean preIATcued_mean preIATuncued_mean
##              <dbl>              <dbl>           <dbl>             <dbl>
## 1            0.518              0.595           0.211             0.302
## # … with 4 more variables: postIATcued_mean <dbl>, postIATuncued_mean <dbl>,
## #   weekIATcued_mean <dbl>, weekIATuncued_mean <dbl>

To calculate standard error, I used std.error() to calculate from the means I calculated above.
However, it kept coming up with NA for my standard errors.

se_bias <- std.error(avgbias)
  
head(se_bias)

##   baseIATcued_mean baseIATuncued_mean    preIATcued_mean  preIATuncued_mean 
##                 NA                 NA                 NA                 NA 
##   postIATcued_mean postIATuncued_mean 
##                 NA                 NA

Attempt 2

I tried again, but now I decided to calculate means and standard errors for each of the 4 conditions (number of years spent at Uni).

0 years at Uni bias change:

First, I selected participants that have only attended uni for 0 years, using filter()

year0uni_participants <- cleandata %>%
  filter(General_1_UniYears == "0")

I calculated average bias levels and standard error for this group, using the same method as earlier:

year0uni <- year0uni_participants %>% 
  select(baseIATcued, baseIATuncued, preIATcued, preIATuncued, postIATcued, postIATuncued,weekIATcued, weekIATuncued) 

year0uni_avgbias <- year0uni %>% 
  summarise(across(contains("IAT"), list(mean = mean)))

print(year0uni_avgbias)

## # A tibble: 1 x 8
##   baseIATcued_mean baseIATuncued_mean preIATcued_mean preIATuncued_mean
##              <dbl>              <dbl>           <dbl>             <dbl>
## 1            0.672              0.455           0.408             0.269
## # … with 4 more variables: postIATcued_mean <dbl>, postIATuncued_mean <dbl>,
## #   weekIATcued_mean <dbl>, weekIATuncued_mean <dbl>

year0uni_se <- std.error(year0uni_avgbias)
head(year0uni_se)

##   baseIATcued_mean baseIATuncued_mean    preIATcued_mean  preIATuncued_mean 
##                 NA                 NA                 NA                 NA 
##   postIATcued_mean postIATuncued_mean 
##                 NA                 NA

However, for standard error, it kept coming up with the same NA error. Thus, I tried using na.rm to get rid of the NAs.

year0uni_se <- std.error(year0uni_avgbias, na.rm = TRUE)
head(year0uni_se)

##   baseIATcued_mean baseIATuncued_mean    preIATcued_mean  preIATuncued_mean 
##                 NA                 NA                 NA                 NA 
##   postIATcued_mean postIATuncued_mean 
##                 NA                 NA

Then, I realised I was using the wrong mean variable to calculate standard error. Now that I have calculated using the right variable, it has calculated for standard error.

year0uni_se <- std.error(year0uni)
head(year0uni_se)

##   baseIATcued baseIATuncued    preIATcued  preIATuncued   postIATcued 
##     0.1124953     0.1401144     0.1231403     0.1006117     0.1505249 
## postIATuncued 
##     0.1604736

Now I have to do the same for Years 1,2,and 3: >Year 1

First, I selected participants that have only attended uni for 1 year, using filter()

year1uni_participants <- cleandata %>%
  filter(General_1_UniYears == "1")

Then, calculated avg bias levels for that group:

year1uni <- year1uni_participants %>% 
  select(baseIATcued, baseIATuncued, preIATcued, preIATuncued, postIATcued, postIATuncued,weekIATcued, weekIATuncued) 

year1uni_avgbias <- year1uni %>% 
  summarise(across(contains("IAT"), list(mean = mean)))

head(year1uni_avgbias)

## # A tibble: 1 x 8
##   baseIATcued_mean baseIATuncued_mean preIATcued_mean preIATuncued_mean
##              <dbl>              <dbl>           <dbl>             <dbl>
## 1            0.509              0.742           0.305             0.390
## # … with 4 more variables: postIATcued_mean <dbl>, postIATuncued_mean <dbl>,
## #   weekIATcued_mean <dbl>, weekIATuncued_mean <dbl>

Then, calculated standard error using the means above:

year1uni_se <- std.error(year1uni)
head(year1uni_se)

##   baseIATcued baseIATuncued    preIATcued  preIATuncued   postIATcued 
##     0.2331138     0.2359056     0.3046492     0.2841019     0.1672415 
## postIATuncued 
##     0.2697327

Year 2

First, I selected participants that have only attended uni for 2 years, using filter()

year2uni_participants <- cleandata %>%
  filter(General_1_UniYears == "2")

Then, calculated avg bias levels for that group:

year2uni <- year2uni_participants %>% 
  select(baseIATcued, baseIATuncued, preIATcued, preIATuncued, postIATcued, postIATuncued,weekIATcued, weekIATuncued) 

year2uni_avgbias <- year2uni %>% 
  summarise(across(contains("IAT"), list(mean = mean)))

head(year2uni_avgbias)

## # A tibble: 1 x 8
##   baseIATcued_mean baseIATuncued_mean preIATcued_mean preIATuncued_mean
##              <dbl>              <dbl>           <dbl>             <dbl>
## 1            0.340              0.719         -0.0141             0.371
## # … with 4 more variables: postIATcued_mean <dbl>, postIATuncued_mean <dbl>,
## #   weekIATcued_mean <dbl>, weekIATuncued_mean <dbl>

Then, calculated standard error using the means above:

year2uni_se <- std.error(year2uni)
head(year2uni_se)

##   baseIATcued baseIATuncued    preIATcued  preIATuncued   postIATcued 
##    0.09853969    0.15985863    0.20114450    0.13174385    0.13812838 
## postIATuncued 
##    0.14299955

Year 3

First, I selected participants that have only attended uni for 3 years, using filter()

year3uni_participants <- cleandata %>%
  filter(General_1_UniYears == "3")

Then, calculated avg bias levels for that group:

year3uni <- year3uni_participants %>% 
  select(baseIATcued, baseIATuncued, preIATcued, preIATuncued, postIATcued, postIATuncued,weekIATcued, weekIATuncued) 

year3uni_avgbias <- year1uni %>% 
  summarise(across(contains("IAT"), list(mean = mean)))

head(year3uni_avgbias)

## # A tibble: 1 x 8
##   baseIATcued_mean baseIATuncued_mean preIATcued_mean preIATuncued_mean
##              <dbl>              <dbl>           <dbl>             <dbl>
## 1            0.509              0.742           0.305             0.390
## # … with 4 more variables: postIATcued_mean <dbl>, postIATuncued_mean <dbl>,
## #   weekIATcued_mean <dbl>, weekIATuncued_mean <dbl>

Then, calculated standard error using the means above:

year3uni_se <- std.error(year3uni)
head(year3uni_se)

##   baseIATcued baseIATuncued    preIATcued  preIATuncued   postIATcued 
##     0.1240071     0.1158314     0.1314291     0.2613532     0.1547907 
## postIATuncued 
##     0.1680495

Create the data tibble

Because I want to see if there are bias changes over time, for the number of years, I used a line graph:

biasdata <- data.frame(
  condition = factor(c("0", "0", "0", "0", "1", "1", "1", "1", "2", "2", "2", "2", "3", "3", "3", "3")),
  time = factor(c("Baseline", "Prenap", "Postnap", "1-week", "Baseline", "Prenap", "Postnap", "1-week", "Baseline", "Prenap", "Postnap", "1-week", "Baseline", "Prenap", "Postnap", "1-week")),
  levels = c("Baseline", "Prenap", "Postnap", "1-week"),
  bias_av = c(year0uni$

## Error: <text>:6:0: unexpected end of input
## 4:   levels = c("Baseline", "Prenap", "Postnap", "1-week"),
## 5:   bias_av = c(year0uni$  
##   ^

Halfway through constructing the tibble, I realised that I need to create a column/variable that averages each IAT timepoint over the cued and uncued condition i.e. need to find IAT score for each timepoint averaged over cued/uncued condition. Use mutate?

Attempt 3

I have to start again from the beginning:

exploratorydata1 <- read.csv("cleandata.csv")

Create baseline IAT, averaged over cued/uncued condition using mutate():

mutated_exploratorydata1 <- exploratorydata1 %>% 
  rowwise() %>% 
  mutate(baselineIAT = mean(c(baseIATcued, baseIATuncued)),
         prenapIAT = mean(c(preIATcued, preIATuncued)),
         postnapIAT = mean(c(postIATcued, postIATuncued)),
         weekIAT = mean(c(weekIATcued, weekIATuncued)))

Used glimpse() to view if the data calculated properly - it did as there are now 105 columns, rather than 101 columns.

glimpse(mutated_exploratorydata1)

## Rows: 31
## Columns: 105
## Rowwise: 
## $ ParticipantID          <chr> "ub6", "ub7", "ub8", "ub9", "ub11", "ub13", "ub…
## $ exclude                <chr> "no", "no", "no", "no", "no", "no", "no", "no",…
## $ cue_presented          <chr> "yes", "yes", "yes", "yes", "yes", "yes", "yes"…
## $ heard_cue_report       <chr> "no", "no", "no", "no", "no", "no", "no", "no",…
## $ heard_cue_exit         <chr> "no", "unsure", "no", "no", "no", "no", "no", "…
## $ predicted_cue          <chr> "no", "no", "no", "suspected", "no", "no", "no"…
## $ Cue_condition          <chr> "race cue played", "race cue played", "gender c…
## $ Counterbias_order      <chr> "racial training first", "gender training first…
## $ Sound_assignment       <chr> "machR and descG", "machG and descR", "machR an…
## $ IAT1_order             <chr> "EATF-SATF", "SATF-EATF", "EATF-SATF", "SATF-EA…
## $ IAT234_order           <chr> "SATS-EATS", "EATS-SATS", "SATS-EATS", "EATS-SA…
## $ IAT_order              <chr> "ES, SESESE", "SE, ESESES", "ES, SESESE", "SE, …
## $ compensation           <chr> "cash", "cash", "cash", "cash", "course credit"…
## $ General_1_Age          <int> 21, 21, 20, 21, 19, 20, 18, 18, 18, 18, 19, 19,…
## $ General_1_Sex          <chr> "Female", "Female", "Female", "Male", "Female",…
## $ General_1_Race         <chr> "White", "White", "White", "White", "White", "W…
## $ General_1_English      <chr> "Yes", "Yes", "Yes", "Yes", "No", "Yes", "Yes",…
## $ General_1_EnglishYrs   <int> NA, NA, NA, NA, 12, NA, NA, NA, NA, NA, NA, NA,…
## $ General_1_Caffeine     <chr> "Yes", "Yes", "No", "No", "No", "No", "No", "Ye…
## $ General_1_CaffCups     <int> 1, 1, NA, NA, NA, NA, NA, 1, NA, 1, NA, NA, 1, …
## $ General_1_CaffHrsAgo   <dbl> 2.0, 3.0, NA, NA, NA, NA, NA, 5.5, NA, 2.0, NA,…
## $ General_1_SleepDisor   <chr> "No", "No", "No", "Yes", "No", "No", "No", "No"…
## $ General_1_MentalDiso   <chr> "No", "No", "No", "No", "No", "No", "No", "No",…
## $ General_1_Meds         <chr> "No", "Yes", "No", "No", "Yes", "Yes", "No", "N…
## $ General_1_MedList      <chr> "", "20 mg prozac every day", "", "", "birth co…
## $ General_1_University   <chr> "Furman University", "Furman University", "Furm…
## $ General_1_UniYears     <int> 3, 3, 2, 3, 0, 2, 0, 0, 0, 0, 1, 0, 0, 0, 1, 2,…
## $ Demo_1_Ethnic          <chr> "Not Hispanic or Latino", "Not Hispanic or Lati…
## $ Demo_1_Racial          <chr> "White", "White", "White", "White", "White", "W…
## $ Demo_1_Gender          <chr> "Female", "Female", "Female", "Male", "Female",…
## $ Demo_1_NonParticipat   <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ Epworth_1_Read         <chr> "slight chance of dozing", "slight chance of do…
## $ Epworth_1_TV           <chr> "slight chance of dozing", "slight chance of do…
## $ Epworth_1_Public       <chr> "slight chance of dozing", "no chance of dozing…
## $ Epworth_1_Passenger    <chr> "moderate chance of dozing", "slight chance of …
## $ Epworth_1_LyingDown    <chr> "high chance of dozing", "slight chance of dozi…
## $ Epworth_1_Talking      <chr> "no chance of dozing", "no chance of dozing", "…
## $ Epworth_1_Lunch        <chr> "no chance of dozing", "no chance of dozing", "…
## $ Epworth_1_Traffic      <chr> "no chance of dozing", "no chance of dozing", "…
## $ Epworth_total          <int> 16, 12, 13, 10, 16, 12, 20, 15, 16, 16, 11, 21,…
## $ AlertTest_1_Concentr_1 <int> 80, 60, 60, 70, 70, 40, 80, 80, 80, 50, 90, 30,…
## $ AlertTest_1_Refresh_1  <int> 60, 70, 60, 30, 60, 40, 80, 60, 40, 30, 90, 40,…
## $ AlertTest_1_Feel       <chr> "2 - Functioning at high levels, but not at pea…
## $ AlertTest_2_Concentr_1 <int> 70, 60, 40, 60, 80, 40, 80, 80, 60, 40, 90, 20,…
## $ AlertTest_2_Refresh_1  <int> 70, 60, 30, 30, 60, 40, 70, 60, 40, 30, 70, 30,…
## $ AlertTest_2_Feel       <chr> "3 - Awake, but relaxed; responsive but not ful…
## $ AlertTest_3_Concentr_1 <int> NA, 60, 40, 80, NA, 60, 100, 80, 70, NA, 70, 50…
## $ AlertTest_3_Refresh_1  <int> NA, 70, 50, 70, NA, 80, 100, 90, 90, NA, 100, 7…
## $ AlertTest_3_Feel       <chr> NA, "2 - Functioning at high levels, but not at…
## $ AlertTest_4_Concentr_1 <int> 80, 60, 40, NA, NA, 70, 90, 100, 70, 80, 100, 3…
## $ AlertTest_4_Refresh_1  <int> 90, 50, 30, NA, NA, 80, 90, 80, 60, 80, 100, 70…
## $ AlertTest_4_Feel       <chr> "1 - Feeling active, vital alert, or wide awake…
## $ S1_ExitQ_1_sound       <chr> "No", "No", "No", "No", "No", "No", "No", "No",…
## $ S1_ExitQ_1_soundaffect <chr> "No", "No", "No", "No", "No", "No", "No", "No",…
## $ S1_ExitQ_2_sound       <chr> "No", "No", "No", "No", "No", "No", "No", "No",…
## $ S1_ExitQ_3_sound       <chr> "No", "No", "No", "No", "No", "No", "No", "No",…
## $ S1_ExitQ_4_sound       <chr> "No", "No", "No", "No", "No", "No", "No", "No",…
## $ S1_ExitQ_4_soundaffect <chr> "No", "No", "No", "No", "No", "No", "No", "No",…
## $ S1_ExitQ_5_sound       <chr> "No", "No", "No", "No", "No", "No", "No", "No",…
## $ S1_ExitQ_5_soundaffect <chr> "No", "No", "No", "No", "No", "No", "No", "No",…
## $ S2_ExitQ_1_sound       <chr> "No", "No", "No", "No", "No", "No", "No", "No",…
## $ S2_ExitQ_1_soundaffect <chr> "No", "No", "No", "No", "No", "No", "No", "No",…
## $ S2_ExitQ_2_sound       <chr> "No", "No", "No", "No", "No", "No", "No", "No",…
## $ S2_ExitQ_3_sound       <chr> "No", "No", "No", "No", "No", "No", "No", "No",…
## $ S2_ExitQ_4_sound       <chr> "No", "No", "No", "No", "No", "No", "No", "No",…
## $ S2_ExitQ_4_soundaffect <chr> "No", "No", "No", "No", "No", "No", "No", "No",…
## $ S2_ExitQ_5_sound       <chr> "No", "No", "No", "No", "No", "No", "No", "No",…
## $ S2_ExitQ_5_soundaffect <chr> "No", "No", "No", "No", "No", "No", "No", "No",…
## $ Total_sleep            <int> 65, 66, 80, 62, 51, 81, 81, 67, 79, 71, 68, 85,…
## $ Wake_amount            <int> 25, 24, 10, 28, 39, 9, 9, 23, 11, 19, 22, 5, 16…
## $ NREM1_amount           <int> 10, 9, 5, 5, 11, 4, 3, 4, 4, 2, 15, 4, 4, 7, 3,…
## $ NREM2_amount           <dbl> 20.0, 52.0, 15.0, 15.5, 22.0, 23.0, 31.0, 50.0,…
## $ SWS_amount             <int> 12, 19, 24, 24, 16, 36, 37, 4, 23, 25, 7, 31, 4…
## $ REM_amount             <int> 23, 0, 17, 17, 2, 18, 9, 9, 24, 11, 10, 16, 0, …
## $ SWSxREM                <int> 276, 0, 408, 408, 32, 648, 333, 36, 552, 275, 7…
## $ cue_minutes            <dbl> 9.5, 12.0, 15.5, 16.0, 15.0, 25.0, 23.0, 3.0, 1…
## $ baseIATcued            <dbl> 0.57544182, 0.09911241, 0.20577365, 0.35314196,…
## $ baseIATuncued          <dbl> 0.60953653, 0.64396538, 1.52435622, 0.13108478,…
## $ preIATcued             <dbl> 0.55905291, -0.13380639, 0.51077026, -0.0293319…
## $ preIATuncued           <dbl> 0.21462144, 0.33985028, 0.37990232, -0.94209553…
## $ postIATcued            <dbl> 0.681910146, 0.044634805, -0.002583615, -0.2459…
## $ postIATuncued          <dbl> 0.46728694, -0.05686262, 0.68243589, 0.94970369…
## $ weekIATcued            <dbl> 0.20377367, 0.45873715, 0.39859469, 0.92341592,…
## $ weekIATuncued          <dbl> 0.68277422, -0.01070460, 0.71187286, 0.20212832…
## $ postnap_change_cued    <dbl> 0.1228572, 0.1784412, -0.5133539, -0.2166575, 0…
## $ postnap_change_uncued  <dbl> 0.25266550, -0.39671291, 0.30253356, 1.89179922…
## $ week_change_cued       <int> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
## $ week_change_uncued     <int> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
## $ diff_biaschange_cued   <dbl> 0.371668148, -0.359624732, -0.192821041, -0.570…
## $ diff_biaschange_uncued <dbl> -0.07323769, 0.65466998, 0.81248335, -0.0710435…
## $ diff_biaschange        <dbl> 0.44490584, -1.01429471, -1.00530440, -0.499230…
## $ base_IAT_race          <dbl> 0.57544182, 0.09911241, 1.52435622, 0.13108478,…
## $ base_IAT_gen           <dbl> 0.60953653, 0.64396538, 0.20577365, 0.35314196,…
## $ pre_IAT_race           <dbl> 0.55905291, -0.13380639, 0.37990232, -0.9420955…
## $ pre_IAT_gen            <dbl> 0.21462144, 0.33985028, 0.51077026, -0.02933191…
## $ post_IAT_race          <dbl> 0.68191015, 0.04463480, 0.68243589, 0.94970369,…
## $ week_IAT_race          <dbl> 0.20377367, 0.45873715, 0.71187286, 0.20212832,…
## $ post_IAT_gen           <dbl> 0.467286940, -0.056862624, -0.002583615, -0.245…
## $ week_IAT_gen           <dbl> 0.68277422, -0.01070460, 0.39859469, 0.92341592…
## $ filter_.               <chr> "Selected", "Selected", "Selected", "Selected",…
## $ cues_total             <dbl> 142.5, 180.0, 232.5, 240.0, 225.0, 375.0, 345.0…
## $ baselineIAT            <dbl> 0.5924892, 0.3715389, 0.8650649, 0.2421134, 0.3…
## $ prenapIAT              <dbl> 0.38683717, 0.10302195, 0.44533629, -0.48571372…
## $ postnapIAT             <dbl> 0.57459854, -0.00611391, 0.33992614, 0.35185714…
## $ weekIAT                <dbl> 0.44327395, 0.22401627, 0.55523378, 0.56277212,…

Now, I have to repeat the process to find means and standard errors:

0 years at Uni bias change:

Select participants only attend uni 0 years:

year0uni_participants <- mutated_exploratorydata1 %>%
  filter(General_1_UniYears == "0")

calculate avg bias levels:

year0uni <- year0uni_participants %>% 
  select(baselineIAT, prenapIAT, postnapIAT, weekIAT)

year0uni_avgbias <- year0uni %>% 
  summarise(year0uni, mean = "mean")

head(year0uni_avgbias)

## # A tibble: 6 x 5
##   baselineIAT prenapIAT postnapIAT weekIAT mean 
##         <dbl>     <dbl>      <dbl>   <dbl> <chr>
## 1       0.310    0.0320     0.0751  0.0560 mean 
## 2       0.867    0.122      0.806  -0.173  mean 
## 3       0.646    0.611      0.535   0.142  mean 
## 4       0.805    0.305     -0.627   0.230  mean 
## 5       0.418    0.624      0.0990  0.0556 mean 
## 6       0.564    0.403      0.262   0.750  mean

year0uni_se <- std.error(year0uni)
head(year0uni_se)

## baselineIAT   prenapIAT  postnapIAT     weekIAT 
##  0.08727692  0.06171902  0.11709737  0.09872838

1 year at Uni bias change:

Select participants only attend uni 1 years:

year1uni_participants <- mutated_exploratorydata1 %>%
  filter(General_1_UniYears == "1")

print(year1uni_participants)

## # A tibble: 4 x 105
## # Rowwise: 
##   ParticipantID exclude cue_presented heard_cue_report heard_cue_exit
##   <chr>         <chr>   <chr>         <chr>            <chr>         
## 1 ub25          no      yes           no               unsure        
## 2 ub31          no      yes           no               no            
## 3 ub34          no      yes           no               no            
## 4 ub45          no      yes           no               no            
## # … with 100 more variables: predicted_cue <chr>, Cue_condition <chr>,
## #   Counterbias_order <chr>, Sound_assignment <chr>, IAT1_order <chr>,
## #   IAT234_order <chr>, IAT_order <chr>, compensation <chr>,
## #   General_1_Age <int>, General_1_Sex <chr>, General_1_Race <chr>,
## #   General_1_English <chr>, General_1_EnglishYrs <int>,
## #   General_1_Caffeine <chr>, General_1_CaffCups <int>,
## #   General_1_CaffHrsAgo <dbl>, General_1_SleepDisor <chr>,
## #   General_1_MentalDiso <chr>, General_1_Meds <chr>, General_1_MedList <chr>,
## #   General_1_University <chr>, General_1_UniYears <int>, Demo_1_Ethnic <chr>,
## #   Demo_1_Racial <chr>, Demo_1_Gender <chr>, Demo_1_NonParticipat <chr>,
## #   Epworth_1_Read <chr>, Epworth_1_TV <chr>, Epworth_1_Public <chr>,
## #   Epworth_1_Passenger <chr>, Epworth_1_LyingDown <chr>,
## #   Epworth_1_Talking <chr>, Epworth_1_Lunch <chr>, Epworth_1_Traffic <chr>,
## #   Epworth_total <int>, AlertTest_1_Concentr_1 <int>,
## #   AlertTest_1_Refresh_1 <int>, AlertTest_1_Feel <chr>,
## #   AlertTest_2_Concentr_1 <int>, AlertTest_2_Refresh_1 <int>,
## #   AlertTest_2_Feel <chr>, AlertTest_3_Concentr_1 <int>,
## #   AlertTest_3_Refresh_1 <int>, AlertTest_3_Feel <chr>,
## #   AlertTest_4_Concentr_1 <int>, AlertTest_4_Refresh_1 <int>,
## #   AlertTest_4_Feel <chr>, S1_ExitQ_1_sound <chr>,
## #   S1_ExitQ_1_soundaffect <chr>, S1_ExitQ_2_sound <chr>,
## #   S1_ExitQ_3_sound <chr>, S1_ExitQ_4_sound <chr>,
## #   S1_ExitQ_4_soundaffect <chr>, S1_ExitQ_5_sound <chr>,
## #   S1_ExitQ_5_soundaffect <chr>, S2_ExitQ_1_sound <chr>,
## #   S2_ExitQ_1_soundaffect <chr>, S2_ExitQ_2_sound <chr>,
## #   S2_ExitQ_3_sound <chr>, S2_ExitQ_4_sound <chr>,
## #   S2_ExitQ_4_soundaffect <chr>, S2_ExitQ_5_sound <chr>,
## #   S2_ExitQ_5_soundaffect <chr>, Total_sleep <int>, Wake_amount <int>,
## #   NREM1_amount <int>, NREM2_amount <dbl>, SWS_amount <int>, REM_amount <int>,
## #   SWSxREM <int>, cue_minutes <dbl>, baseIATcued <dbl>, baseIATuncued <dbl>,
## #   preIATcued <dbl>, preIATuncued <dbl>, postIATcued <dbl>,
## #   postIATuncued <dbl>, weekIATcued <dbl>, weekIATuncued <dbl>,
## #   postnap_change_cued <dbl>, postnap_change_uncued <dbl>,
## #   week_change_cued <int>, week_change_uncued <int>,
## #   diff_biaschange_cued <dbl>, diff_biaschange_uncued <dbl>,
## #   diff_biaschange <dbl>, base_IAT_race <dbl>, base_IAT_gen <dbl>,
## #   pre_IAT_race <dbl>, pre_IAT_gen <dbl>, post_IAT_race <dbl>,
## #   week_IAT_race <dbl>, post_IAT_gen <dbl>, week_IAT_gen <dbl>,
## #   filter_. <chr>, cues_total <dbl>, baselineIAT <dbl>, prenapIAT <dbl>,
## #   postnapIAT <dbl>, weekIAT <dbl>

calculate avg bias levels:

year1uni <- year1uni_participants %>% 
  select(baselineIAT, prenapIAT, postnapIAT, weekIAT)

year1uni_avgbias <- year1uni %>% 
  summarise(across(contains("IAT"), list(mean = mean)))

head(year1uni_avgbias)

## # A tibble: 4 x 4
##   baselineIAT_mean prenapIAT_mean postnapIAT_mean weekIAT_mean
##              <dbl>          <dbl>           <dbl>        <dbl>
## 1            0.665         0.906           0.375         0.433
## 2            0.527         0.216           0.891         0.765
## 3            0.401        -0.0223          0.0634        0.291
## 4            0.909         0.291           0.160         0.516

year1uni_se <- std.error(year1uni)
head(year1uni_se)

## baselineIAT   prenapIAT  postnapIAT     weekIAT 
##   0.1087993   0.1977530   0.1848361   0.0993104

2 years at Uni bias change:

Select participants only attend uni 2 years:

year2uni_participants <- mutated_exploratorydata1 %>%
  filter(General_1_UniYears == "2")

print(year2uni_participants)

## # A tibble: 10 x 105
## # Rowwise: 
##    ParticipantID exclude cue_presented heard_cue_report       heard_cue_exit
##    <chr>         <chr>   <chr>         <chr>                  <chr>         
##  1 ub8           no      yes           no                     no            
##  2 ub13          no      yes           no                     no            
##  3 ub32          no      yes           maybe, unsure, unclear no            
##  4 ub36          no      yes           no                     no            
##  5 ub38          no      yes           no                     no            
##  6 ub40          no      yes           no                     no            
##  7 ub41          no      yes           no                     no            
##  8 ub43          no      yes           no                     no            
##  9 ub44          no      yes           no                     no            
## 10 ub48          no      yes           no                     no            
## # … with 100 more variables: predicted_cue <chr>, Cue_condition <chr>,
## #   Counterbias_order <chr>, Sound_assignment <chr>, IAT1_order <chr>,
## #   IAT234_order <chr>, IAT_order <chr>, compensation <chr>,
## #   General_1_Age <int>, General_1_Sex <chr>, General_1_Race <chr>,
## #   General_1_English <chr>, General_1_EnglishYrs <int>,
## #   General_1_Caffeine <chr>, General_1_CaffCups <int>,
## #   General_1_CaffHrsAgo <dbl>, General_1_SleepDisor <chr>,
## #   General_1_MentalDiso <chr>, General_1_Meds <chr>, General_1_MedList <chr>,
## #   General_1_University <chr>, General_1_UniYears <int>, Demo_1_Ethnic <chr>,
## #   Demo_1_Racial <chr>, Demo_1_Gender <chr>, Demo_1_NonParticipat <chr>,
## #   Epworth_1_Read <chr>, Epworth_1_TV <chr>, Epworth_1_Public <chr>,
## #   Epworth_1_Passenger <chr>, Epworth_1_LyingDown <chr>,
## #   Epworth_1_Talking <chr>, Epworth_1_Lunch <chr>, Epworth_1_Traffic <chr>,
## #   Epworth_total <int>, AlertTest_1_Concentr_1 <int>,
## #   AlertTest_1_Refresh_1 <int>, AlertTest_1_Feel <chr>,
## #   AlertTest_2_Concentr_1 <int>, AlertTest_2_Refresh_1 <int>,
## #   AlertTest_2_Feel <chr>, AlertTest_3_Concentr_1 <int>,
## #   AlertTest_3_Refresh_1 <int>, AlertTest_3_Feel <chr>,
## #   AlertTest_4_Concentr_1 <int>, AlertTest_4_Refresh_1 <int>,
## #   AlertTest_4_Feel <chr>, S1_ExitQ_1_sound <chr>,
## #   S1_ExitQ_1_soundaffect <chr>, S1_ExitQ_2_sound <chr>,
## #   S1_ExitQ_3_sound <chr>, S1_ExitQ_4_sound <chr>,
## #   S1_ExitQ_4_soundaffect <chr>, S1_ExitQ_5_sound <chr>,
## #   S1_ExitQ_5_soundaffect <chr>, S2_ExitQ_1_sound <chr>,
## #   S2_ExitQ_1_soundaffect <chr>, S2_ExitQ_2_sound <chr>,
## #   S2_ExitQ_3_sound <chr>, S2_ExitQ_4_sound <chr>,
## #   S2_ExitQ_4_soundaffect <chr>, S2_ExitQ_5_sound <chr>,
## #   S2_ExitQ_5_soundaffect <chr>, Total_sleep <int>, Wake_amount <int>,
## #   NREM1_amount <int>, NREM2_amount <dbl>, SWS_amount <int>, REM_amount <int>,
## #   SWSxREM <int>, cue_minutes <dbl>, baseIATcued <dbl>, baseIATuncued <dbl>,
## #   preIATcued <dbl>, preIATuncued <dbl>, postIATcued <dbl>,
## #   postIATuncued <dbl>, weekIATcued <dbl>, weekIATuncued <dbl>,
## #   postnap_change_cued <dbl>, postnap_change_uncued <dbl>,
## #   week_change_cued <int>, week_change_uncued <int>,
## #   diff_biaschange_cued <dbl>, diff_biaschange_uncued <dbl>,
## #   diff_biaschange <dbl>, base_IAT_race <dbl>, base_IAT_gen <dbl>,
## #   pre_IAT_race <dbl>, pre_IAT_gen <dbl>, post_IAT_race <dbl>,
## #   week_IAT_race <dbl>, post_IAT_gen <dbl>, week_IAT_gen <dbl>,
## #   filter_. <chr>, cues_total <dbl>, baselineIAT <dbl>, prenapIAT <dbl>,
## #   postnapIAT <dbl>, weekIAT <dbl>

calculate avg bias levels:

year2uni <- year2uni_participants %>% 
  select(baselineIAT, prenapIAT, postnapIAT, weekIAT)

year2uni_avgbias <- year2uni %>% 
  summarise(across(contains("IAT"), list(mean = mean)))

head(year2uni_avgbias)

## # A tibble: 6 x 4
##   baselineIAT_mean prenapIAT_mean postnapIAT_mean weekIAT_mean
##              <dbl>          <dbl>           <dbl>        <dbl>
## 1            0.865          0.445          0.340        0.555 
## 2            0.606          0.197          0.624        0.839 
## 3            0.442         -0.378          0.311        0.950 
## 4            0.503          0.398          0.199        0.631 
## 5           -0.219         -0.273          0.0524       0.0274
## 6            0.437         -0.176          0.269        0.402

year2uni_se <- std.error(year2uni)
head(year2uni_se)

## baselineIAT   prenapIAT  postnapIAT     weekIAT 
##   0.1082313   0.1509254   0.1050117   0.1107171

3 years at Uni bias change:

Select participants only attend uni 3 years:

year3uni_participants <- mutated_exploratorydata1 %>%
  filter(General_1_UniYears == "3")

print(year3uni_participants)

## # A tibble: 6 x 105
## # Rowwise: 
##   ParticipantID exclude cue_presented heard_cue_report heard_cue_exit
##   <chr>         <chr>   <chr>         <chr>            <chr>         
## 1 ub6           no      yes           no               no            
## 2 ub7           no      yes           no               unsure        
## 3 ub9           no      yes           no               no            
## 4 ub35          no      yes           no               no            
## 5 ub42          no      yes           no               no            
## 6 ub49          no      yes           no               no            
## # … with 100 more variables: predicted_cue <chr>, Cue_condition <chr>,
## #   Counterbias_order <chr>, Sound_assignment <chr>, IAT1_order <chr>,
## #   IAT234_order <chr>, IAT_order <chr>, compensation <chr>,
## #   General_1_Age <int>, General_1_Sex <chr>, General_1_Race <chr>,
## #   General_1_English <chr>, General_1_EnglishYrs <int>,
## #   General_1_Caffeine <chr>, General_1_CaffCups <int>,
## #   General_1_CaffHrsAgo <dbl>, General_1_SleepDisor <chr>,
## #   General_1_MentalDiso <chr>, General_1_Meds <chr>, General_1_MedList <chr>,
## #   General_1_University <chr>, General_1_UniYears <int>, Demo_1_Ethnic <chr>,
## #   Demo_1_Racial <chr>, Demo_1_Gender <chr>, Demo_1_NonParticipat <chr>,
## #   Epworth_1_Read <chr>, Epworth_1_TV <chr>, Epworth_1_Public <chr>,
## #   Epworth_1_Passenger <chr>, Epworth_1_LyingDown <chr>,
## #   Epworth_1_Talking <chr>, Epworth_1_Lunch <chr>, Epworth_1_Traffic <chr>,
## #   Epworth_total <int>, AlertTest_1_Concentr_1 <int>,
## #   AlertTest_1_Refresh_1 <int>, AlertTest_1_Feel <chr>,
## #   AlertTest_2_Concentr_1 <int>, AlertTest_2_Refresh_1 <int>,
## #   AlertTest_2_Feel <chr>, AlertTest_3_Concentr_1 <int>,
## #   AlertTest_3_Refresh_1 <int>, AlertTest_3_Feel <chr>,
## #   AlertTest_4_Concentr_1 <int>, AlertTest_4_Refresh_1 <int>,
## #   AlertTest_4_Feel <chr>, S1_ExitQ_1_sound <chr>,
## #   S1_ExitQ_1_soundaffect <chr>, S1_ExitQ_2_sound <chr>,
## #   S1_ExitQ_3_sound <chr>, S1_ExitQ_4_sound <chr>,
## #   S1_ExitQ_4_soundaffect <chr>, S1_ExitQ_5_sound <chr>,
## #   S1_ExitQ_5_soundaffect <chr>, S2_ExitQ_1_sound <chr>,
## #   S2_ExitQ_1_soundaffect <chr>, S2_ExitQ_2_sound <chr>,
## #   S2_ExitQ_3_sound <chr>, S2_ExitQ_4_sound <chr>,
## #   S2_ExitQ_4_soundaffect <chr>, S2_ExitQ_5_sound <chr>,
## #   S2_ExitQ_5_soundaffect <chr>, Total_sleep <int>, Wake_amount <int>,
## #   NREM1_amount <int>, NREM2_amount <dbl>, SWS_amount <int>, REM_amount <int>,
## #   SWSxREM <int>, cue_minutes <dbl>, baseIATcued <dbl>, baseIATuncued <dbl>,
## #   preIATcued <dbl>, preIATuncued <dbl>, postIATcued <dbl>,
## #   postIATuncued <dbl>, weekIATcued <dbl>, weekIATuncued <dbl>,
## #   postnap_change_cued <dbl>, postnap_change_uncued <dbl>,
## #   week_change_cued <int>, week_change_uncued <int>,
## #   diff_biaschange_cued <dbl>, diff_biaschange_uncued <dbl>,
## #   diff_biaschange <dbl>, base_IAT_race <dbl>, base_IAT_gen <dbl>,
## #   pre_IAT_race <dbl>, pre_IAT_gen <dbl>, post_IAT_race <dbl>,
## #   week_IAT_race <dbl>, post_IAT_gen <dbl>, week_IAT_gen <dbl>,
## #   filter_. <chr>, cues_total <dbl>, baselineIAT <dbl>, prenapIAT <dbl>,
## #   postnapIAT <dbl>, weekIAT <dbl>

calculate avg bias levels:

year3uni <- year3uni_participants %>% 
  select(baselineIAT, prenapIAT, postnapIAT, weekIAT)

year3uni_avgbias <- year3uni %>% 
  summarise(across(contains("IAT"), list(mean = mean)))

head(year3uni_avgbias)

## # A tibble: 6 x 4
##   baselineIAT_mean prenapIAT_mean postnapIAT_mean weekIAT_mean
##              <dbl>          <dbl>           <dbl>        <dbl>
## 1            0.592         0.387          0.575          0.443
## 2            0.372         0.103         -0.00611        0.224
## 3            0.242        -0.486          0.352          0.563
## 4            0.924         0.453          0.204          0.538
## 5            0.740         0.508          0.0421        -0.467
## 6            0.387         0.0875         0.376          0.249

year3uni_se <- std.error(year3uni)
head(year3uni_se)

## baselineIAT   prenapIAT  postnapIAT     weekIAT 
##   0.1049749   0.1508470   0.0898536   0.1562415

Create dataframe

When creating this dataframe, it kept coming up with an error, saying that there are differing number of rows. By this stage, I decided to just wait for the Q&A session on Tuesday and see if there was anything that could be used during that session

biasdata <- data.frame(
  condition = factor(c("0", "0", "0", "0", "1", "1", "1", "1", "2", "2", "2", "2", "3", "3", "3", "3")),
  time = factor(c("Baseline", "Prenap", "Postnap", "1-week", "Baseline", "Prenap", "Postnap", "1-week", "Baseline", "Prenap", "Postnap", "1-week", "Baseline", "Prenap", "Postnap", "1-week")),
  levels = c("Baseline", "Prenap", "Postnap", "1-week"),
  bias_av = c(year0uni_avgbias$baselineIAT_mean, year0uni_avgbias$prenapIAT_mean, year0uni_avgbias$postnapIAT_mean, year0uni_avgbias$weekIAT_mean, 
              year1uni_avgbias$baselineIAT_mean, year1uni_avgbias$prenapIAT_mean, year1uni_avgbias$postnapIAT_mean, year1uni_avgbias$weekIAT_mean, 
              year2uni_avgbias$baselineIAT_mean, year2uni_avgbias$prenapIAT_mean, year2uni_avgbias$postnapIAT_mean, year2uni_avgbias$weekIAT_mean, 
              year3uni_avgbias$baselineIAT_mean, year3uni_avgbias$prenapIAT_mean, year3uni_avgbias$postnapIAT_mean, year3uni_avgbias$weekIAT_mean))

## Warning: Unknown or uninitialised column: `baselineIAT_mean`.

## Warning: Unknown or uninitialised column: `prenapIAT_mean`.

## Warning: Unknown or uninitialised column: `postnapIAT_mean`.

## Warning: Unknown or uninitialised column: `weekIAT_mean`.

Attempt 4

After the Q&A where Jenny went through how to do descriptive statistics quickly and easily, I tried to do the same:

I decided to create the means and standard errors for each of the 4 IAT timepoints, grouped by the number of years at university. I also tried calculating standard error, using the formula rather than an actual function:

year_summary_baseline <- mutated_exploratorydata1 %>% 
  group_by(General_1_UniYears) %>% 
  summarise(mean = mean(baselineIAT),
            sd = sd(baselineIAT),
            n = n(),
            se = sd/sqrt(n))

year_summary_prenap <- mutated_exploratorydata1 %>% 
  group_by(General_1_UniYears) %>% 
  summarise(mean = mean(prenapIAT),
            sd = sd(prenapIAT),
            n = n(),
            se = sd/sqrt(n))

year_summary_postnap <- mutated_exploratorydata1 %>% 
  group_by(General_1_UniYears) %>% 
  summarise(mean = mean(postnapIAT),
            sd = sd(postnapIAT),
            n = n(),
            se = sd/sqrt(n))

year_summary_week <- mutated_exploratorydata1 %>% 
  group_by(General_1_UniYears) %>% 
  summarise(mean = mean(weekIAT),
            sd = sd(weekIAT),
            n = n(),
            se = sd/sqrt(n))

Now, that the means and standard errors have been calculated, I have to put the descriptive statistics into a table for each timepoint (i.e. create 4 tables). I used the gt() package to create the table.

year_summary_baseline %>% 
  gt() %>% 
  tab_header(title = md("**Baseline IAT timepoint**")) %>% 
  fmt_number(
    columns = vars(mean, sd, se),
    decimals = 2
  ) %>% 
  cols_label(General_1_UniYears = "Number of Years at University", 
             mean = "Mean", 
             sd = "SD", 
             n = "n",
             se = "SE")

## Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
## * please use `columns = c(...)` instead

## Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
## * please use `columns = c(...)` instead

Baseline IAT timepoint
Number of Years at University	Mean	SD	n	SE
0	0.56	0.29	11	0.09
1	0.63	0.22	4	0.11
2	0.53	0.34	10	0.11
3	0.54	0.26	6	0.10

year_summary_prenap %>% 
  gt() %>% 
  tab_header(title = md("**Prenap IAT timepoint**")) %>% 
  fmt_number(
    columns = vars(mean, sd, se),
    decimals = 2
  ) %>% 
  cols_label(General_1_UniYears = "Number of Years at University", 
             mean = "Mean", 
             sd = "SD", 
             n = "n",
             se = "SE")

## Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
## * please use `columns = c(...)` instead

## Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
## * please use `columns = c(...)` instead

Prenap IAT timepoint
Number of Years at University	Mean	SD	n	SE
0	0.34	0.20	11	0.06
1	0.35	0.40	4	0.20
2	0.18	0.48	10	0.15
3	0.18	0.37	6	0.15

year_summary_postnap %>% 
  gt() %>% 
  tab_header(title = md("**Postnap IAT timepoint**")) %>% 
  fmt_number(
    columns = vars(mean, sd, se),
    decimals = 2
  ) %>% 
  cols_label(General_1_UniYears = "Number of Years at University", 
             mean = "Mean", 
             sd = "SD", 
             n = "n",
             se = "SE")

## Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
## * please use `columns = c(...)` instead

## Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
## * please use `columns = c(...)` instead

Postnap IAT timepoint
Number of Years at University	Mean	SD	n	SE
0	0.25	0.39	11	0.12
1	0.37	0.37	4	0.18
2	0.29	0.33	10	0.11
3	0.26	0.22	6	0.09

year_summary_week %>% 
  gt() %>% 
  tab_header(title = md("**One-week delay IAT timepoint**")) %>% 
  fmt_number(
    columns = vars(mean, sd, se),
    decimals = 2
  ) %>% 
  cols_label(General_1_UniYears = "Number of Years at University", 
             mean = "Mean", 
             sd = "SD", 
             n = "n",
             se = "SE")

## Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
## * please use `columns = c(...)` instead

## Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
## * please use `columns = c(...)` instead

One-week delay IAT timepoint
Number of Years at University	Mean	SD	n	SE
0	0.30	0.33	11	0.10
1	0.50	0.20	4	0.10
2	0.56	0.35	10	0.11
3	0.26	0.38	6	0.16

Visualisation

Now that the descriptive statistics have been calculated, it's time to create the figure. First, I have to create the dataframe for the figure. - condition = defines the data points for each of the 4 lines in the graph; one for each year spent in uni - time = defines the x-axis, and the 4 different IAT timepoints - level also helps define the x-axis - bias_av defines the data points for the average bias scores - stderror defines the data points for the standard errors

biasdata <- data.frame(
  condition = factor(c("0", "0", "0", "0", "1", "1", "1", "1", "2", "2", "2", "2", "3", "3", "3", "3")),
  time = factor(c("Baseline", "Prenap", "Postnap", "1-week", "Baseline", "Prenap", "Postnap", "1-week", "Baseline", "Prenap", "Postnap", "1-week", "Baseline", "Prenap", "Postnap", "1-week")),
  levels = c("Baseline", "Prenap", "Postnap", "1-week"),
  bias_av = c(0.5633872, 0.3387359, 0.2454916, 0.2977948,
              0.6255864, 0.3477496, 0.3722320, 0.5012622,
              0.5295660, 0.1786688, 0.2875954, 0.5551314,
              0.5428986, 0.1754848, 0.2571501, 0.2583117),
  stderror = c(0.08727692, 0.06171902, 0.1170974, 0.09872838,
               0.6255864, 0.19775303, 0.1848361, 0.09931040,
               0.10823134, 0.15092541, 0.1050117, 0.11071705,
               0.10497492, 0.15084704, 0.0898536, 0.15624148
  ))

head(biasdata)

##   condition     time   levels   bias_av   stderror
## 1         0 Baseline Baseline 0.5633872 0.08727692
## 2         0   Prenap   Prenap 0.3387359 0.06171902
## 3         0  Postnap  Postnap 0.2454916 0.11709740
## 4         0   1-week   1-week 0.2977948 0.09872838
## 5         1 Baseline Baseline 0.6255864 0.62558640
## 6         1   Prenap   Prenap 0.3477496 0.19775303

Attempt 1

Now that the dataframe is created, the figure must be formatted. I used ggplot() to create the figure. However, it only came up with standard error bars for some reason. The lines to connect each of the time conditions aren't appearing and the legend for the 4 different conditions is missing.

fig_1 <- ggplot(data = biasdata, 
                aes(x = time, y = bias_av, fill = condition))+
  geom_line()+
  geom_errorbar(aes(x = time, ymin = bias_av - stderror, ymax = bias_av + stderror), 
                width=0.1, 
                colour="grey", 
                alpha= 0.9) +
   ylim(0.0, 0.7) +
  labs(x = "", 
       y = "D600 Bias Score", 
       caption = "Fig 3. Average D600 scores at each IAT timepoint") +
  theme_bw()

print(fig_1)

## geom_path: Each group consists of only one observation. Do you need to adjust
## the group aesthetic?

Attempt 2

I tried defining the x-axis better, by first defining the variable to be used, and the levels too.
The y-axis definitely needed to be defined better, first by colour (changes the colour of each line by each condition) and then by group (allows for partitioning of the data by condition)
I also changed the theme to classic, because I didn't like the grid lines in the plot

fig_1.2 <- ggplot(data = biasdata, 
                  aes(x = factor(time, level = c("Baseline", "Prenap", "Postnap", "1-week")), 
                      y = bias_av, 
                      colour = condition, 
                      group = condition)) +
  geom_line()+
  geom_errorbar(aes(x = time, ymin = bias_av - stderror, ymax = bias_av + stderror), 
                width=0.1, 
                colour="grey", 
                alpha= 0.9) +
   ylim(0.0, 0.7) +
  labs(x = "", 
       y = "D600 Bias Score", 
       caption = "Fig 3. Average D600 scores at each IAT timepoint",
       title = "Do the number of years spent in university affect students' bias scores?") +
  theme_classic()

print(fig_1.2)

Attempt 3

However, I realised that the legend title for my plot didn't make sense. I first tried defining the title by using labs() and defining for group =, which didn't change anything. By defining for colour =, I was able to change the legend title.

fig_1.3 <- ggplot(data = biasdata, 
                  aes(x = factor(time, level = c("Baseline", "Prenap", "Postnap", "1-week")), 
                      y = bias_av, 
                      colour = condition, 
                      group = condition)) +
  geom_line()+
  geom_errorbar(aes(x = time, ymin = bias_av - stderror, ymax = bias_av + stderror), 
                width=0.1, 
                colour="grey", 
                alpha= 0.9) +
   ylim(0.0, 0.7) +
  labs(x = "", 
       y = "D600 Bias Score", 
       caption = "Fig 3. Average D600 scores at each IAT timepoint",
       title = "Do the number of years spent in university affect students' bias scores?",
       colour = "Number of years in university") +
  theme_classic()

print(fig_1.3)

Interestingly, it seems that those who spent 1 year in uni are the least affected by the TMR procedure, maintaining a relatively high bias score throughout the 4 timepoints. Meanwhile, 2- or 3- year students have similar, low bias score before the nap but 2-year students have the fastest recovery, surpassing 1-year students at the one-week timepoint. In contrast, 0- and 3- years maintain low bias scores even one-week later, suggesting that the TMR procedure may have had a greater impact on them compared to the other year groups.

Statistics !

I have to compare means between conditions (number of years spent in university) at each IAT timepoint, so I thought I would use ANOVA. I used the method that Jenny showed in the Q&A session. However, I have no idea how to interpret it.

Timepoint 1: baseline

m1 <- aov(mean ~ General_1_UniYears, data = year_summary_baseline)

summary(m1)

##                    Df   Sum Sq  Mean Sq F value Pr(>F)
## General_1_UniYears  1 0.001240 0.001240   0.594  0.522
## Residuals           2 0.004177 0.002088

Timepoint 2: prenap

m2 <- aov(mean ~ General_1_UniYears, data = year_summary_prenap)

summary(m2)

##                    Df   Sum Sq  Mean Sq F value Pr(>F)
## General_1_UniYears  1 0.021703 0.021703   7.291  0.114
## Residuals           2 0.005954 0.002977

Timepoint 3: postnap

m3 <- aov(mean ~ General_1_UniYears, data = year_summary_postnap)

summary(m3)

##                    Df   Sum Sq  Mean Sq F value Pr(>F)
## General_1_UniYears  1 0.000123 0.000123   0.025  0.888
## Residuals           2 0.009703 0.004852

Timepoint 4: one-week delay

m4 <- aov(mean ~ General_1_UniYears, data = year_summary_week)

summary(m4)

##                    Df  Sum Sq Mean Sq F value Pr(>F)
## General_1_UniYears  1 0.00021 0.00021   0.006  0.943
## Residuals           2 0.06459 0.03230

I'm not great with statistics, but I'm assuming Pr(>F) is the p-value associated with each F-statistic. Since all four p-values are greater than 0.05, there is no evidence to suggest that the number of years spent in university affect students' bias scores.

Challenges and successes

Most of these have been documented above, however I'm not sure if I did my statistics right. Hopefully, the next Q&A session, I'll have a chance to ask questions regarding this.

Next steps

I now need to decide on which two other questions I'll be including in my verification report, and then carrying out the coding to answer them. I also need to research more about the statistics part of this, as statistics is definitely not one of my strengths.

Learning Log 8

Jade Gurtala

24/07/2021

This week's coding goals

The Questions

How did I go?

Preliminaries

Descriptive statistics

Attempt 1

Attempt 2

Attempt 3

Attempt 4

Visualisation

Attempt 1

Attempt 2

Attempt 3

Statistics !

Challenges and successes

Next steps