Week 7 coding goals

With the group project graphs and tables mostly complete, this week’s goals were to do with learning to understand R better and clean code for my verification report:

  • Learn to use the gt() package by recreating Table 1 myself
  • Learn to create tibbles with the value name instead of the actual value itself

Challenges and successes

1. Learn to use gt() package

I decided to attempt to better understand tables and the creation of descriptive statistics since most of the first few tables were completed by my group members.

Here is the table that I was trying to recreate.

First, I had to create the descriptive statistics. So, I did the usual code where I loaded the packages, read the data from the original OSF file, and remove excluded participants. Then I used a dataframe Jade put together for Table 1. This was made from the descriptive statistics which we learnt to calculate a few weeks ago.

Now we have to learn to use the gt() package. First, we name the table, then tell it to use the tibble1 dataframe. Then we give the table a title. Then, because the table shortens each values to 2 decimal points, we will do the same by specifying that decimals = 2. We also format the table to have two columns and specify that they are called “Mean” and “SD”. Then we added a footnote for the entire table that explains how the IAT values are calculated.

#load packages
library(readspss) #package to read the original datafile from OFS
library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.1 ──
## ✓ ggplot2 3.3.4     ✓ purrr   0.3.4
## ✓ tibble  3.1.2     ✓ dplyr   1.0.6
## ✓ tidyr   1.1.3     ✓ stringr 1.4.0
## ✓ readr   1.4.0     ✓ forcats 0.5.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()
library(gt)
library(plotrix)

#read data
data <- read.sav("Humiston & Wamsley 2019 data.sav")

#remove excluded 
cleandata <- data %>%     #remove excluded participants 
  filter(exclude=="no")

#use dataframe
tibble1 <- tibble( #3 columns
  Characteristics = c("Age (yrs)", "ESS", "SSS", "Baseline implicit bias", "Prenap implicit bias", "Postnap implicit bias", "One-week delay implicit bias", "Sex (% male)", "Cue played during nap (% racial cue)"), #label
  Mean = c(19.5, 15.3, 2.81, 0.557, 0.257, 0.278, 0.399, 0.484, 0.548),
  SD = c(1.23, 2.83, 0.749, 0.406, 0.478, 0.459, 0.425, NA, NA)
)

print(tibble1)
## # A tibble: 9 x 3
##   Characteristics                        Mean     SD
##   <chr>                                 <dbl>  <dbl>
## 1 Age (yrs)                            19.5    1.23 
## 2 ESS                                  15.3    2.83 
## 3 SSS                                   2.81   0.749
## 4 Baseline implicit bias                0.557  0.406
## 5 Prenap implicit bias                  0.257  0.478
## 6 Postnap implicit bias                 0.278  0.459
## 7 One-week delay implicit bias          0.399  0.425
## 8 Sex (% male)                          0.484 NA    
## 9 Cue played during nap (% racial cue)  0.548 NA
table1 <- tibble1 %>%
  gt() %>% #use gt() package
  tab_header(title = md("Table 1. Participant characteristics.")) %>% #title
  fmt_number(columns = vars(Mean, SD), decimals = 2) %>%  
  tab_source_note(source_note = "Implicit bias values are the average of D600 score for each timepoint") %>%
  cols_label(Characteristics = "")
## Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
## * please use `columns = c(...)` instead

## Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
## * please use `columns = c(...)` instead
print(table1)
## <div id="eoqtrfzwhz" style="overflow-x:auto;overflow-y:auto;width:auto;height:auto;">
##   <style>html {
##   font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, Oxygen, Ubuntu, Cantarell, 'Helvetica Neue', 'Fira Sans', 'Droid Sans', Arial, sans-serif;
## }
## 
## #eoqtrfzwhz .gt_table {
##   display: table;
##   border-collapse: collapse;
##   margin-left: auto;
##   margin-right: auto;
##   color: #333333;
##   font-size: 16px;
##   font-weight: normal;
##   font-style: normal;
##   background-color: #FFFFFF;
##   width: auto;
##   border-top-style: solid;
##   border-top-width: 2px;
##   border-top-color: #A8A8A8;
##   border-right-style: none;
##   border-right-width: 2px;
##   border-right-color: #D3D3D3;
##   border-bottom-style: solid;
##   border-bottom-width: 2px;
##   border-bottom-color: #A8A8A8;
##   border-left-style: none;
##   border-left-width: 2px;
##   border-left-color: #D3D3D3;
## }
## 
## #eoqtrfzwhz .gt_heading {
##   background-color: #FFFFFF;
##   text-align: center;
##   border-bottom-color: #FFFFFF;
##   border-left-style: none;
##   border-left-width: 1px;
##   border-left-color: #D3D3D3;
##   border-right-style: none;
##   border-right-width: 1px;
##   border-right-color: #D3D3D3;
## }
## 
## #eoqtrfzwhz .gt_title {
##   color: #333333;
##   font-size: 125%;
##   font-weight: initial;
##   padding-top: 4px;
##   padding-bottom: 4px;
##   border-bottom-color: #FFFFFF;
##   border-bottom-width: 0;
## }
## 
## #eoqtrfzwhz .gt_subtitle {
##   color: #333333;
##   font-size: 85%;
##   font-weight: initial;
##   padding-top: 0;
##   padding-bottom: 4px;
##   border-top-color: #FFFFFF;
##   border-top-width: 0;
## }
## 
## #eoqtrfzwhz .gt_bottom_border {
##   border-bottom-style: solid;
##   border-bottom-width: 2px;
##   border-bottom-color: #D3D3D3;
## }
## 
## #eoqtrfzwhz .gt_col_headings {
##   border-top-style: solid;
##   border-top-width: 2px;
##   border-top-color: #D3D3D3;
##   border-bottom-style: solid;
##   border-bottom-width: 2px;
##   border-bottom-color: #D3D3D3;
##   border-left-style: none;
##   border-left-width: 1px;
##   border-left-color: #D3D3D3;
##   border-right-style: none;
##   border-right-width: 1px;
##   border-right-color: #D3D3D3;
## }
## 
## #eoqtrfzwhz .gt_col_heading {
##   color: #333333;
##   background-color: #FFFFFF;
##   font-size: 100%;
##   font-weight: normal;
##   text-transform: inherit;
##   border-left-style: none;
##   border-left-width: 1px;
##   border-left-color: #D3D3D3;
##   border-right-style: none;
##   border-right-width: 1px;
##   border-right-color: #D3D3D3;
##   vertical-align: bottom;
##   padding-top: 5px;
##   padding-bottom: 6px;
##   padding-left: 5px;
##   padding-right: 5px;
##   overflow-x: hidden;
## }
## 
## #eoqtrfzwhz .gt_column_spanner_outer {
##   color: #333333;
##   background-color: #FFFFFF;
##   font-size: 100%;
##   font-weight: normal;
##   text-transform: inherit;
##   padding-top: 0;
##   padding-bottom: 0;
##   padding-left: 4px;
##   padding-right: 4px;
## }
## 
## #eoqtrfzwhz .gt_column_spanner_outer:first-child {
##   padding-left: 0;
## }
## 
## #eoqtrfzwhz .gt_column_spanner_outer:last-child {
##   padding-right: 0;
## }
## 
## #eoqtrfzwhz .gt_column_spanner {
##   border-bottom-style: solid;
##   border-bottom-width: 2px;
##   border-bottom-color: #D3D3D3;
##   vertical-align: bottom;
##   padding-top: 5px;
##   padding-bottom: 6px;
##   overflow-x: hidden;
##   display: inline-block;
##   width: 100%;
## }
## 
## #eoqtrfzwhz .gt_group_heading {
##   padding: 8px;
##   color: #333333;
##   background-color: #FFFFFF;
##   font-size: 100%;
##   font-weight: initial;
##   text-transform: inherit;
##   border-top-style: solid;
##   border-top-width: 2px;
##   border-top-color: #D3D3D3;
##   border-bottom-style: solid;
##   border-bottom-width: 2px;
##   border-bottom-color: #D3D3D3;
##   border-left-style: none;
##   border-left-width: 1px;
##   border-left-color: #D3D3D3;
##   border-right-style: none;
##   border-right-width: 1px;
##   border-right-color: #D3D3D3;
##   vertical-align: middle;
## }
## 
## #eoqtrfzwhz .gt_empty_group_heading {
##   padding: 0.5px;
##   color: #333333;
##   background-color: #FFFFFF;
##   font-size: 100%;
##   font-weight: initial;
##   border-top-style: solid;
##   border-top-width: 2px;
##   border-top-color: #D3D3D3;
##   border-bottom-style: solid;
##   border-bottom-width: 2px;
##   border-bottom-color: #D3D3D3;
##   vertical-align: middle;
## }
## 
## #eoqtrfzwhz .gt_from_md > :first-child {
##   margin-top: 0;
## }
## 
## #eoqtrfzwhz .gt_from_md > :last-child {
##   margin-bottom: 0;
## }
## 
## #eoqtrfzwhz .gt_row {
##   padding-top: 8px;
##   padding-bottom: 8px;
##   padding-left: 5px;
##   padding-right: 5px;
##   margin: 10px;
##   border-top-style: solid;
##   border-top-width: 1px;
##   border-top-color: #D3D3D3;
##   border-left-style: none;
##   border-left-width: 1px;
##   border-left-color: #D3D3D3;
##   border-right-style: none;
##   border-right-width: 1px;
##   border-right-color: #D3D3D3;
##   vertical-align: middle;
##   overflow-x: hidden;
## }
## 
## #eoqtrfzwhz .gt_stub {
##   color: #333333;
##   background-color: #FFFFFF;
##   font-size: 100%;
##   font-weight: initial;
##   text-transform: inherit;
##   border-right-style: solid;
##   border-right-width: 2px;
##   border-right-color: #D3D3D3;
##   padding-left: 12px;
## }
## 
## #eoqtrfzwhz .gt_summary_row {
##   color: #333333;
##   background-color: #FFFFFF;
##   text-transform: inherit;
##   padding-top: 8px;
##   padding-bottom: 8px;
##   padding-left: 5px;
##   padding-right: 5px;
## }
## 
## #eoqtrfzwhz .gt_first_summary_row {
##   padding-top: 8px;
##   padding-bottom: 8px;
##   padding-left: 5px;
##   padding-right: 5px;
##   border-top-style: solid;
##   border-top-width: 2px;
##   border-top-color: #D3D3D3;
## }
## 
## #eoqtrfzwhz .gt_grand_summary_row {
##   color: #333333;
##   background-color: #FFFFFF;
##   text-transform: inherit;
##   padding-top: 8px;
##   padding-bottom: 8px;
##   padding-left: 5px;
##   padding-right: 5px;
## }
## 
## #eoqtrfzwhz .gt_first_grand_summary_row {
##   padding-top: 8px;
##   padding-bottom: 8px;
##   padding-left: 5px;
##   padding-right: 5px;
##   border-top-style: double;
##   border-top-width: 6px;
##   border-top-color: #D3D3D3;
## }
## 
## #eoqtrfzwhz .gt_striped {
##   background-color: rgba(128, 128, 128, 0.05);
## }
## 
## #eoqtrfzwhz .gt_table_body {
##   border-top-style: solid;
##   border-top-width: 2px;
##   border-top-color: #D3D3D3;
##   border-bottom-style: solid;
##   border-bottom-width: 2px;
##   border-bottom-color: #D3D3D3;
## }
## 
## #eoqtrfzwhz .gt_footnotes {
##   color: #333333;
##   background-color: #FFFFFF;
##   border-bottom-style: none;
##   border-bottom-width: 2px;
##   border-bottom-color: #D3D3D3;
##   border-left-style: none;
##   border-left-width: 2px;
##   border-left-color: #D3D3D3;
##   border-right-style: none;
##   border-right-width: 2px;
##   border-right-color: #D3D3D3;
## }
## 
## #eoqtrfzwhz .gt_footnote {
##   margin: 0px;
##   font-size: 90%;
##   padding: 4px;
## }
## 
## #eoqtrfzwhz .gt_sourcenotes {
##   color: #333333;
##   background-color: #FFFFFF;
##   border-bottom-style: none;
##   border-bottom-width: 2px;
##   border-bottom-color: #D3D3D3;
##   border-left-style: none;
##   border-left-width: 2px;
##   border-left-color: #D3D3D3;
##   border-right-style: none;
##   border-right-width: 2px;
##   border-right-color: #D3D3D3;
## }
## 
## #eoqtrfzwhz .gt_sourcenote {
##   font-size: 90%;
##   padding: 4px;
## }
## 
## #eoqtrfzwhz .gt_left {
##   text-align: left;
## }
## 
## #eoqtrfzwhz .gt_center {
##   text-align: center;
## }
## 
## #eoqtrfzwhz .gt_right {
##   text-align: right;
##   font-variant-numeric: tabular-nums;
## }
## 
## #eoqtrfzwhz .gt_font_normal {
##   font-weight: normal;
## }
## 
## #eoqtrfzwhz .gt_font_bold {
##   font-weight: bold;
## }
## 
## #eoqtrfzwhz .gt_font_italic {
##   font-style: italic;
## }
## 
## #eoqtrfzwhz .gt_super {
##   font-size: 65%;
## }
## 
## #eoqtrfzwhz .gt_footnote_marks {
##   font-style: italic;
##   font-weight: normal;
##   font-size: 65%;
## }
## </style>
##   <table class="gt_table">
##   <thead class="gt_header">
##     <tr>
##       <th colspan="3" class="gt_heading gt_title gt_font_normal gt_bottom_border" style>Table 1. Participant characteristics.</th>
##     </tr>
##     
##   </thead>
##   <thead class="gt_col_headings">
##     <tr>
##       <th class="gt_col_heading gt_columns_bottom_border gt_left" rowspan="1" colspan="1"></th>
##       <th class="gt_col_heading gt_columns_bottom_border gt_right" rowspan="1" colspan="1">Mean</th>
##       <th class="gt_col_heading gt_columns_bottom_border gt_right" rowspan="1" colspan="1">SD</th>
##     </tr>
##   </thead>
##   <tbody class="gt_table_body">
##     <tr><td class="gt_row gt_left">Age (yrs)</td>
## <td class="gt_row gt_right">19.50</td>
## <td class="gt_row gt_right">1.23</td></tr>
##     <tr><td class="gt_row gt_left">ESS</td>
## <td class="gt_row gt_right">15.30</td>
## <td class="gt_row gt_right">2.83</td></tr>
##     <tr><td class="gt_row gt_left">SSS</td>
## <td class="gt_row gt_right">2.81</td>
## <td class="gt_row gt_right">0.75</td></tr>
##     <tr><td class="gt_row gt_left">Baseline implicit bias</td>
## <td class="gt_row gt_right">0.56</td>
## <td class="gt_row gt_right">0.41</td></tr>
##     <tr><td class="gt_row gt_left">Prenap implicit bias</td>
## <td class="gt_row gt_right">0.26</td>
## <td class="gt_row gt_right">0.48</td></tr>
##     <tr><td class="gt_row gt_left">Postnap implicit bias</td>
## <td class="gt_row gt_right">0.28</td>
## <td class="gt_row gt_right">0.46</td></tr>
##     <tr><td class="gt_row gt_left">One-week delay implicit bias</td>
## <td class="gt_row gt_right">0.40</td>
## <td class="gt_row gt_right">0.42</td></tr>
##     <tr><td class="gt_row gt_left">Sex (% male)</td>
## <td class="gt_row gt_right">0.48</td>
## <td class="gt_row gt_right">NA</td></tr>
##     <tr><td class="gt_row gt_left">Cue played during nap (% racial cue)</td>
## <td class="gt_row gt_right">0.55</td>
## <td class="gt_row gt_right">NA</td></tr>
##   </tbody>
##   <tfoot class="gt_sourcenotes">
##     <tr>
##       <td class="gt_sourcenote" colspan="3">Implicit bias values are the average of D600 score for each timepoint</td>
##     </tr>
##   </tfoot>
##   
## </table>
## </div>

Luckily, this turned out pretty well because I could refer to my group members tables and walk through their tables very slowly to understand it one by one. So I would say this was a success, but not because I came up with the table myself. Rather, it was because I was able to use my group members codes to walk myself through their steps.

However, as you can see, I’m having an issue knitting. In my R Console, the table automatically shows up and looks great, but it’s not showing up in my knit, which I’m not too sure how to address.

Here is what a screenshot of my console looks like:

2. Learn to create tibbles with the variable name instead of the value itself.

All of my plots so far have been made with dataframes that look like this:

data4 <- data.frame(
  condition = factor(c("cued", "cued", "cued", "cued", "uncued", "uncued", "uncued", "uncued")),
  time = factor(c("Baseline", "Prenap", "Postnap", "1-week", "Baseline", "Prenap", "Postnap", "1-week")),
  levels = c("Baseline", "Prenap", "Postnap", "1-week"),
  bias_av = c(0.52, 0.21, 0.31, 0.40, 0.60, 0.30, 0.25, 0.40),
  se = c(0.06, 0.09, 0.08,0.07,0.08, 0.08, 0.09,0.08)
)

head(data4)
##   condition     time   levels bias_av   se
## 1      cued Baseline Baseline    0.52 0.06
## 2      cued   Prenap   Prenap    0.21 0.09
## 3      cued  Postnap  Postnap    0.31 0.08
## 4      cued   1-week   1-week    0.40 0.07
## 5    uncued Baseline Baseline    0.60 0.08
## 6    uncued   Prenap   Prenap    0.30 0.08

I want to try construct a dataframe that looks a bit cleaner and uses the value names rather than the numbers.

This proved quite difficult. To demonstrate how I did this, I will first put my code that we used to calculate those values in the first place. (This is from my figure 3 code)

#selecting variables that show biases
choosing_bias <- cleandata %>%
  select(baseIATcued, baseIATuncued, preIATcued, preIATuncued, postIATcued, postIATuncued, weekIATcued, weekIATuncued)

#calculate averages
bias_av <- choosing_bias %>%
  summarise(cued_baseline_av = mean(baseIATcued),
            cued_pre_av = mean(preIATcued),
            cued_post_av = mean(postIATcued),
            cued_week_av = mean(weekIATcued),
            uncued_baseline_av = mean(baseIATuncued),
            uncued_pre_av = mean(preIATuncued),
            uncued_post_av = mean(postIATuncued),
            uncued_week_av = mean(weekIATuncued))
  
print(bias_av)
##   cued_baseline_av cued_pre_av cued_post_av cued_week_av uncued_baseline_av
## 1        0.5175814   0.2108864    0.3068241    0.3999553          0.5954932
##   uncued_pre_av uncued_post_av uncued_week_av
## 1     0.3024484       0.248543      0.3988819
SE_bias <- std.error(choosing_bias) #calculated using the plotrix package, see learning log 6
print(SE_bias)
##   baseIATcued baseIATuncued    preIATcued  preIATuncued   postIATcued 
##    0.06522755    0.08030357    0.09232149    0.07937978    0.07984374 
## postIATuncued   weekIATcued weekIATuncued 
##    0.08578681    0.06954221    0.08388760

I had noticed that Jade managed to use variable names in her table instead of values by doing this and using the dollar sign $. Here, Jade has put the $ between the dataset it is from, and the name of the variable.

table1 <- tibble( #3 columns
  Characteristics = c("Age (yrs)", "ESS", "SSS", "Baseline implicit bias", "Prenap implicit bias", "Postnap implicit bias", "One-week delay implicit bias", "Sex (% male)", "Cue played during nap (% racial cue)"), #label
  Mean = c(ageaverage$ageaverage, ESS$ESSaverage, SSS$SSSaverage, BIB$BIBaverage, PrenapIB$PrenapIBaverage, PostnapIB$PostnapIBaverage, OWDIB$OWDIBaverage, Male_percentage$n, racialcue_perentage$n), #data$variable
  SD = c(ageaverage$agesd, ESS$ESSsd, SSS$SSSsd, BIB$BIBsd, PrenapIB$PrenapIBsd, PostnapIB$PostnapIBsd, OWDIB$OWDIBsd, NA, NA)
)

print(table1)

However, when I attempted to do it this way, it didn’t work. As it said that “Error in SE_bias$baseIATcued : $ operator is invalid for atomic vectors”.

data4 <- data.frame(
  condition = factor(c("cued", "cued", "cued", "cued", "uncued", "uncued", "uncued", "uncued")),
  time = factor(c("Baseline", "Prenap", "Postnap", "1-week", "Baseline", "Prenap", "Postnap", "1-week")),
  levels = c("Baseline", "Prenap", "Postnap", "1-week"),
  bias_av = c(bias_av$cued_baseline_av, bias_av$cued_pre_av, bias_av$cued_post_av, bias_av$cued_week_av, bias_av$uncued_baseline_av, bias_av$uncued_pre_av, bias_av$uncued_post_av, bias_av$uncued_week_av),
  se = c(SE_bias$baseIATcued, SE_bias$preIATcued, SE_bias$postIATcued, SE_bias$weekIATcued, SE_bias$baseIATuncued, SE_bias$preIATuncued, SE_bias$postIATuncued, SE_bias$weekIATuncued)
)

head(data4)

I later realised this was because the SE_bias dataset did not have each of those variables explicitly stated so each separate value was not seen as a variable within a dataset.

So, I attempted to re-do this dataframe with a new technique for calculating the SE. Instead of using the value given by the plotrix package, I attempted a new way of calculating that Jenny showed us in the Q&A where she used se = sd/sqrt(n).

#calculate SE attempt 1
SE_bias1 <- choosing_bias %>%
  summarise(cued_baseline_se = sd/sqrt(baseIATcued))

However, this came up with this error message: `non-numeric argument to binary operator’. I later tried a different attempt which specifies exactly which variables both parts of the operation need to refer to in attempt two. While this did provide a value, it is the incorrect SE.

#calculate SE attempt 2
SE_bias2 <- choosing_bias %>%
  summarise(cued_baseline_se = sd(baseIATcued)/sqrt(mean(baseIATcued)))
  
print(SE_bias2)
##   cued_baseline_se
## 1        0.5048037

So this was a half failure. I managed to get the variables to work for the means, but not for the SEs.

So, my question is: how can I use a variable name for SEs in the dataframe?

Next steps on my coding journey

From next week, I want to start working on my verification report. This means cleaning up the code and beginning my exploratory analysis.

Goals for next week:

  • Learn to replace the SE number values with the variable name
  • Clean up code
  • Start exploratory analysis