This week’s coding goals

  1. Attend Q&A session to learn to fix working directory issues
  2. Get started on reproducing descriptive statistics from the main study of my group’s research article
  3. Work with my group and resolve any issues as best as possible

Achieving the goals

Here are the steps on how I started reproducing the descriptive statistics.

  1. Load packages and have R read data
library(tidyverse)
## -- Attaching packages --------------------------------------- tidyverse 1.3.1 --
## v ggplot2 3.3.4     v purrr   0.3.4
## v tibble  3.1.2     v dplyr   1.0.7
## v tidyr   1.1.3     v stringr 1.4.0
## v readr   1.4.0     v forcats 0.5.1
## -- Conflicts ------------------------------------------ tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()
library(dplyr)
library(ggplot2)
data <- read.csv("beliefsuperiority_all.csv")
data <- filter(data,Q62 == 1)
  1. Remove those who failed attention check items and make separate dataframe
data_attn= filter(data,AC_a==3) %>% 
  filter(AC_b==5)
  1. Remove attention check items from dataframe which changes data_attn variables from 65 and 63
data_attn=dplyr::select(data_attn,-starts_with('AC'))
  1. Reverse score certain items 2, 4, 5, 7, 10, 11, 13, 16, 18, 19 on the dogmatism scale
  2. Calculate the mean scores for items on the dogmatism scale

But this is where I ran into some issues…

Challenges and successes

I tried to create a new dataframe for the dogmatism scores and reverse score certain items but an error message popped up saying that ‘Argument 2 must be named, not unnamed’. So as a group, we looked back to the coding of the original researchers and figured out that I had to install a package called ‘car’ which, helps to transform data.

install.packages("car")
library(car)

I then tried running the chunk again but this time the last line of code that calculated the mean of the dogscale scores had an error.

data_attn$Q37_2 = recode(data_attn$Q37_2, '1=9; 2=8; 3=7; 4=6; 6=4; 7=3; 8=2; 9=1')
data_attn$Q37_4 = recode(data_attn$Q37_4, '1=9; 2=8; 3=7; 4=6; 6=4; 7=3; 8=2; 9=1')
data_attn$Q37_5 = recode(data_attn$Q37_5, '1=9; 2=8; 3=7; 4=6; 6=4; 7=3; 8=2; 9=1')
data_attn$Q37_7 = recode(data_attn$Q37_7, '1=9; 2=8; 3=7; 4=6; 6=4; 7=3; 8=2; 9=1')
data_attn$Q37_10 = recode(data_attn$Q37_10, '1=9; 2=8; 3=7; 4=6; 6=4; 7=3; 8=2; 9=1')
data_attn$Q37_11 = recode(data_attn$Q37_11, '1=9; 2=8; 3=7; 4=6; 6=4; 7=3; 8=2; 9=1')
data_attn$Q37_13 = recode(data_attn$Q37_13, '1=9; 2=8; 3=7; 4=6; 6=4; 7=3; 8=2; 9=1')
data_attn$Q37_16 = recode(data_attn$Q37_16, '1=9; 2=8; 3=7; 4=6; 6=4; 7=3; 8=2; 9=1')
data_attn$Q37_18 = recode(data_attn$Q37_18, '1=9; 2=8; 3=7; 4=6; 6=4; 7=3; 8=2; 9=1')
data_attn$Q37_19 = recode(data_attn$Q37_19, '1=9; 2=8; 3=7; 4=6; 6=4; 7=3; 8=2; 9=1')

dogscale=dplyr::select(data_attn,starts_with('Q37'))
data_attn$meanD=rowMeans(dogscale,na.rm = TRUE)

After consulting the original code, I found that the researchers already converted x into a character and then into a numeric, so I added the additional code and ran it again and it worked, creating the resulting histogram.

library(car)
## Loading required package: carData
## 
## Attaching package: 'car'
## The following object is masked from 'package:dplyr':
## 
##     recode
## The following object is masked from 'package:purrr':
## 
##     some
data[] <- lapply(data, function(x) {
    if(is.factor(x)) as.numeric(as.character(x)) else x
})

data_attn$Q37_2 = recode(data_attn$Q37_2, '1=9; 2=8; 3=7; 4=6; 6=4; 7=3; 8=2; 9=1')
data_attn$Q37_4 = recode(data_attn$Q37_4, '1=9; 2=8; 3=7; 4=6; 6=4; 7=3; 8=2; 9=1')
data_attn$Q37_5 = recode(data_attn$Q37_5, '1=9; 2=8; 3=7; 4=6; 6=4; 7=3; 8=2; 9=1')
data_attn$Q37_7 = recode(data_attn$Q37_7, '1=9; 2=8; 3=7; 4=6; 6=4; 7=3; 8=2; 9=1')
data_attn$Q37_10 = recode(data_attn$Q37_10, '1=9; 2=8; 3=7; 4=6; 6=4; 7=3; 8=2; 9=1')
data_attn$Q37_11 = recode(data_attn$Q37_11, '1=9; 2=8; 3=7; 4=6; 6=4; 7=3; 8=2; 9=1')
data_attn$Q37_13 = recode(data_attn$Q37_13, '1=9; 2=8; 3=7; 4=6; 6=4; 7=3; 8=2; 9=1')
data_attn$Q37_16 = recode(data_attn$Q37_16, '1=9; 2=8; 3=7; 4=6; 6=4; 7=3; 8=2; 9=1')
data_attn$Q37_18 = recode(data_attn$Q37_18, '1=9; 2=8; 3=7; 4=6; 6=4; 7=3; 8=2; 9=1')
data_attn$Q37_19 = recode(data_attn$Q37_19, '1=9; 2=8; 3=7; 4=6; 6=4; 7=3; 8=2; 9=1')

dogscale=dplyr::select(data_attn,starts_with('Q37'))
data_attn$meanD=rowMeans(dogscale,na.rm = TRUE)

hist(data_attn$meanD)

Along the way, I encountered a lot of careless mistakes such as forgetting to use the same label for the relevant data in the global environment. But working with my group really helped to solve these problems!

The next stage

Next I plan to tackle reproducing the mean scores of the belief superiority scale as well as the descriptive statistics of the participants and the pilot study. I also hope to learn more about the functions the researchers used. I’m really looking forward to working with my group again!