Hello everyone! I hope everyone has had a good Flexi Week despite the lockdown.

Goals

My main goal for this week was to work on Fig 1 from Nichols et al.

our progress for nichols et al.

loading libraries

library(tidyverse)
library(janitor) #used to read csv
library(dplyr) #used for mutating variables
library(ggplot2) #used to plot column
library(ggdist) #used for raincloud plot

reading nichols csv

data1 <- read_csv("data/Nichols_et_al_data.csv")

cleaning up variables

We used the dplyr package to make the data much neater. Rename was used to clean up the variables to look more like the original. We also used filter and select to narrow down the variables we will be using for Fig 1.

Claimpercent had to be mutated to become a percentage value, since they were in decimal place beforehand. Then, we used as_tibble to make the data easier to look at.

data1 <- data1 %>%  
  filter(include == 0) %>% 
  rename(cond = con,
         claimpercent = claim,
         claimmoney = moneyclaim,
         CT_practice = `completion time (practice included)`,
         CT_payments = `completion time (payments only)`,
         religiosity = relig,
         religion = Religion) %>% 
  select(site, claimpercent, cond, id) %>% 
  mutate(claimpercent = claimpercent * 100) %>% 
  as_tibble()

fig 1 - data by condition

rename each condition

We had to create a new claimpercent variable so each condition had a mean value instead. As such, we had to rename each condition.

data1$cond[data1$cond==4] <- 0 # Make religious prime the reference category
data1$cond[data1$cond==1] <- 4 # This is in a weird order as R reads the code line by line, so if we go from top to bottom, 
data1$cond[data1$cond==3] <- 1 # we're changing the number twice which screws up our dataframe
data1$cond[data1$cond==4] <- 3
mutating claimpercent

We used mutate from the dplyr package to create “claimpercent2”, which would be the average value of each condition rather than all of the values.

data1 <- data1 %>% 
  mutate(numberOf = (cond == 0) * 100 + (cond == 1 | cond == 2) * 103 + (cond == 3) * 102) %>% 
  mutate(claimpercent2 = claimpercent / numberOf)
plotting fig 1 - data by condition

We used the ggdist package for “stat_halfeye” to create the cloud parts of the graph, and we combined it with geom_col to get the bar underneath. We had to mess with the width so both could fit in the respective x axis.

“as.numeric(cond)+.5” was used to move the columns so they would fit with the cloud plot.

In the ggdist::stat_halfeye section, adjust changed the height so we could leave room for the column. “.width” and “point_colour” were a part of the original cloud plot but we removed them since they are not in Fig 1.

We used “coord_flip()” from ggplot2 to flip the graph to look like the original Figure 1. “ylab” was used to rename the y axis.

fig1_condition_rp <- ggplot(data1, aes(x = cond)) +
  geom_col(
    aes(x = as.numeric(cond)+.5, y = claimpercent2),
    width = .3
  ) +
  ggdist::stat_halfeye(
    aes(y = claimpercent),
    adjust = .5,
    width = .5,
    .width = 0,
    point_colour = NA
  ) +
  coord_flip(
  ) +
  ylab("Percent Claimed")

So far, our plot is looking like this:

Challenges

We had a huge problem with combining geom_col and stat_halfeye originally because for some reason, the cloud plot was “squished”. We ended up fixing that by realising that geom_col was plotting the sum of all the values instead of the mean, causing the x-axis to expand and make the cloud plot smaller.

We also had a hard time getting geom_col to plot the averages, but we made it in the end.

Currently, I am having an issue changing the x-axis value names in particular from numbers to the labels, and Google has not been of much help at all. I also do not know how to get the column underneath the cloud plot instead of above without messing up all of the values.

Also, for each of my R markdown learning logs, even though I have loaded the libraries, it cannot find function “ggplot” which is stopping me from loading the graphs through R Markdown itself.

Next Steps

  • Adding final touches to the first graph of Fig 1 (e.g., confidence interval, labels, colours)
  • Creating the second graph of Fig 2