knitr::opts_chunk$set(message = FALSE, warning = FALSE)

What were the goals for this week?

This week the goals were to continue learning more about Github, using pull requests to work effectively as a team. My personal goal was to finish off Figure 2, which consisted of 3 plots.

How did I achieve these goals?

The first thing I did was have a look at Fig 2.

We see that it consists of 3 plots, but the y variable is consistent throughout all 3 plots. My first thought then was that I would use the facet function to build all 3 plots simultaneously. However, after inspecting the code, I realized this would probably be too difficult with my knowledge of R. This is due to the fact that the plots are not divided by different values of a single variable. Instead, the x axis represents a different variable with each plot, something which I did not know how to do with facet, so I instead focused on building each plot individually.

With figure A, there were a few things I needed to do to the data before I started building the plot. However, I did not know this and began by simply taking the data and putting it into a plot.

library(tidyverse)
library(janitor)

data1 <- read.csv("Nichols_et_al_data.csv")


#Cleaning up the data for use

dataA <- data1 %>%                                                          
  filter(include == 0) %>% 
  rename(cond = con,
         claimpercent = claim,
         claimmoney = moneyclaim,
         CT_practice = completion.time..practice.included.,
         CT_payments = completion.time..payments.only.,
         religiosity = relig,
         religion = Religion) %>% 
  select(cond:ritual, -claimmoney, -sex, -age, -Religion.Text, -religion, -starts_with("CT")) %>% 
  as_tibble() 



# Creating plot A.


plotA <- ggplot(dataA, aes(religiosity, claimpercent, colour = cond)) +
  geom_smooth(method="lm")

plot(plotA)

As we can see, I was only getting one line instead of the 4 separate lines for each condition I was expecting. After looking at the data, I realized that this was because condition was represented by 1 of 4 numbers (1, 2, 3, 4). I needed to turn the condition column into a factor variable so that each value would be treated as a factor, rather than a number. Looking through the OSF repo, I found code that coded the condition variable to work with the figure, as well as rename each factor level for me.

dataA <- dataA %>% 
  mutate(religiosity = abs(religiosity - 5), # reverse coding
         ritual = abs(ritual - 7),
         claimpercent = claimpercent * 100) #turning this into a percentage value

# Re-order conditions to: religous, secular, noise, and control

dataA$cond[dataA$cond==4] <- 0 # Make religious prime the reference category
dataA$cond[dataA$cond==1] <- 4 # This is in a weird order as R reads the code line by line, so if we go from top to bottom, 
dataA$cond[dataA$cond==3] <- 1 # we're changing the number twice which screws up our dataframe
dataA$cond[dataA$cond==4] <- 3


# treatment variable
dataA$cond <- factor(dataA$cond,levels= c(0,1,2,3),
                labels = c("Religious", "Secular", "Noise","Control"))

plotA <- ggplot(dataA, aes(religiosity, claimpercent, colour = cond)) +
  geom_smooth(method="lm")

plot(plotA)

After looking at this plot, I realized a few things were off. The first thing I needed to do was change the theme of the graph.

plotA <- ggplot(dataA, aes(religiosity, claimpercent, color = cond)) +
  geom_smooth(method = "lm") + #method = "lm" creates a straight line of best fit
  theme_light() +
  theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank())

plot(plotA)

The next thing was to change the y-axis limits to 0, 50 in order to match the plot in the paper. However, using ylim did not work so well…

plotA <- ggplot(dataA, aes(religiosity, claimpercent, color = cond)) +
  geom_smooth(method = "lm") + #method = "lm" creates a straight line of best fit
  theme_light() +
  theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank()) +
  ylim(c(0, 50))

plot(plotA)

After consulting with JennyS, it turned out that using ylim() simply zoomed in/out of the graph which would delete data points when used with geom_smooth(). In order to circumvent this, I used the coord_cartesian() function instead.

plotA <- ggplot(dataA, aes(religiosity, claimpercent, color = cond)) +
  geom_smooth(method = "lm") + #method = "lm" creates a straight line of best fit
  theme_light() + #Gives white background to plot
  theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank()) + #Removes gridlines from plot
  coord_cartesian(ylim = c(0, 50)) #Sets y limit to 50
  
plot(plotA)

Voila!

The key issue with this plot now was that some of the lines were a bit off compared to the paper. However, JennyS had the same issue as me and we were both unsure as to where the discrepency was coming from. There was a bit more to do with the graph but I decided to get started on the second plot for now. This one was extremely easy, as it was essentially the exact same as plotA.

plotB <- ggplot(dataA, aes(ritual, claimpercent, color = cond)) +
  geom_smooth(method = "lm") +
  theme_light() +
  theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank()) +
  coord_cartesian(ylim = c(0, 50))

plot(plotB)

Again, some of the lines in this plot did not look quite right compared to the plot on the paper, but I moved on to the 3rd plot. This plot was a bit more difficult as the x-axis actually consisted of factor levels. When I used as.factor() on affil, my plot failed to even draw lines. So I settled for keeping the affil variable as a numeric variable and just set the x scale as a discrete scale with values 0 and 1. I can change the interval ticks later so they look like how they do in the paper.

plotC <- ggplot(dataA, aes(affil, claimpercent, color = cond)) +
  stat_smooth(method = "lm") +
  theme_light() +
  theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank()) +
  coord_cartesian(ylim = c(0, 50), xlim = c(0, 1)) +
  scale_x_discrete(limits = c(0, 1))

plot(plotC)

Again, the lines did not look 1:1 to the ones in the paper. That’s an issue for next week however :). For now, I worked on prettying up my graphs to roughly match the paper by changing the axis labels and adding a title, and then by placing them side by side using the gridExtra package.

library(gridExtra)

plotA <- ggplot(dataA, aes(religiosity, claimpercent, color = cond)) +
  geom_smooth(method = "lm", se = FALSE) + #method = "lm" creates a straight line of best fit
  theme_light() + #Gives white background to plot
  theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank()) + #Removes gridlines from plot
  coord_cartesian(ylim = c(0, 50)) + #Sets y limit to 50
  labs(x = "Religiosity", y = "Percentage claimed", title = "Condition*Religiosity") + #axis labels and title
  theme(plot.title = element_text(hjust = 0.5), legend.position = "none")  #centres title text and removes legend
  


plotB <- ggplot(dataA, aes(ritual, claimpercent, color = cond)) +
  geom_smooth(method = "lm", se = FALSE) +
  theme_light() +
  theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank()) +
  coord_cartesian(ylim = c(0, 50)) +
  labs(x = "Ritual frequency", y = "Percentage claimed", title = "Condition*Ritual frequency") +
  theme(plot.title = element_text(hjust = 0.5), legend.position = "none")

plotC <- ggplot(dataA, aes(affil, claimpercent, color = cond)) +
  stat_smooth(method = "lm", se = FALSE) +
  theme_light() +
  theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank()) +
  coord_cartesian(ylim = c(0, 50), xlim = c(0, 1)) +
  scale_x_discrete(limits = c(0, 1)) +
  labs(x = "Religious affiliation", y = "Percentage claimed", title = "Condition*Religious affiliation") +
  theme(plot.title = element_text(hjust = 0.5)) 

grid.arrange(plotA, plotB, plotC, ncol = 3)

What are the goals for next week?

Next week I will work on the formatting as there are still a few kinks to iron out, but I am fairly happy with where it’s at already. The biggest issue I need to work on is figuring out why the data points are different from the paper. I also need to check in on my group mates to make sure they are making good progress, however Github makes this easy and they have been making consistent progress so far.