Goals

  1. Figure out what kind of plot Fig 1 is.
  2. Start the code for Fig 1.

Figuring out Fig 1

As I had no idea what Fig 1 was, I turned to the R Markdown that was given in the OSF. Unfortunately, I was unable to knit the R Markdown to figure out which code pertained to Figure 1 and when I brought it up with my team, we found out that the code in the R Markdown was not referring to Figure 1 in the slightest.

For the sake of not wasting my efforts, I will include what I got up to when trying to figure out what Fig 1 was, but just know that the following code will not be used for the rest of my project.

Skip to the next goal *** the code for Fig 1*** if you want to see what the relevant code will be.

loading libraries

Loading libraries that Nichols et al. used so I can identify the functions they used.

I could not install “glmmADMB” for this version of R and lsmeans is being phased out so i have to use emmeans instead

library(tidyverse)
library(dplyr)
library(yarrr)
library(psych)
library(emmeans)
library(janitor)

reading the csv

nichols <- read_csv("data/nichols_et_al_data.csv")

test running the r script from r markdown

create site (as done in the R Markdown)
USA <- nichols[nichols$site=="USA",]
CZ <- nichols[nichols$site=="CZ",]
JP <- nichols[nichols$site=="JP",]
sites <- c("USA", "CZ", "JP")
histogram of percentage of money claimed dishonestly
par(mfrow=c(1,1))
hist(nichols$claim, breaks = seq(0,100,5), freq = F,xlab = 'Claim', ylab = 'Probability',
     main = 'Histogram with Normal PDF',col="grey78", border="deepskyblue1", lty=1,
     ylim=c(0,0.05),xlim=c(0,100))

Histogram not completed - figuring out functions still and need to figure out the “sit” tibble.

I think line 42 creates the rows, line 43 is creating a histogram and labelling the y and x axis. line 44 is for colour and line 45 specifies the x and y limits.

histograms of claims per site
par(mfrow=c(1,3))
for (i in 1:3){ 
  sit = eval(parse(text = sites[i])) 
  hist(sit$claim, breaks = seq(0,100,5), freq = F, 
       main = parse(text = sites[i]), col="grey78", border="deepskyblue1",
       lty=1, ylim=c(0,0.1))} 

Not sure what “i” in line 56 is doing. eval(parse), according to help, returns an unevaluated function call.

Also an uncompleted plot

pirate plot (not sure if this is part of the same plot?)
show between-site differences in the percentage of money claimed, age, religiousness, and the frequency of ritual behavior

pirateplot is from the yarrr library.

not sure what this plot is referring to? also still figuring out specific functions but nichols have labelled each line of code.

formula = claim ~ site shows the distribution of claims based on site.

{par(mfrow=c(2,2))
  {pirateplot(formula = claim ~ site,
              data = nichols,
              theme = 0,
              main = "",
              pal = "basel", # southpark color palette
              bean.b.o = 1, # Bean fill
              bean.f.o = .005, # Bean fill
              point.o = .7, # Points
              inf.f.o = .7, # Inference fill
              inf.b.o = .96, # Inference border
              avg.line.o = 1, # Aver line
              bar.f.o = .2, # Bar
              inf.f.col = "white", # Inf fill col
              inf.b.col = "black", # Inf border col
              avg.line.col = "black", # avg line col
              point.pch = 21,
              point.bg = "white",
              point.cex = 1,
              point.lwd = 2,
              bean.lwd = 2)}
  
  
  {pirateplot(formula = age ~ site,
              data = nichols,
              theme = 0,
              main = "",
              pal = "basel", # southpark color palette
              bean.b.o = 1, # Bean fill
              bean.f.o = .005, # Bean fill
              point.o = .7, # Points
              inf.f.o = .7, # Inference fill
              inf.b.o = .96, # Inference border
              avg.line.o = 1, # Aver line
              bar.f.o = .2, # Bar
              inf.f.col = "white", # Inf fill col
              inf.b.col = "black", # Inf border col
              avg.line.col = "black", # avg line col
              point.pch = 21,
              point.bg = "white",
              point.cex = 1,
              point.lwd = 2,
              bean.lwd = 2)}
  
  {pirateplot(formula = relig ~ site,
              data = nichols,
              theme = 0,
              main = "",
              pal = "basel", # southpark color palette
              bean.b.o = 1, # Bean fill
              bean.f.o = .005, # Bean fill
              point.o = .7, # Points
              inf.f.o = .7, # Inference fill
              inf.b.o = .96, # Inference border
              avg.line.o = 1, # Aver line
              bar.f.o = .2, # Bar
              inf.f.col = "white", # Inf fill col
              inf.b.col = "black", # Inf border col
              avg.line.col = "black", # avg line col
              point.pch = 21,
              point.bg = "white",
              point.cex = 1,
              point.lwd = 2,
              bean.lwd = 2)}
  
  {pirateplot(formula = ritual ~ site,
              data = nichols,
              theme = 0,
              main = "",
              pal = "basel", # southpark color palette
              bean.b.o = 1, # Bean fill
              bean.f.o = .005, # Bean fill
              point.o = .7, # Points
              inf.f.o = .7, # Inference fill
              inf.b.o = .96, # Inference border
              avg.line.o = 1, # Aver line
              bar.f.o = .2, # Bar
              inf.f.col = "white", # Inf fill col
              inf.b.col = "black", # Inf border col
              avg.line.col = "black", # avg line col
              point.pch = 21,
              point.bg = "white",
              point.cex = 1,
              point.lwd = 2,
              bean.lwd = 2)}}

Starting code for Fig 1

After consulting with my team (yay caterpillar coding!), we wondered if Fig 1 was a combination of an area plot and a bar plot. I decided to see if I was able to recreate such a plot with ggplot.

loading libraries

library(tidyverse)
library(janitor)
library(dplyr)
library(ggplot2)

The data was renamed so it would be easier to enter in R Script. It initially did not work because I did not load dplyr and I could not use the mutate function. That just emphasized the importance of loading the correct libraries for me.

I also only selected the variables that mattered for Fig 1, namely the site, the condition and the percentage of dishonestly claimed earnings.

data1 <- data1 %>%  
  filter(include == 0) %>% 
  rename(cond = con,
         claimpercent = claim,
         claimmoney = moneyclaim,
         CT_practice = `completion time (practice included)`,
         CT_payments = `completion time (payments only)`,
         religiosity = relig,
         religion = Religion) %>% 
  select(site, claimpercent, cond)

graphing Fig 1 - data by condition

fig1_condition <- ggplot(data = data1) +
  geom_col(aes(x = claimpercent, y = cond)) +
  geom_area(aes(x = claimpercent, y = cond)) +
  scale_x_continuous(
    name = "Percent Claimed",
    labels = c(0, 20, 40, 60, 80, 100),
    breaks = c(0, 20, 40, 60, 80, 100)) +
  scale_y_discrete(
    name = NULL) +
  ggtitle(label = "Data By Condition") 

I tried creating the first Fig 1 plot by combining geom_area and geom_col, but that did not turn out right. However, I might have inputted the code wrong.

The output for this chunk won’t work because it said it cannot find the function “ggplot” for some reason, and I haven’t been able to insert the image of the graph, so I apologise that I cannot visualise it for you.

I also need to name the conditions, and change claim percent so the numbers and labels line up. But, the bars seem right so far (except they are without the error bars).

I had to include breaks because when I inputted my data, it said my labels and breaks had to be the same. To be honest, I am not sure what the difference is.

Obviously, this graph has a long way to go! I need to brush up on data visualization because I am not even entirely sure I plotted this correctly.

Challenges

As aforementioned, I had issues trying to figure out what kind of plot Fig 1 was. Using the R Markdown provided was a complete bust, unfortunately.

I still need to figure out which ggplot functions I can use to create the Fig 1 plot correctly, or if I can even use ggplot! I also need to brush up on ggplot since I am not sure if I am using it correctly.

Next steps forward

  1. Revising on ggplot so I can properly plot Fig 1.
  2. Figuring out which ggplot functions I can use/if I can even use ggplot for Fig 1.