Verification report

Part 1

Summary

Our group was group 4, and our paper was Replicating and extending the effects of auditory religious cues on dishonest behavior by Nichols and colleagues (2020). The aim of the paper was to replicate an experiment a previous study on religious priming by Lang and colleagues (2016). The paper aimed to target the issue of the lack of replications on religious priming studies. The paper also aimed to improve the methodology of the experimental design of Lang and colleagues by improving the generalization of results by including more diverse samples from different geographical areas, as well as investigating the effect of religious socialization, identity, and ritual participation in the efficacy of religious primes. The problem this paper aimed to address was the criticisms towards religious priming studies that had previously shown statistically significant results providing evidence towards a priming effect. One of these criticisms is the lack of replication studies, with one paper arguing that meta-analyses are no replacement for replications. Another criticism is that such studies are usually performed with WEIRD samples (Western, Educated, Industrialized, Rich, and Democratic). This study aimed to address these issues by including not only participants from the USA, but also participants from Japan and the Czech republic.

The researchers used a computer-based task named the Dot Task in order to measure the effect of religious primes, in this case, an auditory cue. This task consists of 200 trials, with each trial shortly displaying an image with dots on a vertically bisected screen. In order to proceed to the next trial, participants had to select which side of the monitor they believed contained the most dots. The task was skewed so that the left side of the monitor would contain more dots 60% of the time. Selecting the left side of the monitor would award the participant $0.005USD whereas selecting the right side would award $0.05. The researchers justified this discrepency by informing participants that answering accurately for the right side was more difficult than the left. Participants were instructed to be as accurate as possible, however, incorrect answers still awarded the participants the allocated amount for the side they chose. Dishonesty was measured by counting the number of times the participant chose the right side, when the left side was the correct answer. The participants were randomly allocated to one of four conditions, which would determine the type of music that would be played during the task. They could be allocated in control (no music), white noise, secular music, or religious music. The choice of music for the secular and religious conditions varied between locations in order to account for the predominant religion in the area. After the task was completed, participants were given a survey, indicating their rating on their self-perceived religiosity, ritual attendance frequency, religious organization affiliation, and which religious tradition they associated with. The participants in the music conditions also rated the music based on how secular/religious the music was, how profane/sacred the music was, as well as rated the cue on a number of aspects such as fast, slow, boring, exciting, happy, sad, and more.

The researchers found that participants from the USA were the most dishonest (mean of 40.56% dishonest answers, SD = 34.28%), with Japanese participants being the second most dishonest (mean of 29.88%, SD = 28.43%), and Czech participants were the most honest (mean of 11.69%, SD = 13.72%). The researchers also found there were no significant differences in cheating between the 4 conditions, replicating findings from the original paper. The researchers then used moderation models based on the self-ratings of religiosity, ritual frequency, and religious affiliation. There was no significant correlation found between self-reported religiosity scores and the conditions, however, the researchers found that a one-point increase in the 7 point self-reported ritual frequency measure (0 - Never, 6 - More than once a week) correlated with a decrease of roughly 4 percent in the dishonesty measure for the religious condition. This effect was not found in any of the other conditions. Lastly, the researchers found a strong correlation between participants who had reported an affiliation with the religious stimuli at the site, and lowered dishonesty measures. In the religious condition, this predicted on average a 26% decrease in dishonest claims (95% CI = [-40.12, -12.06]). This effect was much smaller in the 3 other conditions. This demonstrates that a ritual affiliation with the religious cue predicts less dishonesty in the Dot Task, however, affiliation has no effect when the religious cue is not present.

This study is important for several reasons. Firstly, it demonstrates the importance for not only replication of religious prime studies, but scientific research in general. This paper was unable to replicate one of the effects from Lang and colleagues’ paper (2016), namely, an interaction between the religious condition and self-rated religiosity, however, was able to replicate the effects of ritual frequency and religious affiliation with the religious condition. Replication studies immensely increase the credibility of scientific research, particularly during the replication crisis currently plaguing scientific research, and in particular, psychological research. This study is also shows further positive evidence for the use of religious cues to decrease dishonest decision making. The results from this study could potentially pave the way for future studies that aim to generalize these findings to a broader sample as well as an experimental design with greater external validity.

Reaction

I was surprised to see that there was such a large difference in honesty measures across the different locations sampled. Initially, I pondered whether this could be due to religious differences between the different locations. As mentioned in the study, one study found that 78% of Czech participants did not believe it was necessary to believe in God to be moral, compared to 55% and 56% of Japanese and US respondents, respectively. I also thought about whether the design of the Dot Task made it harder for people of different cultures to understand how to cheat. However, after reading further down the discussion section, it was revealed that due to the lab guidelines in the Czech location, the participants were also awarded course credit for their participation. This may have have further incentivized Czech participants to answer honestly in order to receive the points necessary for course credit. I felt that it was a little disingenuous to report this towards the middle of the discussion section, rather than reporting this fact in the procedure or participants section, as many readers may potentially skip this portion of the paper.

The most interesting part of the paper to me was the fact that the religious condition actually had a significant effect on certain participants. Keep in mind that I come from a non-religious background, however, when I had a look over the materials they used in the experiment, I was unable to distinguish what made the cues used in the religious condition any more religious than the music used in the secular condition. For instance, in the USA, the religious cue was Bach BWV 147, a church cantana. The secular cue used was Bach BWV 140, which is also a church cantana. The cues were also matched in terms of tempo, mood, and other subjective measures that may have affected participants in unintended ways. Perhaps the piece selection would make much more sense to me if I had come from a religious background, but in any case, this part of the paper confuses me. In a good way I guess.

I would say that future work in this area needs to consider the diversity in terms of the religions in the population they are sampling. This was a concern that was raised, particularly in the Japanese sample where there are many religions practiced within the population, however, I felt that the authors could have done more to specifically address this point of concern. In future studies, exclusion criteria could be used to limit participants to those who are either Christian or non-religious. This exclusion criteria could also be changed to other religions, such as Buddhism. It would be interesting to note whether the change in religions would change the effect size, considering that many religions have differing views on morality.

Part 2: Verification

The things that we needed to reproduce were the participant demographics,

table 1,

figure 1,

and figure 2.

Demographics

For the demographics, we needed to calculate the mean and SD of participants age at each site. We also needed to calculate how many of each sex there were. We first loaded in our packages and data.

library(tidyverse)
library(gt)

dat <- read.csv("Nichols_et_al_data.csv")

We then selected only the variables we needed.

dat <- dat %>% 
  select(
  sex, site, include, age
  ) 

We used count() to count how many of each value of the sex variable there was and placed it into a dataset called datSex.

datSex <- dat %>% 
  count(sex)

After this, we created a dataset named datSite that was the same as the original dataset, except filtering only for participants who were not excluded. We had to do this after counting the sex of participants, because for some reason the authors had kept the excluded participants in their calculations of sex.

datSite <- dat %>% 
  filter(include == 0) 

Next, we grouped datSite by site, and calculated the mean and SD of age for each site. We used na.rm so that only non-NA values were used in the calculation. Note that we did not actually drop the NA values - this is because there were some participants who were missing age and sex data, so their data did not affect the mean and sd, but would affect the n count.

datSite <- datSite %>% 
  group_by(site) %>% 
  summarise(mean = mean(age, na.rm = TRUE),
            sd = sd(age, na.rm = TRUE),
            n = n())

We then relabelled the values for our tables to make it readable.

datSex$sex[datSex$sex == 0] <- "Female"
datSex$sex[datSex$sex == 1] <- "Male"

datSite$site[datSite$site == 1] <- "USA"
datSite$site[datSite$site == 2] <- "Czech Republic"
datSite$site[datSite$site == 3] <- "Japan"

Lastly, we made a simple gt table to lay out our data. We used fmt_number to round our data to two decimal places.

gt1 <- gt(datSite) 
gt2 <- gt(datSex)


gt1 %>% fmt_number(
    columns = c("mean", "sd"),
    decimals = 2,
    use_seps = FALSE
  )
site mean sd n
USA 25.54 9.84 123
Czech Republic 24.40 3.37 128
Japan 19.79 0.90 157
gt2
sex n
Female 228
Male 227
NA 5

Table 1

The first thing to do was to load in our packages. We used the gt package for the bulk of our visualization for this table.

library(tidyverse)
library(dplyr)
library(gt)
data1 <- read.csv("Nichols_et_al_data.csv")

Below was the standard header of code we included in nearly every r script we used. It accounted for the paper’s exclusion criteria with filter(), renamed variables to make them easier to read, and selected only variables we would use for our plots.

data1 <- data1 %>% 
  filter(include == 0) %>% 
  rename(cond = con,
         claimpercent = claim,
         claimmoney = moneyclaim,
         CT_practice = completion.time..practice.included.,
         CT_payments = completion.time..payments.only.,
         religiosity = relig,
         religion = Religion) %>% 
  select(-CT_practice, -CT_payments, -CT, -CT_cheat, -Religion.Text, -affil_cong)

Some variables needed to be created out of existing variables, as they combined some of the aspects of the music into its own category, for instance tempo and negativity/positivity. After reading the OSF repo, we found that these were simply averages of other variables and coded it according to the repo.

There was still some confusion however. We’re still not sure why tempo is coded by taking the absolute value of slow subtracted by seven, given that slow is a 5 point scale from 1-5. This is unlikely to be solved without contacting the authors themselves.

data1 <- data1 %>% 
  mutate(
    negativity = (distressing + irritating + boring + sad)/4,
    positivity = (interesting + pleasant + exciting + relaxing + happy)/5,
    impact = (deep + powerful)/2,
    tempo = (fast + abs(slow-7))/2
  ) 

So, now that we had the new variables, we needed to find the descriptive statistics for them (ie the values in the table): means, standard deviations, confidence intervals and Cohen’s d’s.

We started by using the group_by and summarise function but quickly realised this was not going to work because while this allowed us to visualise the descriptives, it did not create new values that we could use to input into a table.

data1 %>% 
  group_by(cond) %>%
  summarise(mean_claim = mean(claimpercent, na.rm = TRUE),
            mean_sac = mean(sacred, na.rm = TRUE),
            mean_neg = mean(negativity, na.rm = TRUE),
            mean_pos = mean(positivity, na.rm = TRUE),
            mean_imp = mean(impact, na.rm = TRUE),
            mean_temp = mean(tempo, na.rm = TRUE),
            SD_claim = sd(claimpercent, na.rm = TRUE),
            SD_sac = sd(sacred, na.rm = TRUE),
            SD_neg = sd(negativity, na.rm = TRUE),
            SD_pos = sd(positivity, na.rm = TRUE),
            SD_imp = sd(impact, na.rm = TRUE),
            SD_temp = sd(tempo, na.rm = TRUE)
  )
## # A tibble: 4 x 13
##    cond mean_claim mean_sac mean_neg mean_pos mean_imp mean_temp SD_claim SD_sac
##   <int>      <dbl>    <dbl>    <dbl>    <dbl>    <dbl>     <dbl>    <dbl>  <dbl>
## 1     1      0.224   NaN      NaN      NaN      NaN       NaN       0.239  NA   
## 2     2      0.294     3.26     2.58     1.38     1.76      3.99    0.311   1.19
## 3     3      0.303     4.32     1.64     2.72     2.73      2.78    0.310   1.35
## 4     4      0.273     5.17     1.82     2.54     2.91      2.60    0.299   1.39
## # ... with 4 more variables: SD_neg <dbl>, SD_pos <dbl>, SD_imp <dbl>,
## #   SD_temp <dbl>

So, we consulted the author’s code again, which did prove to be helpful despite being confusingly laid-out and containing some outdated code. Nevertheless, we came up with the following long piece of code to obtain values for the descriptives (Means, SDs and CIs) of each of the six variables (% claimed, sacredness, negativity, positivity, tempo, impact) in the table, for each of the four conditions. So the code below is partitioned into six parts for the six variables in the table. For each variable, the code is partitioned into the four conditions (though control group only actually appears in the percent claimed variable because it doesn’t apply to the other variables (that is, you can’t rate the tempo or impact of no sound)). Then, within each condition there is six lines of code that compute six new values (mean, SD, number of responses, SE, upper and lower CI limit).

Before we dive into how each descriptive value was calculated, we first want to explain how each value was LABELLED. Each new value was labelled as such: descriptive.conditionnumber.variable, where the control group is (1), white noise group is (2), secular music group is (3), and religious music group is (4). So, lets take ‘sd3s’ as an example. This is the standard deviation (sd) of the sacredness variable (s) in the secular music group (3). Or, for example, ‘mean2i’ is the mean of the impact variable (i) for the white noise group (2).

For means, we used the mean function to take the mean of a specific variable. You can see we are taking the variable “claimpercent” from the data1 dataset - the $ operator allows us to specify which variable of a dataset we are parsing to our function, and this is only taking the values for which the condition is equal to “1”, or in other words, the control condition. na.rm removes any NA values, and then we multiply our result by 100 to get a percentage value.

The SD is calculated using the same code, replacing the mean function with sd.

The next descriptive in the table from the original paper is a confidence interval which requires a measurement of standard error which requires us to calculate n (number of responses in the control group for the percent claimed variable). To do this, the original paper used the length() function, which gets the length of a vector (in this case the list of responses in the control group), but the n() function could also have been used. The !is.na removed NA values from this calculation. It uses the ! operator, otherwise known as NOT, to only include values that were NOT NA.

Then, to calculate standard error, which we need to calculate the upper and lower CIs, we took the sd value that we just calculated above and divided that by the square root (sqrt) of n (also just calculated above). Now that we had the standard error measure, we could plug that into the formula for out upper and lower 95% confidence interval limits. Thus, we obtained descriptives (mean, SD, CIs) for each variable across each group.

# Percent claimed variable
# Control Group (1)
mean1c <- mean(data1$claimpercent[data1$cond=="1"], na.rm = TRUE)*100 # mean
sd1c <- sd(data1$claimpercent[data1$cond=="1"], na.rm = TRUE)*100 # SD
n1c <- length(data1$cond[data1$cond=="1" & !is.na(data1$cond)]) # number of responses
se1c <- sd1c/sqrt(n1c) # standard error
lCI1c <- mean1c - (1.96*se1c) # lower 95% CI
uCI1c <- mean1c + (1.96*se1c) # upper 95% CI
# White Noise Group (2)
mean2c <- mean(data1$claimpercent[data1$cond=="2"], na.rm = TRUE)*100 # mean
sd2c <- sd(data1$claimpercent[data1$cond=="2"], na.rm = TRUE)*100 # SD
n2c <- length(data1$cond[data1$cond=="2" & !is.na(data1$cond)]) # number of responses
se2c <- sd2c/sqrt(n2c) # standard error
lCI2c <- mean2c - (1.96*se2c) # lower 95% CI
uCI2c <- mean2c + (1.96*se2c) #upper 95% CI
# Secular Group (3)
mean3c <- mean(data1$claimpercent[data1$cond=="3"], na.rm = TRUE)*100 # mean
sd3c <- sd(data1$claimpercent[data1$cond=="3"], na.rm = TRUE)*100 # SD
n3c <- length(data1$cond[data1$cond=="3" & !is.na(data1$cond)]) # number of responses
se3c <- sd3c/sqrt(n3c) # standard error
lCI3c <- mean3c - (1.96*se3c) # lower 95% CI
uCI3c <- mean3c + (1.96*se3c) #upper 95% CI
# Religious Group (4)
mean4c <- mean(data1$claimpercent[data1$cond=="4"], na.rm = TRUE)*100 # mean
sd4c <- sd(data1$claimpercent[data1$cond=="4"], na.rm = TRUE)*100 # SD
n4c <- length(data1$cond[data1$cond=="4" & !is.na(data1$cond)]) # number of responses
se4c <- sd4c/sqrt(n4c) # standard error
lCI4c <- mean4c - (1.96*se4c) # lower 95% CI
uCI4c <- mean4c + (1.96*se4c) #upper 95% CI
# Sacredness variable
# White Noise Group (2)
mean2s <- mean(data1$sacred[data1$cond=="2"], na.rm = TRUE) # mean
sd2s <- sd(data1$sacred[data1$cond=="2"], na.rm = TRUE) # SD
n2s <- length(data1$cond[data1$cond=="2" & !is.na(data1$cond)]) # number of responses
se2s <- sd2s/sqrt(n2s) # standard error
lCI2s <- mean2s - (1.96*se2s) # lower 95% CI
uCI2s <- mean2s + (1.96*se2s) #upper 95% CI
# Secular Group (3)
mean3s <- mean(data1$sacred[data1$cond=="3"], na.rm = TRUE) # mean
sd3s <- sd(data1$sacred[data1$cond=="3"], na.rm = TRUE) # SD
n3s <- length(data1$cond[data1$cond=="3" & !is.na(data1$cond)]) # number of responses
se3s <- sd3s/sqrt(n3s) # standard error
lCI3s <- mean3s - (1.96*se3s) # lower 95% CI
uCI3s <- mean3s + (1.96*se3s) #upper 95% CI
# Religious Group (4)
mean4s <- mean(data1$sacred[data1$cond=="4"], na.rm = TRUE) # mean
sd4s <- sd(data1$sacred[data1$cond=="4"], na.rm = TRUE) # SD
n4s <- length(data1$cond[data1$cond=="4" & !is.na(data1$cond)]) # number of responses
se4s <- sd4s/sqrt(n4s) # standard error
lCI4s <- mean4s - (1.96*se4s) # lower 95% CI
uCI4s <- mean4s + (1.96*se4s) #upper 95% CI
# Negativity variable
# White Noise Group (2)
mean2n <- mean(data1$negativity[data1$cond=="2"], na.rm = TRUE) # mean
sd2n <- sd(data1$negativity[data1$cond=="2"], na.rm = TRUE) # SD
n2n <- length(data1$cond[data1$cond=="2" & !is.na(data1$cond)]) # number of responses
se2n <- sd2n/sqrt(n2n) # standard error
lCI2n <- mean2n - (1.96*se2n) # lower 95% CI
uCI2n <- mean2n + (1.96*se2n) #upper 95% CI
# Secular Group (3)
mean3n <- mean(data1$negativity[data1$cond=="3"], na.rm = TRUE) # mean
sd3n <- sd(data1$negativity[data1$cond=="3"], na.rm = TRUE) # SD
n3n <- length(data1$cond[data1$cond=="3" & !is.na(data1$cond)]) # number of responses
se3n <- sd3n/sqrt(n3n) # standard error
lCI3n <- mean3n - (1.96*se3n) # lower 95% CI
uCI3n <- mean3n + (1.96*se3n) #upper 95% CI
# Religious Group (4)
mean4n <- mean(data1$negativity[data1$cond=="4"], na.rm = TRUE) # mean
sd4n <- sd(data1$negativity[data1$cond=="4"], na.rm = TRUE) # SD
n4n <- length(data1$cond[data1$cond=="4" & !is.na(data1$cond)]) # number of responses
se4n <- sd4n/sqrt(n4n) # standard error
lCI4n <- mean4n - (1.96*se4n) # lower 95% CI
uCI4n <- mean4n + (1.96*se4n) #upper 95% CI
# Positivity variable
# White Noise Group (2)
mean2p <- mean(data1$positivity[data1$cond=="2"], na.rm = TRUE) # mean
sd2p <- sd(data1$positivity[data1$cond=="2"], na.rm = TRUE) # SD
n2p <- length(data1$cond[data1$cond=="2" & !is.na(data1$cond)]) # number of responses
se2p <- sd2p/sqrt(n2p) # standard error
lCI2p <- mean2p - (1.96*se2p) # lower 95% CI
uCI2p <- mean2p + (1.96*se2p) #upper 95% CI
# Secular Group (3)
mean3p <- mean(data1$positivity[data1$cond=="3"], na.rm = TRUE) # mean
sd3p <- sd(data1$positivity[data1$cond=="3"], na.rm = TRUE) # SD
n3p <- length(data1$cond[data1$cond=="3" & !is.na(data1$cond)]) # number of responses
se3p <- sd3p/sqrt(n3p) # standard error
lCI3p <- mean3p - (1.96*se3p) # lower 95% CI
uCI3p <- mean3p + (1.96*se3p) #upper 95% CI
# Religious Group (4)
mean4p <- mean(data1$positivity[data1$cond=="4"], na.rm = TRUE) # mean
sd4p <- sd(data1$positivity[data1$cond=="4"], na.rm = TRUE) # SD
n4p <- length(data1$cond[data1$cond=="4" & !is.na(data1$cond)]) # number of responses
se4p <- sd4p/sqrt(n4p) # standard error
lCI4p <- mean4p - (1.96*se4p) # lower 95% CI
uCI4p <- mean4p + (1.96*se4p) #upper 95% CI
# Tempo variable
# White Noise Group (2)
mean2t <- mean(data1$tempo[data1$cond=="2"], na.rm = TRUE) # mean
sd2t <- sd(data1$tempo[data1$cond=="2"], na.rm = TRUE) # SD
n2t <- length(data1$cond[data1$cond=="2" & !is.na(data1$cond)]) # number of responses
se2t <- sd2t/sqrt(n2t) # standard error
lCI2t <- mean2t - (1.96*se2t) # lower 95% CI
uCI2t <- mean2t + (1.96*se2t) #upper 95% CI
# Secular Group (3)
mean3t <- mean(data1$tempo[data1$cond=="3"], na.rm = TRUE) # mean
sd3t <- sd(data1$tempo[data1$cond=="3"], na.rm = TRUE) # SD
n3t <- length(data1$cond[data1$cond=="3" & !is.na(data1$cond)]) # number of responses
se3t <- sd3t/sqrt(n3t) # standard error
lCI3t <- mean3t - (1.96*se3t) # lower 95% CI
uCI3t <- mean3t + (1.96*se3t) #upper 95% CI
# Religious Group (4)
mean4t <- mean(data1$tempo[data1$cond=="4"], na.rm = TRUE) # mean
sd4t <- sd(data1$tempo[data1$cond=="4"], na.rm = TRUE) # SD
n4t <- length(data1$cond[data1$cond=="4" & !is.na(data1$cond)]) # number of responses
se4t <- sd4t/sqrt(n4t) # standard error
lCI4t <- mean4t - (1.96*se4t) # lower 95% CI
uCI4t <- mean4t + (1.96*se4t) #upper 95% CI
# Impact variable
# White Noise Group (2)
mean2i <- mean(data1$impact[data1$cond=="2"], na.rm = TRUE) # mean
sd2i <- sd(data1$impact[data1$cond=="2"], na.rm = TRUE) # SD
n2i <- length(data1$cond[data1$cond=="2" & !is.na(data1$cond)]) # number of responses
se2i <- sd2i/sqrt(n2i) # standard error
lCI2i <- mean2i - (1.96*se2i) # lower 95% CI
uCI2i <- mean2i + (1.96*se2i) #upper 95% CI
# Secular Group (3)
mean3i <- mean(data1$impact[data1$cond=="3"], na.rm = TRUE) # mean
sd3i <- sd(data1$impact[data1$cond=="3"], na.rm = TRUE) # SD
n3i <- length(data1$cond[data1$cond=="3" & !is.na(data1$cond)]) # number of responses
se3i <- sd3i/sqrt(n3i) # standard error
lCI3i <- mean3i - (1.96*se3i) # lower 95% CI
uCI3i <- mean3i + (1.96*se3i) #upper 95% CI
# Religious Group (4)
mean4i <- mean(data1$impact[data1$cond=="4"], na.rm = TRUE) # mean
sd4i <- sd(data1$impact[data1$cond=="4"], na.rm = TRUE) # SD
n4i <- length(data1$cond[data1$cond=="4" & !is.na(data1$cond)]) # number of responses
se4i <- sd4i/sqrt(n4i) # standard error
lCI4i <- mean4i - (1.96*se4i) # lower 95% CI
uCI4i <- mean4i + (1.96*se4i) #upper 95% CI

The last descriptive measurement in Table 1 was the Cohen’s d variable, measuring the effect size for each of the six variables across conditions, in comparison to the religious condition. We calculated it as such…

The formula for Cohen’s d is as follows…

As such, the lines of code below calculate the effect size (Cohen’s d) for each variable in the religious music condition in comparison with each variable in the other conditions. The abs() function was added to visualise all the effect sizes as absolute values, as done in the original table. Similarly to the other variables, the labels for the Cohen d effect sizes are labelled in the following manner: d standing for Cohen’s d, the number referring to which effect comparison between conditions this variable is measuring (1 = relig vs sec, 2 = reglig vs noise, 3 = relig vs control), then the final letter represents which of the six variables is being measured (eg n for negativity).

# computing Cohen's d
# % claimed
#relig vs sec
d1c <- abs((mean4c-mean3c)/sqrt((sd4c^2+sd3c^2)/2))
#relig vs noise
d2c <- abs((mean4c-mean2c)/sqrt((sd4c^2+sd2c^2)/2))
#relig vs control
d3c <- abs((mean4c-mean1c)/sqrt((sd4c^2+sd1c^2)/2))
# sacredness
#relig vs sec
d1s <- abs((mean4s-mean3s)/sqrt((sd4s^2+sd3s^2)/2))
#relig vs noise
d2s <- abs((mean4s-mean2s)/sqrt((sd4s^2+sd2s^2)/2))
# negativity
#relig vs sec
d1n <- abs((mean4n-mean3n)/sqrt((sd4n^2+sd3n^2)/2))
#relig vs noise
d2n <- abs((mean4n-mean2n)/sqrt((sd4n^2+sd2n^2)/2))
# positivity
#relig vs sec
d1p <- abs((mean4p-mean3p)/sqrt((sd4p^2+sd3p^2)/2))
#relig vs noise
d2p <- abs((mean4p-mean2p)/sqrt((sd4p^2+sd2p^2)/2))
# tempo
#relig vs sec
d1t <- abs((mean4t-mean3t)/sqrt((sd4t^2+sd3t^2)/2))
#relig vs noise
d2t <- abs((mean4t-mean2t)/sqrt((sd4t^2+sd2t^2)/2))
# impact
#relig vs sec
d1i <- abs((mean4i-mean3i)/sqrt((sd4i^2+sd3i^2)/2))
#relig vs noise
d2i <- abs((mean4i-mean2i)/sqrt((sd4i^2+sd2i^2)/2))

Now that we have worked out the formula for each data set, we wanted to visualise it into a table. This ensured that we had all the correct values for each group. Firstly, we placed the data into a tibble, starting with the religious sample.

We then listed the characteristics that this table focused on such as % claimed and sacredness before placing the variable names under the appropriate columns. After doing so, we used the mutate_if function to round the values to 2 decimal places. The function is.numeric specifies that this will only apply to columns with numeric values. Finally, we used the GT function to plot it into a table and gave it an appropriate title using cols_label.

#Making the tables
#Make religious table
table1 <- tibble(
  characteristics = c("% claimed", "Sacredness", "Negativity", "Positivity", "Tempo", "Impact"),
  M = c(mean4c, mean4s, mean4n, mean4p, mean4t, mean4i),
  SD = c(sd4c, sd4s, sd4n, sd4p, sd4t, sd4i),
  lCI = c(lCI4c, lCI4s, lCI4n, lCI4p, lCI4t, lCI4i),
  uCI = c(uCI4c, uCI4s, uCI4n, uCI4p, uCI4t, uCI4i),
  d = c("-", "-", "-", "-", "-", "-")
) 
table1 %>% mutate_if(is.numeric, ~round(., 2)) %>%
  gt() %>%
  cols_label(characteristics = "Religious") 
Religious M SD lCI uCI d
% claimed 27.33 29.92 21.52 33.14 -
Sacredness 5.17 1.39 4.90 5.44 -
Negativity 1.82 0.65 1.70 1.95 -
Positivity 2.54 0.85 2.37 2.70 -
Tempo 2.60 0.81 2.45 2.76 -
Impact 2.91 1.09 2.70 3.12 -

We then applied this to the secular, white noise and control groups. The c() function is used to create a list out of all the arguments supplied, which enables us to place the variables in the order we want.

#Make secular table
table2 <- tibble(
  characteristics = c("% claimed", "Sacredness", "Negativity", "Positivity", "Tempo", "Impact"),
  M = c(mean3c, mean3s, mean3n, mean3p, mean3t, mean3i),
  SD = c(sd3c, sd3s, sd3n, sd3p, sd3t, sd3i),
  lCI = c(lCI3c, lCI3s, lCI3n, lCI3p, lCI3t, lCI3i),
  uCI = c(uCI3c, uCI3s, uCI3n, uCI3p, uCI3t, uCI3i),
  d = c(d1c, d1s, d1n, d1p, d1t, d1i)
)
table2 %>% mutate_if(is.numeric, ~round(., 2)) %>%
  gt() %>%
  cols_label(characteristics = "Secular")
Secular M SD lCI uCI d
% claimed 30.32 30.98 24.33 36.30 0.10
Sacredness 4.32 1.35 4.06 4.58 0.62
Negativity 1.64 0.51 1.55 1.74 0.31
Positivity 2.72 0.78 2.56 2.87 0.22
Tempo 2.78 0.82 2.62 2.94 0.22
Impact 2.73 1.13 2.51 2.95 0.16
#Make white noise table
table3 <- tibble(
  characteristics = c("% claimed", "Sacredness", "Negativity", "Positivity", "Tempo", "Impact"),
  M = c(mean2c, mean2s, mean2n, mean2p, mean2t, mean2i),
  SD = c(sd2c, sd2s, sd2n, sd2p, sd2t, sd2i),
  lCI = c(lCI2c, lCI2s, lCI2n, lCI2p, lCI2t, lCI2i),
  uCI = c(uCI2c, uCI2s, uCI2n, uCI2p, uCI2t, uCI2i),
  d = c(d2c, d2s, d2n, d2p, d2t, d2i)
)
table3 %>% mutate_if(is.numeric, ~round(., 2)) %>%
  gt() %>%
  cols_label(characteristics = "White Noise")
White Noise M SD lCI uCI d
% claimed 29.40 31.09 23.39 35.40 0.07
Sacredness 3.26 1.19 3.04 3.49 1.48
Negativity 2.58 0.96 2.39 2.76 0.93
Positivity 1.38 0.62 1.26 1.50 1.56
Tempo 3.99 0.72 3.85 4.13 1.80
Impact 1.76 0.93 1.58 1.94 1.13
#Make control group table
table4 <- tibble(
  characteristics = c("% claimed", "Sacredness", "Negativity", "Positivity", "Tempo", "Impact"),
  M = c(mean1c, NA, NA, NA, NA, NA),
  SD = c(sd1c, NA, NA, NA, NA, NA),
  lCI = c(lCI1c, NA, NA, NA, NA, NA),
  uCI = c(uCI1c, NA, NA, NA, NA, NA),
  d = c(d3c, NA, NA, NA, NA, NA)
)
table4 %>% mutate_if(is.numeric, ~round(., 2)) %>%
  gt() %>%
  cols_label(characteristics = "Control Group")
Control Group M SD lCI uCI d
% claimed 22.38 23.89 17.69 27.06 0.18
Sacredness NA NA NA NA NA
Negativity NA NA NA NA NA
Positivity NA NA NA NA NA
Tempo NA NA NA NA NA
Impact NA NA NA NA NA

After this, we needed to merge the tables together. We found that using data.frame was the easiest. It allowed us to take several datasets and merge them into one.

data_tables <- data.frame(Religious = table1, 
                          Secular = table2,
                          WhiteNoise = table3,
                          Control = table4)

Building upon that, we used the function gt() from the gt package. This is an extremely versatile package that allowed us to build beautiful tables extremely simply, using the wide array of functions included in the package. By inputting the function tab_source_note(source_note = …), we were able to place text at the bottom of the table, similar to the original. The function tab_spanner(label = …,columns = c()) allowed us to place a label that spans across a certain amount of columns. This enabled us to label the columns that pertain to each group.

gt_tbl <- gt(data_tables) %>% 
  tab_source_note(
    source_note = "M = Mean; SD = Standard Deviation; CI = 95% Confidence Intervals. Cohen's d represents the effect size of comparisons between musical conditions.") %>%
  tab_spanner(
    label = "Religious (n = 102)",
    columns = c(Religious.M, Religious.SD, Religious.lCI, Religious.uCI, Religious.d)
  ) %>%
  tab_spanner(
    label = "Secular (n = 103)",
    columns = c(Secular.M, Secular.SD, Secular.lCI, Secular.uCI, Secular.d)
  ) %>%
  tab_spanner(
    label = "White Noise (n = 103)",
    columns = c(WhiteNoise.M, WhiteNoise.SD, WhiteNoise.lCI, WhiteNoise.uCI, WhiteNoise.d)
  ) %>%
  tab_spanner(
    label = "Control (n = 100)",
    columns = c(Control.M, Control.SD, Control.lCI, Control.uCI, Control.d)
  )

Moving on from this, we needed to name the columns appropriately with M, SD, lCI, uCI and d. This was completed using the function cols_label, where the original name of the column is overridden, and the words in the speech marks ("") set the new name.

We needed to ensure that all values were rounded to two decimal places. This was achieved using fmt_number on columns with numeric values. This was difficult for values that were marked with a “-” initially, as they were not calculated by the researchers. By coming back to the original table visualisation and changing the “-” into NA, the function fmt_number finally worked. This was applied to each column and use_seps was set to false as we did not want any digit group separators. The exact function is: fmt_number(columns = …, decimals = 2, use_seps = FALSE).

Next on the list was hiding columns that repeated the %claimed, sacredness, negativity, positivity, tempo and impact characteristics. This occurred as each group initially had their own separate table. We achieved this by using the function cols_hide(columns = c(…)), with the columns we wanted to hide parsed into the argument.

Finally, the last thing to do was change the NA’s in our GT table to “-” to visualise it similarly to Nichols and colleagues. This was done using the function fmt_missing and applying it to every column. The missing_text function replaced the NA’s with “-”. And now our work was finally done!

gt_tbl %>%  cols_label(Religious.characteristics = NULL, 
             Religious.M = "M",
             Religious.SD = "SD",
             Religious.lCI = "lCI",
             Religious.uCI = "uCI",
             Religious.d = "d",
             Secular.characteristics = "Secular",
             Secular.M = "M",
             Secular.SD = "SD",
             Secular.lCI = "lCI",
             Secular.uCI = "uCI",
             Secular.d = "d",
             WhiteNoise.characteristics = "White Noise",
             WhiteNoise.M = "M",
             WhiteNoise.SD = "SD",
             WhiteNoise.lCI = "lCI",
             WhiteNoise.uCI = "uCI",
             WhiteNoise.d = "d",
             Control.characteristics = "Control",
             Control.M = "M",
             Control.SD = "SD",
             Control.lCI = "lCI",
             Control.uCI = "uCI",
             Control.d = "d") %>% 
  fmt_number(
      columns = c(Religious.M, Religious.SD, Religious.lCI, Religious.uCI, Secular.M, Secular.SD, Secular.lCI,
                  Secular.uCI, Secular.d, WhiteNoise.M, WhiteNoise.SD, WhiteNoise.lCI, WhiteNoise.uCI,
                  WhiteNoise.d, Control.M, Control.SD, Control.lCI, Control.uCI, Control.d),
      decimals = 2,
      use_seps = FALSE
  ) %>% 
  cols_hide(
    columns = c(Secular.characteristics, WhiteNoise.characteristics, Control.characteristics)
  ) %>% 
  fmt_missing(
    columns = everything(),
    missing_text = "-"
  )
M Religious (n = 102) Secular (n = 103) White Noise (n = 103) Control (n = 100)
M SD lCI uCI d M SD lCI uCI d M SD lCI uCI d M SD lCI uCI d
% claimed 27.33 29.92 21.52 33.14 - 30.32 30.98 24.33 36.30 0.10 29.40 31.09 23.39 35.40 0.07 22.38 23.89 17.69 27.06 0.18
Sacredness 5.17 1.39 4.90 5.44 - 4.32 1.35 4.06 4.58 0.62 3.26 1.19 3.04 3.49 1.48 - - - - -
Negativity 1.82 0.65 1.70 1.95 - 1.64 0.51 1.55 1.74 0.31 2.58 0.96 2.39 2.76 0.93 - - - - -
Positivity 2.54 0.85 2.37 2.70 - 2.72 0.78 2.56 2.87 0.22 1.38 0.62 1.26 1.50 1.56 - - - - -
Tempo 2.60 0.81 2.45 2.76 - 2.78 0.82 2.62 2.94 0.22 3.99 0.72 3.85 4.13 1.80 - - - - -
Impact 2.91 1.09 2.70 3.12 - 2.73 1.13 2.51 2.95 0.16 1.76 0.93 1.58 1.94 1.13 - - - - -
M = Mean; SD = Standard Deviation; CI = 95% Confidence Intervals. Cohen's d represents the effect size of comparisons between musical conditions.

Figure 1

We first loaded in the packages and data required for the figure.

library(janitor) #used to read csv
library(ggplot2) #used to plot column
library(ggdist) #used for raincloud plot
library(gridExtra) #combines both plots
data2 <- read_csv("Nichols_et_al_data.csv")

claimpercent had to be mutated to become a percentage value by multiplying by 100, since they were in decimal place beforehand. Then, we used as_tibble to make the data easier to look at.

data2 <- data2 %>%  
  filter(include == 0) %>% 
  rename(cond = con,
         claimpercent = claim,
         claimmoney = moneyclaim,
         CT_practice = `completion time (practice included)`,
         CT_payments = `completion time (payments only)`,
         religiosity = relig,
         religion = Religion) %>% 
  select(site, claimpercent, id, cond) %>% 
  mutate(claimpercent = claimpercent * 100) %>% 
  as_tibble()

We reordered the cond and site variables in order to the code them in the same order as the original paper. This code was taken from the author’s code. At first we were confused as to how they had reordered the conditions. Why did they reorder cond==4 twice? And why did they not reorder cond==2? After playing around with the code, we realized that r read the code line by line, so if we changed cond==1 to 3, then on the next line it would be changed straight back to 1. Furthermore, cond==2 did not need to be changed as it remained the same no matter what order it was.

# data by condition
data2$cond[data2$cond==4] <- 0 # Make religious prime the reference category
data2$cond[data2$cond==1] <- 4 # This is in a weird order as R reads the code line by line, so if we go from top to bottom, 
data2$cond[data2$cond==3] <- 1 # we're changing the number twice which screws up our dataframe
data2$cond[data2$cond==4] <- 3
# data by site
data2$site[data2$site==1] <- 0 
data2$site[data2$site==3] <- 1 
# makes USA = 0, Japan = 1, Czech Republic = 2

We used mutate from the dplyr package to create “claimpercent2”, which would be the original claimpercent value, divided by the number of participants in that participants condition. We did this as geom_col takes the sum of all the values when visualizing data, whereas we wanted the columns to represent the means. We used conditional statements in order to determine how much to divide each claimpercent by. This code essentially figures out which condition the participant was in, and then divides the claimpercent value for that row by the number of participants in that same condition. When the scores are all added up, we will be left with the mean. Using the | (or) operand and == (equal to) operand, we can construct code that enables certain outputs only when the correct conditions are met.

# data by condition
data2 <- data2 %>% 
  mutate(numberOf = (cond == 0) * 100 + (cond == 1 | cond == 2) * 103 + (cond == 3) * 102) %>% 
  mutate(claimpercent2 = claimpercent / numberOf)
# data by site
data2 <- data2 %>% 
  mutate(numberOf2 = (site == 0) * 123 + (site == 2) * 127 + (site == 1) * 156) %>% 
  mutate(claimpercent3 = claimpercent / numberOf2)

We created group_cond so geom_error (in the next chunk) can plot the error bar. group_by() from dplyr creates a tibble so that any data wrangling we apply to the dataset is done in relation to the variables we have grouped by. For this part, we grouped by condition

n = n() assigns n the number of the grouped variable in the dataframe.

drop_na() from tidyverse “drops” rows that contains missing values.

Then we created two new variables, lowerCI and upperCI, by calculating the standard error (se) and then calculating lowerCI and upperCI with the variables we had just created.

Then, we ungrouped the data frame.

We had to change group_cond and group_site into a factor so it would align with our plot later. Without this, the error bars would not align with the cond and site bars due to ggplot being finicky with discrete and continuous variables.

# data by condition
group_cond <- data2 %>% 
  group_by(cond) %>% 
  summarise(mean = mean(claimpercent, na.rm = TRUE),
            sd = sd(claimpercent, na.rm = TRUE),
            n = n()) %>% 
  drop_na() %>% 
  mutate(se = sd / sqrt(n),
         lowerCI = mean - qt(1 - (0.05/2), n - 1) * se,
         upperCI = mean + qt(1 - (0.05/2), n - 1) * se) %>% 
  ungroup()
group_cond$cond <- factor(group_cond$cond)
# data by site
group_site <- data2 %>% 
  group_by(site) %>% 
  summarise(mean = mean(claimpercent, na.rm = TRUE),
            sd = sd(claimpercent, na.rm = TRUE),
            n = n()) %>% 
  drop_na() %>% 
  mutate(se = sd / sqrt(n),
         lowerCI = mean - qt(1 - (0.05/2), n - 1) * se,
         upperCI = mean + qt(1 - (0.05/2), n - 1) * se) %>% 
  ungroup()
group_site$site <- factor(group_site$site)

Next, we changed both cond and site to a factor. We changed it at this point in time because if we changed it earlier, we could not be able to do the calculations needed, as the calculations would only take continuous variables.

We used the ggdist package for “stat_halfeye” to create the cloud parts of the graph, and we combined it with geom_col from ggplot2 to get the bar underneath. We had to mess with the width so both could fit in the respective x axis.

In the ggdist::stat_halfeye section, adjust changed the height so we could leave room for the column. “.width” and “point_colour” were a part of the original cloud plot but we removed them since they are not in Fig 1.

We used “coord_flip()” from ggplot2 to flip the graph to look like the original Figure 1. “ylab” was used to rename the y axis.

We used geom_errorbar() from ggplot2 to plot the error bars, using the SE and assigning the lower and upper CI from the previous chunk. I adjusted the width to fit the columns.

“Expand limits” from ggplot2 ensured that limits included x = 0 and y = 0 to better fit with the original graph.

For “scale_x_discrete” and “scale_y_continuous”, expand made sure the x and y axis included zero. We used label to name each value, and used break so labels and breaks were the same length. “name = NULL” removed the x axis label, like the original figure, and “name =”Percent Claimed"" relabeled the y axis.

theme_classic created a white background for the graph (instead of grids).

Both from ggplot 2, “theme(legend.position =”none“)” removed the legend from “scale_fill_manual”. “scale_fill_manual” allowed us to add the exact colours used in the original plot, and the colours were obtained by using a colour picking tool and taking the hex codes for the colours.

ggtitle() from ggplot2 was used to create the “A.” title in the corner of the graph, and facet_grid() allowed us to put the title “Data by _____” in its own box, like in the original.

# data by condition
data2$cond <- factor(data2$cond)
fig1_cond <- ggplot(data2, aes(x = cond)) +
  geom_col(
    aes(y = claimpercent2, fill = cond),
    width = .3
  ) +
  ggdist::stat_halfeye(
    aes(y = claimpercent, fill = cond),
    adjust = .5,
    width = .4,
    .width = 0,
    point_colour = NA,
    position = position_nudge(x = 0.17, y = 0)
  ) +
  geom_errorbar(data = group_cond,
                aes(ymin = lowerCI, ymax = upperCI),
                show.legend = NA,
                width = .3
                ) +
  coord_flip() +
  expand_limits(x = 0, y = 0) +
  scale_x_discrete(name = NULL,
                     expand = c(0,0),
                     breaks = c(0, 1, 2, 3),
                     labels = c("Control", "Noise", "Secular", "Religious")
  ) +
  scale_y_continuous(name = "Percent Claimed",
                     expand = c(0,0),
                     breaks = c(0, 20, 40, 60, 80, 100)
                     ) +
  ggtitle(
    label = "Data by condition"
  ) +
  theme_classic() +
  theme(legend.position = "none") +
  scale_fill_manual(values = c("#adadad","#336ca3","#77c2ec","#ecb234")) +
  ggtitle(label = "A.") +
  facet_grid(. ~ "Data by condition")
plot(fig1_cond)

# data by site
data2$site <- factor(data2$site)
fig1_site <- ggplot(data2, aes(x = site)) +
  geom_col(
    aes(y = claimpercent3, fill = site),
    width = .3
  ) +
  ggdist::stat_halfeye(
    aes(y = claimpercent, fill = site),
    adjust = .5,
    width = .3,
    .width = 0,
    point_colour = NA,
    position = position_nudge(x = 0.2, y = 0)
  ) +
  geom_errorbar(data = group_site,
                aes(ymin = lowerCI, ymax = upperCI),
                width = .3
  ) +
  coord_flip() +
  expand_limits(x = 0, y = 0) +
  scale_x_discrete(name = NULL,
                     expand = c(0,0), 
                     breaks = c(0, 1, 2),
                     labels = c("USA","Japan","Czech Rep.")
                     ) +
  scale_y_continuous(name = "Percent Claimed",
                     expand = c(0,0),
                     breaks = c(0, 20, 40, 60, 80, 100)
                     ) +
  theme_classic() +
  theme(legend.position = "none") +
  scale_fill_manual(values = c("#fec8bc","#f2667c","#d75b5c")) +
  ggtitle(label = "B.") +
  facet_grid(. ~ "Data by site") 
plot(fig1_site)

We then plotted the two graphs side by side using grid.arrange() from the gridExtra package.

grid.arrange(fig1_cond, fig1_site, ncol = 2)

Figure 2

We cleaned up the code first, just like all the other figures/tables. After working on the code for a while we realized we would need to reverse code some variables, namely religiosity and ritual. The abs() function takes the absolute value of the equation inside, so when combined with the subtraction, this allows our code to be reordered back to front.

library(gmodels)
data3 <- read.csv("Nichols_et_al_data.csv")
#Cleaning up the data for use
dataA <- data3 %>%                                                          
  filter(include == 0) %>% 
  
  rename(cond = con,
         claimpercent = claim,
         claimmoney = moneyclaim,
         CT_practice = completion.time..practice.included.,
         CT_payments = completion.time..payments.only.,
         religiosity = relig,
         religion = Religion) %>% 
  
  mutate(religiosity = abs(religiosity - 5),
         ritual = abs(ritual - 7),              # reverse coding
         claimpercent = claimpercent * 100,     #turning this into a percentage value
         affil = as.factor(affil_cong)) %>%      
         
  
  select(cond:ritual, -claimmoney, -sex, -age, -Religion.Text, -religion, -starts_with("CT")) %>% # selecting only the columns required
  
  as_data_frame() 

We also reordered and factored the conditions as per the author’s code.

# Re-order conditions to: religious, secular, noise, and control
dataA$cond[dataA$cond==4] <- 0 # Make religious prime the reference category
dataA$cond[dataA$cond==1] <- 4 # This is in a weird order as R reads the code line by line, so if we go from top to bottom, 
dataA$cond[dataA$cond==3] <- 1 # we're changing the number twice which screws up our dataframe
dataA$cond[dataA$cond==4] <- 3
# labelling the levels of the treatment variable
dataA$cond <- factor(dataA$cond, levels= c(0,1,2,3),
                labels = c("Religious", "Secular", "Noise","Control"))

The next piece of code creates a dataset that takes the data of all participants from dataA who are in the religious condition. This was done in order to plot the SE line in the first 2 plots, as the SE line only appears for the religious condition.

dataB <- dataA %>% 
  filter(cond == "Religious")

In order to plot the 3rd figure, we needed to figure out the CI limits for all conditions, based on whether they were religiously affiliated or not. In order to create a dataframe that grouped these variables together, we needed to create a grouped table using group_by(). From there, we calculated the mean, sd, and number of participants in each of our grouped variables, and then applied a formula to calculate the lower and upper CI limits. We then relabeled each of the factor levels to either “Non-affiliated” or “Affiliated”. We also found that many of the participants did not have a value for the affil variable. In order to filter these participants out, we used the drop_na() function.

#Make a new dataframe that contains CI limits for all conditions based on whether participants are religiously affiliated or not
groupA <- dataA %>% 
  group_by(cond, affil) %>% 
  summarise(mean = mean(claimpercent, na.rm = TRUE),
            sd = sd(claimpercent, na.rm = TRUE),
            n = n()) %>% 
  drop_na() %>% 
  mutate(se = sd / sqrt(n),
         lowerCI = mean - qt(1 - (0.05/2), n - 1) * se,
         upperCI = mean + qt(1 - (0.05/2), n - 1) * se) %>% 
  ungroup()
  
# labelling the levels of religious affiliation 
groupA$affil <- factor(groupA$affil, levels = c(0, 1), 
                       labels = c("Non-affiliated", "Affiliated"))

Figuring out how to code the plots took a bit of time. The first thing to figure out was how to get 4 lines to appear, as attempts to use geom_smooth() failed at first. We figured out that we needed to turn the cond variable into a factor variable, rather than a numeric variable. This enabled 4 lines to appear.

figA <- ggplot(dataA, aes(religiosity, claimpercent, color = cond)) +
  geom_smooth(method = "lm")

figA

After that, the next thing to figure out was how to get the y limits to reach 50.

figA <- ggplot(dataA, aes(religiosity, claimpercent, color = cond)) +
  geom_smooth(method = "lm") +
  ylim(c(0, 50))

figA

When we tried using the ylim() function, the lines changed completely. After reviewing the documentation, we figured out that ylim() and other similar functions simply zoomed in or out of the graph, which caused data points to disappear when using geom_smooth. We found a function, coord_cartesian() that allowed us to set limits without losing any data points. We then used functions such as labs(), theme_light(), theme(), and scale_color_manual to change the graph to look similar to the paper. The last thing to figure out was how to put the title into a grey box. This grey box typically only appears in graphs that have used the facet function which is typically used with multiple graphs, so we just used the facet_grid() function on one table at a time in order to get the grey box to appear.

figA <- ggplot(dataA, aes(religiosity, claimpercent, color = cond)) +
  geom_smooth(method = "lm", se = FALSE, size = 1.5) + #method = "lm" creates a straight line of best fit
  geom_smooth(data = dataB, size = 0, se = TRUE, method = "lm", fill = "#fbf1d8") + #this creates the SE for the religious condition
  theme_light() + #Gives white background to plot
  theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank(), legend.position = "none") + #Removes gridlines from plot, removes legend
  coord_cartesian(ylim = c(0, 50)) + #Sets y limit to 50
  labs(x = "Religiosity", y = "Percentage claimed") + #axis labels 
  facet_grid(. ~ "Condition*Religiosity") + # title, we use facet_grid to put it in a grey box
  scale_color_manual(values = c("#dea520", "#5ab3e4", "#094689", "#a4a4a4")) #change the colours of the lines to fit source

figA

The second plot was mostly identical to the first plot. Note that with the first two plots we used 2 geom_smooth() as one of the geom_smooth() was simply to plot the SE line (using se = TRUE) for the religious condition.

figB <- ggplot(dataA, aes(ritual, claimpercent, color = cond)) +
  geom_smooth(method = "lm", se = FALSE, size = 1.5) +
  geom_smooth(data = dataB, size = 0, se = TRUE, method = "lm", fill = "#fbf1d8") +
  theme_light() +
  theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank(), legend.position = "none") +
  coord_cartesian(ylim = c(0, 50)) +
  labs(x = "Ritual frequency", y = "Percentage claimed") +
  facet_grid(. ~ "Condition*Ritual frequency") +
  scale_x_continuous(breaks = seq(0, 6, 1)) +  #adjusting the tick marks so 7 marks appear
  scale_color_manual(values = c("#dea520", "#5ab3e4", "#094689", "#a4a4a4"))

figB

The third plot took the most time to figure out. This was in part due to needing to plot the CI limits, which was explained above. The next biggest issue was figuring out how to use geom_line(). Our initial attempts to use geom_line() failed to produce any lines. After scouring the geom_line() documentation, we realized that we needed to use the group argument in order to tell geom_line() what it was plotting. This took a while to figure out as the documentation used the argument (group = group) in their example which did not make much sense at first, as they were grouping the variable “group”. The next thing to figure out was how to get the geoms to not overlap over each other. After some research, we found the function position_dodge, which moved each line by a set distance in order to stop them from overlapping.

figC <- ggplot(groupA, aes(affil, mean, color = cond)) +
  geom_line(aes(group = cond), position = position_dodge(width = 0.5), size = 1.5) + #group our lines by condition, we use position dodge so nothing overlaps
  geom_errorbar(data = groupA, aes(ymin = lowerCI, ymax = upperCI), width = 0.2, position = position_dodge(width = 0.5), size = 1.3) +
  theme_light() +
  theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank()) +
  ylim(0, 50) +
  scale_x_discrete() +
  labs(x = "Religious affiliation", y = "Percentage claimed") +
  facet_grid(. ~ "Condition*Religious affiliation") +
  scale_color_manual(values = c("#dea520", "#5ab3e4", "#094689", "#a4a4a4"))

figC

The first part of the figure that we knew would be a bit difficult was how to display 3 graphs at once. This was due to the fact that facet would not work for us as we were using different variables in each plot. After some research, we came across the gridExtra package which came with the function grid.arrange() which allowed us to plot 3 graphs side by side.

grid.arrange(figA, figB, figC, ncol = 3)

Part 3: Exploration

Question 1: Did the religion of the participants affect their religiosity ratings of the music?

For this question, I wanted to see whether the religion of participants played an effect on how “Religious” they rated the music to be. The authors made comments in the discussion about how participants from different locations had differing views on religion and morality, so I wanted to see if the religiousness of participants could actually affect their perception of religious music. The religiousness of the music was rated on a 7 point scale, with 1 being the lowest and 7 being the highest rating of religiousness. First thing I needed to do was load in packages and my data.

library(tidyverse)
library(gt)
library(ggsci)

data4 <- read.csv('Nichols_et_al_data.csv')

data4 <- data4 %>% 
  rename(cond = con,
         religion = Religion) %>% 
  select(include, cond, religion, religious) %>% 
  filter(include == 0) %>% 
  drop_na() %>% 
  as_tibble()

The next thing I did was reorder the conditions so that they were consistent with the paper.

# Re-order conditions to: religious, secular, noise, and control

data4$cond[data4$cond==4] <- 0 # Make religious prime the reference category
data4$cond[data4$cond==1] <- 4 # This is in a weird order as R reads the code line by line, so if we go from top to bottom, 
data4$cond[data4$cond==3] <- 1 # we're changing the number twice which screws up our dataframe
data4$cond[data4$cond==4] <- 3

This piece of code filtered out participants who were only in the religious condition. I wanted to do this so that I could gather religiousness ratings for the religious cue only. I then grouped the table via religion, and then looked at the mean, sd, and number of participants in each religion. As you can see in the table, there were no great differences between any of the religions or nonreligious participants in their religious ratings. The religious participants tended to score slightly higher but not much more. I decided not to run an ANOVA as the group sizes differed greatly from one another.

corr1 <- data4 %>% 
  filter(cond == 0) %>% 
  group_by(religion) %>% 
  summarise(n = n(),  # Calculating number of participants, mean, and SD
            mean = mean(religious, na.rm = TRUE),
            sd = sd(religious, na.rm = TRUE)) %>% 
  mutate(se = sd / sqrt(n),  #95% CI formula
         lowerCI = mean - qt(1 - (0.05/2), n - 1) * se,
         upperCI = mean + qt(1 - (0.05/2), n - 1) * se) %>% 
  drop_na()

gt1 <- gt(corr1)

gt1 <- gt1 %>%    
  tab_header(   # Creating title and subtitle
    title = md("**Average religiousness ratings of cue by religion of participant**"),
    subtitle = "Religious condition"
  ) %>% 
  tab_row_group(  # Creating row groups
    label = md("*Religious*"),
    rows = c(3, 4, 5, 9, 8)
  ) %>% 
  tab_row_group(
    label = md("*Nonreligious*"),
    rows = c(1, 2, 7)
  ) %>% 
  tab_row_group(
    label = md("*Other*"),
    rows = c(6, 10)
  ) %>% 
  cols_width(
    n ~ px(90),
    mean:upperCI ~ px(70)
  ) %>% 
  cols_label( # Prettying up the column labels
    religion = "Religion",
    mean = "Mean",
    sd = "Standard Deviation",
    se = "Standard Error",
    lowerCI = "95% Lower CI",
    upperCI = "95% Upper CI"
  ) %>% 
  row_group_order( # Placing the row groups in correct order
    groups = c("*Religious*", "*Nonreligious*", "*Other*")
  )
gt1
Average religiousness ratings of cue by religion of participant
Religious condition
Religion n Mean Standard Deviation Standard Error 95% Lower CI 95% Upper CI
Religious
Buddhism 9 5.444444 1.130388 0.3767961 4.575551 6.313338
Christian 25 5.280000 1.275408 0.2550817 4.753537 5.806463
Hindu 3 5.000000 2.645751 1.5275252 -1.572411 11.572411
Other 12 4.750000 1.912875 0.5521995 3.534617 5.965383
Shinto 2 6.000000 0.000000 0.0000000 6.000000 6.000000
Nonreligious
Agnosticism 11 4.545455 1.694912 0.5110352 3.406797 5.684112
Atheisim 10 4.600000 1.955050 0.6182412 3.201441 5.998559
No Religion 22 5.272727 1.241421 0.2646718 4.722312 5.823142
Other
Missing 4 6.500000 1.000000 0.5000000 4.908777 8.091223
Unknown 2 5.000000 0.000000 0.0000000 5.000000 5.000000

I did the same for the secular condition. There was one row where the religion was blank, which I dropped from the table. For this condition, religiosity ratings were lower by around 1-2 points across the board. Again, there was no real difference between religious or nonreligious participants in their ratings.

# Constructing descriptive tables 
corr2 <- data4 %>% 
  filter(cond == 1) %>% 
  group_by(religion) %>% 
  summarise(n = n(),
            mean = mean(religious, na.rm = TRUE),
            sd = sd(religious, na.rm = TRUE)) %>% 
  mutate(se = sd / sqrt(n),
         lowerCI = mean - qt(1 - (0.05/2), n - 1) * se,
         upperCI = mean + qt(1 - (0.05/2), n - 1) * se) %>% 
  drop_na() 
corr2 <- corr2[-c(1), ]

gt2 <- gt(corr2)

gt2 <- gt2 %>% 
  tab_header(
    title = md("**Average religiousness ratings of cue by religion of participant**"),
    subtitle = "Secular condition"
  ) %>% 
  tab_row_group(
    label = md("*Religious*"),
    rows = c(3, 4, 5, 9, 8)
  ) %>% 
  tab_row_group(
    label = md("*Nonreligious*"),
    rows = c(1, 2, 7)
  ) %>% 
  tab_row_group(
    label = md("*Other*"),
    rows = c(6, 10)
  ) %>% 
  cols_width(
    n ~ px(90),
    mean:upperCI ~ px(70)
  ) %>% 
  cols_label(
    religion = "Religion",
    mean = "Mean",
    sd = "Standard Deviation",
    se = "Standard Error",
    lowerCI = "95% Lower CI",
    upperCI = "95% Upper CI"
  ) %>% 
  row_group_order(
    groups = c("*Religious*", "*Nonreligious*", "*Other*")
  ) 
gt2
Average religiousness ratings of cue by religion of participant
Secular condition
Religion n Mean Standard Deviation Standard Error 95% Lower CI 95% Upper CI
Religious
Buddhism 9 4.222222 0.6666667 0.2222222 3.709777 4.734668
Christian 30 3.966667 1.5643294 0.2856062 3.382536 4.550797
Hindu 2 2.000000 1.4142136 1.0000000 -10.706205 14.706205
Other 10 3.300000 0.9486833 0.3000000 2.621353 3.978647
Shinto 5 3.400000 1.5165751 0.6782330 1.516923 5.283077
Nonreligious
Agnosticism 10 3.700000 1.0593499 0.3349959 2.942187 4.457813
Atheisim 7 3.428571 2.1491970 0.8123201 1.440896 5.416247
No Religion 19 4.052632 1.3933845 0.3196643 3.381042 4.724221
Other
Missing 2 4.000000 0.0000000 0.0000000 4.000000 4.000000
Unknown 5 4.200000 1.3038405 0.5830952 2.581068 5.818932

After this, I wanted to draw a graph that would plot the mean of religious and non-religious participants as well as 95% CI in order to determine if there was any significant differences in how the two categories of participants rated the music. I filtered the original dataset for the religions (Christian, Buddhism, Hindu, Shinto, and Other) and also filtered for the religious condition. I then created variables with the sum, mean, number of participants, and sd. I found out the number of participants by running the tally() function. I then placed these variables into a dataframe, and then did the same for non-religious participants, filtering for Agnosticism, Atheisim, and No religion participants. After I had my religious and non-religious dataframes, I used rbind to bind the two rows together into one table named finalTable1. I then calculated the 95% CIs, and plotted it out using a column graph, with errorbars showing the CI limits. I used the package ggsci to generate a palette based on the Simpsons for me. As you can see in the graph, there was no significant difference between how non-religious and religious participants rated the religiousness of the music in the religious cue condition.

# Constructing graph for religious condition
data5 <- data4 %>% 
  filter(religion == 'Christian' | religion == 'Buddhism' | 
         religion == 'Hindu' | religion == 'Shinto' | religion == 'Other', 
         cond == 0)

religSum1 = sum(data5$religious)  
religM1 = mean(data5$religious)
religN1 = tally(data5)
religSD1 = sd(data5$religious)

religTable1 <- data.frame(Status = "Religious", Sum = religSum1, mean = religM1,  SD = religSD1, n = religN1)

data6 <- data4 %>% 
  filter(religion == 'Agnosticism' | religion == 'Atheisim' | religion == 'No Religion', cond == 0)

noReligSum1 = sum(data6$religious)
noReligM1 = mean(data6$religious)
noReligN1 = tally(data6)
noReligSD1 = sd(data6$religious)

noReligTable1 <- data.frame(Status = "Non-religious", Sum = noReligSum1, mean = noReligM1,  
                           SD = noReligSD1, n = noReligN1)

finalTable1 = rbind(religTable1, noReligTable1) # combining the two dataframes together

finalTable1 <- finalTable1 %>% 
  mutate(se = SD / sqrt(n),
         lowerCI = mean - qt(1 - (0.05/2), n - 1) * se,
         upperCI = mean + qt(1 - (0.05/2), n - 1) * se)

finalPlot1 <- ggplot(finalTable1, aes(Status, mean, fill = Status)) +
  geom_col() +
  geom_errorbar(aes(ymin = lowerCI, ymax = upperCI)) +
  ylab("Religiosity rating") + #Customizing x and y labels
  xlab("Religious status") +
  facet_grid(. ~ "Religious condition - religiousness ratings by religion") +
  theme_light() + # creates white background
  theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank(), legend.position = "none") + #removes gridlines, removes legends
  scale_fill_simpsons()

plot(finalPlot1)

I then ran the same code again, instead filtering for participants in the secular condition this time. This graph shows even less of a difference between the two groups.

#Constructing graph for secular condition
data7 <- data4 %>% 
  filter(religion == 'Christian' | religion == 'Buddhism' | 
         religion == 'Hindu' | religion == 'Shinto' | religion == 'other', 
         cond == 1)

religSum2 = sum(data7$religious)  
religM2 = mean(data7$religious)
religN2 = tally(data7)
religSD2 = sd(data7$religious)

religTable2 <- data.frame(Status = "Religious", Sum = religSum2, mean = religM2,  SD = religSD2, n = religN2)

data8 <- data4 %>% 
  filter(religion == 'Agnosticism' | religion == 'Atheisim' | religion == 'No Religion', cond == 1)

noReligSum2 = sum(data8$religious)
noReligM2 = mean(data8$religious)
noReligN2 = tally(data8)
noReligSD2 = sd(data8$religious)

noReligTable2 <- data.frame(Status = "Non-religious", Sum = noReligSum2, mean = noReligM2,  
                            SD = noReligSD2, n = noReligN2)

finalTable2 = rbind(religTable2, noReligTable2) # combining the two dataframes together

finalTable2 <- finalTable2 %>% 
  mutate(se = SD / sqrt(n),
         lowerCI = mean - qt(1 - (0.05/2), n - 1) * se,
         upperCI = mean + qt(1 - (0.05/2), n - 1) * se)

finalPlot2 <- ggplot(finalTable2, aes(Status, mean, fill = Status)) +
  geom_col() +
  geom_errorbar(aes(ymin = lowerCI, ymax = upperCI)) +
  ylab("Religiosity rating") + #Customizing x and y labels
  xlab("Religious status") +
  facet_grid(. ~ "Secular condition - religiousness ratings by religion") +
  theme_light() + # creates white background
  theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank(), legend.position = "none") + #removes gridlines, removes legends
  scale_fill_simpsons()

plot(finalPlot2)

Question 2: Does age play a factor on religiosity ratings?

For this question, I wanted to look at whether participants were more likely to be religious the older they were. This would help determine if the age distribution of the sample could introduce a confound in this experiment. In order to do this, I first constructed scatter plots looking at the correlation between age and religiosity ratings. I split this up by site in order to see whether there were differences based on where the participants were located. I used ggscatter for my scatterplots, and used Kendall’s Rank Correlation test in order to get a measure of correlation (R). I used Kendall’s correlation here as the data used is not from a normal distribution (Determined by running a Shapiro-Wilk test on the age of participants in each location, and all locations combined). The scatterplots also show the p-value, so anything below 0.05 signifies a significant correlation. It’s also worthy to note that I plotted the Japanese scatterplot twice, as there was a single participant who was aged 15, which was around 5 years younger than most other participants. I wanted to see whether the correlation coefficient would change drastically with the removal of that participant.

I also created a function to help reduce the lines of code I needed write, named scatter_func, which creates a scatterplot for me. It takes in the data required, the coordinates for the p-value and correlation coefficient to be displayed, and the title for the graph. This saved me a lot of lines of code and makes it much easier to read and understand overall.

library(tidyverse)
library(ggpubr)
library(ggeasy)
library(ggsci)

# Function code
scatter_func <- function(input, coord1, coord2, title1) {
  output <- ggscatter(input, x = "age", y = "religiosity", 
          color = "age",
          add = "reg.line", conf.int = TRUE,  # adds regression line
          cor.coef = TRUE, cor.method = "kendall", cor.coef.coord = c(coord1, coord2), #displaying the p value and R coefficient 
          position = position_jitter(),
          ylim = c(0.7, 5.3),
          xlab = "Age", ylab = "Religiosity",
          title = title1)
  
  return(output)
}

#read in data

data9 <- read.csv("Nichols_et_al_data.csv")



cor1 <- data9 %>% 
  rename(religiosity = relig,) %>% 
  filter(include == 0) %>% 
  select(age, religiosity, site) %>% 
  as_tibble() 
  

#scatterplot for all locations
scatter_func(cor1, 50, 5, "All locations")

#filtering through each site and making a plot for each
cor2 <- cor1 %>% 
  filter(site == 1)

scatter_func(cor2, 50, 5, "USA")

cor3 <- cor1 %>% 
  filter(site == 2)

scatter_func(cor3, 30, 5, "Czech Republic")

cor4 <- cor1 %>% 
  filter(site == 3)

scatter_func(cor4, 16, 5, "Japan")

cor5 <- cor4 %>% 
  filter(age > 16)

scatter_func(cor5, 21, 1, "Japan > 16")

As we can see in the plots, there is a significant correlation when looking at all locations, with a negative correlation between age and religiosity. However, I am not so sure if this is an accurate statistical test as the sample is heavily biased towards younger people. Furthermore, Japan’s age range was between 15-23 and the Japanese sample had a much higher mean religiosity rating than the other sites, further skewing results. Japan’s sociocultural factors also come into play, with many citizens taking part in religious and cultural festivals despite many citizens not actively practicing any religion. When looking at each location individually, there is no significant correlation between age and religiosity. I also am unsure if using Kendall’s Rank Correlation test was appropriate here, or if a correlation test was appropriate at all given this biased sample. Given these shortcomings of my analysis, I wanted to further explore the data so I decided to see if there was a difference in religiosity between older and younger age groups. I decided to split it up into two age groups - young and old. Young participants were classed as those equal or below 21 years of age, whilst older participants were aged 21 and up. I chose this age as 21 was the median age of the sample, and it is typically marked as an age of adulthood in many countries around the world, such as the USA. I did this by labeling the sites, splitting up the participants using case_when(), and then making a grouped table by site and age group. I calculated the number of participants, mean, sd, se, and 95% CI limits for each site and age group and created a table using gt().

# Labeling sites

cor1$site[cor1$site == 1] <- "USA"
cor1$site[cor1$site == 2] <- "Czech Republic"
cor1$site[cor1$site == 3] <- "Japan"

#Splitting participants up into young and old age groups
cor6 <- cor1 %>% 
  mutate(ageGroup = case_when(age <= 21 ~ "Young", age > 21 ~ "Old"))


group1 <- cor6 %>% 
  group_by(site, ageGroup) %>% 
  summarise(n = n(),
            mean = mean(religiosity, na.rm = TRUE),
            sd = sd(religiosity, na.rm = TRUE),
  ) %>% 
  mutate(se = sd / sqrt(n),
         lowerCI = mean - qt(1 - (0.05/2), n - 1) * se,
         upperCI = mean + qt(1 - (0.05/2), n - 1) * se) %>% 
  drop_na()

gt3 <- gt(group1)

gt3 <- gt3 %>% 
  tab_header( # Adding title
    title = md("**Regliousness of participants by site and age group**"),
  ) %>% 
  row_group_order( # Changing row group order
    groups = c("USA", "Czech Republic", "Japan")
  ) %>% 
  cols_align( # Adjusting alignment 
    align = "center",
    columns = sd
  ) %>% 
  cols_width( # Changing column width
    mean:upperCI ~ px(70)
  ) %>% 
  cols_label( # Changing column labels
    ageGroup = "Age Group (Young <= 21)",
    mean = "Mean",
    sd = "Standard Deviation",
    se = "Standard Error",
    lowerCI = "95% Lower CI",
    upperCI = "95% Upper CI"
  ) 

gt3
Regliousness of participants by site and age group
Age Group (Young <= 21) n Mean Standard Deviation Standard Error 95% Lower CI 95% Upper CI
USA
Old 62 3.161290 1.4048341 0.17841411 2.804529 3.518051
Young 60 2.733333 1.1026419 0.14235046 2.448491 3.018176
Czech Republic
Old 103 3.164948 1.2134696 0.11956672 2.927788 3.402108
Young 23 2.952381 1.0235326 0.21342131 2.509772 3.394990
Japan
Old 6 4.666667 0.5163978 0.21081851 4.124740 5.208593
Young 149 3.778523 0.9645447 0.07901858 3.622373 3.934674

Looking at these results, the USA seemed to have the greatest difference between young and old participants in terms of religiosity. I did not take the results of the Japanese sample too seriously due to the discrepancy in size between the two groups. I used a two sample t-test with the t.test function to see if there was a significant difference in the USA sample. I did this by creating datasets containing only USA participants split into old and young groups, and then parsed this through the t.test function, with the argument var.equal = FALSE to use the Welch’s t-test, as I could not assume variances would be equal between the two groups.

usaOld <- cor6 %>% 
  filter(ageGroup == "Old", site == "USA")

usaYoung <- cor6 %>% 
  filter(ageGroup == "Young", site == "USA")
t.test(usaOld$religiosity, usaYoung$religiosity, var.equal = FALSE)
## 
##  Welch Two Sample t-test
## 
## data:  usaOld$religiosity and usaYoung$religiosity
## t = 1.875, df = 115.14, p-value = 0.06333
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.02414422  0.88005820
## sample estimates:
## mean of x mean of y 
##  3.161290  2.733333

With a two sample Welch’s t-test, we are unable to reject the null hypothesis.

I then decided to plot out my results using bar graphs and error bars to show the 95% CIs. I placed the number of participants in each group above each bar so that readers are able to understand where the difference may be coming from in the Japan location group. Overall, there was no significant different in religiosity throughout age groups across all locations.

agePlot <- ggplot(group1, aes(site, mean, fill = ageGroup)) +
  geom_bar( # Plot out bar plots
    stat = "identity",
    position = position_dodge(),
    width = 0.7,
    ) +
  geom_errorbar( #Add 95% CI limits as error bars
    aes(ymin = lowerCI, ymax = upperCI), width = 0.2, position = position_dodge(0.7)
  ) +
  theme_light() +
  ylab("Mean religiosity") + #Change x and y axis labels
  xlab("Site") +
  facet_grid( . ~ "Mean religiosity of age groups across sites") + #Adds title in grey element
  theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank()) + # Remove grid lines
  scale_fill_simpsons() + #Add simpsons colour scheme
  guides(fill = guide_legend(title = "Age Groups")) + #Change legend title
  geom_text( # displays number of participants
    aes(label = n, group = ageGroup),
    position = position_dodge(1), 
    vjust = -0.3, size = 3.5 #adjusting the position so it is readable
  )

agePlot

Question 3: Is the dishonesty distribution linear or parabolic?

For this question, my assumption was that the distribution of dishonesty measures would be parabolic, with a large number of participants either not cheating at all, or cheating for almost all the trials. The result of this analysis could be used to design better honesty tasks in the future, where there is a greater variety in the amount of cheating being done throughout the task. I first used a histogram and density plot to view the density of dishonesty measures across all sites. I also used vline to plot a vertical line at the mean of the dishonesty measure. Once again, I used a function, plot_func(), to make my code cleaner.

library(tidyverse)
library(sm)
library(ggeasy)

#Function code
plot_func <- function(data, title){
  output <- ggplot(data, aes(x = claimpercent)) +
    geom_histogram(aes(y = ..density..), color = "black", fill = "#E64B35B2", binwidth = 5) + #binwidth sets the bars to be 5% wide
    geom_density(alpha = .3, fill="#4DBBD5B2") +
    geom_vline(aes(xintercept = mean(claimpercent)), linetype = "dashed") + #vline shows the mean of the dishonesty measure
    xlab("Dishonesty measure") +
    ylab("Density") +
    ggtitle(title)
  return(output)
}

data10 <- read.csv("Nichols_et_al_data.csv")

data10 <- data10 %>% 
  rename(cond = con,
         claimpercent = claim,
         ) %>% 
  filter(include == 0) %>% 
  select(claimpercent, site) %>% 
  mutate(claimpercent = claimpercent * 100) %>% 
  drop_na()


plot_func(data10, "Density of dishonesty measures for all participants")

As you can see, my suspicions were mostly confirmed, however there was definitely a much larger proportion of participants who did not cheat/cheated very little, compared to those who cheated on most of/all the trials. I then decided to plot 3 more graphs, one for each site, in order to see whether this parabolic trend remained true for all sites.

#Filtering for USA participants
site1 <- data10 %>% 
  filter(site == 1)

plot_func(site1, "Density of dishonesty measures for USA participants")

#Filtering for Czech participants
site2 <- data10 %>% 
  filter(site == 2)

plot_func(site2, "Density of dishonesty measures for Czech Republic participants")

#Filtering for Japanese participants
site3 <- data10 %>% 
  filter(site == 3)

plot_func(site3, "Density of dishonesty measures for Japanese participants")

Finally, I decided to compare all 3 sites in one plot, using the sm.density.compare function from the sm package.

# Comparing the 3 density plots

site.f <- factor(data10$site, levels = c(1,2,3),
                 labels = c("USA", "Czech Republic", "Japan")) #Creating factor levels for the sites

sm.density.compare(data10$claimpercent, data10$site, xlab = "Dishonestly claimed percentage",
                   xlim = c(0, 100)) #using sm.density.compare from the sm package to plot
title(main = "Density plot of dishonesty measures")

colfill <- c(2:(2+length(levels(site.f)))) #Adding a legend to the graph
legend("topright", levels(site.f), fill = colfill)

As you can see, the parabolic trend persisted in every site apart from the Czech Republic. USA had by far the highest proportion of participants who cheated on every trial. There are a few reasons why I believe most participants tended to cheat very little, rather than not cheat at all. I believe that many participants examined whether cheating would have any consequence, and whether they really would get paid for cheating. However, most of the participants probably decided it would not be worth the risk to be called out for cheating on every trial and possibly believed they would not be paid if they were found to be cheating. Many participants probably also got detected “cheating”, when in reality they most likely just made a simple error in their judgement.

These graphs also show the inconsistency of the Czech sample to the other groups. Whilst I do believe environmental and sociocultural factors do come into play to a certain extent, I am almost positively certain that the fact that participants were awarded course credit for the participation of this experiment played a large effect on the participants’ dishonesty measures, and thus introduced a confound into the experiment.

Part 4: Recommendations

Codebook

The first recommendation is to include a codebook in the OSF repo. A lot of variables had extremely similar names, such as relig, religion, and religiosity. Furthermore, the coding system used for variables such as sex and site were ambiguous, and we had to use the count() function to match up the numbers from the paper to be confident we were coding the variables correctly. A lot of the variables flat out didn’t make sense too, such as CT and CT_cheat. We knew that CT referred to completion time, but this was given as a decimal value. Furthermore, we could not figure out what CT_cheat meant and how it differed from CT, although the values were different for all participants. We spent over an hour at the beginning of the project just deciphering the variables, and we were still unable to figure everything out. A codebook would make everything clearer for people to understand how you manipulated your variables and why exactly you were performing certain calculations. This would simply involve listing out each variable and describing what exactly it meant, how it was coded, and its relationship with other variables.

An example of confusing variable names.

The code itself!

For some reason, even though you took the effort to use an Open Science Framework Repository, you failed to include the most important part towards enabling people to reproduce your analysis; the code itself! What you included did help, and we definitely would have struggled a fair bit more without the code you included, but for some reason the plots and tables you included weren’t the actual ones from the paper! This was very confusing to us, and we thought that it would make much more sense to put this code in a separate document whilst leaving the original code intact. This made our lives hard (for obvious reasons), but another thing was that when we were unable to reproduce the plots, we were unable to figure out whether it was an error on our part, or that you had did something malicious to the database to produce fabricated results. This sort of defeats the purpose of open science if there are massive barriers to actually reproducing the figures and results in the paper.

We can’t tell why there’s a difference between your figure 2 and our figure 2

Package versions

Another thing that could have been done is including which version of the packages was used. The version of R which was used was included which is good, but seeing that functions from packages make up the bulk of the code, it makes sense to add in the version of the package that was used. This can be done using sessionInfo to record the packages used and which version they were, or if you would like to take it to the next level, you can use packrat which stores all the packages used in its own private library in the project directory, which allows any future reproducer the ability to build the packages from source.

We didn’t run into any troubles here, but it is highly likely that sometime in the future, one of the packages will change to the point where some of the code does not function as intended, making tools like packrat essential to reproducibility.

Transparency

Overall, the paper lacked transparency in many areas. There was no real rationale as to why certain plots were used, and why they were graphed in a certain way. This isn’t to say that the figures and tables were difficult to read, or that we thought that you were deliberately only using graphs with interesting results, but it would help readers to trust the figures they are being shown. There was also a lack of transparency as to how the scales were selected for the ratings of music. Ratings for the religiousness/sacredness of the music was on a 7 point scale, but for every other aspect this was coded on a 5 point scale. This was extremely confusing for us, and I would imagine to most other people reading this paper too.

Differences in scales used

Another issue of transparency was how the participants were recruited. There was clearly several different recruitment methods used, based on the age variance between groups as well as the fact that the Czech sample was awarded participation points on top of the monetary reward. Recruitment is a crucial part of psychological research as this is where sampling bias can often be introduced. In Japan, the sample used was not representative of the general Japanese populace as the ages sampled ranged only 7 years between the youngest and oldest participant. An understanding of how recruitment differed between the different locations would be benificial not only to those attempting to reproduce the analysis, but to replicate the experiment altogether.

Code readability

In your code (which didn’t actually have the code used in the paper), there were a lot of style choices that made it difficult to decipher the code. The first was the object names used.

Now, these names actually make a lot of sense once you understand them. However, there was no comments left on the code explaining how this naming scheme worked. The only way to figure it out was to read all of the code, and think about what it was doing and how it related to the name. You could save a lot of people a lot of time, simply by just spending a little bit of time on your comments! This could also help save yourself in the future, when you’re trying to decipher code you wrote a year (or a night) ago!

Another interesting style choice was the use of empty space (or lack thereof).

In many equations, there was a lack of spaces in between parts of the equations making them difficult to read. A lot of the time, you would keep a line of code going until it hit the character limit for the line, before starting a new line, rather than just starting a new line so that the eye would not have to travel so far. These are all crucial elements to making your code easier to read and understand. Also, a lot of the time, an entire section of code would be written without a single comment, or only one comment explaining what the purpose of the code was for. This made it extremely difficult to understand what each line of the code was doing.

What is going on here?

We would recommend you refer to some basic coding best practices blogs and instructional materials such as this one to help brush up on your R code so it is nice and readable for everyone, including yourself!