Verification

I undertook the verification task with Victor, Jessica and Sam. Against Jenny and Kate’s advice, we decided to divide and conquer the plots and table we were to reproduce.

We had to first recreate the demographic statistics, figures and tables:

Data were collected at universities across three sites: the USA, the Czech Republic, and Japan. A total of 460 (228 females) adults were randomly assigned to one of four conditions: religious, secular, white noise, or control (no sound). Due to self-reported suspicion on the goals of the experiment and previous experience with studies using the Dots Game, a total of 52 participants were excluded from the analysis. In support of this decision, supplementary analyses of the incorrectly claimed higher-paying sides in the Dots game showed that the excluded participants had 5.5 higher odds of incorrectly claiming all money from the higher-paying side (100% claimed) compared to participants in the included sample (see S1 Table G in S1 File for the analysis of the full sample). Five additional participants had missing data on crucial variables such as sex and age. Data included in the analyses therefore comprise: 123 American participants (Mage = 25.5, SD = 9.8), 128 Czech participants (Mage = 24.4, SD = 3.4), and 157 Japanese participants (Mage = 19.8, SD = 0.9). Within the four experimental conditions, there were: 100 participants in the control condition, 103 participants in the white-noise condition, 103 participants in the secular condition, and 102 participants in the religious condition.

Figure 1

Figure 2

Table 1

I took on Figure 1 (with Victor’s assistance), Victor tackled Figure 2, and Sam and Jessica worked on Table 1.

We created a codebook to make sense of the variables in the csv.

include: 0 for included, 1 for excluded
site: 1 for USA, 2 for Czech Republic, 3 for Japan
id: id of participant
cond: 1 for control, 2 for white noise, 3 for secular, 4 for religious
claimpercent: percentage claimed incorrectly 
claimmoney: how much money was made from the task
sex: 0 = female, 1 = male, NA = missing data
age: age
religiosity: 1 - 5, least to most self-rating of religiosity
affil: 0 nonaffiliated, 1 affiliated
religion: religion
ritual: self-rated ritual frequency

This would prove to be helpful, especially when we would find out that some variables (e.g., condition) were inconsistent with those used for the plots.

Before recreating the demographic statistics and plots, we had to load the correct libraries, and read the csv.

library(tidyverse)
library(dplyr) #used for cleaning up data
library(janitor) #used to read csv
library(gt) #used to create the table
library(ggplot2) #used to plot graphs for Fig 1 and Fig 2
library(ggdist) #used for raincloud plot in Fig 1
library(gridExtra) #combines plots for Fig 1 and Fig 2
# Data for demographic statistics
demographics <- read_csv("~/Important/University Subjects/PSYC3361/RFiles/verification_report/Nichols_et_al_dataset_V2.0.csv")

# Data for Fig 1
data1 <- read_csv("~/Important/University Subjects/PSYC3361/RFiles/verification_report/Nichols_et_al_dataset_V2.0.csv")

# Data for Fig 2
data2 <- read_csv("~/Important/University Subjects/PSYC3361/RFiles/verification_report/Nichols_et_al_dataset_V2.0.csv")

# Data for Table 1
data3 <- read_csv("~/Important/University Subjects/PSYC3361/RFiles/verification_report/Nichols_et_al_dataset_V2.0.csv")

Then, we could rename the variables, to align with the code used for the plots/table.

data1 <- data1 %>%  
  filter(include == 0) %>% 
  rename(cond = con,
         claimpercent = claim,
         claimmoney = moneyclaim,
         CT_practice = `completion time (practice included)`,
         CT_payments = `completion time (payments only)`,
         religiosity = relig,
         religion = Religion)

For our individual plots, we will also use the select() function from the dplyr package to use the relevant variables.

Demographic statistics

We needed to calculate the number of males and females, as well as the mean age and SD, and number of each site. We also had to relabel the sex and sites with demoSex$sex[demoSex$sex == 0] <- "Female". gt() was used to visualise the data in a table. Additionally, na.rm = TRUE skips all the NA values, fmt_number() changes the values to 1 decimal place and use_seps() removes any digit separators.

demographics <- demographics %>% 
  select(sex, site, include, age) 

demoSex <- demographics %>% 
  count(sex)

demoSite <- demographics %>% 
  filter(include == 0) 

demoSite <- demoSite %>% 
  group_by(site) %>% 
  summarise(mean = mean(age, na.rm = TRUE),
            sd = sd(age, na.rm = TRUE),
            n = n())

demoSex$sex[demoSex$sex == 0] <- "Female"
demoSex$sex[demoSex$sex == 1] <- "Male"

demoSite$site[demoSite$site == 1] <- "USA"
demoSite$site[demoSite$site == 2] <- "Czech Republic"
demoSite$site[demoSite$site == 3] <- "Japan"

gt(demoSite) %>%
  fmt_number(
    columns = c("mean", "sd"),
    decimals = 1,
    use_seps = FALSE)
site mean sd n
USA 25.5 9.8 123
Czech Republic 24.4 3.4 128
Japan 19.8 0.9 157
gt(demoSex)
sex n
Female 228
Male 227
NA 5

Figure 1

Before recreating Figure 1, I used the select() function to create a smaller table with the relevant variables. The mutate function was used to make claimpercent a percentage instead of a decimal.

data1 <- data1 %>% 
  select(site, claimpercent, cond) %>% 
  mutate(claimpercent = claimpercent * 100)

My first step was to figure out what kind of plot Figure 1 was so I could create it using the ggplot2 package.

My first course of action was to check the R Markdown file provided by Nichols and colleagues. I was unable to knit the file because of outdated packages, so I instead tried running the code line by line to see if the code related to Figure 1. However, Victor was able to run the whole R Markdown File and discovered that the file only contained new plots that were not in the paper.

However, we figured out that the columns in Figure 1 was just geom_col() combined with another function, Victor and I tried to find if we could combine geom_area() with with geom_col().

fig1_areacol <- ggplot(data = data1) +
  geom_col(aes(x = claimpercent, y = cond)) +
  geom_area(aes(x = claimpercent, y = cond)) +
  scale_x_continuous(
    name = "Percent Claimed",
    labels = c(0, 20, 40, 60, 80, 100),
    breaks = c(0, 20, 40, 60, 80, 100)) +
  scale_y_discrete(name = NULL) +
  ggtitle(label = "Data By Condition")

plot(fig1_areacol)

Clearly, this was not the right way to go about it. After asking for help on Slack, Jenny S recommended to Google “raincloud plots”.

I came across this blog which showed that ggdist::stat_halfeye() could be used to create the density plot.

fig1_raincloud <- ggplot(data1, aes(x = cond, y = claimpercent)) +
  ggdist::stat_halfeye(
    adjust = .5,
    width = .5,
    .width = 0,
    point_colour = NA
  ) +
  coord_flip()

plot(fig1_raincloud)

Now we have the density plot! I then combined ggdist::stat_halfeye() with geom_col().

fig1_densitycol <- ggplot(data1, aes(x = cond, y = claimpercent)) +
  geom_col(
    aes(x = as.numeric(cond)+.5),
    width = .3
  ) +
  ggdist::stat_halfeye(
    adjust = .5,
    width = .5,
    .width = 0,
    point_colour = NA
  ) +
  coord_flip()

plot(fig1_densitycol)

For some reason, the density plots were “squished” because the claimpercent values were way too high. It was then we realised that geom_col() was summing all of the values rather than taking the means from each condition.

Before calculating the means, we had to rename the conditions so that they better aligned with the original code. We copied the code from their R Markdown file to reorder the conditions, where they made religious (cond = 4) a reference category.

For “Data by Condition”, this made 1 = “Control”, 2 = “White Noise”, 3 = “Secular”, and 4 = “Religious”, and for “Data by Site”, 0 = “USA”, 1 = “Japan”, and 2 = “Czech Republic”.

# data by condition
data1$cond[data1$cond==4] <- 0 
data1$cond[data1$cond==1] <- 4 
data1$cond[data1$cond==3] <- 1 
data1$cond[data1$cond==4] <- 3

# data by site
data1$site[data1$site==1] <- 0 
data1$site[data1$site==3] <- 1 

Then, we calculated the means by multiplying each condition/site with the amount of values they had, then dividing claimpercent by the number of values. We had to use the mutate function from dplyr to create a new variable.

# data by condition

data1 <- data1 %>% 
  mutate(numberOf = (cond == 0) * 100 + (cond == 1 | cond == 2) * 103 + (cond == 3) * 102) %>% 
  mutate(claimpercent2 = claimpercent / numberOf)

# data by site

data1 <- data1 %>% 
  mutate(numberOf2 = (site == 0) * 123 + (site == 2) * 127 + (site == 1) * 156) %>% 
  mutate(claimpercent3 = claimpercent / numberOf2)

And our efforts paid off!

fig1_densitycol2 <- ggplot(data1, aes(x = cond)) +
  geom_col(
    aes(x = as.numeric(cond)+.5, y = claimpercent2),
    width = .3
  ) +
  ggdist::stat_halfeye(
    aes(y = claimpercent),
    adjust = .5,
    width = .5,
    .width = 0,
    point_colour = NA
  ) +
  coord_flip()

plot(fig1_densitycol2)

First of all, we changed both cond and site to a factor with as.numeric(). We had to change it before we plotted but after mutating all the variables we needed because the variables would not calculate a factor. We later use factor() instead, because we realised as.numeric() was the wrong function.

In the ggdist::stat_halfeye section, adjust changed the height so the column and density plot could fit in the graph. .width and point_colour wwere part of the code from the blog, but they were removed removed since they are not in Fig 1.

Then, coord_flip() from ggplot2 was used to flip the graph to match Figure 1.

All we had left to do was:

  • label the x and y axis, as well as the x and y axis values,
  • move the columns below the density plot,
  • add title and subtitle,
  • add the error bars,
  • add the colours,
  • do the same for the “Data by Site” figure.

To calculate the error bars, we “group_cond” and “group_site” so geom_error(). group_by() creates a tibble so we can summarise() to make a new data frame.

The mean was calculated with mean = mean(claimpercent), while n = n() counts the number of rows in the dataframe.

Like na.rm = TRUE, drop_na() from tidyverse removes rows with missing values.

To calculate the confidence intervals, we created new variables “lowerCI” and “upperCI” by calculating se and then plugging it into the CI formula.

“group_cond” and “group_site” were turned into a factor so it would align with our plot later. This is because cond and site will be changed into factors later, so if “group_cond” and “group_site” are not factors, the error bars will not align with the columns.

group_cond <- data1 %>% 
  group_by(cond) %>% 
  summarise(mean = mean(claimpercent, na.rm = TRUE),
            sd = sd(claimpercent, na.rm = TRUE),
            n = n()) %>% 
  drop_na() %>% 
  mutate(se = sd / sqrt(n),
         lowerCI = mean - qt(1 - (0.05/2), n - 1) * se,
         upperCI = mean + qt(1 - (0.05/2), n - 1) * se) %>% 
  ungroup()

group_cond$cond <- as.factor(group_cond$cond)

# data by site
group_site <- data1 %>% 
  group_by(site) %>% 
  summarise(mean = mean(claimpercent, na.rm = TRUE),
            sd = sd(claimpercent, na.rm = TRUE),
            n = n()) %>% 
  drop_na() %>% 
  mutate(se = sd / sqrt(n),
         lowerCI = mean - qt(1 - (0.05/2), n - 1) * se,
         upperCI = mean + qt(1 - (0.05/2), n - 1) * se) %>% 
  ungroup()

group_site$site <- factor(group_site$site)

The final Figure 1 plot needed the following:

  1. Create error bars

geom_errorbar() from ggplot2 plotted the error bars by using the SE and “lowerCI” and “upperCI” from the previous chunk. Then, the width was adjusted so the error bars matched the column width.

  1. Rename the x-axis, y-axis, as well as the axis values.

For scale_x_discrete() and scale_y_continuous(), the arguments did the following:

  • expand = included 0 in the x and y axis,
  • label = named each value,
  • break = made the labels and breaks the same length,
  • name = NULL removed the x axis label (as Figure 1 did not have one) and name = "Percent Claimed" relabeled the y axis title.
  1. Add the aesthetics of the Figure 1 plot.

theme(legend.position = "none") from ggplot2 removed the legend, while scale_fill_manual from the same package added the exact colours used in the original plot, which were obtained by using a colour picking site.

To recreate the title aesthetics, ggtitle() from ggplot2 created the “A.” and “B.” title, while facet_grid() put the “Data by _____” title in a box.

The code for fig_cond and fig_site are the same - just with different colours and labels!

# data by condition

data1$cond <- factor(data1$cond)

fig1_cond <- ggplot(data1, aes(x = cond)) +
  geom_col(
    aes(y = claimpercent2, fill = cond),
    width = .3
  ) +
  ggdist::stat_halfeye(
    aes(y = claimpercent, fill = cond),
    adjust = .5,
    width = .4,
    .width = 0,
    point_colour = NA,
    position = position_nudge(x = 0.17, y = 0)
  ) +
  geom_errorbar(data = group_cond,
                aes(ymin = lowerCI, ymax = upperCI),
                show.legend = NA,
                width = .3
                ) +
  coord_flip() +
  expand_limits(x = 0, y = 0) +
  scale_x_discrete(name = NULL,
                     expand = c(0,0),
                     breaks = c(0, 1, 2, 3),
                     labels = c("Control", "Noise", "Secular", "Religious")
  ) +
  scale_y_continuous(name = "Percent Claimed",
                     expand = c(0,0),
                     breaks = c(0, 20, 40, 60, 80, 100)
                     ) +
  ggtitle(
    label = "Data by condition"
  ) +
  theme_classic() +
  theme(legend.position = "none") +
  scale_fill_manual(values = c("#adadad","#336ca3","#77c2ec","#ecb234")) +
  ggtitle(label = "A.") +
  facet_grid(. ~ "Data by condition")

plot(fig1_cond)

Then, all we had to do was use grid.arrange() from gridExtra to combine the two plots!

grid.arrange(fig1_site, fig1_cond, ncol = 2)

We really appreciated the power of Google for Figure 1 - without that raincloud plot blog we would have never figured out how to plot this!

Figure 2

Like we did for Figure 1, we had to select the relevant variables for Figure 2. To match with Nichols’ code, we had to reverse code “ritual” by subtracting 7 (the number of points on the scale) from ritual and taking the absolute value.

As previously done, cond was renamed and changed to a factor.

dataA$cond[dataA$cond==4] <- 0 
dataA$cond[dataA$cond==1] <- 4 
dataA$cond[dataA$cond==3] <- 1 
dataA$cond[dataA$cond==4] <- 3

dataA$cond <- factor(dataA$cond, levels= c(0,1,2,3),
                labels = c("Religious", "Secular", "Noise","Control"))

dataB was created to take the data from all participants in dataA that are in the religious condition, which is needed to plot the SE.

dataB <- dataA %>% 
  filter(cond == "Religious")

We also had to calculate the CI limits by calculating the mean, sd and n for each grouped variable, then applying the CI limit formula.

groupA <- dataA %>% 
  group_by(cond, affil) %>% 
  summarise(mean = mean(claimpercent, na.rm = TRUE),
            sd = sd(claimpercent, na.rm = TRUE),
            n = n()) %>% 
  drop_na() %>% 
  mutate(se = sd / sqrt(n),
         lowerCI = mean - qt(1 - (0.05/2), n - 1) * se,
         upperCI = mean + qt(1 - (0.05/2), n - 1) * se) %>% 
  ungroup()
  
# labelling the levels of religious affiliation 
groupA$affil <- factor(groupA$affil, levels = c(0, 1), 
                       labels = c("Non-affiliated", "Affiliated"))

For the first Figure 2 plot, method = "lm" from ggplot2 drew a line of best fit. geom_smooth (also from ggplot2) was used to plot figA and figB, while coord_cartesian() was used to set the ylimits without losing data points.

figA <- ggplot(dataA, aes(religiosity, claimpercent, color = cond)) +
  geom_smooth(method = "lm", se = FALSE, size = 1.5) +
  geom_smooth(data = dataB, size = 0, se = TRUE, method = "lm", fill = "#fbf1d8") + 
  theme_light() + 
  theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank(), legend.position = "none") + 
  coord_cartesian(ylim = c(0, 50)) + 
  labs(x = "Religiosity", y = "Percentage claimed") + 
  facet_grid(. ~ "Condition*Religiosity") + 
  scale_color_manual(values = c("#dea520", "#5ab3e4", "#094689", "#a4a4a4")) 

The second plot had the same code as the first plot. As a note, figA and figB needed two geom_smooth() to plot the SE line for only the religious condition, as Figure 2 did.

While figA and figB were the same, figC differed because they had CI limits. As such, geom_line() was used to create the connecting lines in figC, and we used position_dodge to move each line to stop them from overlapping.

figC <- ggplot(groupA, aes(affil, mean, color = cond)) +
  geom_line(aes(group = cond), position = position_dodge(width = 0.5), size = 1.5) + 
  geom_errorbar(data = groupA, aes(ymin = lowerCI, ymax = upperCI), width = 0.2, position = position_dodge(width = 0.5), size = 1.3) +
  theme_light() +
  theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank()) +
  ylim(0, 50) +
  scale_x_discrete() +
  labs(x = "Religious affiliation", y = "Percentage claimed") +
  facet_grid(. ~ "Condition*Religious affiliation") +
  scale_color_manual(values = c("#dea520", "#5ab3e4", "#094689", "#a4a4a4"))

Then, we combined figA, figB and figC.

grid.arrange(figA, figB, figC, ncol = 3)

Caterpillar coding was especially helpful for Figure 2. Victor had trouble with geom_line() but Sam was able to point out a mistake he might have made, which helped Victor solve the problem!

Table 1

Again, we selected the relevant variables.

data3 <- data3 %>% 
  select(-CT_practice, -CT_payments, -CT, -CT_cheat, -Religion_Text, -affil_cong)

The first course of action was to create variables that were not in the csv: negativity, positivity, impact and tempo.

We first looked at the codebook to figure out what the variables were, what they were measuring and how they were calculated from the existing variables in the csv. Doing this revealed that the variables were the averages of specific variables in the csv, so we calculated the averages of the variables listed.

data3 <- data3 %>% 
  mutate(
    negativity = (distressing + irritating + boring + sad)/4,
    positivity = (interesting + pleasant + exciting + relaxing + happy)/5,
    impact = (deep + powerful)/2,
    tempo = (fast + abs(slow-7))/2
  ) 

Then, we calculated the mean and SD of the new variables, as well as variables claimpercent and sacred. However, these values did not calculate new variables that we could put into a table.

data3 %>% 
  group_by(cond) %>%
  summarise(mean_claim = mean(claimpercent, na.rm = TRUE),
            mean_sac = mean(sacred, na.rm = TRUE),
            mean_neg = mean(negativity, na.rm = TRUE),
            mean_pos = mean(positivity, na.rm = TRUE),
            mean_imp = mean(impact, na.rm = TRUE),
            mean_temp = mean(tempo, na.rm = TRUE),
            SD_claim = sd(claimpercent, na.rm = TRUE),
            SD_sac = sd(sacred, na.rm = TRUE),
            SD_neg = sd(negativity, na.rm = TRUE),
            SD_pos = sd(positivity, na.rm = TRUE),
            SD_imp = sd(impact, na.rm = TRUE),
            SD_temp = sd(tempo, na.rm = TRUE)
  )

Therefore, we had to create new individual means, SDs, SEs and CI limits for each variable from each condition.

To do this, we selected the variable from the data, and selected the condition with $. Claimpercent had to be multiplied by 100 to make it a percentage value.

To calculate n, we used length() (which was used in the paper’s R Markdown) which takes the list of responses in the group, though in hindsight we could have also used n(). !is.na was used to remove the NA values.

SE was calculated with the SE formula, which was then plugged into the CI limits formula to create the upper and lower CI limits.

We will also explain how the variables were named. The values were named in the order of descriptive statistic/condition/variable. For example, ‘mean1c’ refers to the ‘mean’ of ‘claimpercent’ (c) in the ‘control group’ (1). For reference, the conditions were: control = 1, white noise = 2, secular = 3, and religious = 4.

For conciseness, I will only show the code for the descriptive statistics of claimpercent variable in the control group, but the code remains the same for the other variables. In hindsight, we should have created a function instead since the code amounted to 150 lines.

# Percent claimed variable
# Control Group (1)
mean1c <- mean(data3$claimpercent[data3$cond=="1"], na.rm = TRUE)*100 # mean
sd1c <- sd(data3$claimpercent[data3$cond=="1"], na.rm = TRUE)*100 # SD
n1c <- length(data3$cond[data3$cond=="1" & !is.na(data3$cond)]) # number of responses
se1c <- sd1c/sqrt(n1c) # standard error
lCI1c <- mean1c - (1.96*se1c) # lower 95% CI
uCI1c <- mean1c + (1.96*se1c) # upper 95% CI

After calculating the descriptive statistics, we needed to calculate Cohen’s d using the formula:

We also only took absolute values because Table 1 did not include any negative Cohen’s d values.

The formatting of the value names were the same as the descriptive statistics - d refers to Cohen’s d, the number refers to the condition, and the letter refers to the variable.

Again, we only showed parts of the code due to its long length, but the formatting of the code was the same for each Cohen’s d values.

# computing Cohen's d
# % claimed
#relig vs sec
d1c <- abs((mean4c-mean3c)/sqrt((sd4c^2+sd3c^2)/2))
#relig vs noise
d2c <- abs((mean4c-mean2c)/sqrt((sd4c^2+sd2c^2)/2))

Now that we had all of the values, we were now able to put it into a table.

We first put the variables into a tibble, with the characteristics including the 6 variables. is.numeric was used to specify that only columns with numeric values will be mutated, and used ~round(., 2) to round the values to 2 decimal places.

Then, gt() placed all the variables into a gt table and labelled the first column with cols_label().

As a note, only the code for the religious table is shown. However, Table 4 differs as we changed the ‘-’ to ’NA’s.

# Religious table
table1 <- tibble(
  characteristics = c("% claimed", "Sacredness", "Negativity", "Positivity", "Tempo", "Impact"),
  M = c(mean4c, mean4s, mean4n, mean4p, mean4t, mean4i),
  SD = c(sd4c, sd4s, sd4n, sd4p, sd4t, sd4i),
  lCI = c(lCI4c, lCI4s, lCI4n, lCI4p, lCI4t, lCI4i),
  uCI = c(uCI4c, uCI4s, uCI4n, uCI4p, uCI4t, uCI4i),
  d = c("-", "-", "-", "-", "-", "-")
) 
table1 %>% mutate_if(is.numeric, ~round(., 2)) %>%
  gt() %>%
  cols_label(characteristics = "Religious")
Religious M SD lCI uCI d
% claimed 27.33 29.92 21.52 33.14 -
Sacredness 5.17 1.39 4.90 5.44 -
Negativity 1.82 0.65 1.70 1.95 -
Positivity 2.54 0.85 2.37 2.70 -
Tempo 2.60 0.81 2.45 2.76 -
Impact 2.91 1.09 2.70 3.12 -
Secular M SD lCI uCI d
% claimed 30.32 30.98 24.33 36.30 0.10
Sacredness 4.32 1.35 4.06 4.58 0.62
Negativity 1.64 0.51 1.55 1.74 0.31
Positivity 2.72 0.78 2.56 2.87 0.22
Tempo 2.78 0.82 2.62 2.94 0.22
Impact 2.73 1.13 2.51 2.95 0.16
White Noise M SD lCI uCI d
% claimed 29.40 31.09 23.39 35.40 0.07
Sacredness 3.26 1.19 3.04 3.49 1.48
Negativity 2.58 0.96 2.39 2.76 0.93
Positivity 1.38 0.62 1.26 1.50 1.56
Tempo 3.99 0.72 3.85 4.13 1.80
Impact 1.76 0.93 1.58 1.94 1.13
Control Group M SD lCI uCI d
% claimed 22.38 23.89 17.69 27.06 0.18
Sacredness NA NA NA NA NA
Negativity NA NA NA NA NA
Positivity NA NA NA NA NA
Tempo NA NA NA NA NA
Impact NA NA NA NA NA

To combine the individual tables, we used data.frame to create a data frame including all four tables.

data_tables <- data.frame(Religious = table1, 
                          Secular = table2,
                          WhiteNoise = table3,
                          Control = table4)

We were now almost done! We just had to change the aesthetics of the table to match Table 1.

tab_source_note() moved the source note to the bottom of the table and tab_spanner() was used to label each column by each condition group.

gt_tbl <- gt(data_tables) %>% 
  tab_source_note(
    source_note = "M = Mean; SD = Standard Deviation; CI = 95% Confidence Intervals. Cohen's d represents the effect size of comparisons between musical conditions.") %>%
  tab_spanner(
    label = "Religious (n = 102)",
    columns = c(Religious.M, Religious.SD, Religious.lCI, Religious.uCI, Religious.d)
  ) %>%
  tab_spanner(
    label = "Secular (n = 103)",
    columns = c(Secular.M, Secular.SD, Secular.lCI, Secular.uCI, Secular.d)
  ) %>%
  tab_spanner(
    label = "White Noise (n = 103)",
    columns = c(WhiteNoise.M, WhiteNoise.SD, WhiteNoise.lCI, WhiteNoise.uCI, WhiteNoise.d)
  ) %>%
  tab_spanner(
    label = "Control (n = 100)",
    columns = c(Control.M, Control.SD, Control.lCI, Control.uCI, Control.d)
  )

Finally, we renamed all the column labels and rounded the values to 2 decimal places. cols_hide() removed any repeating characteristics in columns.

One problem was that fmt_number() from gt would not work with the ‘-’ values in the table, so we temporarily changed ‘-’ to ‘NA’, then at the end we changed ‘NA’ back to ‘-’ by using fmt_missing, also from gt.

gt_tbl %>% cols_label(Religious.characteristics = NULL, 
             Religious.M = "M",
             Religious.SD = "SD",
             Religious.lCI = "lCI",
             Religious.uCI = "uCI",
             Religious.d = "d",
             Secular.characteristics = "Secular",
             Secular.M = "M",
             Secular.SD = "SD",
             Secular.lCI = "lCI",
             Secular.uCI = "uCI",
             Secular.d = "d",
             WhiteNoise.characteristics = "White Noise",
             WhiteNoise.M = "M",
             WhiteNoise.SD = "SD",
             WhiteNoise.lCI = "lCI",
             WhiteNoise.uCI = "uCI",
             WhiteNoise.d = "d",
             Control.characteristics = "Control",
             Control.M = "M",
             Control.SD = "SD",
             Control.lCI = "lCI",
             Control.uCI = "uCI",
             Control.d = "d")%>% 
  fmt_number(
    columns = c(Religious.M, Religious.SD, Religious.lCI, Religious.uCI, Secular.M, Secular.SD, Secular.lCI, Secular.uCI, Secular.d, WhiteNoise.M, WhiteNoise.SD, WhiteNoise.lCI, WhiteNoise.uCI, WhiteNoise.d, Control.M, Control.SD, Control.lCI, Control.uCI, Control.d),
    decimals = 2,
    use_seps = FALSE) %>% 
  cols_hide(
    columns = c(Secular.characteristics, WhiteNoise.characteristics, Control.characteristics)) %>% 
  fmt_missing(
  columns = everything())
M Religious (n = 102) Secular (n = 103) White Noise (n = 103) Control (n = 100)
M SD lCI uCI d M SD lCI uCI d M SD lCI uCI d M SD lCI uCI d
% claimed 27.33 29.92 21.52 33.14 - 30.32 30.98 24.33 36.30 0.10 29.40 31.09 23.39 35.40 0.07 22.38 23.89 17.69 27.06 0.18
Sacredness 5.17 1.39 4.90 5.44 - 4.32 1.35 4.06 4.58 0.62 3.26 1.19 3.04 3.49 1.48
Negativity 1.82 0.65 1.70 1.95 - 1.64 0.51 1.55 1.74 0.31 2.58 0.96 2.39 2.76 0.93
Positivity 2.54 0.85 2.37 2.70 - 2.72 0.78 2.56 2.87 0.22 1.38 0.62 1.26 1.50 1.56
Tempo 2.60 0.81 2.45 2.76 - 2.78 0.82 2.62 2.94 0.22 3.99 0.72 3.85 4.13 1.80
Impact 2.91 1.09 2.70 3.12 - 2.73 1.13 2.51 2.95 0.16 1.76 0.93 1.58 1.94 1.13
M = Mean; SD = Standard Deviation; CI = 95% Confidence Intervals. Cohen's d represents the effect size of comparisons between musical conditions.
  missing_text = "-"

Now we were done with Table 1!

The lack of a codebook/comprehensive R Markdown file made Table 1 especially hard! This made us really appreciate the power of rubber duck coding.