This week I hope to begin the reproducibility project and play around with the COVID paper data. Specifically, I hope to reproduce a table from my groups COVID paper.
For reference, here is a photo of the table I intend to reproduce:
One thing I certainly do not know how to is organise the emotions into two subgroups in the table. I only know how to create two new data frames - positive emotions and negative emotions - based on the variables in the original data set. So for now, let’s work with positive emotions only.
Now that I have the frequencies of each positive emotion grouped into one data frame, I can use the summarise and across functions to acquire the means and SDs of each positive emotion.
pos_des <- pos_emot %>%
summarise(
across(.cols = everything(), na.rm = TRUE, list(M = mean, SD = sd)))
pos_des
## f_calm_M f_calm_SD f_qui_M f_qui_SD f_app_M f_app_SD f_int_M f_int_SD
## 1 2.444929 0.8656906 2.432139 0.8747724 2.404788 0.933181 2.279045 0.8270466
## f_cont_M f_cont_SD f_hap_M f_hap_SD f_rela_M f_rela_SD f_pea_M f_pea_SD
## 1 2.151885 0.9388588 2.134133 0.8042611 2.129103 0.8860399 2.049342 0.9457163
## f_ener_M f_ener_SD f_aff_M f_aff_SD f_amu_M f_amu_SD f_acc_M f_acc_SD
## 1 1.89819 0.8040365 1.888018 0.8607027 1.871287 0.7194135 1.83557 0.8667226
## f_joy_M f_joy_SD f_pro_M f_pro_SD f_reli_M f_reli_SD f_exc_M f_exc_SD
## 1 1.705069 0.8975795 1.667053 0.9719912 1.478114 0.8819814 1.463719 0.7853215
The descriptive statistics seem to match those in the table. Now that I have the values, it is just a matter of organising the data properly so that when I run it through the apa_table function, it will produce a table which resembles the one in the paper. After much deliberation, this is the best process I could come up with. Essentially, I rename the mean variables to the names of each emotion and create two new data frames - one with a long version of the names and means, the other with a long version of the SDs. I will combine the to data frames later.
pos_des <- pos_des %>%
rename(Calm = f_calm_M, Quiet = f_qui_M, Appreciative = f_app_M,
Interested = f_int_M, Content = f_cont_M, Happy = f_hap_M,
Relaxed = f_rela_M, Peaceful = f_pea_M, Energetic = f_ener_M,
Affectionate = f_aff_M, Amused = f_amu_M, Accomplished = f_acc_M,
Joyful = f_joy_M, Proud = f_pro_M, Relieved = f_reli_M,
Excited = f_exc_M)
pos_pivot1 <- pos_des %>%
select(-starts_with("f_")) %>%
pivot_longer(cols = everything(),
names_to = "Positive Emotion",
values_to = "M")
pos_pivot2 <- pos_des %>%
select(starts_with("f_")) %>%
pivot_longer(cols = everything(),
names_to = NULL,
values_to = "SD")
At this point, I decided to try to produce the 95% confidence intervals for each mean. Confidence intervals are easy to produce using the t.test function. However, I decided to calculate them manually to so I can create a variable for each CI limit.
pos_emot_mean <- pos_emot %>%
summarise(across(.cols = everything(), na.rm = TRUE, list(M = mean)))
pos_emot_sd <- pos_emot %>%
summarise(across(.cols = everything(), na.rm = TRUE, list(SD = sd)))
n <- 945; xbar <- pos_emot_mean; s <- pos_emot_sd
tstar <- qt(0.975, n-1)
pos_ll <- xbar - tstar*(s/sqrt(n))
pos_ul <- xbar + tstar*(s/sqrt(n))
pos_CI <- bind_rows(pos_ll, pos_ul)
Again, we must reorganise the data into long format for the table. At this stage, I am not sure how to create one variable called “CI Limits” that has both limits for each emotions (e.g. [1.17, 2.92]). For now, I must have two variables - one for the lower limit, one for the upper limit.
table_CI_long <- pos_CI %>%
pivot_longer(
cols = everything())
table_ll <- table_CI_long %>%
slice(1:16)
table_ul <- table_CI_long %>%
slice(17:32)
table_CI <- bind_cols(table_ll$value, table_ul$value)
table_CI <- table_CI %>%
rename("Lower CI Limit" = ...1, "Upper CI Limit" = ...2)
Now to bind all the variables together into one data frame and print the table:
pos_des_table <- bind_cols(pos_pivot1, pos_pivot2, table_CI)
apa_table(pos_des_table, caption = "Mean Frequencies of Emotions",
note = "N = 945. CI = 95% confidence interval")
| Positive Emotion | M | SD | Lower CI Limit | Upper CI Limit |
|---|---|---|---|---|
| Calm | 2.44 | 0.87 | 2.39 | 2.50 |
| Quiet | 2.43 | 0.87 | 2.38 | 2.49 |
| Appreciative | 2.40 | 0.93 | 2.35 | 2.46 |
| Interested | 2.28 | 0.83 | 2.23 | 2.33 |
| Content | 2.15 | 0.94 | 2.09 | 2.21 |
| Happy | 2.13 | 0.80 | 2.08 | 2.19 |
| Relaxed | 2.13 | 0.89 | 2.07 | 2.19 |
| Peaceful | 2.05 | 0.95 | 1.99 | 2.11 |
| Energetic | 1.90 | 0.80 | 1.85 | 1.95 |
| Affectionate | 1.89 | 0.86 | 1.83 | 1.94 |
| Amused | 1.87 | 0.72 | 1.83 | 1.92 |
| Accomplished | 1.84 | 0.87 | 1.78 | 1.89 |
| Joyful | 1.71 | 0.90 | 1.65 | 1.76 |
| Proud | 1.67 | 0.97 | 1.61 | 1.73 |
| Relieved | 1.48 | 0.88 | 1.42 | 1.53 |
| Excited | 1.46 | 0.79 | 1.41 | 1.51 |
Note. N = 945. CI = 95% confidence interval
Like I discovered last week, the apa_table function does not produce an APA format table when used in a HTML markdown document. However, if I run the exact same script through an APA template markdown doc with a “docx” output, you get the following table:
This week I managed to successfully reproduce all the values in Table 1 from my group’s COVID article. I also managed to create a table which somewhat resembles the one from the paper. However, I did struggle when it came to reorganising the data to fit the apa_table function. I also could not figure out how to efficiently create a single CI variable with both limits, without typing in each value manually. I also still have no idea how I can create a table with all emotions grouped into “positive” and “negative” like the original table.
The next step for me would be figure out how to refine the table I created to make it look even more like the one from the paper.