Goals

This week I hope to begin the reproducibility project and play around with the COVID paper data. Specifically, I hope to reproduce a table from my groups COVID paper.

For reference, here is a photo of the table I intend to reproduce:

Progress

One thing I certainly do not know how to is organise the emotions into two subgroups in the table. I only know how to create two new data frames - positive emotions and negative emotions - based on the variables in the original data set. So for now, let’s work with positive emotions only.

Now that I have the frequencies of each positive emotion grouped into one data frame, I can use the summarise and across functions to acquire the means and SDs of each positive emotion.

pos_des <- pos_emot %>%
  summarise(
    across(.cols = everything(), na.rm = TRUE, list(M = mean, SD = sd)))
pos_des
##   f_calm_M f_calm_SD  f_qui_M  f_qui_SD  f_app_M f_app_SD  f_int_M  f_int_SD
## 1 2.444929 0.8656906 2.432139 0.8747724 2.404788 0.933181 2.279045 0.8270466
##   f_cont_M f_cont_SD  f_hap_M  f_hap_SD f_rela_M f_rela_SD  f_pea_M  f_pea_SD
## 1 2.151885 0.9388588 2.134133 0.8042611 2.129103 0.8860399 2.049342 0.9457163
##   f_ener_M f_ener_SD  f_aff_M  f_aff_SD  f_amu_M  f_amu_SD f_acc_M  f_acc_SD
## 1  1.89819 0.8040365 1.888018 0.8607027 1.871287 0.7194135 1.83557 0.8667226
##    f_joy_M  f_joy_SD  f_pro_M  f_pro_SD f_reli_M f_reli_SD  f_exc_M  f_exc_SD
## 1 1.705069 0.8975795 1.667053 0.9719912 1.478114 0.8819814 1.463719 0.7853215

The descriptive statistics seem to match those in the table. Now that I have the values, it is just a matter of organising the data properly so that when I run it through the apa_table function, it will produce a table which resembles the one in the paper. After much deliberation, this is the best process I could come up with. Essentially, I rename the mean variables to the names of each emotion and create two new data frames - one with a long version of the names and means, the other with a long version of the SDs. I will combine the to data frames later.

pos_des <- pos_des %>%
  rename(Calm = f_calm_M, Quiet = f_qui_M, Appreciative = f_app_M,
         Interested = f_int_M, Content = f_cont_M, Happy = f_hap_M,
         Relaxed = f_rela_M, Peaceful = f_pea_M, Energetic = f_ener_M,
         Affectionate = f_aff_M, Amused = f_amu_M, Accomplished = f_acc_M,
         Joyful = f_joy_M, Proud = f_pro_M, Relieved = f_reli_M,
         Excited = f_exc_M)

pos_pivot1 <- pos_des %>%
  select(-starts_with("f_")) %>%
  pivot_longer(cols = everything(), 
               names_to = "Positive Emotion",
               values_to = "M")
  
pos_pivot2 <- pos_des %>%
  select(starts_with("f_")) %>%
  pivot_longer(cols = everything(),
               names_to = NULL,
               values_to = "SD")

At this point, I decided to try to produce the 95% confidence intervals for each mean. Confidence intervals are easy to produce using the t.test function. However, I decided to calculate them manually to so I can create a variable for each CI limit.

pos_emot_mean <- pos_emot %>%
  summarise(across(.cols = everything(), na.rm = TRUE, list(M = mean)))

pos_emot_sd <- pos_emot %>%
  summarise(across(.cols = everything(), na.rm = TRUE, list(SD = sd)))

n <- 945; xbar <- pos_emot_mean; s <- pos_emot_sd
tstar <- qt(0.975, n-1)   
pos_ll <- xbar - tstar*(s/sqrt(n))
pos_ul <- xbar + tstar*(s/sqrt(n))      

pos_CI <- bind_rows(pos_ll, pos_ul)

Again, we must reorganise the data into long format for the table. At this stage, I am not sure how to create one variable called “CI Limits” that has both limits for each emotions (e.g. [1.17, 2.92]). For now, I must have two variables - one for the lower limit, one for the upper limit.

table_CI_long <- pos_CI %>%
  pivot_longer(
    cols = everything())

table_ll <- table_CI_long %>%
  slice(1:16)

table_ul <- table_CI_long %>%
  slice(17:32)

table_CI <- bind_cols(table_ll$value, table_ul$value)

table_CI <- table_CI %>%
  rename("Lower CI Limit" = ...1, "Upper CI Limit" = ...2)

Now to bind all the variables together into one data frame and print the table:

pos_des_table <- bind_cols(pos_pivot1, pos_pivot2, table_CI)

apa_table(pos_des_table, caption = "Mean Frequencies of Emotions",
          note = "N = 945. CI = 95% confidence interval")
(#tab:unnamed-chunk-7)
Mean Frequencies of Emotions
Positive Emotion M SD Lower CI Limit Upper CI Limit
Calm 2.44 0.87 2.39 2.50
Quiet 2.43 0.87 2.38 2.49
Appreciative 2.40 0.93 2.35 2.46
Interested 2.28 0.83 2.23 2.33
Content 2.15 0.94 2.09 2.21
Happy 2.13 0.80 2.08 2.19
Relaxed 2.13 0.89 2.07 2.19
Peaceful 2.05 0.95 1.99 2.11
Energetic 1.90 0.80 1.85 1.95
Affectionate 1.89 0.86 1.83 1.94
Amused 1.87 0.72 1.83 1.92
Accomplished 1.84 0.87 1.78 1.89
Joyful 1.71 0.90 1.65 1.76
Proud 1.67 0.97 1.61 1.73
Relieved 1.48 0.88 1.42 1.53
Excited 1.46 0.79 1.41 1.51

Note. N = 945. CI = 95% confidence interval

 

Like I discovered last week, the apa_table function does not produce an APA format table when used in a HTML markdown document. However, if I run the exact same script through an APA template markdown doc with a “docx” output, you get the following table:


Challenges and Successes

This week I managed to successfully reproduce all the values in Table 1 from my group’s COVID article. I also managed to create a table which somewhat resembles the one from the paper. However, I did struggle when it came to reorganising the data to fit the apa_table function. I also could not figure out how to efficiently create a single CI variable with both limits, without typing in each value manually. I also still have no idea how I can create a table with all emotions grouped into “positive” and “negative” like the original table.

Next Step

The next step for me would be figure out how to refine the table I created to make it look even more like the one from the paper.