WEEK 6 LEARNING LOG
Goals:
My goal this week was to reproduce table 3, as well as take on Jenny’s challenge of making the graph even BETTER!
Challenges:
I tried two different ways of improving the graph… neither of which were overly successful. First I tried adding a reference line to our bar chart, that would display the mean “disobedience to lockdown measures” (totaling the ‘adhere’ variables). I used the function “abline”, adding the following code to the graph “+ geom_hline(yintercept = 23.3, col =”red“) + geom_hline(yintercept = 39.9, col =”blue“)”. This was the product:
This issue with this was firstly: It was very tricky to add these lines to the key/legend. Secondly: The lines could be misunderstood as the average across all variables for those who thought they did and did not have covid. We cant add an average line across all variables as they are measuring different things, however if the line only depicts the average for the adhere vairables, it could be misunderstood.
SO I tried another idea to improve our graph. I considered adding percentage labels to each variable for easy reading of our graph. I used the following code “geom_text(aes(label = Percentage), vjust = -1.5, size = 3.5, colour = black)”. I was hoping for something like this - found on google:
However because our bar chart has so many variables, the text was too small and we also couldn’t centre the text above each bar.
After both ideas failed I was feeling a bit like this:
So I moved onto completing table 3.
Successes:
I started working on the code for the table, getting the correct output and renaming variables.Bart and Eddie were then able to finish off the table, by rearranging the variables and adding chart element titles. He added code using “as.factor”. And now Table 3 is complete!!
Here is the code i developed for the table:
library('haven')
library('ggplot2')
library('tidyverse')
library('janitor')
library('gt')
library('gtsummary')
table3 <- covid_3 %>%
pivot_wider(names_from = vars, values_from = values) %>%
zap_labels() %>% mutate_if(is.numeric, as.factor)
table3$Ever_covid <- as.factor(table3$Ever_covid)
table3$Ever_covid <- factor(table3$Ever_covid, labels = c("Think have not had COVID-19", "Think have had COVID-19"))
table3$Adhere_shop_groceries <- factor(table3$Adhere_shop_groceries, labels = c("no one or fewer days in the last week, n = 2389", "On two or more days in the last week, n = 3760"))
table3$Adhere_meet_friends <- factor(table3$Adhere_meet_friends, labels = c("Not at all in the last week, n = 5271", "On one or more days in the last week, n = 878"))
table3$Adhere_shop_other <- factor(table3$Adhere_shop_other, labels = c("Not at all in the last week, n = 1833", "On one or moroe days in the last week, n = 4316"))
table3$Sx_covid_nomissing <- factor(table3$Sx_covid_nomissing, labels = c("Did not correctly identify symptooms, n = 2390", "Correctly identified common symptoms, n = 3632"))
table3 <- table3 %>%
pivot_longer(Adhere_meet_friends:Sx_covid_nomissing, names_to = "Participant Characteristics", values_to = "Levels") %>%
na.omit %>%
mutate(Real = paste0(number,"(",percentage,")")) %>%
select(-number,-percentage) %>%
pivot_wider(id_cols = -Real, names_from = Ever_covid, values_from = Real)
print(table3)Bart then took this code and finished off the table.
Next steps:
Next I will work on the presentation, specifically what we have learned about reproducibility. I will continue to communicate frequently with the group, and help other members when needed.