Calculate the Efficacy vs. Disease; Explain your results.
It appears that the efficacy of the vaccine is not as good for adults over 50 as it is for adults under 50. I would assume this is because as we age, we tend to have more health concerns. Since the virus is more deadly to those that have certain health conditions, I would assume this is the reason for the negative rating in efficacy, as compared to the group under the age of 50.
#creates a new dataframe from a csv file, instead of xlsx file
new_data <- link2 %>%
#starts reading csv file from second row, which will have the new header names
read_csv(skip=2,col_names=names_list)
## Rows: 14 Columns: 7
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (2): Age, pop_no_vax
## dbl (4): ...1, pop_full_vax, sev_no_vax_p100k, sev_full_vax_p100k
## lgl (1): Efficacy
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
#name the row for percentage of population below 50
new_data$Age[2]<-'percentage_<50'
#name the row for percentage of population above 50
new_data$Age[4]<-'percentage_>50'
#create new column that was deleted due to not having data
new_data['Efficacy']<- 0
#create the values for the new Efficacy column
new_data['Efficacy'][1,] <- 1-(new_data['sev_full_vax_p100k'][1,]/new_data['sev_no_vax_p100k'][1,])
#create the values for the new Efficacy column
new_data['Efficacy'][3,] <- 1-(new_data['sev_full_vax_p100k'][3,]/new_data['sev_no_vax_p100k'][3,])
new_data <- new_data[1:4,2:7]
new_data
## # A tibble: 4 × 6
## Age pop_no_vax pop_full_vax sev_no_vax_p100k sev_full_vax_p1… Efficacy
## <chr> <chr> <dbl> <dbl> <dbl> <dbl>
## 1 <50 1116834 3501118 43 11 0.744
## 2 percentage_<50 0.233 0.73 NA NA 0
## 3 >50 186078 2133516 171 290 -0.696
## 4 percentage_>50 0.079 0.904 NA NA 0
From your calculation of efficacy vs. disease, are you able to compare the rate of severe cases in unvaccinated individuals to that in vaccinated individuals?
The efficacy and rate of severe cases between vaccinated and unvaccinated can be compared because they are ratios. Efficacy is derived from the severe cases per 100 thousand of vaccinated and unvaccinated. The ratios of severe cases is calculated by dividing the number of cases by the population then multiplied by a hundred thousand. This is done precisely so the groups can be compared.
The comparison that really needs to be made, is not really on the efficacy of the vaccine, as the data is really showing how our decisions in early life start to effect us at age 50.
This would be a good time to do a public health blitz campaign on healthy diets throughout our lives. As most of the health complications stem from unhealthy eating habits.
#removes percent signs from percentages so can be turn to numeric
new_data<-new_data %>%
mutate_at(.vars = c("pop_no_vax", 'pop_full_vax'),
.funs = gsub,
pattern = '%',
replacement = "")
#modifies the data in the pop_no_vax and pop_full_vax columns by removing columns
#to beable to put as numeric
new_data<-new_data %>%
mutate_at(.vars = c("pop_no_vax", 'pop_full_vax'),
.funs = gsub,
pattern = ',',
replacement = "")
#changes data from character to numeric
new_data[, 2:3] <- sapply(new_data[, 2:3], as.numeric)
Do you have enough information to calculate the total population? What does this total population represent?
Yes you do have enough to calculate the total population above what ever age limit that they allow to get the vaccine. I’m not certain if they are like the United States and do not allow children under the age of 12 to get the vaccine or not.
To calculate the population of the data set just add the under 50 and over 50 groups of vaccinated and unvaccinated populations, then add their percentages of the population. Take the sum of the population and divide it by the sum of the percentage of the population.
The group of unaccounted for people in the population are more than likely people who are not in the country. A small portion might also be of people who have passed away from other causes. Also, the percentage seems too small to account for children who might be 12 and under, so I am assuming they are left out of the data.
#create new column that adds up the total population
new_data<-new_data %>%
mutate(Total_Pop = as.numeric(pop_no_vax)+ as.numeric(pop_full_vax))
#reorder columns put total_pop after the two population columns
new_data <- new_data[,c(1,2,3,7,4,5,6)]
new_data
## # A tibble: 4 × 7
## Age pop_no_vax pop_full_vax Total_Pop sev_no_vax_p100k sev_full_vax_p1…
## <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 <50 1116834 3501118 4617952 43 11
## 2 percentage_<50 0.233 0.73 0.963 NA NA
## 3 >50 186078 2133516 2319594 171 290
## 4 percentage_>50 0.079 0.904 0.983 NA NA
## # … with 1 more variable: Efficacy <dbl>
#create a new column for the unaccounted for population
new_data<-new_data %>%
mutate(Unaccounted =c(round(Total_Pop[1]/Total_Pop[2]-Total_Pop[1],0),1-Total_Pop[2],round(Total_Pop[3]/Total_Pop[4]-Total_Pop[3],0),1-Total_Pop[4]))
#rearrange the columns to a more intuitive order
new_data <- new_data[,c(1,2,3,4,8,5,6,7)]
new_data
## # A tibble: 4 × 8
## Age pop_no_vax pop_full_vax Total_Pop Unaccounted sev_no_vax_p100k
## <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 <50 1116834 3501118 4617952 177429 43
## 2 percentage_<50 0.233 0.73 0.963 0.0370 NA
## 3 >50 186078 2133516 2319594 40115 171
## 4 percentage_>50 0.079 0.904 0.983 0.0170 NA
## # … with 2 more variables: sev_full_vax_p100k <dbl>, Efficacy <dbl>