Questions

  1. Do you have enough information to calculate the total population? What does this total population represent?
  2. Calculate the Efficacy vs. Disease; Explain your results.
  3. From your calculation of efficacy vs. disease, are you able to compare the rate of severe cases in unvaccinated individuals to that in vaccinated individuals?

Question

Calculate the Efficacy vs. Disease; Explain your results.

It appears that the efficacy of the vaccine is not as good for adults over 50 as it is for adults under 50. I would assume this is because as we age, we tend to have more health concerns. Since the virus is more deadly to those that have certain health conditions, I would assume this is the reason for the negative rating in efficacy, as compared to the group under the age of 50.

#creates a new dataframe from a csv file, instead of xlsx file
new_data <- link2 %>%
  #starts reading csv file from second row, which will have the new header names
  read_csv(skip=2,col_names=names_list)
## Rows: 14 Columns: 7
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (2): Age, pop_no_vax
## dbl (4): ...1, pop_full_vax, sev_no_vax_p100k, sev_full_vax_p100k
## lgl (1): Efficacy
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
  #name the row for percentage of population below 50
  new_data$Age[2]<-'percentage_<50'
  #name the row for percentage of population above 50
  new_data$Age[4]<-'percentage_>50'
  #create new column that was deleted due to not having data
  new_data['Efficacy']<- 0
  #create the values for the new Efficacy column
  new_data['Efficacy'][1,] <- 1-(new_data['sev_full_vax_p100k'][1,]/new_data['sev_no_vax_p100k'][1,])
  #create the values for the new Efficacy column
  new_data['Efficacy'][3,] <- 1-(new_data['sev_full_vax_p100k'][3,]/new_data['sev_no_vax_p100k'][3,])
  new_data <- new_data[1:4,2:7]
  new_data
## # A tibble: 4 × 6
##   Age            pop_no_vax pop_full_vax sev_no_vax_p100k sev_full_vax_p1… Efficacy
##   <chr>          <chr>             <dbl>            <dbl>            <dbl>    <dbl>
## 1 <50            1116834     3501118                   43               11    0.744
## 2 percentage_<50 0.233             0.73                NA               NA    0    
## 3 >50            186078      2133516                  171              290   -0.696
## 4 percentage_>50 0.079             0.904               NA               NA    0

Question

From your calculation of efficacy vs. disease, are you able to compare the rate of severe cases in unvaccinated individuals to that in vaccinated individuals?

The efficacy and rate of severe cases between vaccinated and unvaccinated can be compared because they are ratios. Efficacy is derived from the severe cases per 100 thousand of vaccinated and unvaccinated. The ratios of severe cases is calculated by dividing the number of cases by the population then multiplied by a hundred thousand. This is done precisely so the groups can be compared.

The comparison that really needs to be made, is not really on the efficacy of the vaccine, as the data is really showing how our decisions in early life start to effect us at age 50.

This would be a good time to do a public health blitz campaign on healthy diets throughout our lives. As most of the health complications stem from unhealthy eating habits.

#removes percent signs from percentages so can be turn to numeric
new_data<-new_data %>%
  mutate_at(.vars = c("pop_no_vax", 'pop_full_vax'),
            .funs = gsub,
            pattern = '%',
            replacement = "")
#modifies the data in the pop_no_vax and pop_full_vax columns by removing columns
#to beable to put as numeric
new_data<-new_data %>%
  mutate_at(.vars = c("pop_no_vax", 'pop_full_vax'),
            .funs = gsub,
            pattern = ',',
            replacement = "")
#changes data from character to numeric
new_data[, 2:3] <- sapply(new_data[, 2:3], as.numeric)

Question

Do you have enough information to calculate the total population? What does this total population represent?

Yes you do have enough to calculate the total population above what ever age limit that they allow to get the vaccine. I’m not certain if they are like the United States and do not allow children under the age of 12 to get the vaccine or not.

To calculate the population of the data set just add the under 50 and over 50 groups of vaccinated and unvaccinated populations, then add their percentages of the population. Take the sum of the population and divide it by the sum of the percentage of the population.

The group of unaccounted for people in the population are more than likely people who are not in the country. A small portion might also be of people who have passed away from other causes. Also, the percentage seems too small to account for children who might be 12 and under, so I am assuming they are left out of the data.

#create new column that adds up the total population 
new_data<-new_data %>%
  mutate(Total_Pop = as.numeric(pop_no_vax)+ as.numeric(pop_full_vax))
#reorder columns put total_pop after the two population columns 
new_data <- new_data[,c(1,2,3,7,4,5,6)]
new_data
## # A tibble: 4 × 7
##   Age             pop_no_vax pop_full_vax   Total_Pop sev_no_vax_p100k sev_full_vax_p1…
##   <chr>                <dbl>        <dbl>       <dbl>            <dbl>            <dbl>
## 1 <50            1116834      3501118     4617952                   43               11
## 2 percentage_<50       0.233        0.73        0.963               NA               NA
## 3 >50             186078      2133516     2319594                  171              290
## 4 percentage_>50       0.079        0.904       0.983               NA               NA
## # … with 1 more variable: Efficacy <dbl>
#create a new column for the unaccounted for population
new_data<-new_data %>%
  mutate(Unaccounted =c(round(Total_Pop[1]/Total_Pop[2]-Total_Pop[1],0),1-Total_Pop[2],round(Total_Pop[3]/Total_Pop[4]-Total_Pop[3],0),1-Total_Pop[4]))
#rearrange the columns to a more intuitive order
new_data <- new_data[,c(1,2,3,4,8,5,6,7)]
new_data
## # A tibble: 4 × 8
##   Age             pop_no_vax pop_full_vax   Total_Pop Unaccounted sev_no_vax_p100k
##   <chr>                <dbl>        <dbl>       <dbl>       <dbl>            <dbl>
## 1 <50            1116834      3501118     4617952     177429                    43
## 2 percentage_<50       0.233        0.73        0.963      0.0370               NA
## 3 >50             186078      2133516     2319594      40115                   171
## 4 percentage_>50       0.079        0.904       0.983      0.0170               NA
## # … with 2 more variables: sev_full_vax_p100k <dbl>, Efficacy <dbl>