Israeli Population research
Who is eligible for vaccine?
Persons 12 years or older
source: https://govextra.gov.il/ministry-of-health/covid19-vaccine/en-covid19-vaccination-information/
How many are eligible for vaccine?
28% are under 14 years old. So we know that atleast 72% (1.00 - 0.28) of the population is eligible.
This would equate roughly to [9,053,000 * 0.72] = 6,518,160 eligible persons
load libraries
library(tidyverse)## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.1 ──
## ✓ ggplot2 3.3.5 ✓ purrr 0.3.4
## ✓ tibble 3.1.4 ✓ dplyr 1.0.7
## ✓ tidyr 1.1.3 ✓ stringr 1.4.0
## ✓ readr 2.0.1 ✓ forcats 0.5.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
library(stringr)Reading and Cleaning Data
il_vax <- read_csv("https://raw.githubusercontent.com/man-of-moose/masters_607/main/homework_4/efficacy_data_clean.csv")## New names:
## * `` -> ...3
## * `` -> ...5
## Rows: 5 Columns: 6
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (6): Age, Population %, ...3, Severe Cases, ...5, Efficacy
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
Inspect the table
il_vaxFill in age column using fill()
This replaces any Null values with the value above it.
il_vax <- il_vax %>% fill(Age)Remove the first row
This row does not contain any useful information. It’s better to remove, and to rename columns
il_vax <- il_vax[-1,]
colnames(il_vax) <- c("age","not_vax","vax","severe_no_vax","severe_vax","efficacy")il_vaxTemporarily split out population count values with their percent counterparts
vac_percentages <- il_vax[-c(1,3),] %>%
select(age,not_vax,vax)
il_vax <- il_vax[c(1,3),]vac_percentagesil_vaxuse gather() function to convert il_vax and vac_percentages to long
First we will convert vac_percentaes to long
vac_percentages <- vac_percentages %>%
gather("vax_status","age_pop_percent",2:3) %>%
mutate(
age_pop_percent = as.double(age_pop_percent)
)vac_percentagesThen we can use the same techniques to convert il_vax to long. Here we use gather twice, once for vax_status, and another for severe cases.
il_vax <- il_vax %>%
gather("vax_status","pop_count",2:3) %>%
gather("severe_cases","severe_count",2:3)il_vaxRemove redundant rows
Using gather() on severe_cases was important for making this dataframe long, but it also created a number of redundant rows. We use filter to choose only rows where vax_status matches severe_cases (vax -> vax, or no_vax -> no_vax)
We can then remove the column severe_cases, because severe_count holds the actual value we care about.
We can finally create a new column which is the proportion of severe_cases to total group population.
il_vax <- il_vax %>%
filter((vax_status=="not_vax" & severe_cases=="severe_no_vax") |
(vax_status=="vax" & severe_cases=="severe_vax")) %>%
select(age,vax_status,pop_count,severe_count) %>%
mutate(
pop_count = as.integer(pop_count),
severe_count = as.integer(severe_count),
severe_count_prop = severe_count / pop_count
)il_vaxMerge vax_table and vac_percentages
We can now combine back in the population vaccination percentages by age, captured in the vax_percentages table.
il_vax <- left_join(il_vax,vac_percentages,by=c("age","vax_status")) %>%
select(age,
vax_status,
pop_count,
age_pop_percent,
severe_count,
severe_count_prop)il_vaxQuestions
Do you have enough information to calculate total population? What does this total population represent?
Yes, we can calculate the population of “individuals eligible for vaccination in Israel” based on the data that we have here.
calculate_total_population <- function(age_group, data) {
temp_table <- data %>% filter(age==age_group)
pop_count_sum <- sum(temp_table$pop_count)
age_pop_percent_sum <- sum(temp_table$age_pop_percent)
return(pop_count_sum/age_pop_percent_sum)
}calculate_total_population("<50",il_vax) +
calculate_total_population(">50",il_vax)## [1] 7155090
This does not match our expected population of 6.5M for total eligible voters shown at the start of this assignment (and sourced from wikipedia). However, that statistic was generated using the proportion of people ages 0-14, where for this problem we are actually interested in the proportion of people ages 0-12.
Calculate the Efficacy vs Disease; Explain your results
In order to keep our data tidy, we need to create a new table for efficacy. We can always join this table to the other one later.
efficacy_table <- tibble(
age = unique(il_vax$age)
)calculate_efficacy <- function(age_group, table) {
vax_prop <- table %>%
filter(age==age_group & vax_status=="vax") %>%
.$severe_count_prop
no_vax_prop <- table %>%
filter(age==age_group & vax_status=="not_vax") %>%
.$severe_count_prop
ret <- 1-(vax_prop/no_vax_prop)
return(ret)
}efficacy_table <- efficacy_table %>%
mutate(efficacy = calculate_efficacy(age,il_vax))efficacy_tableAs we can see from the above efficacy_table, vaccines are incredibly effective for both age groups. Interpreting the above data, we can say that:
- vaccines prevent 91% of severe cases for individuals under 50 years of age.
- vaccines prevent 85% of severe cases for individuals over 50 years of age.
From your calculation of efficacy vs disease, are you able to compare the rate of severe cases in unvaccinted individuals to that in vaccinated individuals?
Yes, by comparing the proportions of [severe_count / pop_count] between vaccinated and unvaccinated peoples we can see the difference in severe cases rate.
under_50_not_vax_severe_proportion <- il_vax %>%
filter(age=="<50", vax_status=="not_vax") %>%
.$severe_count_prop
under_50_vax_severe_proportion <- il_vax %>%
filter(age=="<50", vax_status=="vax") %>%
.$severe_count_prop
# comparing difference in proportions
under_50_not_vax_severe_proportion / under_50_vax_severe_proportion## [1] 12.25445
In the under-50 bracket, unvaccinated people end up with severe cases at a rate 12 times greater than vaccinated people.
over_50_not_vax_severe_proportion <- il_vax %>%
filter(age==">50", vax_status=="not_vax") %>%
.$severe_count_prop
over_50_vax_severe_proportion <- il_vax %>%
filter(age==">50", vax_status=="vax") %>%
.$severe_count_prop
# comparing difference in proportions
over_50_not_vax_severe_proportion / over_50_vax_severe_proportion## [1] 6.760814
In the over-50 bracket, unvaccinated people end up with severe cases at a rate 6 times greater than vaccinated people.