Import the Data

I used the read.csv() function to read the data from a csv file.

vaccineData <- read.csv("https://raw.githubusercontent.com/juliaDataScience-22/cuny-fall-23/manage-acquire-data/vaccine_data.csv")

Format the Data

To format the data, I made the types of data the names of the columns. I renamed the first and last columns, and then I fixed the age groups so each row had an age group. To sort through the data, I broke it up into counts and percentages.

Next, I used pivot_longer() to make the two parts tidy tables. After this, I combined the tables into a final table. I calculated the missing percentages and the efficacy vs. severe disease. To determine the missing percentages, I used the following formula:

(number of people) / (100,000 people)

To determine the efficacy vs. severe disease, I used the following formula:

1 - (% fully vaxed severe cases per 100K / % not vaxed severe cases per 100K)

## 
## Attaching package: 'janitor'
## The following objects are masked from 'package:stats':
## 
##     chisq.test, fisher.test
Table 1
Vaccination Data and Severe Cases for Israel
age categories numbers percents efficacy_vs_severe_disease
<50 Not Vax 1116834 23.30 0.75
<50 Fully Vax 3501118 73.00 0.75
<50 Not Vax per 100K 43 0.04 0.75
<50 Fully Vax per 100K 11 0.01 0.75
>50 Not Vax 186078 7.90 -0.71
>50 Fully Vax 2133516 90.40 -0.71
>50 Not Vax per 100K 171 0.17 -0.71
>50 Fully Vax per 100K 290 0.29 -0.71

Question 1: Do you have enough information to calculate the total population? What does this total population represent?

There is enough information to calculate the total population. First, you can figure out the population under 50 and the population over 50. Then, you can add them together. To determine the population under 50, do the following calculations:

(Not Vaxed) / (% People Not Vaxed) = 1116834 / 0.233 = 4793278 (Fully Vaxed) / (% People Fully Vaxed) = 3501118 / 0.73 = 4796052

Find the average of those two values for an approximate population: 4794665 people

To determine the population over 50, do the following calculations:

(Not Vaxed) / (% People Not Vaxed) = 186078 / 0.079 = 2355418 (Fully Vaxed) / (% People Fully Vaxed) = 2133516 / 0.904 = 2357751

Find the average of those two values for an approximate population: 2360084 people

Then, add the two numbers together: 4794665 + 2360084 = 7152416 people

The total population represents people who were not vaxed, partially vaxed, and fully vaxed.

Question 2: Calculate the Efficacy vs. Disease. Explain your results.

For the group younger than 50:

vaccineData$efficacy_vs_severe_disease[1]
## [1] 0.75

For the group older than 50:

vaccineData$efficacy_vs_severe_disease[5]
## [1] -0.71

The positive number indicates that the vaccine was more efficacious against severe disease for the group younger than 50. The negative number indicates that the vaccine was less efficacious against severe disease for the group older than 50.

Question 3: From your calculation of efficacy vs. disease, are you able to compare the rate of severe cases in unvaccinated individuals to that in vaccinated individuals?

I can only compare the rates by age group since it appeared to be more effective in the younger population. In the younger population, more people without a vaccination had severe cases. In the older population, more people who were fully vaccinated had severe cases. One difference is the total population of people under 50 was about two times the total population of people over 50. This skewed the results. Also, the older population is more susceptible to more serious diseases because of their age and possibly other pre-existing conditions that can exacerbate Covid. Therefore, it is challenging to compare the rates of severe cases for unvaccinated and vaccinated people from this data.