I used the read.csv() function to read the data from a csv file.
vaccineData <- read.csv("https://raw.githubusercontent.com/juliaDataScience-22/cuny-fall-23/manage-acquire-data/vaccine_data.csv")
To format the data, I made the types of data the names of the columns. I renamed the first and last columns, and then I fixed the age groups so each row had an age group. To sort through the data, I broke it up into counts and percentages.
Next, I used pivot_longer() to make the two parts tidy tables. After this, I combined the tables into a final table. I calculated the missing percentages and the efficacy vs. severe disease. To determine the missing percentages, I used the following formula:
(number of people) / (100,000 people)
To determine the efficacy vs. severe disease, I used the following formula:
1 - (% fully vaxed severe cases per 100K / % not vaxed severe cases per 100K)
##
## Attaching package: 'janitor'
## The following objects are masked from 'package:stats':
##
## chisq.test, fisher.test
| Table 1 | ||||
| Vaccination Data and Severe Cases for Israel | ||||
| age | categories | numbers | percents | efficacy_vs_severe_disease |
|---|---|---|---|---|
| <50 | Not Vax | 1116834 | 23.30 | 0.75 |
| <50 | Fully Vax | 3501118 | 73.00 | 0.75 |
| <50 | Not Vax per 100K | 43 | 0.04 | 0.75 |
| <50 | Fully Vax per 100K | 11 | 0.01 | 0.75 |
| >50 | Not Vax | 186078 | 7.90 | -0.71 |
| >50 | Fully Vax | 2133516 | 90.40 | -0.71 |
| >50 | Not Vax per 100K | 171 | 0.17 | -0.71 |
| >50 | Fully Vax per 100K | 290 | 0.29 | -0.71 |
There is enough information to calculate the total population. First, you can figure out the population under 50 and the population over 50. Then, you can add them together. To determine the population under 50, do the following calculations:
(Not Vaxed) / (% People Not Vaxed) = 1116834 / 0.233 = 4793278 (Fully Vaxed) / (% People Fully Vaxed) = 3501118 / 0.73 = 4796052
Find the average of those two values for an approximate population: 4794665 people
To determine the population over 50, do the following calculations:
(Not Vaxed) / (% People Not Vaxed) = 186078 / 0.079 = 2355418 (Fully Vaxed) / (% People Fully Vaxed) = 2133516 / 0.904 = 2357751
Find the average of those two values for an approximate population: 2360084 people
Then, add the two numbers together: 4794665 + 2360084 = 7152416 people
The total population represents people who were not vaxed, partially vaxed, and fully vaxed.
For the group younger than 50:
vaccineData$efficacy_vs_severe_disease[1]
## [1] 0.75
For the group older than 50:
vaccineData$efficacy_vs_severe_disease[5]
## [1] -0.71
The positive number indicates that the vaccine was more efficacious against severe disease for the group younger than 50. The negative number indicates that the vaccine was less efficacious against severe disease for the group older than 50.
I can only compare the rates by age group since it appeared to be more effective in the younger population. In the younger population, more people without a vaccination had severe cases. In the older population, more people who were fully vaccinated had severe cases. One difference is the total population of people under 50 was about two times the total population of people over 50. This skewed the results. Also, the older population is more susceptible to more serious diseases because of their age and possibly other pre-existing conditions that can exacerbate Covid. Therefore, it is challenging to compare the rates of severe cases for unvaccinated and vaccinated people from this data.