Problem Statement

With COVID-19 vaccines widely available in California and the push for individuals to get vaccinated, there has been concern about the relation between age and vaccination status. Specifically, there is worry that, of the 58 counties in California, those which have low median age might be less likely to have a high proportion of fully vaccinated individuals. Our team will investigate the correlation, if any, between age and proportion of fully vaccinated individuals. We will utilize the California Vaccine Progress Dashboard Data and California Demographics Data and analyze this relation on a county level and determine which counties are in need of more support.

Methods

We used two datasets to tackle this project: California Vaccine Progress Dashboard Data & California Demographics Data

The California Vaccine Progress Dashboard Data was accessed through the California Department of Public Health, and sources include the California Immunization Registry and the American Community Survey’s 2015-2019 5-Year data. We filtered out the most recent date (9/14/21) to obtain the an accurate and current assessment of the vaccination status. The other variables we selected were: zip code, county, age 12+ population, fully vaccinated (counts), and whether or not the data had been redacted. This final variable would enable us to impute county-level averages of fully vaccinated to have a more complete picture.

The California Demographics Data was from an unknown source, and the years/dates of the data were not provided. Data about race, gender, family size, number of households and other characteristics were provided. We only selected the county and median age columns from this dataset.

In our subsetted dataset of 1764 observations, we noticed that 5 counties were listed as NA, so, instead of excluding those, we searched the zip codes to manually replace them with the county names. Moreover, we noticed that there were many observations where the age 12+ population was less than the number of fully vaccinated. To address this situation, we decided to first exclude those observations from the calculation of the county-level average of fully vaccinated people. Once we imputed the county-level averages for observations with missing data, we added those excluded observations back into the dataset and calculated a county-level summary mean.

Next, we created an aggregated dataset, grouping by county and creating 4 columns: total age 12+ population, average fully-vaccinated, percentage of fully vaccinated, and proportion of fully vaccinated.

Finally, we joined the datasets, using “county” as a key. This would allow us to then assess the relationship between median age and fully vaccinated percentage/proportion by county.

Results

Figure 1

The scatterplot indicates the absence of a linear relationship between median age and percent of fully vaccinated by county.

Table 1

Table 1 displays the county median age and average percent of fully vaccinated persons for COVID-19 for all 58 counties in California.

Figure 2

With the exception of Marin County, the bubble chart shows a positive trend that as the average fully vaccinated and the total population increase, the percent of fully vaccinated increases as well.

Figure 3

The ordered presentation of counties in the bar chart facilitates for the identification of counties with the lowest and highest average percentage of fully vaccinated for COVID-19.

Discussion

The results of our preliminary analysis suggest that there is little to no correlation between median age and the proportion of fully vaccinated individuals, as can be seen from Figure 1.

From Table 1 we noticed that though Marin County had the highest percent of fully vaccinated individuals and Lassen County the lowest, the median age for both counties is classified as middle-aged and thus had no notable difference. The bubble chart (Figure 2) generally suggests that counties with smaller populations are less likely to have a high proportion of fully vaccinated individuals. Lastly, from the bar chart (Figure 3), we recognize 17 counties with the lowest percentage of fully vaccinated (< 50%). These counties may be more prone to COVID-19, so it is vital to increase surveillance and provide necessary support to reduce the transmission of SARS-CoV-2 in these areas.

Overall, based on our investigation, we suggest that Lassen County and all other counties that have a less than 50% of fully vaccinated be the main recipients of extra aid and support to increase vaccination rates.

In the future, we hope to explore the relationships between proportion of fully vaccinated and other factors such as race, gender, and owners vs renters (indicator for SES).