Project Milestone 5

Author

Usah Dutson and Jillian Kadota Tomlinson

Problem Statement

In our research on COVID-19 vaccinations and flu rates in California, we are specifically investigating the correlation within the Hispanic population. Focusing on Hispanics is essential due to their known health disparities, and understanding their vaccination and flu trends is crucial to address these inequities. Given the disproportionate impact of COVID-19 on minority communities, including Hispanics, our project contributes to achieving public health equity by focusing on a demographic that requires targeted interventions for comprehensive healthcare access. The project aims to analyze and visualize the correlation of COVID-19 vaccination rates and infection risks across counties in California.

Methods

We utilized two primary datasets: the California Flu Data, providing demographic information and the COVID-19 Vaccine Data across all counties in California. The methodology included importing and cleaning of flu and vaccination datasets. The 2 flu datasets were processed by handling missing values, recoding variables, and combining datasets. Similarly, vaccination data was cleaned by addressing missing and erroneous values, ensuring compatibility with flu data. Subsequently, data is aggregated, and descriptive statistics are generated. Both California Flu and flu datasets were then joined together. The final project results are presented using visualizations, including interactive tables and plots, to explore patterns and correlations between vaccination and flu rates.

Results

1) Table 1. Interactive Table displaying flu and vaccination data across all counties in California for different levels of race/ethnicity by quarter.

Interpretation: We have chosen to display the flu and vaccination data across all counties in California for the different levels of race/ethnicity by quarter. We have flagged the groups of people that less than 70% of the population has been vaccinated against COVID 19 in yellow, and sorted by the risk of flu per 100,000 population. We have also included a column specifying if the infection risk for that race/ethnicity category in that given quarter is above the average infection risk for that race/ethnicity across all quarters of data. We note that Hispanic (any race) and Multiracial categories consistently have lower than 70% of the population vaccinated and have among the top flu risks per 100,000 in any quarter of data.

2) Plot 1. Flu infection rate per 100,000 across all counties grouped by race and quarter.

Interpretation: This bar graph demonstrates that while some race/ethnicities including Alaska Native, White (Non-Hispanic) and Hispanic (any race) have proportionally higher infection rates across all quarters of data, it is clear that the quarter starting on 1/1/2023 (demonstrated by the medium blue part of each bar) consists of the highest number of infections across all races/ethnicities.

`summarise()` has grouped output by 'race_ethnicity'. You can override using
the `.groups` argument.

3) Plot 2. Flu rate by county for Hispanic (any race) in the quarter starting on 1/1/2023.

Interpretation: Given some of the patterns we have observed in Table 1 and that flu rate seems to be consistent across race but highest in the quarter starting in January 2023 from plot 1, we have chosen to focus on the Hispanic (any race) population in additional visualizations to determine if there is any correlation between COVID vaccination rates and flu within a given race in the quarter with the highest infection rates. From this plot, we see that the proportion of the population with flu ranges from a low of 32.37K per 100,000 in San Mateo county to a high of 34.92K per 100,000 in Calaveras county for Hispanic (any race) populations.

4) Plot 3. Flu rate by county for Hispanic (any race) in the quarter starting on 1/1/2023 vs. proportion of the population that is fully vaccinated for COVID 19.

Interpretation: In order to see if there is any correlation between the rate of flu and the proportion of the population that is vaccinated, we plot the proportion of the population that is vaccinated vs. the rate of flu. We see that within Hispanic populations in the quarter starting on 1/1/23, there may be pattern demonstrating potential lower rates of flu infection per 100,000 among populations with higher proportions who are fully vaccinated against COVID-19, as demonstrated by the blue descending line of best fit.

`geom_smooth()` using formula = 'y ~ x'

Discussion

The findings of our research show insights into the correlation between COVID-19 vaccination rates and flu infection risks within the Hispanic population in California. The interactive table (Table 1) highlights disparities in vaccination coverage, particularly among Hispanic (any race) and Multiracial categories, where less than 70% of the population is vaccinated. These groups consistently exhibit higher flu risks per 100,000 population, emphasizing the vulnerability of certain demographic segments. Plot 1 illustrates a significant increase in flu infections per 100,000 across all races in the quarter starting on 1/1/2023. Examining the Hispanic (any race) population in Plot 2 reveals varying flu rates by county in the same quarter. Plot 3 suggests a potential inverse correlation between COVID-19 vaccination rates and flu infections in Hispanic populations, indicating that higher vaccination rates may be associated with lower flu risks, highlighting the protective impact of vaccination.