Project Milestone 6

Author

Hui Zou, Reneta Hermiz, Samar Robleh

Problem Statement

When COVID-19 vaccines started to become available in 2021, there was a global initiative ensure full vaccination through a series of administrations over a period of time. In 2022, as people emerged from extended lockdowns to return to workplaces and social events, other viruses, like the flu, also re-emerged. Exploring populations of individuals who were fully vaccinated against COVID-19 in 2022 and comparing them to flu rates could provide insight into the degree of flu protection in these populations. We will review data from 57 counties in California to determine whether additional support is needed to encourage flu protection.

Methods

The California datasets utilized in this project contain information about flu cases, flu severity, COVID vaccination status, location (by county), demographics (including age), and time period. Data outlined in this report is representative of records from 57 counties in California in 2022. Our group chose to focus on three key indicators: flu incidence, severe flu rate, and full COVID vaccination rate.

Flu and COVID vaccination datasets were aggregated separately and then joined to produce a single dataset. All data was filtered for cases occurring in 2022 and stratified across county and age group. The reported age groups are 18-49, 50-64, and 65+. Flu incidence rate was calculated as the sum of all new infections for each county and age group as a percentage of the total population. Severe flu rate was calculated by obtaining the maximum cumulative number of severe flu cases for each county and age group as a percentage of the total population. Full COVID vaccination rate was calculated by obtaining the maximum cumulative number of fully vaccinated individuals for each county and age group as a percentage of the total estimated population. Flu and COVID data were joined and records with missing data are not reported.

Analysis was performed by generating grouped box-plots of each indicator across age groups for all counties and analyzing the scatter of data between full COVID vaccination rates against flu incidence and severe flu rate in all counties.

Results

Table 1

This table displays the flu incidence, severe flu rate, and full vaccination rates against COVID by county and age group in California.

Figure 1: Box Plot

This plot shows the distribution of flu incidence rate, severe flu rate, and full COVID vaccination rate across three age categories. When comparing the three rates within each age group, it is evident that the flu incidence rate is generally higher than the severe flu rate, which is expected. The full COVID vaccination rate varies significantly, which might be due to different factors, including vaccine availability, public health policies, and population attitudes towards vaccination.

Figure 2: Scatter plot with Regression Line

This scatter plot investigates the relationship between full covid vaccination rates and flu incidence rates amongst all California counties fitted with a regression line. The line indicates a positive slope which suggests a slight upward trend in flu incidence rates as full vaccination rate increases which doesn’t imply a causal relationship. For example, among younger groups (18-49), the distribution is relatively even, suggesting that other factors might be influencing flu incidence in these age groups. In comparasion, the older age groups (50-64) & (65+) tend to have higher vaccination rates as seen in the density of data points. This could indicate effective vaccination campaigns targeting older individuals or higher vaccine uptake due to increased health risks associated with flu in these age groups.

`geom_smooth()` using formula = 'y ~ x'

Figure 3: Scatter plot with Color Gradient

The scatterplot represents the relationship between full COVID vaccination rates and severe flu rates across different locations, likely counties. Full vaccination rates range broadly, from just over 20% to near 100%, with a concentration of data points between 60% and 100%. Severe flu rates are generally below 0.2%, with a sparse distribution of higher rates indicated by darker colored points. There is no immediately apparent correlation between the high vaccination rates and lower severe flu rates. Some outliers with high severe flu rates occur even in areas with high vaccination rates, hinting at the influence of other factors beyond vaccination.

Discussion

The analysis and visualization collectively offer a multifaceted view of public health metrics regarding flu incidence, severe flu and full Covid vaccnine across different age groups in California counties, highlighting the complexities in the interplay between the three categories. While the flu incidence rate remains generally higher than the severe flu rate across all age groups, the full COVID vaccination rate shows considerable variability. The positive slope of the regression line suggests a slight increase in flu incidence rates with higher vaccination rates. However, this simply reflects a correlation that could be influenced by confounding variables. There is no clear pattern indicating that higher Covid vaccination rates is related to lower severe flu rates. Outliers with high severe flu rates in areas of high vaccination coverage underscore the potential influence of other variables, such as socioeconomic factors, access to healthcare, or genetic predispositions to severe flu outcomes.In conclusion, these plots demonstrate that full Covid vaccination dose not imply a low flu incidence and low severe flu rate as we hypothesized. Other factors, such as education, age, location, and other variables could play significant roles. A deeper statistical analysis, possibly including multivariate regression or geographical information system (GIS) mapping, could provide further insights into these relationships and help public health officials to tailor interventions more effectively.