Infectious Disease Outbreak in California

Problem Statement Our team is tasked with understanding the progression of a disease outbreak in California to identify which populations are most affected and how best to allocate prevention and treatment resources. The dataset provides detailed demographic data, including age, sex, and race, as well as geographic information about counties within California. This data, combined with health-related metrics such as diagnoses, severity of illness, and epidemiological week, offers a comprehensive view of the outbreak’s impact across different populations.

The goal is to determine if certain demographic or geographic groups are disproportionately affected by the disease over time. Insights from this analysis will help prioritize and target public health interventions, ensuring that prevention and treatment efforts are directed where they are needed most. This approach aims to mitigate the outbreak’s impact and address health inequities effectively.

Methods The dataset used in this analysis includes simulated morbidity data for California counties and demographic details, as well as population data for 2023. The primary objective was to analyze disparities in infection rates by county and age group to enable informed resource allocation. Data was sourced from the simulated datasets sim_novelid_CA and sim_novelid_LA for morbidity data, and ca_pop_2023 for California population data. These datasets were joined to calculate case rates per 100,000 population and analyze demographic and geographic disparities in infection rates.

The datasets were standardized for consistency, including aligning column names to snake case, reconciling race values to standardized categories, and ensuring date formats were uniform. Columns not required for analysis were removed. The morbidity datasets were joined with the population dataset, ensuring consistent county naming for proper alignment during the merge. The combined dataset was grouped by county and age_category, and total new infections, total population, and infection rates were calculated. Infection rates were standardized to per 100,000 population to facilitate comparisons across strata.

A summary table was created to display infection rates, total new cases, and population counts by county and age group. Counties with the highest rates were highlighted in descending order. A bar plot visualized infection rates by county and age group, emphasizing disparities. Age categories were color-coded to enhance interpretability. These deliverables are clear, concise, and aligned with the objectives of identifying populations and regions most impacted by the outbreak.

Footnote: Case rates are per 100,000 population and age group in 2023

Results

The table highlights California counties with the total number of new COVID-19 cases for 2023, total population by age group, and total infection rates per 100,000 population for each age group by county. The data is organized in descending order, with counties displaying the highest infection rates at the top. Notably, Imperial County shows the highest infection rates among the 65+ age group at 66,593 cases per 100,000 population, followed by 58,216 cases per 100,000 population for the 18–49 age group in the same county. Among the 65+ population, Kings, Tulare, and Kern counties also report high infection rates, though lower than Imperial County.

The graph displays infection rates per 100,000 population (y-axis) against California counties (x-axis). The rates are categorized by age groups: 65+, 50–64, 18–49, and 0–17. The graph visually confirms that Imperial County has the highest new infection rates across all age groups.

Discussion

The table and graph provide a detailed analysis of new COVID-19 infection rates in California counties by age group for 2023. The findings highlight that older adults (65+) consistently experience the highest infection rates, with Imperial County leading at 66,593 cases per 100,000 population. This trend reflects the heightened vulnerability of older populations to infectious diseases due to factors such as weakened immune systems or pre-existing conditions.

Additionally, the younger age group (18–49) in Imperial County shows a notably high infection rate of 58,216 cases per 100,000 population, significantly exceeding rates observed in other counties. This could indicate increased transmission within this group due to factors such as greater mobility, higher levels of social interaction, or potential barriers to vaccination and healthcare access.

Intervention and Resource Allocation

The elevated infection rates in the 65+ age group underscore the importance of interventions such as vaccination drives, improved access to healthcare, and preventive education tailored to older populations. The high infection rates among the 18–49 age group in Imperial County suggest a need to investigate vaccination rates, vaccine hesitancy, and healthcare barriers in this demographic. Strengthened public health messaging and outreach could address these challenges. Imperial County, given its consistently high infection rates across all age groups, requires targeted public health efforts, including increased testing, awareness campaigns, and community engagement to reduce further transmission.

In conclusion, the result highlights the need for age-specific intervention focusing on Imperial County to manage and reduce COVID-19 Infection rates effectively.