Milestone_6

Author

Leah Gibbs & Lucien LaFerr

A Novel Respiratory Infectious Disease Outbreak in California

Problem Statement

The California Department of Public Health is monitoring a simulated outbreak of a novel respiratory infectious disease among California residents during 2023–2024. Weekly surveillance data include counts of new infections and disease severity, stratified by age group, sex, race/ethnicity, and geographic area, and summarized by health officer region. Characterizing how the outbreak unfolded across regions and identifying populations that experienced a higher burden of disease are essential for informing timely public health response.

The objective of this analysis is to describe the course of the outbreak from June 2023 through January 2024 and to assess whether incidence and severity differed across demographic and geographic groups. By examining cumulative incidence rates and severe disease burden across health officer regions, this report aims to highlight populations that may benefit from targeted prevention, treatment, or resource allocation. These findings are intended to support surveillance planning and equitable distribution of public health resources during an active outbreak.

Methods

Three datasets were used to explore this issue. Datasets one (sim_novelid_CA) and two (sim_novelid_LACounty) are from the California Department of Public Health and contain weekly surveillance data about cases and case severity from May 29, 2023 to December 25, 2023. Both of these datasets stratify surveillance case data by age, race/ethnicity, and sex. Dataset one provides surveillance data for all California counties except Los Angeles county and dataset two provides surveillance data for Los Angeles county only. Dataset three (ca_pop_2023) is from the California Department of Finance and provides population data estimates for the state of California by county and several demographic variables for the year 2023.

We first needed to join datasets one and two to create a complete infectious outbreak morbidity dataset for the entire state of California. The cleaning process to do this involved formatting column names to snake case for consistency, standardizing column names, reformatting dates from character values to date  values, standardizing character values describing counties, and recoding race/ethnicity values from numeric to character values for consistency across both datasets. Next, we needed to clean dataset three for future joins with strata of interest from the morbidity dataset. Cleaning included renaming race/ethnicity values and recoding age categories to ensure consistency with the morbidity dataset.

To identify regions of California most impacted by the outbreak, we added a variable calculating weekly cumulative incidence and subsetted data by health officer region.

Cumulative Incidence per 100,000 Residents by Race/Ethnicity and Age Category, June 2023–January 2024
Health Officer Region Race/Ethnicity Age Category Total New Infections Population (2023) Cumulative incidence per 100,000
Central California White, Non-Hispanic 65+ 100740 8812928 1143.1
Central California American Indian or Alaska Native, Non-Hispanic 65+ 1526 145142 1051.4
Central California Black, Non-Hispanic 65+ 6725 639685 1051.3
Greater Sierra Sacramento Black, Non-Hispanic 65+ 6804 688510 988.2
Central California White, Non-Hispanic 18-49 134456 14216879 945.7
Central California Asian, Non-Hispanic 65+ 13570 1545722 877.9
Central California Hispanic (any race) 65+ 51568 5957704 865.6
Greater Sierra Sacramento American Indian or Alaska Native, Non-Hispanic 65+ 954 110639 862.3
Central California Black, Non-Hispanic 18-49 22417 2661071 842.4
Greater Sierra Sacramento Black, Non-Hispanic 18-49 19463 2358294 825.3
Greater Sierra Sacramento White, Non-Hispanic 65+ 86617 10774453 803.9
Central California American Indian or Alaska Native, Non-Hispanic 18-49 2682 333715 803.7
Central California Native Hawaiian or Pacific Islander, Non-Hispanic 65+ 489 63240 773.2
Greater Sierra Sacramento Asian, Non-Hispanic 65+ 13641 1791025 761.6
Central California Hispanic (any race) 18-49 260337 35672599 729.8
Greater Sierra Sacramento American Indian or Alaska Native, Non-Hispanic 18-49 1675 233802 716.4
Central California Native Hawaiian or Pacific Islander, Non-Hispanic 18-49 1274 180265 706.7
Greater Sierra Sacramento Hispanic (any race) 65+ 13617 1943948 700.5
Central California Multiracial (two or more of above races), Non-Hispanic 65+ 2481 356159 696.6
Greater Sierra Sacramento White, Non-Hispanic 18-49 115459 17128244 674.1
Greater Sierra Sacramento Native Hawaiian or Pacific Islander, Non-Hispanic 65+ 732 113739 643.6
Central California Multiracial (two or more of above races), Non-Hispanic 18-49 10807 1704287 634.1
Greater Sierra Sacramento Hispanic (any race) 18-49 65496 10830098 604.8
Central California Asian, Non-Hispanic 18-49 30292 5072375 597.2
Greater Sierra Sacramento Native Hawaiian or Pacific Islander, Non-Hispanic 18-49 2017 347975 579.6
Greater Sierra Sacramento Asian, Non-Hispanic 18-49 31796 5753476 552.6
Greater Sierra Sacramento Multiracial (two or more of above races), Non-Hispanic 65+ 2158 407743 529.3
Greater Sierra Sacramento Multiracial (two or more of above races), Non-Hispanic 18-49 12055 2410250 500.2
Central California White, Non-Hispanic 50-64 36439 7719775 472
Central California American Indian or Alaska Native, Non-Hispanic 50-64 699 156488 446.7
Central California Black, Non-Hispanic 50-64 4204 953839 440.7
Greater Sierra Sacramento Black, Non-Hispanic 50-64 4024 956970 420.5
Greater Sierra Sacramento American Indian or Alaska Native, Non-Hispanic 50-64 453 118110 383.5
Central California Hispanic (any race) 50-64 38714 10169984 380.7
Central California Asian, Non-Hispanic 50-64 6354 1877453 338.4
Greater Sierra Sacramento White, Non-Hispanic 50-64 30750 9283601 331.2
Central California Multiracial (two or more of above races), Non-Hispanic 50-64 1378 421011 327.3
Greater Sierra Sacramento Hispanic (any race) 50-64 9957 3164604 314.6
Greater Sierra Sacramento Asian, Non-Hispanic 50-64 6568 2142906 306.5
Greater Sierra Sacramento Native Hawaiian or Pacific Islander, Non-Hispanic 50-64 445 146816 303.1
Central California Native Hawaiian or Pacific Islander, Non-Hispanic 50-64 255 86056 296.3
Central California White, Non-Hispanic 0-17 21120 7513625 281.1
Central California American Indian or Alaska Native, Non-Hispanic 0-17 379 147374 257.2
Central California Black, Non-Hispanic 0-17 3442 1447390 237.8
Greater Sierra Sacramento Multiracial (two or more of above races), Non-Hispanic 50-64 1338 574120 233.1
Greater Sierra Sacramento Black, Non-Hispanic 0-17 2630 1173381 224.1
Greater Sierra Sacramento Native Hawaiian or Pacific Islander, Non-Hispanic 0-17 253 131347 192.6
Greater Sierra Sacramento White, Non-Hispanic 0-17 14838 7920934 187.3
Greater Sierra Sacramento American Indian or Alaska Native, Non-Hispanic 0-17 196 105152 186.4
Central California Hispanic (any race) 0-17 45211 25151478 179.8
Central California Asian, Non-Hispanic 0-17 4422 2606573 169.6
Greater Sierra Sacramento Asian, Non-Hispanic 0-17 4036 2545100 158.6
Central California Native Hawaiian or Pacific Islander, Non-Hispanic 0-17 140 89683 156.1
Greater Sierra Sacramento Hispanic (any race) 0-17 10120 6918921 146.3
Central California Multiracial (two or more of above races), Non-Hispanic 0-17 2396 1693654 141.5
Greater Sierra Sacramento Multiracial (two or more of above races), Non-Hispanic 0-17 2298 2095352 109.7
Note:
Rates are calculated as total new infections divided by the 2023 population for each county–race/ethnicity group, multiplied by 100,000. Cells highlighted in red indicate values greater than the overall mean cumulative incidence rate across all groups.

Results:

Weekly trends in cumulative incidence for this outbreak were similar across all health officer regions in California, with synchronous increases and declines over the course of the outbreak. Across regions, cumulative incidence peaked between August 28 and September 18, 2023. The mean weekly cumulative incidence rate during the outbreak period was 402.45 cases per 100,000 population. From August 21 through October 23, 2023, weekly cumulative incidence rates in all health officer regions exceeded this mean. Despite shared patterns, two regions experienced a disproportionately higher burden of disease: Central California and Greater Sierra Sacramento. These regions consistently exhibited higher cumulative incidence rates relative to other health officer regions in the state. To identify priority populations for focused intervention, we examined cumulative incidence within these two regions by race/ethnicity and age group. The highest cumulative incidence rates were observed among adults aged 65 years and older in Central California, with White, non-Hispanic individuals most affected, followed by American Indian or Alaska Native, non-Hispanic individuals, and then Black, non-Hispanic individuals. The fourth-highest cumulative incidence rate was observed among Black, non-Hispanic adults aged 65 years and older in the Greater Sierra Sacramento region. Overall, 32 race/ethnicity and age-specific subgroups within Central California and Greater Sierra Sacramento had cumulative incidence rates exceeding the statewide mean.

Discussion

Overall, the weekly cumulative incidence trends shown in the Visualization 1 indicate that the outbreak followed a similar temporal pattern across California health officer regions, with rapid increases during the summer of 2023, peaks in late August through mid-September, and steady declines into the fall. While the timing of the outbreak was consistent statewide, disease burden was not evenly distributed. Central California and Greater Sierra Sacramento experienced consistently higher cumulative incidence rates compared to other regions, indicating a disproportionate share of infections during the outbreak period.

The cumulative incidence table further demonstrates that within these higher-burden regions, incidence varied by age group and race/ethnicity. Adults aged 65 years and older experienced the highest cumulative incidence, particularly in Central California, with notable differences across racial and ethnic groups. Similar patterns were observed in Greater Sierra Sacramento, where older adult populations also exhibited elevated incidence. Together, these findings highlight meaningful geographic and demographic variation in disease burden and underscore the importance of region-specific surveillance and targeted public health planning rather than reliance on statewide averages alone.