This analysis examines how socioeconomic conditions relate to the burden of chronic disease across the United States. The Centers for Disease Control and Prevention’s Chronic Disease Indicators (CDI) dataset aggregates and standardizes public health indicators across all U.S. states and territories.
The original CDI dataset contains 309,215 observations and 34 variables. The variable data types are as follows:
Missing data occurs across several variables:
All logical variables appear to be empty placeholders and will be excluded from this analysis.
The DataValueFootnote variable
documents reasons for missingness in DataValue and
therefore contains many missing entries in rows where a value is present
because it is only populated when data is suppressed or
unavailable.
Overall, there are 100,019 rows with missingness in the
DataValue column with 90,271 of those
occurring in observations stratified by race/ethnicity. Based on the
DataValueFootnote, these values are suppressed because
sample sizes were too small to report which makes direct comparison
between these groups unreliable.
After removing empty variables and filtering for relevant observations, the cleaned dataset contains 106,338 observations and 22 variables. The variable data types are as follows:
Life expectancy at birth declined nationwide between 2019 and 2020 (Figure 1). Hawaii maintained the highest life expectancy, while Mississippi remained lowest.
Figure 1. Life expectancy at birth declined in the United States from 2019 to 2020.
National mortality rates for major chronic diseases increased between 2019 and 2021 (Figure 3). Heart disease remained the leading cause throughout the period. Mortality from diabetes and coronary heart disease also increased modestly, while stroke showed only a slight increase.
Figure 3. National age-adjusted mortality rates (per 100,000) for major chronic diseases between 2019 and 2021.
COPD hospitalizations decreased from 2019 to 2020 for both men and
women, followed by a modest increase in 2021 (Figure 4). Heart failure
hospitalizations followed the same pattern at a lower overall rate. Men
consistently had higher rates than women across all years.
Figure 4. Age-adjusted hospitalization rates among Medicare beneficiaries aged 65 years and older for COPD and heart failure, stratified by sex.
Across all four years, states with higher uninsured prevalence tend to have higher rates of diabetes (Figure 5). The positive association was consistent from 2019 through 2022.
Figure 5. Association between state level diabetes prevalence and lack of health insurance between 2019 and 2022. Points represent states values. The blue line shows the fitted linear trend with 95% confidence intervals.
Across all three years, states with higher uninsured rates tended to have higher mortality rates (Figure 6). The positive association was modest in 2019 and 2020, and appeared stronger in 2021, when the slope of the relationship steepened.
Figure 6. Association between mortality and lack of health insurance across U.S. states, 2019–2021. Points represent states values. The blue line shows the fitted linear trend with 95% confidence intervals.
Social Determinants of Health
Key socioeconomic indicators showed mixed changes between 2019 and 2021 (Figure 2). Uninsured rate improved but lack of routine checkups increased, poverty levels remained stable, and unemployment rose. Poverty and unemployment data is missing for 2020 and 2022.
Figure 2. Social determinants of health indicators across U.S. states and territories. Points represent individual states (jittered for visibility). The solid line indicates the national average.