Estimating the Preventable Burden of Dementia in Cameron County: A Quasi-Causal Modeling Approach

Abstract and Overview

Objective:

To calculate point prevalence estimates in smaller geographic units using multiple imputation to account for missing data in order to construct population attributable fractions (PAF), potential impact fractions (PIF), and potential impact on dementia cases associated with a reduction in risk factor prevalence.

Introduction:

Alzheimer’s Disease and Related Dementias (ADRD) are becoming increasingly recognized as a global health crisis owing to the number of people in the aging population. As modifiable risk factor reduction across the lifespan becomes a front-line intervention to mitigate dementia onset, increasingly granular epidemiological approaches in public health programming are needed to evaluate programming outcomes and ensure that prevention efforts by state and local health departments are aligned with known risk data (such as those identified by the 2024 Lancet Commission) as well as conditions of high burden within communities. Accordingly, health departments can better contextualize community approaches to ADRD prevention and establish high-yield best practices.

Thus, to support public health programming for ADRD, a “proof-of-concept” pipeline was established to produce modifiable risk factor prevalence estimates within smaller geographic units (e.g. counties), identify which risk factors are most highly correlated, approximate the effects of modest improvement in prevalence (simulating small-scale but effective outreach impact), and ultimately calculate potentially preventable cases of dementia in that area as a result. This work aims to improve quantification of public health programming outcomes in small and large communities. In turn, health departments can meaningfully improve health outcomes in a cost-effective fashion, support policy development through analysis, and build community trust through contextually relevant outreach and programming.

Design:

We estimated the prevalence and relative impact of eleven (11) modifiable dementia risk factors in Cameron County, Texas, using 2023 Texas BRFSS data. To address missing data, multiple imputation was used (predictive mean matching, m = 20) rather than multiyear BRFSS pooling to assess a greater number of Lancet Commission modifiable risk factors within the data; checks for convergence confirmed that distributions of imputed categorical data matched those of the original data. Survey-weighted totals were pooled across imputations and stratified by age group for risk factors identified as relevant. We computed prevalence proportions and applied principal component analysis (PCA) to a tetrachoric correlation matrix to estimate shared variance (communalities, H²).

Results:

Prevalence point estimates were built for each age group, with midlife populations (i.e. 45 to 64 years of age) representing the majority of prevention opportunities with high cholesterol, high blood pressure, and obesity being present in >49% of those in midlife. Interrelated risk clustering was also observed with highest communalities, or shared correlation with other risk factors as observed by Lee et al (Lee et al. 2022a), noted for diabetes (H² = 79.2%), excessive alcohol consumption (H² = 78.9%), and depression (H² = 75.4%). Finally, when a modest (10%) reduction in prevalence across all eleven risk factors was considered, an estimated 352 dementia cases could have been prevented.

Foreword

In 1999, Dr. Gladys Maestre wrote:

…people who choose neuroscience as their field of endeavor face a special burden because neural problems are not high on the list of public health priorities, little funding is available, the public does not [apply pressure] to solve these problems, [and] researchers …are often secretive about their work.

To compensate for these difficulties, the successful researcher searches for creative solutions such as strategic alliances [involving two or more teams that share resources and information to their mutual benefit]
.” (Maestre 1999)

Nearly 30 years later, with tremendous advances in the field of Alzheimer’s Disease and Related Dementia (ADRD) risk factor reduction and prevention, these words remain as relevant as ever if not more so in the face of an aging global population and a renewed interest in ADRD programming in public health.

Thus, in the spirit of these words, this white paper and the work within it has been developed to empower public health professionals large and small in the hopes that open access, functional work built to solve problems at home can serve to foster strategic alliances that solve problems across the world.

Background

This work focuses on using readily available “microdata” like county-level survey data found within state BRFSS datasets and U.S. Census Bureau data to estimate how much dementia prevalence could be reduced in a specific area if common modifiable risk factors could be reduced.

This work accomplishes all of this through:

  • Imputation of missing risk factor data through the use of Predictive Mean Matching (“pmm”) via the “mice” package in R.

  • Estimation of age-stratified prevalence through the use of the “survey” package in R.

    • A version stratified by age and binary race/ethnicity (including race/ethnicity as a predictor) was also created but is questionable for reasons owing to data granularity and stability.
  • Calculation of Population Attributable Fractions (PAFs) based heavily off of the Lancet 2024 supplementary material methodologies (Livingston, n.d.)through use of the “psych” package in R.

  • Calculation of Potential Impact Fractions (PIFs), and adjustments thereof in line with the work found in Ma’u et al. (Ma’u et al. 2025) and Lee et al. (Lee et al. 2022b).

  • Projection of preventable dementia burden through these fractions as found in Ma’u et al. and Lee et al. (ibid)

In turn, local health departments and state departments are better able to reasonably identify risk factors that contribute most to ADRD prevalence and approximate the prospective effects of outreach and programming impact across communities through a calculation of the number of dementia cases they may prevent as a result. As such, policy and the effects of public health programming in small and large organizations can be further quantified.

Multiple Imputation

To tackle some problems common with using survey data in smaller areas like missing data or non-response, it felt necessary to improve the dataset’s predictive power through imputation, or the replacement of a missing value with one that’s reasonable, based on a specific process, idea, or methodology.

While there is a great deal of nuance surrounding multiple imputation [Buuren and Groothuis-Oudshoorn (2011)](Jia and Wu 2022), (Memon, Wamala, and Kabano 2023), especially across categorical variables like answer choices in a survey, it’s possible to identify if methods “converge” or otherwise generally match the distribution of the original dataset. As you can see below, the Cameron County 2023 BRFSS data is similarly distributed to nearly all of the methods available through “mice” for categorical and ordinal data.


Discussion

as can the dollars-to-dollars benefits of investment into public health initiatives on a prevalence-based basis

When you click the Render button a document will be generated that includes both content and the output of embedded code. You can embed code like this:

1 + 1
[1] 2

You can add options to executable code like this

[1] 4

The echo: false option disables the printing of code (only output is displayed).

References

Buuren, Stef Van, and Karin Groothuis-Oudshoorn. 2011. Mice: Multivariate Imputation by Chained Equations inR.” Journal of Statistical Software 45 (3). https://doi.org/10.18637/jss.v045.i03.
Jia, Fan, and Wei Wu. 2022. “A Comparison of Multiple Imputation Strategies to Deal with Missing Nonnormal Data in Structural Equation Modeling.” Behavior Research Methods 55 (6): 3100–3119. https://doi.org/10.3758/s13428-022-01936-y.
Lee, Mark, Eric Whitsel, Christy Avery, Timothy M. Hughes, Michael E. Griswold, Sanaz Sedaghat, Rebecca F. Gottesman, Thomas H. Mosley, Gerardo Heiss, and Pamela L. Lutsey. 2022b. “Variation in Population Attributable Fraction of Dementia Associated With Potentially Modifiable Risk Factors by Race and Ethnicity in the US.” JAMA Network Open 5 (7): e2219672. https://doi.org/10.1001/jamanetworkopen.2022.19672.
———. 2022a. “Variation in Population Attributable Fraction of Dementia Associated With Potentially Modifiable Risk Factors by Race and Ethnicity in the US.” JAMA Network Open 5 (7): e2219672. https://doi.org/10.1001/jamanetworkopen.2022.19672.
Livingston, Gill. n.d. “The Lancet 2024 Supplementary Appendix.” https://www.thelancet.com/cms/10.1016/S0140-6736(24)01296-0/attachment/95b06bf4-f411-4c87-b960-00b474cdd26f/mmc1.pdf.
Ma’u, Etuini, Naaheed Mukadam, Gill Livingston, Sebastian Walsh, Susanne Röhr, Carol Brayne, Gary Cheung, and Sarah Cullum. 2025. “Estimating the Impact of Risk Factor Reduction on Dementia Prevalence in New Zealand.” Alzheimer’s & Dementia 21 (7). https://doi.org/10.1002/alz.70440.
Maestre, Gladys. 1999. “Strategic Alliances in Neuroscience,” August. https://research.ebsco.com/c/y5wonk/search/details/bh53z2awff?db=a9h.
Memon, Shaheen Mz., Robert Wamala, and Ignace H. Kabano. 2023. “A Comparison of Imputation Methods for Categorical Data.” Informatics in Medicine Unlocked 42: 101382. https://doi.org/10.1016/j.imu.2023.101382.