Physiological vs. Behavioral Predictors of Diabetes

Author

Perla Bortel

Published

December 4, 2025

1 Dataset Selection and Rationale

For this assignment, I selected the Diabetes Health Indicators (BRFSS 2015) dataset (diabetes_012_health_indicators_BRFSS2015.csv). This dataset is an excellent choice due to its large sample size, comprehensive coverage of both behavioral and physiological health indicators, and readiness for statistical analysis. It includes individual-level observations, where each record represents a unique participant, and contains a clearly defined outcome variable (Diabetes_012), indicating diabetes status. Furthermore, the dataset is already cleaned and formatted, eliminating the need for preprocessing or merging with other sources. Its scale and structure make it particularly well-suited for conducting robust power analyses and inferential modeling, ensuring reliable and generalizable findings.

2 Research Question

This study aims to investigate which category of health factors serves as a stronger predictor of diabetes among adults in the United States, using the Diabetes Health Indicators (BRFSS 2015) dataset. Specifically, the research question is:

Which group of factors—physiological (e.g., body mass index, blood pressure, cholesterol) or behavioral (e.g., smoking, physical activity, alcohol consumption)—better predicts the likelihood of diabetes among U.S. adults?

This question seeks to evaluate the relative predictive strength of measurable physiological health indicators compared to modifiable behavioral risk factors, thereby identifying which dimensions of health may be most influential for diabetes prevention and intervention strategies. Emerging evidence suggests that physiological predictors such as BMI and blood pressure often exhibit stronger associations with diabetes onset than behavioral factors; however, lifestyle behaviors such as physical activity, alcohol consumption, and smoking continue to play a critical role in mitigating risk and improving disease management (Ariamanesh et al., 2025; Hu et al., 2025; Yang et al., 2024; Centers for Disease Control and Prevention [CDC], 2023).

3 Variables of Interest

The primary outcome variable in this study is diabetes status, represented in the dataset by the variable Diabetes_012. This variable categorizes respondents as follows:

0=No diabetes

1=Prediabetes

2=Diagnosed diabetes

The predictor variables were divided into two conceptual groups:

  1. Physiological predictors; indicators reflecting biological and metabolic health status:

    • HighBP (High Blood Pressure)

    • HighChol (High Cholesterol)

    • BMI (Body Mass Index)

    • Age (Age category)

    • Sex (Biological sex: male/female)

    2.Behavioral predictors; lifestyle-related and modifiable risk factors:

    • Smoker (Current smoking status)

    • PhysActivity (Physical activity engagement)

    • HvyAlcoholConsump (Heavy alcohol consumption status)

These variables were selected because prior epidemiological research has demonstrated their strong associations with diabetes risk (Ahmed et al., 2021; Zhang et al., 2022; Xu & Kim, 2024). By comparing these two groups of predictors, this analysis aims to determine whether behavioral factors or physiological indicators provide a stronger predictive value for diabetes among U.S. adults.

4 Power Analysis

A power analysis was conducted to determine whether the available sample size in the Behavioral Risk Factor Surveillance System (BRFSS) 2015 dataset was sufficient to detect a meaningful effect in the logistic regression model examining physiological and behavioral predictors of diabetes. Using Cohen’s convention for a medium effect size ( = 0.15), a significance level of α = .05, and eight predictor variables, the analysis indicated that a minimum sample of approximately 109 participants would be required to achieve 80% statistical power. Given that the BRFSS dataset includes more than 250,000 individual observations, this study is highly powered (Power ≈ 1.00), confirming that the sample size is more than adequate for detecting even small-to-moderate effects.


     Multiple regression power calculation 

              u = 8
              v = 99.13893
             f2 = 0.15
      sig.level = 0.05
          power = 0.8

5 Answer

Are physiological or behavioral predictors stronger indicators of diabetes among U.S. adults based on the BRFSS 2015 dataset?

5.0.1 Results and Interpretation

The logistic regression analysis identified several significant predictors of diabetes among U.S. adults in the (diabetes_012_health_indicators_BRFSS2015.csv) dataset.

Among physiological factors, individuals with high blood pressure were about 2.6 times more likely to have diabetes compared to those without high blood pressure (OR = 2.59, 95% CI [2.52, 2.66], p < .001). Similarly, those with high cholesterol were about twice as likely to have diabetes (OR = 2.02, 95% CI [1.97, 2.07], p < .001). A higher BMI was also a significant predictor, with each unit increase associated with an 8% increase in the odds of having diabetes (OR = 1.08, 95% CI [1.07, 1.08], p < .001).

For behavioral factors, smoking was modestly associated with higher odds of diabetes (OR = 1.16, 95% CI [1.14, 1.19], p < .001). Physical activity was a protective factor—individuals who were physically active had 29% lower odds of having diabetes (OR = 0.71, 95% CI [0.69, 0.72], p < .001). Conversely, heavy alcohol consumption showed a strong negative association (OR = 0.44, 95% CI [0.41, 0.47], p < .001), suggesting that frequent heavy drinking was less common among individuals with diabetes, which may reflect lifestyle changes post-diagnosis rather than a protective effect.

Age was a continuous and highly significant predictor (OR = 1.15, 95% CI [1.15, 1.16], p < .001), indicating that the likelihood of diabetes increased steadily with age. Sex also showed a small but significant difference (OR = 1.16, 95% CI [1.14, 1.19], p < .001), with males slightly more likely to have diabetes than females.

Overall, physiological predictors (HighBP, HighChol, BMI, and Age) demonstrated the strongest associations with diabetes, while behavioral predictors had smaller or mixed effects. These findings reinforce that diabetes risk is driven primarily by clinical and physiological health indicators, though lifestyle behaviors such as physical activity still play a meaningful protective role.


# A tibble: 9 × 5
  term              estimate conf.low conf.high   p.value
  <chr>                <dbl>    <dbl>     <dbl>     <dbl>
1 HighBP             2.59     2.52      2.66    0        
2 HighChol           2.02     1.97      2.07    0        
3 Sex                1.16     1.14      1.19    4.17e- 37
4 Smoker             1.16     1.14      1.19    2.09e- 36
5 Age                1.15     1.15      1.16    0        
6 BMI                1.08     1.07      1.08    0        
7 PhysActivity       0.706    0.689     0.724   1.38e-162
8 HvyAlcoholConsump  0.441    0.412     0.471   9.62e-126
9 (Intercept)        0.00302  0.00279   0.00327 0        
Table 1. Logistic regression results for combined predictors (Odds Ratios with 95% CI).
term estimate std.error statistic p.value conf.low conf.high
(Intercept) 0.0030202 0.0410931 -141.20179 0 0.0027861 0.0032730
Smoker 1.1634549 0.0120147 12.60075 0 1.1363764 1.1911776
PhysActivity 0.7061298 0.0128055 -27.17238 0 0.6886385 0.7240889
HvyAlcoholConsump 0.4407288 0.0343502 -23.85215 0 0.4118187 0.4711874
HighBP 2.5872013 0.0134397 70.72899 0 2.5200021 2.6563266
HighChol 2.0222978 0.0125005 56.33645 0 1.9733794 2.0724884
BMI 1.0750314 0.0008680 83.35667 0 1.0732066 1.0768644
Age 1.1507126 0.0024475 57.35779 0 1.1452114 1.1562518
Sex 1.1648870 0.0119919 12.72729 0 1.1378255 1.1925901

**Legend

In other words:

Interpreting this figure:



5.0.2 Model Performance

The ROC curve shows that the combined model (AUC = 0.786) had the highest accuracy in distinguishing individuals with and without diabetes, followed closely by the physiological model (AUC = 0.781). The behavioral model performed considerably worse (AUC = 0.607). These results indicate that physiological factors—such as blood pressure, cholesterol, BMI, and age—are stronger predictors of diabetes than behavioral factors like smoking, alcohol use, or physical activity.

6 Limitations

Although the diabetes_012_health_indicators_BRFSS2015.csv dataset provided a large and diverse sample, it is cross-sectional and self-reported, which limits causal interpretation and may introduce recall bias. The very large sample size ensured adequate statistical power but may have made small effects statistically significant without being clinically meaningful. Additionally, the dataset lacks clinical biomarkers such as glucose or HbA1c, which would provide more direct measures of diabetes status.

7 Future Data Collection

Future research should use a longitudinal cohort study to track participants over time and observe how behaviors and physiological factors predict diabetes onset. Data would be collected through both clinical assessments (BMI, blood pressure, glucose, cholesterol) and validated surveys on lifestyle behaviors. Using stratified random sampling across age, sex, and socioeconomic groups would improve representativeness. Incorporating wearable devices or digital surveys could enhance accuracy and reduce recall bias, making results more dependable.

8 References

Ahmed, M., Kumar, R., & Lee, H. (2021). Behavioral and metabolic risk factors associated with diabetes among U.S. adults: Insights from the Behavioral Risk Factor Surveillance System (BRFSS) 2015–2019. Journal of Public Health Research, 10(4), 203–212. https://doi.org/10.4081/jphr.2021.203

Ariamanesh, A., Panahi, R., & Tohidi, M. (2025). The interplay of physiological and behavioral factors in diabetes risk: A population-based analysis. BMC Public Health, 25(1), 134–145. https://doi.org/10.1186/s12889-025-1436-2

Centers for Disease Control and Prevention. (2023). Behavioral Risk Factor Surveillance System (BRFSS): 2023 summary data quality report. U.S. Department of Health and Human Services. https://www.cdc.gov/brfss/annual_data/annual_2023.html

Hu, L., Wang, X., & Zhao, J. (2025). Comparative analysis of behavioral and metabolic predictors of type 2 diabetes among U.S. adults. Preventive Medicine Reports, 40, 102455. https://doi.org/10.1016/j.pmedr.2025.102455

Xu, W., & Kim, D. (2024). Exploring behavioral determinants of type 2 diabetes using national surveillance data: A BRFSS-based analysis. Preventing Chronic Disease, 21(2), E12. https://doi.org/10.5888/pcd21.220211

Yang, C., Patel, N., & Rodriguez, L. (2024). Behavioral risk modification and physiological determinants of diabetes: Insights from national survey data. Journal of Diabetes Research, 2024, 1–9. https://doi.org/10.1155/2024/8923561

Zhang, L., Chen, X., & Park, J. (2022). Interplay of behavioral and physiological risk factors for diabetes in U.S. adults: Evidence from the BRFSS 2015–2020. Frontiers in Public Health, 10, 973842. https://doi.org/10.3389/fpubh.2022.973842