Multiple Logistic Regression
Multiple logistic regression analysis of diabetes complications
Step 1: Generate the data
Study objective: To investigate the association between various risk factors and the occurrence of diabetes complications (yes/no) using multiple logistic regression.
- The data set contains information on 1000 patients with diabetes.
- The variables include, age (in years), body mass index (BMI), blood pressure (BP), cholesterol level, smoking status (yes/no), and the presence of diabetes complications (yes/no), randomly.
set.seed(123) # For reproducibility
n <- 1000
age <- rnorm(n, mean=55, sd=10)
bmi <- rnorm(n, mean=28, sd=5)
bp <- rnorm(n, mean=130, sd=15)
cholesterol <- rnorm(n, mean=200, sd=30)
smoking_status <- rbinom(n, 1, 0.3) # 30% smokers
# Logistic model to generate diabetes complications
logit_prob <- -5 + 0.02*age + 0.1*bmi + 0.005*bp + 0.004*cholesterol + 1.0*smoking_status
prob_diabetes_complications <- 1 / (1 + exp(-logit_prob))
diabetes_complications <- rbinom(n, 1, prob_diabetes_complications)Create a data frame
diabetes_data <- data.frame(
age = age,
bmi = bmi,
bp = bp,
cholesterol = cholesterol,
smoking_status = factor(smoking_status, labels=c("No", "Yes")),
diabetes_complications = factor(diabetes_complications, labels=c("No", "Yes"))
)
head(diabetes_data)## age bmi bp cholesterol smoking_status diabetes_complications
## 1 49.39524 23.02101 122.3259 195.4908 No Yes
## 2 52.69823 22.80022 133.5541 190.1673 No Yes
## 3 70.58708 27.91010 121.8762 156.5550 Yes Yes
## 4 55.70508 27.33912 148.2884 179.0815 No Yes
## 5 56.29288 15.25329 132.6120 277.9547 Yes No
## 6 72.15065 33.20287 120.7710 198.8775 Yes Yes
- (Optional) Export the data to a CSV file
- since the data already in factor, we do need to convert again. (mutate part is not needed)
Step 2: Load the required libraries
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.1.4 ✔ readr 2.1.5
## ✔ forcats 1.0.1 ✔ stringr 1.5.2
## ✔ ggplot2 4.0.0 ✔ tibble 3.3.0
## ✔ lubridate 1.9.4 ✔ tidyr 1.3.1
## ✔ purrr 1.1.0
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
## here() starts at D:/DrPH/SEM1/Multivariable Analysis/Asignment Prof KIM/MLogR
Step 3: Estimation
Univariable logistic regression
For age
options(scipen=999) to avoid scientific notation in output
modlog.age <- glm(diabetes_complications ~ age, data = diabetes_data, family = binomial(link = "logit"))
summary(modlog.age)##
## Call:
## glm(formula = diabetes_complications ~ age, family = binomial(link = "logit"),
## data = diabetes_data)
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -0.836200 0.377519 -2.215 0.026761 *
## age 0.026078 0.006826 3.820 0.000133 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 1302.2 on 999 degrees of freedom
## Residual deviance: 1287.2 on 998 degrees of freedom
## AIC: 1291.2
##
## Number of Fisher Scoring iterations: 4
## # A tibble: 2 × 7
## term estimate std.error statistic p.value conf.low conf.high
## <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 (Intercept) 0.433 0.378 -2.21 0.0268 0.206 0.906
## 2 age 1.03 0.00683 3.82 0.000133 1.01 1.04
Intepretation: For each additional year of age, the odds of having diabetes complications increase by a factor of exp(0.026) = 1.03, when controlling for other variables in the model.
For BMI
modlog.bmi <- glm(diabetes_complications ~ bmi, data = diabetes_data, family = binomial(link = "logit"))
summary(modlog.bmi)##
## Call:
## glm(formula = diabetes_complications ~ bmi, family = binomial(link = "logit"),
## data = diabetes_data)
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -2.21317 0.39569 -5.593 0.00000002229953 ***
## bmi 0.10071 0.01415 7.117 0.00000000000111 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 1302.2 on 999 degrees of freedom
## Residual deviance: 1247.5 on 998 degrees of freedom
## AIC: 1251.5
##
## Number of Fisher Scoring iterations: 4
## # A tibble: 2 × 7
## term estimate std.error statistic p.value conf.low conf.high
## <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 (Intercept) 0.109 0.396 -5.59 2.23e- 8 0.0499 0.236
## 2 bmi 1.11 0.0142 7.12 1.11e-12 1.08 1.14
Intepretation: For each additional unit increase in BMI, the odds of having diabetes complications increase by a factor of exp(0.101) = 1.106, when controlling for other variables in the model.
For Blood Pressure
modlog.bp <- glm(diabetes_complications ~ bp, data = diabetes_data, family = binomial(link = "logit"))
summary(modlog.bp)##
## Call:
## glm(formula = diabetes_complications ~ bp, family = binomial(link = "logit"),
## data = diabetes_data)
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 0.779754 0.588108 1.326 0.185
## bp -0.001441 0.004503 -0.320 0.749
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 1302.2 on 999 degrees of freedom
## Residual deviance: 1302.1 on 998 degrees of freedom
## AIC: 1306.1
##
## Number of Fisher Scoring iterations: 4
## # A tibble: 2 × 7
## term estimate std.error statistic p.value conf.low conf.high
## <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 (Intercept) 2.18 0.588 1.33 0.185 0.690 6.93
## 2 bp 0.999 0.00450 -0.320 0.749 0.990 1.01
Intepretation: For each additional unit increase in blood pressure, the odds of having diabetes complications increase by a factor of exp(-0.001) = 0.999, when controlling for other variables in the model.
For Cholesterol
modlog.chol <- glm(diabetes_complications ~ cholesterol, data = diabetes_data, family = binomial(link = "logit"))
summary(modlog.chol)##
## Call:
## glm(formula = diabetes_complications ~ cholesterol, family = binomial(link = "logit"),
## data = diabetes_data)
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 0.057273 0.447296 0.128 0.898
## cholesterol 0.002686 0.002223 1.208 0.227
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 1302.2 on 999 degrees of freedom
## Residual deviance: 1300.7 on 998 degrees of freedom
## AIC: 1304.7
##
## Number of Fisher Scoring iterations: 4
## # A tibble: 2 × 7
## term estimate std.error statistic p.value conf.low conf.high
## <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 (Intercept) 1.06 0.447 0.128 0.898 0.441 2.55
## 2 cholesterol 1.00 0.00222 1.21 0.227 0.998 1.01
Intepretation: For each additional unit increase in cholesterol level, the odds of having diabetes complications increase by a factor of exp(0.003) = 1.003, when controlling for other variables in the model.
For Smoking Status
modlog.smoke <- glm(diabetes_complications ~ smoking_status, data = diabetes_data, family = binomial(link = "logit"))
summary(modlog.smoke)##
## Call:
## glm(formula = diabetes_complications ~ smoking_status, family = binomial(link = "logit"),
## data = diabetes_data)
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 0.36408 0.07567 4.811 0.0000014982 ***
## smoking_statusYes 0.92607 0.16425 5.638 0.0000000172 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 1302.2 on 999 degrees of freedom
## Residual deviance: 1267.4 on 998 degrees of freedom
## AIC: 1271.4
##
## Number of Fisher Scoring iterations: 4
## # A tibble: 2 × 7
## term estimate std.error statistic p.value conf.low conf.high
## <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 (Intercept) 1.44 0.0757 4.81 0.00000150 1.24 1.67
## 2 smoking_statusYes 2.52 0.164 5.64 0.0000000172 1.84 3.51
Intepretation: Smokers have exp(0.926) = 2.524 times higher odds of having diabetes complications compared to non-smokers, when controlling for other variables in the model.
Multivariable logistic regression
We will only use significant and clinically relevant variables from the univariable analysis (age,BMI and smoking status).
without interaction terms
modlog.multi <- glm(diabetes_complications ~ age + bmi + smoking_status, data = diabetes_data, family = binomial(link = "logit"))
summary(modlog.multi)##
## Call:
## glm(formula = diabetes_complications ~ age + bmi + smoking_status,
## family = binomial(link = "logit"), data = diabetes_data)
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -3.797968 0.557167 -6.817 0.00000000000932 ***
## age 0.023893 0.007142 3.345 0.000822 ***
## bmi 0.101781 0.014576 6.983 0.00000000000289 ***
## smoking_statusYes 0.990752 0.169755 5.836 0.00000000533555 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 1302.2 on 999 degrees of freedom
## Residual deviance: 1199.1 on 996 degrees of freedom
## AIC: 1207.1
##
## Number of Fisher Scoring iterations: 4
## # A tibble: 4 × 7
## term estimate std.error statistic p.value conf.low conf.high
## <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 (Intercept) 0.0224 0.557 -6.82 9.32e-12 0.00741 0.0659
## 2 age 1.02 0.00714 3.35 8.22e- 4 1.01 1.04
## 3 bmi 1.11 0.0146 6.98 2.89e-12 1.08 1.14
## 4 smoking_statusYes 2.69 0.170 5.84 5.34e- 9 1.94 3.78
Intepretation: After adjusting for all other variables in the model, each additional year of age increases the odds of diabetes complications by a factor of exp(0.024) = 1.024, each additional unit increase in BMI increases the odds by exp(0.102) = 1.107 and smokers have exp(0.991) = 2.694 times higher odds of having diabetes complications compared to non-smokers.
with interaction terms (age and BMI)
modlog.interact <- glm(diabetes_complications ~ age + bmi + smoking_status + age:bmi, data = diabetes_data, family = binomial(link = "logit"))
summary(modlog.interact)##
## Call:
## glm(formula = diabetes_complications ~ age + bmi + smoking_status +
## age:bmi, family = binomial(link = "logit"), data = diabetes_data)
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -4.1322178 2.3194910 -1.782 0.0748 .
## age 0.0300109 0.0418060 0.718 0.4728
## bmi 0.1139454 0.0832113 1.369 0.1709
## smoking_statusYes 0.9903857 0.1698123 5.832 0.00000000547 ***
## age:bmi -0.0002223 0.0014962 -0.149 0.8819
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 1302.2 on 999 degrees of freedom
## Residual deviance: 1199.0 on 995 degrees of freedom
## AIC: 1209
##
## Number of Fisher Scoring iterations: 4
## # A tibble: 5 × 7
## term estimate std.error statistic p.value conf.low conf.high
## <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 (Intercept) 0.0160 2.32 -1.78 7.48e-2 0.000161 1.46
## 2 age 1.03 0.0418 0.718 4.73e-1 0.950 1.12
## 3 bmi 1.12 0.0832 1.37 1.71e-1 0.953 1.32
## 4 smoking_statusYes 2.69 0.170 5.83 5.47e-9 1.94 3.78
## 5 age:bmi 1.000 0.00150 -0.149 8.82e-1 0.997 1.00
Intepretation: The interaction term between age and BMI was not statistically significant (p = 0.882), indicating that the effect of age on diabetes complications does not differ significantly across different BMI levels in this dataset.
with interaction terms (age and smoking status)
modlog.interact2 <- glm(diabetes_complications ~ age + bmi + smoking_status + age:smoking_status, data = diabetes_data, family = binomial(link = "logit"))
summary(modlog.interact2)##
## Call:
## glm(formula = diabetes_complications ~ age + bmi + smoking_status +
## age:smoking_status, family = binomial(link = "logit"), data = diabetes_data)
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -3.655834 0.587255 -6.225 0.00000000048067 ***
## age 0.021160 0.008004 2.644 0.0082 **
## bmi 0.102051 0.014588 6.996 0.00000000000264 ***
## smoking_statusYes 0.282053 0.965880 0.292 0.7703
## age:smoking_statusYes 0.013173 0.017735 0.743 0.4576
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 1302.2 on 999 degrees of freedom
## Residual deviance: 1198.5 on 995 degrees of freedom
## AIC: 1208.5
##
## Number of Fisher Scoring iterations: 4
## # A tibble: 5 × 7
## term estimate std.error statistic p.value conf.low conf.high
## <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 (Intercept) 0.0258 0.587 -6.23 4.81e-10 0.00804 0.0805
## 2 age 1.02 0.00800 2.64 8.20e- 3 1.01 1.04
## 3 bmi 1.11 0.0146 7.00 2.64e-12 1.08 1.14
## 4 smoking_statusYes 1.33 0.966 0.292 7.70e- 1 0.197 8.78
## 5 age:smoking_statusYes 1.01 0.0177 0.743 4.58e- 1 0.979 1.05
Intepretation: The interaction term between age and smoking status was not statistically significant (p = 0.458), indicating that the effect of age on diabetes complications does not differ significantly between smokers and non-smokers in this dataset.
with interaction terms (bmi and smoking status)
modlog.interact3 <- glm(diabetes_complications ~ age + bmi + smoking_status + bmi:smoking_status, data = diabetes_data, family = binomial(link = "logit"))
summary(modlog.interact3)##
## Call:
## glm(formula = diabetes_complications ~ age + bmi + smoking_status +
## bmi:smoking_status, family = binomial(link = "logit"), data = diabetes_data)
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -3.724888 0.591973 -6.292 0.000000000313 ***
## age 0.023949 0.007142 3.353 0.000799 ***
## bmi 0.099043 0.016399 6.040 0.000000001544 ***
## smoking_statusYes 0.643954 0.978902 0.658 0.510645
## bmi:smoking_statusYes 0.012851 0.035782 0.359 0.719488
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 1302.2 on 999 degrees of freedom
## Residual deviance: 1198.9 on 995 degrees of freedom
## AIC: 1208.9
##
## Number of Fisher Scoring iterations: 4
## # A tibble: 5 × 7
## term estimate std.error statistic p.value conf.low conf.high
## <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 (Intercept) 0.0241 0.592 -6.29 3.13e-10 0.00743 0.0758
## 2 age 1.02 0.00714 3.35 7.99e- 4 1.01 1.04
## 3 bmi 1.10 0.0164 6.04 1.54e- 9 1.07 1.14
## 4 smoking_statusYes 1.90 0.979 0.658 5.11e- 1 0.273 12.8
## 5 bmi:smoking_statusYes 1.01 0.0358 0.359 7.19e- 1 0.945 1.09
Intepretation: The interaction term between BMI and smoking status was not statistically significant (p = 0.719), indicating that the effect of BMI on diabetes complications does not differ significantly between smokers and non-smokers in this dataset.
Conclusion: Since none of the interaction terms were statistically significant, we will proceed with the simpler model without interaction terms (modlog.multi) for further inference and prediction.
Step 4: Inference
For log odds and odds ratio
##
## Call:
## glm(formula = diabetes_complications ~ age + bmi + smoking_status,
## family = binomial(link = "logit"), data = diabetes_data)
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -3.797968 0.557167 -6.817 0.00000000000932 ***
## age 0.023893 0.007142 3.345 0.000822 ***
## bmi 0.101781 0.014576 6.983 0.00000000000289 ***
## smoking_statusYes 0.990752 0.169755 5.836 0.00000000533555 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 1302.2 on 999 degrees of freedom
## Residual deviance: 1199.1 on 996 degrees of freedom
## AIC: 1207.1
##
## Number of Fisher Scoring iterations: 4
## # A tibble: 4 × 7
## term estimate std.error statistic p.value conf.low conf.high
## <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 (Intercept) 0.0224 0.557 -6.82 9.32e-12 0.00741 0.0659
## 2 age 1.02 0.00714 3.35 8.22e- 4 1.01 1.04
## 3 bmi 1.11 0.0146 6.98 2.89e-12 1.08 1.14
## 4 smoking_statusYes 2.69 0.170 5.84 5.34e- 9 1.94 3.78
Interpretation: After adjusting for all other variables in the model, each additional year of age increases the odds of diabetes complications by a factor of 1.02 (95% CI: 1.01, 1.04), each additional unit increase in BMI increases the odds by 1.11 (95% CI: 1.08, 1.14) and smokers have 2.69 (95% CI: 1.94, 3.78) times higher odds of having diabetes complications compared to non-smokers.
Step 5: Prediction
## # A tibble: 10 × 10
## diabetes_complications age bmi smoking_status .fitted .resid .hat
## <fct> <dbl> <dbl> <fct> <dbl> <dbl> <dbl>
## 1 Yes 49.4 23.0 No -0.275 1.30 0.00301
## 2 Yes 52.7 22.8 No -0.218 1.27 0.00287
## 3 Yes 70.6 27.9 Yes 1.72 0.574 0.00475
## 4 Yes 55.7 27.3 No 0.316 1.05 0.00153
## 5 No 56.3 15.3 Yes 0.0903 -1.22 0.0128
## 6 Yes 72.2 33.2 Yes 2.30 0.438 0.00392
## 7 Yes 59.6 29.2 No 0.603 0.934 0.00178
## 8 Yes 42.3 40.1 No 1.29 0.696 0.00799
## 9 No 48.1 31.4 No 0.551 -1.42 0.00264
## 10 No 50.5 25.8 No 0.0321 -1.19 0.00197
## # ℹ 3 more variables: .sigma <dbl>, .cooksd <dbl>, .std.resid <dbl>
## # A tibble: 10 × 10
## diabetes_complications age bmi smoking_status .fitted .resid .hat
## <fct> <dbl> <dbl> <fct> <dbl> <dbl> <dbl>
## 1 Yes 49.4 23.0 No 0.432 1.30 0.00301
## 2 Yes 52.7 22.8 No 0.446 1.27 0.00287
## 3 Yes 70.6 27.9 Yes 0.848 0.574 0.00475
## 4 Yes 55.7 27.3 No 0.578 1.05 0.00153
## 5 No 56.3 15.3 Yes 0.523 -1.22 0.0128
## 6 Yes 72.2 33.2 Yes 0.909 0.438 0.00392
## 7 Yes 59.6 29.2 No 0.646 0.934 0.00178
## 8 Yes 42.3 40.1 No 0.785 0.696 0.00799
## 9 No 48.1 31.4 No 0.634 -1.42 0.00264
## 10 No 50.5 25.8 No 0.508 -1.19 0.00197
## # ℹ 3 more variables: .sigma <dbl>, .cooksd <dbl>, .std.resid <dbl>
Manual calculate for first observation
# Coefficients
coef <- coef(final_model)
# First observation values
obs1 <- diabetes_data[1, ]
# Linear predictor
linear_predictor <- coef[1] + coef[2]*obs1$age + coef[3]*obs1$bmi + coef[4]*(ifelse(obs1$smoking_status == "Yes", 1, 0))
# Probability calculation
probability <- 1 / (1 + exp(-linear_predictor))
probability## (Intercept)
## 0.4317638
Interpretation: For the first patient in the dataset, the predicted probability of having diabetes complications is approximately (43.2%).
Step 6: Confounding and mediation
Check correlation between age and BMI (numerical variables)
## [1] 0.08647944
Correlation between age and BMI
cor_matrix <- diabetes_data %>%
dplyr::select(age, bmi, bp, cholesterol) %>%
cor(use = "complete.obs")
cor_matrix## age bmi bp cholesterol
## age 1.00000000 0.086479441 -0.01932954 -0.002994710
## bmi 0.08647944 1.000000000 0.02650333 -0.007029076
## bp -0.01932954 0.026503334 1.00000000 0.050560850
## cholesterol -0.00299471 -0.007029076 0.05056085 1.000000000
Visualize the correlation matrix
Interpretation: The correlation between age and BMI is low (r = 0.09), indicating that there is weak to no linear relationship between these two variables in this dataset. Therefore, we can include both in the model without worrying about multicollinearity, and it is unlikely that BMI strongly confounds the relationship between age and diabetes complications.
Check the association of each variable with the outcome (diabetes complications)
Display the results
slr.age.bmi.smoke %>%
mutate(model = c('b0', 'age', 'b0', 'bmi', 'b0', 'smoking_statusYes')) %>%
dplyr::select(model, everything())## # A tibble: 6 × 6
## model term estimate std.error statistic p.value
## <chr> <chr> <dbl> <dbl> <dbl> <dbl>
## 1 b0 (Intercept) -0.836 0.378 -2.21 2.68e- 2
## 2 age .x 0.0261 0.00683 3.82 1.33e- 4
## 3 b0 (Intercept) -2.21 0.396 -5.59 2.23e- 8
## 4 bmi .x 0.101 0.0142 7.12 1.11e-12
## 5 b0 (Intercept) 0.364 0.0757 4.81 1.50e- 6
## 6 smoking_statusYes .xYes 0.926 0.164 5.64 1.72e- 8
Interpretation: All three variables (age, BMI, and smoking status) show significant associations with diabetes complications in the univariable analyses. This suggests that they are important predictors and should be included in the multivariable model. Since none of these variables show strong correlations with each other, it is unlikely that confounding is a major concern in this analysis.
Step 7: Model checking
Load library
## Loading required package: lattice
##
## Attaching package: 'caret'
## The following object is masked from 'package:purrr':
##
## lift
Create predicted classes based on a 0.5 threshold
Create confusion matrix
confusionMatrix(as.factor(final.m.prob$pred.class), diabetes_data$diabetes_complications, positive = "Yes")## Confusion Matrix and Statistics
##
## Reference
## Prediction No Yes
## No 113 79
## Yes 243 565
##
## Accuracy : 0.678
## 95% CI : (0.648, 0.7069)
## No Information Rate : 0.644
## P-Value [Acc > NIR] : 0.01301
##
## Kappa : 0.2171
##
## Mcnemar's Test P-Value : < 0.0000000000000002
##
## Sensitivity : 0.8773
## Specificity : 0.3174
## Pos Pred Value : 0.6993
## Neg Pred Value : 0.5885
## Prevalence : 0.6440
## Detection Rate : 0.5650
## Detection Prevalence : 0.8080
## Balanced Accuracy : 0.5974
##
## 'Positive' Class : Yes
##
Interpretation: The confusion matrix shows that the model has an accuracy of approximately 67.8%, with a sensitivity of 87.7% and a specificity of 31.7%. This indicates that the model is quite good at identifying patients with diabetes complications (high sensitivity) but less effective at correctly identifying those without complications (low specificity). Depending on the clinical context, further refinement of the model or adjustment of the classification threshold may be necessary to improve performance.
Check for linearity of covariates with logit
###Load library
## Warning: package 'mfp' was built under R version 4.5.2
## Loading required package: survival
##
## Attaching package: 'survival'
## The following object is masked from 'package:caret':
##
## cluster
lin.age.bmi <- mfp(diabetes_complications ~ fp(age) + fp(bmi) + smoking_status, data = diabetes_data, family = binomial(link = "logit"),
verbose = TRUE)##
## Variable Deviance Power(s)
## ------------------------------------------------
## Cycle 1
## bmi
## 1251.692
## 1199.066 1
## 1199.066 1
## 1199.065 -2 1
##
## smoking_statusYes
## 1236.408
## 1199.066 1
##
##
##
## age
## 1210.496
## 1199.066 1
## 1198.847 2
## 1197.198 -2 -2
##
##
## Tansformation
## shift scale
## bmi 0 10
## smoking_statusYes 0 1
## age 0 100
##
## Fractional polynomials
## df.initial select alpha df.final power1 power2
## bmi 4 1 0.05 1 1 .
## smoking_statusYes 1 1 0.05 1 1 .
## age 4 1 0.05 1 1 .
##
##
## Transformations of covariates:
## formula
## age I((age/100)^1)
## bmi I((bmi/10)^1)
## smoking_status smoking_status
##
##
## Deviance table:
## Resid. Dev
## Null model 1302.164
## Linear model 1199.066
## Final model 1199.066
##
## Call:
## glm(formula = diabetes_complications ~ I((bmi/10)^1) + smoking_status +
## I((age/100)^1), family = binomial(link = "logit"), data = diabetes_data)
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -3.7980 0.5572 -6.817 0.00000000000932 ***
## I((bmi/10)^1) 1.0178 0.1458 6.983 0.00000000000289 ***
## smoking_statusYes 0.9908 0.1698 5.836 0.00000000533555 ***
## I((age/100)^1) 2.3893 0.7142 3.345 0.000822 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 1302.2 on 999 degrees of freedom
## Residual deviance: 1199.1 on 996 degrees of freedom
## AIC: 1207.1
##
## Number of Fisher Scoring iterations: 4
Interpretation: The analysis confirms that both Age and BMI have a linear relationship with the log-odds of diabetes complications.The “Power” value for both variables is 1, which means a straight line (\(x^1\)) fits the data best.Because the relationship is linear, we do not need to apply complex transformations to these variables. We can include them in the logistic regression model exactly as they are.
Step 8: Diagnostics plot
Interpretation: The diagnostic plots indicate that the residuals are approximately normally distributed, and there are no obvious patterns suggesting non-linearity or heteroscedasticity. There are a few potential outliers, but they do not appear to have a significant influence on the model. Overall, the model assumptions seem to be reasonably met.
Influential observations
## [1] "dfb.1_" "dfb.age" "dfb.bmi" "dfb.sm_Y" "dffit" "cov.r" "cook.d"
## [8] "hat"
Identify observations with high Cook’s distance
## dfb.1_ dfb.age dfb.bmi dfb.sm_Y dffit
## 5 -0.0594546125 -0.013760500 0.095001730423 -0.06653561 -0.12715827
## 9 -0.0036668251 0.030754523 -0.034136540333 0.02103548 -0.06662158
## 13 0.0287719914 0.019330487 -0.049539410183 -0.02409077 0.06889973
## 25 0.0771688848 -0.024049945 -0.073232852550 -0.03014678 0.09259093
## 26 0.0403897472 -0.062474750 0.012708518816 -0.02039296 0.07550210
## 29 -0.0243170607 0.047803601 -0.021702139261 0.02157175 -0.06960049
## 33 -0.0074937617 -0.038155329 0.038617929701 0.02202888 -0.06888745
## 41 -0.0433039754 0.024800426 0.035611298199 -0.08036276 -0.10537456
## 51 0.0302424402 -0.016125885 -0.026021688661 -0.08823483 -0.10178356
## 53 0.0256254063 -0.002166441 -0.033137362615 -0.08829956 -0.10246213
## 55 0.0263571000 0.011569229 -0.057346000895 0.01938892 -0.07752616
## 56 0.0624278374 -0.071405249 -0.026992630749 0.01894098 -0.09312697
## 64 -0.0024765226 0.047029675 -0.051640543801 0.02059966 -0.08477320
## 72 0.0832766087 -0.098035618 -0.010709360763 -0.02670270 0.10994829
## 73 0.0090593588 -0.054651847 0.040253018050 -0.08597737 -0.11929707
## 84 0.0377687885 0.031634070 -0.073302182685 -0.02674058 0.09171378
## 87 0.0164852277 -0.058759061 0.034005868007 -0.08675925 -0.11946518
## 88 0.0391918009 -0.019769929 -0.044786369011 0.01928679 -0.07154832
## 90 0.0061733527 0.036065248 -0.043268020184 0.04784627 0.07924954
## 91 0.0848531673 -0.044083088 -0.084074853529 0.01496162 -0.10934298
## 97 0.0791171146 -0.103721099 -0.018815503197 0.01849341 -0.11829082
## 104 0.0585049925 -0.011517174 -0.060016996515 -0.02770489 0.07808634
## 108 0.0449211285 -0.063159470 0.007348289352 -0.02136874 0.07612787
## 120 -0.0748294882 0.033276610 0.070774239114 -0.06662538 -0.11548341
## 122 0.0797264747 -0.038831401 -0.062543601010 -0.02969689 0.08988129
## 123 0.0260769865 0.024494648 -0.069473177330 0.01897824 -0.08921042
## 136 0.0065789490 -0.049856990 0.030064171963 0.02203569 -0.07359092
## 140 0.0468197530 -0.055465987 -0.002487085289 -0.02236398 0.07048256
## 143 0.0425936602 -0.059100982 0.006584327447 -0.02115850 0.07261121
## 145 0.0358577160 -0.058166111 0.014619531848 -0.01967485 0.07182520
## 146 -0.0298989047 0.018205674 0.023543671421 -0.08334318 -0.10218926
## 147 0.0398092461 -0.039167877 -0.016951071598 0.05197936 0.07588149
## 149 0.0588442092 -0.103605283 0.019025975925 -0.08815882 -0.14288937
## 159 0.0027456343 -0.053445966 0.047769397602 -0.08503917 -0.12114988
## 161 0.0448052898 -0.049227113 -0.024053856596 0.01979964 -0.07568435
## 164 0.0943647052 -0.155363837 0.009978694732 0.01904539 -0.16396028
## 169 0.0277103200 0.024224304 -0.052775234620 -0.02423263 0.07264669
## 175 -0.0152108423 0.029647153 -0.007681956767 -0.08671213 -0.10346525
## 179 0.0219031793 0.018622702 -0.048069730471 0.05225514 0.08101861
## 187 0.0020600581 -0.048228270 0.034879140917 0.02208916 -0.07397214
## 195 0.0693652101 -0.053957476 -0.034103918658 -0.02695120 0.08054690
## 198 0.0200122488 0.054334162 -0.079910755756 -0.08814365 -0.13406576
## 202 0.0098092418 -0.057787054 0.033239660704 0.02207295 -0.08033060
## 207 -0.0450501036 0.029129766 0.033836409105 -0.08028843 -0.10587811
## 216 0.0758226752 -0.078707957 -0.038399280066 0.01778459 -0.10308053
## 217 0.0912043340 -0.015192359 -0.100779221031 -0.03281726 0.11471181
## 219 0.0786245487 -0.051676024 -0.048763432672 -0.02874054 0.08747657
## 223 -0.0285665197 -0.038551332 0.069731654212 0.02042252 -0.08568228
## 231 0.0259219358 -0.087571483 0.039551078465 0.02208404 -0.10607868
## 234 0.0542111888 -0.019370863 -0.055841529499 0.05913453 0.09461803
## 244 0.0606051811 -0.045645819 -0.030267662714 -0.02585371 0.07267663
## 248 -0.0268439714 0.057644900 -0.027756719794 0.02153043 -0.07846011
## 249 0.0208581566 0.032611617 -0.070106389594 0.01918866 -0.09194029
## 255 0.0884981235 -0.073696805 -0.060594144695 0.01595973 -0.11020328
## 257 0.0064457152 0.043350099 -0.042637535263 -0.02109147 0.07258503
## 260 -0.0424008183 0.058922115 -0.006862063876 0.02157026 -0.07303982
## 261 0.0196567805 0.020424437 -0.046719262609 -0.08841155 -0.10899426
## 265 0.0697677772 -0.108829685 -0.000981009138 0.01966567 -0.12061391
## 273 0.0456490046 -0.035326258 -0.028683753625 -0.08784789 -0.10605961
## 278 0.0285276876 -0.002845517 -0.046392622493 0.01967038 -0.06922296
## 286 0.0134328922 -0.011570409 -0.007307304224 -0.08790483 -0.09948174
## 288 -0.0049929931 0.062583090 -0.045709127549 -0.02001701 0.08599440
## 289 0.0768438131 -0.005102569 -0.090938183933 -0.03119519 0.10434887
## 307 0.1075469544 -0.089340206 -0.051846700001 -0.03129327 0.11676762
## 309 0.0019803460 -0.053200378 0.039859677881 0.02213573 -0.07924040
## 310 0.0587224720 -0.096156836 0.002064485612 0.02021131 -0.10899499
## 312 0.0187150916 -0.041812873 0.014597468457 -0.08750916 -0.10807664
## 313 -0.0189853445 0.077794759 -0.048927206463 -0.08837825 -0.13396596
## 322 0.0103028065 0.040749114 -0.053461853558 0.05048495 0.08955506
## 327 0.0339110289 0.004020864 -0.060498537222 0.01894705 -0.07952394
## 332 -0.0200876063 0.051438675 -0.031229147877 0.02140447 -0.07592914
## 336 -0.0927371536 0.070310859 0.059675366135 -0.06187064 -0.12284673
## 338 -0.0412626364 0.001445270 0.055329031520 -0.07839313 -0.10936270
## 340 -0.0063186863 -0.001194817 0.009838454548 -0.08637283 -0.09934483
## 343 0.0263529474 -0.075743149 0.027426836187 0.02181642 -0.09315769
## 345 0.0534406641 0.001576445 -0.065700332953 -0.02763650 0.08130034
## 347 -0.0239752587 -0.011879375 0.044414848646 -0.08227025 -0.10719248
## 349 0.0583447915 -0.038604395 -0.033950839810 -0.02589303 0.07013564
## 353 -0.0281200591 0.013298615 0.025830741013 -0.08335600 -0.10196855
## 354 0.0278642341 -0.004904459 -0.043476684723 0.01979292 -0.06739643
## 358 0.0519474777 0.002348921 -0.064428211019 -0.02741474 0.08015976
## 359 -0.0674779177 0.070208738 0.018830513588 0.01914386 -0.08017726
## 371 0.0534477567 -0.113021894 0.025732149114 0.02119108 -0.12559214
## 374 -0.0205578060 -0.032935191 0.052295950198 0.02158272 -0.07262499
## 380 -0.0363781180 0.074866730 -0.031117358896 0.02157295 -0.09226627
## 385 0.0461770064 -0.025214914 -0.039161056408 -0.08762281 -0.10578707
## 386 -0.0200183638 -0.050643552 0.069084046177 0.02123510 -0.09205411
## 388 -0.0205823131 0.048712031 -0.018684248248 -0.08697779 -0.11159512
## 389 0.0051881693 0.019884326 -0.035952659824 0.02076579 -0.06407994
## 392 0.1066685503 -0.095765683 -0.044489750903 -0.03080859 0.11858049
## 395 0.0030302623 0.026238553 -0.039115734216 0.02074795 -0.06792017
## 396 0.0011787421 -0.063151331 0.050753506653 0.02220334 -0.09083038
## 401 -0.0267036156 -0.003719957 0.040296447882 -0.08227757 -0.10506390
## 403 -0.0011670753 0.028623198 -0.035563542128 0.02094326 -0.06661295
## 413 0.0199038098 0.028692793 -0.046611229634 -0.02292537 0.06912788
## 414 0.0336481722 0.002129164 -0.048308806620 -0.08799173 -0.10682912
## 416 0.0687467299 -0.106433985 0.016781969411 -0.02332233 0.11605932
## 422 0.0386554039 -0.012602120 -0.050979319636 0.01908390 -0.07396269
## 432 0.0421991897 0.020098396 -0.068255047235 -0.02679014 0.08473924
## 436 -0.0240675538 0.044996301 -0.019308037310 0.02159412 -0.06696531
## 437 -0.0351727246 -0.024729977 0.065825122919 0.01981129 -0.07682384
## 439 0.0464022483 -0.061905294 -0.004086164522 0.05180609 0.08776164
## 444 0.0236213059 0.005257703 -0.037542278590 -0.08833385 -0.10392539
## 448 -0.0299578144 -0.031682534 0.064915281759 0.02053172 -0.07935661
## 449 0.0398167475 -0.074486501 0.024855552418 -0.01929654 0.08732540
## 450 -0.0467830957 0.061013644 -0.002620209516 0.02144763 -0.07391290
## 453 0.0368115369 -0.019856481 -0.041405333692 0.01950731 -0.06936922
## 454 0.0623934557 -0.032331485 -0.064651700806 0.01733276 -0.08997287
## 456 0.0275586194 -0.060227054 0.020199259279 0.04262695 0.07968510
## 458 0.0253642411 0.024164237 -0.068166815513 0.01905659 -0.08810504
## 459 0.0272677056 -0.010390670 -0.037330319930 0.02001436 -0.06422138
## 462 0.0080215678 0.016009300 -0.036144155528 0.02068043 -0.06331700
## 468 0.0726840970 -0.081250374 -0.031630357026 0.01825056 -0.10256805
## 472 0.0533102471 -0.008667426 -0.055748174951 -0.02693961 0.07393994
## 473 0.0401789224 -0.035419721 -0.021074012470 0.05260297 0.07589007
## 476 0.0531931545 0.010834981 -0.074206953006 -0.02819178 0.08888738
## 479 0.0521408899 0.020992879 -0.082490064894 -0.02865674 0.09738132
## 482 0.0264584408 -0.014390797 -0.032326277984 0.02020068 -0.06215377
## 484 0.0383255177 -0.004601803 -0.058261704922 0.01882777 -0.07816617
## 490 0.0150230672 0.018211439 -0.048044025266 0.02012471 -0.07171115
## 491 0.0168399933 0.005128740 -0.037875939756 0.02035996 -0.06317653
## 493 0.0083818178 -0.056774951 0.034271365107 0.02209013 -0.07989279
## 495 0.0638807476 -0.034726577 -0.045090086448 -0.02718443 0.07509361
## 499 0.0428303165 -0.016184762 -0.053286955929 0.01880781 -0.07645362
## 500 0.0235457599 -0.031322511 -0.002160145156 -0.08801984 -0.10318639
## 513 0.0481037124 -0.041063953 -0.036520630685 0.01920780 -0.07618101
## 519 -0.0252262169 0.043640640 -0.007409478708 -0.08603885 -0.10814624
## 523 -0.0300692987 0.026390688 0.015884208879 -0.08395415 -0.10279823
## 535 0.0704195857 -0.092108857 -0.018030998048 0.01894247 -0.10792563
## 537 0.0417956699 -0.059569224 -0.000006526297 -0.08814827 -0.11417496
## 540 0.0125311555 0.009422924 -0.036038867468 0.02054855 -0.06226941
## 545 0.0267873587 -0.040162301 -0.007773579248 0.02091441 -0.06476571
## 554 0.0342907581 -0.011023825 -0.046463300745 0.01943389 -0.07043842
## 556 -0.0713256767 0.055706840 0.044325712351 -0.07351642 -0.11569718
## 567 0.0554616845 -0.026641827 -0.050548376645 -0.08672349 -0.10914097
## 572 0.0676980667 -0.059835659 -0.026250601870 -0.02626017 0.08129500
## 583 -0.0672134780 0.043555370 0.050391745850 -0.07346685 -0.11282481
## 591 0.1666304232 -0.127321955 -0.096194380423 -0.03707237 0.17087617
## 598 -0.0003327305 0.093874286 -0.100105203325 0.01913831 -0.14414752
## 599 0.0175618567 0.043155715 -0.065761661268 0.05377780 0.10120763
## 605 -0.0270206480 -0.030008249 0.058861337085 0.02104542 -0.07494638
## 608 0.0002010670 0.062620180 -0.060667857750 -0.08879298 -0.12978287
## 615 0.0277663842 -0.056783281 0.016592444345 -0.08777930 -0.11473132
## 616 -0.0998881669 0.091696270 0.048881431196 -0.06216445 -0.13209360
## 623 0.0460835672 -0.039370087 -0.035363538744 0.01933803 -0.07461792
## 629 0.0507551372 -0.054172089 -0.008962935854 -0.02329357 0.07078196
## 631 -0.0689030411 0.019541344 0.075873690178 -0.06760794 -0.11600365
## 632 -0.0576182607 0.060658345 0.013552262563 0.02056326 -0.07278294
## 633 0.0273397802 0.024880605 -0.052904446828 -0.02420473 0.07292346
## 636 -0.0578375462 0.064923555 0.009608582173 0.02071021 -0.07600603
## 637 0.0371130062 0.006922990 -0.067730517758 0.01852520 -0.08534186
## 649 0.0501922807 -0.034166539 -0.046075188256 0.01874464 -0.07808379
## 654 -0.0267619537 0.062116312 -0.032246458488 0.02147940 -0.08343200
## 656 0.0799830612 -0.033410798 -0.068080128728 -0.03005873 0.09176486
## 661 0.0565696422 -0.039893031 -0.049342329967 0.01830364 -0.08293562
## 664 -0.0423028377 -0.006623206 0.064540283251 -0.07699873 -0.11344234
## 665 0.0932842263 -0.097706124 -0.024512881675 -0.02850039 0.11293826
## 666 -0.0024148915 -0.007509754 0.010562172802 -0.08658744 -0.09970431
## 668 0.0241418228 0.007998771 -0.050813878566 0.01969745 -0.07232223
## 676 -0.0200321472 0.090191934 -0.069100462711 0.02071590 -0.12208912
## 688 0.0421373770 0.024703654 -0.081749772622 0.05980763 0.11269915
## 690 0.0555668846 -0.081173152 0.010311923634 -0.02235250 0.09231197
## 692 0.0055059555 0.036944404 -0.043198428746 0.04766353 0.07941830
## 693 -0.0077897249 -0.049139283 0.049834060474 0.02203327 -0.08077339
## 700 -0.0454532025 0.060874879 -0.004400066396 0.02149659 -0.07407385
## 702 0.0615659022 -0.062215774 -0.015754520499 -0.02492132 0.07937805
## 707 0.0151420088 -0.019066722 -0.002427279773 -0.08783268 -0.10043439
## 708 0.0122822316 0.043523104 -0.068787271776 0.01959371 -0.09495367
## 710 0.0614975028 -0.033997886 -0.061813750065 0.01750931 -0.08858709
## 716 0.0156320015 -0.050842741 0.018249576739 0.02176754 -0.07172948
## 722 0.0422640489 -0.064495330 0.012142170223 -0.02066551 0.07726931
## 726 0.0652087597 0.016459724 -0.095815961785 -0.03058958 0.10881681
## 727 0.0307047805 0.026800771 -0.059229296399 -0.02502794 0.07859245
## 732 0.0245715205 0.039822159 -0.063415224925 -0.02473673 0.08634977
## 733 0.0407388993 -0.071284652 0.020601891197 -0.01977446 0.08389185
## 742 0.0067840959 -0.044074970 0.024125814393 0.02194999 -0.06799240
## 744 0.0784243337 -0.028577138 -0.090207110758 0.01508095 -0.10869618
## 746 0.1020783810 -0.097522842 -0.036581924018 -0.02997945 0.11684661
## 747 0.0553088483 -0.134996019 0.054159464275 -0.08809233 -0.17446262
## 749 0.1126112393 -0.108482724 -0.060052211731 0.01432510 -0.13608010
## 756 0.0422907929 0.007434641 -0.075366876233 0.01796199 -0.09160321
## 759 0.0855715497 -0.087239158 -0.024121148132 -0.02778611 0.10351371
## 766 0.0126706222 0.040498940 -0.048194285611 -0.02226810 0.07506735
## 771 0.0765231470 -0.070049941 -0.028363429830 -0.02723327 0.09038342
## 773 0.0739390648 -0.018837596 -0.093449294580 0.01521961 -0.10904378
## 775 0.0623894069 0.017610454 -0.093100714963 -0.03019559 0.10646803
## 781 0.0282865042 -0.037768140 -0.012190078189 0.02074642 -0.06428168
## 785 -0.0234376164 0.057537817 -0.023270789444 -0.08706442 -0.11644718
## 788 0.0221436499 -0.063564833 0.037729020977 -0.01567033 0.08082556
## 789 0.0601277614 -0.057905201 -0.026812059078 0.05727007 0.09545303
## 791 0.0274558290 -0.018170629 -0.030047939612 0.02023689 -0.06203557
## 792 -0.0509300494 0.034315943 0.036917850015 -0.07894614 -0.10765921
## 801 0.0324869229 0.017896329 -0.053139440859 -0.02474450 0.07156697
## 808 0.0540334167 -0.101493230 0.013744603206 0.02077948 -0.11392223
## 811 -0.0595140843 0.093467558 -0.016324399264 0.02132187 -0.10292837
## 812 0.0649207380 -0.063833224 -0.018708469587 -0.02546636 0.08167560
## 827 -0.0443301723 0.009077702 0.052185122429 -0.07821453 -0.10817642
## 830 -0.0125620895 0.065905381 -0.055905600543 0.02079082 -0.09819000
## 831 0.0295358261 -0.048867986 -0.003156921166 0.02096957 -0.07005768
## 833 0.0383501802 0.038479830 -0.080610375210 -0.02731426 0.10000450
## 834 -0.0045201333 0.057335030 -0.041373310025 -0.01967427 0.07996844
## 839 0.0436435444 -0.001134236 -0.068954579808 0.01815231 -0.08635707
## 842 -0.0300702500 0.086419729 -0.035308539590 -0.01563319 0.09836433
## 850 0.0541960831 0.002625725 -0.067719148811 -0.02784140 0.08302747
## 851 0.0750616597 0.032519171 -0.124580686477 -0.03303123 0.13754525
## 860 0.0123104807 0.070374742 -0.076079026086 -0.02449667 0.11110394
## 863 0.0574391081 -0.041604375 -0.038836685240 0.05800280 0.09198569
## 867 -0.0784124039 0.072740364 0.037640402648 -0.07294191 -0.12227985
## 868 0.0108314780 -0.050481965 0.024663239822 0.02193364 -0.07265611
## 875 0.0603381911 -0.069848262 -0.025610647374 0.01909120 -0.09144649
## 877 0.0917544845 -0.029562742 -0.097622356241 -0.07702365 -0.12862646
## 878 0.0112414822 0.080544524 -0.084316513911 -0.02503870 0.12302632
## 884 0.0242039280 -0.064359447 0.028796197378 -0.08734700 -0.12096099
## 886 0.0531819448 -0.023403585 -0.050537456810 -0.08688593 -0.10860734
## 887 0.0211306402 0.044570085 -0.072031622674 -0.08820463 -0.12638704
## 888 0.0729725166 -0.023085713 -0.068478918321 -0.02950685 0.08828817
## 891 0.0753781030 -0.040556645 -0.055015929605 -0.02885798 0.08518023
## 900 0.0266945545 -0.011028784 -0.035914921997 0.02008003 -0.06347091
## 901 0.0142997292 0.049207902 -0.077095942202 0.01923946 -0.10354464
## 910 0.0508879455 -0.010718804 -0.069679094731 0.01774373 -0.08792618
## 913 0.0323987630 -0.003919430 -0.050720002780 0.01936392 -0.07243909
## 914 -0.0475700360 0.078034335 -0.009863492330 -0.08430864 -0.12552815
## 921 0.0641318535 -0.007880550 -0.080560613454 0.06272312 0.11350616
## 923 -0.0401137860 -0.006149051 0.061073761581 -0.07789077 -0.11208213
## 930 -0.0024882513 -0.027061735 0.029519122397 -0.08561272 -0.10625941
## 936 0.0650547062 -0.021689819 -0.059124339041 -0.02822694 0.08012375
## 937 -0.0299455442 0.025832415 0.016252489038 -0.08393770 -0.10270653
## 940 -0.0162998375 -0.052115651 0.065087296261 0.02156927 -0.09064962
## 950 0.0506980280 0.061033820 -0.138538839132 0.01452705 -0.15786656
## 953 0.0051558190 -0.013180049 0.005623361646 -0.08720926 -0.09987979
## 955 0.0524879990 -0.015289789 -0.067469748600 0.01775201 -0.08700179
## 956 0.0591587452 -0.064950552 -0.009926256549 -0.02425376 0.08010417
## 961 0.0245354397 0.035444912 -0.059204189579 -0.02440770 0.08133723
## 967 -0.0313324923 0.080168328 -0.043444976705 0.02140469 -0.10147881
## 972 -0.0632584832 0.062625174 0.020057887333 0.01963738 -0.07455618
## 975 0.0095564068 0.040321978 -0.043891977058 -0.02155600 0.07192686
## 976 0.0085816454 0.047664806 -0.057765364880 -0.08865234 -0.12176772
## 977 0.0014144205 0.033275676 -0.034035729627 -0.08830675 -0.10887315
## 980 0.0253061902 -0.006451708 -0.038422898307 0.02005102 -0.06422146
## 984 0.0761082051 -0.019052946 -0.076580903691 -0.03026700 0.09403175
## 988 0.0770815595 -0.034737179 -0.082454256888 0.01557689 -0.10443103
## 992 0.0086053142 0.046517889 -0.056691804732 -0.08864507 -0.12088871
## 997 0.0364223946 -0.049906970 -0.011748798480 0.02050950 -0.07238784
## 1000 -0.0117153327 0.005705746 0.010602145425 -0.08595414 -0.09951017
## cov.r cook.d hat
## 5 1.0119988 0.003601324 0.012826119
## 9 0.9999273 0.001150827 0.002640252
## 13 1.0024883 0.001079011 0.003656786
## 25 1.0018276 0.002283288 0.004854630
## 26 1.0048428 0.001211051 0.005189871
## 29 1.0016424 0.001154229 0.003370873
## 33 1.0028842 0.001058160 0.003822832
## 41 1.0028488 0.003012658 0.006088440
## 51 0.9956379 0.004287956 0.003487367
## 53 0.9956837 0.004345316 0.003533694
## 55 0.9985961 0.001791482 0.002892721
## 56 0.9989761 0.002729400 0.003883274
## 64 1.0008581 0.001934549 0.004006751
## 72 1.0059925 0.002945603 0.007969413
## 73 1.0010118 0.004579346 0.006173645
## 84 1.0038429 0.002026305 0.005697737
## 87 1.0002835 0.004786782 0.005901280
## 88 0.9979507 0.001549620 0.002416006
## 90 1.0110216 0.001156067 0.009522798
## 91 0.9967725 0.004740574 0.004178540
## 97 1.0006044 0.004579081 0.005952020
## 104 1.0016957 0.001514058 0.003909350
## 108 1.0044427 0.001254465 0.005018610
## 120 1.0097466 0.002971430 0.010617433
## 122 1.0017711 0.002126821 0.004664578
## 123 0.9991227 0.002431729 0.003700935
## 136 1.0021925 0.001280174 0.003827329
## 140 1.0032868 0.001096425 0.004101361
## 143 1.0040945 0.001136418 0.004620655
## 145 1.0046985 0.001081284 0.004882323
## 146 1.0010911 0.003045802 0.005154070
## 147 1.0087436 0.001092161 0.007656896
## 149 0.9990035 0.008421112 0.006847845
## 159 1.0019552 0.004539856 0.006678775
## 161 0.9986145 0.001689992 0.002795995
## 164 1.0063176 0.008675978 0.011495484
## 169 1.0028433 0.001202441 0.004039966
## 175 0.9991002 0.003530179 0.004525919
## 179 1.0096181 0.001250032 0.008569067
## 187 1.0026894 0.001264663 0.004056525
## 195 1.0022368 0.001587160 0.004275955
## 198 0.9964065 0.008346082 0.005473704
## 202 1.0028465 0.001530678 0.004521491
## 207 1.0029489 0.003035414 0.006165338
## 216 0.9986370 0.003600005 0.004357272
## 217 1.0024486 0.003831399 0.006492772
## 219 1.0020305 0.001962481 0.004620783
## 223 1.0097810 0.001422361 0.008944072
## 231 1.0052300 0.002764792 0.007309941
## 234 1.0094265 0.001821118 0.009197069
## 244 1.0019638 0.001257260 0.003680635
## 248 1.0024629 0.001472866 0.004240228
## 249 0.9996094 0.002541980 0.004011473
## 255 0.9975814 0.004577666 0.004452386
## 257 1.0043334 0.001124516 0.004739698
## 260 1.0037291 0.001170263 0.004468883
## 261 0.9959653 0.004978951 0.003951482
## 265 1.0022023 0.004430513 0.006752356
## 273 0.9948751 0.005036142 0.003541381
## 278 0.9982440 0.001405806 0.002359225
## 286 0.9971518 0.003632278 0.003731329
## 288 1.0069344 0.001543650 0.007051476
## 289 1.0024961 0.002988320 0.005867441
## 307 1.0036368 0.003795953 0.007169507
## 309 1.0035019 0.001437093 0.004750850
## 310 1.0017693 0.003468963 0.005845984
## 312 0.9983437 0.004140958 0.004557339
## 313 1.0005006 0.006448653 0.006871932
## 322 1.0123694 0.001505085 0.011088050
## 327 0.9982698 0.001945691 0.002919429
## 332 1.0016854 0.001416269 0.003773274
## 336 1.0133156 0.003203768 0.013536086
## 338 1.0041352 0.003128308 0.006949272
## 340 0.9988942 0.003225480 0.004219319
## 343 1.0030344 0.002184803 0.005408091
## 345 1.0020903 0.001635494 0.004262857
## 347 1.0021063 0.003265460 0.005876004
## 349 1.0017058 0.001171311 0.003426937
## 353 1.0010507 0.003035710 0.005124956
## 354 0.9982394 0.001321756 0.002262330
## 358 1.0020849 0.001580814 0.004190136
## 359 1.0099875 0.001210616 0.008793110
## 371 1.0051550 0.004317956 0.008487880
## 374 1.0050036 0.001097477 0.005096794
## 380 1.0045926 0.001993472 0.006109159
## 385 0.9943912 0.005192943 0.003424050
## 386 1.0094690 0.001701714 0.009080595
## 388 0.9997185 0.004133349 0.005218637
## 389 0.9992681 0.001097168 0.002324828
## 392 1.0042201 0.003853065 0.007569137
## 395 0.9995575 0.001232193 0.002611761
## 396 1.0056086 0.001844554 0.006567203
## 401 1.0018824 0.003136344 0.005649409
## 403 0.9997641 0.001162333 0.002594454
## 413 1.0030434 0.001058980 0.003907365
## 414 0.9948791 0.005127536 0.003582331
## 416 1.0094147 0.003036640 0.010431370
## 422 0.9979482 0.001675175 0.002542652
## 432 1.0029509 0.001735029 0.004843464
## 436 1.0014637 0.001064389 0.003146621
## 437 1.0088531 0.001121473 0.007787415
## 439 1.0115063 0.001455989 0.010333207
## 444 0.9957323 0.004487114 0.003622658
## 448 1.0082228 0.001228659 0.007494252
## 449 1.0072325 0.001588065 0.007317896
## 450 1.0043764 0.001172083 0.004845266
## 453 0.9980197 0.001435003 0.002317246
## 454 0.9974132 0.002781042 0.003275861
## 456 1.0137557 0.001125772 0.011725463
## 458 0.9991142 0.002359627 0.003634606
## 459 0.9982391 0.001182843 0.002097557
## 462 0.9990793 0.001080822 0.002237806
## 468 0.9990302 0.003467710 0.004450694
## 472 1.0016589 0.001331059 0.003641631
## 473 1.0085061 0.001098367 0.007490497
## 476 1.0026419 0.001980643 0.004966502
## 479 1.0033881 0.002402858 0.005836075
## 482 0.9982873 0.001093890 0.002001588
## 484 0.9980261 0.001898723 0.002785624
## 490 0.9990021 0.001450434 0.002676330
## 491 0.9986221 0.001109245 0.002125755
## 493 1.0029280 0.001504745 0.004530058
## 495 1.0016025 0.001385343 0.003690936
## 499 0.9978444 0.001824372 0.002651019
## 500 0.9970378 0.004014440 0.003906636
## 513 0.9980984 0.001777472 0.002696030
## 519 1.0001319 0.003718485 0.005158232
## 523 1.0008506 0.003133495 0.005099228
## 535 1.0001646 0.003691898 0.005156728
## 537 0.9968569 0.005270884 0.004472297
## 540 0.9988075 0.001059579 0.002119304
## 545 0.9991660 0.001132033 0.002337008
## 554 0.9980553 0.001483245 0.002381001
## 556 1.0071134 0.003231351 0.008970328
## 567 0.9934885 0.006032266 0.003410770
## 572 1.0026557 0.001590095 0.004498743
## 583 1.0066631 0.003074359 0.008529970
## 591 1.0037723 0.010959881 0.010615573
## 598 1.0055161 0.006205770 0.009835068
## 599 1.0135993 0.001974635 0.012627665
## 605 1.0064768 0.001125626 0.006097931
## 608 0.9982894 0.006744673 0.005814310
## 615 0.9984349 0.004808469 0.004974367
## 616 1.0153796 0.003706639 0.015558182
## 623 0.9981231 0.001689682 0.002618037
## 629 1.0029060 0.001127092 0.003950658
## 631 1.0094337 0.003031320 0.010440608
## 632 1.0062763 0.001056769 0.005843125
## 633 1.0028884 0.001210794 0.004076859
## 636 1.0065072 0.001162680 0.006181091
## 637 0.9983013 0.002301441 0.003248991
## 649 0.9978254 0.001920446 0.002733517
## 654 1.0028258 0.001680146 0.004706291
## 656 1.0017690 0.002239605 0.004779965
## 661 0.9977392 0.002231375 0.002973878
## 664 1.0052520 0.003286064 0.007781272
## 665 1.0051657 0.003259245 0.007703795
## 666 0.9987275 0.003289533 0.004188926
## 668 0.9985296 0.001526999 0.002591558
## 676 1.0055765 0.003936812 0.008494725
## 688 1.0128361 0.002603972 0.012641677
## 690 1.0061752 0.001882056 0.006982542
## 692 1.0111400 0.001159474 0.009622550
## 693 1.0048921 0.001421730 0.005546493
## 700 1.0042029 0.001186430 0.004767057
## 702 1.0031031 0.001469364 0.004576033
## 707 0.9973056 0.003681036 0.003824645
## 708 1.0003957 0.002629673 0.004456477
## 710 0.9974612 0.002668625 0.003212309
## 716 1.0010929 0.001277361 0.003303831
## 722 1.0049466 0.001275047 0.005356421
## 726 1.0035087 0.003174676 0.006610033
## 727 1.0031498 0.001431390 0.004547866
## 732 1.0042840 0.001713655 0.005579735
## 733 1.0063962 0.001478403 0.006595229
## 742 1.0014171 0.001105885 0.003192141
## 744 0.9968814 0.004633679 0.004171704
## 746 1.0046182 0.003643685 0.007662184
## 747 1.0045467 0.011251535 0.011221831
## 749 0.9981234 0.007761561 0.006132074
## 756 0.9982456 0.002746981 0.003584836
## 759 1.0044113 0.002683975 0.006721401
## 766 1.0040861 0.001230665 0.004770996
## 771 1.0031540 0.002015059 0.005288809
## 773 0.9970605 0.004614504 0.004239833
## 775 1.0034956 0.003002281 0.006456602
## 781 0.9989330 0.001130096 0.002254992
## 785 1.0001670 0.004502649 0.005674103
## 788 1.0083906 0.001278024 0.007693606
## 789 1.0106156 0.001811287 0.010084360
## 791 0.9983011 0.001088095 0.001998393
## 792 1.0037461 0.003055267 0.006651422
## 801 1.0024797 0.001180949 0.003818844
## 808 1.0031022 0.003645899 0.006738998
## 811 1.0093223 0.002246912 0.009604181
## 812 1.0030930 0.001575404 0.004715407
## 827 1.0040689 0.003049916 0.006841947
## 830 1.0026601 0.002538651 0.005555234
## 831 0.9995717 0.001323539 0.002735777
## 833 1.0045907 0.002440337 0.006593282
## 834 1.0061497 0.001327637 0.006207263
## 839 0.9979973 0.002417423 0.003225624
## 842 1.0127429 0.001870264 0.011823437
## 850 1.0021696 0.001714484 0.004402141
## 851 1.0053302 0.005489908 0.009325549
## 860 1.0082369 0.002808465 0.009378832
## 863 1.0094958 0.001697586 0.009095400
## 867 1.0084633 0.003574867 0.010191156
## 868 1.0016607 0.001276577 0.003563900
## 875 0.9989985 0.002605225 0.003792864
## 877 0.9885312 0.014036379 0.003474076
## 878 1.0099720 0.003477395 0.011206431
## 884 0.9997142 0.005114067 0.005781541
## 886 0.9936133 0.005898995 0.003408719
## 887 0.9960857 0.007274316 0.004946013
## 888 1.0017539 0.002036553 0.004559418
## 891 1.0017517 0.001864835 0.004367039
## 900 0.9982575 0.001149948 0.002062839
## 901 1.0007936 0.003202174 0.005123195
## 910 0.9977068 0.002576263 0.003237675
## 913 0.9981499 0.001573130 0.002508058
## 914 1.0033681 0.004661382 0.007589747
## 921 1.0111894 0.002744511 0.011494342
## 923 1.0046922 0.003257121 0.007402095
## 930 1.0000310 0.003574156 0.005008189
## 936 1.0016143 0.001618191 0.004003053
## 937 1.0008489 0.003126662 0.005092953
## 940 1.0083091 0.001687474 0.008210981
## 950 1.0011512 0.009907081 0.008611966
## 953 0.9981278 0.003434874 0.004021041
## 955 0.9976310 0.002523963 0.003168121
## 956 1.0035151 0.001474434 0.004811331
## 961 1.0038084 0.001510675 0.005028918
## 967 1.0048591 0.002505363 0.006825363
## 972 1.0080780 0.001064364 0.007117925
## 975 1.0040233 0.001114400 0.004542227
## 976 0.9971833 0.006106797 0.004997945
## 977 0.9975589 0.004443170 0.004370057
## 980 0.9982953 0.001178185 0.002109445
## 984 1.0019208 0.002361830 0.004981769
## 988 0.9969608 0.004160514 0.003955404
## 992 0.9971562 0.006001357 0.004939506
## 997 0.9991554 0.001468025 0.002753956
## 1000 0.9992471 0.003168574 0.004340472
Interpretation: The analysis identified a few observations with high Cook’s distance values, indicating that they may be influential points that could disproportionately affect the model’s estimates. It is important to investigate these observations further to determine if they are data entry errors, outliers, or valid extreme cases. Depending on the findings, we may consider conducting sensitivity analyses by excluding these points to assess their impact on the model results.
Step 9: Model presentation
Regression table
final_model %>%
tbl_regression(
exponentiate = TRUE,
conf.level = 0.95,
label = list(
age ~ "Age (years)",
bmi ~ "Body Mass Index (BMI)",
smoking_status ~ "Smoking Status (Yes vs No)"
)
) %>%
as_gt() %>%
gt::tab_header(
title = "Multiple Logistic Regression Analysis of Diabetes Complications",
subtitle = "Odds Ratios with 95% Confidence Intervals"
) %>%
# --- Adding model fit statistics ---
gt::tab_source_note(
source_note = paste0(
"Note: Model fit statistics - AIC: ", round(AIC(final_model), 2),
", BIC: ", round(BIC(final_model), 2),
", Log-Likelihood: ", round(logLik(final_model), 2)
)
)| Multiple Logistic Regression Analysis of Diabetes Complications | |||
| Odds Ratios with 95% Confidence Intervals | |||
| Characteristic | OR | 95% CI | p-value |
|---|---|---|---|
| Age (years) | 1.02 | 1.01, 1.04 | <0.001 |
| Body Mass Index (BMI) | 1.11 | 1.08, 1.14 | <0.001 |
| Smoking Status (Yes vs No) | |||
| No | — | — | |
| Yes | 2.69 | 1.94, 3.78 | <0.001 |
| Abbreviations: CI = Confidence Interval, OR = Odds Ratio | |||
| Note: Model fit statistics - AIC: 1207.07, BIC: 1226.7, Log-Likelihood: -599.53 | |||
Step 10: Interpretation of the final model table
Interpretation:
- The final multiple logistic regression model indicates that age, BMI, and smoking status are significant predictors of diabetes complications.
- Each additional year of age is associated with a 2.4% increase in the odds of complications (OR = 1.024, 95% CI: 1.012-1.036), and each unit increase in BMI corresponds to a 10.7% increase in odds (OR = 1.107, 95% CI: 1.080-1.134).
- Smokers have nearly 2.7 times higher odds of developing complications compared to non-smokers (OR = 2.694, 95% CI: 1.940-3.738).
- The model fit statistics (AIC, BIC, Log-Likelihood) suggest a reasonable fit to the data, supporting the validity of these findings.