In a previous article, we discuss how to construct a logistic regression in R (see previous article). However, we did not discuss how to assess whether the logistic regression fit our data well. In this article, we’ll go over some goodness of fit tests to help determine whether the logistic regression model does a good job of fitting the data.
We will use the example diabetes.csv
from the previous article.
You can download the data here.
There are 768 observations in the diabetes.data
data frame.
library("psych")
library("gmodels")
## Import file
diabetes.data <- read.csv("diabetes.csv", header = TRUE)
head(diabetes.data)
## Pregnancies Glucose BloodPressure SkinThickness Insulin BMI
## 1 6 148 72 35 0 33.6
## 2 1 85 66 29 0 26.6
## 3 8 183 64 0 0 23.3
## 4 1 89 66 23 94 28.1
## 5 0 137 40 35 168 43.1
## 6 5 116 74 0 0 25.6
## DiabetesPedigreeFunction Age Outcome
## 1 0.627 50 1
## 2 0.351 31 0
## 3 0.672 32 1
## 4 0.167 21 0
## 5 2.288 33 1
## 6 0.201 30 0
### Descriptive of diabetes.data
describeBy(diabetes.data)
## vars n mean sd median trimmed mad min
## Pregnancies 1 768 3.85 3.37 3.00 3.46 2.97 0.00
## Glucose 2 768 120.89 31.97 117.00 119.38 29.65 0.00
## BloodPressure 3 768 69.11 19.36 72.00 71.36 11.86 0.00
## SkinThickness 4 768 20.54 15.95 23.00 19.94 17.79 0.00
## Insulin 5 768 79.80 115.24 30.50 56.75 45.22 0.00
## BMI 6 768 31.99 7.88 32.00 31.96 6.82 0.00
## DiabetesPedigreeFunction 7 768 0.47 0.33 0.37 0.42 0.25 0.08
## Age 8 768 33.24 11.76 29.00 31.54 10.38 21.00
## Outcome 9 768 0.35 0.48 0.00 0.31 0.00 0.00
## max range skew kurtosis se
## Pregnancies 17.00 17.00 0.90 0.14 0.12
## Glucose 199.00 199.00 0.17 0.62 1.15
## BloodPressure 122.00 122.00 -1.84 5.12 0.70
## SkinThickness 99.00 99.00 0.11 -0.53 0.58
## Insulin 846.00 846.00 2.26 7.13 4.16
## BMI 67.10 67.10 -0.43 3.24 0.28
## DiabetesPedigreeFunction 2.42 2.34 1.91 5.53 0.01
## Age 81.00 60.00 1.13 0.62 0.42
## Outcome 1.00 1.00 0.63 -1.60 0.02
Currently, the data frame has a variable called Pregnancies
that contains the number of pregnancies each subject has. We are only interested in whether or not they have had a pregnancy, so we’ll create a binary variable called group
. We added labels to the 0s an 1s and call this new data frame data1
. There was a total of 657 (85.5%) subjects with a history of pregnancies and 111 (14.5%) subjects with no history of pregnancy.
#### Generate groups based on pregnancies (Group 1 = 0, Group 2 = 1-5, Group 3 = >5 pregnancies)
diabetes.data$group[diabetes.data$Pregnancies == 0] = 0
diabetes.data$group[diabetes.data$Pregnancies > 0] = 1
CrossTable(diabetes.data$group)
##
##
## Cell Contents
## |-------------------------|
## | N |
## | N / Table Total |
## |-------------------------|
##
##
## Total Observations in Table: 768
##
##
## | 0 | 1 |
## |-----------|-----------|
## | 111 | 657 |
## | 0.145 | 0.855 |
## |-----------|-----------|
##
##
##
##
### Create factors
data1 <- within(diabetes.data, {
group <- factor(group, labels = c("No history of pregnancies", "History of pregnancies"))
})
CrossTable(data1$group)
##
##
## Cell Contents
## |-------------------------|
## | N |
## | N / Table Total |
## |-------------------------|
##
##
## Total Observations in Table: 768
##
##
## | No history of pregnancies | History of pregnancies |
## |---------------------------|---------------------------|
## | 111 | 657 |
## | 0.145 | 0.855 |
## |---------------------------|---------------------------|
##
##
##
##
### Compare datasets
head(diabetes.data)
## Pregnancies Glucose BloodPressure SkinThickness Insulin BMI
## 1 6 148 72 35 0 33.6
## 2 1 85 66 29 0 26.6
## 3 8 183 64 0 0 23.3
## 4 1 89 66 23 94 28.1
## 5 0 137 40 35 168 43.1
## 6 5 116 74 0 0 25.6
## DiabetesPedigreeFunction Age Outcome group
## 1 0.627 50 1 1
## 2 0.351 31 0 1
## 3 0.672 32 1 1
## 4 0.167 21 0 1
## 5 2.288 33 1 0
## 6 0.201 30 0 1
head(data1)
## Pregnancies Glucose BloodPressure SkinThickness Insulin BMI
## 1 6 148 72 35 0 33.6
## 2 1 85 66 29 0 26.6
## 3 8 183 64 0 0 23.3
## 4 1 89 66 23 94 28.1
## 5 0 137 40 35 168 43.1
## 6 5 116 74 0 0 25.6
## DiabetesPedigreeFunction Age Outcome group
## 1 0.627 50 1 History of pregnancies
## 2 0.351 31 0 History of pregnancies
## 3 0.672 32 1 History of pregnancies
## 4 0.167 21 0 History of pregnancies
## 5 2.288 33 1 No history of pregnancies
## 6 0.201 30 0 History of pregnancies
We create a crude logistic regression model to evaluate the association of the subject’s age Age
on history of pregnancy group
.
Based on the crude logistic regression model, a 1-unit increase in age was associated with a 7% increase in the odds of having a history of pregnancy (95% CI: 1.05, 1.10), which is statistically significant.
logit1 <- glm(group ~ Age, data = data1, family = "binomial"(link = "logit"))
summary(logit1)
##
## Call:
## glm(formula = group ~ Age, family = binomial(link = "logit"),
## data = data1)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -2.9478 0.2902 0.4966 0.6630 0.7500
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -0.33970 0.39173 -0.867 0.386
## Age 0.06972 0.01343 5.193 2.06e-07 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 634.53 on 767 degrees of freedom
## Residual deviance: 597.14 on 766 degrees of freedom
## AIC: 601.14
##
## Number of Fisher Scoring iterations: 5
confint(logit1)
## 2.5 % 97.5 %
## (Intercept) -1.13328100 0.40600173
## Age 0.04484305 0.09761406
### Exponentiate the coefficients
exp(coef(logit1))
## (Intercept) Age
## 0.711984 1.072211
exp(confint(logit1))
## 2.5 % 97.5 %
## (Intercept) 0.3219751 1.500805
## Age 1.0458637 1.102537
We can add potential confounders such as BMI, glucose, and skin thickness to the logistic regression model. This will generate model estimates that will be adjusting for these other predictors.
logit2 <- glm(group ~ Age + BMI + Glucose + SkinThickness, data = data1, family = "binomial"(link = "logit"))
summary(logit2)
##
## Call:
## glm(formula = group ~ Age + BMI + Glucose + SkinThickness, family = binomial(link = "logit"),
## data = data1)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -3.3549 0.2632 0.4737 0.6239 1.2282
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 1.516255 0.626257 2.421 0.015472 *
## Age 0.083301 0.014808 5.625 1.85e-08 ***
## BMI -0.053471 0.015487 -3.453 0.000555 ***
## Glucose -0.005492 0.003615 -1.519 0.128700
## SkinThickness 0.007768 0.007413 1.048 0.294694
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 634.53 on 767 degrees of freedom
## Residual deviance: 578.30 on 763 degrees of freedom
## AIC: 588.3
##
## Number of Fisher Scoring iterations: 6
confint(logit2)
## 2.5 % 97.5 %
## (Intercept) 0.296789712 2.755963522
## Age 0.055821246 0.114006339
## BMI -0.084232809 -0.023425961
## Glucose -0.012607513 0.001577382
## SkinThickness -0.006861665 0.022262463
### Exponentiate the coefficients
exp(coef(logit2))
## (Intercept) Age BMI Glucose SkinThickness
## 4.5551360 1.0868687 0.9479330 0.9945226 1.0077981
exp(confint(logit2))
## 2.5 % 97.5 %
## (Intercept) 1.3455323 15.7361958
## Age 1.0574087 1.1207592
## BMI 0.9192172 0.9768463
## Glucose 0.9874716 1.0015786
## SkinThickness 0.9931618 1.0225121
In our motivating example, we have two models: (1) Crude model and (2) Adjusted model. The crude model only has the outcome (Pregnancy = Yes/No) and the predictor of interest (Age). The adjusted model has the outcome and predictor of interest along with other covariates (BMI, Glucose, and SkinThickness). We can assess which of these two models fit the data better using different types of tests. One of these tests is the Hosmer-Lemeshow (HL) goodness of fit (GOF) test.
There are several R packages that you can use to perform the HL GOF test:
* hoslem.test
function from the ResourceSelection
package
* performance_hosmer
function from the performance
package
* hltest
function from the largesamplehl
package
Note: I noticed that when I used data1
in the hoslem.test
function, I get p-values that are very small (e.g., \(10^{-14}\)). This does not occur when I use diabetetes.data
in the function. I believe this has something to do when I factor the outcome variable pregnancy
. Therefore, when using the hoslem.test
function, make sure to use the unfactorized outcome variable.
You can compare models if one is nested within the other. For our example, the crude mode is nested within the adjusted model.
where \(p_{i}\) is the probability of having a history of pregnancy.
We can compare both models using the likelihood ratio test to see which one fits the data better. We will use the lrtest
function from the lmtest
package. The null hypothesis states that the crude model fits the data better. The alternative hypothesis states that the alternative (adjusted model) fits the data better.
library(lmtest)
### Comparing the crude model (logit1) to the adjusted model (logit2)
lrtest(logit1, logit2)
## Likelihood ratio test
##
## Model 1: group ~ Age
## Model 2: group ~ Age + BMI + Glucose + SkinThickness
## #Df LogLik Df Chisq Pr(>Chisq)
## 1 2 -298.57
## 2 5 -289.15 3 18.846 0.0002942 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
The p<0.00001 suggesting that the adjusted model with four covariates (Age, BMI, Glucose, and SkinThickness) fits the data significantly better than the crude model.
The Hosmer-Lemeshow GOF test evaluates whether the logistic regression model is a good fit for the data. The null hypothesis for the Hosmer-Lemeshow GOF test is that the model is a good fit of the data; the alternative hypothesis is that the model is not a good fit.
library("ResourceSelection")
library("largesamplehl")
library("PredictABEL")
library("performance")
### Method 1:
hl_gof <- hoslem.test(diabetes.data$group, fitted(logit1), g = 10) ### Use the unfactored data for the outcomes
hl_gof
##
## Hosmer and Lemeshow goodness of fit (GOF) test
##
## data: diabetes.data$group, fitted(logit1)
## X-squared = 22.446, df = 8, p-value = 0.004154
### Method 2:
performance_hosmer(logit1, n_bins = 10)
## # Hosmer-Lemeshow Goodness-of-Fit Test
##
## Chi-squared: 22.446
## df: 8
## p-value: 0.004
### Method 3:
hltest(logit1, G = 10)
##
## Modified Hosmer-Lemeshow test for large samples
## H0: epsilon<=epsilon0 vs. Ha: epsilon>epsilon0
## In-sample goodness of fit.
##
## data: Model evaluated on 768 observations
## HL statistic = 22.446, dof = 8.0000000, lambda = 0.0057656, p-value =
## 0.004179
## alternative hypothesis: true epsilon is greater than 0.002739948
## 95 percent confidence interval:
## 0.07177937 Inf
## sample estimates:
## epsilonHat
## 0.1371472
### Plot the predicted against the observed:
plotCalibration(data = data1, cOutcome = 9, predRisk = fitted(logit1), groups= 10)
## $Table_HLtest
## total meanpred meanobs predicted observed
## [0.755,0.780) 135 0.762 0.119 102.81 16
## 0.780 38 0.780 0.184 29.63 7
## [0.791,0.814) 94 0.797 0.234 74.94 22
## [0.814,0.834) 65 0.819 0.246 53.21 16
## [0.834,0.852) 64 0.838 0.359 53.63 23
## [0.852,0.884) 78 0.864 0.487 67.36 38
## [0.884,0.915) 75 0.898 0.467 67.37 35
## [0.915,0.939) 78 0.926 0.513 72.20 40
## [0.939,0.964) 68 0.949 0.559 64.55 38
## [0.964,0.995] 73 0.977 0.452 71.29 33
##
## $Chi_square
## [1] 2370.819
##
## $df
## [1] 8
##
## $p_value
## [1] 0
The Hosmer-Lemeshow GOF test evaluates whether the logistic regression model is a good fit for the data. The null hypothesis for the Hosmer-Lemeshow GOF test is that the model is a good fit of the data; the alternative hypothesis is that the model is not a good fit.
# Method 1:
hl_gof <- hoslem.test(diabetes.data$group, fitted(logit2), g = 10) ### Use the unfactored data for the outcomes
hl_gof
##
## Hosmer and Lemeshow goodness of fit (GOF) test
##
## data: diabetes.data$group, fitted(logit2)
## X-squared = 20.848, df = 8, p-value = 0.007561
# Method 2:
performance_hosmer(logit2, n_bins = 10)
## # Hosmer-Lemeshow Goodness-of-Fit Test
##
## Chi-squared: 20.848
## df: 8
## p-value: 0.008
### Method 3:
hltest(logit2, G = 10)
##
## Modified Hosmer-Lemeshow test for large samples
## H0: epsilon<=epsilon0 vs. Ha: epsilon>epsilon0
## In-sample goodness of fit.
##
## data: Model evaluated on 768 observations
## HL statistic = 20.848, dof = 8.0000000, lambda = 0.0057656, p-value =
## 0.007603
## alternative hypothesis: true epsilon is greater than 0.002739948
## 95 percent confidence interval:
## 0.06237691 Inf
## sample estimates:
## epsilonHat
## 0.1293432
### Plot the predicted against the observed:
plotCalibration(data = data1, cOutcome = 9, predRisk = fitted(logit2), groups= 10)
## $Table_HLtest
## total meanpred meanobs predicted observed
## [0.436,0.736) 77 0.668 0.545 51.42 42
## [0.736,0.787) 77 0.766 0.247 58.95 19
## [0.787,0.813) 77 0.800 0.195 61.58 15
## [0.813,0.836) 77 0.824 0.182 63.44 14
## [0.836,0.860) 76 0.848 0.276 64.44 21
## [0.860,0.889) 77 0.876 0.403 67.48 31
## [0.889,0.917) 77 0.903 0.390 69.55 30
## [0.917,0.945) 77 0.932 0.377 71.74 29
## [0.945,0.968) 77 0.956 0.455 73.65 35
## [0.968,0.999] 76 0.984 0.421 74.77 32
##
## $Chi_square
## [1] 3463.833
##
## $df
## [1] 8
##
## $p_value
## [1] 0
For large samples, there is a modified version of the Hosmer-Lemeshow test. To perform this test, we will need to install the largesamplehl
package, which can be found here. (Note: I had trouble installing this using the install.packages()
command. Instead, I used the devtools
command: install_github("gnattino/largesamplehl")
to install the largesamplehl
package.)
### Crude model
hltest(logit1)
##
## Modified Hosmer-Lemeshow test for large samples
## H0: epsilon<=epsilon0 vs. Ha: epsilon>epsilon0
## In-sample goodness of fit.
##
## data: Model evaluated on 768 observations
## HL statistic = 22.446, dof = 8.0000000, lambda = 0.0057656, p-value =
## 0.004179
## alternative hypothesis: true epsilon is greater than 0.002739948
## 95 percent confidence interval:
## 0.07177937 Inf
## sample estimates:
## epsilonHat
## 0.1371472
### Adjusted model
hltest(logit2)
##
## Modified Hosmer-Lemeshow test for large samples
## H0: epsilon<=epsilon0 vs. Ha: epsilon>epsilon0
## In-sample goodness of fit.
##
## data: Model evaluated on 768 observations
## HL statistic = 20.848, dof = 8.0000000, lambda = 0.0057656, p-value =
## 0.007603
## alternative hypothesis: true epsilon is greater than 0.002739948
## 95 percent confidence interval:
## 0.06237691 Inf
## sample estimates:
## epsilonHat
## 0.1293432
Currently, many users consider the Hosmer-Lemeshow GOF test to be obsolete. As a replacement, Hosmer and Lemeshow have developed the omnibus goodness of fit test. To perform this test, we need the rms
package and use the residual_lrm
function.
The results from the crude model:
library("rms")
logit1.res <- lrm(group ~ Age, data = data1, y = TRUE, x = TRUE)
residuals(logit1.res, type = "gof")
## Sum of squared errors Expected value|H0 SD
## 8.965968e+01 9.089012e+01 2.857065e-01
## Z P
## -4.306652e+00 1.657438e-05
Based on the omnibus GOF test for the crude model (p<0.00001), the specific model does not fit the data well.
The results from the adjusted model:
logit2.res <- lrm(group ~ Age + BMI + Glucose + SkinThickness, data = data1, y = TRUE, x = TRUE)
residuals(logit2.res, type = "gof")
## Sum of squared errors Expected value|H0 SD
## 8.537840e+01 8.821907e+01 5.065439e-01
## Z P
## -5.607947e+00 2.047404e-08
Based on the omnibus GOF test for the adjusted model (p<0.00001), the specific model does not fit the data well.
Let’s look at a logistic regression model where the model specification fits the data well. I’m using the cancer.csv
data from Kleinbam and Klein’s Logistic Regression text. This is a study on male patients at Evans County who were followed for 7 years to see if they develop coronary heard disease. You can download the data here. The variables include:
data2 <- read.csv("evans.csv", header = TRUE)
head(data2)
## X patientid chd cat age chl smk ecg dbp sbp hpt CATxHPT CATxCHL
## 1 1 31 0 0 43 159 1 0 74 128 0 0 0
## 2 2 51 1 1 56 201 1 1 112 164 1 1 201
## 3 3 71 0 1 64 179 1 0 100 200 1 1 179
## 4 4 74 0 0 49 243 1 0 82 145 0 0 0
## 5 5 91 0 0 46 252 1 0 88 142 0 0 0
## 6 6 111 1 0 52 179 1 1 80 128 0 0 0
The outcome is coronary heard disease chd
and it is a dichotomous variable. We can construct a logistic regression model to see what factors (age, cholesterol, smoking status, catecholamin levels, and ECG abnormality) were significantly associated with coronary heart disease:
logit3 <- glm(chd ~ age + chl + smk + cat + ecg, data = data2, family = "binomial"(link = logit))
summary(logit3)
##
## Call:
## glm(formula = chd ~ age + chl + smk + cat + ecg, family = binomial(link = logit),
## data = data2)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -1.1270 -0.5275 -0.4028 -0.2995 2.5166
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -6.765311 1.131393 -5.980 2.24e-09 ***
## age 0.032534 0.015143 2.148 0.03168 *
## chl 0.009401 0.003233 2.908 0.00364 **
## smk 0.822366 0.304303 2.702 0.00688 **
## cat 0.775393 0.332986 2.329 0.01988 *
## ecg 0.414322 0.292467 1.417 0.15659
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 438.31 on 607 degrees of freedom
## Residual deviance: 402.50 on 602 degrees of freedom
## AIC: 414.5
##
## Number of Fisher Scoring iterations: 5
confint(logit3)
## 2.5 % 97.5 %
## (Intercept) -9.030820529 -4.58458025
## age 0.002628888 0.06215569
## chl 0.003042452 0.01575518
## smk 0.246243987 1.44532854
## cat 0.118407908 1.42810972
## ecg -0.168689141 0.98130992
### Exponentiate the coefficients
exp(coef(logit3))
## (Intercept) age chl smk cat ecg
## 0.001153089 1.033068529 1.009445464 2.275878222 2.171444262 1.513344220
exp(confint(logit3))
## 2.5 % 97.5 %
## (Intercept) 0.0001196643 0.01020803
## age 1.0026323461 1.06412801
## chl 1.0030470852 1.01587995
## smk 1.2792116470 4.24324599
## cat 1.1257032016 4.17080775
## ecg 0.8447714673 2.66794875
The results of the HF GOF test indicate that the logistic regression is a good fit to the observed data (p=0.062). Additionally, the plot of the predicted and observed probabilities are not over or underestimating the probabilities.
### Method 1:
hl_gof <- hoslem.test(data2$chd, fitted(logit3), g = 10)
hl_gof
##
## Hosmer and Lemeshow goodness of fit (GOF) test
##
## data: data2$chd, fitted(logit3)
## X-squared = 14.874, df = 8, p-value = 0.06165
### Method 2:
performance_hosmer(logit3, n_bins = 10)
## # Hosmer-Lemeshow Goodness-of-Fit Test
##
## Chi-squared: 14.874
## df: 8
## p-value: 0.062
## Summary: model seems to fit well.
### Method 3:
hltest(logit3, G = 10)
##
## Modified Hosmer-Lemeshow test for large samples
## H0: epsilon<=epsilon0 vs. Ha: epsilon>epsilon0
## In-sample goodness of fit.
##
## data: Model evaluated on 608 observations
## HL statistic = 14.874, dof = 8.0000000, lambda = 0.0045644, p-value =
## 0.06182
## alternative hypothesis: true epsilon is greater than 0.002739948
## 95 percent confidence interval:
## 0 Inf
## sample estimates:
## epsilonHat
## 0.1063262
### Plot the predicted against the observed:
plotCalibration(data = data2, cOutcome = 3, predRisk = fitted(logit3), groups= 10)
## $Table_HLtest
## total meanpred meanobs predicted observed
## [0.0153,0.0384) 61 0.030 0.000 1.84 0
## [0.0384,0.0551) 61 0.047 0.033 2.85 2
## [0.0551,0.0672) 61 0.062 0.082 3.77 5
## [0.0672,0.0764) 61 0.072 0.148 4.38 9
## [0.0764,0.0972) 60 0.086 0.117 5.15 7
## [0.0972,0.1105) 61 0.103 0.115 6.28 7
## [0.1105,0.1308) 61 0.120 0.131 7.33 8
## [0.1308,0.1608) 61 0.145 0.066 8.86 4
## [0.1608,0.2275) 61 0.190 0.115 11.60 7
## [0.2275,0.4820] 60 0.316 0.367 18.95 22
##
## $Chi_square
## [1] 14.817
##
## $df
## [1] 8
##
## $p_value
## [1] 0.0628
The results of the Omnibus GOF test also indicate that the logistic regression model is a good fit to the observed data (p=0.369).
logit2.res <- lrm(chd ~ age + chl + smk + cat + ecg, data = data2, y = TRUE, x = TRUE)
residuals(logit2.res, type = "gof")
## Sum of squared errors Expected value|H0 SD
## 58.0457420 58.4394289 0.4378980
## Z P
## -0.8990381 0.3686324
Logistic regression models are useful when you have an outcome variable that is dichotomous. However, the model may not fit the data very well. Using goodness of fit tests will help determine whether the logistic regression model is the correct specification for your data. We demonstrated using a logistic regression model is poorly specified in the diabetes.csv
data. We also demonstrated who the logistic regression is correctly specified in the evans.csv
data. You should not always assume that the logistic regression model is the best specification for your data. However, when the GOF results indicate that the model is a poor fit, you may want to consider the covariates in the model or a different specification (e.g., probit model).
The text that I used to understand the different types of tests used to assess the quality of the logistic regression model is Kleinbaum DG and Klein M. Logistic Regression: A Self-Learning Text; the book’s website contains useful information that compliments the text.
Equally helpful is a website by Jonathan Bartlett called The Stats Geek where he succinctly describes and provides codes for performing the Hosmer-Lemeshow GOF test in R.
I learned about the modified Hosmer-Lemeshow GOF test when I reviewed the largesamplehl
package, which can be found here.
Several StackExchange threads were helpful in writing this tutorial: link1, link2, and link3.
These tutorials are a work in progress, so they may contain errors. I highly recommend you seek out statistical consult if you plan on running any analysis meant for publication. Any comments and suggestions can be sent to: internal.validity.blog@gmail.com