http://www.umass.edu/statdata/statdata/data/lowbwt.txt
SOURCE: Hosmer and Lemeshow (2000) Applied Logistic Regression: Second Edition. These data are copyrighted by John Wiley & Sons Inc. and must be acknowledged and used accordingly. Data were collected at Baystate Medical Center, Springfield, Massachusetts during 1986.
## Fix cases 10 and 39 to make them identical to Dr. Orav's dataset
lbw <- read.table("http://www.umass.edu/statdata/statdata/data/lowbwt.dat", head = T, skip = 4)
lbw[c(10,39),"BWT"] <- c(2655,3035)
## Change variable names to lower cases
names(lbw) <- tolower(names(lbw))
head(lbw)
id low age lwt race smoke ptl ht ui ftv bwt
1 85 0 19 182 2 0 0 0 1 0 2523
2 86 0 33 155 3 0 0 0 0 3 2551
3 87 0 20 105 1 1 0 0 0 1 2557
4 88 0 21 108 1 1 0 0 1 2 2594
5 89 0 18 107 1 1 0 0 1 0 2600
6 91 0 21 124 3 0 0 0 0 0 2622
## Recoding
lbw <- within(lbw, {
## Relabel race
race.cat <- factor(race, levels = 1:3, labels = c("White","Black","Other"))
## Categorize ftv (frequency of visit)
ftv.cat <- cut(ftv, breaks = c(-Inf, 0, 2, Inf), labels = c("None","Normal","Many"))
ftv.cat <- relevel(ftv.cat, ref = "Normal")
## Dichotomize ptl
preterm <- factor(ptl >= 1, levels = c(FALSE,TRUE), labels = c("0","1+"))
## Categorize smoke ht ui
smoke <- factor(smoke, levels = 0:1, labels = c("No","Yes"))
ht <- factor(ht, levels = 0:1, labels = c("No","Yes"))
ui <- factor(ui, levels = 0:1, labels = c("No","Yes"))
})
logit.full <- glm(low ~ age + lwt + smoke + ht + ui + ftv.cat + race.cat + preterm, data = lbw, family = binomial)
logit.null <- glm(low ~ 1, data = lbw, family = binomial)
logit.full
Call: glm(formula = low ~ age + lwt + smoke + ht + ui + ftv.cat + race.cat +
preterm, family = binomial, data = lbw)
Coefficients:
(Intercept) age lwt smokeYes htYes uiYes ftv.catNone
0.4461 -0.0355 -0.0150 0.7681 1.8067 0.7252 0.2582
ftv.catMany race.catBlack race.catOther preterm1+
0.8033 1.1906 0.7530 1.2528
Degrees of Freedom: 188 Total (i.e. Null); 178 Residual
Null Deviance: 235
Residual Deviance: 196 AIC: 218
summary(logit.full)
Call:
glm(formula = low ~ age + lwt + smoke + ht + ui + ftv.cat + race.cat +
preterm, family = binomial, data = lbw)
Deviance Residuals:
Min 1Q Median 3Q Max
-1.684 -0.810 -0.502 0.816 2.187
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 0.44611 1.26160 0.35 0.7236
age -0.03551 0.03842 -0.92 0.3553
lwt -0.01497 0.00697 -2.15 0.0317 *
smokeYes 0.76813 0.42380 1.81 0.0699 .
htYes 1.80673 0.71053 2.54 0.0110 *
uiYes 0.72520 0.46427 1.56 0.1183
ftv.catNone 0.25824 0.39945 0.65 0.5180
ftv.catMany 0.80332 0.71606 1.12 0.2619
race.catBlack 1.19059 0.53674 2.22 0.0265 *
race.catOther 0.75304 0.45949 1.64 0.1012
preterm1+ 1.25279 0.46763 2.68 0.0074 **
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 234.67 on 188 degrees of freedom
Residual deviance: 195.50 on 178 degrees of freedom
AIC: 217.5
Number of Fisher Scoring iterations: 4
exp(coef(logit.full))
(Intercept) age lwt smokeYes htYes uiYes ftv.catNone ftv.catMany
1.5622 0.9651 0.9851 2.1557 6.0905 2.0651 1.2946 2.2329
race.catBlack race.catOther preterm1+
3.2890 2.1234 3.5001
confint(logit.full)
2.5 % 97.5 %
(Intercept) -2.00843 2.966732
age -0.11268 0.038664
lwt -0.02948 -0.001962
smokeYes -0.05784 1.614551
htYes 0.44858 3.286032
uiYes -0.19642 1.637396
ftv.catNone -0.52029 1.054060
ftv.catMany -0.64396 2.209601
race.catBlack 0.14082 2.262345
race.catOther -0.14050 1.672051
preterm1+ 0.34686 2.193097
exp(confint(logit.full))
2.5 % 97.5 %
(Intercept) 0.1342 19.428
age 0.8934 1.039
lwt 0.9710 0.998
smokeYes 0.9438 5.026
htYes 1.5661 26.737
uiYes 0.8217 5.142
ftv.catNone 0.5943 2.869
ftv.catMany 0.5252 9.112
race.catBlack 1.1512 9.606
race.catOther 0.8689 5.323
preterm1+ 1.4146 8.963
library(epicalc)
logistic.display(logit.full)
Logistic regression predicting low
crude OR(95%CI) adj. OR(95%CI) P(Wald's test) P(LR-test)
age (cont. var.) 0.95 (0.89,1.01) 0.97 (0.9,1.04) 0.355 0.351
lwt (cont. var.) 0.99 (0.97,1) 0.99 (0.97,1) 0.032 0.023
smoke: Yes vs No 2.02 (1.08,3.78) 2.16 (0.94,4.95) 0.07 0.068
ht: Yes vs No 3.37 (1.02,11.09) 6.09 (1.51,24.52) 0.011 0.009
ui: Yes vs No 2.58 (1.14,5.83) 2.07 (0.83,5.13) 0.118 0.122
ftv.cat: ref.=Normal 0.514
None 1.84 (0.95,3.59) 1.29 (0.59,2.83) 0.518
Many 2.34 (0.66,8.28) 2.23 (0.55,9.09) 0.262
race.cat: ref.=White 0.055
Black 2.33 (0.94,5.77) 3.29 (1.15,9.42) 0.027
Other 1.89 (0.96,3.74) 2.12 (0.86,5.23) 0.101
preterm: 1+ vs 0 4.32 (1.92,9.73) 3.5 (1.4,8.75) 0.007 0.007
Log-likelihood = -97.7515
No. of observations = 189
AIC value = 217.5031
anova(logit.full, test = "LRT")
Analysis of Deviance Table
Model: binomial, link: logit
Response: low
Terms added sequentially (first to last)
Df Deviance Resid. Df Resid. Dev Pr(>Chi)
NULL 188 235
age 1 2.76 187 232 0.0966 .
lwt 1 4.79 186 227 0.0286 *
smoke 1 4.24 185 223 0.0394 *
ht 1 7.20 184 216 0.0073 **
ui 1 3.91 183 212 0.0481 *
ftv.cat 2 1.65 181 210 0.4373
race.cat 2 7.26 179 203 0.0265 *
preterm 1 7.36 178 196 0.0067 **
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
library(car)
Anova(logit.full, type = 3, test = "LR")
Analysis of Deviance Table (Type III tests)
Response: low
LR Chisq Df Pr(>Chisq)
age 0.87 1 0.3513
lwt 5.15 1 0.0233 *
smoke 3.32 1 0.0683 .
ht 6.80 1 0.0091 **
ui 2.40 1 0.1215
ftv.cat 1.33 2 0.5141
race.cat 5.79 2 0.0554 .
preterm 7.36 1 0.0067 **
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Effect of a given predictor after other variables are typical values.
library(effects)
plot(allEffects(logit.full), ylim = c(0,1))
More complex model for interaction plot
The model itself is not very meaningful. This is for demonstration purpose only.
logit.full.int <- glm(low ~ age*lwt + smoke + ht + ui + age*ftv.cat + race.cat*preterm, data = lbw, family = binomial)
Anova(logit.full.int, type = 3, test = "LR")
Analysis of Deviance Table (Type III tests)
Response: low
LR Chisq Df Pr(>Chisq)
age 0.41 1 0.5233
lwt 0.23 1 0.6325
smoke 3.37 1 0.0665 .
ht 6.23 1 0.0126 *
ui 3.30 1 0.0691 .
ftv.cat 10.55 2 0.0051 **
race.cat 3.45 2 0.1777
preterm 4.27 1 0.0388 *
age:lwt 0.00 1 0.9991
age:ftv.cat 10.62 2 0.0049 **
race.cat:preterm 0.06 2 0.9698
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Continuous x Continuous interaction
plot(effect("age:lwt", logit.full.int))
plot(effect("age:lwt", logit.full.int, xlevels = list(lwt = c(100,150,200,250))), multiline = T)
Continuous x Categorical interaction
plot(effect("age:ftv.cat", logit.full.int), multiline = TRUE)
Categorical x Categorical interaction
Use x.var and z.var to override the default X-axis variable and grouping variable assignment. The default is the predictor with the largest number of levels or values for x.var, which is not useful in categorical x categorical interactions.
plot(effect(c("race.cat*preterm"), logit.full.int), multiline = TRUE)
plot(effect(c("race.cat*preterm"), logit.full.int), x.var = "preterm", z.var = "race.cat", multiline = TRUE)