The data set Caesar in vcdExtra gives a 3 × 2 frequency table classifying 251 women who gave birth by Caesarian section by Infection (three levels: none, Type 1, Type2) and Risk, whether Antibiotics were used, and whether the Caesarian section was Planned or not. Infection is a natural response variable. In this exercise, consider only the binary outcome of infection vs. no infection.
Fit the main-effects logit model for the binary response Infect. Note that with the data in the form of a frequency data frame you will need to use weights=Freq in the call to glm(). (It might also be convenient to reorder the levels of the factors so that “No” is the baseline level for each.)
Use summary () or car::Anova () to test the terms in this model.
Interpret the coefficients in the fitted model in terms of their effect on the odds of infection.
Make one or more effects plots for this model, showing separate terms, or their combinations.
library(vcd)
## Loading required package: grid
library(vcdExtra)
## Loading required package: gnm
library(effects)
## Loading required package: carData
##
## Attaching package: 'carData'
## The following object is masked from 'package:vcdExtra':
##
## Burt
## lattice theme set by effectsTheme()
## See ?effectsTheme for details.
data("Caesar", package="vcdExtra")
Caesar.df <- as.data.frame(Caesar)
str(Caesar.df)
## 'data.frame': 24 obs. of 5 variables:
## $ Infection : Factor w/ 3 levels "Type 1","Type 2",..: 1 2 3 1 2 3 1 2 3 1 ...
## $ Risk : Factor w/ 2 levels "Yes","No": 1 1 1 2 2 2 1 1 1 2 ...
## $ Antibiotics: Factor w/ 2 levels "Yes","No": 1 1 1 1 1 1 2 2 2 2 ...
## $ Planned : Factor w/ 2 levels "Yes","No": 1 1 1 1 1 1 1 1 1 1 ...
## $ Freq : num 0 1 17 0 1 1 11 17 30 4 ...
#(a)
Caesar.df$Infect <- as.numeric(Caesar.df$Infection %in% c("Type 1", "Type 2"))
#reorder variable levels
Caesar.df$Risk <- factor(Caesar.df$Risk, levels(Caesar.df$Risk)[c(2,1)])
Caesar.df$Antibiotics <- factor(Caesar.df$Antibiotics, levels(Caesar.df$Antibiotics)[c(2,1)])
Caesar.df$Planned <- factor(Caesar.df$Planned, levels(Caesar.df$Planned)[c(2,1)])
model1 <- glm(Infect ~ Risk + Antibiotics + Planned, family = binomial, data = Caesar.df, weights = Freq)
# (b)
summary(model1)
##
## Call:
## glm(formula = Infect ~ Risk + Antibiotics + Planned, family = binomial,
## data = Caesar.df, weights = Freq)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -6.7471 -0.4426 0.0000 3.2338 5.4201
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -0.7935 0.4785 -1.658 0.0972 .
## RiskYes 1.8270 0.4364 4.186 2.84e-05 ***
## AntibioticsYes -3.0011 0.4593 -6.535 6.37e-11 ***
## PlannedYes -0.9064 0.4084 -2.219 0.0265 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 300.85 on 16 degrees of freedom
## Residual deviance: 236.36 on 13 degrees of freedom
## AIC: 244.36
##
## Number of Fisher Scoring iterations: 6
anova(model1)
## Analysis of Deviance Table
##
## Model: binomial, link: logit
##
## Response: Infect
##
## Terms added sequentially (first to last)
##
##
## Df Deviance Resid. Df Resid. Dev
## NULL 16 300.85
## Risk 1 4.104 15 296.75
## Antibiotics 1 55.163 14 241.59
## Planned 1 5.230 13 236.36
# (c) Our results indicate that the odds of infection decrease by 95% when an antibiotic is present, and decrease by 60% when it is planned.
effects <- exp(coef(model1)) - 1
effects.perc <- paste(round(100*effects, 2), "%", sep="")
effects.perc.abs <- paste(round(100*abs(effects), 2), "%", sep="")
effects.perc
## [1] "-54.77%" "521.52%" "-95.03%" "-59.6%"
#(d) Our results indicate that if there was risk factor present but antibiotics given, the probability of being infected can also decrease a lot, compared to when no antibiotics were given.
plot(model1)
model1.inter <- update(model1, . ~ . + Risk:Antibiotics)
plot(allEffects(model1.inter))
Note that the echo = FALSE parameter was added to the code chunk to prevent printing of the R code that generated the plot.