Exercise 7.5 The data set Caesar in vcdExtra gives a 3 × 2 frequency table classifying 251 women who gave birth by Caesarian section by Infection (three levels: none, Type 1, Type2) and Risk, whether Antibiotics were used, and whether the Caesarian section was Planned or not. Infection is a natural response variable. In this exercise, consider only the binary outcome of infection vs. no infection.

data("Caesar", package = "vcdExtra")
Caesar.df <- as.data.frame(Caesar)
Caesar.df$Infect <- as.numeric(Caesar.df$Infection %in% c('Type 1', 'Type 2'))
str(Caesar.df)
## 'data.frame':    24 obs. of  6 variables:
##  $ Infection  : Factor w/ 3 levels "Type 1","Type 2",..: 1 2 3 1 2 3 1 2 3 1 ...
##  $ Risk       : Factor w/ 2 levels "Yes","No": 1 1 1 2 2 2 1 1 1 2 ...
##  $ Antibiotics: Factor w/ 2 levels "Yes","No": 1 1 1 1 1 1 2 2 2 2 ...
##  $ Planned    : Factor w/ 2 levels "Yes","No": 1 1 1 1 1 1 1 1 1 1 ...
##  $ Freq       : num  0 1 17 0 1 1 11 17 30 4 ...
##  $ Infect     : num  1 1 0 1 1 0 1 1 0 1 ...
Caesar.df$Risk <- factor(Caesar.df$Risk, levels(Caesar.df$Risk)[c(2,1)])
Caesar.df$Antibiotics <- factor(Caesar.df$Antibiotics, levels(Caesar.df$Antibiotics)[c(2,1)])
Caesar.df$Planned <- factor(Caesar.df$Planned, levels(Caesar.df$Planned)[c(2,1)])
str(Caesar.df)
## 'data.frame':    24 obs. of  6 variables:
##  $ Infection  : Factor w/ 3 levels "Type 1","Type 2",..: 1 2 3 1 2 3 1 2 3 1 ...
##  $ Risk       : Factor w/ 2 levels "No","Yes": 2 2 2 1 1 1 2 2 2 1 ...
##  $ Antibiotics: Factor w/ 2 levels "No","Yes": 2 2 2 2 2 2 1 1 1 1 ...
##  $ Planned    : Factor w/ 2 levels "No","Yes": 2 2 2 2 2 2 2 2 2 2 ...
##  $ Freq       : num  0 1 17 0 1 1 11 17 30 4 ...
##  $ Infect     : num  1 1 0 1 1 0 1 1 0 1 ...
head(Caesar.df)
##   Infection Risk Antibiotics Planned Freq Infect
## 1    Type 1  Yes         Yes     Yes    0      1
## 2    Type 2  Yes         Yes     Yes    1      1
## 3      None  Yes         Yes     Yes   17      0
## 4    Type 1   No         Yes     Yes    0      1
## 5    Type 2   No         Yes     Yes    1      1
## 6      None   No         Yes     Yes    1      0
  1. Fit the main-effects logit model for the binary response Infect. Note that with the data in the form of a frequency data frame you will need to use weights=Freq in the call to glm(). (It might also be convenient to reorder the levels of the factors so that “No” is the baseline level for each.)
Caesar.logistic <- glm(Infect ~ Risk + Antibiotics + Planned, data = Caesar.df, family = binomial, weights=Freq)
  1. Use summary () or car::Anova () to test the terms in this model.
summary(Caesar.logistic)
## 
## Call:
## glm(formula = Infect ~ Risk + Antibiotics + Planned, family = binomial, 
##     data = Caesar.df, weights = Freq)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -6.7471  -0.4426   0.0000   3.2338   5.4201  
## 
## Coefficients:
##                Estimate Std. Error z value Pr(>|z|)    
## (Intercept)     -0.7935     0.4785  -1.658   0.0972 .  
## RiskYes          1.8270     0.4364   4.186 2.84e-05 ***
## AntibioticsYes  -3.0011     0.4593  -6.535 6.37e-11 ***
## PlannedYes      -0.9064     0.4084  -2.219   0.0265 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 300.85  on 16  degrees of freedom
## Residual deviance: 236.36  on 13  degrees of freedom
## AIC: 244.36
## 
## Number of Fisher Scoring iterations: 6
  1. Interpret the coefficients in the fitted model in terms of their effect on the odds of infection.
effect <- exp(coef(Caesar.logistic)) - 1
effect.perc <- paste(round(100*effect, 2), "%", sep="")
effect.perc.abs <- paste(round(100*abs(effect), 2), "%", sep="")
effect.perc
## [1] "-54.77%" "521.52%" "-95.03%" "-59.6%"

Hence The explanation of each factor is:

With all other factors constant, if there is presence of risk factors, the odds of being infected will increase by 521.52%, which is a significant increase.

With all other factors constant, if antibiotics was given, the odds of being infected will decrease by 95.03%, which is a significant decrease.

With all other factors constant, if Caesar section was planned, the odds of being infected will decreast by 59.6%, which is a big decrease.

  1. Make one or more effects plots for this model, showing separate terms, or their combinations.
library(effects)
## Warning: package 'effects' was built under R version 3.4.4
## Loading required package: carData
## Warning: package 'carData' was built under R version 3.4.4
## lattice theme set by effectsTheme()
## See ?effectsTheme for details.
plot(allEffects(Caesar.logistic), rows=1, cols=3)

From seperate effect plot, it confirmed that:

If there is presence of risk factors while keeping all other control factors constant at means, there is increased probability of being infected.

If there is antibiotics given while keeping all other control factors constant at means, the probability of being infected decreased a lot. Same conclusion is also true for the case when Caesar section was planned.

Caesar.logistic.inter <- update(Caesar.logistic, . ~ . + Risk:Antibiotics)
plot(allEffects(Caesar.logistic.inter))

From combination effect plot, we find that if there was risk factor presented but antibiotics given, the probability of being infected can also decrease a lot, compared with the case where risk factor was presented but no antibiotics were given.