Problem:
Arbuthnot’s data on the sex ratio of births in London was examined in Chapter 3. Use a binomial logistic regression model to assess whether the proportion of male births varied with the variables Year, Plague, and Mortality in the Arbuthnot data set. Produce effect plots for the terms in this model. What do you conclude?
Compare your results with a null model whereas you only have the intercept of the curve.
#install.packages("carData")
#install.packages("survey")
#install.packages("~/Downloads/effects_4.0-2.tgz", repos = NULL, type = .Platform$pkgType)
library(effects)
Logit Model
library(HistData)
data(Arbuthnot, package="HistData")
#Arbuthnot
#str(Arbuthnot)
#cbind(Arbuthnot$Males, Arbuthnot$Females)
mod <- glm(cbind(Males, Females) ~ Year + Plague + Mortality, data=Arbuthnot, family=binomial)
summary(mod)
##
## Call:
## glm(formula = cbind(Males, Females) ~ Year + Plague + Mortality,
## family = binomial, data = Arbuthnot)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -3.1845 -0.9956 -0.0053 0.8503 3.7136
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -4.067e-02 3.091e-01 -0.132 0.8953
## Year 8.282e-05 1.931e-04 0.429 0.6680
## Plague 1.907e-06 1.135e-06 1.681 0.0928 .
## Mortality -1.858e-06 9.260e-07 -2.007 0.0448 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 169.74 on 81 degrees of freedom
## Residual deviance: 156.31 on 78 degrees of freedom
## AIC: 963.84
##
## Number of Fisher Scoring iterations: 3
Logistic Regression model shows that proportion of male birth vary by Motality and somewhat by Plague. Plague is directly proportional to male/female ration while mortality is inversly proportional to the male/female ratio
Effect Plots
e <- allEffects(mod,partial.residuals= TRUE)
plot(e, ylab="Males/Females",type="response",residuals.pch=1.5,rows=1, cols=3)
Effect plots visually show how male to female ratio vary by plague and mortality.
Comparison with the NULL model
modn <- glm(cbind(Males, Females) ~ 1, data=Arbuthnot, family=binomial)
anova(modn,mod, test="Chisq")
## Analysis of Deviance Table
##
## Model 1: cbind(Males, Females) ~ 1
## Model 2: cbind(Males, Females) ~ Year + Plague + Mortality
## Resid. Df Resid. Dev Df Deviance Pr(>Chi)
## 1 81 169.74
## 2 78 156.31 3 13.429 0.003795 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
The summary of the test shows that the logistic regression model fits signficantly better than the null model