RMD 5 will focus on an introduction to logistic regression. Main takeaways should be when and how to use logistic regression.
To start, the formula for logistic regression is: log(p/1-p) = β0 + β1(var1)
library(faraway)
data(orings)
This is a cool plot from Dr. Knudson’s Logistic Intro.
plot(damage/6 ~ temp, data = orings, ylim = c(0,1), xlab = "Temperature at Launch", ylab = "Probability of O-ring damage")
The plot tells us a few things. First, as the temperature rises O-ring damage decreases and secondly, we should not launch below 55 degrees.
shuttlemod <- glm(cbind(damage,6-damage) ~ temp, family = binomial, data = orings)
summary(shuttlemod)
##
## Call:
## glm(formula = cbind(damage, 6 - damage) ~ temp, family = binomial,
## data = orings)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -0.9529 -0.7345 -0.4393 -0.2079 1.9565
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 11.66299 3.29626 3.538 0.000403 ***
## temp -0.21623 0.05318 -4.066 4.78e-05 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 38.898 on 22 degrees of freedom
## Residual deviance: 16.912 on 21 degrees of freedom
## AIC: 33.675
##
## Number of Fisher Scoring iterations: 6
log(p/1-p) = 11.663 + -.2162*temp
plot((damage)/6 ~ jitter(temp), data = orings, ylim = c(0,1), xlim= c(25,85), xlab = "Temperature at launch", ylab = "Probability of O-Ring Damage")
xs<- seq(25,85,1)
ys <- ilogit(11.663-.2162*xs)
lines(xs,ys)
This plots our model for our data. We can see that the lower the temp, the higher chance of O-ring damage.
We can interpret regression equations by talking about the odds p/1-p. The odds of o-ring failure change by a factor of exp(.2162) for every degree the temp drops.
Just like Poisson regression we want to interpret the exponentiated coefficents. So, we want to calculate the confidence interval for a coefficent and the exponentiate the endpoints
forbeta <- coef(shuttlemod)[2] + 1.96*c(-1,1)*.05318
exp(forbeta)#first way
## [1] 0.7258104 0.8940435
CI <- confint(shuttlemod)
## Waiting for profiling to be done...
exp(CI)
## 2.5 % 97.5 %
## (Intercept) 263.8010254 1.372889e+08
## temp 0.7170161 8.867617e-01
df = n - # of betas
pchisq(deviance(shuttlemod), lower.tail = FALSE, df =1)
## [1] 3.914754e-05
How do we interpret this: H0: The model does not exhibit a lack of fit
So, in this example we can say that the model does exhibit a lack of fit and we can reject the null hypothesis
mod2 <- glm(cbind(damage, 6-damage) ~1, data = orings, family = binomial)
dsmall <- deviance(mod2)
dbig <- deviance(shuttlemod)
teststat <- dsmall-dbig
pchisq(teststat, df = 1, lower.tail = FALSE)
## [1] 2.747351e-06
```