Binomial Regression Model

For when response variable is binomially distributed (probabilities of k successes on n trials.)

for binomial response we need 2 sets of info about the response

y: successes n: number of trials

OR n-y: number of failures

Example: Challenger Disaster

This is a dataset examining the failure of the O-rings (important component of a space shuttle) under certain conditions

library(tidyverse)
## -- Attaching packages --------------------------------------- tidyverse 1.3.1 --
## v ggplot2 3.3.5     v purrr   0.3.4
## v tibble  3.1.6     v dplyr   1.0.8
## v tidyr   1.2.0     v stringr 1.4.0
## v readr   2.1.2     v forcats 0.5.1
## -- Conflicts ------------------------------------------ tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()
library(faraway)

Linking function in Binomial Regression

The link function is how the predictors are related to the response variable.

In binomial regression, the three common link functions are Logit, Probit and Complimentary Log-Log

They can be used in the GLM model to define how the response fits the predictor.

Inference

Much like Poisson or Logistic regression. Models are compared to a saturated model (i.e. a perfect fit to the data) as opposed to a null model. This changes the way we interpret our p-values

pchisq(deviance(logitmod), df.residual(logitmod),lower=FALSE)
## [1] 0.7164099

recall:

H_0: our model (less complex)

H_A: saturated model (more complex)

since the P-value is HIGH, we say that it fits the data well

Confidence Intervals

Another method of interpreting our model. If the confidence interval includes zero, then you could reject the model and build something else

library(MASS)
## 
## Attaching package: 'MASS'
## The following object is masked from 'package:dplyr':
## 
##     select
confint(logitmod)
## Waiting for profiling to be done...
##                 2.5 %    97.5 %
## (Intercept)  5.575195 18.737598
## temp        -0.332657 -0.120179

Odds