This week we focused more on poisson regression and several criterias for model selection and adding/dropping variables. I’ll walk through a few.
## visits complaints residency gender revenue
## Min. : 879 Min. : 0.000 N:24 F:12 Min. :206.4
## 1st Qu.:2036 1st Qu.: 1.000 Y:20 M:32 1st Qu.:235.9
## Median :2384 Median : 2.000 Median :258.5
## Mean :2386 Mean : 3.341 Mean :260.1
## 3rd Qu.:2813 3rd Qu.: 5.000 3rd Qu.:282.3
## Max. :3763 Max. :11.000 Max. :334.9
## hours
## Min. : 589
## 1st Qu.:1207
## Median :1512
## Mean :1417
## 3rd Qu.:1642
## Max. :1917
You can conduct a drop-in-deviance test a couple ways, I’ll show the automatic way with anova testing and the longer way with finding the test stat. To interpret, a low pvalue would indicate that the more complex model is a significant improvement over the null model. You must use nested models. As you can see, computing by hand returns a longer decimal vs rounded for the anova test.
mod0 <- glm(complaints ~ visits + factor(residency), data = esdcomp, family = "poisson")
mod1 <- glm(complaints ~ visits + factor(residency) + factor(gender), data = esdcomp, family = "poisson")
teststat <- deviance(mod0) - deviance(mod1)
pchisq(teststat, df = 1, lower.tail = FALSE)
## [1] 0.7049961
anova(mod0, mod1, test = "Chisq")
## Analysis of Deviance Table
##
## Model 1: complaints ~ visits + factor(residency)
## Model 2: complaints ~ visits + factor(residency) + factor(gender)
## Resid. Df Resid. Dev Df Deviance Pr(>Chi)
## 1 41 50.882
## 2 40 50.739 1 0.14333 0.705
For a lack of fit test, a low pvalue would indicate a lack of fit. As you can see, this model does not have a lack of fit.
teststat <- summary(mod0)$deviance
wdf <- nrow(esdcomp) - length(coef(mod0))
pchisq(teststat, df = wdf, lower.tail= FALSE)
## [1] 0.138621
We also brushed up on creating confidence intervals. How to do it manually is below.
visitsCI <- exp(.0008101 + c(-1, 1)*0.0001429*1.96)
residencyCI <- exp(-0.3121729 + c(-1, 1)*0.1725024*1.96)
visitsCI
## [1] 1.000530 1.001091
residencyCI
## [1] 0.5219009 1.0262710
This week I had some fun with the cats package. My new favorite function is called “here kitty” which returns a random cat and a cat noise. Enjoy.
here_kitty()
## mrrrrowwww