Week 4

This week we focused more on poisson regression and several criterias for model selection and adding/dropping variables. I’ll walk through a few.

##      visits       complaints     residency gender    revenue     
##  Min.   : 879   Min.   : 0.000   N:24      F:12   Min.   :206.4  
##  1st Qu.:2036   1st Qu.: 1.000   Y:20      M:32   1st Qu.:235.9  
##  Median :2384   Median : 2.000                    Median :258.5  
##  Mean   :2386   Mean   : 3.341                    Mean   :260.1  
##  3rd Qu.:2813   3rd Qu.: 5.000                    3rd Qu.:282.3  
##  Max.   :3763   Max.   :11.000                    Max.   :334.9  
##      hours     
##  Min.   : 589  
##  1st Qu.:1207  
##  Median :1512  
##  Mean   :1417  
##  3rd Qu.:1642  
##  Max.   :1917

Drop-In-Deviance Test

You can conduct a drop-in-deviance test a couple ways, I’ll show the automatic way with anova testing and the longer way with finding the test stat. To interpret, a low pvalue would indicate that the more complex model is a significant improvement over the null model. You must use nested models. As you can see, computing by hand returns a longer decimal vs rounded for the anova test.

mod0 <- glm(complaints ~ visits + factor(residency), data = esdcomp, family = "poisson")
mod1 <- glm(complaints ~ visits + factor(residency) + factor(gender), data = esdcomp, family = "poisson")
teststat <- deviance(mod0) - deviance(mod1)
pchisq(teststat, df = 1, lower.tail = FALSE)

## [1] 0.7049961

anova(mod0, mod1, test = "Chisq")

## Analysis of Deviance Table
## 
## Model 1: complaints ~ visits + factor(residency)
## Model 2: complaints ~ visits + factor(residency) + factor(gender)
##   Resid. Df Resid. Dev Df Deviance Pr(>Chi)
## 1        41     50.882                     
## 2        40     50.739  1  0.14333    0.705

Lack of Fit Test

For a lack of fit test, a low pvalue would indicate a lack of fit. As you can see, this model does not have a lack of fit.

teststat <- summary(mod0)$deviance
wdf <- nrow(esdcomp) - length(coef(mod0))
pchisq(teststat, df = wdf, lower.tail= FALSE)

## [1] 0.138621

Confidence Interval

We also brushed up on creating confidence intervals. How to do it manually is below.

visitsCI <- exp(.0008101 + c(-1, 1)*0.0001429*1.96)
residencyCI <- exp(-0.3121729 + c(-1, 1)*0.1725024*1.96)
visitsCI

## [1] 1.000530 1.001091

residencyCI

## [1] 0.5219009 1.0262710

My Learning

This week I had some fun with the cats package. My new favorite function is called “here kitty” which returns a random cat and a cat noise. Enjoy.

here_kitty()

## mrrrrowwww

Week 4

StatsWithSarah

3/7/2021

Drop-In-Deviance Test

Lack of Fit Test

Confidence Interval

My Learning