Week 4

#Handout (Ch 4 Intro): https://www.dropbox.com/s/mffcxfvxe5j2vo0/Intro.pdf?dl=0
library(faraway)
data(gala)


# glm() very similar to lm(), need to tell "family = " could be poisson or logistic...
# summary () is also very similar for glm()

#dispersion parameter (1 for poisson) forces variance to be equal to the mean

#Deviance is the generalization of RSS (residual sum of squares, or sum of squared residuals). 
#Sometimes you’ll see this called the G-statistic. Small deviance indicates the model fits well.
#Large deviance indicates bad predictions.

# Goodness of Fit w/ LM: does model fit? using residual sum of squares
#GOF for glm: use residual deviance as test stat. df caluclated based on n.
# df = n-(# f betas)= n-(k+1)
# pchisq(teststat, df, lower.tail = FALSE)

# halfnorm(residuals(model)) --> is there a trend? Look for outliers

##dispersion parameter: 
#how different is our variance from our mean? Want to be close to 1.
#dp <- sum(residuals(pmod1, type="pearson")^2 / pmod1$df.res) 
#summary(pmod1, dispersion=dp) change dispersion to correct standard errors
#estimates do not change, only SEs, zval and Pval

#Interpreting coefficients
#Increasing the predictor by 1 unit results in a multiplicative change of exp(Beta_1) in the mean number of species. 

#confidence interval confint(modelname, level =0.95) <-- didnt give dp, calculates CI using SE built under assumption mean=var
# if the df is >1 the CI is gonna be too skinny.
# need to add confint(model name, level, dispersion= dp) (might be wrong)
# do this manually by nultiplying SE by sqrt(dp)

#We interpreted beta but wanted to interpret exp(beta) because that tells us the impact on the mean. SImilarly we want a corresponding CI for exp(Beta)
# just exponentiate the endpoints of the CI using the dispersion parameter. 

#Stat theory: our beta estimates are approx. normally distributed. Not necessarily true the exp(beta_hat) is normally dist.
# must make CI for betas then transform them for the exponentiated version.

#Drop in Deviance Tests:
#How much does dev deop as we go to more complicated model?
# deviance(modelname)
#Deviance is residual sum of squares.
# Large drop in deviance, we want to go with the more complicated model
# Need to check that its a true Poisson Model (Mean=Var)

#Setting 1
# Test stat  = diff in deviance, Chi-Squared w/ df= df1-df2, upper tail
#can do this with anova!
#anova(mod1,mod2, test ="Chisq")

#Setting 2 : Mean=/=Var
#test stat: (drop in dev)/(dp*df)
#Now use an   F-Distribution   with df1= # Betas dropping, df2= n - # Betas in larger model

#drop1(model, test="F") to see which would be the best variable to delete


#residual plot
#plot(residuals(pmod1)~predict(pmod1,type="link"),xlab="log lambda", ylab="deviance residuals")



#Homework: https://www.dropbox.com/s/vaktl73l6tknvqe/PoissonHW.pdf?dl=0
Week 4

Cheyanne Simpson

2/28/2021