What does regression do?

























Steps in regression





















Aide: what do I mean by “multilevel” or “hiearchical”

What is it?

  • Many names for same / similar issue
    • repeated measures
    • blocking
    • random effects / mixed effects models
    • multilevel models
    • hiearchical models
  • Measurements on same thing multiple times or on similar experimental units are likely to be more similar to each other than random
  • Violates the assumption of independence in regression/ANOVA








Multilevel Examples:

  • Education studies
    • students in class rooms in schools in districts in states
    • (multiple students per class, multiples classes per school, multiple schools per district)
  • Health care studies
    • patients on hospital floors in hospitals in cities
  • Animal studies
    • repeated measures on mice in cages
    • (multiple measurements per mice, multiple mice per cage)





















Regression models in R using lm()

Basic R regression

  • “lm()” = “linear model”"
  • regression, t.test, ANOVA, ANCOVA are all linear models
  • all are fit with lm() function
  • (Welch’s correct only done with t.test; similar effect with gls())
  • “gls()” = generalized least squares
  • “glm()” = generalized linear model
lm.mass <- lm(fat.percent ~ log(mass.female),
             data = milk3)

























Model summary

  • Look at model w/ summary()
  • (we can get just the coefficients/slopes with coef() )
  • (anoter useful function is arm::display; has shorter output)
  • (recall that you can save these summary outputs and access them as lsits with a $)
summary(lm.mass)
## 
## Call:
## lm(formula = fat.percent ~ log(mass.female), data = milk3)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
##  -8.34  -4.18  -1.05   1.22  15.74 
## 
## Coefficients:
##                  Estimate Std. Error t value Pr(>|t|)    
## (Intercept)        20.515      3.249    6.31  1.7e-07 ***
## log(mass.female)   -1.752      0.457   -3.83  0.00044 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 5.96 on 40 degrees of freedom
## Multiple R-squared:  0.268,  Adjusted R-squared:  0.25 
## F-statistic: 14.7 on 1 and 40 DF,  p-value: 0.000443

We’ll unpack what each of these is

























Plot model w/ sumamry output





















Plot model w/ equations









































Add location of intercept





















Coefficient plot

What is this?

  • arm::coefplot()
  • NOTE: by convention, axes are flipped!





















Coefficient plot

What is this?

  • arm::coefplot()
  • NOTE: by convention, axes are flipped!





















Compare coefficient plot to raw data

par(mfrow = c(1,2), mar = c(5,2,7,0.25))

arm::coefplot(lm.mass,intercept = T,
              mar=c(5,6,8,1),
              cex.pts = 2,
              col.pts = 1:2,
              pch.pts = 16:17,
              lwd = 3)
mtext(text = "y  = m*x + b",
      side = 1,adj= 1, line = -2)
mtext(text = "fat = slope*log(mass) + intercept",side = 1,adj= 1, line = -1)
make.plot()

y. <- coef(lm.mass)[1]
abline(v = 0, col = 2,lwd = 2)
arrows(x0 = 1,x1=0,
       col = 2,lwd = 2,
       y0 = y.)
lab. <- round(coef(lm.mass)[1],2)
text(x = 1.71,y = y.,
     label = lab.)





















Linear regression as model fitting

Fit 2 models

Null model (Ho): flat line

  • Null hypothesis: ????
  • Null model: “fat.percent ~ 1”
  • “~1” means fit model
    • w/ slope = 0
    • intercept = mean of response variable
lm.null <- lm(fat.percent ~ 1,
              data = milk3)

Alternative model: fat.percent ~ log(mass.female)


lm.mass <- lm(fat.percent ~ log(mass.female),
              data = milk3)





















Plot both models





















What is the intercep of null model?

The mean fat value is 8.6

summary(milk3[,"fat.percent"])
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    0.70    3.75    6.60    8.58   11.90   27.00





















Test model: significance test

Nested Models

Ha: y ~ -1.75log(mass) + 20.5 ### Ho: y ~ 0log(mass) + 8.6

  • Ha has TWO parameters (aka coefficients, betas)
    • Intercept = 20.515
    • Slope = -1.75
  • Ho has ONE parameter
    • Intercept = 8.6
    • (Slope = 0, so it doesn’t count)
  • Hypothesis can be formulated of 2 “nested models”
    • Applies to t-test, ANOVA, ANCOVA
    • A major difference between Hypothesis testing w/p-values and IT-AIC is that AIC doesn’t require models to be nested!





















Hypothesis test of nested models in R

  • anova()
  • carries out an F test for linear models
  • “likelihood ratio test” for GLMs/GLMMs
    • GLMM = generalized linear mixed models
    • Models w/random effects, blocking, repeated measures etc
#NOTE: order w/in anova() doesn't matter
anova(lm.null, 
      lm.mass)
## Analysis of Variance Table
## 
## Model 1: fat.percent ~ 1
## Model 2: fat.percent ~ log(mass.female)
##   Res.Df  RSS Df Sum of Sq    F  Pr(>F)    
## 1     41 1943                              
## 2     40 1422  1       521 14.7 0.00044 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
  • Output of anova() is similar to content of summary()
  • I like to emphasize the testing of nested models
  • GLMs, GLMMs often can only be tested with anova()





















Examine model: IT-AIC

IT-AIC

  • “Information Theory- Akaike’s Information Criteria”
  • “Information Theory” is a general approach to investiting models
  • AIC is a particular information theory
    • Others: BIC, WAIC, FIC, DIC (Bayesian), …
    • seems like every statistican has one now
  • AICc: correction for small samples size
    • Should always use AICc
  • qAIC: corretion for “overdispersion”
    • For GLMs (ie. logistic regression)

AIC in base R

  • Note: no AICc command in base R!
  • Other packages can do AICc and other ICs
  • DF = number of parameters
AIC(lm.null, 
    lm.mass)
##         df AIC
## lm.null  2 284
## lm.mass  3 273





















AIC table

  • Lower is “better”
  • Absoluate values have NO MEANSING
  • AICs are only interpretable relative to other AIC
  • focus: “delta AIC” (dAIC)
  • AIC goes down if a model fits the data better
  • BUT - every additional parameter has a “penalty”
library(bbmle)

AICtab(lm.null,
       lm.mass,
       base = T)
##         AIC   dAIC  df
## lm.mass 273.1   0.0 3 
## lm.null 284.2  11.1 2

AICc

  • AICc is a correction for small sample size
  • Doesn’t make much difference here
library(bbmle)

ICtab(lm.null,
      lm.mass,
      type = c("AICc"),
      base = T)
##         AICc  dAICc df
## lm.mass 273.7   0.0 3 
## lm.null 284.5  10.8 2




















What about R^2?

Hypothesis testing:

  • p-values: tell if you if a predictor is “significantly” associated w/ a respons
  • Does not tell you the EFFECT SIZE is large
    • p can be very very small
    • but slope also very small

IC-AIC

  • AIC tells you if one model fits the data better than another
  • AIC doesn’t tell you if either model data very well

R^2

  • Tells you how much variation in the data is explained by model
  • whether you use p-values or AIC, you should report R^2




















Null vs Alternative model

  • Slope of Alt models is highly significant
  • But R2 isn’t huge
  • Lots of scatter around line
  • Other predictors might improve fit