23/2/2021

Back to the beginning

(Test score depends on IQ)

  • \[ GPA = \alpha + \beta.IQ + \epsilon \]

  • \(\alpha\) and \(\beta\) are the intercept and the slope of the line (when talking about a population)

A new notation for linear models

  • \[ GPA = \alpha + \beta.IQ + \epsilon \]
  • \(\alpha\) is more usually \(\beta_0\)
  • So the more general form is
  • \[ y_i = \beta_0 + \beta_1x_i +\epsilon_i \]

The error structure of a linear model

We assume that the errors \(\epsilon_i\) are independent and identically distributed such that

\[ E[\epsilon_i] = 0 \] \[ var[\epsilon_i] = \sigma^2 \] So for linear models we assume

\[ \epsilon_i \sim N(0,\sigma^2) \]

A generalized linear model is made up of

A linear predictor \(\eta\) \[ \eta_i = \beta_0 + \beta_1x_{1i} + \ldots + \beta_px_{pi} \] and two functions

  • the link function that describes how the mean \(\mu_i\) depends on the linear predictor \[ g(\mu_i) = \eta_i \]

  • the variance (or error) function that describes how the variance \(var(Y_i)\) depends on the mean \[ var(Y_i) = \phi V(\mu) \] where the dispersion parameter \(\phi\) is a constant

Normal general linear models as a special case

This can all best be understood if we write a linear model in this form. Taken from https://statmath.wu.ac.at/courses/heather_turner/glmCourse_001.pdf

For a general linear model with \(\epsilon_i \sim N(0,\sigma^2)\), the linear predictor is \[ \eta_i = \beta_0 + \beta_1x_{1i} + \ldots + \beta_px_{pi} \] the link function \[ g(\mu_i) = \mu_i\]

and the variance function \[ V(\mu_i) = 1\] i.e. variance doesn’t change with mean (normal distribution)

Some possible generalized linear models

Next time

Doing a generalized linear model in R.