Lecture 1 - Linear Regression Review

Regression vs Classification

  • Regression: \(f(X)=E[Y|X]\)
    • conditional expectation of Y given X
  • Classification: \(f(X)=Pr[Y=\text {label}|X]\)
    • conditional probability that y takes on a given label, given X
  • why conditional expectations?
    • \(E[Y|X]\) minimizes the mean squared error
    • \(E[\epsilon |X]=0\) is uncorrelated with any function of X
    • we have broken Y into a component explained by X, and another component that is orthogonal to X
  • linear regression goal: find the best linear approximation of \(E[Y|X]\) to minimize the mean squared error between prediction of Y and sum of actual values of Y observed at each point, estimated bt \(E[Y|X]=\alpha + \beta X\)

Ordinary Least Squares

  • estimate linear regression using OLS, which finds the values of parameters to minimize prediction errors
    • choose \(\alpha, \beta\) to minimize the Residual Sum of Squares (RSS) \[ RSS = \sum_{i=1}^{n} (y_i - \hat{y}_i)^2 \] where \[ \hat{\beta}=\frac {cov(x,y)}{var(X)} \]
  • key assumption behind OLS: \(E(\epsilon|X)=0\)
    • re: the difference between X and Y is effectively random, and everything else in the world that explains Y (aside from X) is uncorrelated with X (aka no omitted variable bias!)
  • key assumption behind hypothesis tetsing in OLS: an individual’s error variance cannot tell me anything about another individual’s error variance
    • no correlation of epsilon across individuals in our sample -> overestimation of the degree to which including X in your model explains the variation of Y

Linear Regression in R

  • use “binscattering” in R to produce more readable figures when there is a lot of data
  • the underlying relationship stays the same, with the linear OLS estimation remaining constant across the original and binned data
  • easier to visualize whether the data should be modelled linearly, quadratically, etc.

Group Means

  • linear regression is also a useful tool to compute group means
  • computing group means
    • run regression sans intercept, hypothesis tests are meaningful for each bin

Multiple Linear Regression

  • multiple input variables, \(X_1 \text { and } X_2\)

Lecture 2 - Experiments