The model that we have considered so far is:

\[Y = \alpha + \beta X + \epsilon, \quad E(\epsilon|X)=0\]

The population parameters of interest are \(\alpha\) and \(\beta\). Parameters are variables whose values are unknown.

The term \(\epsilon\) is a stochastic error term (i.e. a random variable) which is unobservable and which can never be recovered from the data. \(\epsilon\) is not a parameter, it is a random variable. The type of estimation we learned in this class looks only at how to estimate variables (parameters) NOT random variables. So thinking that we are interested in estimating \(\epsilon\) is simply wrong.

Interpretations of the slope parameter

There are two equivalent interpretations of \(\beta\).

  1. Consider the equation: \(Y = \alpha + \beta X + \epsilon\) and differentiate it with respect to \(X\). This obtains that \[\beta = \frac{dY}{dX} + \frac{d\epsilon}{dX}\] Since we made the assumption that \(\epsilon\) and \(X\) are uncorrelated, this means that \(\epsilon\) does not depend on \(X\) so \(\frac{d\epsilon}{dX} = 0\), i.e. \(\epsilon\) does not change with \(X\). Differently put, \(\Delta Y = \beta \Delta X\), so changing \(X\) by one unit, changes \(Y\) by \(\beta\) units keeping everything else the same.

  2. Consider again the linear equation above and take expectations to obtain \[E(Y|X)=\alpha + \beta X + E(\epsilon|X) =\alpha + \beta X\] Differentiating this expression with respect to \(X\) we see that \(\beta = \frac{dE(Y|X)}{dX}\), which means that changing \(X\) by one unit changes \(Y\) by \(\beta\) units on average.

So you can either say:

\(\beta\) shows by how much \(Y\) changes when \(X\) changes by 1 unit while keeping everything else the same

OR

\(\beta\) shows by how much \(Y\) changes when \(X\) changes by 1 unit on average

If either “while keeping everything else the same” OR “on average” are missing, your answer is incomplete.