Econ 107 Lab Exercise 2: Multi-variable Regression Analysis

Several Important Things

  • 1 Do Not forget attach your do-file (10 points, 6 points for do-file, 4 points for document)

  • 2 Please do not help others groups when you finished (This is a quiz)

  • 3 Check your email on my screen before you leave

Omitted Variable Bias

True model:

\[ y=\beta_0+\beta_1 x+\beta_2 z + u \]

Omitted variable model:

\[ y=\beta_0+\beta_1 x+v \]

where \( v=\beta_2 z + u \)

Omitted Variable Bias

Unbiased estimator:

\[ \hat{\beta}_1 \rightarrow \beta_1 \]

Omitted variable biased estimator:

\[ \hat{\beta}_1 \rightarrow \beta_1 + \rho_{xz}\frac{\sigma z}{\sigma x} \]

Solution

Control variable selection (choose the best subset variables):

  • 1 Do regress dependent variable on independent variable with one control variable each time.

  • 2 Check the change before and after adding each control variable (compare with regression without control variable).

  • 3 Choose the control variable which has a large change.

Data set

Go to Stock and Watson textbook data website

http://wps.aw.com/aw_stock_ie_3/178/45691/11696965.cw/

\( \rightarrow \) Data for Empirical Exercises and Test Bank

\( \rightarrow \) Download College Distance Data in Stata format.

Also, you need to download the data discription to understand the meanings of all the variables in this data set.

Question Part

Does lowering the cost of higher education improve student outcomes?

  • Data Analysis: Run a regression of years of completed education (ED) on distance to the nearest college (Dist), where Dist is measure in tens of miles.

  • Run multiple regressions including controls for each category of control variables (demographic characteristics, family background, etc.) and a single regression controlling for all categories at the same time. Do not throw all control variables into one regression. Instead, think about what you would like to control for and decide which variables are most appropriate to use (find 3 control variables).

Question Part

  • Output: Use the outreg2 command to create a table of results.

  • Write-Up: Does the distance from a university actually impact the years of education according to your results? In your most comprehensive regression, explain what elements remain in the error term and whether omitted variable bias exists. Argue that your estimated \( \hat{\beta}_1 \) is centered around the true \( \beta_1 \).

Do Not Forget

Email subject: 

Econ 107+Assignment name+your name+your student id

Email: Name+ID of all the group members

Assignment email: econ107lab@gmail.com