Computer Assignment #4

Introduction

The purpose of this assignment is for you to get more hands on experience running and interpretting regressions, and doing F-tests for joint significance (hint: the command anova(mod.u, mod.r) will run that F test… check lecture slides for more details).

Format

Please turn in a hard copy of your script and a separate document with your answers to the questions. For questions that ask you to estimate a model, copy/paste a summary table of your model using stargazer.

Grading

This assignment will be graded on a $\checkmark$/$\checkmark +$ basis. Completing the assignment gets you a $\checkmark$ (worth 85%) and getting the hardest part right gets you a $\checkmark +$ (worth 100%). Incomplete work is worth 0%.

The deadline is the beginning of class next Thursday (Nov 10) or by email prior to the class. (Late submissions will be docked 20 points.)

The Assignment

Step 0: set up your workspace

Breathe deeply, brew some coffee, and create a new project with its own folder named “HW4” (or whatever you want to call it). Type Ctrl + Shift + n to open up a new script and save it (name it something creative like “script.R”… something you’ll remember).

Step 1: gather the data

From the Wooldridge textbook data find the datasets below:

WAGE1
WAGE2
HPRICE1

Copy them into your working directory (the folder on your computer R is operating in). Each question below will use a different dataset. You don’t have to do this, but I’m going to copy each dataset into a new object instead of using Wooldridge’s default object data.

load("wage1.RData")
wage <- data
wage.desc <- desc
load("wage2.RData")
wage2 <- data
wage2.desc <- desc
load("gpa2.RData")
gpa <- data
gpa.desc <- desc

Step 2: answer the questions

We’re doing questions C1 and C2 from chapter 5 and C2 and C3 from chapter 6

C1. Use the wage1 data.

Estimate the equation \[wage = \beta_0 + \beta_1 educ + \beta_2 exper + \beta_3 tenure + u\] save the residuals and plot a histogram. (hint: If your model is named m1.1, then this line of code will make the histogram qplot(m1.1$residuals.)
Repeat part (i) using $log(wage)$ as the dependent variable.
Which model looks like it satisfies assumption 6 (normality of the residuals) better?

C2. Use the gpa2 data

Using all 4,137 observations, estimate the model \[colgpa = \beta_0 + \beta_1 hsperc + \beta_2 sat + u\] and report the results
Repeat (i) using only the first 2,070 observations. (hint: data = gpa[1:2070,] in your call to lm() will accomplish this.)
Find the ratio of the standard errors on $hsperc$ from parts (i) and (ii). Compare this result from equation 5.10 in the textbook which states:

\[se(\hat{\beta_j}) \approx c_j/\sqrt{n}\],

where $c_j$ is a constant that doesn’t matter for the ratio you’re calculating since it will cancel out.

Chapter 6 questions

C2. Use the wage1 data

Use OLS to estimate the model \[log(wage) = \beta_0 + \beta_1 educ + \beta_2 exper + \beta_3 exper^2 + u \] and report the results
Is $exper^2$ statistically significant at the 1% level?
Using the approximation \[\%\Delta\hat{wage} \approx 100(\hat{\beta_2} + 2\hat{\beta_3}exper)\Delta exper\], find the approximate return to the fifth year of experience. Repeat for the 12th year of experience.

C3. (with modification) Use the wage2 data

Estimate the model \[log(wage) = \beta_0 + \beta_1 educ + \beta_2 exper + \beta_3 educ \times exper + u \] and report the results.
Re-estimate the model in part (i) using demeaned data (hint: data = mutate_each(wage,funs(scale(.,scale=F)))). Report the results along side the original model using stargazer. What values changed? What values didn’t change? (e.g. did the $R^2$ change? Did the intercept change? etc.)
What is the effect of an additional year of experience for someone with an average amount of education?