\(~\) \(~\) \(~\)

(a) Use OLS to estimate the parameters of the model logw = β1 + β2educ + β3exper + β4exper2 + β5smsa + β6south + ε.Give an interpretation to the estimated β2 coefficient.

\(~\)

data <- read.table("./Week 4 - Test.txt", header = TRUE, sep = ",")
data$exper2 <- data$exper^2
reg <- lm(logw ~ educ + exper + exper2 + smsa + south, data = data)
summary(reg)
## 
## Call:
## lm(formula = logw ~ educ + exper + exper2 + smsa + south, data = data)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -1.71487 -0.22987  0.02268  0.24898  1.38552 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  4.6110144  0.0678950  67.914  < 2e-16 ***
## educ         0.0815797  0.0034990  23.315  < 2e-16 ***
## exper        0.0838357  0.0067735  12.377  < 2e-16 ***
## exper2      -0.0022021  0.0003238  -6.800 1.26e-11 ***
## smsa         0.1508006  0.0158360   9.523  < 2e-16 ***
## south       -0.1751761  0.0146486 -11.959  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.3813 on 3004 degrees of freedom
## Multiple R-squared:  0.2632, Adjusted R-squared:  0.2619 
## F-statistic: 214.6 on 5 and 3004 DF,  p-value: < 2.2e-16

\(~\)

The estimated β2 coefficient indicates that each one-unit increase in the education level is accompanied by an increase of 0.0815797 - or 8% - in the predicted value of log wage, with all other variables remaing constant. This does not, however, necessarily mean that there’s a causal relation between both variables, given the possibility of endogeneity in the model.

\(~\)

(b) OLS may be inconsistent in this case as educ and exper may be endogenous. Give a reason why this may be the case. Also indicate whether the estimate in part (a) is still useful

\(~\)

This may be the case because factors like motivation, work-ethic, efficiency and intelligence are all variables that can affect both the education level and wage of a person, but are not present in the model. The coefficients of part A thus lose their usefulness, since they cannot properly estimate causal effects.

\(~\)

(c) Give a motivation why age and age2 can be used as instruments for exper and exper2.

\(~\)

The increase of age is directly related to an increase in years of working experience, but not to an increase in wage, meaning it’s adequate as an instrumental variable.

\(~\)

(d) Run the first-stage regression for educ for the two-stage least squares estimation of the parameters in the model above when age, age2, nearc, dadeduc, and momeduc are used as additional instruments. What do you conclude about the suitability of these instruments for schooling?

\(~\)

data$age2 <- data$age^2
reg2 <- lm(formula = educ ~ age + age2 + smsa + south + nearc + daded + momed, data = data)
summary(reg2)
## 
## Call:
## lm(formula = educ ~ age + age2 + smsa + south + nearc + daded + 
##     momed, data = data)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -11.2777  -1.5450  -0.2224   1.6957   7.2250 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -5.652354   3.976343  -1.421 0.155277    
## age          0.989610   0.278714   3.551 0.000390 ***
## age2        -0.017019   0.004838  -3.518 0.000441 ***
## smsa         0.529566   0.101504   5.217 1.94e-07 ***
## south       -0.424851   0.091037  -4.667 3.19e-06 ***
## nearc        0.264554   0.099085   2.670 0.007626 ** 
## daded        0.190443   0.015611  12.199  < 2e-16 ***
## momed        0.234515   0.017028  13.773  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.326 on 3002 degrees of freedom
## Multiple R-squared:  0.2466, Adjusted R-squared:  0.2448 
## F-statistic: 140.4 on 7 and 3002 DF,  p-value: < 2.2e-16

\(~\)

These values indicate that the added variables are suitable instruments for education, with the most significant ones being education of the father and education of the mother.

\(~\)

(e) Estimate the parameters of the model for log wage using two-stage least squares where you correct for the endogeneity of education and experience. Compare your result to the estimate in part (a).

\(~\)

reg3 <- ivreg(logw ~ smsa + south | educ + exper + exper2 | age + age2 + nearc + daded + momed, data = data)
summary(reg3)
## 
## Call:
## ivreg(formula = logw ~ smsa + south | educ + exper + exper2 | 
##     age + age2 + nearc + daded + momed, data = data)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -1.7494 -0.2360  0.0266  0.2498  1.3468 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  4.4169039  0.1154208  38.268  < 2e-16 ***
## educ         0.0998429  0.0065738  15.188  < 2e-16 ***
## exper        0.0728669  0.0167134   4.360 1.35e-05 ***
## exper2      -0.0016393  0.0008381  -1.956   0.0506 .  
## smsa         0.1349370  0.0167695   8.047 1.21e-15 ***
## south       -0.1589869  0.0156854 -10.136  < 2e-16 ***
## 
## Diagnostic tests:
##                            df1  df2 statistic p-value    
## Weak instruments (educ)      5 3002   145.511 < 2e-16 ***
## Weak instruments (exper)     5 3002  1257.258 < 2e-16 ***
## Weak instruments (exper2)    5 3002  1098.430 < 2e-16 ***
## Wu-Hausman                   2 3002     5.709 0.00335 ** 
## Sargan                       2   NA     3.702 0.15705    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.3844 on 3004 degrees of freedom
## Multiple R-Squared: 0.2512,  Adjusted R-squared: 0.2499 
## Wald test: 175.9 on 5 and 3004 DF,  p-value: < 2.2e-16

\(~\)

Education now has a higher impact on the level of wages and years of experience a lower one, meaning their effects were being respectively under- and overestimated.

\(~\)

(f) Perform the Sargan test for validity of the instruments. What is your conclusion?

\(~\)

As seen in item above, the sargan test results in a statistic of 3.702 and a p-value of 0.15705, meaning that the instruments are validly exogenous.

\(~\)