Question 1
Question 1a
For this question, we will estimate the \(\beta\) in a linear regression model.
\[y_i=\beta_0+\beta_1x_{1i}+u_i\]
library(wooldridge)
m <- lm(wage~educ+IQ,wage2)
summary(m)
Call:
lm(formula = wage ~ educ + IQ, data = wage2)
Residuals:
Min 1Q Median 3Q Max
-860.29 -251.00 -35.31 203.98 2110.38
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -128.8899 92.1823 -1.398 0.162
educ 42.0576 6.5498 6.421 2.15e-10 ***
IQ 5.1380 0.9558 5.375 9.66e-08 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 376.7 on 932 degrees of freedom
Multiple R-squared: 0.1339, Adjusted R-squared: 0.132
F-statistic: 72.02 on 2 and 932 DF, p-value: < 2.2e-16
The estimated equation equals: \[ \widehat{wage} = \underset{(92)}{-128}+\underset{(6.5)}{42}educ+\underset{(0.95)}{5}IQ,\] with 935 observations.
Question 4 (a)
First we define the variables as:
The variable \(urban\) =1 if individual lives in an urban area, and 0 otherwise
The variable \(urban*educ\) is the interaction of the dummy with education.
For this question, we will estimate the \(wage\) in a linear regression model.
\[wage=\beta_0+\beta_1IQ+\beta_2educ+\beta_3urban+\beta_4(urban*educ)+u\]
Question 4 (b)
For this question we estimate the extended linear regression equation by OLS
\[\hat{wage}=\underset{(147.457)}{72.758}+\underset{(0.938)}{5.262}IQ_i+\underset{(11.206)}{17.350}educ_i+\underset{(166.970)}{-245.112}urban_i+\underset{(12.378)}{30.240}(urban*educ)_i+u\]
library(wooldridge)
model1 <- lm(wage~IQ+educ+urban+(urban*educ),data=wage2)
summary(model1)
Call:
lm(formula = wage ~ IQ + educ + urban + (urban * educ), data = wage2)
Residuals:
Min 1Q Median 3Q Max
-852.0 -245.6 -35.9 192.3 2068.6
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 72.7583 147.4571 0.493 0.6218
IQ 5.2622 0.9384 5.607 2.71e-08 ***
educ 17.3502 11.2055 1.548 0.1219
urban -245.1125 166.9702 -1.468 0.1424
educ:urban 30.2400 12.3783 2.443 0.0148 *
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 369.2 on 930 degrees of freedom
Multiple R-squared: 0.1698, Adjusted R-squared: 0.1662
F-statistic: 47.55 on 4 and 930 DF, p-value: < 2.2e-16
\(n=935\) observations
\(R^{2}= 0.170\)
Question 4 (c)
Restricted verse Unrestricted Models
Restricted: \[wage=\beta_0+\beta_1IQ+\beta_2educ++u\]
Unrestricted:
\[wage=\beta_0+\beta_1IQ+\beta_2educ+\beta_3urban+\beta_4(urban*educ)+u\]
Alternative Hypothesis
\(H_1 : \beta_3≠0\)
Null Hypotheses
\(H_0 : \beta_3=\beta_4=0\)
\(F=(SSR_r-SSR_{ur})/(SSR_{ur}) * (n-k-1/q)\) ~ \(F(q,n-k-1)\)
\(q=2,\) \(n=935\), \(k=4\)
library(wooldridge)
modelr <- lm(wage~IQ+educ,data=wage2)
modelur <- lm(wage~IQ+educ+urban+(urban*educ),data=wage2)
SSR_R <- deviance(modelr)
SSR_UR <-deviance(modelur)

\(F_{calc}=(132274591.451-12678694.489)/(132274591.451)*((935-4-1)/2)\)
\(F_{calc}=420.429\)
\(F_{crit}=F(0.005,2,930)=3.005\)
We reject \(H_0\) because \(F_{calc}>F_{crit}\)
We reject the null hypothesis that living in the city has no effect on wage at the 5% significant level
Question 4 (d)
First we need to calculate the average IQ;
avg_IQ <- mean(wage2$IQ, na.rm=TRUE)

now we can derive the predicated wage for:
Scenario 1 (lives in the city, average IQ, solving for education)
\[\hat{wage}={72.758}+({5.262}*101.282)+({17.350}*x)+({-245.112}*0)+{30.240}(0*x)\]
\[\hat{wage}=605.704+{17.350}x\]
Scenario 2 (lives rural, average IQ, solving for education)
\[\hat{wage}={72.758}+({5.262}*101.282)+({17.350}*x)+({-245.112}*1){+30.240}(1*x)\]
\[\hat{wage}=360.592+{47.590}x\]
Solve for x
\[605.704+{17.350}x=360.592+{47.590}x\]
\[x=8.106\]
The education level where there is no difference in predicted wage between these two individuals is 8.106 years