Question 4

  1. The classical errors-in-variables (CEV) assumption is that the measurement error has zero mean and is uncorrelated with the unobserved explanatory variable:

                          Cov(tvhours*, e) = 0 and mean(e) = 0
                          With tvhour = tvhours* + e

CEV is not likely to hold because it is likely to have relationship between error and motheduc/fatheduc. It is because higher education of parents will lead to less time being with their children, resulting in underreport tvhours*.

Question C2

  1. The estimated return to education in this case is 5.76%
data_c2 <- wage2
model_1 <- lm(log(wage)~ educ+ exper+tenure+married+ south+urban+black, data= data_c2)
model_2 <- lm(log(wage)~ educ+ exper+tenure+married+ south+urban+black+KWW, data= data_c2)
summary(model_1)
## 
## Call:
## lm(formula = log(wage) ~ educ + exper + tenure + married + south + 
##     urban + black, data = data_c2)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -1.98069 -0.21996  0.00707  0.24288  1.22822 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  5.395497   0.113225  47.653  < 2e-16 ***
## educ         0.065431   0.006250  10.468  < 2e-16 ***
## exper        0.014043   0.003185   4.409 1.16e-05 ***
## tenure       0.011747   0.002453   4.789 1.95e-06 ***
## married      0.199417   0.039050   5.107 3.98e-07 ***
## south       -0.090904   0.026249  -3.463 0.000558 ***
## urban        0.183912   0.026958   6.822 1.62e-11 ***
## black       -0.188350   0.037667  -5.000 6.84e-07 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.3655 on 927 degrees of freedom
## Multiple R-squared:  0.2526, Adjusted R-squared:  0.2469 
## F-statistic: 44.75 on 7 and 927 DF,  p-value: < 2.2e-16
summary(model_2)
## 
## Call:
## lm(formula = log(wage) ~ educ + exper + tenure + married + south + 
##     urban + black + KWW, data = data_c2)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -2.04494 -0.21931 -0.00048  0.24163  1.26464 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  5.358797   0.113600  47.172  < 2e-16 ***
## educ         0.057628   0.006838   8.428  < 2e-16 ***
## exper        0.012228   0.003241   3.773 0.000172 ***
## tenure       0.011072   0.002456   4.507 7.40e-06 ***
## married      0.189461   0.039077   4.848 1.46e-06 ***
## south       -0.091601   0.026156  -3.502 0.000484 ***
## urban        0.175545   0.027032   6.494 1.36e-10 ***
## black       -0.164267   0.038530  -4.263 2.22e-05 ***
## KWW          0.005028   0.001819   2.764 0.005820 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.3642 on 926 degrees of freedom
## Multiple R-squared:  0.2587, Adjusted R-squared:  0.2523 
## F-statistic: 40.39 on 8 and 926 DF,  p-value: < 2.2e-16
  1. The estimated return to education in this case is 4.98%
model_1 <- lm(log(wage)~ educ+ exper+tenure+married+ south+urban+black, data= data_c2)
model_3 <- lm(log(wage)~ educ+ exper+tenure+married+ south+urban+black+KWW+IQ, data= data_c2)
summary(model_1)
## 
## Call:
## lm(formula = log(wage) ~ educ + exper + tenure + married + south + 
##     urban + black, data = data_c2)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -1.98069 -0.21996  0.00707  0.24288  1.22822 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  5.395497   0.113225  47.653  < 2e-16 ***
## educ         0.065431   0.006250  10.468  < 2e-16 ***
## exper        0.014043   0.003185   4.409 1.16e-05 ***
## tenure       0.011747   0.002453   4.789 1.95e-06 ***
## married      0.199417   0.039050   5.107 3.98e-07 ***
## south       -0.090904   0.026249  -3.463 0.000558 ***
## urban        0.183912   0.026958   6.822 1.62e-11 ***
## black       -0.188350   0.037667  -5.000 6.84e-07 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.3655 on 927 degrees of freedom
## Multiple R-squared:  0.2526, Adjusted R-squared:  0.2469 
## F-statistic: 44.75 on 7 and 927 DF,  p-value: < 2.2e-16
summary(model_3)
## 
## Call:
## lm(formula = log(wage) ~ educ + exper + tenure + married + south + 
##     urban + black + KWW + IQ, data = data_c2)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -2.05704 -0.21621  0.00824  0.23725  1.24895 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  5.175644   0.127776  40.506  < 2e-16 ***
## educ         0.049837   0.007262   6.863 1.24e-11 ***
## exper        0.012752   0.003231   3.947 8.51e-05 ***
## tenure       0.010925   0.002446   4.467 8.92e-06 ***
## married      0.192145   0.038909   4.938 9.35e-07 ***
## south       -0.082029   0.026222  -3.128  0.00181 ** 
## urban        0.175823   0.026910   6.534 1.06e-10 ***
## black       -0.130399   0.039901  -3.268  0.00112 ** 
## KWW          0.003826   0.001852   2.066  0.03913 *  
## IQ           0.003118   0.001013   3.079  0.00214 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.3625 on 925 degrees of freedom
## Multiple R-squared:  0.2662, Adjusted R-squared:  0.2591 
## F-statistic: 37.28 on 9 and 925 DF,  p-value: < 2.2e-16

In part ii, KWW and IQ are independently significant at 5%.

((0.2662-0.2526)/2)/((1-0.2662)/925)
## [1] 8.571818

Based on F-test =8.57 >1 -> The two variables are jointly significant.

Question C8

  1. The sample mean is 0.047 and the standard deviation is 0.855
data_c8 <- twoyear
mean <- mean(data_c8$stotal)
sd <- sd(data_c8$stotal)
mean
## [1] 0.04748291
sd
## [1] 0.8535441
  1. Both 2 variables are statistically significant on stotal because their p-value are smaller than 0.05. It means that the jc increase 1 unit, stotal will increase 0.075. When univ increase 1 unit, stotal will increase 0.1646
model_1 <- lm(stotal ~ jc+ univ, data= data_c8)
summary(model_1)
## 
## Call:
## lm(formula = stotal ~ jc + univ, data = data_c8)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -3.0298 -0.4457  0.1220  0.4522  2.4846 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -0.295005   0.013196 -22.356  < 2e-16 ***
## jc           0.074767   0.012170   6.143 8.53e-10 ***
## univ         0.164644   0.004091  40.246  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.7667 on 6760 degrees of freedom
## Multiple R-squared:  0.1934, Adjusted R-squared:  0.1932 
## F-statistic: 810.5 on 2 and 6760 DF,  p-value: < 2.2e-16
  1. problems in the dataset

  2. problems in the dataset

v.problems in the dataset

  1. problems in the dataset