Chapter5

data <- wooldridge::econmath
head(data,10)

##    age work study econhs colgpa hsgpa acteng actmth act mathscr male calculus
## 1   23 15.0  10.0      0 3.4909 3.355     24     26  27      10    1        1
## 2   23  0.0  22.5      1 2.1000 3.219     23     20  24       9    1        0
## 3   21 25.0  12.0      0 3.0851 3.306     21     24  21       8    1        1
## 4   22 30.0  40.0      0 2.6805 3.977     31     28  31      10    0        1
## 5   22 25.0  15.0      1 3.7454 3.890     28     31  32       8    1        1
## 6   22  0.0  30.0      0 3.0555 3.500     25     30  28      10    1        1
## 7   22 20.0  25.0      1 2.1666 3.000     15     19  18       9    0        1
## 8   22 20.0  15.0      0 3.2544 3.770     28     30  32       9    1        1
## 9   22 28.0   7.0      0 3.1298 3.927     28     28  30       6    0        0
## 10  21 22.5  25.0      0 2.2424 2.770     18     19  17       9    0        1
##    attexc attgood fathcoll mothcoll score
## 1       0       0        1        1 84.43
## 2       0       0        0        1 57.38
## 3       1       0        0        1 66.39
## 4       0       1        1        1 81.15
## 5       0       1        0        1 95.90
## 6       1       0        0        1 83.61
## 7       0       1        0        0 76.23
## 8       1       0        1        1 84.43
## 9       1       0        0        1 79.51
## 10      0       1        0        0 46.72

Min Max

Logically, min value is 0 and max value is 100. However, in the data min is 19.53 and max is 98.44

min(data$score)

## [1] 19.53

max(data$score)

## [1] 98.44

score~ colgpa+actmth+acteng

MLR6 cannot hold for the error term u because the score doesn’t have normal distribution. The consequences is we cannot reject H0: B3=0

model <- lm( score~ colgpa+actmth+acteng, data= data)
summary(model)

## 
## Call:
## lm(formula = score ~ colgpa + actmth + acteng, data = data)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -39.855  -6.215   0.444   6.812  32.670 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 16.17402    2.80044   5.776 1.09e-08 ***
## colgpa      12.36620    0.71506  17.294  < 2e-16 ***
## actmth       0.88335    0.11220   7.873 1.11e-14 ***
## acteng       0.05176    0.11106   0.466    0.641    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 10.35 on 810 degrees of freedom
##   (42 observations deleted due to missingness)
## Multiple R-squared:  0.3972, Adjusted R-squared:  0.395 
## F-statistic: 177.9 on 3 and 810 DF,  p-value: < 2.2e-16

Arguement

For the statement “You cannot trust p-value because clearly the error term in the equation cannot have a normal cannot have a normal distribution”:

I think it is not strong enough to make the certain inference without MLR6. However, with 4 assumption satisfied, we still can have unbiased estimators and. And with the additional of MLR.5, it is the best linear unbiased estimator with a finite sample property.

Chapter5_FI

NGUYEN BAO QUYNH TRANG

2023-11-05