library(dplyr)
library(wooldridge)
library(car)
library(sandwich)
library(lmtest)
data("jtrain")
Some reasons why the unobserved factors in the error terms might be correlated with grant:
Selection Bias: Firms with higher or lower scrap rates might be more likely to receive a grant, leading to a correlation with the error term.
Omitted Variables: Unobserved factors like management quality or business performance that affect both the grant decision and scrap rate.
Endogeneity: The decision to provide a grant might be influenced by factors related to the firm’s scrap rate.
Reverse Causality: A firm’s scrap rate could influence whether it receives a grant.
Unobservable Firm Characteristics: Unmeasured factors like organizational culture or employee motivation that affect both grant status and scrap rate.
jtrain_88 <- jtrain %>%
filter(year == 1988) %>%
filter(!is.na(scrap))
summary(lm(log(scrap) ~ grant, data = jtrain_88))
##
## Call:
## lm(formula = log(scrap) ~ grant, data = jtrain_88)
##
## Residuals:
## Min 1Q Median 3Q Max
## -3.4043 -0.9536 -0.0465 0.9636 2.8103
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.4085 0.2406 1.698 0.0954 .
## grant 0.0566 0.4056 0.140 0.8895
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.423 on 52 degrees of freedom
## Multiple R-squared: 0.0003744, Adjusted R-squared: -0.01885
## F-statistic: 0.01948 on 1 and 52 DF, p-value: 0.8895
Looking at the p-value, it can be inferred that receiving a job training grant does not significantly lower a firm’s scrap rate
model <- lm(log(scrap) ~ grant + lscrap_1, data = jtrain_88)
summary(model)
##
## Call:
## lm(formula = log(scrap) ~ grant + lscrap_1, data = jtrain_88)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.9146 -0.1763 0.0057 0.2308 1.5991
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.02124 0.08910 0.238 0.8126
## grant -0.25397 0.14703 -1.727 0.0902 .
## lscrap_1 0.83116 0.04444 18.701 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.5127 on 51 degrees of freedom
## Multiple R-squared: 0.8728, Adjusted R-squared: 0.8678
## F-statistic: 174.9 on 2 and 51 DF, p-value: < 2.2e-16
Adding log(scrap_87)
(the lagged scrap rate) might have
reduced the estimated effect of grant
, as it accounts for
some of the variation in the current scrap rate. The coefficient for
grant
is negative but not
statistically significant at the 5% level, meaning we cannot
confidently conclude that receiving a grant has a meaningful impact on
reducing the scrap rate.
linearHypothesis(model, "lscrap_1 = 1")
##
## Linear hypothesis test:
## lscrap_1 = 1
##
## Model 1: restricted model
## Model 2: log(scrap) ~ grant + lscrap_1
##
## Res.Df RSS Df Sum of Sq F Pr(>F)
## 1 52 17.197
## 2 51 13.404 1 3.793 14.432 0.0003885 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
The p-value is very small, at approximately 0, indicating that the parameter on log(scrap87) is statistically different to 1.
robust_se <- vcovHC(model, type = "HC3")
robust_se
## (Intercept) grant lscrap_1
## (Intercept) 0.013461225 -0.007871283 -0.008085249
## grant -0.007871283 0.023323961 0.004125846
## lscrap_1 -0.008085249 0.004125846 0.007779360
The use of heteroskedasticity-robust standard errors
provides more reliable standard errors in the presence of
heteroskedasticity, but in this case, the robust standard errors result
in slightly larger standard errors for both grant
and
lscrap_1
, which in turn affects the t-values and p-values.
However, the key takeaway is that the grant variable remains
statistically insignificant at the 5% level, and the
lscrap_1 variable remains highly significant.
linearHypothesis(model, "lscrap_1 = 1", vcov. = robust_se)
##
## Linear hypothesis test:
## lscrap_1 = 1
##
## Model 1: restricted model
## Model 2: log(scrap) ~ grant + lscrap_1
##
## Note: Coefficient covariance matrix supplied.
##
## Res.Df Df F Pr(>F)
## 1 52
## 2 51 1 3.6644 0.06121 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
In this case, the p-value is 0.06, which indicate that at 5% confidence level, the parameter on log(scrap87) is equal to 1.