data <- readMat("C:/Users/dorgo/Documents/R/matlabdata.mat")
data %<>% as.data.frame()
tobit_model <- censReg(formula = I~AGE+MARRIED+WOMAN+KIDS+Y,
data = data)
b <- tobit_model %>% summary()
tobit_coef <- coef(b) %>% as.data.frame()
sigma <- tobit_coef["logSigma",1] %>% exp()
values <- list(1,28,1,1,1,0.7)
a <- sapply(1:6, function(x)tobit_coef[x,1]*values[[x]])
x_beta <- (a %>% sum())
The x times \(\beta\) is -1.0320593 and the \(\sigma\) is 0.9478728 so we get: 0.0664937 which is ~ 665 NIS of expenditure on insurance.
Now we will see the output of a regression implemented just on positive I’s:
data_1 <- data[data$I>0,]
linear_model <- lm(formula = I~AGE+MARRIED+WOMAN+KIDS,
data = data)
linear_model %>% summary()
##
## Call:
## lm(formula = I ~ AGE + MARRIED + WOMAN + KIDS, data = data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.3445 -0.4646 -0.2411 0.2389 3.3834
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -0.214314 0.051767 -4.140 3.56e-05 ***
## AGE 0.017514 0.001341 13.063 < 2e-16 ***
## MARRIED 0.116904 0.036353 3.216 0.00131 **
## WOMAN -0.103529 0.026879 -3.852 0.00012 ***
## KIDS 0.084615 0.018685 4.529 6.15e-06 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.7641 on 3236 degrees of freedom
## Multiple R-squared: 0.1343, Adjusted R-squared: 0.1332
## F-statistic: 125.5 on 4 and 3236 DF, p-value: < 2.2e-16
We can see the “attenuation bias” we were talking about in class - the \(\beta\)’s are biased toward 0.