b

data <- readMat("C:/Users/dorgo/Documents/R/matlabdata.mat")
data %<>% as.data.frame()
tobit_model <- censReg(formula = I~AGE+MARRIED+WOMAN+KIDS+Y,
                       data = data)
b <- tobit_model %>% summary()
tobit_coef <- coef(b) %>% as.data.frame()
sigma <- tobit_coef["logSigma",1] %>% exp()
values <- list(1,28,1,1,1,0.7)
a <- sapply(1:6, function(x)tobit_coef[x,1]*values[[x]])
x_beta <- (a %>% sum())

The x times \(\beta\) is -1.0320593 and the \(\sigma\) is 0.9478728 so we get: 0.0664937 which is ~ 665 NIS of expenditure on insurance.

c

Now we will see the output of a regression implemented just on positive I’s:

data_1 <- data[data$I>0,]
linear_model <- lm(formula = I~AGE+MARRIED+WOMAN+KIDS,
                       data = data)
linear_model  %>% summary()
## 
## Call:
## lm(formula = I ~ AGE + MARRIED + WOMAN + KIDS, data = data)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -1.3445 -0.4646 -0.2411  0.2389  3.3834 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -0.214314   0.051767  -4.140 3.56e-05 ***
## AGE          0.017514   0.001341  13.063  < 2e-16 ***
## MARRIED      0.116904   0.036353   3.216  0.00131 ** 
## WOMAN       -0.103529   0.026879  -3.852  0.00012 ***
## KIDS         0.084615   0.018685   4.529 6.15e-06 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.7641 on 3236 degrees of freedom
## Multiple R-squared:  0.1343, Adjusted R-squared:  0.1332 
## F-statistic: 125.5 on 4 and 3236 DF,  p-value: < 2.2e-16

We can see the “attenuation bias” we were talking about in class - the \(\beta\)’s are biased toward 0.