install.packages("wooldridge")
## Installing package into '/cloud/lib/x86_64-pc-linux-gnu-library/4.3'
## (as 'lib' is unspecified)
library(wooldridge)
data("smoke")
data(smoke)
model <- lm(cigs ~ lcigpric + lincome + educ + age + agesq + restaurn + white, data = smoke)
summary(model)
## 
## Call:
## lm(formula = cigs ~ lcigpric + lincome + educ + age + agesq + 
##     restaurn + white, data = smoke)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -15.772  -9.330  -5.907   7.945  70.275 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -2.682435  24.220730  -0.111  0.91184    
## lcigpric    -0.850904   5.782321  -0.147  0.88305    
## lincome      0.869014   0.728764   1.192  0.23344    
## educ        -0.501753   0.167168  -3.001  0.00277 ** 
## age          0.774502   0.160516   4.825 1.68e-06 ***
## agesq       -0.009069   0.001748  -5.188 2.70e-07 ***
## restaurn    -2.865621   1.117406  -2.565  0.01051 *  
## white       -0.559236   1.459461  -0.383  0.70169    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 13.41 on 799 degrees of freedom
## Multiple R-squared:  0.05291,    Adjusted R-squared:  0.04461 
## F-statistic: 6.377 on 7 and 799 DF,  p-value: 2.588e-07
# (i) Are there any important differences between the two sets of standard errors?
# Larger robust standard errors for coefficients like the cigarette price coefficient, indicating greater uncertainty due to varying income levels or other factors influencing smoking behavior. Potentially different significance levels for the coefficients based on the standard errors used. For example, a coefficient might appear significant using usual standard errors but not significant with robust standard errors, highlighting the importance of adjusting for heteroskedasticity.
# (ii) Holding other factors fixed, if education increases by four years, what happens to the estimated probability of smoking?
# -0.05*4=-0.2 that means Holding other factors fixed, if education increases by four years, estimated probability of smoking will be decreased by 20%.
# (iii) At what point does another year of age reduce the probability of smoking?
# (iv) Interpret the coefficient on the binary variable restaurn (a dummy variable equal to one if the person lives in a state with restaurant smoking restrictions).
# (v) Person number 206 in the data set has the following characteristics: cigpric 5 67.44,income 5 6,500, educ 5 16, age 5 77, restaurn 5 0, white 5 0, and smokes 5 0. Compute the predicted probability of smoking for this person and comment on the result.
# smoking=0.656-0.69*1.83+0.012*3.81-0.029*16+0.020*77-0.00026*5929=-1.03