Estimar el modelo lineal: cigs=f(cigpric, lcigpric, income, lincome, age, agesq, educ, white, restaurn)

options(scipen = 99999)
library(lmtest)
load("C:/Users/mende/Downloads/Jairo Rodrigo Mendez Carrillo - smoke.RData")
modelo_smoke<-lm(cigs~cigpric+lcigpric+income+lincome+age+agesq+educ+white+restaurn , data=data)
print(modelo_smoke)
## 
## Call:
## lm(formula = cigs ~ cigpric + lcigpric + income + lincome + age + 
##     agesq + educ + white + restaurn, data = data)
## 
## Coefficients:
##   (Intercept)        cigpric       lcigpric         income        lincome  
##  340.80437460     2.00226767  -115.27346445    -0.00004619     1.40406118  
##           age          agesq           educ          white       restaurn  
##    0.77835901    -0.00915035    -0.49478062    -0.53105164    -2.64424135

Presente la un resumen del modelo con stargazer

library(stargazer)
stargazer(modelo_smoke,type = "html",title = "resumen del modelo" )
resumen del modelo
Dependent variable:
cigs
cigpric 2.002
(1.493)
lcigpric -115.273
(85.424)
income -0.00005
(0.0001)
lincome 1.404
(1.708)
age 0.778***
(0.161)
agesq -0.009***
(0.002)
educ -0.495***
(0.168)
white -0.531
(1.461)
restaurn -2.644**
(1.130)
Constant 340.804
(260.016)
Observations 807
R2 0.055
Adjusted R2 0.044
Residual Std. Error 13.413 (df = 797)
F Statistic 5.169*** (df = 9; 797)
Note: p<0.1; p<0.05; p<0.01

Verifique si la matriz de Varianza-Covarianza del modelo es escalar, de no ser escalar indique la fuente del problema.

Prueba de White

library(lmtest)
prueba_white<-bptest(modelo_smoke,data = data)
print(prueba_white)
## 
##  studentized Breusch-Pagan test
## 
## data:  modelo_smoke
## BP = 33.525, df = 9, p-value = 0.0001082

Como 0.0001082 < 0.05 ,se rechaza la H0, por lo tanto hay evidencia de que la varianza de los residuos no es homocedástica, Hay evidencia de que la varianza de los residuos es Heterocedástica

Prueba del Multiplicador de Lagrange (Breusch Godfrey)

Autocorrelación de 2º Orden:

library(lmtest)
prueba_LM<-bgtest(modelo_smoke,order = 2)
print(prueba_LM)
## 
##  Breusch-Godfrey test for serial correlation of order up to 2
## 
## data:  modelo_smoke
## LM test = 0.26889, df = 2, p-value = 0.8742

/ya que el p-value = 0.8742 > 0.05 , No se rechaza H0, No hay evidencia de autocorrelación, no siguen autocorrelación de orden “2”

Autocorrelación de 1º orden (prueba de Durbin Watson)

library(car)
## Warning: package 'car' was built under R version 4.2.3
## Loading required package: carData
## Warning: package 'carData' was built under R version 4.2.3
durbinWatsonTest(model = modelo_smoke)
##  lag Autocorrelation D-W Statistic p-value
##    1    -0.009243664      2.017442   0.884
##  Alternative hypothesis: rho != 0

ya que el p-value = 0.938 > 0.05,No se rechaza H0, No hay evidencia de autocorrelación, no siguen autocorrelación de 1° orden

Obtenga el Estimador HAC apropiado y denominelo vcov_HAC

options(scipen = 99999)
library(lmtest)
#Sin corregir:
coeftest(modelo_smoke)
## 
## t test of coefficients:
## 
##                   Estimate     Std. Error t value     Pr(>|t|)    
## (Intercept)  340.804374604  260.015587269  1.3107     0.190334    
## cigpric        2.002267667    1.492831189  1.3413     0.180220    
## lcigpric    -115.273464445   85.424315195 -1.3494     0.177585    
## income        -0.000046194    0.000133491 -0.3460     0.729402    
## lincome        1.404061178    1.708165841  0.8220     0.411340    
## age            0.778359013    0.160555612  4.8479 0.0000015001 ***
## agesq         -0.009150353    0.001749292 -5.2309 0.0000002158 ***
## educ          -0.494780616    0.168180198 -2.9420     0.003356 ** 
## white         -0.531051635    1.460721806 -0.3636     0.716287    
## restaurn      -2.644241351    1.129998690 -2.3400     0.019528 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
options(scipen = 99999)
library(lmtest)
library(sandwich)
## Warning: package 'sandwich' was built under R version 4.2.3
#Corregido
vcov_HAC<-vcovHC(modelo_smoke,type = "HC0") 

coeftest(modelo_smoke,vcov. = vcov_HAC)
## 
## t test of coefficients:
## 
##                   Estimate     Std. Error t value        Pr(>|t|)    
## (Intercept)  340.804374604  278.565072832  1.2234        0.221530    
## cigpric        2.002267667    1.602727983  1.2493        0.211927    
## lcigpric    -115.273464445   91.344424868 -1.2620        0.207331    
## income        -0.000046194    0.000115593 -0.3996        0.689540    
## lincome        1.404061178    1.228970726  1.1425        0.253602    
## age            0.778359013    0.136944678  5.6837 0.0000000184838 ***
## agesq         -0.009150353    0.001451548 -6.3039 0.0000000004804 ***
## educ          -0.494780616    0.162968371 -3.0361        0.002475 ** 
## white         -0.531051635    1.361907703 -0.3899        0.696691    
## restaurn      -2.644241351    1.038254938 -2.5468        0.011058 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
library(robustbase)
library(stargazer)
modelo_robust<-lmrob(cigs~cigpric+lcigpric+income+lincome+age+agesq+educ+white+restaurn, data = data)
## Warning in lmrob.S(x, y, control = control): S-estimated scale == 0: Probably
## exact fit; check your data
## Warning in lmrob.fit(x, y, control, init = init): initial estim. 'init' not
## converged -- will be return()ed basically unchanged
robust.se<-sqrt(diag(vcov_HAC))

usando la libreria stargazer, presente el modelo “original” y el modelo “corregido” en una sola tabla (consulte la documentación de stargazer)

stargazer(modelo_smoke,modelo_smoke,se=list(NULL,robust.se),
          type ="html", title = "comparacion de modelo",column.labels = c("original","corregido") )
comparacion de modelo
Dependent variable:
cigs
original corregido
(1) (2)
cigpric 2.002 2.002
(1.493) (1.603)
lcigpric -115.273 -115.273
(85.424) (91.344)
income -0.00005 -0.00005
(0.0001) (0.0001)
lincome 1.404 1.404
(1.708) (1.229)
age 0.778*** 0.778***
(0.161) (0.137)
agesq -0.009*** -0.009***
(0.002) (0.001)
educ -0.495*** -0.495***
(0.168) (0.163)
white -0.531 -0.531
(1.461) (1.362)
restaurn -2.644** -2.644**
(1.130) (1.038)
Constant 340.804 340.804
(260.016) (278.565)
Observations 807 807
R2 0.055 0.055
Adjusted R2 0.044 0.044
Residual Std. Error (df = 797) 13.413 13.413
F Statistic (df = 9; 797) 5.169*** 5.169***
Note: p<0.1; p<0.05; p<0.01