1. Estimar el modelo lineal.

options(scipen = 9999)
load("C:/Users/JOSE CASTRO/Downloads/smoke.RData")
modelo_cigs <- lm(cigs ~ cigpric + lcigpric + income + lincome + 
                    age + agesq + educ + white + restaurn, 
                  data = data)
print(modelo_cigs)
## 
## Call:
## lm(formula = cigs ~ cigpric + lcigpric + income + lincome + age + 
##     agesq + educ + white + restaurn, data = data)
## 
## Coefficients:
##   (Intercept)        cigpric       lcigpric         income        lincome  
##  340.80437460     2.00226767  -115.27346445    -0.00004619     1.40406118  
##           age          agesq           educ          white       restaurn  
##    0.77835901    -0.00915035    -0.49478062    -0.53105164    -2.64424135

2. Presente la un resumen del modelo con stargazer

library(stargazer)
suppressWarnings(
  stargazer(modelo_cigs, 
            type = "text",
            title = "Modelo de Cigarrillos",
            header = FALSE)
)
## 
## Modelo de Cigarrillos
## ===============================================
##                         Dependent variable:    
##                     ---------------------------
##                                cigs            
## -----------------------------------------------
## cigpric                        2.002           
##                               (1.493)          
##                                                
## lcigpric                     -115.273          
##                              (85.424)          
##                                                
## income                       -0.00005          
##                              (0.0001)          
##                                                
## lincome                        1.404           
##                               (1.708)          
##                                                
## age                          0.778***          
##                               (0.161)          
##                                                
## agesq                        -0.009***         
##                               (0.002)          
##                                                
## educ                         -0.495***         
##                               (0.168)          
##                                                
## white                         -0.531           
##                               (1.461)          
##                                                
## restaurn                     -2.644**          
##                               (1.130)          
##                                                
## Constant                      340.804          
##                              (260.016)         
##                                                
## -----------------------------------------------
## Observations                    807            
## R2                             0.055           
## Adjusted R2                    0.044           
## Residual Std. Error      13.413 (df = 797)     
## F Statistic           5.169*** (df = 9; 797)   
## ===============================================
## Note:               *p<0.1; **p<0.05; ***p<0.01

3. Verifique si la matriz de Varianza-Covarianza del modelo es escalar

Prueba de White (Breusch-Pagan)

library(lmtest)
white_test <- bptest(modelo_cigs, 
                     ~ I(cigpric^2) + I(lcigpric^2) + I(income^2) + 
                       I(lincome^2) + I(age^2) + I(agesq^2) + 
                       I(educ^2) + I(white^2) + I(restaurn^2),
                     data = data)
print(white_test)
## 
##  studentized Breusch-Pagan test
## 
## data:  modelo_cigs
## BP = 26.323, df = 9, p-value = 0.001809

p-value = 0.0018 < 0.05 → HAY heterocedasticidad La matriz Var-Cov NO es escalar por heterocedasticidad

Prueba del Multiplicador de Lagrange (Breusch Godfrey)

Autocorrelación de 2º Orden:

prueba_LM <- bgtest(modelo_cigs, order = 2)
print(prueba_LM)
## 
##  Breusch-Godfrey test for serial correlation of order up to 2
## 
## data:  modelo_cigs
## LM test = 0.26889, df = 2, p-value = 0.8742

p-value = 0.8742 > 0.05 → NO hay autocorrelación de orden 2

Autocorrelación de 1º orden (prueba de Durbin Watson)

library(car)
## Cargando paquete requerido: carData
prueba_DW <- durbinWatsonTest(modelo_cigs)
print(prueba_DW)
##  lag Autocorrelation D-W Statistic p-value
##    1    -0.009243664      2.017442   0.888
##  Alternative hypothesis: rho != 0

p-value = 0.92 > 0.05 → NO hay autocorrelación de orden 1

La matriz de Varianza-Covarianza NO es escalar. La fuente del problema es la heterocedasticidad, detectada mediante la prueba de Breusch-Pagan con p-value = 0.0018 < 0.05.

4. Obtenga el Estimador HAC apropiado y denominelo vcov_HAC

Sin corregir

options(scipen = 9999)
library(lmtest)
#Sin corregir:
coeftest(modelo_cigs)
## 
## t test of coefficients:
## 
##                   Estimate     Std. Error t value     Pr(>|t|)    
## (Intercept)  340.804374604  260.015587269  1.3107     0.190334    
## cigpric        2.002267667    1.492831189  1.3413     0.180220    
## lcigpric    -115.273464445   85.424315195 -1.3494     0.177585    
## income        -0.000046194    0.000133491 -0.3460     0.729402    
## lincome        1.404061178    1.708165841  0.8220     0.411340    
## age            0.778359013    0.160555612  4.8479 0.0000015001 ***
## agesq         -0.009150353    0.001749292 -5.2309 0.0000002158 ***
## educ          -0.494780616    0.168180198 -2.9420     0.003356 ** 
## white         -0.531051635    1.460721806 -0.3636     0.716287    
## restaurn      -2.644241351    1.129998690 -2.3400     0.019528 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Corregido (usando un estimador HAC)

library(lmtest)
library(sandwich)

# Corregido
vcov_HAC <- vcovHC(modelo_cigs, type = "HC0")
coeftest(modelo_cigs, vcov. = vcov_HAC)
## 
## t test of coefficients:
## 
##                   Estimate     Std. Error t value        Pr(>|t|)    
## (Intercept)  340.804374604  278.565072885  1.2234        0.221530    
## cigpric        2.002267667    1.602727983  1.2493        0.211927    
## lcigpric    -115.273464445   91.344424879 -1.2620        0.207331    
## income        -0.000046194    0.000115593 -0.3996        0.689540    
## lincome        1.404061178    1.228970726  1.1425        0.253602    
## age            0.778359013    0.136944678  5.6837 0.0000000184838 ***
## agesq         -0.009150353    0.001451548 -6.3039 0.0000000004804 ***
## educ          -0.494780616    0.162968371 -3.0361        0.002475 ** 
## white         -0.531051635    1.361907703 -0.3899        0.696691    
## restaurn      -2.644241351    1.038254938 -2.5468        0.011058 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

5. usando la libreria stargazer, presente el modelo “original” y el modelo “corregido” en una sola tabla (consulte la documentación de stargazer)

library(stargazer)

# Extraer errores estándar corregidos
se_HAC <- sqrt(diag(vcov_HAC))

suppressWarnings(
  stargazer(modelo_cigs, modelo_cigs,
            se = list(NULL, se_HAC),
            column.labels = c("Original", "Corregido"),
            type = "text",
            title = "Modelo Cigarrillos: Original vs HAC Corregido",
            align = TRUE)
)
## 
## Modelo Cigarrillos: Original vs HAC Corregido
## ===========================================================
##                                    Dependent variable:     
##                                ----------------------------
##                                            cigs            
##                                   Original      Corregido  
##                                     (1)            (2)     
## -----------------------------------------------------------
## cigpric                            2.002          2.002    
##                                   (1.493)        (1.603)   
##                                                            
## lcigpric                          -115.273      -115.273   
##                                   (85.424)      (91.344)   
##                                                            
## income                            -0.00005      -0.00005   
##                                   (0.0001)      (0.0001)   
##                                                            
## lincome                            1.404          1.404    
##                                   (1.708)        (1.229)   
##                                                            
## age                               0.778***      0.778***   
##                                   (0.161)        (0.137)   
##                                                            
## agesq                            -0.009***      -0.009***  
##                                   (0.002)        (0.001)   
##                                                            
## educ                             -0.495***      -0.495***  
##                                   (0.168)        (0.163)   
##                                                            
## white                              -0.531        -0.531    
##                                   (1.461)        (1.362)   
##                                                            
## restaurn                          -2.644**      -2.644**   
##                                   (1.130)        (1.038)   
##                                                            
## Constant                          340.804        340.804   
##                                  (260.016)      (278.565)  
##                                                            
## -----------------------------------------------------------
## Observations                        807            807     
## R2                                 0.055          0.055    
## Adjusted R2                        0.044          0.044    
## Residual Std. Error (df = 797)     13.413        13.413    
## F Statistic (df = 9; 797)         5.169***      5.169***   
## ===========================================================
## Note:                           *p<0.1; **p<0.05; ***p<0.01