Las bases de datos en panel son bases en las que podemos observar el comportamiento de un grupo de entidades en el tiempo. Estas entidades pueden ser empresas, estados, países o individuos. Una base de datos en panel luce así:

set.seed(12345)
data.frame(Pais = c(1,1,1,2,2,2), Anio = c(2001,2002,2003,2001,2002,2003), Y = rnorm(6), X1 = runif(6), X2 = floor(runif(6,0,100)))
##   Pais Anio          Y          X1 X2
## 1    1 2001  0.5855288 0.735684952 17
## 2    1 2002  0.7094660 0.001136587 95
## 3    1 2003 -0.1093033 0.391203335 45
## 4    2 2001 -0.4534972 0.462494654 32
## 5    2 2002  0.6058875 0.388143982 96
## 6    2 2003 -1.8179560 0.402485142 70

Para el trabajo con bases de datos en panel, tanto con efectos fijos como con efectos aleatorios, haremos uso del paquete plm(Croissant y Milo, 2008), que contiene funciones y métodos de ajuste relevantes al análisis de este tipo de información. Para instalar el paquete, y cargarlo en el sistema, ejecute las siguientes líneas de código:

if(!require('foreign')){
    install.packages("foreign")
}
## Loading required package: foreign
library("foreign")
library(WDI)
library(wbstats)
library(tidyverse)
## Warning: package 'lubridate' was built under R version 4.3.3
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.3     ✔ readr     2.1.4
## ✔ forcats   1.0.0     ✔ stringr   1.5.0
## ✔ ggplot2   3.5.0     ✔ tibble    3.2.1
## ✔ lubridate 1.9.3     ✔ tidyr     1.3.0
## ✔ purrr     1.0.2     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
gob_data <- wb_data(country = c("MX","EC","Ca"), indicator = "NY.GDP.PCAP.CD", start_date=2013, end_date=2023)
panel <- select(gob_data, country, date, NY.GDP.PCAP.CD)
panel
## # A tibble: 30 × 3
##    country  date NY.GDP.PCAP.CD
##    <chr>   <dbl>          <dbl>
##  1 Canada   2013         52635.
##  2 Canada   2014         50956.
##  3 Canada   2015         43596.
##  4 Canada   2016         42316.
##  5 Canada   2017         45129.
##  6 Canada   2018         46549.
##  7 Canada   2019         46374.
##  8 Canada   2020         43562.
##  9 Canada   2021         52515.
## 10 Canada   2022         55522.
## # ℹ 20 more rows
library(gplots)
## 
## Attaching package: 'gplots'
## The following object is masked from 'package:stats':
## 
##     lowess
library(plm)
## 
## Attaching package: 'plm'
## The following objects are masked from 'package:dplyr':
## 
##     between, lag, lead
eco_data<-wb_data(country = c("AR","US","MX","CN"), indicator = c("NY.GDP.MKTP.KD.ZG","SL.UEM.TOTL.ZS",
"AG.LND.AGRI.ZS", "AG.LND.ARBL.ZS", "EG.ELC.ACCS.ZS", "SP.POP.GROW", "GB.XPD.RSDV.GD.ZS"))
panel <- select(eco_data,country,date,NY.GDP.MKTP.KD.ZG)
panel <- subset(eco_data,date==2000 | date ==2010  | date ==2015  | date == 2020)
panel <- pdata.frame(panel, index=c("country", "date"))
panel                     
##                    iso2c iso3c       country date AG.LND.AGRI.ZS AG.LND.ARBL.ZS
## Argentina-2000        AR   ARG     Argentina 2000       46.95819       10.09979
## Argentina-2010        AR   ARG     Argentina 2010       46.13891       13.88652
## Argentina-2015        AR   ARG     Argentina 2015       44.11877       14.74011
## Argentina-2020        AR   ARG     Argentina 2020       43.02927       15.35021
## China-2000            CN   CHN         China 2000       55.69458       12.67972
## China-2010            CN   CHN         China 2010       56.18347       12.79608
## China-2015            CN   CHN         China 2015       55.78313       12.23564
## China-2020            CN   CHN         China 2020       55.46265       11.60626
## Mexico-2000           MX   MEX        Mexico 2000       54.69791       11.78271
## Mexico-2010           MX   MEX        Mexico 2010       52.37120       11.79711
## Mexico-2015           MX   MEX        Mexico 2015       50.76262       11.34186
## Mexico-2020           MX   MEX        Mexico 2020       49.96939       10.32485
## United States-2000    US   USA United States 2000       45.23058       19.14097
## United States-2010    US   USA United States 2010       44.49251       17.24164
## United States-2015    US   USA United States 2015       44.24403       17.12451
## United States-2020    US   USA United States 2020       44.36337       17.24386
##                    EG.ELC.ACCS.ZS GB.XPD.RSDV.GD.ZS NY.GDP.MKTP.KD.ZG
## Argentina-2000           95.68047           0.43884        -0.7889989
## Argentina-2010           98.82000           0.56104        10.1253982
## Argentina-2015           99.68903           0.62262         2.7311598
## Argentina-2020          100.00000           0.54154        -9.9004848
## China-2000               96.74506           0.89316         8.4900934
## China-2010               99.70000           1.71372        10.6358711
## China-2015              100.00000           2.05701         7.0413289
## China-2020              100.00000           2.40666         2.2386384
## Mexico-2000              98.00713           0.30613         5.0292840
## Mexico-2010              99.23670           0.49485         4.9713346
## Mexico-2015              99.00000           0.42943         2.7023234
## Mexico-2020              99.40000           0.29638        -8.6515868
## United States-2000      100.00000           2.61984         4.0771595
## United States-2010      100.00000           2.71445         2.7088567
## United States-2015      100.00000           2.78700         2.7063696
## United States-2020      100.00000           3.46777        -2.7678025
##                    SL.UEM.TOTL.ZS SP.POP.GROW
## Argentina-2000             15.000   1.1332770
## Argentina-2010              7.710   0.2555824
## Argentina-2015              7.583   1.0780013
## Argentina-2020             11.460   0.9700540
## China-2000                  3.260   0.7879566
## China-2010                  4.530   0.4829597
## China-2015                  4.650   0.5814561
## China-2020                  5.000   0.2380409
## Mexico-2000                 2.650   1.5845508
## Mexico-2010                 5.300   1.3265790
## Mexico-2015                 4.310   1.1670088
## Mexico-2020                 4.440   0.7272438
## United States-2000          3.990   1.1127690
## United States-2010          9.630   0.8296167
## United States-2015          5.280   0.7362173
## United States-2020          8.050   0.9643479
#Plot de heterogeneidad por país
plotmeans(panel$NY.GDP.MKTP.KD.ZG ~ panel$country,xlab="País",ylab="GDP",
             mean.labels=TRUE, digits=-3,
             col="red",connect=TRUE, main="Heterogeneidad entre paises")

#Plot de heterogeneidad por años
plotmeans(panel$NY.GDP.MKTP.KD.ZG ~ panel$date,xlab="Año",ylab="GDP",
             mean.labels=TRUE, digits=-3,
             col="red",connect=TRUE, main="Heterogeneidad entre años")

Modelo

pooled<-plm(NY.GDP.MKTP.KD.ZG ~ SL.UEM.TOTL.ZS+
AG.LND.AGRI.ZS+AG.LND.ARBL.ZS+EG.ELC.ACCS.ZS+SP.POP.GROW+GB.XPD.RSDV.GD.ZS , data=panel ,model = "pooling")

summary(pooled)
## Pooling Model
## 
## Call:
## plm(formula = NY.GDP.MKTP.KD.ZG ~ SL.UEM.TOTL.ZS + AG.LND.AGRI.ZS + 
##     AG.LND.ARBL.ZS + EG.ELC.ACCS.ZS + SP.POP.GROW + GB.XPD.RSDV.GD.ZS, 
##     data = panel, model = "pooling")
## 
## Balanced Panel: n = 4, T = 4, N = 16
## 
## Residuals:
##    Min. 1st Qu.  Median 3rd Qu.    Max. 
## -8.1043 -1.9711  1.2967  2.3740  5.8281 
## 
## Coefficients:
##                    Estimate Std. Error t-value Pr(>|t|)
## (Intercept)       108.49114  161.77765  0.6706   0.5193
## SL.UEM.TOTL.ZS     -0.11244    0.70155 -0.1603   0.8762
## AG.LND.AGRI.ZS      1.07183    0.65102  1.6464   0.1341
## AG.LND.ARBL.ZS      2.14456    1.21025  1.7720   0.1102
## EG.ELC.ACCS.ZS     -1.82557    1.44565 -1.2628   0.2384
## SP.POP.GROW        -4.69836    4.18512 -1.1226   0.2906
## GB.XPD.RSDV.GD.ZS  -1.70636    2.19119 -0.7787   0.4561
## 
## Total Sum of Squares:    512.67
## Residual Sum of Squares: 236.86
## R-Squared:      0.53799
## Adj. R-Squared: 0.22998
## F-statistic: 1.74666 on 6 and 9 DF, p-value: 0.21687
within<-plm(NY.GDP.MKTP.KD.ZG ~ SL.UEM.TOTL.ZS+
AG.LND.AGRI.ZS+AG.LND.ARBL.ZS+EG.ELC.ACCS.ZS+SP.POP.GROW+GB.XPD.RSDV.GD.ZS , data=panel ,model = "within")

summary(within)
## Oneway (individual) effect Within Model
## 
## Call:
## plm(formula = NY.GDP.MKTP.KD.ZG ~ SL.UEM.TOTL.ZS + AG.LND.AGRI.ZS + 
##     AG.LND.ARBL.ZS + EG.ELC.ACCS.ZS + SP.POP.GROW + GB.XPD.RSDV.GD.ZS, 
##     data = panel, model = "within")
## 
## Balanced Panel: n = 4, T = 4, N = 16
## 
## Residuals:
##     Min.  1st Qu.   Median  3rd Qu.     Max. 
## -4.98771 -2.38596  0.41446  1.92653  5.64725 
## 
## Coefficients:
##                   Estimate Std. Error t-value Pr(>|t|)  
## SL.UEM.TOTL.ZS    -0.60572    0.97119 -0.6237  0.55579  
## AG.LND.AGRI.ZS     3.29170    1.43542  2.2932  0.06167 .
## AG.LND.ARBL.ZS     0.47688    2.68288  0.1777  0.86477  
## EG.ELC.ACCS.ZS     0.75577    3.26340  0.2316  0.82455  
## SP.POP.GROW       -3.77042    7.00843 -0.5380  0.60995  
## GB.XPD.RSDV.GD.ZS -3.72381    7.02411 -0.5301  0.61505  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Total Sum of Squares:    401.22
## Residual Sum of Squares: 144.53
## R-Squared:      0.63977
## Adj. R-Squared: 0.099435
## F-statistic: 1.77603 on 6 and 6 DF, p-value: 0.25126
walhus<-plm(NY.GDP.MKTP.KD.ZG ~ SL.UEM.TOTL.ZS+
AG.LND.AGRI.ZS+AG.LND.ARBL.ZS+EG.ELC.ACCS.ZS+SP.POP.GROW+GB.XPD.RSDV.GD.ZS , data=panel ,model = "random", random.method = "walhus")

summary(walhus)
## Oneway (individual) effect Random Effect Model 
##    (Wallace-Hussain's transformation)
## 
## Call:
## plm(formula = NY.GDP.MKTP.KD.ZG ~ SL.UEM.TOTL.ZS + AG.LND.AGRI.ZS + 
##     AG.LND.ARBL.ZS + EG.ELC.ACCS.ZS + SP.POP.GROW + GB.XPD.RSDV.GD.ZS, 
##     data = panel, model = "random", random.method = "walhus")
## 
## Balanced Panel: n = 4, T = 4, N = 16
## 
## Effects:
##                  var std.dev share
## idiosyncratic 19.349   4.399     1
## individual     0.000   0.000     0
## theta: 0
## 
## Residuals:
##    Min. 1st Qu.  Median 3rd Qu.    Max. 
## -8.1043 -1.9711  1.2967  2.3740  5.8281 
## 
## Coefficients:
##                    Estimate Std. Error z-value Pr(>|z|)  
## (Intercept)       108.49114  161.77765  0.6706  0.50246  
## SL.UEM.TOTL.ZS     -0.11244    0.70155 -0.1603  0.87267  
## AG.LND.AGRI.ZS      1.07183    0.65102  1.6464  0.09968 .
## AG.LND.ARBL.ZS      2.14456    1.21025  1.7720  0.07640 .
## EG.ELC.ACCS.ZS     -1.82557    1.44565 -1.2628  0.20666  
## SP.POP.GROW        -4.69836    4.18512 -1.1226  0.26159  
## GB.XPD.RSDV.GD.ZS  -1.70636    2.19119 -0.7787  0.43614  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Total Sum of Squares:    512.67
## Residual Sum of Squares: 236.86
## R-Squared:      0.53799
## Adj. R-Squared: 0.22998
## Chisq: 10.4799 on 6 DF, p-value: 0.10584
nerlove<-plm(NY.GDP.MKTP.KD.ZG ~ SL.UEM.TOTL.ZS+
AG.LND.AGRI.ZS+AG.LND.ARBL.ZS+EG.ELC.ACCS.ZS+SP.POP.GROW+GB.XPD.RSDV.GD.ZS , data=panel ,model = "random", random.method = "nerlove")

summary(nerlove)
## Oneway (individual) effect Random Effect Model 
##    (Nerlove's transformation)
## 
## Call:
## plm(formula = NY.GDP.MKTP.KD.ZG ~ SL.UEM.TOTL.ZS + AG.LND.AGRI.ZS + 
##     AG.LND.ARBL.ZS + EG.ELC.ACCS.ZS + SP.POP.GROW + GB.XPD.RSDV.GD.ZS, 
##     data = panel, model = "random", random.method = "nerlove")
## 
## Balanced Panel: n = 4, T = 4, N = 16
## 
## Effects:
##                   var std.dev share
## idiosyncratic   9.033   3.006 0.028
## individual    308.516  17.565 0.972
## theta: 0.9148
## 
## Residuals:
##     Min.  1st Qu.   Median  3rd Qu.     Max. 
## -5.44445 -2.22648  0.45774  2.91280  4.41861 
## 
## Coefficients:
##                    Estimate Std. Error z-value Pr(>|z|)  
## (Intercept)       -93.54726  235.48687 -0.3973  0.69118  
## SL.UEM.TOTL.ZS     -0.47168    0.82125 -0.5743  0.56574  
## AG.LND.AGRI.ZS      2.77116    1.11836  2.4779  0.01322 *
## AG.LND.ARBL.ZS      1.39124    1.94063  0.7169  0.47343  
## EG.ELC.ACCS.ZS     -0.50775    2.27171 -0.2235  0.82314  
## SP.POP.GROW        -4.94041    5.70743 -0.8656  0.38670  
## GB.XPD.RSDV.GD.ZS  -1.43001    4.94935 -0.2889  0.77264  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Total Sum of Squares:    402.03
## Residual Sum of Squares: 162.37
## R-Squared:      0.59612
## Adj. R-Squared: 0.32687
## Chisq: 13.2839 on 6 DF, p-value: 0.038742
#Bibliotecas
#-----------------------------------
library(plm)
library(car)
## Loading required package: carData
## 
## Attaching package: 'car'
## The following object is masked from 'package:dplyr':
## 
##     recode
## The following object is masked from 'package:purrr':
## 
##     some
#library(ggstatsplot)
library(gplots)
library(foreign)
library(lmtest)
## Loading required package: zoo
## 
## Attaching package: 'zoo'
## The following objects are masked from 'package:base':
## 
##     as.Date, as.Date.numeric
imp_alc_accide_trans <- read.dta('http://fmwww.bc.edu/ec-p/data/stockwatson/fatality.dta')
imp_alc_accide_trans$tasa_mort <- imp_alc_accide_trans$allmort/imp_alc_accide_trans$pop
imp_alc_accide_trans$logpbi <- log(imp_alc_accide_trans$perinc)
imp_alc_accide_trans$edadminima <- ifelse(imp_alc_accide_trans$mlda<19, 18,
                                          ifelse(imp_alc_accide_trans$mlda>=19 & imp_alc_accide_trans$mlda<20,
                                                 19,
                                                 ifelse(imp_alc_accide_trans$mlda>=20 & imp_alc_accide_trans$mlda<21,
                                                        20,21)))
reg_imp_mort_82 <- lm(tasa_mort ~ beertax, 
                      data = imp_alc_accide_trans[imp_alc_accide_trans$year==1982,])
reg_imp_mort_88 <- lm(tasa_mort ~ beertax, 
                      data = imp_alc_accide_trans[imp_alc_accide_trans$year==1988,])
summary(reg_imp_mort_82)
## 
## Call:
## lm(formula = tasa_mort ~ beertax, data = imp_alc_accide_trans[imp_alc_accide_trans$year == 
##     1982, ])
## 
## Residuals:
##        Min         1Q     Median         3Q        Max 
## -9.356e-05 -4.480e-05 -1.068e-05  2.295e-05  2.172e-04 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 2.010e-04  1.391e-05  14.455   <2e-16 ***
## beertax     1.485e-05  1.884e-05   0.788    0.435    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 6.705e-05 on 46 degrees of freedom
## Multiple R-squared:  0.01332,    Adjusted R-squared:  -0.008126 
## F-statistic: 0.6212 on 1 and 46 DF,  p-value: 0.4347
summary(reg_imp_mort_88)
## 
## Call:
## lm(formula = tasa_mort ~ beertax, data = imp_alc_accide_trans[imp_alc_accide_trans$year == 
##     1988, ])
## 
## Residuals:
##        Min         1Q     Median         3Q        Max 
## -7.293e-05 -3.603e-05 -7.132e-06  3.994e-05  1.358e-04 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 1.859e-04  1.060e-05  17.540   <2e-16 ***
## beertax     4.388e-05  1.645e-05   2.668   0.0105 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 4.903e-05 on 46 degrees of freedom
## Multiple R-squared:  0.134,  Adjusted R-squared:  0.1152 
## F-statistic: 7.118 on 1 and 46 DF,  p-value: 0.0105
imp_alc_accide_trans$dif_mort_88_82 <- imp_alc_accide_trans$tasa_mort[imp_alc_accide_trans$year==1988] - 
  imp_alc_accide_trans$tasa_mort[imp_alc_accide_trans$year==1982]
imp_alc_accide_trans$dif_imp_88_82  <- imp_alc_accide_trans$beertax[imp_alc_accide_trans$year==1988] - 
  imp_alc_accide_trans$beertax[imp_alc_accide_trans$year==1982]
reg_dif_88_82 <- lm(dif_mort_88_82 ~ dif_imp_88_82,
                    data = imp_alc_accide_trans)
summary(reg_dif_88_82)
## 
## Call:
## lm(formula = dif_mort_88_82 ~ dif_imp_88_82, data = imp_alc_accide_trans)
## 
## Residuals:
##        Min         1Q     Median         3Q        Max 
## -1.227e-04 -9.619e-06  9.212e-06  2.229e-05  6.774e-05 
## 
## Coefficients:
##                 Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   -7.204e-06  2.251e-06  -3.201   0.0015 ** 
## dif_imp_88_82 -1.041e-04  1.548e-05  -6.723 7.72e-11 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.869e-05 on 334 degrees of freedom
## Multiple R-squared:  0.1192, Adjusted R-squared:  0.1166 
## F-statistic:  45.2 on 1 and 334 DF,  p-value: 7.719e-11

Efectos fijos

reg_imp_mort_fe <-plm(tasa_mort ~ beertax, 
                      data=imp_alc_accide_trans,
                      model = 'within')
summary(reg_imp_mort_fe)
## Oneway (individual) effect Within Model
## 
## Call:
## plm(formula = tasa_mort ~ beertax, data = imp_alc_accide_trans, 
##     model = "within")
## 
## Balanced Panel: n = 48, T = 7, N = 336
## 
## Residuals:
##        Min.     1st Qu.      Median     3rd Qu.        Max. 
## -5.8696e-05 -8.2838e-06 -1.2701e-07  7.9545e-06  8.9780e-05 
## 
## Coefficients:
##            Estimate  Std. Error t-value Pr(>|t|)    
## beertax -6.5587e-05  1.8785e-05 -3.4915 0.000556 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Total Sum of Squares:    1.0785e-07
## Residual Sum of Squares: 1.0345e-07
## R-Squared:      0.040745
## Adj. R-Squared: -0.11969
## F-statistic: 12.1904 on 1 and 287 DF, p-value: 0.00055597

Hagamos el test de agrupamiento, para lo cual necesitamos correr el modelo de regresión sobre todo el periodo bajo estudio:

reg_imp_mort <- lm(tasa_mort ~ beertax, 
                   data = imp_alc_accide_trans)
pFtest(reg_imp_mort_fe, reg_imp_mort)
## 
##  F test for individual effects
## 
## data:  tasa_mort ~ beertax
## F = 52.179, df1 = 47, df2 = 287, p-value < 2.2e-16
## alternative hypothesis: significant effects

El p-value del test de agrupamiento es menor al 5%, con lo que rechazamos la hipótesis de agrupamiento y concluimos que la información del panel (tanto porque hay diferencias entre unidades o a través del tiempo o ambas) es importante. Debemos correr modelos de datos en panel.

reg_imp_mort_re <- plm(tasa_mort ~ beertax, 
                       data=imp_alc_accide_trans, 
                       index=c("state", "year"), 
                       model="random")
summary(reg_imp_mort_re)
## Oneway (individual) effect Random Effect Model 
##    (Swamy-Arora's transformation)
## 
## Call:
## plm(formula = tasa_mort ~ beertax, data = imp_alc_accide_trans, 
##     model = "random", index = c("state", "year"))
## 
## Balanced Panel: n = 48, T = 7, N = 336
## 
## Effects:
##                     var   std.dev share
## idiosyncratic 3.605e-10 1.899e-05 0.119
## individual    2.660e-09 5.158e-05 0.881
## theta: 0.8622
## 
## Residuals:
##        Min.     1st Qu.      Median     3rd Qu.        Max. 
## -4.7109e-05 -1.2005e-05 -2.1530e-06  9.1011e-06  9.6435e-05 
## 
## Coefficients:
##                Estimate  Std. Error z-value Pr(>|z|)    
## (Intercept)  2.0671e-04  9.9971e-06 20.6773   <2e-16 ***
## beertax     -5.2016e-06  1.2418e-05 -0.4189   0.6753    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Total Sum of Squares:    1.2648e-07
## Residual Sum of Squares: 1.2642e-07
## R-Squared:      0.00052508
## Adj. R-Squared: -0.0024674
## Chisq: 0.175467 on 1 DF, p-value: 0.6753

Test de Hausman (para averiguar si hay que aplicar efectos fijos o aleatorios)

phtest(reg_imp_mort_fe, reg_imp_mort_re)
## 
##  Hausman Test
## 
## data:  tasa_mort ~ beertax
## chisq = 18.353, df = 1, p-value = 1.835e-05
## alternative hypothesis: one model is inconsistent

P valor = 0.0000 < 0.05 –> Rechazo H0 y el estimador consistente es el de DE (efectos fijos)

Estimamos los efectos fijos para cada sección cruzada

summary(fixef(reg_imp_mort_fe))
##      Estimate Std. Error t-value  Pr(>|t|)    
## AL 3.4776e-04 3.1336e-05 11.0980 < 2.2e-16 ***
## AZ 2.9099e-04 9.2539e-06 31.4452 < 2.2e-16 ***
## AR 2.8227e-04 1.3212e-05 21.3636 < 2.2e-16 ***
## CA 1.9682e-04 7.4007e-06 26.5943 < 2.2e-16 ***
## CO 1.9934e-04 8.0371e-06 24.8019 < 2.2e-16 ***
## CT 1.6154e-04 8.3913e-06 19.2506 < 2.2e-16 ***
## DE 2.1700e-04 7.7457e-06 28.0159 < 2.2e-16 ***
## FL 3.2095e-04 2.2151e-05 14.4890 < 2.2e-16 ***
## GA 4.0022e-04 4.6403e-05  8.6249 4.435e-16 ***
## ID 2.8086e-04 9.8767e-06 28.4368 < 2.2e-16 ***
## IL 1.5160e-04 7.8478e-06 19.3176 < 2.2e-16 ***
## IN 2.0161e-04 8.8672e-06 22.7364 < 2.2e-16 ***
## IA 1.9337e-04 1.0222e-05 18.9176 < 2.2e-16 ***
## KS 2.2544e-04 1.0863e-05 20.7528 < 2.2e-16 ***
## KY 2.2601e-04 8.0462e-06 28.0893 < 2.2e-16 ***
## LA 2.6305e-04 1.6266e-05 16.1714 < 2.2e-16 ***
## ME 2.3697e-04 1.6006e-05 14.8045 < 2.2e-16 ***
## MD 1.7712e-04 8.2458e-06 21.4800 < 2.2e-16 ***
## MA 1.3679e-04 8.6477e-06 15.8178 < 2.2e-16 ***
## MI 1.9931e-04 1.1663e-05 17.0888 < 2.2e-16 ***
## MN 1.5804e-04 9.3628e-06 16.8797 < 2.2e-16 ***
## MS 3.4485e-04 2.0936e-05 16.4717 < 2.2e-16 ***
## MO 2.1814e-04 9.2523e-06 23.5764 < 2.2e-16 ***
## MT 3.1172e-04 9.4413e-06 33.0170 < 2.2e-16 ***
## NE 1.9555e-04 1.0551e-05 18.5342 < 2.2e-16 ***
## NV 2.8769e-04 8.1056e-06 35.4922 < 2.2e-16 ***
## NH 2.2232e-04 1.4114e-05 15.7512 < 2.2e-16 ***
## NJ 1.3719e-04 7.3328e-06 18.7089 < 2.2e-16 ***
## NM 3.9040e-04 1.0154e-05 38.4492 < 2.2e-16 ***
## NY 1.2910e-04 7.5629e-06 17.0696 < 2.2e-16 ***
## NC 3.1872e-04 2.5173e-05 12.6609 < 2.2e-16 ***
## ND 1.8542e-04 1.0193e-05 18.1912 < 2.2e-16 ***
## OH 1.8032e-04 1.0193e-05 17.6910 < 2.2e-16 ***
## OK 2.9326e-04 1.8429e-05 15.9133 < 2.2e-16 ***
## OR 2.3096e-04 8.1175e-06 28.4526 < 2.2e-16 ***
## PA 1.7102e-04 8.6477e-06 19.7759 < 2.2e-16 ***
## RI 1.2126e-04 7.7533e-06 15.6395 < 2.2e-16 ***
## SC 4.0348e-04 3.5479e-05 11.3724 < 2.2e-16 ***
## SD 2.4739e-04 1.4121e-05 17.5195 < 2.2e-16 ***
## TN 2.6020e-04 9.1624e-06 28.3983 < 2.2e-16 ***
## TX 2.5602e-04 1.0853e-05 23.5889 < 2.2e-16 ***
## UT 2.3137e-04 1.5453e-05 14.9721 < 2.2e-16 ***
## VT 2.5116e-04 1.3973e-05 17.9751 < 2.2e-16 ***
## VA 2.1874e-04 1.4664e-05 14.9170 < 2.2e-16 ***
## WA 1.8181e-04 8.2328e-06 22.0836 < 2.2e-16 ***
## WV 2.5809e-04 1.0767e-05 23.9707 < 2.2e-16 ***
## WI 1.7184e-04 7.7457e-06 22.1848 < 2.2e-16 ***
## WY 3.2491e-04 7.2328e-06 44.9219 < 2.2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Efectos fijos en desviaciones con respecto a su media

summary(fixef(reg_imp_mort_fe, type ="dmean"))
##       Estimate  Std. Error  t-value  Pr(>|t|)    
## AL  1.1006e-04  3.1336e-05   3.5121 0.0005161 ***
## AZ  5.3283e-05  9.2539e-06   5.7579 2.188e-08 ***
## AR  4.4560e-05  1.3213e-05   3.3726 0.0008470 ***
## CA -4.0891e-05  7.4007e-06  -5.5254 7.380e-08 ***
## CO -3.8373e-05  8.0371e-06  -4.7744 2.876e-06 ***
## CT -7.6170e-05  8.3913e-06  -9.0773 < 2.2e-16 ***
## DE -2.0705e-05  7.7457e-06  -2.6731 0.0079465 ** 
## FL  8.3242e-05  2.2151e-05   3.7579 0.0002076 ***
## GA  1.6252e-04  4.6403e-05   3.5023 0.0005348 ***
## ID  4.3153e-05  9.8767e-06   4.3692 1.744e-05 ***
## IL -8.6107e-05  7.8478e-06 -10.9721 < 2.2e-16 ***
## IN -3.6099e-05  8.8672e-06  -4.0710 6.058e-05 ***
## IA -4.4338e-05  1.0222e-05  -4.3376 1.997e-05 ***
## KS -1.2266e-05  1.0863e-05  -1.1291 0.2597802    
## KY -1.1696e-05  8.0462e-06  -1.4536 0.1471402    
## LA  2.5344e-05  1.6266e-05   1.5581 0.1203234    
## ME -7.3922e-07  1.6006e-05  -0.0462 0.9631970    
## MD -6.0588e-05  8.2458e-06  -7.3478 2.102e-12 ***
## MA -1.0092e-04  8.6477e-06 -11.6700 < 2.2e-16 ***
## MI -3.8397e-05  1.1663e-05  -3.2922 0.0011185 ** 
## MN -7.9666e-05  9.3628e-06  -8.5087 9.901e-16 ***
## MS  1.0715e-04  2.0936e-05   5.1178 5.669e-07 ***
## MO -1.9571e-05  9.2523e-06  -2.1152 0.0352734 *  
## MT  7.4016e-05  9.4413e-06   7.8396 8.909e-14 ***
## NE -4.2162e-05  1.0551e-05  -3.9962 8.188e-05 ***
## NV  4.9978e-05  8.1056e-06   6.1659 2.373e-09 ***
## NH -1.5390e-05  1.4114e-05  -1.0904 0.2764610    
## NJ -1.0052e-04  7.3328e-06 -13.7083 < 2.2e-16 ***
## NM  1.5269e-04  1.0154e-05  15.0382 < 2.2e-16 ***
## NY -1.0861e-04  7.5629e-06 -14.3610 < 2.2e-16 ***
## NC  8.1009e-05  2.5173e-05   3.2180 0.0014386 ** 
## ND -5.2288e-05  1.0193e-05  -5.1299 5.345e-07 ***
## OH -5.7386e-05  1.0193e-05  -5.6301 4.288e-08 ***
## OK  5.5549e-05  1.8428e-05   3.0143 0.0028056 ** 
## OR -6.7445e-06  8.1175e-06  -0.8309 0.4067439    
## PA -6.6691e-05  8.6477e-06  -7.7120 2.049e-13 ***
## RI -1.1645e-04  7.7533e-06 -15.0194 < 2.2e-16 ***
## SC  1.6577e-04  3.5479e-05   4.6724 4.582e-06 ***
## SD  9.6834e-06  1.4121e-05   0.6858 0.4934227    
## TN  2.2490e-05  9.1624e-06   2.4546 0.0146996 *  
## TX  1.8308e-05  1.0853e-05   1.6869 0.0927111 .  
## UT -6.3395e-06  1.5453e-05  -0.4102 0.6819363    
## VT  1.3451e-05  1.3973e-05   0.9627 0.3365174    
## VA -1.8963e-05  1.4664e-05  -1.2931 0.1970014    
## WA -5.5897e-05  8.2328e-06  -6.7895 6.460e-11 ***
## WV  2.0380e-05  1.0767e-05   1.8929 0.0593813 .  
## WI -6.5871e-05  7.7457e-06  -8.5042 1.021e-15 ***
## WY  8.7205e-05  7.2328e-06  12.0568 < 2.2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Test de BreuscH-Godfrey-Wooldridge correlación serial

pbgtest(reg_imp_mort_fe)
## 
##  Breusch-Godfrey/Wooldridge test for serial correlation in panel models
## 
## data:  tasa_mort ~ beertax
## chisq = 50.668, df = 7, p-value = 1.068e-08
## alternative hypothesis: serial correlation in idiosyncratic errors

P valor = 0.0000 < 0.05- Rechazo H0. Tenemos problemas de autocorrelación.

De todas maneras, conviene corregir las estimaciones. Para ello se corre un corrector llamado de “Newey-West” de la matriz de covarianzas. La correlación serial no altera los coeficientes obtenidos pero si los errores estándar y por ende el grado de significatividad de los coeficientes. En R corremos:

coeftest(reg_imp_mort_fe, vcov. = vcovHC)
## 
## t test of coefficients:
## 
##            Estimate  Std. Error t value Pr(>|t|)  
## beertax -6.5587e-05  2.8837e-05 -2.2744  0.02368 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

El impuesto de la cerveza es significativo al 5%.

Test de dependencia transversal

pcdtest(reg_imp_mort_fe)
## 
##  Pesaran CD test for cross-sectional dependence in panels
## 
## data:  tasa_mort ~ beertax
## z = 5.436, p-value = 5.449e-08
## alternative hypothesis: cross-sectional dependence

P valor = 0.0000 < 0.05. Rechazo H0. Dependencia de las secciones cruzadas (hay que estimar panel)

coeftest(reg_imp_mort_fe, vcov. = vcovHC, cluster = 'time')
## 
## t test of coefficients:
## 
##            Estimate  Std. Error t value  Pr(>|t|)    
## beertax -6.5587e-05  9.4573e-06 -6.9351 2.691e-11 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

El impuesto de la cerveza resulta significativo al 5%.

coeftest(reg_imp_mort_fe, vcov. = vcovSCC)
## 
## t test of coefficients:
## 
##            Estimate  Std. Error t value  Pr(>|t|)    
## beertax -6.5587e-05  1.0628e-05 -6.1712 2.303e-09 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Panel <- read.dta("http://dss.princeton.edu/training/Panel101.dta")
colnames(Panel)
## [1] "country" "year"    "y"       "y_bin"   "x1"      "x2"      "x3"     
## [8] "opinion" "op"

Regresión OLS

ols <- lm(y ~x1, data = Panel)
summary(ols)
## 
## Call:
## lm(formula = y ~ x1, data = Panel)
## 
## Residuals:
##        Min         1Q     Median         3Q        Max 
## -9.546e+09 -1.578e+09  1.554e+08  1.422e+09  7.183e+09 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)  
## (Intercept) 1.524e+09  6.211e+08   2.454   0.0167 *
## x1          4.950e+08  7.789e+08   0.636   0.5272  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.028e+09 on 68 degrees of freedom
## Multiple R-squared:  0.005905,   Adjusted R-squared:  -0.008714 
## F-statistic: 0.4039 on 1 and 68 DF,  p-value: 0.5272

Efectos fijos

pan_fe <-plm(y ~ x1, data=Panel, model = 'within')
summary(pan_fe)
## Oneway (individual) effect Within Model
## 
## Call:
## plm(formula = y ~ x1, data = Panel, model = "within")
## 
## Balanced Panel: n = 7, T = 10, N = 70
## 
## Residuals:
##      Min.   1st Qu.    Median      Mean   3rd Qu.      Max. 
## -8.63e+09 -9.70e+08  5.40e+08  0.00e+00  1.39e+09  5.61e+09 
## 
## Coefficients:
##      Estimate Std. Error t-value Pr(>|t|)  
## x1 2475617827 1106675594   2.237  0.02889 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Total Sum of Squares:    5.2364e+20
## Residual Sum of Squares: 4.8454e+20
## R-Squared:      0.074684
## Adj. R-Squared: -0.029788
## F-statistic: 5.00411 on 1 and 62 DF, p-value: 0.028892
fe_dum <-lm(y ~ x1 + factor(country) - 1, data=Panel)
Panel$yhat <- fe_dum$fitted  #valores ajustados de y

scatterplot(yhat~x1|country, data=Panel, boxplots=FALSE, 
            xlab="x1", ylab="yhat",smooth=FALSE)
abline(lm(y~x1, data = Panel),lwd=3, col="red")

Test de agrupamiento

pFtest(pan_fe, ols)
## 
##  F test for individual effects
## 
## data:  y ~ x1
## F = 2.9655, df1 = 6, df2 = 62, p-value = 0.01307
## alternative hypothesis: significant effects

P valor = 0.013 < 0.05. Rechazo H0. Efectos fijos son significativos. Es decir, cada una de las regresiones es distinta.

Corremos un modelo con efectos aleatorios y lo comparamos con el de efectos fijos por medio del test de Hausma:

pan_re <- plm(y ~ x1, data=Panel, 
              index=c("country", "year"), model="random")
phtest(pan_fe, pan_re)
## 
##  Hausman Test
## 
## data:  y ~ x1
## chisq = 3.674, df = 1, p-value = 0.05527
## alternative hypothesis: one model is inconsistent

P valor = 0.055 > 0.05. Por lo tanto, hay que estimar por efectos aleatorios.

** p>0.05, usar el modelo de efectos aleatorios.

Test de Bresuch-Godfrey-Woolrdidge correlación serial

pbgtest(pan_re)
## 
##  Breusch-Godfrey/Wooldridge test for serial correlation in panel models
## 
## data:  y ~ x1
## chisq = 10.274, df = 10, p-value = 0.4168
## alternative hypothesis: serial correlation in idiosyncratic errors

P valor = 0.41 > 0.05. Acepto H0. No hay autocorrelación.

Test de dependencia transversal

pcdtest(pan_re)
## 
##  Pesaran CD test for cross-sectional dependence in panels
## 
## data:  y ~ x1
## z = 1.5264, p-value = 0.1269
## alternative hypothesis: cross-sectional dependence

P valor = 0.12 > 0.05. Las secciones cruzadas son independientes.