PANEL

Las bases de datos en panel son bases en las que podemos observar el comportamiento de un grupo de entidades en el tiempo. Estas entidades pueden ser empresas, estados, países o individuos. Una base de datos en panel luce así:

set.seed(12345)
data.frame(Pais = c(1,1,1,2,2,2), Anio = c(2001,2002,2003,2001,2002,2003), Y = rnorm(6), X1 = runif(6), X2 = floor(runif(6,0,100)))

##   Pais Anio          Y          X1 X2
## 1    1 2001  0.5855288 0.735684952 17
## 2    1 2002  0.7094660 0.001136587 95
## 3    1 2003 -0.1093033 0.391203335 45
## 4    2 2001 -0.4534972 0.462494654 32
## 5    2 2002  0.6058875 0.388143982 96
## 6    2 2003 -1.8179560 0.402485142 70

Para el trabajo con bases de datos en panel, tanto con efectos fijos como con efectos aleatorios, haremos uso del paquete plm(Croissant y Milo, 2008), que contiene funciones y métodos de ajuste relevantes al análisis de este tipo de información. Para instalar el paquete, y cargarlo en el sistema, ejecute las siguientes líneas de código:

if(!require('foreign')){
    install.packages("foreign")
}

## Loading required package: foreign

library("foreign")

library(WDI)
library(wbstats)
library(tidyverse)

## Warning: package 'lubridate' was built under R version 4.3.3

## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.3     ✔ readr     2.1.4
## ✔ forcats   1.0.0     ✔ stringr   1.5.0
## ✔ ggplot2   3.5.0     ✔ tibble    3.2.1
## ✔ lubridate 1.9.3     ✔ tidyr     1.3.0
## ✔ purrr     1.0.2     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

gob_data <- wb_data(country = c("MX","EC","Ca"), indicator = "NY.GDP.PCAP.CD", start_date=2013, end_date=2023)

panel <- select(gob_data, country, date, NY.GDP.PCAP.CD)
panel

## # A tibble: 30 × 3
##    country  date NY.GDP.PCAP.CD
##    <chr>   <dbl>          <dbl>
##  1 Canada   2013         52635.
##  2 Canada   2014         50956.
##  3 Canada   2015         43596.
##  4 Canada   2016         42316.
##  5 Canada   2017         45129.
##  6 Canada   2018         46549.
##  7 Canada   2019         46374.
##  8 Canada   2020         43562.
##  9 Canada   2021         52515.
## 10 Canada   2022         55522.
## # ℹ 20 more rows

library(gplots)

## 
## Attaching package: 'gplots'

## The following object is masked from 'package:stats':
## 
##     lowess

library(plm)

## 
## Attaching package: 'plm'

## The following objects are masked from 'package:dplyr':
## 
##     between, lag, lead

eco_data<-wb_data(country = c("AR","US","MX","CN"), indicator = c("NY.GDP.MKTP.KD.ZG","SL.UEM.TOTL.ZS",
"AG.LND.AGRI.ZS", "AG.LND.ARBL.ZS", "EG.ELC.ACCS.ZS", "SP.POP.GROW", "GB.XPD.RSDV.GD.ZS"))

panel <- select(eco_data,country,date,NY.GDP.MKTP.KD.ZG)
panel <- subset(eco_data,date==2000 | date ==2010  | date ==2015  | date == 2020)
panel <- pdata.frame(panel, index=c("country", "date"))
panel

##                    iso2c iso3c       country date AG.LND.AGRI.ZS AG.LND.ARBL.ZS
## Argentina-2000        AR   ARG     Argentina 2000       46.95819       10.09979
## Argentina-2010        AR   ARG     Argentina 2010       46.13891       13.88652
## Argentina-2015        AR   ARG     Argentina 2015       44.11877       14.74011
## Argentina-2020        AR   ARG     Argentina 2020       43.02927       15.35021
## China-2000            CN   CHN         China 2000       55.69458       12.67972
## China-2010            CN   CHN         China 2010       56.18347       12.79608
## China-2015            CN   CHN         China 2015       55.78313       12.23564
## China-2020            CN   CHN         China 2020       55.46265       11.60626
## Mexico-2000           MX   MEX        Mexico 2000       54.69791       11.78271
## Mexico-2010           MX   MEX        Mexico 2010       52.37120       11.79711
## Mexico-2015           MX   MEX        Mexico 2015       50.76262       11.34186
## Mexico-2020           MX   MEX        Mexico 2020       49.96939       10.32485
## United States-2000    US   USA United States 2000       45.23058       19.14097
## United States-2010    US   USA United States 2010       44.49251       17.24164
## United States-2015    US   USA United States 2015       44.24403       17.12451
## United States-2020    US   USA United States 2020       44.36337       17.24386
##                    EG.ELC.ACCS.ZS GB.XPD.RSDV.GD.ZS NY.GDP.MKTP.KD.ZG
## Argentina-2000           95.68047           0.43884        -0.7889989
## Argentina-2010           98.82000           0.56104        10.1253982
## Argentina-2015           99.68903           0.62262         2.7311598
## Argentina-2020          100.00000           0.54154        -9.9004848
## China-2000               96.74506           0.89316         8.4900934
## China-2010               99.70000           1.71372        10.6358711
## China-2015              100.00000           2.05701         7.0413289
## China-2020              100.00000           2.40666         2.2386384
## Mexico-2000              98.00713           0.30613         5.0292840
## Mexico-2010              99.23670           0.49485         4.9713346
## Mexico-2015              99.00000           0.42943         2.7023234
## Mexico-2020              99.40000           0.29638        -8.6515868
## United States-2000      100.00000           2.61984         4.0771595
## United States-2010      100.00000           2.71445         2.7088567
## United States-2015      100.00000           2.78700         2.7063696
## United States-2020      100.00000           3.46777        -2.7678025
##                    SL.UEM.TOTL.ZS SP.POP.GROW
## Argentina-2000             15.000   1.1332770
## Argentina-2010              7.710   0.2555824
## Argentina-2015              7.583   1.0780013
## Argentina-2020             11.460   0.9700540
## China-2000                  3.260   0.7879566
## China-2010                  4.530   0.4829597
## China-2015                  4.650   0.5814561
## China-2020                  5.000   0.2380409
## Mexico-2000                 2.650   1.5845508
## Mexico-2010                 5.300   1.3265790
## Mexico-2015                 4.310   1.1670088
## Mexico-2020                 4.440   0.7272438
## United States-2000          3.990   1.1127690
## United States-2010          9.630   0.8296167
## United States-2015          5.280   0.7362173
## United States-2020          8.050   0.9643479

#Plot de heterogeneidad por país
plotmeans(panel$NY.GDP.MKTP.KD.ZG ~ panel$country,xlab="País",ylab="GDP",
             mean.labels=TRUE, digits=-3,
             col="red",connect=TRUE, main="Heterogeneidad entre paises")

#Plot de heterogeneidad por años
plotmeans(panel$NY.GDP.MKTP.KD.ZG ~ panel$date,xlab="Año",ylab="GDP",
             mean.labels=TRUE, digits=-3,
             col="red",connect=TRUE, main="Heterogeneidad entre años")

Modelo

pooled<-plm(NY.GDP.MKTP.KD.ZG ~ SL.UEM.TOTL.ZS+
AG.LND.AGRI.ZS+AG.LND.ARBL.ZS+EG.ELC.ACCS.ZS+SP.POP.GROW+GB.XPD.RSDV.GD.ZS , data=panel ,model = "pooling")

summary(pooled)

## Pooling Model
## 
## Call:
## plm(formula = NY.GDP.MKTP.KD.ZG ~ SL.UEM.TOTL.ZS + AG.LND.AGRI.ZS + 
##     AG.LND.ARBL.ZS + EG.ELC.ACCS.ZS + SP.POP.GROW + GB.XPD.RSDV.GD.ZS, 
##     data = panel, model = "pooling")
## 
## Balanced Panel: n = 4, T = 4, N = 16
## 
## Residuals:
##    Min. 1st Qu.  Median 3rd Qu.    Max. 
## -8.1043 -1.9711  1.2967  2.3740  5.8281 
## 
## Coefficients:
##                    Estimate Std. Error t-value Pr(>|t|)
## (Intercept)       108.49114  161.77765  0.6706   0.5193
## SL.UEM.TOTL.ZS     -0.11244    0.70155 -0.1603   0.8762
## AG.LND.AGRI.ZS      1.07183    0.65102  1.6464   0.1341
## AG.LND.ARBL.ZS      2.14456    1.21025  1.7720   0.1102
## EG.ELC.ACCS.ZS     -1.82557    1.44565 -1.2628   0.2384
## SP.POP.GROW        -4.69836    4.18512 -1.1226   0.2906
## GB.XPD.RSDV.GD.ZS  -1.70636    2.19119 -0.7787   0.4561
## 
## Total Sum of Squares:    512.67
## Residual Sum of Squares: 236.86
## R-Squared:      0.53799
## Adj. R-Squared: 0.22998
## F-statistic: 1.74666 on 6 and 9 DF, p-value: 0.21687

within<-plm(NY.GDP.MKTP.KD.ZG ~ SL.UEM.TOTL.ZS+
AG.LND.AGRI.ZS+AG.LND.ARBL.ZS+EG.ELC.ACCS.ZS+SP.POP.GROW+GB.XPD.RSDV.GD.ZS , data=panel ,model = "within")

summary(within)

## Oneway (individual) effect Within Model
## 
## Call:
## plm(formula = NY.GDP.MKTP.KD.ZG ~ SL.UEM.TOTL.ZS + AG.LND.AGRI.ZS + 
##     AG.LND.ARBL.ZS + EG.ELC.ACCS.ZS + SP.POP.GROW + GB.XPD.RSDV.GD.ZS, 
##     data = panel, model = "within")
## 
## Balanced Panel: n = 4, T = 4, N = 16
## 
## Residuals:
##     Min.  1st Qu.   Median  3rd Qu.     Max. 
## -4.98771 -2.38596  0.41446  1.92653  5.64725 
## 
## Coefficients:
##                   Estimate Std. Error t-value Pr(>|t|)  
## SL.UEM.TOTL.ZS    -0.60572    0.97119 -0.6237  0.55579  
## AG.LND.AGRI.ZS     3.29170    1.43542  2.2932  0.06167 .
## AG.LND.ARBL.ZS     0.47688    2.68288  0.1777  0.86477  
## EG.ELC.ACCS.ZS     0.75577    3.26340  0.2316  0.82455  
## SP.POP.GROW       -3.77042    7.00843 -0.5380  0.60995  
## GB.XPD.RSDV.GD.ZS -3.72381    7.02411 -0.5301  0.61505  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Total Sum of Squares:    401.22
## Residual Sum of Squares: 144.53
## R-Squared:      0.63977
## Adj. R-Squared: 0.099435
## F-statistic: 1.77603 on 6 and 6 DF, p-value: 0.25126

walhus<-plm(NY.GDP.MKTP.KD.ZG ~ SL.UEM.TOTL.ZS+
AG.LND.AGRI.ZS+AG.LND.ARBL.ZS+EG.ELC.ACCS.ZS+SP.POP.GROW+GB.XPD.RSDV.GD.ZS , data=panel ,model = "random", random.method = "walhus")

summary(walhus)

## Oneway (individual) effect Random Effect Model 
##    (Wallace-Hussain's transformation)
## 
## Call:
## plm(formula = NY.GDP.MKTP.KD.ZG ~ SL.UEM.TOTL.ZS + AG.LND.AGRI.ZS + 
##     AG.LND.ARBL.ZS + EG.ELC.ACCS.ZS + SP.POP.GROW + GB.XPD.RSDV.GD.ZS, 
##     data = panel, model = "random", random.method = "walhus")
## 
## Balanced Panel: n = 4, T = 4, N = 16
## 
## Effects:
##                  var std.dev share
## idiosyncratic 19.349   4.399     1
## individual     0.000   0.000     0
## theta: 0
## 
## Residuals:
##    Min. 1st Qu.  Median 3rd Qu.    Max. 
## -8.1043 -1.9711  1.2967  2.3740  5.8281 
## 
## Coefficients:
##                    Estimate Std. Error z-value Pr(>|z|)  
## (Intercept)       108.49114  161.77765  0.6706  0.50246  
## SL.UEM.TOTL.ZS     -0.11244    0.70155 -0.1603  0.87267  
## AG.LND.AGRI.ZS      1.07183    0.65102  1.6464  0.09968 .
## AG.LND.ARBL.ZS      2.14456    1.21025  1.7720  0.07640 .
## EG.ELC.ACCS.ZS     -1.82557    1.44565 -1.2628  0.20666  
## SP.POP.GROW        -4.69836    4.18512 -1.1226  0.26159  
## GB.XPD.RSDV.GD.ZS  -1.70636    2.19119 -0.7787  0.43614  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Total Sum of Squares:    512.67
## Residual Sum of Squares: 236.86
## R-Squared:      0.53799
## Adj. R-Squared: 0.22998
## Chisq: 10.4799 on 6 DF, p-value: 0.10584

nerlove<-plm(NY.GDP.MKTP.KD.ZG ~ SL.UEM.TOTL.ZS+
AG.LND.AGRI.ZS+AG.LND.ARBL.ZS+EG.ELC.ACCS.ZS+SP.POP.GROW+GB.XPD.RSDV.GD.ZS , data=panel ,model = "random", random.method = "nerlove")

summary(nerlove)

## Oneway (individual) effect Random Effect Model 
##    (Nerlove's transformation)
## 
## Call:
## plm(formula = NY.GDP.MKTP.KD.ZG ~ SL.UEM.TOTL.ZS + AG.LND.AGRI.ZS + 
##     AG.LND.ARBL.ZS + EG.ELC.ACCS.ZS + SP.POP.GROW + GB.XPD.RSDV.GD.ZS, 
##     data = panel, model = "random", random.method = "nerlove")
## 
## Balanced Panel: n = 4, T = 4, N = 16
## 
## Effects:
##                   var std.dev share
## idiosyncratic   9.033   3.006 0.028
## individual    308.516  17.565 0.972
## theta: 0.9148
## 
## Residuals:
##     Min.  1st Qu.   Median  3rd Qu.     Max. 
## -5.44445 -2.22648  0.45774  2.91280  4.41861 
## 
## Coefficients:
##                    Estimate Std. Error z-value Pr(>|z|)  
## (Intercept)       -93.54726  235.48687 -0.3973  0.69118  
## SL.UEM.TOTL.ZS     -0.47168    0.82125 -0.5743  0.56574  
## AG.LND.AGRI.ZS      2.77116    1.11836  2.4779  0.01322 *
## AG.LND.ARBL.ZS      1.39124    1.94063  0.7169  0.47343  
## EG.ELC.ACCS.ZS     -0.50775    2.27171 -0.2235  0.82314  
## SP.POP.GROW        -4.94041    5.70743 -0.8656  0.38670  
## GB.XPD.RSDV.GD.ZS  -1.43001    4.94935 -0.2889  0.77264  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Total Sum of Squares:    402.03
## Residual Sum of Squares: 162.37
## R-Squared:      0.59612
## Adj. R-Squared: 0.32687
## Chisq: 13.2839 on 6 DF, p-value: 0.038742

#Bibliotecas
#-----------------------------------
library(plm)
library(car)

## Loading required package: carData

## 
## Attaching package: 'car'

## The following object is masked from 'package:dplyr':
## 
##     recode

## The following object is masked from 'package:purrr':
## 
##     some

#library(ggstatsplot)
library(gplots)
library(foreign)
library(lmtest)

## Loading required package: zoo

## 
## Attaching package: 'zoo'

## The following objects are masked from 'package:base':
## 
##     as.Date, as.Date.numeric

imp_alc_accide_trans <- read.dta('http://fmwww.bc.edu/ec-p/data/stockwatson/fatality.dta')

imp_alc_accide_trans$tasa_mort <- imp_alc_accide_trans$allmort/imp_alc_accide_trans$pop

imp_alc_accide_trans$logpbi <- log(imp_alc_accide_trans$perinc)

imp_alc_accide_trans$edadminima <- ifelse(imp_alc_accide_trans$mlda<19, 18,
                                          ifelse(imp_alc_accide_trans$mlda>=19 & imp_alc_accide_trans$mlda<20,
                                                 19,
                                                 ifelse(imp_alc_accide_trans$mlda>=20 & imp_alc_accide_trans$mlda<21,
                                                        20,21)))

reg_imp_mort_82 <- lm(tasa_mort ~ beertax, 
                      data = imp_alc_accide_trans[imp_alc_accide_trans$year==1982,])
reg_imp_mort_88 <- lm(tasa_mort ~ beertax, 
                      data = imp_alc_accide_trans[imp_alc_accide_trans$year==1988,])

summary(reg_imp_mort_82)

## 
## Call:
## lm(formula = tasa_mort ~ beertax, data = imp_alc_accide_trans[imp_alc_accide_trans$year == 
##     1982, ])
## 
## Residuals:
##        Min         1Q     Median         3Q        Max 
## -9.356e-05 -4.480e-05 -1.068e-05  2.295e-05  2.172e-04 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 2.010e-04  1.391e-05  14.455   <2e-16 ***
## beertax     1.485e-05  1.884e-05   0.788    0.435    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 6.705e-05 on 46 degrees of freedom
## Multiple R-squared:  0.01332,    Adjusted R-squared:  -0.008126 
## F-statistic: 0.6212 on 1 and 46 DF,  p-value: 0.4347

summary(reg_imp_mort_88)

## 
## Call:
## lm(formula = tasa_mort ~ beertax, data = imp_alc_accide_trans[imp_alc_accide_trans$year == 
##     1988, ])
## 
## Residuals:
##        Min         1Q     Median         3Q        Max 
## -7.293e-05 -3.603e-05 -7.132e-06  3.994e-05  1.358e-04 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 1.859e-04  1.060e-05  17.540   <2e-16 ***
## beertax     4.388e-05  1.645e-05   2.668   0.0105 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 4.903e-05 on 46 degrees of freedom
## Multiple R-squared:  0.134,  Adjusted R-squared:  0.1152 
## F-statistic: 7.118 on 1 and 46 DF,  p-value: 0.0105

imp_alc_accide_trans$dif_mort_88_82 <- imp_alc_accide_trans$tasa_mort[imp_alc_accide_trans$year==1988] - 
  imp_alc_accide_trans$tasa_mort[imp_alc_accide_trans$year==1982]
imp_alc_accide_trans$dif_imp_88_82  <- imp_alc_accide_trans$beertax[imp_alc_accide_trans$year==1988] - 
  imp_alc_accide_trans$beertax[imp_alc_accide_trans$year==1982]

reg_dif_88_82 <- lm(dif_mort_88_82 ~ dif_imp_88_82,
                    data = imp_alc_accide_trans)
summary(reg_dif_88_82)

## 
## Call:
## lm(formula = dif_mort_88_82 ~ dif_imp_88_82, data = imp_alc_accide_trans)
## 
## Residuals:
##        Min         1Q     Median         3Q        Max 
## -1.227e-04 -9.619e-06  9.212e-06  2.229e-05  6.774e-05 
## 
## Coefficients:
##                 Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   -7.204e-06  2.251e-06  -3.201   0.0015 ** 
## dif_imp_88_82 -1.041e-04  1.548e-05  -6.723 7.72e-11 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.869e-05 on 334 degrees of freedom
## Multiple R-squared:  0.1192, Adjusted R-squared:  0.1166 
## F-statistic:  45.2 on 1 and 334 DF,  p-value: 7.719e-11

Efectos fijos

reg_imp_mort_fe <-plm(tasa_mort ~ beertax, 
                      data=imp_alc_accide_trans,
                      model = 'within')
summary(reg_imp_mort_fe)

## Oneway (individual) effect Within Model
## 
## Call:
## plm(formula = tasa_mort ~ beertax, data = imp_alc_accide_trans, 
##     model = "within")
## 
## Balanced Panel: n = 48, T = 7, N = 336
## 
## Residuals:
##        Min.     1st Qu.      Median     3rd Qu.        Max. 
## -5.8696e-05 -8.2838e-06 -1.2701e-07  7.9545e-06  8.9780e-05 
## 
## Coefficients:
##            Estimate  Std. Error t-value Pr(>|t|)    
## beertax -6.5587e-05  1.8785e-05 -3.4915 0.000556 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Total Sum of Squares:    1.0785e-07
## Residual Sum of Squares: 1.0345e-07
## R-Squared:      0.040745
## Adj. R-Squared: -0.11969
## F-statistic: 12.1904 on 1 and 287 DF, p-value: 0.00055597

Hagamos el test de agrupamiento, para lo cual necesitamos correr el modelo de regresión sobre todo el periodo bajo estudio:

reg_imp_mort <- lm(tasa_mort ~ beertax, 
                   data = imp_alc_accide_trans)
pFtest(reg_imp_mort_fe, reg_imp_mort)

## 
##  F test for individual effects
## 
## data:  tasa_mort ~ beertax
## F = 52.179, df1 = 47, df2 = 287, p-value < 2.2e-16
## alternative hypothesis: significant effects

El p-value del test de agrupamiento es menor al 5%, con lo que rechazamos la hipótesis de agrupamiento y concluimos que la información del panel (tanto porque hay diferencias entre unidades o a través del tiempo o ambas) es importante. Debemos correr modelos de datos en panel.

reg_imp_mort_re <- plm(tasa_mort ~ beertax, 
                       data=imp_alc_accide_trans, 
                       index=c("state", "year"), 
                       model="random")
summary(reg_imp_mort_re)

## Oneway (individual) effect Random Effect Model 
##    (Swamy-Arora's transformation)
## 
## Call:
## plm(formula = tasa_mort ~ beertax, data = imp_alc_accide_trans, 
##     model = "random", index = c("state", "year"))
## 
## Balanced Panel: n = 48, T = 7, N = 336
## 
## Effects:
##                     var   std.dev share
## idiosyncratic 3.605e-10 1.899e-05 0.119
## individual    2.660e-09 5.158e-05 0.881
## theta: 0.8622
## 
## Residuals:
##        Min.     1st Qu.      Median     3rd Qu.        Max. 
## -4.7109e-05 -1.2005e-05 -2.1530e-06  9.1011e-06  9.6435e-05 
## 
## Coefficients:
##                Estimate  Std. Error z-value Pr(>|z|)    
## (Intercept)  2.0671e-04  9.9971e-06 20.6773   <2e-16 ***
## beertax     -5.2016e-06  1.2418e-05 -0.4189   0.6753    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Total Sum of Squares:    1.2648e-07
## Residual Sum of Squares: 1.2642e-07
## R-Squared:      0.00052508
## Adj. R-Squared: -0.0024674
## Chisq: 0.175467 on 1 DF, p-value: 0.6753

Test de Hausman (para averiguar si hay que aplicar efectos fijos o aleatorios)

phtest(reg_imp_mort_fe, reg_imp_mort_re)

## 
##  Hausman Test
## 
## data:  tasa_mort ~ beertax
## chisq = 18.353, df = 1, p-value = 1.835e-05
## alternative hypothesis: one model is inconsistent

P valor = 0.0000 < 0.05 –> Rechazo H0 y el estimador consistente es el de DE (efectos fijos)

Estimamos los efectos fijos para cada sección cruzada

summary(fixef(reg_imp_mort_fe))

##      Estimate Std. Error t-value  Pr(>|t|)    
## AL 3.4776e-04 3.1336e-05 11.0980 < 2.2e-16 ***
## AZ 2.9099e-04 9.2539e-06 31.4452 < 2.2e-16 ***
## AR 2.8227e-04 1.3212e-05 21.3636 < 2.2e-16 ***
## CA 1.9682e-04 7.4007e-06 26.5943 < 2.2e-16 ***
## CO 1.9934e-04 8.0371e-06 24.8019 < 2.2e-16 ***
## CT 1.6154e-04 8.3913e-06 19.2506 < 2.2e-16 ***
## DE 2.1700e-04 7.7457e-06 28.0159 < 2.2e-16 ***
## FL 3.2095e-04 2.2151e-05 14.4890 < 2.2e-16 ***
## GA 4.0022e-04 4.6403e-05  8.6249 4.435e-16 ***
## ID 2.8086e-04 9.8767e-06 28.4368 < 2.2e-16 ***
## IL 1.5160e-04 7.8478e-06 19.3176 < 2.2e-16 ***
## IN 2.0161e-04 8.8672e-06 22.7364 < 2.2e-16 ***
## IA 1.9337e-04 1.0222e-05 18.9176 < 2.2e-16 ***
## KS 2.2544e-04 1.0863e-05 20.7528 < 2.2e-16 ***
## KY 2.2601e-04 8.0462e-06 28.0893 < 2.2e-16 ***
## LA 2.6305e-04 1.6266e-05 16.1714 < 2.2e-16 ***
## ME 2.3697e-04 1.6006e-05 14.8045 < 2.2e-16 ***
## MD 1.7712e-04 8.2458e-06 21.4800 < 2.2e-16 ***
## MA 1.3679e-04 8.6477e-06 15.8178 < 2.2e-16 ***
## MI 1.9931e-04 1.1663e-05 17.0888 < 2.2e-16 ***
## MN 1.5804e-04 9.3628e-06 16.8797 < 2.2e-16 ***
## MS 3.4485e-04 2.0936e-05 16.4717 < 2.2e-16 ***
## MO 2.1814e-04 9.2523e-06 23.5764 < 2.2e-16 ***
## MT 3.1172e-04 9.4413e-06 33.0170 < 2.2e-16 ***
## NE 1.9555e-04 1.0551e-05 18.5342 < 2.2e-16 ***
## NV 2.8769e-04 8.1056e-06 35.4922 < 2.2e-16 ***
## NH 2.2232e-04 1.4114e-05 15.7512 < 2.2e-16 ***
## NJ 1.3719e-04 7.3328e-06 18.7089 < 2.2e-16 ***
## NM 3.9040e-04 1.0154e-05 38.4492 < 2.2e-16 ***
## NY 1.2910e-04 7.5629e-06 17.0696 < 2.2e-16 ***
## NC 3.1872e-04 2.5173e-05 12.6609 < 2.2e-16 ***
## ND 1.8542e-04 1.0193e-05 18.1912 < 2.2e-16 ***
## OH 1.8032e-04 1.0193e-05 17.6910 < 2.2e-16 ***
## OK 2.9326e-04 1.8429e-05 15.9133 < 2.2e-16 ***
## OR 2.3096e-04 8.1175e-06 28.4526 < 2.2e-16 ***
## PA 1.7102e-04 8.6477e-06 19.7759 < 2.2e-16 ***
## RI 1.2126e-04 7.7533e-06 15.6395 < 2.2e-16 ***
## SC 4.0348e-04 3.5479e-05 11.3724 < 2.2e-16 ***
## SD 2.4739e-04 1.4121e-05 17.5195 < 2.2e-16 ***
## TN 2.6020e-04 9.1624e-06 28.3983 < 2.2e-16 ***
## TX 2.5602e-04 1.0853e-05 23.5889 < 2.2e-16 ***
## UT 2.3137e-04 1.5453e-05 14.9721 < 2.2e-16 ***
## VT 2.5116e-04 1.3973e-05 17.9751 < 2.2e-16 ***
## VA 2.1874e-04 1.4664e-05 14.9170 < 2.2e-16 ***
## WA 1.8181e-04 8.2328e-06 22.0836 < 2.2e-16 ***
## WV 2.5809e-04 1.0767e-05 23.9707 < 2.2e-16 ***
## WI 1.7184e-04 7.7457e-06 22.1848 < 2.2e-16 ***
## WY 3.2491e-04 7.2328e-06 44.9219 < 2.2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Efectos fijos en desviaciones con respecto a su media

summary(fixef(reg_imp_mort_fe, type ="dmean"))

##       Estimate  Std. Error  t-value  Pr(>|t|)    
## AL  1.1006e-04  3.1336e-05   3.5121 0.0005161 ***
## AZ  5.3283e-05  9.2539e-06   5.7579 2.188e-08 ***
## AR  4.4560e-05  1.3213e-05   3.3726 0.0008470 ***
## CA -4.0891e-05  7.4007e-06  -5.5254 7.380e-08 ***
## CO -3.8373e-05  8.0371e-06  -4.7744 2.876e-06 ***
## CT -7.6170e-05  8.3913e-06  -9.0773 < 2.2e-16 ***
## DE -2.0705e-05  7.7457e-06  -2.6731 0.0079465 ** 
## FL  8.3242e-05  2.2151e-05   3.7579 0.0002076 ***
## GA  1.6252e-04  4.6403e-05   3.5023 0.0005348 ***
## ID  4.3153e-05  9.8767e-06   4.3692 1.744e-05 ***
## IL -8.6107e-05  7.8478e-06 -10.9721 < 2.2e-16 ***
## IN -3.6099e-05  8.8672e-06  -4.0710 6.058e-05 ***
## IA -4.4338e-05  1.0222e-05  -4.3376 1.997e-05 ***
## KS -1.2266e-05  1.0863e-05  -1.1291 0.2597802    
## KY -1.1696e-05  8.0462e-06  -1.4536 0.1471402    
## LA  2.5344e-05  1.6266e-05   1.5581 0.1203234    
## ME -7.3922e-07  1.6006e-05  -0.0462 0.9631970    
## MD -6.0588e-05  8.2458e-06  -7.3478 2.102e-12 ***
## MA -1.0092e-04  8.6477e-06 -11.6700 < 2.2e-16 ***
## MI -3.8397e-05  1.1663e-05  -3.2922 0.0011185 ** 
## MN -7.9666e-05  9.3628e-06  -8.5087 9.901e-16 ***
## MS  1.0715e-04  2.0936e-05   5.1178 5.669e-07 ***
## MO -1.9571e-05  9.2523e-06  -2.1152 0.0352734 *  
## MT  7.4016e-05  9.4413e-06   7.8396 8.909e-14 ***
## NE -4.2162e-05  1.0551e-05  -3.9962 8.188e-05 ***
## NV  4.9978e-05  8.1056e-06   6.1659 2.373e-09 ***
## NH -1.5390e-05  1.4114e-05  -1.0904 0.2764610    
## NJ -1.0052e-04  7.3328e-06 -13.7083 < 2.2e-16 ***
## NM  1.5269e-04  1.0154e-05  15.0382 < 2.2e-16 ***
## NY -1.0861e-04  7.5629e-06 -14.3610 < 2.2e-16 ***
## NC  8.1009e-05  2.5173e-05   3.2180 0.0014386 ** 
## ND -5.2288e-05  1.0193e-05  -5.1299 5.345e-07 ***
## OH -5.7386e-05  1.0193e-05  -5.6301 4.288e-08 ***
## OK  5.5549e-05  1.8428e-05   3.0143 0.0028056 ** 
## OR -6.7445e-06  8.1175e-06  -0.8309 0.4067439    
## PA -6.6691e-05  8.6477e-06  -7.7120 2.049e-13 ***
## RI -1.1645e-04  7.7533e-06 -15.0194 < 2.2e-16 ***
## SC  1.6577e-04  3.5479e-05   4.6724 4.582e-06 ***
## SD  9.6834e-06  1.4121e-05   0.6858 0.4934227    
## TN  2.2490e-05  9.1624e-06   2.4546 0.0146996 *  
## TX  1.8308e-05  1.0853e-05   1.6869 0.0927111 .  
## UT -6.3395e-06  1.5453e-05  -0.4102 0.6819363    
## VT  1.3451e-05  1.3973e-05   0.9627 0.3365174    
## VA -1.8963e-05  1.4664e-05  -1.2931 0.1970014    
## WA -5.5897e-05  8.2328e-06  -6.7895 6.460e-11 ***
## WV  2.0380e-05  1.0767e-05   1.8929 0.0593813 .  
## WI -6.5871e-05  7.7457e-06  -8.5042 1.021e-15 ***
## WY  8.7205e-05  7.2328e-06  12.0568 < 2.2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Test de BreuscH-Godfrey-Wooldridge correlación serial

pbgtest(reg_imp_mort_fe)

## 
##  Breusch-Godfrey/Wooldridge test for serial correlation in panel models
## 
## data:  tasa_mort ~ beertax
## chisq = 50.668, df = 7, p-value = 1.068e-08
## alternative hypothesis: serial correlation in idiosyncratic errors

P valor = 0.0000 < 0.05- Rechazo H0. Tenemos problemas de autocorrelación.

De todas maneras, conviene corregir las estimaciones. Para ello se corre un corrector llamado de “Newey-West” de la matriz de covarianzas. La correlación serial no altera los coeficientes obtenidos pero si los errores estándar y por ende el grado de significatividad de los coeficientes. En R corremos:

coeftest(reg_imp_mort_fe, vcov. = vcovHC)

## 
## t test of coefficients:
## 
##            Estimate  Std. Error t value Pr(>|t|)  
## beertax -6.5587e-05  2.8837e-05 -2.2744  0.02368 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

El impuesto de la cerveza es significativo al 5%.

Test de dependencia transversal

pcdtest(reg_imp_mort_fe)

## 
##  Pesaran CD test for cross-sectional dependence in panels
## 
## data:  tasa_mort ~ beertax
## z = 5.436, p-value = 5.449e-08
## alternative hypothesis: cross-sectional dependence

P valor = 0.0000 < 0.05. Rechazo H0. Dependencia de las secciones cruzadas (hay que estimar panel)

coeftest(reg_imp_mort_fe, vcov. = vcovHC, cluster = 'time')

## 
## t test of coefficients:
## 
##            Estimate  Std. Error t value  Pr(>|t|)    
## beertax -6.5587e-05  9.4573e-06 -6.9351 2.691e-11 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

El impuesto de la cerveza resulta significativo al 5%.

coeftest(reg_imp_mort_fe, vcov. = vcovSCC)

## 
## t test of coefficients:
## 
##            Estimate  Std. Error t value  Pr(>|t|)    
## beertax -6.5587e-05  1.0628e-05 -6.1712 2.303e-09 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Panel <- read.dta("http://dss.princeton.edu/training/Panel101.dta")

colnames(Panel)

## [1] "country" "year"    "y"       "y_bin"   "x1"      "x2"      "x3"     
## [8] "opinion" "op"

Regresión OLS

ols <- lm(y ~x1, data = Panel)
summary(ols)

## 
## Call:
## lm(formula = y ~ x1, data = Panel)
## 
## Residuals:
##        Min         1Q     Median         3Q        Max 
## -9.546e+09 -1.578e+09  1.554e+08  1.422e+09  7.183e+09 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)  
## (Intercept) 1.524e+09  6.211e+08   2.454   0.0167 *
## x1          4.950e+08  7.789e+08   0.636   0.5272  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.028e+09 on 68 degrees of freedom
## Multiple R-squared:  0.005905,   Adjusted R-squared:  -0.008714 
## F-statistic: 0.4039 on 1 and 68 DF,  p-value: 0.5272

Efectos fijos

pan_fe <-plm(y ~ x1, data=Panel, model = 'within')
summary(pan_fe)

## Oneway (individual) effect Within Model
## 
## Call:
## plm(formula = y ~ x1, data = Panel, model = "within")
## 
## Balanced Panel: n = 7, T = 10, N = 70
## 
## Residuals:
##      Min.   1st Qu.    Median      Mean   3rd Qu.      Max. 
## -8.63e+09 -9.70e+08  5.40e+08  0.00e+00  1.39e+09  5.61e+09 
## 
## Coefficients:
##      Estimate Std. Error t-value Pr(>|t|)  
## x1 2475617827 1106675594   2.237  0.02889 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Total Sum of Squares:    5.2364e+20
## Residual Sum of Squares: 4.8454e+20
## R-Squared:      0.074684
## Adj. R-Squared: -0.029788
## F-statistic: 5.00411 on 1 and 62 DF, p-value: 0.028892

fe_dum <-lm(y ~ x1 + factor(country) - 1, data=Panel)
Panel$yhat <- fe_dum$fitted  #valores ajustados de y

scatterplot(yhat~x1|country, data=Panel, boxplots=FALSE, 
            xlab="x1", ylab="yhat",smooth=FALSE)
abline(lm(y~x1, data = Panel),lwd=3, col="red")

Test de agrupamiento

pFtest(pan_fe, ols)

## 
##  F test for individual effects
## 
## data:  y ~ x1
## F = 2.9655, df1 = 6, df2 = 62, p-value = 0.01307
## alternative hypothesis: significant effects

P valor = 0.013 < 0.05. Rechazo H0. Efectos fijos son significativos. Es decir, cada una de las regresiones es distinta.

Corremos un modelo con efectos aleatorios y lo comparamos con el de efectos fijos por medio del test de Hausma:

pan_re <- plm(y ~ x1, data=Panel, 
              index=c("country", "year"), model="random")
phtest(pan_fe, pan_re)

## 
##  Hausman Test
## 
## data:  y ~ x1
## chisq = 3.674, df = 1, p-value = 0.05527
## alternative hypothesis: one model is inconsistent

P valor = 0.055 > 0.05. Por lo tanto, hay que estimar por efectos aleatorios.

** p>0.05, usar el modelo de efectos aleatorios.

Test de Bresuch-Godfrey-Woolrdidge correlación serial

pbgtest(pan_re)

## 
##  Breusch-Godfrey/Wooldridge test for serial correlation in panel models
## 
## data:  y ~ x1
## chisq = 10.274, df = 10, p-value = 0.4168
## alternative hypothesis: serial correlation in idiosyncratic errors

P valor = 0.41 > 0.05. Acepto H0. No hay autocorrelación.

Test de dependencia transversal

pcdtest(pan_re)

## 
##  Pesaran CD test for cross-sectional dependence in panels
## 
## data:  y ~ x1
## z = 1.5264, p-value = 0.1269
## alternative hypothesis: cross-sectional dependence

P valor = 0.12 > 0.05. Las secciones cruzadas son independientes.

PANEL

2024-03-27