Las bases de datos en panel son bases en las que podemos observar el comportamiento de un grupo de entidades en el tiempo. Estas entidades pueden ser empresas, estados, países o individuos. Una base de datos en panel luce así:
set.seed(12345)
data.frame(Pais = c(1,1,1,2,2,2), Anio = c(2001,2002,2003,2001,2002,2003), Y = rnorm(6), X1 = runif(6), X2 = floor(runif(6,0,100)))
## Pais Anio Y X1 X2
## 1 1 2001 0.5855288 0.735684952 17
## 2 1 2002 0.7094660 0.001136587 95
## 3 1 2003 -0.1093033 0.391203335 45
## 4 2 2001 -0.4534972 0.462494654 32
## 5 2 2002 0.6058875 0.388143982 96
## 6 2 2003 -1.8179560 0.402485142 70
Para el trabajo con bases de datos en panel, tanto con efectos fijos como con efectos aleatorios, haremos uso del paquete plm(Croissant y Milo, 2008), que contiene funciones y métodos de ajuste relevantes al análisis de este tipo de información. Para instalar el paquete, y cargarlo en el sistema, ejecute las siguientes líneas de código:
if(!require('foreign')){
install.packages("foreign")
}
## Loading required package: foreign
library("foreign")
library(WDI)
library(wbstats)
library(tidyverse)
## Warning: package 'lubridate' was built under R version 4.3.3
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.1.3 ✔ readr 2.1.4
## ✔ forcats 1.0.0 ✔ stringr 1.5.0
## ✔ ggplot2 3.5.0 ✔ tibble 3.2.1
## ✔ lubridate 1.9.3 ✔ tidyr 1.3.0
## ✔ purrr 1.0.2
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
gob_data <- wb_data(country = c("MX","EC","Ca"), indicator = "NY.GDP.PCAP.CD", start_date=2013, end_date=2023)
panel <- select(gob_data, country, date, NY.GDP.PCAP.CD)
panel
## # A tibble: 30 × 3
## country date NY.GDP.PCAP.CD
## <chr> <dbl> <dbl>
## 1 Canada 2013 52635.
## 2 Canada 2014 50956.
## 3 Canada 2015 43596.
## 4 Canada 2016 42316.
## 5 Canada 2017 45129.
## 6 Canada 2018 46549.
## 7 Canada 2019 46374.
## 8 Canada 2020 43562.
## 9 Canada 2021 52515.
## 10 Canada 2022 55522.
## # ℹ 20 more rows
library(gplots)
##
## Attaching package: 'gplots'
## The following object is masked from 'package:stats':
##
## lowess
library(plm)
##
## Attaching package: 'plm'
## The following objects are masked from 'package:dplyr':
##
## between, lag, lead
eco_data<-wb_data(country = c("AR","US","MX","CN"), indicator = c("NY.GDP.MKTP.KD.ZG","SL.UEM.TOTL.ZS",
"AG.LND.AGRI.ZS", "AG.LND.ARBL.ZS", "EG.ELC.ACCS.ZS", "SP.POP.GROW", "GB.XPD.RSDV.GD.ZS"))
panel <- select(eco_data,country,date,NY.GDP.MKTP.KD.ZG)
panel <- subset(eco_data,date==2000 | date ==2010 | date ==2015 | date == 2020)
panel <- pdata.frame(panel, index=c("country", "date"))
panel
## iso2c iso3c country date AG.LND.AGRI.ZS AG.LND.ARBL.ZS
## Argentina-2000 AR ARG Argentina 2000 46.95819 10.09979
## Argentina-2010 AR ARG Argentina 2010 46.13891 13.88652
## Argentina-2015 AR ARG Argentina 2015 44.11877 14.74011
## Argentina-2020 AR ARG Argentina 2020 43.02927 15.35021
## China-2000 CN CHN China 2000 55.69458 12.67972
## China-2010 CN CHN China 2010 56.18347 12.79608
## China-2015 CN CHN China 2015 55.78313 12.23564
## China-2020 CN CHN China 2020 55.46265 11.60626
## Mexico-2000 MX MEX Mexico 2000 54.69791 11.78271
## Mexico-2010 MX MEX Mexico 2010 52.37120 11.79711
## Mexico-2015 MX MEX Mexico 2015 50.76262 11.34186
## Mexico-2020 MX MEX Mexico 2020 49.96939 10.32485
## United States-2000 US USA United States 2000 45.23058 19.14097
## United States-2010 US USA United States 2010 44.49251 17.24164
## United States-2015 US USA United States 2015 44.24403 17.12451
## United States-2020 US USA United States 2020 44.36337 17.24386
## EG.ELC.ACCS.ZS GB.XPD.RSDV.GD.ZS NY.GDP.MKTP.KD.ZG
## Argentina-2000 95.68047 0.43884 -0.7889989
## Argentina-2010 98.82000 0.56104 10.1253982
## Argentina-2015 99.68903 0.62262 2.7311598
## Argentina-2020 100.00000 0.54154 -9.9004848
## China-2000 96.74506 0.89316 8.4900934
## China-2010 99.70000 1.71372 10.6358711
## China-2015 100.00000 2.05701 7.0413289
## China-2020 100.00000 2.40666 2.2386384
## Mexico-2000 98.00713 0.30613 5.0292840
## Mexico-2010 99.23670 0.49485 4.9713346
## Mexico-2015 99.00000 0.42943 2.7023234
## Mexico-2020 99.40000 0.29638 -8.6515868
## United States-2000 100.00000 2.61984 4.0771595
## United States-2010 100.00000 2.71445 2.7088567
## United States-2015 100.00000 2.78700 2.7063696
## United States-2020 100.00000 3.46777 -2.7678025
## SL.UEM.TOTL.ZS SP.POP.GROW
## Argentina-2000 15.000 1.1332770
## Argentina-2010 7.710 0.2555824
## Argentina-2015 7.583 1.0780013
## Argentina-2020 11.460 0.9700540
## China-2000 3.260 0.7879566
## China-2010 4.530 0.4829597
## China-2015 4.650 0.5814561
## China-2020 5.000 0.2380409
## Mexico-2000 2.650 1.5845508
## Mexico-2010 5.300 1.3265790
## Mexico-2015 4.310 1.1670088
## Mexico-2020 4.440 0.7272438
## United States-2000 3.990 1.1127690
## United States-2010 9.630 0.8296167
## United States-2015 5.280 0.7362173
## United States-2020 8.050 0.9643479
#Plot de heterogeneidad por país
plotmeans(panel$NY.GDP.MKTP.KD.ZG ~ panel$country,xlab="País",ylab="GDP",
mean.labels=TRUE, digits=-3,
col="red",connect=TRUE, main="Heterogeneidad entre paises")
#Plot de heterogeneidad por años
plotmeans(panel$NY.GDP.MKTP.KD.ZG ~ panel$date,xlab="Año",ylab="GDP",
mean.labels=TRUE, digits=-3,
col="red",connect=TRUE, main="Heterogeneidad entre años")
Modelo
pooled<-plm(NY.GDP.MKTP.KD.ZG ~ SL.UEM.TOTL.ZS+
AG.LND.AGRI.ZS+AG.LND.ARBL.ZS+EG.ELC.ACCS.ZS+SP.POP.GROW+GB.XPD.RSDV.GD.ZS , data=panel ,model = "pooling")
summary(pooled)
## Pooling Model
##
## Call:
## plm(formula = NY.GDP.MKTP.KD.ZG ~ SL.UEM.TOTL.ZS + AG.LND.AGRI.ZS +
## AG.LND.ARBL.ZS + EG.ELC.ACCS.ZS + SP.POP.GROW + GB.XPD.RSDV.GD.ZS,
## data = panel, model = "pooling")
##
## Balanced Panel: n = 4, T = 4, N = 16
##
## Residuals:
## Min. 1st Qu. Median 3rd Qu. Max.
## -8.1043 -1.9711 1.2967 2.3740 5.8281
##
## Coefficients:
## Estimate Std. Error t-value Pr(>|t|)
## (Intercept) 108.49114 161.77765 0.6706 0.5193
## SL.UEM.TOTL.ZS -0.11244 0.70155 -0.1603 0.8762
## AG.LND.AGRI.ZS 1.07183 0.65102 1.6464 0.1341
## AG.LND.ARBL.ZS 2.14456 1.21025 1.7720 0.1102
## EG.ELC.ACCS.ZS -1.82557 1.44565 -1.2628 0.2384
## SP.POP.GROW -4.69836 4.18512 -1.1226 0.2906
## GB.XPD.RSDV.GD.ZS -1.70636 2.19119 -0.7787 0.4561
##
## Total Sum of Squares: 512.67
## Residual Sum of Squares: 236.86
## R-Squared: 0.53799
## Adj. R-Squared: 0.22998
## F-statistic: 1.74666 on 6 and 9 DF, p-value: 0.21687
within<-plm(NY.GDP.MKTP.KD.ZG ~ SL.UEM.TOTL.ZS+
AG.LND.AGRI.ZS+AG.LND.ARBL.ZS+EG.ELC.ACCS.ZS+SP.POP.GROW+GB.XPD.RSDV.GD.ZS , data=panel ,model = "within")
summary(within)
## Oneway (individual) effect Within Model
##
## Call:
## plm(formula = NY.GDP.MKTP.KD.ZG ~ SL.UEM.TOTL.ZS + AG.LND.AGRI.ZS +
## AG.LND.ARBL.ZS + EG.ELC.ACCS.ZS + SP.POP.GROW + GB.XPD.RSDV.GD.ZS,
## data = panel, model = "within")
##
## Balanced Panel: n = 4, T = 4, N = 16
##
## Residuals:
## Min. 1st Qu. Median 3rd Qu. Max.
## -4.98771 -2.38596 0.41446 1.92653 5.64725
##
## Coefficients:
## Estimate Std. Error t-value Pr(>|t|)
## SL.UEM.TOTL.ZS -0.60572 0.97119 -0.6237 0.55579
## AG.LND.AGRI.ZS 3.29170 1.43542 2.2932 0.06167 .
## AG.LND.ARBL.ZS 0.47688 2.68288 0.1777 0.86477
## EG.ELC.ACCS.ZS 0.75577 3.26340 0.2316 0.82455
## SP.POP.GROW -3.77042 7.00843 -0.5380 0.60995
## GB.XPD.RSDV.GD.ZS -3.72381 7.02411 -0.5301 0.61505
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Total Sum of Squares: 401.22
## Residual Sum of Squares: 144.53
## R-Squared: 0.63977
## Adj. R-Squared: 0.099435
## F-statistic: 1.77603 on 6 and 6 DF, p-value: 0.25126
walhus<-plm(NY.GDP.MKTP.KD.ZG ~ SL.UEM.TOTL.ZS+
AG.LND.AGRI.ZS+AG.LND.ARBL.ZS+EG.ELC.ACCS.ZS+SP.POP.GROW+GB.XPD.RSDV.GD.ZS , data=panel ,model = "random", random.method = "walhus")
summary(walhus)
## Oneway (individual) effect Random Effect Model
## (Wallace-Hussain's transformation)
##
## Call:
## plm(formula = NY.GDP.MKTP.KD.ZG ~ SL.UEM.TOTL.ZS + AG.LND.AGRI.ZS +
## AG.LND.ARBL.ZS + EG.ELC.ACCS.ZS + SP.POP.GROW + GB.XPD.RSDV.GD.ZS,
## data = panel, model = "random", random.method = "walhus")
##
## Balanced Panel: n = 4, T = 4, N = 16
##
## Effects:
## var std.dev share
## idiosyncratic 19.349 4.399 1
## individual 0.000 0.000 0
## theta: 0
##
## Residuals:
## Min. 1st Qu. Median 3rd Qu. Max.
## -8.1043 -1.9711 1.2967 2.3740 5.8281
##
## Coefficients:
## Estimate Std. Error z-value Pr(>|z|)
## (Intercept) 108.49114 161.77765 0.6706 0.50246
## SL.UEM.TOTL.ZS -0.11244 0.70155 -0.1603 0.87267
## AG.LND.AGRI.ZS 1.07183 0.65102 1.6464 0.09968 .
## AG.LND.ARBL.ZS 2.14456 1.21025 1.7720 0.07640 .
## EG.ELC.ACCS.ZS -1.82557 1.44565 -1.2628 0.20666
## SP.POP.GROW -4.69836 4.18512 -1.1226 0.26159
## GB.XPD.RSDV.GD.ZS -1.70636 2.19119 -0.7787 0.43614
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Total Sum of Squares: 512.67
## Residual Sum of Squares: 236.86
## R-Squared: 0.53799
## Adj. R-Squared: 0.22998
## Chisq: 10.4799 on 6 DF, p-value: 0.10584
nerlove<-plm(NY.GDP.MKTP.KD.ZG ~ SL.UEM.TOTL.ZS+
AG.LND.AGRI.ZS+AG.LND.ARBL.ZS+EG.ELC.ACCS.ZS+SP.POP.GROW+GB.XPD.RSDV.GD.ZS , data=panel ,model = "random", random.method = "nerlove")
summary(nerlove)
## Oneway (individual) effect Random Effect Model
## (Nerlove's transformation)
##
## Call:
## plm(formula = NY.GDP.MKTP.KD.ZG ~ SL.UEM.TOTL.ZS + AG.LND.AGRI.ZS +
## AG.LND.ARBL.ZS + EG.ELC.ACCS.ZS + SP.POP.GROW + GB.XPD.RSDV.GD.ZS,
## data = panel, model = "random", random.method = "nerlove")
##
## Balanced Panel: n = 4, T = 4, N = 16
##
## Effects:
## var std.dev share
## idiosyncratic 9.033 3.006 0.028
## individual 308.516 17.565 0.972
## theta: 0.9148
##
## Residuals:
## Min. 1st Qu. Median 3rd Qu. Max.
## -5.44445 -2.22648 0.45774 2.91280 4.41861
##
## Coefficients:
## Estimate Std. Error z-value Pr(>|z|)
## (Intercept) -93.54726 235.48687 -0.3973 0.69118
## SL.UEM.TOTL.ZS -0.47168 0.82125 -0.5743 0.56574
## AG.LND.AGRI.ZS 2.77116 1.11836 2.4779 0.01322 *
## AG.LND.ARBL.ZS 1.39124 1.94063 0.7169 0.47343
## EG.ELC.ACCS.ZS -0.50775 2.27171 -0.2235 0.82314
## SP.POP.GROW -4.94041 5.70743 -0.8656 0.38670
## GB.XPD.RSDV.GD.ZS -1.43001 4.94935 -0.2889 0.77264
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Total Sum of Squares: 402.03
## Residual Sum of Squares: 162.37
## R-Squared: 0.59612
## Adj. R-Squared: 0.32687
## Chisq: 13.2839 on 6 DF, p-value: 0.038742
#Bibliotecas
#-----------------------------------
library(plm)
library(car)
## Loading required package: carData
##
## Attaching package: 'car'
## The following object is masked from 'package:dplyr':
##
## recode
## The following object is masked from 'package:purrr':
##
## some
#library(ggstatsplot)
library(gplots)
library(foreign)
library(lmtest)
## Loading required package: zoo
##
## Attaching package: 'zoo'
## The following objects are masked from 'package:base':
##
## as.Date, as.Date.numeric
imp_alc_accide_trans <- read.dta('http://fmwww.bc.edu/ec-p/data/stockwatson/fatality.dta')
imp_alc_accide_trans$tasa_mort <- imp_alc_accide_trans$allmort/imp_alc_accide_trans$pop
imp_alc_accide_trans$logpbi <- log(imp_alc_accide_trans$perinc)
imp_alc_accide_trans$edadminima <- ifelse(imp_alc_accide_trans$mlda<19, 18,
ifelse(imp_alc_accide_trans$mlda>=19 & imp_alc_accide_trans$mlda<20,
19,
ifelse(imp_alc_accide_trans$mlda>=20 & imp_alc_accide_trans$mlda<21,
20,21)))
reg_imp_mort_82 <- lm(tasa_mort ~ beertax,
data = imp_alc_accide_trans[imp_alc_accide_trans$year==1982,])
reg_imp_mort_88 <- lm(tasa_mort ~ beertax,
data = imp_alc_accide_trans[imp_alc_accide_trans$year==1988,])
summary(reg_imp_mort_82)
##
## Call:
## lm(formula = tasa_mort ~ beertax, data = imp_alc_accide_trans[imp_alc_accide_trans$year ==
## 1982, ])
##
## Residuals:
## Min 1Q Median 3Q Max
## -9.356e-05 -4.480e-05 -1.068e-05 2.295e-05 2.172e-04
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 2.010e-04 1.391e-05 14.455 <2e-16 ***
## beertax 1.485e-05 1.884e-05 0.788 0.435
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 6.705e-05 on 46 degrees of freedom
## Multiple R-squared: 0.01332, Adjusted R-squared: -0.008126
## F-statistic: 0.6212 on 1 and 46 DF, p-value: 0.4347
summary(reg_imp_mort_88)
##
## Call:
## lm(formula = tasa_mort ~ beertax, data = imp_alc_accide_trans[imp_alc_accide_trans$year ==
## 1988, ])
##
## Residuals:
## Min 1Q Median 3Q Max
## -7.293e-05 -3.603e-05 -7.132e-06 3.994e-05 1.358e-04
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.859e-04 1.060e-05 17.540 <2e-16 ***
## beertax 4.388e-05 1.645e-05 2.668 0.0105 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 4.903e-05 on 46 degrees of freedom
## Multiple R-squared: 0.134, Adjusted R-squared: 0.1152
## F-statistic: 7.118 on 1 and 46 DF, p-value: 0.0105
imp_alc_accide_trans$dif_mort_88_82 <- imp_alc_accide_trans$tasa_mort[imp_alc_accide_trans$year==1988] -
imp_alc_accide_trans$tasa_mort[imp_alc_accide_trans$year==1982]
imp_alc_accide_trans$dif_imp_88_82 <- imp_alc_accide_trans$beertax[imp_alc_accide_trans$year==1988] -
imp_alc_accide_trans$beertax[imp_alc_accide_trans$year==1982]
reg_dif_88_82 <- lm(dif_mort_88_82 ~ dif_imp_88_82,
data = imp_alc_accide_trans)
summary(reg_dif_88_82)
##
## Call:
## lm(formula = dif_mort_88_82 ~ dif_imp_88_82, data = imp_alc_accide_trans)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.227e-04 -9.619e-06 9.212e-06 2.229e-05 6.774e-05
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -7.204e-06 2.251e-06 -3.201 0.0015 **
## dif_imp_88_82 -1.041e-04 1.548e-05 -6.723 7.72e-11 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 3.869e-05 on 334 degrees of freedom
## Multiple R-squared: 0.1192, Adjusted R-squared: 0.1166
## F-statistic: 45.2 on 1 and 334 DF, p-value: 7.719e-11
Efectos fijos
reg_imp_mort_fe <-plm(tasa_mort ~ beertax,
data=imp_alc_accide_trans,
model = 'within')
summary(reg_imp_mort_fe)
## Oneway (individual) effect Within Model
##
## Call:
## plm(formula = tasa_mort ~ beertax, data = imp_alc_accide_trans,
## model = "within")
##
## Balanced Panel: n = 48, T = 7, N = 336
##
## Residuals:
## Min. 1st Qu. Median 3rd Qu. Max.
## -5.8696e-05 -8.2838e-06 -1.2701e-07 7.9545e-06 8.9780e-05
##
## Coefficients:
## Estimate Std. Error t-value Pr(>|t|)
## beertax -6.5587e-05 1.8785e-05 -3.4915 0.000556 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Total Sum of Squares: 1.0785e-07
## Residual Sum of Squares: 1.0345e-07
## R-Squared: 0.040745
## Adj. R-Squared: -0.11969
## F-statistic: 12.1904 on 1 and 287 DF, p-value: 0.00055597
Hagamos el test de agrupamiento, para lo cual necesitamos correr el modelo de regresión sobre todo el periodo bajo estudio:
reg_imp_mort <- lm(tasa_mort ~ beertax,
data = imp_alc_accide_trans)
pFtest(reg_imp_mort_fe, reg_imp_mort)
##
## F test for individual effects
##
## data: tasa_mort ~ beertax
## F = 52.179, df1 = 47, df2 = 287, p-value < 2.2e-16
## alternative hypothesis: significant effects
El p-value del test de agrupamiento es menor al 5%, con lo que rechazamos la hipótesis de agrupamiento y concluimos que la información del panel (tanto porque hay diferencias entre unidades o a través del tiempo o ambas) es importante. Debemos correr modelos de datos en panel.
reg_imp_mort_re <- plm(tasa_mort ~ beertax,
data=imp_alc_accide_trans,
index=c("state", "year"),
model="random")
summary(reg_imp_mort_re)
## Oneway (individual) effect Random Effect Model
## (Swamy-Arora's transformation)
##
## Call:
## plm(formula = tasa_mort ~ beertax, data = imp_alc_accide_trans,
## model = "random", index = c("state", "year"))
##
## Balanced Panel: n = 48, T = 7, N = 336
##
## Effects:
## var std.dev share
## idiosyncratic 3.605e-10 1.899e-05 0.119
## individual 2.660e-09 5.158e-05 0.881
## theta: 0.8622
##
## Residuals:
## Min. 1st Qu. Median 3rd Qu. Max.
## -4.7109e-05 -1.2005e-05 -2.1530e-06 9.1011e-06 9.6435e-05
##
## Coefficients:
## Estimate Std. Error z-value Pr(>|z|)
## (Intercept) 2.0671e-04 9.9971e-06 20.6773 <2e-16 ***
## beertax -5.2016e-06 1.2418e-05 -0.4189 0.6753
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Total Sum of Squares: 1.2648e-07
## Residual Sum of Squares: 1.2642e-07
## R-Squared: 0.00052508
## Adj. R-Squared: -0.0024674
## Chisq: 0.175467 on 1 DF, p-value: 0.6753
Test de Hausman (para averiguar si hay que aplicar efectos fijos o aleatorios)
phtest(reg_imp_mort_fe, reg_imp_mort_re)
##
## Hausman Test
##
## data: tasa_mort ~ beertax
## chisq = 18.353, df = 1, p-value = 1.835e-05
## alternative hypothesis: one model is inconsistent
P valor = 0.0000 < 0.05 –> Rechazo H0 y el estimador consistente es el de DE (efectos fijos)
Estimamos los efectos fijos para cada sección cruzada
summary(fixef(reg_imp_mort_fe))
## Estimate Std. Error t-value Pr(>|t|)
## AL 3.4776e-04 3.1336e-05 11.0980 < 2.2e-16 ***
## AZ 2.9099e-04 9.2539e-06 31.4452 < 2.2e-16 ***
## AR 2.8227e-04 1.3212e-05 21.3636 < 2.2e-16 ***
## CA 1.9682e-04 7.4007e-06 26.5943 < 2.2e-16 ***
## CO 1.9934e-04 8.0371e-06 24.8019 < 2.2e-16 ***
## CT 1.6154e-04 8.3913e-06 19.2506 < 2.2e-16 ***
## DE 2.1700e-04 7.7457e-06 28.0159 < 2.2e-16 ***
## FL 3.2095e-04 2.2151e-05 14.4890 < 2.2e-16 ***
## GA 4.0022e-04 4.6403e-05 8.6249 4.435e-16 ***
## ID 2.8086e-04 9.8767e-06 28.4368 < 2.2e-16 ***
## IL 1.5160e-04 7.8478e-06 19.3176 < 2.2e-16 ***
## IN 2.0161e-04 8.8672e-06 22.7364 < 2.2e-16 ***
## IA 1.9337e-04 1.0222e-05 18.9176 < 2.2e-16 ***
## KS 2.2544e-04 1.0863e-05 20.7528 < 2.2e-16 ***
## KY 2.2601e-04 8.0462e-06 28.0893 < 2.2e-16 ***
## LA 2.6305e-04 1.6266e-05 16.1714 < 2.2e-16 ***
## ME 2.3697e-04 1.6006e-05 14.8045 < 2.2e-16 ***
## MD 1.7712e-04 8.2458e-06 21.4800 < 2.2e-16 ***
## MA 1.3679e-04 8.6477e-06 15.8178 < 2.2e-16 ***
## MI 1.9931e-04 1.1663e-05 17.0888 < 2.2e-16 ***
## MN 1.5804e-04 9.3628e-06 16.8797 < 2.2e-16 ***
## MS 3.4485e-04 2.0936e-05 16.4717 < 2.2e-16 ***
## MO 2.1814e-04 9.2523e-06 23.5764 < 2.2e-16 ***
## MT 3.1172e-04 9.4413e-06 33.0170 < 2.2e-16 ***
## NE 1.9555e-04 1.0551e-05 18.5342 < 2.2e-16 ***
## NV 2.8769e-04 8.1056e-06 35.4922 < 2.2e-16 ***
## NH 2.2232e-04 1.4114e-05 15.7512 < 2.2e-16 ***
## NJ 1.3719e-04 7.3328e-06 18.7089 < 2.2e-16 ***
## NM 3.9040e-04 1.0154e-05 38.4492 < 2.2e-16 ***
## NY 1.2910e-04 7.5629e-06 17.0696 < 2.2e-16 ***
## NC 3.1872e-04 2.5173e-05 12.6609 < 2.2e-16 ***
## ND 1.8542e-04 1.0193e-05 18.1912 < 2.2e-16 ***
## OH 1.8032e-04 1.0193e-05 17.6910 < 2.2e-16 ***
## OK 2.9326e-04 1.8429e-05 15.9133 < 2.2e-16 ***
## OR 2.3096e-04 8.1175e-06 28.4526 < 2.2e-16 ***
## PA 1.7102e-04 8.6477e-06 19.7759 < 2.2e-16 ***
## RI 1.2126e-04 7.7533e-06 15.6395 < 2.2e-16 ***
## SC 4.0348e-04 3.5479e-05 11.3724 < 2.2e-16 ***
## SD 2.4739e-04 1.4121e-05 17.5195 < 2.2e-16 ***
## TN 2.6020e-04 9.1624e-06 28.3983 < 2.2e-16 ***
## TX 2.5602e-04 1.0853e-05 23.5889 < 2.2e-16 ***
## UT 2.3137e-04 1.5453e-05 14.9721 < 2.2e-16 ***
## VT 2.5116e-04 1.3973e-05 17.9751 < 2.2e-16 ***
## VA 2.1874e-04 1.4664e-05 14.9170 < 2.2e-16 ***
## WA 1.8181e-04 8.2328e-06 22.0836 < 2.2e-16 ***
## WV 2.5809e-04 1.0767e-05 23.9707 < 2.2e-16 ***
## WI 1.7184e-04 7.7457e-06 22.1848 < 2.2e-16 ***
## WY 3.2491e-04 7.2328e-06 44.9219 < 2.2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Efectos fijos en desviaciones con respecto a su media
summary(fixef(reg_imp_mort_fe, type ="dmean"))
## Estimate Std. Error t-value Pr(>|t|)
## AL 1.1006e-04 3.1336e-05 3.5121 0.0005161 ***
## AZ 5.3283e-05 9.2539e-06 5.7579 2.188e-08 ***
## AR 4.4560e-05 1.3213e-05 3.3726 0.0008470 ***
## CA -4.0891e-05 7.4007e-06 -5.5254 7.380e-08 ***
## CO -3.8373e-05 8.0371e-06 -4.7744 2.876e-06 ***
## CT -7.6170e-05 8.3913e-06 -9.0773 < 2.2e-16 ***
## DE -2.0705e-05 7.7457e-06 -2.6731 0.0079465 **
## FL 8.3242e-05 2.2151e-05 3.7579 0.0002076 ***
## GA 1.6252e-04 4.6403e-05 3.5023 0.0005348 ***
## ID 4.3153e-05 9.8767e-06 4.3692 1.744e-05 ***
## IL -8.6107e-05 7.8478e-06 -10.9721 < 2.2e-16 ***
## IN -3.6099e-05 8.8672e-06 -4.0710 6.058e-05 ***
## IA -4.4338e-05 1.0222e-05 -4.3376 1.997e-05 ***
## KS -1.2266e-05 1.0863e-05 -1.1291 0.2597802
## KY -1.1696e-05 8.0462e-06 -1.4536 0.1471402
## LA 2.5344e-05 1.6266e-05 1.5581 0.1203234
## ME -7.3922e-07 1.6006e-05 -0.0462 0.9631970
## MD -6.0588e-05 8.2458e-06 -7.3478 2.102e-12 ***
## MA -1.0092e-04 8.6477e-06 -11.6700 < 2.2e-16 ***
## MI -3.8397e-05 1.1663e-05 -3.2922 0.0011185 **
## MN -7.9666e-05 9.3628e-06 -8.5087 9.901e-16 ***
## MS 1.0715e-04 2.0936e-05 5.1178 5.669e-07 ***
## MO -1.9571e-05 9.2523e-06 -2.1152 0.0352734 *
## MT 7.4016e-05 9.4413e-06 7.8396 8.909e-14 ***
## NE -4.2162e-05 1.0551e-05 -3.9962 8.188e-05 ***
## NV 4.9978e-05 8.1056e-06 6.1659 2.373e-09 ***
## NH -1.5390e-05 1.4114e-05 -1.0904 0.2764610
## NJ -1.0052e-04 7.3328e-06 -13.7083 < 2.2e-16 ***
## NM 1.5269e-04 1.0154e-05 15.0382 < 2.2e-16 ***
## NY -1.0861e-04 7.5629e-06 -14.3610 < 2.2e-16 ***
## NC 8.1009e-05 2.5173e-05 3.2180 0.0014386 **
## ND -5.2288e-05 1.0193e-05 -5.1299 5.345e-07 ***
## OH -5.7386e-05 1.0193e-05 -5.6301 4.288e-08 ***
## OK 5.5549e-05 1.8428e-05 3.0143 0.0028056 **
## OR -6.7445e-06 8.1175e-06 -0.8309 0.4067439
## PA -6.6691e-05 8.6477e-06 -7.7120 2.049e-13 ***
## RI -1.1645e-04 7.7533e-06 -15.0194 < 2.2e-16 ***
## SC 1.6577e-04 3.5479e-05 4.6724 4.582e-06 ***
## SD 9.6834e-06 1.4121e-05 0.6858 0.4934227
## TN 2.2490e-05 9.1624e-06 2.4546 0.0146996 *
## TX 1.8308e-05 1.0853e-05 1.6869 0.0927111 .
## UT -6.3395e-06 1.5453e-05 -0.4102 0.6819363
## VT 1.3451e-05 1.3973e-05 0.9627 0.3365174
## VA -1.8963e-05 1.4664e-05 -1.2931 0.1970014
## WA -5.5897e-05 8.2328e-06 -6.7895 6.460e-11 ***
## WV 2.0380e-05 1.0767e-05 1.8929 0.0593813 .
## WI -6.5871e-05 7.7457e-06 -8.5042 1.021e-15 ***
## WY 8.7205e-05 7.2328e-06 12.0568 < 2.2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Test de BreuscH-Godfrey-Wooldridge correlación serial
pbgtest(reg_imp_mort_fe)
##
## Breusch-Godfrey/Wooldridge test for serial correlation in panel models
##
## data: tasa_mort ~ beertax
## chisq = 50.668, df = 7, p-value = 1.068e-08
## alternative hypothesis: serial correlation in idiosyncratic errors
P valor = 0.0000 < 0.05- Rechazo H0. Tenemos problemas de autocorrelación.
De todas maneras, conviene corregir las estimaciones. Para ello se corre un corrector llamado de “Newey-West” de la matriz de covarianzas. La correlación serial no altera los coeficientes obtenidos pero si los errores estándar y por ende el grado de significatividad de los coeficientes. En R corremos:
coeftest(reg_imp_mort_fe, vcov. = vcovHC)
##
## t test of coefficients:
##
## Estimate Std. Error t value Pr(>|t|)
## beertax -6.5587e-05 2.8837e-05 -2.2744 0.02368 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
El impuesto de la cerveza es significativo al 5%.
Test de dependencia transversal
pcdtest(reg_imp_mort_fe)
##
## Pesaran CD test for cross-sectional dependence in panels
##
## data: tasa_mort ~ beertax
## z = 5.436, p-value = 5.449e-08
## alternative hypothesis: cross-sectional dependence
P valor = 0.0000 < 0.05. Rechazo H0. Dependencia de las secciones cruzadas (hay que estimar panel)
coeftest(reg_imp_mort_fe, vcov. = vcovHC, cluster = 'time')
##
## t test of coefficients:
##
## Estimate Std. Error t value Pr(>|t|)
## beertax -6.5587e-05 9.4573e-06 -6.9351 2.691e-11 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
El impuesto de la cerveza resulta significativo al 5%.
coeftest(reg_imp_mort_fe, vcov. = vcovSCC)
##
## t test of coefficients:
##
## Estimate Std. Error t value Pr(>|t|)
## beertax -6.5587e-05 1.0628e-05 -6.1712 2.303e-09 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Panel <- read.dta("http://dss.princeton.edu/training/Panel101.dta")
colnames(Panel)
## [1] "country" "year" "y" "y_bin" "x1" "x2" "x3"
## [8] "opinion" "op"
Regresión OLS
ols <- lm(y ~x1, data = Panel)
summary(ols)
##
## Call:
## lm(formula = y ~ x1, data = Panel)
##
## Residuals:
## Min 1Q Median 3Q Max
## -9.546e+09 -1.578e+09 1.554e+08 1.422e+09 7.183e+09
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.524e+09 6.211e+08 2.454 0.0167 *
## x1 4.950e+08 7.789e+08 0.636 0.5272
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 3.028e+09 on 68 degrees of freedom
## Multiple R-squared: 0.005905, Adjusted R-squared: -0.008714
## F-statistic: 0.4039 on 1 and 68 DF, p-value: 0.5272
Efectos fijos
pan_fe <-plm(y ~ x1, data=Panel, model = 'within')
summary(pan_fe)
## Oneway (individual) effect Within Model
##
## Call:
## plm(formula = y ~ x1, data = Panel, model = "within")
##
## Balanced Panel: n = 7, T = 10, N = 70
##
## Residuals:
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -8.63e+09 -9.70e+08 5.40e+08 0.00e+00 1.39e+09 5.61e+09
##
## Coefficients:
## Estimate Std. Error t-value Pr(>|t|)
## x1 2475617827 1106675594 2.237 0.02889 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Total Sum of Squares: 5.2364e+20
## Residual Sum of Squares: 4.8454e+20
## R-Squared: 0.074684
## Adj. R-Squared: -0.029788
## F-statistic: 5.00411 on 1 and 62 DF, p-value: 0.028892
fe_dum <-lm(y ~ x1 + factor(country) - 1, data=Panel)
Panel$yhat <- fe_dum$fitted #valores ajustados de y
scatterplot(yhat~x1|country, data=Panel, boxplots=FALSE,
xlab="x1", ylab="yhat",smooth=FALSE)
abline(lm(y~x1, data = Panel),lwd=3, col="red")
Test de agrupamiento
pFtest(pan_fe, ols)
##
## F test for individual effects
##
## data: y ~ x1
## F = 2.9655, df1 = 6, df2 = 62, p-value = 0.01307
## alternative hypothesis: significant effects
P valor = 0.013 < 0.05. Rechazo H0. Efectos fijos son significativos. Es decir, cada una de las regresiones es distinta.
Corremos un modelo con efectos aleatorios y lo comparamos con el de efectos fijos por medio del test de Hausma:
pan_re <- plm(y ~ x1, data=Panel,
index=c("country", "year"), model="random")
phtest(pan_fe, pan_re)
##
## Hausman Test
##
## data: y ~ x1
## chisq = 3.674, df = 1, p-value = 0.05527
## alternative hypothesis: one model is inconsistent
P valor = 0.055 > 0.05. Por lo tanto, hay que estimar por efectos aleatorios.
** p>0.05, usar el modelo de efectos aleatorios.
Test de Bresuch-Godfrey-Woolrdidge correlación serial
pbgtest(pan_re)
##
## Breusch-Godfrey/Wooldridge test for serial correlation in panel models
##
## data: y ~ x1
## chisq = 10.274, df = 10, p-value = 0.4168
## alternative hypothesis: serial correlation in idiosyncratic errors
P valor = 0.41 > 0.05. Acepto H0. No hay autocorrelación.
Test de dependencia transversal
pcdtest(pan_re)
##
## Pesaran CD test for cross-sectional dependence in panels
##
## data: y ~ x1
## z = 1.5264, p-value = 0.1269
## alternative hypothesis: cross-sectional dependence
P valor = 0.12 > 0.05. Las secciones cruzadas son independientes.