Panel Data

**Traffic Fatality Data:** Data for \(n=48\) (state) entities, where each entity is observed in \(T=7\) (year) time periods, for total of 336 observations

library(haven)
fatality <- read_dta("http://wps.aw.com/wps/media/objects/11422/11696965/datasets3e/datasets/fatality.dta")
data <- fatality %>% mutate(vfrall=10000*mrall)%>%
  mutate( da18 = (mlda<19))%>%
  mutate( da19 = (mlda>=19)*(mlda<20))%>%
  mutate( da20 = (mlda>=20)*(mlda<21))%>%
  mutate( da21 = (mlda>=21))%>%
  mutate( incperc = perinc/1000)%>%
  mutate(lincperc = log(incperc))%>%
  mutate( vmilespd = vmiles/1000)%>%
  mutate( frmall = mrall/(vmiles/100000))%>%
  mutate( jailcom = ((jaild+comserd)>0))
data %>% filter(year==1982 | year==1988) %>%
  group_by(year) %>%
  ggplot(., aes(beertax,vfrall)) + geom_point(color="aquamarine4", size = 2) +
  facet_wrap(~year, nrow=1) +
  labs(x="Beer Tax", y="Fatality Rate") +
  stat_smooth(method="lm", se=FALSE) + geom_point()

Figure 10.1 \[\widehat{FatalityRate}_{1982}=2.01^{***}+0.15BeerTax_{1982}\] \[\widehat{FatalityRate}_{1988}=1.86^{***}+0.44^{*}BeerTax_{1988}\] Both plot shows positive relationship between the fatality rate and the beer tax. Higher beer taxes are associated with more traffic fatalities, obviously false.

coef_df <- function(x) {
  sc <- coef(summary(x))
  colnames(sc) <- c("est", "se", "t", "P")
  data.frame(coef=rownames(sc), sc)
}
data %>% filter(year==1982 | year==1988) %>%
  group_by(year) %>%
  do(mod = lm(vfrall ~ beertax, data = .)) %>% 
  do(coef_df(.$mod))
## Source: local data frame [4 x 5]
## Groups: <by row>
## 
## # A tibble: 4 × 5
##          coef       est        se          t            P
## *      <fctr>     <dbl>     <dbl>      <dbl>        <dbl>
## 1 (Intercept) 2.0103813 0.1390785 14.4550158 1.010417e-18
## 2     beertax 0.1484604 0.1883681  0.7881395 4.346580e-01
## 3 (Intercept) 1.8590729 0.1059887 17.5402993 5.284569e-22
## 4     beertax 0.4387546 0.1644538  2.6679501 1.050311e-02

Panel with two time periods

\[\widehat{FatalityRate}_{it}=\beta_0+\beta_1BeerTax_{it}+\beta_2Z_i+u_{it}\]where \(Z_i\) is a variable that determines the fatality rate in the \(i\)th state

 diff <- data %>% 
  filter(year==1982 | year==1988) %>%
  mutate(dbtax=beertax-lag(beertax), dvfrall=vfrall-lag(vfrall)) %>%
  filter(year==1988) %>%
  select(dbtax, dvfrall)
coef_df(lm(dvfrall~dbtax, data=diff))
##                    coef        est         se         t          P
## (Intercept) (Intercept) -0.0720371 0.06064401 -1.187868 0.24098262
## dbtax             dbtax -1.0409726 0.41722785 -2.494974 0.01624848
ggplot(diff, aes(dbtax, dvfrall)) + geom_point() +
  labs(x="Change in Beer Tax", y="Change in Fatality Rate") +
  stat_smooth(method="lm", se=FALSE) 

\[\widehat{\Delta FatalityRate}_{88}=-0.072 -1.041^{*} \Delta BeerTax_{88}\] where \(\Delta BeerTax_{88}=BeerTax_{1988}-Beertax{1982}\)

Estimated effect of a change in the real beer tax is negative. Increase in the real beer tax by 1 dollar per case reduces the traffic fatality rate by 1.04 deaths per 10000 people.

summary(data$vfrall)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##  0.8212  1.6240  1.9560  2.0400  2.4180  4.2180

Average fatality rate is about 2 in whole sample. Estimated effect is quite large.

Fixed Effects

\[Y_{it}=\beta_0+\beta_1 X_{it}+\beta_2Z_i+u_{it}\] where $Z_i $ is unobserved varibale that varies from one state to the next but does not change over time. We want to estimate \(\beta_1\). \[Y_{it}=\alpha_i+\beta_1 X_{it}+u_{it}\]where \(\alpha_i=\beta_0+\beta_2Z_i\)

LS dummy variable estimation

\[D_{ij}=\begin{cases}1&i=j\\0&else \end{cases}\]so \[Y_{it}=\beta_0+\beta_1 X_{it}+\sum_{j=2}^{n}\gamma_kD_{i,k}+u_{it}\]

Within group(entity demeaned)

First Difference

library(plm)
fixed <- plm(vfrall ~ beertax, data = data, index=c("state", "year"), model="within")
summary(fixed)
## Oneway (individual) effect Within Model
## 
## Call:
## plm(formula = vfrall ~ beertax, data = data, model = "within", 
##     index = c("state", "year"))
## 
## Balanced Panel: n=48, T=7, N=336
## 
## Residuals :
##     Min.  1st Qu.   Median  3rd Qu.     Max. 
## -0.58700 -0.08280 -0.00127  0.07950  0.89800 
## 
## Coefficients :
##         Estimate Std. Error t-value Pr(>|t|)    
## beertax -0.65587    0.18785 -3.4915 0.000556 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Total Sum of Squares:    10.785
## Residual Sum of Squares: 10.345
## R-Squared:      0.040745
## Adj. R-Squared: 0.034803
## F-statistic: 12.1904 on 1 and 287 DF, p-value: 0.00055597

Fixed time and individ effects

Can be carried out through specification effect="twoways"

library(plm)
fixed2ways <- plm(vfrall ~ beertax, data = data, index=c("state", "year"), model="within", effect="twoways")
summary(fixed2ways)
## Twoways effects Within Model
## 
## Call:
## plm(formula = vfrall ~ beertax, data = data, effect = "twoways", 
##     model = "within", index = c("state", "year"))
## 
## Balanced Panel: n=48, T=7, N=336
## 
## Residuals :
##     Min.  1st Qu.   Median  3rd Qu.     Max. 
## -0.59600 -0.08100  0.00143  0.08230  0.83900 
## 
## Coefficients :
##         Estimate Std. Error t-value Pr(>|t|)   
## beertax -0.63998    0.19738 -3.2424 0.001328 **
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Total Sum of Squares:    10.29
## Residual Sum of Squares: 9.9193
## R-Squared:      0.036065
## Adj. R-Squared: 0.030161
## F-statistic: 10.5133 on 1 and 281 DF, p-value: 0.001328

Including time effect has little impact.

Full fledged empirical investigation

Alcohol taxes are only one way to discourage drinking. Other potential omitted vars

models <- list (column1=list(formula = vfrall~beertax, 
                             model   = "pooling",
                             effect  = NULL),
                column2=list(formula = vfrall~beertax,
                             model   = "within",
                             effect  = "individual"),
                column3=list(formula = vfrall~beertax,
                             model   = "within",
                             effect  = "twoways"),
                column4=list(formula = vfrall~beertax+da18+da19+da20+jailcom+vmilespd+unrate+lincperc,
                             model   = "within",
                             effect  = "twoways"),
                column5=list(formula = vfrall~beertax+da18+da19+da20+jailcom+vmilespd,
                             model   = "within",
                             effect  = "twoways"),
                column6=list(formula = vfrall~beertax+mlda+jailcom+vmilespd+unrate+lincperc,
                             model   = "within",
                             effect  = "twoways"))

In order to pass the parameter values to lapply we need following wrap function

fitplm <- function(l,data=data){
  lm1 <- plm(formula = l$formula, 
             data=data, 
             index=c("state", "year"),
             model=l$model,
             effect=l$effect)
  return(lm1)
}
Table10 <- lapply(models, FUN=fitplm, data)

To extract coefficients

lapply(Table10, coef_df)
## $column1
##                    coef       est         se         t             P
## (Intercept) (Intercept) 1.8533079 0.04356713 42.539126 6.945570e-137
## beertax         beertax 0.3646054 0.06216983  5.864668  1.082171e-08
## 
## $column2
##            coef        est      se         t            P
## beertax beertax -0.6558736 0.18785 -3.491475 0.0005559707
## 
## $column3
##            coef        est        se         t          P
## beertax beertax -0.6399799 0.1973768 -3.242427 0.00132804
## 
## $column4
##                    coef          est          se          t            P
## beertax         beertax -0.445344278 0.168742644 -2.6391922 8.788044e-03
## da18TRUE       da18TRUE  0.028445686 0.065755403  0.4325985 6.656480e-01
## da19               da19 -0.017953134 0.039940124 -0.4495012 6.534268e-01
## da20               da20  0.032021278 0.045107845  0.7098827 4.783828e-01
## jailcomTRUE jailcomTRUE  0.038330985 0.059860503  0.6403385 5.224896e-01
## vmilespd       vmilespd  0.008228133 0.008821499  0.9327364 3.517803e-01
## unrate           unrate -0.063259817 0.011159051 -5.6689243 3.650903e-08
## lincperc       lincperc  1.815770633 0.380139170  4.7765944 2.913626e-06
## 
## $column5
##                    coef         est         se          t            P
## beertax         beertax -0.68992185 0.20029324 -3.4445589 0.0006612469
## da18TRUE       da18TRUE -0.01037189 0.07898700 -0.1313113 0.8956250632
## da19               da19 -0.07570187 0.04723572 -1.6026404 0.1101620173
## da20               da20 -0.10014508 0.05136217 -1.9497829 0.0522181705
## jailcomTRUE jailcomTRUE  0.08528440 0.07172473  1.1890515 0.2354450999
## vmilespd       vmilespd  0.01743608 0.01056823  1.6498587 0.1001135286
## 
## $column6
##                    coef          est          se          t            P
## beertax         beertax -0.456466622 0.166021090 -2.7494496 6.365104e-03
## mlda               mlda -0.002156749 0.017761354 -0.1214293 9.034397e-01
## jailcomTRUE jailcomTRUE  0.038981429 0.059683119  0.6531399 5.142117e-01
## vmilespd       vmilespd  0.008978657 0.008738391  1.0274954 3.050901e-01
## unrate           unrate -0.062694409 0.011096757 -5.6497955 4.008737e-08
## lincperc       lincperc  1.786435571 0.362595093  4.9268057 1.445374e-06