Econometrics with R

regression with panel data
Author
Affiliation

HKUST, SOSC

Published

October 8, 2022

1 Load Packages And The Dataset

Code
require("pacman")
p_load(AER, tidyverse, ggthemes, plm)

data(Fatalities)

2 Traffic Deaths And Alcohol Taxes

If there are effects of alcohol taxes and drunk driving laws on road fatalities and, if present, how strong these effects are.

2.1 regression

Code
Fatalities %>% 
  mutate(fatal_rate = fatal / pop * 10000) -> data
Code
data %>% 
  filter(year %in% c(1982, 1988)) %>% 
  nest(data = - year) %>% 
  mutate(
    fit = map(data, ~ lm(fatal_rate ~ beertax, .)),
    ttest = map(fit, ~ coeftest(., vcov. = vcovHC, type = "HC1"))
    ) -> model_82_88
Code
model_82_88$ttest[1]
[[1]]

t test of coefficients:

            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  2.01038    0.14957 13.4408   <2e-16 ***
beertax      0.14846    0.13261  1.1196   0.2687    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Code
model_82_88$ttest[2]
[[1]]

t test of coefficients:

            Estimate Std. Error t value  Pr(>|t|)    
(Intercept)  1.85907    0.11461 16.2205 < 2.2e-16 ***
beertax      0.43875    0.12786  3.4314  0.001279 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Code
data %>% 
  filter(year == 1982) %>% 
  ggplot(aes(beertax, fatal_rate)) +
  geom_point() +
  geom_smooth(method = "lm")
`geom_smooth()` using formula 'y ~ x'

Code
data %>% 
  filter(year == 1988) %>% 
  ggplot(aes(beertax, fatal_rate)) +
  geom_point() +
  geom_smooth(method = "lm")
`geom_smooth()` using formula 'y ~ x'

The regression result is contrary to our expectations.

2.2 before and after comparisons

\(FatalityRate_{i,t}\) = \(\beta_{0}\) + \(\beta_{1}\)\(BeerTax_{i,t}\) + \(\beta_{2}\)\(Z_{i}\) + \(\mu_{i,t}\)

Z are specific characteristics that differ bettween states but are constant over time.

\(FatalityRate_{i,1982}\) = \(\beta_{0}\) + \(\beta_{1}\)\(BeerTax_{i,1982}\) + \(\beta_{2}\)\(Z_{i}\) + \(\mu_{i,1982}\)

\(FatalityRate_{i,1988}\) = \(\beta_{0}\) + \(\beta_{1}\)\(BeerTax_{i,1988}\) + \(\beta_{2}\)\(Z_{i}\) + \(\mu_{i,1988}\)

We can eliminate the \(Z_{i}\) by regressing the difference in the fatality rate between 1988 and 1982 on the difference in beer tax between those years.

Code
data %>% 
  filter(year %in% c(1982, 1988)) %>% 
  nest(-state) %>% 
  mutate(
    diff_fatal_rate = map(data, ~ filter(., year == 1988)$fatal_rate - filter(., year == 1982)$fatal_rate) |> unlist(),
    diff_beertax = map(data, ~ filter(., year == 1988)$beertax - filter(., year == 1982)$beertax) |> unlist()
    ) -> diff_data
Code
diff_data %>% 
  lm(diff_fatal_rate ~ diff_beertax, .) %>% 
  coeftest(., vcov. = vcovHC, type = "HC1")

t test of coefficients:

              Estimate Std. Error t value Pr(>|t|)   
(Intercept)  -0.072037   0.065355 -1.1022 0.276091   
diff_beertax -1.040973   0.355006 -2.9323 0.005229 **
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Code
diff_data %>% 
  ggplot(aes(diff_beertax, diff_fatal_rate)) +
  geom_point(color = "blue") +
  geom_smooth(method = "lm", color = "red") +
  xlab("Change in beer tax") +
  ylab("Change in fataility rate") +
  theme_gdocs()
`geom_smooth()` using formula 'y ~ x'

The estimated coefficient on beer tax is now negative and significantly different from zero at 5%. It means that raising the beer tax by $1 causes traffic fatalities to decrease by 1.04 per 10000 people. This outcome is likely to be a consequence of omitting factors in the single-year regression.

2.3 fixed effects regression

\(Y_{it}\) = \(\beta_{0}\) + \(\beta_{1}\)\(X_{it}\) + \(\beta_{2}\)\(Z_{i}\) + \(\mu_{it}\)

\(Z_{i}\) are unobserved time-invariant heterogeneities across the entities \(i = x_{1}, x_{2}, \dots, x_{n}\). We aim to estimate \(\beta_{1}\), the effect on \(Y_{i}\) of a change in \(X_{i}\) holding constant \(Z_{i}\).

Code
data %>% 
  lm(fatal_rate ~ beertax + state - 1, .) -> fatal_fe_lm_mod 

summary(fatal_fe_lm_mod)

Call:
lm(formula = fatal_rate ~ beertax + state - 1, data = .)

Residuals:
     Min       1Q   Median       3Q      Max 
-0.58696 -0.08284 -0.00127  0.07955  0.89780 

Coefficients:
        Estimate Std. Error t value Pr(>|t|)    
beertax -0.65587    0.18785  -3.491 0.000556 ***
stateal  3.47763    0.31336  11.098  < 2e-16 ***
stateaz  2.90990    0.09254  31.445  < 2e-16 ***
statear  2.82268    0.13213  21.364  < 2e-16 ***
stateca  1.96816    0.07401  26.594  < 2e-16 ***
stateco  1.99335    0.08037  24.802  < 2e-16 ***
statect  1.61537    0.08391  19.251  < 2e-16 ***
statede  2.17003    0.07746  28.016  < 2e-16 ***
statefl  3.20950    0.22151  14.489  < 2e-16 ***
statega  4.00223    0.46403   8.625 4.43e-16 ***
stateid  2.80861    0.09877  28.437  < 2e-16 ***
stateil  1.51601    0.07848  19.318  < 2e-16 ***
statein  2.01609    0.08867  22.736  < 2e-16 ***
stateia  1.93370    0.10222  18.918  < 2e-16 ***
stateks  2.25441    0.10863  20.753  < 2e-16 ***
stateky  2.26011    0.08046  28.089  < 2e-16 ***
statela  2.63051    0.16266  16.171  < 2e-16 ***
stateme  2.36968    0.16006  14.805  < 2e-16 ***
statemd  1.77119    0.08246  21.480  < 2e-16 ***
statema  1.36788    0.08648  15.818  < 2e-16 ***
statemi  1.99310    0.11663  17.089  < 2e-16 ***
statemn  1.58042    0.09363  16.880  < 2e-16 ***
statems  3.44855    0.20936  16.472  < 2e-16 ***
statemo  2.18137    0.09252  23.576  < 2e-16 ***
statemt  3.11724    0.09441  33.017  < 2e-16 ***
statene  1.95545    0.10551  18.534  < 2e-16 ***
statenv  2.87686    0.08106  35.492  < 2e-16 ***
statenh  2.22318    0.14114  15.751  < 2e-16 ***
statenj  1.37188    0.07333  18.709  < 2e-16 ***
statenm  3.90401    0.10154  38.449  < 2e-16 ***
stateny  1.29096    0.07563  17.070  < 2e-16 ***
statenc  3.18717    0.25173  12.661  < 2e-16 ***
statend  1.85419    0.10193  18.191  < 2e-16 ***
stateoh  1.80321    0.10193  17.691  < 2e-16 ***
stateok  2.93257    0.18428  15.913  < 2e-16 ***
stateor  2.30963    0.08117  28.453  < 2e-16 ***
statepa  1.71016    0.08648  19.776  < 2e-16 ***
stateri  1.21258    0.07753  15.640  < 2e-16 ***
statesc  4.03480    0.35479  11.372  < 2e-16 ***
statesd  2.47391    0.14121  17.519  < 2e-16 ***
statetn  2.60197    0.09162  28.398  < 2e-16 ***
statetx  2.56016    0.10853  23.589  < 2e-16 ***
stateut  2.31368    0.15453  14.972  < 2e-16 ***
statevt  2.51159    0.13973  17.975  < 2e-16 ***
stateva  2.18745    0.14664  14.917  < 2e-16 ***
statewa  1.81811    0.08233  22.084  < 2e-16 ***
statewv  2.58088    0.10767  23.971  < 2e-16 ***
statewi  1.71836    0.07746  22.185  < 2e-16 ***
statewy  3.24913    0.07233  44.922  < 2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.1899 on 287 degrees of freedom
Multiple R-squared:  0.9931,    Adjusted R-squared:  0.992 
F-statistic: 847.8 on 49 and 287 DF,  p-value: < 2.2e-16
Code
data %>% 
  mutate(fatal_rate_mean = fatal_rate - ave(fatal_rate, state), 
         beertax = beertax - ave(beertax, state)) %>% 
  lm(fatal_rate_mean ~ beertax - 1, .) %>% 
  coeftest(., vcov. = vcovHC, type = "HC1")

t test of coefficients:

        Estimate Std. Error t value  Pr(>|t|)    
beertax -0.65587    0.18815 -3.4858 0.0005559 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Code
data %>% 
  plm(fatal_rate ~ beertax, ., index = c("state", "year"), model = "within") %>% 
  coeftest(., vcov. = vcovHC, type = "HC1")

t test of coefficients:

        Estimate Std. Error t value Pr(>|t|)  
beertax -0.65587    0.28880  -2.271  0.02388 *
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Code
data %>% 
  plm(fatal_rate ~ beertax, ., index = c("state", "year"), model = "within", effect = "twoways") %>% 
  coeftest(., vcov. = vcovHC, type = "HC1")

t test of coefficients:

        Estimate Std. Error t value Pr(>|t|)  
beertax -0.63998    0.35015 -1.8277  0.06865 .
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1