Lecture 4 - Panel Data

EC3133

Introduction

Application: Traffic Deaths and Alcohol Taxes

## [1] TRUE
## [1] 336  34
## Classes 'pdata.frame' and 'data.frame':  336 obs. of  34 variables:
##  $ state       : Factor w/ 48 levels "al","az","ar",..: 1 1 1 1 1 1 1 2 2 2 ...
##   ..- attr(*, "names")= chr [1:336] "al-1982" "al-1983" "al-1984" "al-1985" ...
##   ..- attr(*, "index")=Classes 'pindex' and 'data.frame':    336 obs. of  2 variables:
##   .. ..$ state: Factor w/ 48 levels "al","az","ar",..: 1 1 1 1 1 1 1 2 2 2 ...
##   .. ..$ year : Factor w/ 7 levels "1982","1983",..: 1 2 3 4 5 6 7 1 2 3 ...
##  $ year        : Factor w/ 7 levels "1982","1983",..: 1 2 3 4 5 6 7 1 2 3 ...
##   ..- attr(*, "names")= chr [1:336] "al-1982" "al-1983" "al-1984" "al-1985" ...
##   ..- attr(*, "index")=Classes 'pindex' and 'data.frame':    336 obs. of  2 variables:
##   .. ..$ state: Factor w/ 48 levels "al","az","ar",..: 1 1 1 1 1 1 1 2 2 2 ...
##   .. ..$ year : Factor w/ 7 levels "1982","1983",..: 1 2 3 4 5 6 7 1 2 3 ...
##  $ spirits     : 'pseries' Named num  1.37 1.36 1.32 1.28 1.23 ...
##   ..- attr(*, "names")= chr [1:336] "al-1982" "al-1983" "al-1984" "al-1985" ...
##   ..- attr(*, "index")=Classes 'pindex' and 'data.frame':    336 obs. of  2 variables:
##   .. ..$ state: Factor w/ 48 levels "al","az","ar",..: 1 1 1 1 1 1 1 2 2 2 ...
##   .. ..$ year : Factor w/ 7 levels "1982","1983",..: 1 2 3 4 5 6 7 1 2 3 ...
##  $ unemp       : 'pseries' Named num  14.4 13.7 11.1 8.9 9.8 ...
##   ..- attr(*, "names")= chr [1:336] "al-1982" "al-1983" "al-1984" "al-1985" ...
##   ..- attr(*, "index")=Classes 'pindex' and 'data.frame':    336 obs. of  2 variables:
##   .. ..$ state: Factor w/ 48 levels "al","az","ar",..: 1 1 1 1 1 1 1 2 2 2 ...
##   .. ..$ year : Factor w/ 7 levels "1982","1983",..: 1 2 3 4 5 6 7 1 2 3 ...
##  $ income      : 'pseries' Named num  10544 10733 11109 11333 11662 ...
##   ..- attr(*, "names")= chr [1:336] "al-1982" "al-1983" "al-1984" "al-1985" ...
##   ..- attr(*, "index")=Classes 'pindex' and 'data.frame':    336 obs. of  2 variables:
##   .. ..$ state: Factor w/ 48 levels "al","az","ar",..: 1 1 1 1 1 1 1 2 2 2 ...
##   .. ..$ year : Factor w/ 7 levels "1982","1983",..: 1 2 3 4 5 6 7 1 2 3 ...
##  $ emppop      : 'pseries' Named num  50.7 52.1 54.2 55.3 56.5 ...
##   ..- attr(*, "names")= chr [1:336] "al-1982" "al-1983" "al-1984" "al-1985" ...
##   ..- attr(*, "index")=Classes 'pindex' and 'data.frame':    336 obs. of  2 variables:
##   .. ..$ state: Factor w/ 48 levels "al","az","ar",..: 1 1 1 1 1 1 1 2 2 2 ...
##   .. ..$ year : Factor w/ 7 levels "1982","1983",..: 1 2 3 4 5 6 7 1 2 3 ...
##  $ beertax     : 'pseries' Named num  1.54 1.79 1.71 1.65 1.61 ...
##   ..- attr(*, "names")= chr [1:336] "al-1982" "al-1983" "al-1984" "al-1985" ...
##   ..- attr(*, "index")=Classes 'pindex' and 'data.frame':    336 obs. of  2 variables:
##   .. ..$ state: Factor w/ 48 levels "al","az","ar",..: 1 1 1 1 1 1 1 2 2 2 ...
##   .. ..$ year : Factor w/ 7 levels "1982","1983",..: 1 2 3 4 5 6 7 1 2 3 ...
##  $ baptist     : 'pseries' Named num  30.4 30.3 30.3 30.3 30.3 ...
##   ..- attr(*, "names")= chr [1:336] "al-1982" "al-1983" "al-1984" "al-1985" ...
##   ..- attr(*, "index")=Classes 'pindex' and 'data.frame':    336 obs. of  2 variables:
##   .. ..$ state: Factor w/ 48 levels "al","az","ar",..: 1 1 1 1 1 1 1 2 2 2 ...
##   .. ..$ year : Factor w/ 7 levels "1982","1983",..: 1 2 3 4 5 6 7 1 2 3 ...
##  $ mormon      : 'pseries' Named num  0.328 0.343 0.359 0.376 0.393 ...
##   ..- attr(*, "names")= chr [1:336] "al-1982" "al-1983" "al-1984" "al-1985" ...
##   ..- attr(*, "index")=Classes 'pindex' and 'data.frame':    336 obs. of  2 variables:
##   .. ..$ state: Factor w/ 48 levels "al","az","ar",..: 1 1 1 1 1 1 1 2 2 2 ...
##   .. ..$ year : Factor w/ 7 levels "1982","1983",..: 1 2 3 4 5 6 7 1 2 3 ...
##  $ drinkage    : 'pseries' Named num  19 19 19 19.7 21 ...
##   ..- attr(*, "names")= chr [1:336] "al-1982" "al-1983" "al-1984" "al-1985" ...
##   ..- attr(*, "index")=Classes 'pindex' and 'data.frame':    336 obs. of  2 variables:
##   .. ..$ state: Factor w/ 48 levels "al","az","ar",..: 1 1 1 1 1 1 1 2 2 2 ...
##   .. ..$ year : Factor w/ 7 levels "1982","1983",..: 1 2 3 4 5 6 7 1 2 3 ...
##  $ dry         : 'pseries' Named num  25 23 24 23.6 23.5 ...
##   ..- attr(*, "names")= chr [1:336] "al-1982" "al-1983" "al-1984" "al-1985" ...
##   ..- attr(*, "index")=Classes 'pindex' and 'data.frame':    336 obs. of  2 variables:
##   .. ..$ state: Factor w/ 48 levels "al","az","ar",..: 1 1 1 1 1 1 1 2 2 2 ...
##   .. ..$ year : Factor w/ 7 levels "1982","1983",..: 1 2 3 4 5 6 7 1 2 3 ...
##  $ youngdrivers: 'pseries' Named num  0.212 0.211 0.211 0.211 0.213 ...
##   ..- attr(*, "names")= chr [1:336] "al-1982" "al-1983" "al-1984" "al-1985" ...
##   ..- attr(*, "index")=Classes 'pindex' and 'data.frame':    336 obs. of  2 variables:
##   .. ..$ state: Factor w/ 48 levels "al","az","ar",..: 1 1 1 1 1 1 1 2 2 2 ...
##   .. ..$ year : Factor w/ 7 levels "1982","1983",..: 1 2 3 4 5 6 7 1 2 3 ...
##  $ miles       : 'pseries' Named num  7234 7836 8263 8727 8953 ...
##   ..- attr(*, "names")= chr [1:336] "al-1982" "al-1983" "al-1984" "al-1985" ...
##   ..- attr(*, "index")=Classes 'pindex' and 'data.frame':    336 obs. of  2 variables:
##   .. ..$ state: Factor w/ 48 levels "al","az","ar",..: 1 1 1 1 1 1 1 2 2 2 ...
##   .. ..$ year : Factor w/ 7 levels "1982","1983",..: 1 2 3 4 5 6 7 1 2 3 ...
##  $ breath      : Factor w/ 2 levels "no","yes": 1 1 1 1 1 1 1 1 1 1 ...
##   ..- attr(*, "names")= chr [1:336] "al-1982" "al-1983" "al-1984" "al-1985" ...
##   ..- attr(*, "index")=Classes 'pindex' and 'data.frame':    336 obs. of  2 variables:
##   .. ..$ state: Factor w/ 48 levels "al","az","ar",..: 1 1 1 1 1 1 1 2 2 2 ...
##   .. ..$ year : Factor w/ 7 levels "1982","1983",..: 1 2 3 4 5 6 7 1 2 3 ...
##  $ jail        : Factor w/ 2 levels "no","yes": 1 1 1 1 1 1 1 2 2 2 ...
##   ..- attr(*, "names")= chr [1:336] "al-1982" "al-1983" "al-1984" "al-1985" ...
##   ..- attr(*, "index")=Classes 'pindex' and 'data.frame':    336 obs. of  2 variables:
##   .. ..$ state: Factor w/ 48 levels "al","az","ar",..: 1 1 1 1 1 1 1 2 2 2 ...
##   .. ..$ year : Factor w/ 7 levels "1982","1983",..: 1 2 3 4 5 6 7 1 2 3 ...
##  $ service     : Factor w/ 2 levels "no","yes": 1 1 1 1 1 1 1 2 2 2 ...
##   ..- attr(*, "names")= chr [1:336] "al-1982" "al-1983" "al-1984" "al-1985" ...
##   ..- attr(*, "index")=Classes 'pindex' and 'data.frame':    336 obs. of  2 variables:
##   .. ..$ state: Factor w/ 48 levels "al","az","ar",..: 1 1 1 1 1 1 1 2 2 2 ...
##   .. ..$ year : Factor w/ 7 levels "1982","1983",..: 1 2 3 4 5 6 7 1 2 3 ...
##  $ fatal       : 'pseries' Named int  839 930 932 882 1081 1110 1023 724 675 869 ...
##   ..- attr(*, "names")= chr [1:336] "al-1982" "al-1983" "al-1984" "al-1985" ...
##   ..- attr(*, "index")=Classes 'pindex' and 'data.frame':    336 obs. of  2 variables:
##   .. ..$ state: Factor w/ 48 levels "al","az","ar",..: 1 1 1 1 1 1 1 2 2 2 ...
##   .. ..$ year : Factor w/ 7 levels "1982","1983",..: 1 2 3 4 5 6 7 1 2 3 ...
##  $ nfatal      : 'pseries' Named int  146 154 165 146 172 181 139 131 112 149 ...
##   ..- attr(*, "names")= chr [1:336] "al-1982" "al-1983" "al-1984" "al-1985" ...
##   ..- attr(*, "index")=Classes 'pindex' and 'data.frame':    336 obs. of  2 variables:
##   .. ..$ state: Factor w/ 48 levels "al","az","ar",..: 1 1 1 1 1 1 1 2 2 2 ...
##   .. ..$ year : Factor w/ 7 levels "1982","1983",..: 1 2 3 4 5 6 7 1 2 3 ...
##  $ sfatal      : 'pseries' Named int  99 98 94 98 119 114 89 76 60 81 ...
##   ..- attr(*, "names")= chr [1:336] "al-1982" "al-1983" "al-1984" "al-1985" ...
##   ..- attr(*, "index")=Classes 'pindex' and 'data.frame':    336 obs. of  2 variables:
##   .. ..$ state: Factor w/ 48 levels "al","az","ar",..: 1 1 1 1 1 1 1 2 2 2 ...
##   .. ..$ year : Factor w/ 7 levels "1982","1983",..: 1 2 3 4 5 6 7 1 2 3 ...
##  $ fatal1517   : 'pseries' Named int  53 71 49 66 82 94 66 40 40 51 ...
##   ..- attr(*, "names")= chr [1:336] "al-1982" "al-1983" "al-1984" "al-1985" ...
##   ..- attr(*, "index")=Classes 'pindex' and 'data.frame':    336 obs. of  2 variables:
##   .. ..$ state: Factor w/ 48 levels "al","az","ar",..: 1 1 1 1 1 1 1 2 2 2 ...
##   .. ..$ year : Factor w/ 7 levels "1982","1983",..: 1 2 3 4 5 6 7 1 2 3 ...
##  $ nfatal1517  : 'pseries' Named int  9 8 7 9 10 11 8 7 7 8 ...
##   ..- attr(*, "names")= chr [1:336] "al-1982" "al-1983" "al-1984" "al-1985" ...
##   ..- attr(*, "index")=Classes 'pindex' and 'data.frame':    336 obs. of  2 variables:
##   .. ..$ state: Factor w/ 48 levels "al","az","ar",..: 1 1 1 1 1 1 1 2 2 2 ...
##   .. ..$ year : Factor w/ 7 levels "1982","1983",..: 1 2 3 4 5 6 7 1 2 3 ...
##  $ fatal1820   : 'pseries' Named int  99 108 103 100 120 127 105 81 83 118 ...
##   ..- attr(*, "names")= chr [1:336] "al-1982" "al-1983" "al-1984" "al-1985" ...
##   ..- attr(*, "index")=Classes 'pindex' and 'data.frame':    336 obs. of  2 variables:
##   .. ..$ state: Factor w/ 48 levels "al","az","ar",..: 1 1 1 1 1 1 1 2 2 2 ...
##   .. ..$ year : Factor w/ 7 levels "1982","1983",..: 1 2 3 4 5 6 7 1 2 3 ...
##  $ nfatal1820  : 'pseries' Named int  34 26 25 23 23 31 24 16 19 34 ...
##   ..- attr(*, "names")= chr [1:336] "al-1982" "al-1983" "al-1984" "al-1985" ...
##   ..- attr(*, "index")=Classes 'pindex' and 'data.frame':    336 obs. of  2 variables:
##   .. ..$ state: Factor w/ 48 levels "al","az","ar",..: 1 1 1 1 1 1 1 2 2 2 ...
##   .. ..$ year : Factor w/ 7 levels "1982","1983",..: 1 2 3 4 5 6 7 1 2 3 ...
##  $ fatal2124   : 'pseries' Named int  120 124 118 114 119 138 123 96 80 123 ...
##   ..- attr(*, "names")= chr [1:336] "al-1982" "al-1983" "al-1984" "al-1985" ...
##   ..- attr(*, "index")=Classes 'pindex' and 'data.frame':    336 obs. of  2 variables:
##   .. ..$ state: Factor w/ 48 levels "al","az","ar",..: 1 1 1 1 1 1 1 2 2 2 ...
##   .. ..$ year : Factor w/ 7 levels "1982","1983",..: 1 2 3 4 5 6 7 1 2 3 ...
##  $ nfatal2124  : 'pseries' Named int  32 35 34 45 29 30 25 36 17 33 ...
##   ..- attr(*, "names")= chr [1:336] "al-1982" "al-1983" "al-1984" "al-1985" ...
##   ..- attr(*, "index")=Classes 'pindex' and 'data.frame':    336 obs. of  2 variables:
##   .. ..$ state: Factor w/ 48 levels "al","az","ar",..: 1 1 1 1 1 1 1 2 2 2 ...
##   .. ..$ year : Factor w/ 7 levels "1982","1983",..: 1 2 3 4 5 6 7 1 2 3 ...
##  $ afatal      : 'pseries' Named num  309 342 305 277 361 ...
##   ..- attr(*, "names")= chr [1:336] "al-1982" "al-1983" "al-1984" "al-1985" ...
##   ..- attr(*, "index")=Classes 'pindex' and 'data.frame':    336 obs. of  2 variables:
##   .. ..$ state: Factor w/ 48 levels "al","az","ar",..: 1 1 1 1 1 1 1 2 2 2 ...
##   .. ..$ year : Factor w/ 7 levels "1982","1983",..: 1 2 3 4 5 6 7 1 2 3 ...
##  $ pop         : 'pseries' Named num  3942002 3960008 3988992 4021008 4049994 ...
##   ..- attr(*, "names")= chr [1:336] "al-1982" "al-1983" "al-1984" "al-1985" ...
##   ..- attr(*, "index")=Classes 'pindex' and 'data.frame':    336 obs. of  2 variables:
##   .. ..$ state: Factor w/ 48 levels "al","az","ar",..: 1 1 1 1 1 1 1 2 2 2 ...
##   .. ..$ year : Factor w/ 7 levels "1982","1983",..: 1 2 3 4 5 6 7 1 2 3 ...
##  $ pop1517     : 'pseries' Named num  209000 202000 197000 195000 204000 ...
##   ..- attr(*, "names")= chr [1:336] "al-1982" "al-1983" "al-1984" "al-1985" ...
##   ..- attr(*, "index")=Classes 'pindex' and 'data.frame':    336 obs. of  2 variables:
##   .. ..$ state: Factor w/ 48 levels "al","az","ar",..: 1 1 1 1 1 1 1 2 2 2 ...
##   .. ..$ year : Factor w/ 7 levels "1982","1983",..: 1 2 3 4 5 6 7 1 2 3 ...
##  $ pop1820     : 'pseries' Named num  221553 219125 216724 214349 212000 ...
##   ..- attr(*, "names")= chr [1:336] "al-1982" "al-1983" "al-1984" "al-1985" ...
##   ..- attr(*, "index")=Classes 'pindex' and 'data.frame':    336 obs. of  2 variables:
##   .. ..$ state: Factor w/ 48 levels "al","az","ar",..: 1 1 1 1 1 1 1 2 2 2 ...
##   .. ..$ year : Factor w/ 7 levels "1982","1983",..: 1 2 3 4 5 6 7 1 2 3 ...
##  $ pop2124     : 'pseries' Named num  290000 290000 288000 284000 263000 ...
##   ..- attr(*, "names")= chr [1:336] "al-1982" "al-1983" "al-1984" "al-1985" ...
##   ..- attr(*, "index")=Classes 'pindex' and 'data.frame':    336 obs. of  2 variables:
##   .. ..$ state: Factor w/ 48 levels "al","az","ar",..: 1 1 1 1 1 1 1 2 2 2 ...
##   .. ..$ year : Factor w/ 7 levels "1982","1983",..: 1 2 3 4 5 6 7 1 2 3 ...
##  $ milestot    : 'pseries' Named num  28516 31032 32961 35091 36259 ...
##   ..- attr(*, "names")= chr [1:336] "al-1982" "al-1983" "al-1984" "al-1985" ...
##   ..- attr(*, "index")=Classes 'pindex' and 'data.frame':    336 obs. of  2 variables:
##   .. ..$ state: Factor w/ 48 levels "al","az","ar",..: 1 1 1 1 1 1 1 2 2 2 ...
##   .. ..$ year : Factor w/ 7 levels "1982","1983",..: 1 2 3 4 5 6 7 1 2 3 ...
##  $ unempus     : 'pseries' Named num  9.7 9.6 7.5 7.2 7 ...
##   ..- attr(*, "names")= chr [1:336] "al-1982" "al-1983" "al-1984" "al-1985" ...
##   ..- attr(*, "index")=Classes 'pindex' and 'data.frame':    336 obs. of  2 variables:
##   .. ..$ state: Factor w/ 48 levels "al","az","ar",..: 1 1 1 1 1 1 1 2 2 2 ...
##   .. ..$ year : Factor w/ 7 levels "1982","1983",..: 1 2 3 4 5 6 7 1 2 3 ...
##  $ emppopus    : 'pseries' Named num  57.8 57.9 59.5 60.1 60.7 ...
##   ..- attr(*, "names")= chr [1:336] "al-1982" "al-1983" "al-1984" "al-1985" ...
##   ..- attr(*, "index")=Classes 'pindex' and 'data.frame':    336 obs. of  2 variables:
##   .. ..$ state: Factor w/ 48 levels "al","az","ar",..: 1 1 1 1 1 1 1 2 2 2 ...
##   .. ..$ year : Factor w/ 7 levels "1982","1983",..: 1 2 3 4 5 6 7 1 2 3 ...
##  $ gsp         : 'pseries' Named num  -0.0221 0.0466 0.0628 0.0275 0.0321 ...
##   ..- attr(*, "names")= chr [1:336] "al-1982" "al-1983" "al-1984" "al-1985" ...
##   ..- attr(*, "index")=Classes 'pindex' and 'data.frame':    336 obs. of  2 variables:
##   .. ..$ state: Factor w/ 48 levels "al","az","ar",..: 1 1 1 1 1 1 1 2 2 2 ...
##   .. ..$ year : Factor w/ 7 levels "1982","1983",..: 1 2 3 4 5 6 7 1 2 3 ...
##  - attr(*, "index")=Classes 'pindex' and 'data.frame':   336 obs. of  2 variables:
##   ..$ state: Factor w/ 48 levels "al","az","ar",..: 1 1 1 1 1 1 1 2 2 2 ...
##   ..$ year : Factor w/ 7 levels "1982","1983",..: 1 2 3 4 5 6 7 1 2 3 ...

provide a summary of state and year variables:

##      state       year   
##  al     :  7   1982:48  
##  az     :  7   1983:48  
##  ar     :  7   1984:48  
##  ca     :  7   1985:48  
##  co     :  7   1986:48  
##  ct     :  7   1987:48  
##  (Other):294   1988:48

The variable state is a factor variable with 48 levels (one for each federal state of the US) and there are 7 values for the year variable. Then we have \(7 x 48=336\) observations in total. Because all the variables are observed for all entities and over all time periods, we say that this panel is balanced. If there were missing data for at least one entitry in at least one time period, this would be an unbalanced panel.

Now we will look at the fatality rates for two different years.

We will look at the following regression functions: \[ \hat{FatalityRate}=2.01 + 0.15 \text{ * } BeerTax \text{ .........1982 data} \\ \hat{FatalityRate}=1.86 + 0.44 \text{ * } BeerTax \text{ .........1988 data} \]

Run the below code to see these estimation results:

## 
## t test of coefficients:
## 
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  2.01038    0.14957 13.4408   <2e-16 ***
## beertax      0.14846    0.13261  1.1196   0.2687    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## t test of coefficients:
## 
##             Estimate Std. Error t value  Pr(>|t|)    
## (Intercept)  1.85907    0.11461 16.2205 < 2.2e-16 ***
## beertax      0.43875    0.12786  3.4314  0.001279 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## t test of coefficients:
## 
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  2.01038    0.14957 13.4408   <2e-16 ***
## beertax      0.14846    0.13261  1.1196   0.2687    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

## 
## t test of coefficients:
## 
##             Estimate Std. Error t value  Pr(>|t|)    
## (Intercept)  1.85907    0.11461 16.2205 < 2.2e-16 ***
## beertax      0.43875    0.12786  3.4314  0.001279 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

“Before” and “After” Comparisons

This regression model, where the difference in fatality rate between 1988 and 1982 is regressed on the difference in beer tax between thos eyears, yields an estimate for \(\beta_1\) that is robust to a possible bias due to omission of \(Z_i\), as these influences are eliminated from the model. Next we will estimate a regression model based on the differenced data and we will plot the estimated regression function.

## 
## t test of coefficients:
## 
##               Estimate Std. Error t value Pr(>|t|)   
## (Intercept)  -0.072037   0.065355 -1.1022 0.276091   
## diff_beertax -1.040973   0.355006 -2.9323 0.005229 **
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Fixed Effects Regression: Estimation and Inference

\[ Y_{it} = \beta_0 + \beta_1 X_{it} + \beta_2 Z_i + u_{it} = \alpha_i + \beta_1 X_{it} + u_{it} \] where \(\alpha_i = \beta_0 + \beta_2 Z_{i}\) is the fixed effect of entity \(i\) and the above model is called the fixed effects model.

The variation in the \(\alpha_i\) comes from \(Z_i\).

\[ \frac{1}{T} \sum\limits_{t=1}^T Y_{it} = \beta_0 + \beta_1 \frac{1}{T} \sum\limits_{t=1}^T X_{it} + \alpha_i + \frac{1}{T} \sum\limits_{t=1}^T u_{it} \]

\[ \bar{Y}_{i} = \beta_1 \bar{X}_{i} + \alpha_i + \bar{u}_{i} \]

\[ Y_{it} - \bar{Y}_i = \beta_1 ( X_{it} - \bar{X}_i) + \alpha_i + ( u_{it} - \bar{u}_i) \\ \tilde{Y}_{it} = \beta_1 \tilde{X}_{it} + \tilde{u}_{it} \]

If the following assumptions hold, then

Fixed Effects Regression: Application to Traffic Deaths

The simple fixed effects model for estimation of the relation between traffic fatality rates and the beer taxes is

\[ FatalityRate_{it} = \beta_1 BeerTax_{it} + StateFixedEffects + u_{it} \]

a regression of the traffic fatality rate on beer tax and 48 binary regressors - one for each state.

We can simply use the function lm() to obtain an estimate of \(\beta_1\).

## 
## Call:
## lm(formula = fatal_rate ~ beertax + state - 1, data = Fatalities)
## 
## Coefficients:
## beertax  stateal  stateaz  statear  stateca  stateco  statect  statede  
## -0.6559   3.4776   2.9099   2.8227   1.9682   1.9933   1.6154   2.1700  
## statefl  statega  stateid  stateil  statein  stateia  stateks  stateky  
##  3.2095   4.0022   2.8086   1.5160   2.0161   1.9337   2.2544   2.2601  
## statela  stateme  statemd  statema  statemi  statemn  statems  statemo  
##  2.6305   2.3697   1.7712   1.3679   1.9931   1.5804   3.4486   2.1814  
## statemt  statene  statenv  statenh  statenj  statenm  stateny  statenc  
##  3.1172   1.9555   2.8769   2.2232   1.3719   3.9040   1.2910   3.1872  
## statend  stateoh  stateok  stateor  statepa  stateri  statesc  statesd  
##  1.8542   1.8032   2.9326   2.3096   1.7102   1.2126   4.0348   2.4739  
## statetn  statetx  stateut  statevt  stateva  statewa  statewv  statewi  
##  2.6020   2.5602   2.3137   2.5116   2.1874   1.8181   2.5809   1.7184  
## statewy  
##  3.2491

It is also possible to estimate \(\beta_1\) by applying OLS to the demeaned data, that is, to run the regression

\[ \tilde{FatalityRate} = \beta_1 \tilde{BeerTax}_{it} + u_{it} \]

## 
## Call:
## lm(formula = fatal_rate ~ beertax - 1, data = fatal_demeaned)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.58696 -0.08284 -0.00127  0.07955  0.89780 
## 
## Coefficients:
##         Estimate Std. Error t value Pr(>|t|)    
## beertax  -0.6559     0.1739  -3.772 0.000191 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.1757 on 335 degrees of freedom
## Multiple R-squared:  0.04074,    Adjusted R-squared:  0.03788 
## F-statistic: 14.23 on 1 and 335 DF,  p-value: 0.0001913

The function ave is for computing group averages. We use it to obtain state specific averages of of the fatality rate and the beer tax. Alternatively, one may use plm() from the package with the same name.

As for lm() we have to specify the regression formula and the data to be used in our call of plm(). Additionally, it is required to pass avector of names of entity and time ID variables to the argument index. For Fatalities, the ID variable for entities is named state and the time id variabe is year.

Since the fixed effects estimator is also called the within estimator, we set model = “within.”

The function coeftest() allows to obtain inference based on robust standard errors.

## 
## t test of coefficients:
## 
##         Estimate Std. Error t value Pr(>|t|)  
## beertax -0.65587    0.28880  -2.271  0.02388 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

The estimated coefficient is again \(-0.6559\). Note that plm() uses the entity-demeaned OLS algorithm and thus does not report dummy coefficients. The estimated regression function is \[ FatalityRate = -0.66 * BeerTax + StateFixedEffects. \]

The coefficient on BeerTax is negative and significant. The interpretation is that the estimated reduction in traffic fatalities due to an increase in the real beer tax by $1 is 0.66 per 10000 people, which is still pretty high. Although including state fixed effects eliminates the risk of a bias due to omitted factors that vary across states but not over time, we suspect that there are other omitted variables that ary over time and thus cause bias.

Regression with Time Fixed Effects

using both lm() and plm().

## 
## Call:
## lm(formula = fatal_rate ~ beertax + state + year - 1, data = Fatalities)
## 
## Coefficients:
##  beertax   stateal   stateaz   statear   stateca   stateco   statect   statede  
## -0.63998   3.51137   2.96451   2.87284   2.02618   2.04984   1.67125   2.22711  
##  statefl   statega   stateid   stateil   statein   stateia   stateks   stateky  
##  3.25132   4.02300   2.86242   1.57287   2.07123   1.98709   2.30707   2.31659  
##  statela   stateme   statemd   statema   statemi   statemn   statems   statemo  
##  2.67772   2.41713   1.82731   1.42335   2.04488   1.63488   3.49146   2.23598  
##  statemt   statene   statenv   statenh   statenj   statenm   stateny   statenc  
##  3.17160   2.00846   2.93322   2.27245   1.43016   3.95748   1.34849   3.22630  
##  statend   stateoh   stateok   stateor   statepa   stateri   statesc   statesd  
##  1.90762   1.85664   2.97776   2.36597   1.76563   1.26964   4.06496   2.52317  
##  statetn   statetx   stateut   statevt   stateva   statewa   statewv   statewi  
##  2.65670   2.61282   2.36165   2.56100   2.23618   1.87424   2.63364   1.77545  
##  statewy  year1983  year1984  year1985  year1986  year1987  year1988  
##  3.30791  -0.07990  -0.07242  -0.12398  -0.03786  -0.05090  -0.05180
## 
## t test of coefficients:
## 
##         Estimate Std. Error t value Pr(>|t|)  
## beertax -0.63998    0.35015 -1.8277  0.06865 .
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Before discussing the outcomes we convince ourselves that state and year are of the class factor.

## [1] "pseries" "factor"
## [1] "pseries" "factor"

The lm() functions converts factors into dummies automatically. Since we exclude the intercept by adding -1 to the right-hand side of the regression formula, lm() estimates coefficients for \(n+(T-1)=48+6=54\) binary variables (6 year dummies and 48 state dummies). Again, plm() only reports the estimated coefficient on BeerTax.

The estimated regression function is \[ FatalityRate = -0.64 * BeerTax + StateEffects + TimeFixedEffects. \]

The result -0.66 is close to the estimated coefficient for the regression model including only entity fixed effects. Unsurprisingly, the coefficient is less precisely estimated but significantly different from zero at 10 percent.

We conclude that the estimated relationship between traffic fatalities and the real beer tax is not affected by omitted variable bias due to factors that are constant either over time or across states.

FE Assumptions

\[ Y_{it} = \beta_1 X_{it} + \alpha_i + u_{it} , i=1,....,n, t=1,...,T \] 1. The error term \(u_{it}\) has conditional mean zero, that is, \(E(u_{it} | X_{i1},X_{i2},....X_{iT}=0)\).

  1. \((X_{i1},X_{i2},....X_{iT},u_{i1},u_{i2},....u_{iT})\) \(i=1,...,n\) are i.i.d. draws from their joint distribution.

  2. Large outliers are unlikely, i.e., \((X_{it},u_{it})\) have nonzero finite fourth moments.

  3. There is no perfect multicollinearity.

In cases where there are multiple regressors, \(X_{it}\), is replaced by \(X_{1,it},X_{2,it},....,X_{k,it}\).

Recall from our last lecture what these assumptions mean: