The panel data used measures cigarette consumption in US states. It has 48 observations from 1985 and 1995. Below are the variables in this data set:
state - factor indicating state
year - factor indicating year
cpi - consumer price index
population - state population
packs - number of packs per capita
income - state personal income (total,
nominal)
tax - average state, federal and average local
excise taxes for fiscal year
price - average price during fiscal year, including
sales tax
taxs - average excise taxes for fiscal year,
including sales tax
# summary statistics
stargazer(df, type = "text",
title = "Summary Statistics of CigarattesSW")
##
## Summary Statistics of CigarattesSW
## ==================================================================
## Statistic N Mean St. Dev. Min Max
## ------------------------------------------------------------------
## cpi 96 1.300 0.225 1.076 1.524
## population 96 5,168,866.000 5,442,345.000 478,447 31,493,524
## packs 96 109.182 25.871 49.272 197.994
## income 96 99,878,736.000 120,541,138.000 6,887,097 771,470,144
## tax 96 42.684 16.138 18.000 99.000
## price 96 143.448 43.887 84.968 240.850
## taxs 96 48.326 19.332 21.268 112.633
## ------------------------------------------------------------------
size <- function(x) {
factor(x, levels = names(sort(table(x), decreasing = TRUE)))
}
ggplot(data = df,
aes(x = size(year))) +
geom_bar() +
xlab("Year") + ylab("Frequency") +
theme_minimal()
ggplot(data = df,
aes(x = size(state))) +
geom_bar() +
xlab("Year") + ylab("Frequency") +
theme_minimal() +
theme(axis.text.x = element_text(angle= 90))
Higher average tax is expected to lead to lower cigarette consumption as people would have less real income to spend.
# build linear model
lm_mod <- lm(data = df, formula = packs ~ tax)
summary(lm_mod)
##
## Call:
## lm(formula = packs ~ tax, data = df)
##
## Residuals:
## Min 1Q Median 3Q Max
## -56.252 -8.771 -0.432 7.309 78.843
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 153.1210 5.7806 26.489 < 2e-16 ***
## tax -1.0294 0.1268 -8.121 1.78e-12 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 19.94 on 94 degrees of freedom
## Multiple R-squared: 0.4123, Adjusted R-squared: 0.4061
## F-statistic: 65.95 on 1 and 94 DF, p-value: 1.779e-12
As expected, as average tax increase by 1, the number of packs per capita decreases by approximately 1 as well. The results is also significant.
We could potentially improve our model by including state fixed effects for these reasons:
cultural and social factors in each state might vary
different regulations between each state
demographic factors might also be different
# build linear model with state FE
lm_mod_fe <- lm(data = df, formula = packs ~ tax + state)
summary(lm_mod_fe)
##
## Call:
## lm(formula = packs ~ tax + state, data = df)
##
## Residuals:
## Min 1Q Median 3Q Max
## -12.383 -3.273 0.000 3.273 12.383
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 147.31704 5.58752 26.365 < 2e-16 ***
## tax -1.05565 0.05936 -17.783 < 2e-16 ***
## stateAR 21.29549 7.30661 2.915 0.005440 **
## stateAZ -8.23157 7.31651 -1.125 0.266273
## stateCA -22.78515 7.29550 -3.123 0.003060 **
## stateCO -9.95698 7.28390 -1.367 0.178133
## stateCT 8.28586 7.39463 1.121 0.268184
## stateDE 28.01213 7.28517 3.845 0.000361 ***
## stateFL 10.40133 7.31260 1.422 0.161518
## stateGA -1.18166 7.28855 -0.162 0.871902
## stateIA 5.37204 7.31028 0.735 0.466075
## stateID -17.68783 7.28467 -2.428 0.019061 *
## stateIL 6.59088 7.31558 0.901 0.372215
## stateIN 23.46723 7.28662 3.221 0.002323 **
## stateKS -2.37417 7.28662 -0.326 0.746002
## stateKY 56.30436 7.32761 7.684 7.58e-10 ***
## stateLA 9.16431 7.28420 1.258 0.214567
## stateMA 10.58753 7.39981 1.431 0.159107
## stateMD -1.11454 7.29912 -0.153 0.879292
## stateME 19.17528 7.31841 2.620 0.011799 *
## stateMI 29.16363 7.51985 3.878 0.000326 ***
## stateMN 6.55661 7.34922 0.892 0.376857
## stateMO 16.04277 7.28420 2.202 0.032578 *
## stateMS 0.72205 7.28436 0.099 0.921461
## stateMT -12.54930 7.28372 -1.723 0.091476 .
## stateNC 15.90162 7.32442 2.171 0.035011 *
## stateND -0.84283 7.33434 -0.115 0.909002
## stateNE -1.43099 7.30545 -0.196 0.845548
## stateNH 73.12992 7.28855 10.034 2.87e-13 ***
## stateNJ 6.55084 7.34532 0.892 0.377023
## stateNM -32.08034 7.28366 -4.404 6.09e-05 ***
## stateNV 17.92813 7.30111 2.456 0.017821 *
## stateNY 8.17853 7.39981 1.105 0.274683
## stateOH 13.34268 7.28517 1.831 0.073370 .
## stateOK 13.34647 7.28752 1.831 0.073381 .
## stateOR 9.68669 7.31841 1.324 0.192036
## statePA 6.33239 7.29912 0.868 0.390046
## stateRI 28.18485 7.41051 3.803 0.000411 ***
## stateSC -1.16843 7.30545 -0.160 0.873614
## stateSD -4.24200 7.28517 -0.582 0.563162
## stateTN 13.59669 7.28662 1.866 0.068291 .
## stateTX -0.31153 7.32842 -0.043 0.966272
## stateUT -47.22360 7.28548 -6.482 5.00e-08 ***
## stateVA -3.86661 7.33091 -0.527 0.600370
## stateVT 27.13432 7.28462 3.725 0.000523 ***
## stateWA -3.36250 7.41326 -0.454 0.652219
## stateWI 7.22054 7.33788 0.984 0.330150
## stateWV 5.95008 7.28372 0.817 0.418106
## stateWY 5.17149 7.29387 0.709 0.481815
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 7.284 on 47 degrees of freedom
## Multiple R-squared: 0.9608, Adjusted R-squared: 0.9207
## F-statistic: 23.99 on 48 and 47 DF, p-value: < 2.2e-16
By including states as a fixed effect, we can see that average tax has a negative effect on cigarette consumption in some states but a positive effect on others. The significance and \(R^2\) of the model have also increased.
We will also include time for our two-way OLS model which will be explained with the PLM model.
lm_mod_fe_2 <- lm(data = df, formula = packs ~ tax + state + year)
summary(lm_mod_fe_2)
##
## Call:
## lm(formula = packs ~ tax + state + year, data = df)
##
## Residuals:
## Min 1Q Median 3Q Max
## -10.004 -3.171 0.000 3.171 10.004
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 138.84014 5.30835 26.155 < 2e-16 ***
## tax -0.67473 0.10844 -6.222 1.34e-07 ***
## stateAR 17.58152 6.43074 2.734 0.008854 **
## stateAZ -12.67564 6.46816 -1.960 0.056106 .
## stateCA -25.45159 6.38849 -3.984 0.000240 ***
## stateCO -10.33790 6.34416 -1.630 0.110034
## stateCT 0.09608 6.75811 0.014 0.988719
## stateDE 27.05983 6.34902 4.262 9.94e-05 ***
## stateFL 6.23026 6.45341 0.965 0.339382
## stateGA 0.53248 6.36197 0.084 0.933660
## stateIA 1.37238 6.44461 0.213 0.832307
## stateID -18.46872 6.34712 -2.910 0.005556 **
## stateIL 2.21030 6.46465 0.342 0.733979
## stateIN 24.80045 6.35457 3.903 0.000308 ***
## stateKS -3.70739 6.35457 -0.583 0.562460
## stateKY 61.44679 6.50996 9.439 2.47e-12 ***
## stateLA 8.59293 6.34532 1.354 0.182283
## stateMA 2.20729 6.77701 0.326 0.746126
## stateMD -4.16190 6.40228 -0.650 0.518883
## stateME 14.60425 6.47533 2.255 0.028911 *
## stateMI 17.16465 7.20446 2.383 0.021385 *
## stateMN 0.27143 6.59074 0.041 0.967328
## stateMO 16.61415 6.34532 2.618 0.011923 *
## stateMS 1.37279 6.34593 0.216 0.829690
## stateMT -12.73976 6.34346 -2.008 0.050501 .
## stateNC 20.85358 6.49799 3.209 0.002427 **
## stateND -6.36617 6.53520 -0.974 0.335085
## stateNE -5.04973 6.42634 -0.786 0.436021
## stateNH 71.41579 6.36197 11.225 9.08e-15 ***
## stateNJ 0.45612 6.57623 0.069 0.945004
## stateNM -32.08034 6.34323 -5.057 7.25e-06 ***
## stateNV 14.69031 6.40985 2.292 0.026542 *
## stateNY -0.20171 6.77701 -0.030 0.976384
## stateOH 12.39038 6.34902 1.952 0.057097 .
## stateOK 11.82279 6.35804 1.860 0.069359 .
## stateOR 5.11566 6.47533 0.790 0.433568
## statePA 3.28503 6.40228 0.513 0.610335
## stateRI 19.42369 6.81594 2.850 0.006526 **
## stateSC 2.45031 6.42634 0.381 0.704742
## stateSD -5.19430 6.34902 -0.818 0.417503
## stateTN 14.92991 6.35457 2.349 0.023148 *
## stateTX -5.50156 6.51302 -0.845 0.402649
## stateUT -48.27113 6.35024 -7.601 1.15e-09 ***
## stateVA 1.46627 6.52237 0.225 0.823125
## stateVT 26.37248 6.34694 4.155 0.000140 ***
## stateWA -12.21889 6.82590 -1.790 0.080025 .
## stateWI 1.50674 6.54846 0.230 0.819041
## stateWV 5.75962 6.34346 0.908 0.368631
## stateWY 7.64747 6.38227 1.198 0.236962
## year1995 -10.85336 2.71596 -3.996 0.000231 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 6.343 on 46 degrees of freedom
## Multiple R-squared: 0.9709, Adjusted R-squared: 0.9399
## F-statistic: 31.31 on 49 and 46 DF, p-value: < 2.2e-16
The fixed-effects model helps control for unobserved heterogeneity at the state and year levels. Below is the equation for this model which returns the number of packs consumed by state (i) in time period (t):
\[packs_{it}=β_0+β_1⋅tax_{it}+α_i+γ_t+ε_{it}\]
\(\beta_1\) represents the estimated change in the number of packs consumed for a one-unit change in the average tax, after controlling for state and year fixed effects.
\(α_i\) and \(γ_t\) capture state fixed effects and year
fixed effects respectively
# build FE model with plm
fe_mod <- plm(formula = packs ~ tax,
data = df,
index = c("state", "year"),
model = "within")
summary(fe_mod)
## Oneway (individual) effect Within Model
##
## Call:
## plm(formula = packs ~ tax, data = df, model = "within", index = c("state",
## "year"))
##
## Balanced Panel: n = 48, T = 2, N = 96
##
## Residuals:
## Min. 1st Qu. Median 3rd Qu. Max.
## -12.3834 -3.2727 0.0000 3.2727 12.3834
##
## Coefficients:
## Estimate Std. Error t-value Pr(>|t|)
## tax -1.055649 0.059361 -17.784 < 2.2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Total Sum of Squares: 19271
## Residual Sum of Squares: 2493.4
## R-Squared: 0.87061
## Adj. R-Squared: 0.73847
## F-statistic: 316.252 on 1 and 47 DF, p-value: < 2.22e-16
The coefficient is approximately the same as our first OLS model.
Now, we will create a two-way FE model to estimate how changes in the average tax and the specific years (1985 and 1995) are associated with changes in the number of packs consumed, while accounting for unobserved state-specific factors.
\[ packs_{it}−packsˉ_i=β_1⋅(tax_{it}−taxˉ_i)+ε_{it} \]
Time FE seems to be more significant than state FE since we are only observing two time periods (1985 and 1995) with a gap of ten years.
fe_mod_2 <- plm(formula = packs ~ tax + as.factor(year),
data = df,
index = c("state", "year"),
model = "within")
summary(fe_mod_2)
## Oneway (individual) effect Within Model
##
## Call:
## plm(formula = packs ~ tax + as.factor(year), data = df, model = "within",
## index = c("state", "year"))
##
## Balanced Panel: n = 48, T = 2, N = 96
##
## Residuals:
## Min. 1st Qu. Median 3rd Qu. Max.
## -10.0041 -3.1711 0.0000 3.1711 10.0041
##
## Coefficients:
## Estimate Std. Error t-value Pr(>|t|)
## tax -0.67473 0.10844 -6.2222 1.344e-07 ***
## as.factor(year)1995 -10.85336 2.71596 -3.9961 0.0002306 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Total Sum of Squares: 19271
## Residual Sum of Squares: 1850.9
## R-Squared: 0.90396
## Adj. R-Squared: 0.80165
## F-statistic: 216.473 on 2 and 46 DF, p-value: < 2.22e-16
After controlling for time effects, the number of packs consumed decreases by 0.675 (~1 pack) for a one-unit change in the average tax.
stargazer(lm_mod_fe, lm_mod_fe_2, fe_mod_2, type = "text",
column.labels = c("State", "State|Time", "State|Time (PLM)"))
##
## ===========================================================================================
## Dependent variable:
## -----------------------------------------------------------------------
## packs
## OLS panel
## linear
## State State|Time State|Time (PLM)
## (1) (2) (3)
## -------------------------------------------------------------------------------------------
## tax -1.056*** -0.675*** -0.675***
## (0.059) (0.108) (0.108)
##
## stateAR 21.295*** 17.582***
## (7.307) (6.431)
##
## stateAZ -8.232 -12.676*
## (7.317) (6.468)
##
## stateCA -22.785*** -25.452***
## (7.295) (6.388)
##
## stateCO -9.957 -10.338
## (7.284) (6.344)
##
## stateCT 8.286 0.096
## (7.395) (6.758)
##
## stateDE 28.012*** 27.060***
## (7.285) (6.349)
##
## stateFL 10.401 6.230
## (7.313) (6.453)
##
## stateGA -1.182 0.532
## (7.289) (6.362)
##
## stateIA 5.372 1.372
## (7.310) (6.445)
##
## stateID -17.688** -18.469***
## (7.285) (6.347)
##
## stateIL 6.591 2.210
## (7.316) (6.465)
##
## stateIN 23.467*** 24.800***
## (7.287) (6.355)
##
## stateKS -2.374 -3.707
## (7.287) (6.355)
##
## stateKY 56.304*** 61.447***
## (7.328) (6.510)
##
## stateLA 9.164 8.593
## (7.284) (6.345)
##
## stateMA 10.588 2.207
## (7.400) (6.777)
##
## stateMD -1.115 -4.162
## (7.299) (6.402)
##
## stateME 19.175** 14.604**
## (7.318) (6.475)
##
## stateMI 29.164*** 17.165**
## (7.520) (7.204)
##
## stateMN 6.557 0.271
## (7.349) (6.591)
##
## stateMO 16.043** 16.614**
## (7.284) (6.345)
##
## stateMS 0.722 1.373
## (7.284) (6.346)
##
## stateMT -12.549* -12.740*
## (7.284) (6.343)
##
## stateNC 15.902** 20.854***
## (7.324) (6.498)
##
## stateND -0.843 -6.366
## (7.334) (6.535)
##
## stateNE -1.431 -5.050
## (7.305) (6.426)
##
## stateNH 73.130*** 71.416***
## (7.289) (6.362)
##
## stateNJ 6.551 0.456
## (7.345) (6.576)
##
## stateNM -32.080*** -32.080***
## (7.284) (6.343)
##
## stateNV 17.928** 14.690**
## (7.301) (6.410)
##
## stateNY 8.179 -0.202
## (7.400) (6.777)
##
## stateOH 13.343* 12.390*
## (7.285) (6.349)
##
## stateOK 13.346* 11.823*
## (7.288) (6.358)
##
## stateOR 9.687 5.116
## (7.318) (6.475)
##
## statePA 6.332 3.285
## (7.299) (6.402)
##
## stateRI 28.185*** 19.424***
## (7.411) (6.816)
##
## stateSC -1.168 2.450
## (7.305) (6.426)
##
## stateSD -4.242 -5.194
## (7.285) (6.349)
##
## stateTN 13.597* 14.930**
## (7.287) (6.355)
##
## stateTX -0.312 -5.502
## (7.328) (6.513)
##
## stateUT -47.224*** -48.271***
## (7.285) (6.350)
##
## stateVA -3.867 1.466
## (7.331) (6.522)
##
## stateVT 27.134*** 26.372***
## (7.285) (6.347)
##
## stateWA -3.363 -12.219*
## (7.413) (6.826)
##
## stateWI 7.221 1.507
## (7.338) (6.548)
##
## stateWV 5.950 5.760
## (7.284) (6.343)
##
## stateWY 5.171 7.647
## (7.294) (6.382)
##
## year1995 -10.853***
## (2.716)
##
## as.factor(year)1995 -10.853***
## (2.716)
##
## Constant 147.317*** 138.840***
## (5.588) (5.308)
##
## -------------------------------------------------------------------------------------------
## Observations 96 96 96
## R2 0.961 0.971 0.904
## Adjusted R2 0.921 0.940 0.802
## Residual Std. Error 7.284 (df = 47) 6.343 (df = 46)
## F Statistic 23.991*** (df = 48; 47) 31.312*** (df = 49; 46) 216.473*** (df = 2; 46)
## ===========================================================================================
## Note: *p<0.1; **p<0.05; ***p<0.01
Three of these models produce a rather similar coefficient. When we round them to the nearest whole number, all models reflect a decrease in one pack of cigarette per capita. However, time FE does appear to be important here. The two-way FE models does give a slightly lower negative coefficient irregardless of the fixed effect method used.