Here begins the next addendum to the start of my econometrics
personal review and consolidation efforts.
Scattered package recall and Sisyphusian buttresses hanging of wispy
model assumptions and estimators are to the benefit of none, save a
half-effort conversation void of any depth further than a first
impression (certainly the domestic machinery of the dual mandate stands
to gain none from nonefficient product recall.) So, we begin with
holistic recall of the fixed effects, random effects, correlated random
effects, a variety of Wooldridge tests, and 2SLS.
To begin, we incept a panel austere in all but its pixelated form.
setwd("~/Downloads/")
library(readxl)
gasdata <- read_excel("GasDemand.xlsx")
# with the data loaded, create the panel
library(plm)
gasdata.panel <- pdata.frame(gasdata, index=c("COUNTRY", "YEAR"))
That is excessively simple, just download \(plm\) package and set the cross-sectional
and time ID.
Onwards to random effects and fixed effects estimation, but it is
notable that intuition should lead to a hint in either direction. Recall
that \(a_i\) in one-way fixed effects
deals with only time-invariant heterogeneity, so in the gas car example
(testing causal effects of gas price, pci, and automobiles per capita on
gas consumption), we can brainstorm some type of time-invariant
unoberseved heterogeneity (of which demeaning will eliminate as there is
no time mean to deviate from.) Think: culture of each individual
(denoted \(i\), in this case the
country), where different histories may predispose certain countries to
ever-increasing magnitudes of cultural automobile appreciation. An
argument can be made for many. Such is the use of formal testing.
We begin with FE and RE estimation.
# estimation with random effects
re_lm <- plm(LGASPCAR ~ factor(YEAR) + LGASPCAR_1 + LINCOMEP + LRPMG + LCARPCAP, model = "random", data = gasdata.panel) # don't put the dataset in quotes!
fe_lm <- plm(LGASPCAR ~ factor(YEAR) + LGASPCAR_1 + LINCOMEP + LRPMG + LCARPCAP, model = "within", data = gasdata.panel)
# make a table
library(stargazer)
##
## Please cite as:
## Hlavac, Marek (2022). stargazer: Well-Formatted Regression and Summary Statistics Tables.
## R package version 5.2.3. https://CRAN.R-project.org/package=stargazer
stargazer(re_lm, fe_lm, type = "text", title = "RE vs. FE Estimation", column.labels = c("Random Effects", "Fixed Effects"))
##
## RE vs. FE Estimation
## =========================================================
## Dependent variable:
## ----------------------------------------
## LGASPCAR
## Random Effects Fixed Effects
## (1) (2)
## ---------------------------------------------------------
## factor(YEAR)1962 -0.039** -0.027*
## (0.018) (0.016)
##
## factor(YEAR)1963 -0.022 -0.006
## (0.018) (0.016)
##
## factor(YEAR)1964 -0.003 0.021
## (0.018) (0.017)
##
## factor(YEAR)1965 -0.036** 0.002
## (0.018) (0.018)
##
## factor(YEAR)1966 0.006 0.047**
## (0.018) (0.018)
##
## factor(YEAR)1967 -0.005 0.048**
## (0.018) (0.019)
##
## factor(YEAR)1968 -0.002 0.060***
## (0.018) (0.020)
##
## factor(YEAR)1969 -0.025 0.050**
## (0.018) (0.022)
##
## factor(YEAR)1970 0.002 0.081***
## (0.019) (0.023)
##
## factor(YEAR)1971 -0.003 0.087***
## (0.019) (0.024)
##
## factor(YEAR)1972 -0.007 0.094***
## (0.019) (0.025)
##
## factor(YEAR)1973 0.005 0.116***
## (0.019) (0.027)
##
## factor(YEAR)1974 -0.078*** 0.040
## (0.019) (0.028)
##
## factor(YEAR)1975 0.012 0.115***
## (0.019) (0.026)
##
## factor(YEAR)1976 -0.009 0.104***
## (0.019) (0.028)
##
## factor(YEAR)1977 -0.005 0.112***
## (0.019) (0.029)
##
## factor(YEAR)1978 0.003 0.124***
## (0.020) (0.030)
##
## LGASPCAR_1 0.851*** 0.646***
## (0.021) (0.032)
##
## LINCOMEP 0.117*** 0.022
## (0.024) (0.057)
##
## LRPMG -0.131*** -0.115***
## (0.021) (0.026)
##
## LCARPCAP -0.097*** -0.204***
## (0.018) (0.026)
##
## Constant 0.398***
## (0.088)
##
## ---------------------------------------------------------
## Observations 324 324
## R2 0.971 0.953
## Adjusted R2 0.969 0.947
## F Statistic 10,244.260*** 274.201*** (df = 21; 285)
## =========================================================
## Note: *p<0.1; **p<0.05; ***p<0.01
It is notable that there are results in favor of random effects, but only naivete would drive such an asinine conclusion.
The Hausman test solves this problem simply, under null H0: random effects is ideal.
phtest(re_lm, fe_lm) # tiny p-value rejects the null in favor of fixed effects
##
## Hausman Test
##
## data: LGASPCAR ~ factor(YEAR) + LGASPCAR_1 + LINCOMEP + LRPMG + LCARPCAP
## chisq = 111.96, df = 21, p-value = 2.088e-14
## alternative hypothesis: one model is inconsistent
A brief aside to note one of the coolest facts I’ve encountered yet
in metrics, the random effects special cases and utilizing \(\theta\) parameter to our benefit. Random
effects generally take the form of something similar to (I’m recalling
from memory:)
\((y_t - \theta\bar{y})\) = \(\beta_0 + \beta_1(x_t - \theta\bar{x}) + (u_t -
\bar{u}).\)
Thus, a second grader’s intuition might receal that the magnitude of demeaning (denoted by the \(\theta\)) hints towards either of the special cases (with POLS being \(\theta = 0\) and FE being \(\theta = 1\).) This is easily ascertained:
summary(re_lm)
## Oneway (individual) effect Random Effect Model
## (Swamy-Arora's transformation)
##
## Call:
## plm(formula = LGASPCAR ~ factor(YEAR) + LGASPCAR_1 + LINCOMEP +
## LRPMG + LCARPCAP, data = gasdata.panel, model = "random")
##
## Balanced Panel: n = 18, T = 18, N = 324
##
## Effects:
## var std.dev share
## idiosyncratic 0.0022869 0.0478220 0.811
## individual 0.0005315 0.0230540 0.189
## theta: 0.5608
##
## Residuals:
## Min. 1st Qu. Median 3rd Qu. Max.
## -2.2586e-01 -2.2223e-02 -9.6028e-05 2.0881e-02 3.6006e-01
##
## Coefficients:
## Estimate Std. Error z-value Pr(>|z|)
## (Intercept) 0.3984494 0.0876392 4.5465 5.455e-06 ***
## factor(YEAR)1962 -0.0385783 0.0176110 -2.1906 0.02848 *
## factor(YEAR)1963 -0.0222068 0.0176510 -1.2581 0.20835
## factor(YEAR)1964 -0.0034390 0.0177469 -0.1938 0.84635
## factor(YEAR)1965 -0.0356561 0.0178174 -2.0012 0.04537 *
## factor(YEAR)1966 0.0058505 0.0179202 0.3265 0.74407
## factor(YEAR)1967 -0.0049888 0.0180280 -0.2767 0.78199
## factor(YEAR)1968 -0.0016206 0.0181563 -0.0893 0.92888
## factor(YEAR)1969 -0.0246161 0.0183538 -1.3412 0.17986
## factor(YEAR)1970 0.0015481 0.0185636 0.0834 0.93354
## factor(YEAR)1971 -0.0027906 0.0186953 -0.1493 0.88134
## factor(YEAR)1972 -0.0069512 0.0189236 -0.3673 0.71337
## factor(YEAR)1973 0.0050870 0.0192565 0.2642 0.79165
## factor(YEAR)1974 -0.0775993 0.0191534 -4.0515 5.090e-05 ***
## factor(YEAR)1975 0.0119128 0.0191531 0.6220 0.53396
## factor(YEAR)1976 -0.0089419 0.0193267 -0.4627 0.64360
## factor(YEAR)1977 -0.0049710 0.0194675 -0.2553 0.79845
## factor(YEAR)1978 0.0026702 0.0196564 0.1358 0.89194
## LGASPCAR_1 0.8512078 0.0214263 39.7272 < 2.2e-16 ***
## LINCOMEP 0.1171286 0.0240233 4.8756 1.085e-06 ***
## LRPMG -0.1305157 0.0211988 -6.1568 7.425e-10 ***
## LCARPCAP -0.0973954 0.0175236 -5.5579 2.730e-08 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Total Sum of Squares: 29.367
## Residual Sum of Squares: 0.84094
## R-Squared: 0.97136
## Adj. R-Squared: 0.96937
## Chisq: 10244.3 on 21 DF, p-value: < 2.22e-16
# theta = 0.56, so the demeaning is more than half. evidence towards FE
It is important to note that here lies a placeholder for a future excerpt on correlated random effects. But the midnight oil burns and conciseness takes priority.
Standard diagnostic tests apply:
library(lmtest)
## Loading required package: zoo
##
## Attaching package: 'zoo'
## The following objects are masked from 'package:base':
##
## as.Date, as.Date.numeric
bptest(fe_lm) # H0: homoskedastistic
##
## studentized Breusch-Pagan test
##
## data: fe_lm
## BP = 64.332, df = 21, p-value = 2.763e-06
# errors are heteroskedastic
# test for serial correlation in the errors: note that Ljung-Box, ACFs, and Breusch-Godfreys may be applicable here, but Wooldridge's test ensure it works on panel. There is much to learn yet
library(wooldridge)
pwartest(fe_lm)
##
## Wooldridge's test for serial correlation in FE panels
##
## data: fe_lm
## F = 0.090437, df1 = 1, df2 = 304, p-value = 0.7638
## alternative hypothesis: serial correlation
# fails to reject, so no autocorrelation
With the results that the errors are heteroskedastic and stationary, cluster robust errors apply (even in response to stationary errors.) Thus the final result takes this form.
# generate cluster-robust errors
cov1 <- vcovHC(fe_lm, type = "sss", cluster = "group")
robust_se <- sqrt(diag(cov1))
# final output table
stargazer(fe_lm, title = "Gas Consumption Estimation", column.labels = "Fixed Effects", se = list(robust_se, NULL), type = "text")
##
## Gas Consumption Estimation
## ============================================
## Dependent variable:
## ---------------------------
## LGASPCAR
## Fixed Effects
## --------------------------------------------
## factor(YEAR)1962 -0.027
## (0.026)
##
## factor(YEAR)1963 -0.006
## (0.024)
##
## factor(YEAR)1964 0.021
## (0.016)
##
## factor(YEAR)1965 0.002
## (0.024)
##
## factor(YEAR)1966 0.047
## (0.038)
##
## factor(YEAR)1967 0.048
## (0.030)
##
## factor(YEAR)1968 0.060*
## (0.036)
##
## factor(YEAR)1969 0.050*
## (0.027)
##
## factor(YEAR)1970 0.081**
## (0.038)
##
## factor(YEAR)1971 0.087**
## (0.039)
##
## factor(YEAR)1972 0.094**
## (0.041)
##
## factor(YEAR)1973 0.116**
## (0.048)
##
## factor(YEAR)1974 0.040
## (0.052)
##
## factor(YEAR)1975 0.115**
## (0.051)
##
## factor(YEAR)1976 0.104**
## (0.051)
##
## factor(YEAR)1977 0.112**
## (0.054)
##
## factor(YEAR)1978 0.124**
## (0.051)
##
## LGASPCAR_1 0.646***
## (0.078)
##
## LINCOMEP 0.022
## (0.077)
##
## LRPMG -0.115***
## (0.037)
##
## LCARPCAP -0.204***
## (0.065)
##
## --------------------------------------------
## Observations 324
## R2 0.953
## Adjusted R2 0.947
## F Statistic 274.201*** (df = 21; 285)
## ============================================
## Note: *p<0.1; **p<0.05; ***p<0.01
To be continued with words on 2SLS and instrument assumptions. Wu-Hausman, Sargan, weak instruments test, and exclusion restrictions will be touched on.