Handout 017: Panel Data

This handout introduces/extends estimation procedures for panel data. Note that the Bailey “Computing Corners” will still be useful.

Overview

We begin with an example of something that we know won’t work: a simple, pooled cross-sectional analysis. We fit a standard OLS regression in which we regress an indicator of military expenditure (wdi_expmilgdp, military expenditure as percent of GDP) on real per capita GDP, Economic Globalization (an index created by a social scientist named Axel Dreher), and regime type (Polity score).

out.plain <- lm(wdi_expmilgdp~I(wdi_gdppcpppcon/1000)+dr_eg+p_polity2,data=data)

# The I() operator lets us make transformations to the data 
# directly from the regression line. Here, I'm rescaling
# Real Per Capita GDP because the coefficients are easier to interpret
# when this variable is in thousands than in single dollars

# Note wdi_gdppcpppcon, which is the World Development Indicator's
# GDP Per Capita
# at Purchasing Power Parity
# in Constant (2011) dollars...
# which makes for an ugly but precise variable name

This is just for illustration, though; since we know that this is panel data (look at the data using something like head(data[,1:10]) to see for yourself!) we know that we should begin thinking about how that knowledge changes our modeling strategy. For instance, we can use the least-squares dummy method:

out.lsdv <- lm(wdi_expmilgdp~I(wdi_gdppcpppcon/1000)+dr_eg+p_polity2+as.factor(cname),data=data)

(Why these variables? We might think that military expenditure is affected by societies’ wealth, regime type, and integration into the global economy.)

And now compare them

stargazer(out.plain,out.lsdv,
          type="html", 
          column.labels = c("Plain","With Fixed Effects"),title="Comparing Pooled and Fixed-Effects",
          omit="as.factor",
          covariate.labels = c("Per Capita GDP in 1000s", "Economic Globalization", "Polity 2 Score"),
          notes=c("Fixed effects estimated but not shown in Fixed Effects column"),
           add.lines = list(c("Fixed effects?", "No", "Yes")),
         dep.var.labels = "Military Expenditure"
          )

**Comparing Pooled and Fixed-Effects**

	Dependent variable:

	Military Expenditure
	Plain	With Fixed Effects
	(1)	(2)

Per Capita GDP in 1000s	0.027^***	-0.034^***
	(0.004)	(0.012)

Economic Globalization	-0.007^*	-0.033^***
	(0.004)	(0.006)

Polity 2 Score	-0.101^***	-0.024
	(0.009)	(0.017)

Constant	2.664^***	3.702^***
	(0.176)	(0.502)


Fixed effects?	No	Yes
Observations	2,806	2,806
R²	0.077	0.427
Adjusted R²	0.076	0.396
Residual Std. Error	2.580 (df = 2802)	2.086 (df = 2665)
F Statistic	78.432^*** (df = 3; 2802)	14.161^*** (df = 140; 2665)

Note:	p<0.1; p<0.05; p<0.01
	Fixed effects estimated but not shown in Fixed Effects column

Integrating Two-Way Fixed Effects

We suspect that time, and not just country, might be a source of unexplained variation. Accordingly, we seek to use two-way fixed effects, including controls for both year and country.

This is easy.

We begin by replicating the one-way fixed effects from earlier to verify that plm and LSDV are interchangeable. We then proceed to add the two-way model.

out.plm1way <- plm(wdi_expmilgdp~I(wdi_gdppcpppcon/1000)+dr_eg+p_polity2+as.factor(cname),data=data,
                   index=c("ccode"),model="within")

## These series are constants and have been removed: version, arda_isnatpct, cpds_lmo, p_sf, scip_ameantst, vi_ext, vi_nmw, vi_rag, vi_ram, vi_rcbg, vi_rcbm, vi_rsg, vi_rsm, wdi_ebrdpngnfl

# Note that ccode is a country code -- there are many, I'm just using this for convenience

out.plm2way <- plm(wdi_expmilgdp~I(wdi_gdppcpppcon/1000)+dr_eg+p_polity2+as.factor(cname),data=data,
                   index=c("ccode","year"),model="within",effect="twoways")

## These series are constants and have been removed: version, arda_isnatpct, cpds_lmo, p_sf, scip_ameantst, vi_ext, vi_nmw, vi_rag, vi_ram, vi_rcbg, vi_rcbm, vi_rsg, vi_rsm, wdi_ebrdpngnfl

And now we report the results:

stargazer(out.plain,out.lsdv,out.plm1way,out.plm2way,
          type="html", 
          column.labels = c("Plain","LSDV","De-Meaned","Two Ways"),
          title="Comparing Pooled and Fixed-Effects",
          omit="as.factor",
          covariate.labels = c("Per Capita GDP in 1000s", "Economic Globalization", "Polity 2 Score"),
          notes=c("Fixed effects estimated but not shown in Fixed Effects column"),
           add.lines = list(c("Fixed effects?", "No", "Country","Country","Country and Year")), # Note how this has changed!
         dep.var.labels = "Military Expenditure"
          )

**Comparing Pooled and Fixed-Effects**

	Dependent variable:

	Military Expenditure
	OLS		panel
			linear
	Plain	LSDV	De-Meaned	Two Ways
	(1)	(2)	(3)	(4)

Per Capita GDP in 1000s	0.027^***	-0.034^***	-0.034^***	-0.012
	(0.004)	(0.012)	(0.012)	(0.013)

Economic Globalization	-0.007^*	-0.033^***	-0.033^***	-0.015^*
	(0.004)	(0.006)	(0.006)	(0.008)

Polity 2 Score	-0.101^***	-0.024	-0.024	-0.003
	(0.009)	(0.017)	(0.017)	(0.017)

Constant	2.664^***	3.702^***
	(0.176)	(0.502)


Fixed effects?	No	Country	Country	Country and Year
Observations	2,806	2,806	2,806	2,806
R²	0.077	0.427	0.024	0.002
Adjusted R²	0.076	0.396	-0.027	-0.060
Residual Std. Error	2.580 (df = 2802)	2.086 (df = 2665)
F Statistic	78.432^*** (df = 3; 2802)	14.161^*** (df = 140; 2665)	22.151^*** (df = 3; 2665)	1.442 (df = 3; 2643)

Note:	p<0.1; p<0.05; p<0.01
	Fixed effects estimated but not shown in Fixed Effects column

In this very stripped-down model, you can see how our substantive interpretation of the data changes radically based on our modeling choices. In a pooled model, increased GDP per capita promotes military spending; when we control for country-specific effects, it decreases military expenditure; and when we control for period effects, it has no statistically significant effect (although our estimate of \(\beta_1\) remains negative). Indeed, the only think we are certain of, across these specifications, is that more economic globalization leads to lower military expenditure—although, to be even more skeptical, I suspect part of that is driven by EU countries, who are highly ``globalized’’ and have lower military expenditure as a percent of GDP than other countries.

Please note: this is a toy analysis, not one that you should use as dispositive (“I learned in class that regime type only weakly affects military spending!”) A proper analysis would take way more time and effort to overcome problems that, by now, should be second nature. But the fact that we asked the same question with three different methods and got three different answers should be terrifying to you! It’s super easy to trick yourself into thinking that you have the right answer … when you might just have the wrong method.

Over the next term, we will begin to think about what modeling strategy is most appropriate. But these results should force you to think hard about appropriate choices.

Handout 017: Panel Data

Paul Musgrave

12/8/2016

Loading Packages and Data

Overview

Integrating Two-Way Fixed Effects