Panel Data

Author

Christopher P. Adams

Introduction

The first part of the book considers the value of exogenous changes in variables to help predict the effect of a policy change. The second part of the book illustrates how economic theory can be used explicitly to predict the effect. This part argues that repeated measurement provides valuable information for predicting the effect of a policy change.

Repeated measurement of the same individual allows us to infer the individual treatment effect and uncover unobserved characteristics of the individual. This is of particular interest with panel data sets. In microeconomic panel data sets, such as the National Longitudinal Survey of Youth, we observe a large number of individuals across many years. This chapter works through the standard approaches of difference in difference and fixed effects. These methods are useful when a subset of individuals observed in the data, face differing levels of exposure to the policy over time.

First Differences

flowchart TD
  T(Time) ----> X(X_t)
  T ----> U(U_t)
  X -- b_i --> Y(Y_t)
  U -- d_i --> Y
  U ----> X
Figure 1: Confounded time graph.

Consider data generated according to Figure 1. We are interested in the causal effect of \(X\) on \(Y\) which is denoted \(b_i\). Note that this effect may vary with the individual denoted \(i\). Now consider that we observe two different \(X\)’s for the same individual. Say two different education levels. By observing the same individual with two different education levels we can estimate the individual effect on the outcome of interest. We can measure the difference in income associated with the increase in education. We can measure the causal effect of education on income for each individual.

However, if we observe the same individual with two different education levels then we observe the same individual at two different times. Unobserved characteristics of the individual may also change between the two time periods. There is both an effect of education and of Time on \(Y\). The time effect is denoted \(d_i\) in the graph.

The Figure 1 shows that the relationship between \(X\) and \(Y\) is confounded. In the terminology of Judea Pearl, there is a backdoor relationship that works through Time. That is, you can follow a line directly from \(X\) to \(Y\) and follow a line “backwards” from \(X\) to \(Time\) and \(Time\) to \(U\) and then \(Y\). If the effect of \(Time\) on \(X\) and \(U\) is 1, then the regression of \(X\) on \(Y\) is equal to \(b_i + d_i\). In order to estimate \(b_i\) we need to find a way to estimate \(d_i\).

This section estimates the policy effect by taking first differences in the outcome and the policy variables.

First Difference Model

Consider that we observe an outcome of interest for individual \(i\) at two time periods \(t \in \{1, 2\}\).

\[ y_{it} = a + b x_{it} + d \upsilon_{it} + \epsilon_{it} \tag{1}\]

where the outcome of interest \(y_{it}\) is determined by the policy variable \(x_{it}\) and some unobserved time effect \(\upsilon_{it}\).

In Chapter 2 we accounted for this additional variable by including it in the regression. Here we can account for the unobserved time effect by taking first differences.

\[ y_{i2} - y_{i1} = b (x_{i2} - x_{i1}) + d (\upsilon_{i2} - \upsilon_{i1}) + \epsilon_{i2} - \epsilon_{i1} \tag{2}\]

If we assume that the time effect is 1, then we can add an intercept to measure the time effect.

Simulated Panel Data

Consider the simulated panel data set created below. There are 1,000 simulated individuals observed over two time periods. Note that \(x\) is increasing, at least weakly, over time. What do you get if you run OLS of \(x\) on \(y\)?1

set.seed(123456789)
N <- 1000  # 1000 individuals
T <- 2    # 2 time periods
a <- -2
b <- 3
d <- -4
u <- c(5,6)  # unobserved characteristic in the two periods.
x1 <- runif(N)
x2 <- x1 + runif(N)  # change in X over time.
x <- as.matrix(rbind(x1,x2))  # creates a matrix T x N.
e <- matrix(rnorm(N*T),nrow = T)  
# additional unobserved characteristic
y <- a + b*x + d*u + e
# matrix with T rows and N columns.

OLS Estimation of First Differences

The simulated data allows us to observe \(x\) change for the same individual. Given that we are interested in the effect of this change in \(x\) we could look at first differences. That is we could look at regressing the change in \(x\) on the change in \(y\).

diffy <- y[2,] - y[1,]
diffx <- x[2,] - x[1,]
lm1 <- lm(diffy ~ diffx)

Table Table 1 presents the results of the first difference approach. It does a pretty good job at getting at the true value of \(b = 3\) if the intercept term is used. The intercept captures the time effect and is close to \(d = -4\).

Table 1: OLS of first differences.
Estimate Std. Error t value Pr(>|t|)
(Intercept) -3.770994 0.0901336 -41.83780 0
diffx 2.748104 0.1541591 17.82642 0

Difference in Difference

Consider a case where we observe two types of individuals in the data. For example, let it be the case that for all individuals in the first time period we observe them with only a high school education. Then for the second time period, say ten years later, we observe two different groups. The first group has a college education. We call this the treated group, while the second group remains with a high school education. We call these people the non-treated group. For this second group, we can estimate the impact of time on the outcome of interest. For this group we can estimate the impact of the ten years on income. Therefore, if the effect of time is the same for both the treated group and the non-treated group we can disentangle the effect of time and the policy variable. We can disentangle the effect of the ten years and attending college on income. Of course, it is not at all obvious that the effect of time would be the same for both groups.

The section presents the difference in difference estimator and illustrates the estimator with simulated data.

Difference in Difference Estimator

In algebra, we have the following derivation of the difference in difference estimator. Assume that the observed outcome of interest for individual \(i\) in time \(t\) is determined by the treatment according to the following function.

\[ y_{it}(x_{it}) = a + b x_{it} + d \upsilon_t + \epsilon_{it} \tag{3}\]

where \(x_{it} \in \{0, 1\}\) denotes if the individual receives the treatment. The equation says that individual \(i\)’s outcome in time \(t\) is determined by both whether or not she has received the treatment and an unobserved term \(\upsilon_t \in \{0, 1\}\) that changes with time. There is also another unobserved term \(\epsilon_{it}\). We will assume that this second unobserved term has a mean of 0 and is independent of everything else that is going on. It is “noise.”

Consider observing outcomes \(y_{it}\) for a person in the treated group in two different periods (\(t \in \{0, 1\}\)). The first period is prior to the individual receiving the treatment. Say prior to going to college. The second period is after the individual receives the treatment and the person goes to college. From above, if we compare the expected outcomes from the two cases we get the following result.

\[ \mathbb{E}(y_{i1} | x_{i1} = 1, \upsilon_{1} = 1) - \mathbb{E}(y_{i0} | x_{i0} = 0, \upsilon_{0} = 0) = b + d \tag{4}\]

where \(y_{i1}\) is the outcome observed in time period 1 for individual \(i\) and \(x_{i1}\) indicates the treatment received by individual \(i\) in period 1.

For an individual in the non-treated group, we have the following equation.

\[ \mathbb{E}(y_{i1} | x_{i1} = 0, \upsilon_{1} = 1) - \mathbb{E}(y_{i1} | x_{i0} = 0, \upsilon_{0} = 0) = d \tag{5}\]

For this group, the change in the time period does not change the treatment that they receive. For this group, the only change in outcomes is due to time.

Now we can compare the two differences. Equation 4 considers two conditional expectations. The first conditions on the treatment level equal to 1. This only occurs to individuals that are in the treated group for the time period in which they are treated. This is the group that attends college. Equation 5 gives the difference in the conditional expectation for the non-treated group, conditional on the two time periods. Thus, we can estimate \(b\) by taking the second difference from the first difference.

The analog estimator is as follows.

\[ \beta_{did} = \frac{1}{N} \left (\sum_{i=1}^N \mathbb{1}(x_{i1}=1) (y_{i1} - y_{i0}) - \mathbb{1}(x_{i1}=0)(y_{i1} - y_{i0}) \right) \tag{6}\]

where \(\mathbb{1}()\) is an indicator function; it is 1 if the statement in the parentheses is true and 0 otherwise. The difference in difference estimator is the average difference in outcomes for the treated less the average difference in outcomes for the non-treated.

Difference in Difference Estimator in R

The difference in difference estimator is used to separate out the effect of time (\(u\)) from the treatment (\(x\)). This function creates a table with the two differences.

f_did <- function(Y,treat) {
  y1 <- Y[1,]
  y2 <- Y[2,]
  did <- matrix(NA,3,3)  
  # creates the difference in difference matrix
  did[1,1] <- mean(y2[treat==1]) 
  # calculates the average outcomes for each
  did[2,1] <- mean(y1[treat==1])
  did[1,2] <- mean(y2[treat==0])
  did[2,2] <- mean(y1[treat==0])
  did[3,1] <- did[1,1] - did[2,1]  
  # calculates the differences.
  did[3,2] <- did[1,2] - did[2,2]
  did[1,3] <- did[1,1] - did[1,2]
  did[2,3] <- did[2,1] - did[2,2]
  did[3,3] <- did[3,1] - did[3,2]
  row.names(did) <- c("Period 2", "Period 1", "Diff")
  colnames(did) <- c("Treated", "Not Treated", "Diff")
  return(did)
}

In the simulated data there are two groups, a treated group and a non-treated group. For the treated group, the outcome, \(y\), is determined by both \(x\) and \(u\). For the non-treated group the \(y\) is only determined by \(u\). Of course both groups are affected by some random term \(e\).

set.seed(123456789)
treat <- runif(N) < 0.5
x2 <- x1 + treat  
# this time the change is 1 for the treated group.
x <- rbind(x1,x2)
y <- a + b*x + d*u + e
did1 <- f_did(y,treat)
Table 2: Difference in difference estimator on simulated data.
Treated Not Treated Diff
Period 2 -22.1573619 -23.660074 1.502712
Period 1 -21.2527451 -19.767096 -1.485649
Diff -0.9046169 -3.892978 2.988361

The difference in difference estimator gives an estimate of 2.99, where the true value is 3. Table 2 presents the estimator, where the four cells in the top left are the mean of \(y\) for each case. Can you work out the bootstrapped standard errors for the estimator?

Effect of a Minimum Wage Increase in New Jersey

What happens if the minimum wage is increased? Economic theory gives a standard prediction. If the minimum wage is increased above the equilibrium level wage, the demand for workers will fall and the supply of workers will increase. In fact, a minimum wage increase may actually harm more low-wage workers than it helps. While per-hour pay may increase, the number of hours may decrease. At least in theory.

What actually happens if the minimum wage is increased? The section uses the difference in difference estimator with data from Card and Krueger (1994) to determine what actually happened in New Jersey.

Data from Card and Krueger (1994)

Card and Krueger (1994) surveyed restaurants before and after the state of New Jersey increased the minimum wage. In 1991, the US federal minimum wage was $4.25/hour. In April of 1992, the state of New Jersey increased its minimum wage above the federal level to $5.05/hour. In order to see how much a minimum wage increase led to a decrease in labor demand, Card and Krueger (1994) surveyed restaurants in New Jersey before and after the change. The following code imports the Card and Krueger (1994) data. Note that there is a labeling issue with two restaurants receiving the same label.

x <- read.csv("cardkrueger.csv", as.is = TRUE)
x1 <- cbind(x$SHEET,x$EMPFT,x$EMPPT,x$STATE,1)
# SHEET is the firm ID.
# this includes the initial employment levels.
# column of 1s added to represent the initial time period.
colnames(x1) <- c("SHEET","FT","PT","STATE","TIME")
# FT - fulltime, PT - parttime.
x1 <- as.data.frame(x1)
x1$SHEET <- as.numeric(as.character(x1$SHEET))
x1[x1$SHEET==407,]$SHEET <- c(4071,4072)
# there is an issue with the labeling in the data.
# two firms with the same label.
x2 <- cbind(x$SHEET,x$EMPFT2,x$EMPPT2,x$STATE,2)
# the second period of data.
colnames(x2) <- c("SHEET","FT","PT","STATE","TIME")
x2 <- as.data.frame(x2)
x2$SHEET <- as.numeric(as.character(x2$SHEET))
# a number of variables are changed into "factors"
# as.numeric(as.character()) changes them back into numbers.
x2[x2$SHEET==407,]$SHEET <- c(4071,4072)
x3 <- rbind(x1,x2)  # putting both periods together.
colnames(x3) <- c("SHEET","FT","PT","STATE","TIME")
x3$FT <- as.numeric(as.character(x3$FT))
Warning: NAs introduced by coercion
x3$PT <- as.numeric(as.character(x3$PT))
Warning: NAs introduced by coercion
Warning in hist(as.numeric(x[x$STATE == 1, ]$WAGE_ST), xlim = c(3, 6), xlab =
"Wage", : NAs introduced by coercion
Figure 2: Histogram of New Jersey wage rate prior to the minimum wage increase. Note that the minimum wage changes from $4.25 to $5.05 (represented by the two vertical lines).

The Figure 2 shows the minimum wage change will have some bite. A fairly large number of firms pay exactly the minimum of $4.25 per hour. Most of the firms pay less than the proposed minimum of $5.05.

Difference in Difference Estimates

The concern with just comparing employment in New Jersey before and after the minimum wage change is that other factors may have also changed between the two time periods. Card and Krueger (1994) use difference in difference to account for the time effects. The authors propose using restaurants in the neighboring state of Pennsylvania as the non-treated group. They argue that restaurants in Pennsylvania are not impacted by the New Jersey law change. In addition, these restaurants are similar enough to the New Jersey restaurants that other changes in the economy will be the same between the two states. The restaurants in New Jersey and Pennsylvania will have the same “time effect” on average.

To see what happens we can follow the procedure presented above.

Y <- rbind(x3$FT[x3$TIME==1] + x3$PT[x3$TIME==1],
           x3$FT[x3$TIME==2] + x3$PT[x3$TIME==2])
index_na <- is.na(colSums(Y))==0
Y1 <- Y[,index_na]
treat <- x3$STATE[x3$TIME==1]==1
treat1 <- treat[index_na]
did2 <- f_did(Y1,treat1)
Table 3: Difference in Difference estimates of the effect of minimum wage law changes in New Jersey.
Treated Not Treated Diff
Period 2 26.6047619 27.473684 -0.8689223
Period 1 26.4015873 29.710526 -3.3089390
Diff 0.2031746 -2.236842 2.4400167

The results suggest that the minimum wage increase has no impact on employment in New Jersey. Table 3 presents the difference in difference estimates on the total count of employees, both full time and part time.2 In the table we actually see a very small increase in employment before and after the law change. Meanwhile, in Pennsylvania there is a slight decrease in employment over the same time period. The net results in a slight increase in employment associated with the change in the minimum wage law.

The result seems counter-intuitive, at least counter to standard economic theory. The result is heavily reliant on the assumption that the change in employment in Pennsylvania is a good proxy for the time effect in New Jersey. Is that assumption reasonable?

Fixed Effects

In data sets with more time periods we can use the fixed effects model. As above, it is assumed the time effect is the same for each individual, irrespective of the treatment. Some subset of the sample is treated and they receive the treatment in some subset of the time periods. Usually, there is a pre-treatment and a post-treatment period.

The section presents the general model with individual and time fixed effects.

Fixed Effects Estimator

As with the difference in difference, in the fixed effect model the time effect is assumed to be the same for everyone. This model allows individuals to have different outcomes that are persistent through time. In the restaurant example, we expect some restaurants to have higher employment in both periods, relative to other restaurants. However, with only two time-periods we cannot account for these differences. With more pre-treatment time periods we can attempt to measure differences between restaurants. Accounting for such differences may be particularly important if Pennsylvania and New Jersey tend to have different types of restaurants.3

The general model has the observed outcome as a function of two fixed effects. The first is an individual effect that is allowed to vary across individuals but is persistent through time. The second is a time effect that varies across time but is persistent through the cross section of individuals. There is also a treatment effect that is assumed to affect a subset of individuals in a subset of time periods and an unobserved characteristic that varies across individuals and time.

\[ y_{it}(x_{it}) = \alpha_i + \beta x_{it} + \gamma_t + \epsilon_{it} \tag{7}\]

where

\[ x_{it} = \left \{\begin{array}{l} 1 \mbox{ if } i \mbox{ is treated and } t > T_0\\ 0 \mbox{ otherwise.} \end{array} \right. \tag{8}\]

where the treatment occurs in period \(T_0\), \(y_{it}\) is the outcome, \(\alpha_i\) is the fixed effect for individual \(i\), \(\beta\) measures the treatment effect, \(\gamma_t\) is the “time effect,” \(x_{it}\) is the indicator of the treatment and \(\epsilon_{it}\) is the unobserved term for each individual and time period. We are interested in estimating \(\beta\).

Consider our restaurant example but this time we observe employment levels for the restaurants for a number of years prior to the minimum wage increase in New Jersey. We can allow different restaurants to be big or small leading to a different number of employees on average. We can also allow all restaurants to be hit by changes to general economic conditions. Lastly, we assume that only restaurants in New Jersey are impacted by the increase in the minimum wage in New Jersey.

Nuisance Parameter

A nice feature of the fixed effects model is that it accounts for differences between the treated and non-treated groups that are not due to the treatment itself. The \(\alpha_i\) parameter is allowed to be different for each individual \(i\). While allowing a lot of flexibility this parameter is also a nuisance. It is a nuisance parameter in the econometrics sense. It cannot be consistently estimated.4 It does not converge to the true value as the number of individuals gets large.5 In addition, it may not be of direct relevance. As mentioned above, we are interested in estimating \(\beta\) not \(\alpha_i\). It is also a nuisance parameter in that it makes the model difficult to estimate.

A simple solution is to not include the individual dummy in the estimator. The problem with this approach is that it makes the estimator much noisier and less likely to be correct. This is because all the individual dummy variables have been added to the error term. It may also lead to a biased estimate if there is some systematic difference between the treated group and the non-treated group.

Adjusted Fixed Effects Estimator

One nice solution is to do adjusted estimation. In the first step, the pre-treatment data is used to estimate the \(\alpha_i\) parameter and then the estimates from the first step are used to estimate the treatment effect parameter of interest. The pre-treatment outcomes are averaged.

\[ \frac{1}{T_0}\sum_{t=1}^{T_0} y_{it} = \alpha_i + \frac{1}{T_0}\sum_{t=1}^{T_0} \gamma_t + \frac{1}{T_0}\sum_{t=1}^{T_0} \epsilon_{it} \tag{9}\]

If we substitute that estimate back into Equation 7 we get the following equation. Note that the nuisance parameter has been removed (it has been replaced with an estimate).

\[ y_{it} - \bar{y}_i = \beta x_{it} + \epsilon_{it} - \bar{\epsilon}_{i} \tag{10}\]

where \(\bar{y}_i\) is the average outcome for individual \(i\) over the pre-treatment period. Note the time effect drops out. Equation 10 gives the basis for the adjusted fixed effects estimator.

A potential issue is that we have added extra noise to the estimator (\(\bar{\epsilon}_i\)). Under standard assumptions, the larger the pre-treatment period the smaller this problem is.6

Two Step Fixed Effects Estimator

Another approach is a two step estimator. In the first step, the individual fixed effects are estimated for each individual separately in the pre-treatment period. The individual fixed effects estimator is simply the mean of the residual of the regression of the outcome variable on the time-dummies.

\[ \hat{\alpha}_i = \frac{1}{T_0} \sum_{t=1}^{T_0} (y_{it} - \frac{1}{N}\sum_{j=1}^{N} y_{jt}) \tag{11}\]

The second step regresses the outcome less the fixed effect on the policy variables of interest.

\[ y_{it} - \hat{\alpha}_i = \gamma_t + \beta x_{it} + (\alpha_i - \hat{\alpha}_i) + \epsilon_{it} \tag{12}\]

The Equation 12 forms the basis for the estimator with the outcome and the “error” term adjusted by netting out the estimated fixed effect.

This estimator highlights the estimation problem. If the pre-treatment period is not very long then each of the \(\hat{\alpha}_i\) will not be well estimated. Under standard assumptions these estimates are unbiased, which implies that in some cases the inaccuracy of the estimates are fine, while in other cases it may be problematic.

Also note the similarity and difference between the two step estimator and the adjusted estimator.7

Fixed Effects Estimator in R

The following fixed effects estimator takes data in the form of panel data matrices and converts to use lm(). The function takes advantage of R’s ability to quickly create a large number of dummy variables using as.factor(). Note the function has the option to create fixed effects only in the time dimension. It also does not need to include a treatment variable.

f_fe <- function(Y, X=NULL, cross=TRUE) {
  Y <- as.matrix(Y)
  T <- dim(Y)[1]
  N <- dim(Y)[2]
  XT <- matrix(rep(c(1:T),N), nrow=T)
  # creates a T x N matrix with numbers 1 to T in each row
  y <- as.vector(Y)
  t <- as.vector(XT)
  # set up for different cases
  if (cross) { # create cross-section dummies
    XC <- t(matrix(rep(c(1:N),T), nrow=N))
    # creates a T x N matrix with 1 to N in each column
    c <- as.vector(XC)
  }
  if (is.null(X)==0) { # create treatment variable
    X <- as.matrix(X)
    treat <- as.vector(X)
  }
  # estimator
  if (cross & is.null(X)==0) { # standard case
    lm1 <- lm(y ~ treat + as.factor(t) + as.factor(c))
  } 
  else {
    if (is.null(X)==0) { # no cross-section
      lm1 <- lm(y ~ treat + as.factor(t))
    } 
    else { # no treatment
      lm1 <- lm(y ~ as.factor(t))
    }
  }
  return(lm1)
}

The simulated panel data set has 100 individuals observed over 10 time periods. About half of the individuals are treated and the treatment occurs in the last time period.

set.seed(123456789)
N <- 100
T <- 10
alpha <- runif(N)
gamma <- runif(T)
beta <- 3
epsilon <- matrix(rnorm(N*T),nrow=T)
treat <- runif(N) < 0.5
y <- t(matrix(rep(alpha,T),nrow = N)) + gamma + epsilon
y[1,] <- y[1,] + beta*treat
treat1 <- matrix(0,T,N)
treat1[1,] <- treat

We can compare the different estimators on the simulated data.

# standard estimator
lm1 <- f_fe(y, treat1)
# No individual fixed effects estimator
lm2 <- f_fe(y, treat1, cross = FALSE)
# Adjusted estimator
y0 <- y[2:T,]  # pre-treatment outcomes.
alphahat <- colMeans(y0) # calculate alpha.
y2 <- y - t(matrix(rep(alphahat,T),nrow=N))  
# adjust outcome.
lm3 <- f_fe(y2, treat1, cross = FALSE)
# Two step estimator
lm4 <- f_fe(y0, cross = FALSE)  # adjust for time effects.
y0_res <- matrix(lm4$residuals, nrow=T-1)
alpha_hat <- colMeans(y0_res) # calculate alpha.
y3 <- y - t(matrix(rep(alpha_hat,T),nrow=N)) # adjust outcome
lm5 <- f_fe(y3, treat1, cross=FALSE)
`modelsummary` 2.0.0 now uses `tinytable` as its default table-drawing
  backend. Learn more at: https://vincentarelbundock.github.io/tinytable/

Revert to `kableExtra` for one session:

  options(modelsummary_factory_default = 'kableExtra')
  options(modelsummary_factory_latex = 'kableExtra')
  options(modelsummary_factory_html = 'kableExtra')

Silence this message forever:

  config_modelsummary(startup_message = FALSE)
Table 4: Results from the various fixed effects estimators on simulated data where the true value is 3. The models are the standard estimator (1), the estimator where the individual dummies are dropped (2), the adjusted estimator (3), and the two-step estimator (4).
(1) (2) (3) (4)
(Intercept) 1.005 1.056 0.078 1.089
(0.352) (0.157) (0.144) (0.144)
treat 2.998 3.060 2.998 2.998
(0.215) (0.214) (0.195) (0.195)
Num.Obs. 1000 1000 1000 1000
R2 0.458 0.340 0.378 0.378
R2 Adj. 0.391 0.333 0.372 0.372
AIC 2980.6 2978.9 2796.9 2796.9
BIC 3525.4 3037.8 2855.8 2855.8
Log.Lik. -1379.315 -1477.443 -1386.465 -1386.465
RMSE 0.96 1.06 0.97 0.97

The Table 4 provides a nice comparison between the four approaches. The true parameter is 3 and the standard model gives a good estimate. The model in which the individual dummies are dropped gives a relatively poor estimate. The other models give a good estimate.

Effect of a Federal Minimum Wage Increase

In 2007, the Obama administration increased the federal minimum wage to $7.25. The results from the difference in difference analysis of Card and Krueger (1994) suggest that this change will have little or no effect on employment. However, the approach makes quite strong assumptions and looks at the impact on restaurants rather than on individuals.

The NLSY97 data seems well suited to analyzing the impact of Obama’s federal minimum wage increase. The individuals in the data are in their late 20s and early 30s when the changes occur. At least some proportion of these individuals are likely to work in jobs that pay minimum wage or would be impacted by the changes. The analysis here follows Currie and Fallick (1996), who uses NLSY79 to analyze the impact of minimum wage changes in 1979 and 1980.

NLSY97

The following code imports a data set that I have created from NLSY97.8 It then creates two matrix panels, one for income and one for hours worked. For each, the rows are time periods and the columns are individuals.

x <- read.csv("NLSY97Panel.csv",as.is=TRUE)
x_names <- read.csv("NLSY97Panel_names.csv",
                    header=FALSE,as.is = TRUE)
colnames(x) <- as.vector(x_names[,1])
# create two matrices, with 18 (years) 
# rows 8984 (individuals) columns
# one for income and one for hours worked.
year <- c(1997:2014)
year1 <- c("97","98","99","00","01","02","03",
           "04","05","06","07","08",
           "09","10","11","12","13","14")
# below we need both versions of year.
W <- Y <- matrix(NA,18,8984)
for (i in 1:18) {
  hrs_name <- paste("CVC_HOURS_WK_YR_ALL_",
                    year1[i],"_XRND",sep="")
  # paste() is used to concatenate strings.
  inc_name <- paste("YINC_1700_",year[i],sep="")
  if (hrs_name %in% colnames(x)) {
    Y[i,] <- ifelse(x[,colnames(x)==hrs_name] >= 0,
                    x[,colnames(x)==hrs_name],NA)
  }
  # %in% asks whether something is an element of a set.
  if (inc_name %in% colnames(x)) {
    W[i,] <- ifelse(x[,colnames(x)==inc_name] >= 0,
                    x[,colnames(x)==inc_name],NA)
  }
}
rate_07 <- W[11,]/Y[11,]  
# calculates the wage rate for each person.
x$treat <- ifelse(rate_07<7.26 | W[11,]==0 | Y[11,]==0,1,0)
# treated if earn less than 7.26/hour or no earnings in 2007.

The data includes information on almost 9,000 individuals who are tracked across 18 years. For each individual we know their income and hours worked for the year. From this we can calculate their average hourly wage rate. We can also determine whether there were some individuals in the data earning less than $7.26 an hour in 2007. We will call this group the treated group. Individuals earning more than this are assumed to be unaffected by the policy change. These are the non-treated group. How reasonable is this assignment? Can you re-do this analysis with a different assignment to treated and non-treated?

Fixed Effects Estimators of the Minimum Wage

We can use the standard fixed effects estimator to determine the effect of the minimum wage change on hours worked. The code creates a treatment variable. Note that the pre-treatment period is the first ten years, while the post-treatment period is the last 8 years.

N <- 600 # reduce the size for computational reasons.
T <- 18
y1 <- Y[,1:N]
treat1 <- matrix(0,T,N)
for (i in 11:T) {
  treat1[i,] <- x$treat[1:N]
}
lm1 <- f_fe(y1, treat1)

As discussed above, the individual dummy variables are a nuisance to estimate. You can see this by changing the number of individuals used in the estimation. As the number increases the computation takes longer and longer.

It is lot less computationally burdensome to estimate the “adjusted” fixed effects estimator. The outcome is “adjusted” by differencing out the average hours worked for each individual in the pre-treatment period.

N <- 8984  # full data set
y0 <- Y[1:10,1:N]  # pre-treatment
alpha_hat <- colMeans(y0, na.rm = TRUE)
y1 <- Y - t(matrix(rep(alpha_hat,T), nrow = N))
treat1 <- matrix(0,T,N)
for (i in 11:T) {
  treat1[i,] <- x$treat[1:N]
}
lm2 <- f_fe(y1, treat1, cross=FALSE)

We can also estimate the two step fixed effects estimator in order to compare the results.

lm3 <- f_fe(y0, cross=FALSE)
y0_res <- y0 - matrix(rep(lm3$coefficients,N),nrow=10)
alpha_hat <- colMeans(y0_res, na.rm = TRUE) # calculate alpha.
y3 <- Y - t(matrix(rep(alpha_hat,T),nrow=N)) # adjust outcome
lm4 <- f_fe(y3, treat1, cross=FALSE)

We can compare results across the three estimators. The estimates are similar. They show a fairly substantial reduction in hours associated with the increase in the minimum wage.

Table 5: Fixed effects estimate of the effect of increasing the minimum wage on hours worked. The models are the full fixed effects model with a subset of 600 individuals (1), the full data set but adjusted to average out the individual fixed effects (2), and the two-step estimator (3).
(1) (2) (3)
(Intercept) 821.538 -722.012 25.935
(185.236) (8.313) (8.287)
treat -296.386 -271.147 -274.277
(40.853) (7.821) (7.797)
Num.Obs. 8297 127107 127107
R2 0.542 0.275 0.283
R2 Adj. 0.505 0.274 0.283
AIC 134051.8 2054608.9 2053828.7
BIC 138392.5 2054804.0 2054023.8
Log.Lik. -66407.919 -1027284.467 -1026894.360
RMSE 724.08 782.98 780.58

The Table 5 presents the estimates with the standard estimator and a subset of the individuals, as well as the adjusted estimators with all individuals. It shows a 10% reduction in hours for a full-time person working 2,000 hours, but much more for an average individual in the sample. That said, it is not clear that these workers are worse off because the wage increase was substantial.

Are Workers Better Off?

a <- mean(x[x$treat==1,]$YINC_1700_2007, na.rm = TRUE)
b <- mean(x[x$treat==1,]$CVC_HOURS_WK_YR_ALL_07_XRND, 
          na.rm = TRUE)
c <- a/b  # 2007 wage rate
d <- mean(x[x$treat==1,]$YINC_1700_2010, na.rm = TRUE)  
# 2010 income
e <- mean(x[x$treat==1,]$CVC_HOURS_WK_YR_ALL_10_XRND, 
          na.rm = TRUE)  
# 2010 hours
d - c*(e + 270)
[1] 4206.07
# actual less counter-factual

The minimum wage increase leads to lower hours but higher incomes. To see what happens to the average treated person, compare their actual income in 2010 to their estimated counterfactual income in 2010. Their actual average income in 2010 is $12,548. Their actual hours worked is 1034. According to our estimates if the minimum wage had not increased, they would have worked 1304 hours. However, they would have earned less. On average their wage prior to the minimum wage increase was $6.40 per hour. Assuming that is the counter-factual wage, then the minimum wage change increased their income by $4,200 per year. According to this analysis, the policy increased income for the treated group despite lowering their hours worked. What criticisms would you have of this analysis?

Discussion and Further Reading

The last part of the book presents the third major approach to estimating the policy variables of interest, repeated measurement. This chapter considers panel data. In this data, we observe outcomes for a large number of the same individuals over a number of time periods.

This data allows us to estimate treatment effects by comparing outcomes before and after the treatment. The problem with this comparison is that time may have caused other changes to occur. The observed difference is affected by both the treatment and time. The panel structure suggests a solution. If there are some individuals who were unaffected by the treatment then only time affected the difference in outcomes. We can use these non-treated individuals to measure the impact of time and difference it out. The classic paper is Card and Krueger (1994).

If we have enough time periods we can account for heterogeneity in the observed outcomes. We can use fixed effects to account for unobserved differences in outcomes between individuals. The next chapter considers panel data with some of the assumptions relaxed. In particular, it considers the synthetic control approach of Abadie, Diamond, and Hainmueller (2010). Chapter 11 accounts for heterogeneity with a mixture model. The chapter revisits Card and Krueger (1994).

References

Abadie, Alberto, Alexis Diamond, and Jens Hainmueller. 2010. Synthetic control methods for comparative case studies: estimating the effect of California’s tobacco control program. Journal of the American Statistical Association 105 (490): 493–505.
Card, David, and Alan B. Krueger. 1994. Minimum wages and employment: a case study of the fast-food industry in New Jersey and Pennsylvania.” The American Economic Review 84 (4): 772–93.
Currie, Janet, and Bruce C. Fallick. 1996. The minimum wage and the employment of youth: evidence from the NLSY.” Journal of Human Resources 31 (2): 404–24.

Footnotes

  1. You need to transform both \(x\) and \(y\) from matrices to vectors in order to run the regression.↩︎

  2. Card and Krueger (1994) present results on a measure called full-time equivalence. It is unclear how that measure is calculated.↩︎

  3. Chapter 11 uses a mixture model to account for this heterogeneity.↩︎

  4. See Appendix A for a discussion of this property.↩︎

  5. It can be consistently estimated as the number of time periods gets large, but microeconometric panels tend not to have a large number of time periods.↩︎

  6. The average \(\bar{\epsilon}_i\) will tend to zero. See Chapter 1 and Appendix A.↩︎

  7. An alternative approach to dealing with the nuisance estimator is to assume that the nuisance parameter has a particular distribution. This approach is generally called “random effects.”↩︎

  8. The data is available here: https://sites.google.com/view/microeconometricswithr/table-of-contents.↩︎