Econ 107 LAB session: Week 9

Ran Wang

Introduction of Panel Data

Data

  • Cross section

  • Time series

  • Panel data

Introduction of Panel Data

What is panel data?

Introduction of Panel Data

1 Cross Section Data

\( (y_i,x_i),i=1,...,N \) (Different entities at one time point)

\[ y_i=\beta_0+\beta_1x_i+u_i \]

e.g.

  • Students' midterm scores

  • GDP in different states

Introduction of Panel Data

1 Cross Section Data

Introduction of Panel Data

2 Time Series Data

\( (y_t,x_t),t=1,...,T \) (One entity at different time points)

\[ y_t=\beta_0+\beta_1x_t+u_t \]

e.g.

  • Daily price of one stock

  • GDP of US

Introduction of Panel Data

2 Time Series Data

Introduction of Panel Data

3 Panel Data

\( (y_{it},x_{it}),i=1,...,N,t=1,...,T \) (Different entities at different time points)

\[ y_{it}=\beta_0+\beta_1x_{it}+u_{it} \]

e.g.

  • Average personal income in each state in last 10 years (state \( i \) and year \( t \))

Introduction of Panel Data

3 Panel Data

Introduction of Panel Data

Panel Data Models(\( i \) entity, \( t \) time point):

Pooled Regression

\[ y_{it}=\alpha+\beta x_{it}+u_{it} \]

Introduction of Panel Data

Panel Data Models(\( i \) entity, \( t \) time point):

Pooled Regression

\[ y_{it}=\alpha+\beta x_{it}+u_{it} \]

To each entity:

\[ y_{1t}=\alpha+\beta x_{1t}+u_{1t} \] \[ y_{2t}=\alpha+\beta x_{2t}+u_{2t} \]

To each time point:

\[ y_{i1}=\alpha+\beta x_{i1}+u_{i2} \] \[ y_{i2}=\alpha+\beta x_{i2}+u_{i2} \]

Introduction of Panel Data

Panel Data Models(\( i \) entity, \( t \) time point):

Fixed Effect Regression

\[ y_{it}=\alpha+\beta_1x_{it}+u_{it} \]

\( \alpha \) is based on some factors.

Introduction of Panel Data

Panel Data Models(\( i \) entity, \( t \) time point):

1 Entity-Fixed Effect Regression

\[ y_{it}=\alpha_i+\beta_1x_{it}+u_{it} \]

\( \alpha \) is based on different entities.

Introduction of Panel Data

Panel Data Models(\( i \) entity, \( t \) time point):

1 Entity-Fixed Effect Regression

\[ y_{it}=\alpha_i+\beta x_{it}+u_{it} \]

To each entity:

\[ y_{1t}=\alpha_1+\beta x_{1t}+u_{1t} \] \[ y_{2t}=\alpha_2+\beta x_{2t}+u_{2t} \]

To each time point:

\[ y_{i1}=\alpha_i+\beta x_{i1}+u_{i1} \] \[ y_{i2}=\alpha_i+\beta x_{i2}+u_{i2} \]

Introduction of Panel Data

Panel Data Models(\( i \) entity, \( t \) time point):

2 Time-Fixed Effect Regression

\[ y_{it}=\alpha_t+\beta_1x_{it}+u_{it} \]

\( \alpha \) is based on different time points.

Introduction of Panel Data

Panel Data Models(\( i \) entity, \( t \) time point):

1 Time-Fixed Effect Regression

\[ y_{it}=\alpha_t+\beta x_{it}+u_{it} \]

To each entity:

\[ y_{1t}=\alpha_t+\beta x_{1t}+u_{1t} \] \[ y_{2t}=\alpha_t+\beta x_{2t}+u_{2t} \]

To each time point:

\[ y_{i1}=\alpha_1+\beta x_{i1}+u_{i1} \] \[ y_{i2}=\alpha_2+\beta x_{i2}+u_{i2} \]

Introduction of Panel Data

Panel Data Models(\( i \) entity, \( t \) time point):

3 Entity Time-Fixed Effect Regression

\[ y_{it}=\alpha_{i}+\alpha_{t}+\beta_1x_{it}+u_{it} \]

\( \alpha \) is based on different time points and different entities.

Introduction of Panel Data

Panel Data Models(\( i \) entity, \( t \) time point):

3 Entity Time-Fixed Effect Regression

\[ y_{it}=\alpha_i+\alpha_t+\beta x_{it}+u_{it} \]

To each entity:

\[ y_{1t}=\alpha_1+\alpha_t+\beta x_{1t}+u_{1t} \] \[ y_{2t}=\alpha_2+\alpha_t+\beta x_{2t}+u_{2t} \]

To each time point:

\[ y_{i1}=\alpha_i+\alpha_1+\beta x_{i1}+u_{i1} \] \[ y_{i2}=\alpha_i+\alpha_2+\beta x_{i2}+u_{i2} \]

Summary

  • Pooled Regression Model:

\[ y_{it}=\alpha+\beta x_{it}+u_{it} \]

  • Entity-Fixed Effect Model:

\[ y_{it}=\alpha_i+\beta x_{it}+u_{it} \]

  • Time-Fixed Effect Model:

\[ y_{it}=\alpha_t+\beta x_{it}+u_{it} \]

  • Entity Time-Fixed Effect Model:

\[ y_{it}=\alpha_i+\alpha_t+\beta x_{it}+u_{it} \]

Question Part

Data: seatbelt.dta

The federal government has encouraged states to institute mandatory seat belt laws to reduce the number of fatalities and serious injuries. In this exercise you will investigate how effective these laws are in increasing seat belt use and reducing fatalities. On the textbook website you will find a data file Seatbelts that contains a panel of data from 50 US states plus the District of Columbia for the year 1983 through 1997.

Question Part

  • (a) Estimate the effect of seat belt use on fatalities by regressing \( FatalityRate \) on \( sb~usage \), \( speed65 \), \( speed70 \), \( ba08 \), \( drinkage21 \), \( ln(income) \) and \( age \). Does the estimated regression suggest that increased seat belt use reduces fatalities?

  • (b) Do the results change when you add state fixed effects? Provide an intuitive explanation for why the results changed.

Question Part

  • ( c) Do the results change when you add time fixed effects plus state fixed effects?

  • (d) Which regression specification (a), (b) or ( c) is most reliable? Explain why.

  • (e) Using the results in ( c), discuss the size of the coefficient on sb_usage. Is it large? Small? How many lives would be saved if seat belt use increased from 52% to 90%? (The average number of traffic miles per year per state in the sample is 41,447)

Question Part

  • (f) There are two ways that mandatory seat belt laws are enforced: “Primary” enforcement means that a police officer can stop a car and ticket the driver if the officer observes an occupant not wearing a seat belt; “secondary” enforcement means that a police officer can write a ticket if an occupant is not wearing a seat belt, but must have another reason to stop the car. In the data set, primary is a binary variable for primary enforcement and secondary is a binary variable for secondary enforcement. Run a regression of \( sb~usage \) on \( primary \), \( secondary \), \( speed65 \), \( speed70 \), \( ba08 \), \( drinkage21 \), \( ln(income) \), and \( age \), including fixed state and time effects in the regression. Does primary enforcement lead to more seat belt use? What about secondary enforcement?

Question Part

  • (g) In 2000, New Jersey changed from secondary enforcement to primary enforcement. Estimate the number of lives saved per year by making this change. (63,000 million traffic miles in 1997 in New Jersey)