Ran Wang
Cross section
Time series
Panel data

\( (y_i,x_i),i=1,...,N \) (Different entities at one time point)
\[ y_i=\beta_0+\beta_1x_i+u_i \]
e.g.
Students' midterm scores
GDP in different states

\( (y_t,x_t),t=1,...,T \) (One entity at different time points)
\[ y_t=\beta_0+\beta_1x_t+u_t \]
e.g.
Daily price of one stock
GDP of US

\( (y_{it},x_{it}),i=1,...,N,t=1,...,T \) (Different entities at different time points)
\[ y_{it}=\beta_0+\beta_1x_{it}+u_{it} \]
e.g.

Pooled Regression
\[ y_{it}=\alpha+\beta x_{it}+u_{it} \]
Pooled Regression
\[ y_{it}=\alpha+\beta x_{it}+u_{it} \]
To each entity:
\[ y_{1t}=\alpha+\beta x_{1t}+u_{1t} \] \[ y_{2t}=\alpha+\beta x_{2t}+u_{2t} \]
To each time point:
\[ y_{i1}=\alpha+\beta x_{i1}+u_{i2} \] \[ y_{i2}=\alpha+\beta x_{i2}+u_{i2} \]
Fixed Effect Regression
\[ y_{it}=\alpha+\beta_1x_{it}+u_{it} \]
\( \alpha \) is based on some factors.
1 Entity-Fixed Effect Regression
\[ y_{it}=\alpha_i+\beta_1x_{it}+u_{it} \]
\( \alpha \) is based on different entities.
1 Entity-Fixed Effect Regression
\[ y_{it}=\alpha_i+\beta x_{it}+u_{it} \]
To each entity:
\[ y_{1t}=\alpha_1+\beta x_{1t}+u_{1t} \] \[ y_{2t}=\alpha_2+\beta x_{2t}+u_{2t} \]
To each time point:
\[ y_{i1}=\alpha_i+\beta x_{i1}+u_{i1} \] \[ y_{i2}=\alpha_i+\beta x_{i2}+u_{i2} \]
2 Time-Fixed Effect Regression
\[ y_{it}=\alpha_t+\beta_1x_{it}+u_{it} \]
\( \alpha \) is based on different time points.
1 Time-Fixed Effect Regression
\[ y_{it}=\alpha_t+\beta x_{it}+u_{it} \]
To each entity:
\[ y_{1t}=\alpha_t+\beta x_{1t}+u_{1t} \] \[ y_{2t}=\alpha_t+\beta x_{2t}+u_{2t} \]
To each time point:
\[ y_{i1}=\alpha_1+\beta x_{i1}+u_{i1} \] \[ y_{i2}=\alpha_2+\beta x_{i2}+u_{i2} \]
3 Entity Time-Fixed Effect Regression
\[ y_{it}=\alpha_{i}+\alpha_{t}+\beta_1x_{it}+u_{it} \]
\( \alpha \) is based on different time points and different entities.
3 Entity Time-Fixed Effect Regression
\[ y_{it}=\alpha_i+\alpha_t+\beta x_{it}+u_{it} \]
To each entity:
\[ y_{1t}=\alpha_1+\alpha_t+\beta x_{1t}+u_{1t} \] \[ y_{2t}=\alpha_2+\alpha_t+\beta x_{2t}+u_{2t} \]
To each time point:
\[ y_{i1}=\alpha_i+\alpha_1+\beta x_{i1}+u_{i1} \] \[ y_{i2}=\alpha_i+\alpha_2+\beta x_{i2}+u_{i2} \]
\[ y_{it}=\alpha+\beta x_{it}+u_{it} \]
\[ y_{it}=\alpha_i+\beta x_{it}+u_{it} \]
\[ y_{it}=\alpha_t+\beta x_{it}+u_{it} \]
\[ y_{it}=\alpha_i+\alpha_t+\beta x_{it}+u_{it} \]
Data: seatbelt.dta
The federal government has encouraged states to institute mandatory seat belt laws to reduce the number of fatalities and serious injuries. In this exercise you will investigate how effective these laws are in increasing seat belt use and reducing fatalities. On the textbook website you will find a data file Seatbelts that contains a panel of data from 50 US states plus the District of Columbia for the year 1983 through 1997.
(a) Estimate the effect of seat belt use on fatalities by regressing \( FatalityRate \) on \( sb~usage \), \( speed65 \), \( speed70 \), \( ba08 \), \( drinkage21 \), \( ln(income) \) and \( age \). Does the estimated regression suggest that increased seat belt use reduces fatalities?
(b) Do the results change when you add state fixed effects? Provide an intuitive explanation for why the results changed.
( c) Do the results change when you add time fixed effects plus state fixed effects?
(d) Which regression specification (a), (b) or ( c) is most reliable? Explain why.
(e) Using the results in ( c), discuss the size of the coefficient on sb_usage. Is it large? Small? How many lives would be saved if seat belt use increased from 52% to 90%? (The average number of traffic miles per year per state in the sample is 41,447)