Jonathan Kropko
February 13, 2017
TSCS data contains
Variables may be related to each other over time in a way that is different from how they are related cross-sectionally.
The challenge is to simulate data while controlling BOTH the over-time and cross-sectional slopes.
Within one time point, we want the slope to be \( \beta \). Within one case, we want the slope to be \( \gamma \). So we have the system of equations, \[ \begin{cases} y_{it} = \alpha_t + \beta x_{it} + \varepsilon_{it},\\ y_{it} = \alpha_i + \gamma x_{it} + \delta_{it}, \end{cases} \] which solves to \[ x_{it} = \frac{(\alpha_i - \alpha_t) + (\delta_{it} - \varepsilon_{it})}{\beta - \gamma} \] and \[ y_{it} = \frac{(\beta \alpha_i - \gamma \alpha_t) + (\beta \delta_{it} - \gamma \varepsilon_{it})}{\beta - \gamma}. \]
I authored a twsimdata() command to generate \( x_{it} \) and \( y_{it} \) from these equations. The data are saved in twdata.csv.
twdata <- read.csv("twdata.csv")
summary(twdata[,1:4])
case time y x
Min. : 1.0 Min. : 1.0 Min. :-1.66622 Min. :-0.50828
1st Qu.: 8.0 1st Qu.: 8.0 1st Qu.:-0.51042 1st Qu.:-0.06943
Median :15.5 Median :15.5 Median :-0.03604 Median : 0.07913
Mean :15.5 Mean :15.5 Mean :-0.04012 Mean : 0.07839
3rd Qu.:23.0 3rd Qu.:23.0 3rd Qu.: 0.41392 3rd Qu.: 0.23062
Max. :30.0 Max. :30.0 Max. : 1.85683 Max. : 0.71479