The idea here is to show how even if there is no trend in the effectivness of teachers, that if there is a relationship between effectiveness and leaving the profession then fixed effects models will be biased.
The basic case: no change over time, and no attrition
This is the basic situation: 1000 teachers, who join the profession at the same time. They start with a random initial effectiveness that changes with a random slope. There is no attrition from the sample over 5 years.
data <-data.frame(slope =runif(1000, min =-0.1, max =0.1),year1 =runif(1000, min =-1, max =1))data$year2 <- data$year1 + data$slopedata$year3 <- data$year2 + data$slopedata$year4 <- data$year3 + data$slopedata$year5 <- data$year4 + data$slopehead(data)
Potting these lines we can see how they look (first 20 teachers).
plot(data$year1[1], 1, ylim =c(-1.5,1.5), xlim =c(1,5))for (i in1:20){lines(x =1:5, y = data[i, 2:6])}
So now we can look at the association between years of experience using a fixed effects model.
y <-do.call(c, data[, 2:6]) t <-rep(1:5, each =1000)x <-rep(1:1000, times =5)summary(lm(y ~ t + x))
Call:
lm(formula = y ~ t + x)
Residuals:
Min 1Q Median 3Q Max
-1.33236 -0.53618 -0.00945 0.53604 1.39595
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 5.223e-04 2.492e-02 0.021 0.983
t -1.615e-03 6.041e-03 -0.267 0.789
x -3.424e-05 2.959e-05 -1.157 0.247
Residual standard error: 0.6041 on 4997 degrees of freedom
Multiple R-squared: 0.0002821, Adjusted R-squared: -0.0001181
F-statistic: 0.7049 on 2 and 4997 DF, p-value: 0.4942
Ok, so far so good. Now we can try adding some attrition into the data, but first at random. Once someone leaves this cohort, they leave for good.
Potting these lines we can see, again how they look (first 20 teachers).
plot(data$year1[1], 1, ylim =c(-1.5,1.5), xlim =c(1,5)) for (i in1:20){lines(x =1:5, y = data[i, 2:6]) }
So now we can look at the association between years of experience using a fixed effects model.
y <-do.call(c, data[, 2:6]) summary(lm(y ~ t + x))
Call:
lm(formula = y ~ t + x)
Residuals:
Min 1Q Median 3Q Max
-1.31077 -0.52038 -0.02866 0.52850 1.42266
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 4.836e-02 2.763e-02 1.750 0.08014 .
t -1.085e-02 6.965e-03 -1.558 0.11932
x -1.044e-04 3.379e-05 -3.091 0.00201 **
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.5928 on 3677 degrees of freedom
(1320 observations deleted due to missingness)
Multiple R-squared: 0.003203, Adjusted R-squared: 0.002661
F-statistic: 5.908 on 2 and 3677 DF, p-value: 0.002743
Again, this is fine, and the estimates are unbiased. Now we can introduce the dependency on effectiveness. For the sake of simplicity, we will say that the probability of leaving is 20% in year for those with below average effectiveness and 5% for those with above average effectiveness.