library( dplyr )
library( scales )
library( stargazer )
library( pander )
# STARGAZER OUTPUT
#
# Use:
#
# s.type="text"
#
# while running chunks interactively
# to see table output.
# This sets it to "html"
# when knitting the file.
s.type="html"URL <- "https://github.com/DS4PS/cpp-524-sum-2021/blob/main/labs/data/counterfactuals.csv?raw=true"
d <- read.csv( URL )
summary(d)## ability group time
## Min. :-4.9520 control :360 time0:165
## 1st Qu.:-0.9458 high.ses :112 time1:165
## Median : 0.4322 treatment:188 time2:165
## Mean : 0.6381 time3:165
## 3rd Qu.: 2.2840
## Max. : 6.5428
## [1] TRUE
dm <- filter( d,
group %in% c("treatment") &
time %in% c("time1","time2") )
dm$post.dummy <- ifelse( dm$time=="time2", 1, 0 )
m <- lm( ability ~ post.dummy, data=dm )
stargazer( m, type=s.type,
omit.stat=c("f","ser","adj.rsq"),
intercept.top=TRUE, intercept.bottom=FALSE,
digits=2 )| Dependent variable: | |
| ability | |
| Constant | -0.72*** |
| (0.15) | |
| post.dummy | 2.91*** |
| (0.21) | |
| Observations | 94 |
| R2 | 0.67 |
| Note: | p<0.1; p<0.05; p<0.01 |
What is the average score for kids in the treatment group in period 1 of the study? What is the average score for the same kids in period 2?
Note, you can check your answers against the group means in Table 2 above.
ANSWER:
## [1] -0.7206225
## [1] 2.189249
Explain what it means when b0 is statistically significant in the reflexive model. Explain what it means when b1 is statistically significant.
ANSWER:
b0: If b0 is significant, the model supports the claim that at the intercept (where b1 is zero) the average score is -0.72 and IS NOT zero.
b1: This represents the post test dummy variable. If b1 is significant, the model supports the claim that the variable we are testing (time) does have an impact on the measurement variable (ability). This suggests that there is significant evidence to support the claim that treatment has a positive effect.
What is the effect size according to this model?
ANSWER:
The treatment has an effect size of 2.91. This suggests that by participating in treatment, the potential increase in ability is, on average, 2.91.
Test for zero trend assumption: C1=C2
dm <- filter( d,
group %in% c("control") &
time %in% c("time1","time2") )
dm$post.dummy <- ifelse( dm$time=="time2", 1, 0 )
m <- lm( ability ~ post.dummy, data=dm )
stargazer( m, type=s.type,
omit.stat=c("f","ser","adj.rsq"),
intercept.top=TRUE, intercept.bottom=FALSE,
digits=2 )| Dependent variable: | |
| ability | |
| Constant | -0.65*** |
| (0.12) | |
| post.dummy | 1.28*** |
| (0.17) | |
| Observations | 180 |
| R2 | 0.24 |
| Note: | p<0.1; p<0.05; p<0.01 |
What is the average score for kids in the control group in period 1 of the study? What is the average score for the same kids in period 2?
ANSWER:
## [1] -0.6467743
## [1] 0.6358013
Which coefficient represents the test for whether we observe a secular trend in student achievement gains independent of the treatment? What is the decision rule?
ANSWER:
b1 represents the test for wheter we observe a secular trend. With a significant measure (p<0.01) and a value of 0.635 (not zero), it is suggested that there is enough evidence to support the conclusion of an effect. This indicates there is an observable secular trend in ability.
What does this model tell us about the appropriateness of the reflexive model?
ANSWER:
Due to the presence of secular trends and the absence of a control group, this reflexive model IS NOT an appropriate representation of the results of this study.
dm <- filter( d,
group %in% c("treatment","control") &
time=="time2" )
dm$treat.dummy <- ifelse( dm$group=="treatment", 1, 0 )
m <- lm( ability ~ treat.dummy, data=dm )
stargazer( m, type=s.type,
omit.stat=c("f","ser","adj.rsq"),
intercept.top=TRUE, intercept.bottom=FALSE,
digits=2 )| Dependent variable: | |
| ability | |
| Constant | 0.64*** |
| (0.12) | |
| treat.dummy | 1.55*** |
| (0.20) | |
| Observations | 137 |
| R2 | 0.31 |
| Note: | p<0.1; p<0.05; p<0.01 |
What is the average score for kids in the control group in the study? What is the average score for the kids in the treatment group?
Note, you can check your answers against the group means in Table 2 above.
ANSWER:
## [1] 0.6358013
## [1] 2.189249
What is the effect size identified by the model?
ANSWER:
The effect size identified by the model is 1.55.
What is the identifying assumption of this model? Or stated differently, what must be true in order for the post-test only estimator to be appropriate?
ANSWER:
In order for the post-test only model to be valid, the assumption is that the treatment and control group do not differ in ability at period one.
According to the model below is the assumption met? How can you tell?
ANSWER:
dm <- filter( d,
group %in% c("treatment","control") &
time=="time1" )
dm$treat.dummy <- ifelse( dm$group=="treatment", 1, 0 )
m <- lm( ability ~ treat.dummy, data=dm )
stargazer( m, type=s.type,
omit.stat=c("f","ser","adj.rsq"),
intercept.top=TRUE, intercept.bottom=FALSE,
digits=2 )| Dependent variable: | |
| ability | |
| Constant | -0.65*** |
| (0.12) | |
| treat.dummy | -0.07 |
| (0.20) | |
| Observations | 137 |
| R2 | 0.001 |
| Note: | p<0.1; p<0.05; p<0.01 |
According to the model, the assumption is met. The b1 variable measuring effect is NOT statistically significant. This indicates no difference in mean between the two groups at time period one.
Total Gains: T2-T1
Trend: C2-C1
DID Estimator: [ gains - trend ] = [ (T2-T1) - (C2-C1) ]
dm <- filter( d, group %in% c("treatment","control") &
time %in% c("time1","time2") )
dm$treat.dummy <- ifelse( dm$group=="treatment", 1, 0 )
dm$post.dummy <- ifelse( dm$time=="time2", 1, 0 )
dm$treat.post.dummy <- dm$treat.dummy * dm$post.dummy
summary(dm)ability group time treat.dummy post.dummy
Min. :-4.4451 high.ses : 0 time0: 0 Min. :0.0000 Min. :0.0
1st Qu.:-0.6869 treatment: 94 time1:137 1st Qu.:0.0000 1st Qu.:0.0
Median : 0.1033 control :180 time2:137 Median :0.0000 Median :0.5
Mean : 0.2483 time3: 0 Mean :0.3431 Mean :0.5
3rd Qu.: 1.1537 3rd Qu.:1.0000 3rd Qu.:1.0
Max. : 4.8738 Max. :1.0000 Max. :1.0
treat.post.dummy Min. :0.0000
1st Qu.:0.0000
Median :0.0000
Mean :0.1715
3rd Qu.:0.0000
Max. :1.0000
m <- lm( ability ~ treat.dummy + post.dummy + treat.post.dummy,
data=dm)
stargazer( m, type=s.type,
omit.stat=c("f","ser","adj.rsq"),
intercept.top=TRUE, intercept.bottom=FALSE,
digits=2 )| Dependent variable: | |
| ability | |
| Constant | -0.65*** |
| (0.12) | |
| treat.dummy | -0.07 |
| (0.20) | |
| post.dummy | 1.28*** |
| (0.16) | |
| treat.post.dummy | 1.63*** |
| (0.28) | |
| Observations | 274 |
| R2 | 0.48 |
| Note: | p<0.1; p<0.05; p<0.01 |
b0 <- m$coefficients[1] %>% as.numeric() %>% round(2)
b1 <- m$coefficients[2] %>% as.numeric() %>% round(2)
b2 <- m$coefficients[3] %>% as.numeric() %>% round(2)
b3 <- m$coefficients[4] %>% as.numeric() %>% round(2)
b0 # C1## [1] -0.65
## [1] -0.72
## [1] 0.63
## [1] 2.19
## [1] 0.56
## [1] 1.63
Model: y = b0 + b1(x1) + b2(x2) + b3(x1)(x2)
Are the treatment and control groups equivalent prior to the intervention? How do you know?
ANSWER:
By holding b2 and b3 constant (T1 - C1), we are able to determine the equivalence of means prior to intervention. T1-C1=-0.07 (i.e. b1). According to the table displaying coefficients for the diff in diff model, this is not statistically significant. Therefore, there is no difference in measured ability between treatment and control group prior to interention; they are equivalent.
Do we observe secular trends (gains independent of the treatment)? How do you know?
ANSWER:
By holding b1 constant (C2-C1), we are able to determine the presence of secular trends. C2-C1=1.28 (i.e. b2). According to the table, this IS statistically significant. Therefore, we can conclude that secular trends are evident.
What is the effect size in this model (gains from the treatment)?
ANSWER:
The effect size in this model is calculated be the equation: e = (T2-T1) - (C2-C1). This is also found by T2-CF.
## [1] 1.63
## [1] 1.63
What does statistical significance of b3 represent? In other words, which contrast is being tested?
ANSWER:
b3 represents the difference in difference. This coefficient measures the effect of time and the effect of treatment and subtracts the two (this is why it is called a difference in difference). The difference in gains by the treatement group over time minus the difference in gains by the control group over the same time shows a significant result. This control group and subtracted difference accounts for secular trends. Since this is statistically significant, this confirms that the treatment IS successful.
Do the reflexive and diff-in-diff models generate the same results (approximately)? Why?
ANSWER:
The reflexive model showed a much larger effect. This is because it did not use a control group. In both cases, the effect was significant, however the measure of effect was much different.
Do the post-test only and diff-in-diff models generate the same results (approximately)? Why?
ANSWER:
The post-test only model showed a much smaller effect. This is because it did not account for secular trends. In both cases, the effect was significant, however the measure of effect was different.
dm <- filter( d, group %in% c("treatment","high.ses") &
time %in% c("time1","time2") )
dm$treat.dummy <- ifelse( dm$group=="treatment", 1, 0 )
dm$pre.dummy <- ifelse( dm$time=="time1", 1, 0 )
dm$post.dummy <- ifelse( dm$time=="time2", 1, 0 )
dm$treat.post.dummy <- dm$treat.dummy * dm$post.dummy
m <- lm( ability ~ treat.dummy + post.dummy + treat.post.dummy,
data=dm)
stargazer( m, type=s.type,
omit.stat=c("f","ser","adj.rsq"),
intercept.top=TRUE, intercept.bottom=FALSE,
digits=2 )| Dependent variable: | |
| ability | |
| Constant | 1.44*** |
| (0.19) | |
| treat.dummy | -2.16*** |
| (0.24) | |
| post.dummy | 1.71*** |
| (0.27) | |
| treat.post.dummy | 1.20*** |
| (0.34) | |
| Observations | 150 |
| R2 | 0.68 |
| Note: | p<0.1; p<0.05; p<0.01 |
Are the pre-treatment differences (C1=T1?) different in this model versus the previous diff-in-diff? Why or why not?
ANSWER:
Yes. b1 shows a coefficient of -2.16 (not zero) AND p<0.01. This suggests a statistically significant difference in pretreatment ability measures.
Does the diff-in-diff model require that study groups are equivalent prior to treatment to generate valid results?
ANSWER:
No. Any inequivalences will be subtracted out.
Is the secular trend identified by this model different from the previous diff-in-diff (approximately)? Why or why not?
ANSWER:
Effects are 1.28 abd 1.71. Both are significant. The presence of secular trends is validated in both models, however, the effects are slightly different. I am confident that secular trends are present, but both models account for these trends (subtracted out) in the overall model.
The treatment effects from this model are approximately the same as the previous diff-in-diff model, even though they use very different comparison groups. Why does this model still work using the high SES group?
ANSWER:
Again, in both cases, differences in ability and differences in treatment are both measured. Because the secular trends are similar (counterfactuals are identified and validated), the treatment shows effect in both models. Overall, this is a great example of counterfactual analysis.
What is the identification assumption of the diff-in-diff model?
ANSWER:
An assumption of the diff-in-diff is that the comparisson group measures expected changes independent of treatment. This means we would expect to see the same trend in the treatemnt group IF they did not undergo treatment.
Does the high SES group model secular trend appropriately?
Test: are the study group trend lines parallel prior to the intervention?
dm <- filter( d, group %in% c("treatment","high.ses") &
time %in% c("time0","time1") )
dm$treat.dummy <- ifelse( dm$group=="treatment", 1, 0 )
dm$pre.dummy <- ifelse( dm$time=="time0", 1, 0 )
dm$post.dummy <- ifelse( dm$time=="time1", 1, 0 )
dm$treat.post <- dm$treat.dummy * dm$post.dummy
m <- lm( ability ~ treat.dummy + post.dummy + treat.post,
data=dm)
stargazer( m, type=s.type,
omit.stat=c("f","ser","adj.rsq"),
intercept.top=TRUE, intercept.bottom=FALSE,
digits=2 )| Dependent variable: | |
| ability | |
| Constant | -0.02 |
| (0.21) | |
| treat.dummy | -1.73*** |
| (0.27) | |
| post.dummy | 1.46*** |
| (0.30) | |
| treat.post | -0.43 |
| (0.38) | |
| Observations | 150 |
| R2 | 0.51 |
| Note: | p<0.1; p<0.05; p<0.01 |
Does the control group model secular trend appropriately?
Test: are the study group trend lines parallel prior to the intervention?
dm <- filter( d, group %in% c("treatment","control") &
time %in% c("time0","time1") )
dm$treat.dummy <- ifelse( dm$group=="treatment", 1, 0 )
dm$post.dummy <- ifelse( dm$time=="time1", 1, 0 )
dm$treat.post <- dm$treat.dummy * dm$post.dummy
m <- lm( ability ~ treat.dummy + post.dummy + treat.post,
data=dm)
stargazer( m, type=s.type,
omit.stat=c("f","ser","adj.rsq"),
intercept.top=TRUE, intercept.bottom=FALSE,
digits=2 )| Dependent variable: | |
| ability | |
| Constant | -1.76*** |
| (0.12) | |
| treat.dummy | 0.001 |
| (0.21) | |
| post.dummy | 1.11*** |
| (0.17) | |
| treat.post | -0.08 |
| (0.29) | |
| Observations | 274 |
| R2 | 0.18 |
| Note: | p<0.1; p<0.05; p<0.01 |
Which coefficient captures the parallel lines test? Do we want it to be significant or not?
ANSWER:
The treat.post coefficient captures the parallel lines test. To assume correct measure of secular trends in the High SES group, we do not want this coefficient to be significant. In both cases, this is not significant. We can conclude that the high SES group model DOES measure secular trend appropriately.