id group grade gender english_learner SES test_scores
1 1 treatment 4 male EL Low 50.13043
2 2 treatment 4 female EL High 70.18136
3 3 treatment 4 female EL Low 53.62838
4 4 treatment 5 male non-EL Low 86.81122
5 5 treatment 4 female non-EL High 95.83271
6 6 treatment 5 male EL Low 23.18895
id group grade gender
Min. : 1.0 Length:1000 Length:1000 Length:1000
1st Qu.: 250.8 Class :character Class :character Class :character
Median : 500.5 Mode :character Mode :character Mode :character
Mean : 500.5
3rd Qu.: 750.2
Max. :1000.0
english_learner SES test_scores
Length:1000 Length:1000 Min. : 7.948
Class :character Class :character 1st Qu.:40.046
Mode :character Mode :character Median :57.169
Mean :57.204
3rd Qu.:75.423
Max. :99.452
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.00 0.00 0.00 0.43 1.00 1.00
Call:
matchit(formula = formula, data = df, method = "nearest")
Summary of Balance for All Data:
Means Treated Means Control Std. Mean Diff. Var. Ratio eCDF Mean
distance 0.4321 0.4284 0.1235 0.9891 0.0346
grade4 0.5372 0.4825 0.1098 . 0.0548
grade5 0.4628 0.5175 -0.1098 . 0.0548
genderfemale 0.4837 0.4860 -0.0045 . 0.0022
gendermale 0.5163 0.5140 0.0045 . 0.0022
english_learnerEL 0.5070 0.5228 -0.0317 . 0.0158
english_learnernon-EL 0.4930 0.4772 0.0317 . 0.0158
SESHigh 0.5070 0.4860 0.0420 . 0.0210
SESLow 0.4930 0.5140 -0.0420 . 0.0210
test_scores 57.1438 57.2502 -0.0051 0.9195 0.0207
eCDF Max
distance 0.0761
grade4 0.0548
grade5 0.0548
genderfemale 0.0022
gendermale 0.0022
english_learnerEL 0.0158
english_learnernon-EL 0.0158
SESHigh 0.0210
SESLow 0.0210
test_scores 0.0508
Summary of Balance for Matched Data:
Means Treated Means Control Std. Mean Diff. Var. Ratio eCDF Mean
distance 0.4321 0.4316 0.0185 1.0187 0.0072
grade4 0.5372 0.5372 0.0000 . 0.0000
grade5 0.4628 0.4628 0.0000 . 0.0000
genderfemale 0.4837 0.4977 -0.0279 . 0.0140
gendermale 0.5163 0.5023 0.0279 . 0.0140
english_learnerEL 0.5070 0.4977 0.0186 . 0.0093
english_learnernon-EL 0.4930 0.5023 -0.0186 . 0.0093
SESHigh 0.5070 0.4884 0.0372 . 0.0186
SESLow 0.4930 0.5116 -0.0372 . 0.0186
test_scores 57.1438 58.4044 -0.0602 0.9556 0.0266
eCDF Max Std. Pair Dist.
distance 0.0512 0.0204
grade4 0.0000 0.0047
grade5 0.0000 0.0047
genderfemale 0.0140 0.9028
gendermale 0.0140 0.9028
english_learnerEL 0.0093 0.5489
english_learnernon-EL 0.0093 0.5489
SESHigh 0.0186 0.4559
SESLow 0.0186 0.4559
test_scores 0.0651 0.5835
Sample Sizes:
Control Treated
All 570 430
Matched 430 430
Unmatched 140 0
Discarded 0 0
Observations:
id group grade gender english_learner SES test_scores
1 0 0 0 0 0 0 0
id group grade gender english_learner SES test_scores distance weights subclass
1 1 1 4 male EL Low 50.13043 0.4407498 1 1
2 2 1 4 female EL High 70.18136 0.4551507 1 112
3 3 1 4 female EL Low 53.62838 0.4373067 1 223
4 4 1 5 male non-EL Low 86.81122 0.3955390 1 334
5 5 1 4 female non-EL High 95.83271 0.4665651 1 376
6 6 1 5 male EL Low 23.18895 0.3927080 1 387
[1] 0.4407498 0.4551507 0.4373067 0.3955390 0.4665651 0.3927080
Call:
lm(formula = test_scores ~ group, data = matched_data)
Residuals:
Min 1Q Median 3Q Max
-50.457 -16.852 0.142 17.866 41.880
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 58.404 1.022 57.157 <2e-16 ***
group -1.261 1.445 -0.872 0.383
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 21.19 on 858 degrees of freedom
Multiple R-squared: 0.0008862, Adjusted R-squared: -0.0002783
F-statistic: 0.761 on 1 and 858 DF, p-value: 0.3832
[1] "The estimated ATT is: -1.26065595035959"
The ATT is estimated to be -1.261, with a p-value of 0.383.
This result suggests that the treatment effect is not statistically significant at conventional levels (e.g., 0.05) given the fairly high p-value. The negative sign of the ATT indicates that the treatment group has a lower test score compared to the control group, albeit this difference is not statistically significant.
Next, it would be prudent to assess the quality of the matches to ensure the robustness of the ATT estimate. We typically look at the balance of covariates using balance tables and diagnostic plots.
Balance Measures
Type Diff.Un Diff.Adj
distance Distance 0.1235 0.0185
grade_5 Binary -0.0548 0.0000
gender_male Binary 0.0022 0.0140
english_learner_non-EL Binary 0.0158 -0.0093
SES_Low Binary -0.0210 -0.0186
test_scores Contin. -0.0051 -0.0602
Sample sizes
Control Treated
All 570 430
Matched 430 430
Unmatched 140 0
Findings:
Overall, the matching process seems to have done a good job in balancing the covariates between the treatment and control groups, setting a solid ground for a reliable estimation of the treatment effects.
Given that our ATT estimate was not statistically significant, it might be beneficial to conduct sensitivity analyses to assess the robustness of our findings. Sensitivity analyses could involve varying the specifications of propensity score model or matching method and seeing how ATT estimate changes.
We can include interaction terms or polynomial terms of the covariates in our propensity score model to check if that leads to a different ATT estimate.
Call:
matchit(formula = group ~ grade + gender + english_learner +
SES + test_scores + grade:english_learner, data = df, method = "nearest")
Summary of Balance for All Data:
Means Treated Means Control Std. Mean Diff. Var. Ratio eCDF Mean
distance 0.4324 0.4282 0.1341 0.9555 0.0392
grade4 0.5372 0.4825 0.1098 . 0.0548
grade5 0.4628 0.5175 -0.1098 . 0.0548
genderfemale 0.4837 0.4860 -0.0045 . 0.0022
gendermale 0.5163 0.5140 0.0045 . 0.0022
english_learnerEL 0.5070 0.5228 -0.0317 . 0.0158
english_learnernon-EL 0.4930 0.4772 0.0317 . 0.0158
SESHigh 0.5070 0.4860 0.0420 . 0.0210
SESLow 0.4930 0.5140 -0.0420 . 0.0210
test_scores 57.1438 57.2502 -0.0051 0.9195 0.0207
eCDF Max
distance 0.0796
grade4 0.0548
grade5 0.0548
genderfemale 0.0022
gendermale 0.0022
english_learnerEL 0.0158
english_learnernon-EL 0.0158
SESHigh 0.0210
SESLow 0.0210
test_scores 0.0508
Summary of Balance for Matched Data:
Means Treated Means Control Std. Mean Diff. Var. Ratio eCDF Mean
distance 0.4324 0.4320 0.0132 0.9998 0.0045
grade4 0.5372 0.5488 -0.0233 . 0.0116
grade5 0.4628 0.4512 0.0233 . 0.0116
genderfemale 0.4837 0.5140 -0.0605 . 0.0302
gendermale 0.5163 0.4860 0.0605 . 0.0302
english_learnerEL 0.5070 0.5279 -0.0419 . 0.0209
english_learnernon-EL 0.4930 0.4721 0.0419 . 0.0209
SESHigh 0.5070 0.4860 0.0419 . 0.0209
SESLow 0.4930 0.5140 -0.0419 . 0.0209
test_scores 57.1438 57.1949 -0.0024 0.9419 0.0197
eCDF Max Std. Pair Dist.
distance 0.0233 0.0156
grade4 0.0116 0.1073
grade5 0.0116 0.1073
genderfemale 0.0302 1.1122
gendermale 0.0302 1.1122
english_learnerEL 0.0209 0.4233
english_learnernon-EL 0.0209 0.4233
SESHigh 0.0209 0.1721
SESLow 0.0209 0.1721
test_scores 0.0419 0.6848
Sample Sizes:
Control Treated
All 570 430
Matched 430 430
Unmatched 140 0
Discarded 0 0
[1] "The estimated ATT is: -0.0511595186192628"
Call:
matchit(formula = group ~ grade + gender + english_learner +
SES + test_scores + I(test_scores^2) + I(test_scores^3),
data = df, method = "nearest")
Summary of Balance for All Data:
Means Treated Means Control Std. Mean Diff. Var. Ratio eCDF Mean
distance 0.4357 0.4257 0.2148 0.8212 0.0596
grade4 0.5372 0.4825 0.1098 . 0.0548
grade5 0.4628 0.5175 -0.1098 . 0.0548
genderfemale 0.4837 0.4860 -0.0045 . 0.0022
gendermale 0.5163 0.5140 0.0045 . 0.0022
english_learnerEL 0.5070 0.5228 -0.0317 . 0.0158
english_learnernon-EL 0.4930 0.4772 0.0317 . 0.0158
SESHigh 0.5070 0.4860 0.0420 . 0.0210
SESLow 0.4930 0.5140 -0.0420 . 0.0210
test_scores 57.1438 57.2502 -0.0051 0.9195 0.0207
I(test_scores^2) 3703.1829 3753.9354 -0.0205 0.9746 0.0207
I(test_scores^3) 262443.3773 268528.9501 -0.0250 1.0049 0.0207
eCDF Max
distance 0.1208
grade4 0.0548
grade5 0.0548
genderfemale 0.0022
gendermale 0.0022
english_learnerEL 0.0158
english_learnernon-EL 0.0158
SESHigh 0.0210
SESLow 0.0210
test_scores 0.0508
I(test_scores^2) 0.0508
I(test_scores^3) 0.0508
Summary of Balance for Matched Data:
Means Treated Means Control Std. Mean Diff. Var. Ratio eCDF Mean
distance 0.4357 0.4350 0.0146 1.0026 0.0051
grade4 0.5372 0.5233 0.0280 . 0.0140
grade5 0.4628 0.4767 -0.0280 . 0.0140
genderfemale 0.4837 0.4860 -0.0047 . 0.0023
gendermale 0.5163 0.5140 0.0047 . 0.0023
english_learnerEL 0.5070 0.4977 0.0186 . 0.0093
english_learnernon-EL 0.4930 0.5023 -0.0186 . 0.0093
SESHigh 0.5070 0.5070 0.0000 . 0.0000
SESLow 0.4930 0.4930 0.0000 . 0.0000
test_scores 57.1438 56.8244 0.0152 1.0216 0.0104
I(test_scores^2) 3703.1829 3657.5326 0.0185 1.0317 0.0104
I(test_scores^3) 262443.3773 257300.5649 0.0212 1.0333 0.0104
eCDF Max Std. Pair Dist.
distance 0.0302 0.0178
grade4 0.0140 0.5504
grade5 0.0140 0.5504
genderfemale 0.0023 0.9354
gendermale 0.0023 0.9354
english_learnerEL 0.0093 0.8466
english_learnernon-EL 0.0093 0.8466
SESHigh 0.0000 0.4698
SESLow 0.0000 0.4698
test_scores 0.0372 0.8684
I(test_scores^2) 0.0372 0.8569
I(test_scores^3) 0.0372 0.8264
Sample Sizes:
Control Treated
All 570 430
Matched 430 430
Unmatched 140 0
Discarded 0 0
[1] "The estimated ATT is: 0.319389087250886"
Balance Assessment
ATT (Average Treatment Effect on the Treated) Assessment
Discussion and Conclusion
Upon assessing the balance tables and ATT estimates, we can see that both the polynomial terms model and the interaction terms model have achieved a good balance in the matched data, with SMDs close to zero for most covariates.
Moreover, both these models offer less extreme ATT estimates compared to the primary model, which might imply a more nuanced control of the confounding variables due to the inclusion of interaction and polynomial terms.
Given that both the polynomial and interaction models provide a good balance, the choice between them could be influenced by theoretical considerations and the substantive meanings of the ATT estimates. For instance:
Given the balance achieved and the ATT estimates, one might lean towards the polynomial terms model as it not only assures a good balance but also an ATT that signifies a positive treatment effect, which might be theoretically more plausible or desirable depending on the context of our study.
Therefore, considering both the balance diagnostics and the ATT estimates, the polynomial terms model appears to be the better model. However, it is crucial to align this choice with the theoretical underpinnings of your study and the substantive meanings of the treatment effects.