This dataset comprises student-level data from around 17’000 students. Variables include: outcome.test.score, which is a financial proficiency score and the outcome of interest in this study, treatment, school, is.female, father / mother.attended.secondary.school, failed.at.least.one.school.year, family.receives.cash.transfer, has.computer.with.internet.at.home, is.unemployed, has.some.form.of.income, saves.money.for.future.purchases, intention.to.save.index, makes.list.of.expenses.every.month, negotiates.prices.or.payment.methods, and financial.autonomy.index.
Summary Statistics
# A tibble: 2 × 3
treatment `mean(outcome.test.score)` n
<int> <dbl> <int>
1 0 56.2 8405
2 1 60.5 8894
mean(outcome.test.score)
1 4.329958
It looks like this program increased student financial proficiency on average, as the difference between those who received the treatment vs. those who don’t is positive by over 4 points.
In order to estimate and summarize CATE (Conditional Average Treatment Effects), you can fix the propensity score to 0.5 since this is a randomized controlled trial (RCT), and so the possibility of being selected for the experiment and of being assigned to the treatment or control groups are all random. Since the RCT was clustered at the school level, the model includes a clustering variable, so that random units are drawn at the school level. The model lets you compute a doubly robust Average Treatment Effect or ATE estimate. The benefit appears to be quite strong if you consider the small standard error. A very simple way to see which variables appear to make a difference for treatment effects is to inspect variable importance. These variables have a greater influence on the treatment effects. In lay terms, if you want to cause the desired effect, these are the variables you have to focus on.
[1] "financial.autonomy.index" "intention.to.save.index"
[3] "is.female" "family.receives.cash.transfer"
[5] "has.computer.with.internet.at.home"