Summary
This notebook simulates the estimated yearly learning gains and
yearly payment for a single IP. I first use the draft evaluation design
and other inputs to simulate the outcome of each data collection round.
I then use these simulated data to simulate yearly payments based on a
simple payment formula.
We have opted to use a simulation approach rather than a typical
power calculation approach as the a) typical power calculations are
focused on ensuring results will be found statistically significant
which is not a concern here, b) typical power calculations don’t account
for the effect of payment formulas, and c) it would be difficult to
account for the intricacies of the complicated yearly evaluation design
using power calculations.
Overall, we find that the current evaluation design (and most
plausible evaluation designs which measure yearly marginal impact on
learning) leads to extremely high expected payments. On
average, investors would be paid 5x what they should be paid. The high
expected payment is due to the high uncertainty of yearly estimated
learning gains and the implicit minimum payment of 0. (We assume that
investors won’t be required to ever pay the outcome payers.)
We also find that payments are extremely variable.
For example, there is a 7.1% chance that payment will be 10x what they
should be paid.
Changes to the evaluation design and payment formula matter
have much greater impact on expected payments than minor changes to
targets. A final takeaway from these simulations is that the
design
To fix these issues, we recommend instead using a
cohort-based evaluation design which does not seek to measure marginal
yearly impact on learning. At the end of this page, we
reproduce the simulations using an alternate evaluation design which
still produces annual estimates
Alternatively, shifting to an overall, rather than yearly,
evaluation approach would both significantly reduce cost while also
increasing evaluation precision. This would also allow DIB
partners to focus less on technical issues and more on overall targets.
As these simulations show, if using an annual payment formula DIB
partners need to understand the sampling error of estimates, correlation
of estimates across payment rounds, and the interaction of these things
with the payment formula or else they likely will be very surprised the
results. Shifting to an overall evaluation approach would mean that DIB
partners would not need to understand these detailed technical
issues.
Current Evaluation Design
The table below shows the current draft evaluation design.

Simulation Approach and Inputs
To simulate yearly estimated learning gains and yearly payments, I
first simulate each year + grade data collection point (i.e. each cell
in the evaluation design figure above) using the hypothesized true
effect and the estimated variance of the measured effect.
For example, the red cell in the figure above represents the
different in mean learning levels between treatment and control schools
for year 2, grade 2 students \(\Delta
\bar{y}_{g2,y2}\). If we believe that the true effect of the
intervention over 2 years is T and the standard error of this
measurement is se and assume our estimate is normally distributed then
our estimate of this quantity has the distribution
\[\widehat{\Delta \bar{y}_{g2,y2}} \sim
N(T, se)\]
We can then simulate estimated learning gains and payments by drawing
multiple values from the distribution of measured learning gains for
each year + grade data collection round.
True Effect Sizes and Targets by Year
I assume that the true effect sizes vary by grade but not by
year.
- .15 in grade 1
- .05 in grade 2
- .025 in grade 3
I assume that the targets are based on the true effects. This means
that the targets by year (assuming equal sample from each grade)
are:
- Year 1 target = (.15+.025)/2=.0875
- Years 2, and 3 target = (.15+.05)/2=.1
- Year 4 target =(.05+.025)/2=0.0375
Accounting for Measurement Sampling Error
If we take a random sample of J schools with K students per school to
estimate mean learning levels y, then the variance of the estimate of
the mean (i.e. the square of the standard error) of y is:
\[ V(\bar{y}) = SE^2=
\sigma_y^2\left(\frac{\rho}{J}+\frac{(1-\rho)}{JK}\right)=\sigma_y^2*(A+B)
\] Where \(\sigma_y\) is the
variance of the outcome variable and \(\rho\) is intra-class correlation
(ICC).
For each grade + year data collection point, we calculate
\[
\Delta \bar{y}=\bar{y}_{treat}-\bar{y}_{control}
\]
Since we sample schools in treatment and control independently, the
two terms on the right are independent of each other and thus the
variance of their difference is just the sum of their variances.
Since we are calculating the standard error of standard effect sizes,
we take \(\sigma\) to be 1. In
addition, when estimating effects, we will use covariates to reduce
variance. After setting \(\sigma\) to
1, taking into account the effects of covariates our variance is
\[ V(\Delta \bar{y}) =
2A(1-R_J^2)+2B(1-R_K^2) \]
Where:
- \(R_J^2\) is the R squared from a
regression of the outcome on the school-level covariates (e.g. UDISE
information). I use .2 in the calculations below but this is just a
guess.
- \(R_K^2\) is the R squared from a
regression of the outcome on the student-level covariates. I use .2 in
the calculations below but this is just a guess.
In addition, while beginning of year grade + year data collection
points are all independent since schools are resampled, data collection
points at the end of year are correlated across grades since data
collection will happen at the same schools for two grades. We account
for this by drawing from a multivariate normal distribution with a
correlation of .5 across grades. (Note that the estimate of the
correlation is based on my intuition as estimating this would require
detailed analysis of a dataset with multiple years of learning
levels.)
Calculations
library(MASS)
# sampling inputs
J <- 65 # Number of schools per arm
K <- 25 # Number of students per school
rho <- 0.15
rsj <- .2
rsk <- .2
rsl <- .63
corr <- .5
Sigma <- matrix(c(1,corr, corr, 1), ncol = 2)
# variance calculations
A <- rho/J
B <- (1-rho)/(J*K)
var <- 2*A*(1-rsj)+2*B*(1-rsk)
se = var^.5
# Effect size and target inputs
effect_year1 = .15
effect_year2 = .05
effect_year3 = .025
target_year1 = .0875
target_year2 = .1
target_year3 = .0375
# Simulations
# note that g2y0e indicates that the draw is for grade 2, year 0, end of year
num_sims <- 100000
# year 0
g2y0e <- rnorm(n = num_sims, mean = .2, sd = se)
# year 1
g1y1b <- rnorm(n = num_sims, mean = 0, sd = se)
y1e <- mvrnorm(n = num_sims, mu = c(.225, .15), Sigma = Sigma)
g3y1e <- y1e[,1]
g1y1e <- y1e[,2]
# year 2
g1y2b <- rnorm(n = num_sims, mean = 0, sd = se)
y2e <- mvrnorm(n = num_sims, mu = c(.2, .15), Sigma = Sigma)
g2y2e <- y2e[,1]
g1y2e <- y2e[,2]
# year 3
g1y3b <- rnorm(n = num_sims, mean = 0, sd = se)
y3e <- mvrnorm(n = num_sims, mu = c(.2, .15), Sigma = Sigma)
g2y3e <- y3e[,1]
g1y3e <- y3e[,2]
# year 4
y4e <- mvrnorm(n = num_sims, mu = c(.225, .2), Sigma = Sigma)
g3y4e <- y4e[,1]
g2y4e <- y4e[,2]
# Payment calculation
est_effect_y1 <- ((g3y1e-g2y0e)+(g1y1e-g1y1b))/2
est_effect_y2 <- ((g2y2e-g1y1e)+(g1y2e-g1y2b))/2
est_effect_y3 <- ((g2y3e-g1y2e)+(g1y3e-g1y3b))/2
est_effect_y4 <- ((g3y4e-g2y4e)+(g2y4e-g1y3e))/2
# Check that mean estimated effect is similar to targets
print("Mean estimated effect y1 vs target of .0875:")
[1] "Mean estimated effect y1 vs target of .0875:"
round(mean(est_effect_y1),4)
[1] 0.0898
print("Mean estimated effect y2 vs target of .1:")
[1] "Mean estimated effect y2 vs target of .1:"
round(mean(est_effect_y2),4)
[1] 0.1008
print("Mean estimated effect y3 vs target of .1:")
[1] "Mean estimated effect y3 vs target of .1:"
round(mean(est_effect_y3),4)
[1] 0.1001
print("Mean estimated effect y4 vs target of .0375:")
[1] "Mean estimated effect y4 vs target of .0375:"
round(mean(est_effect_y4),4)
[1] 0.0384
library(tidyverse)
Registered S3 methods overwritten by 'dbplyr':
method from
print.tbl_lazy
print.tbl_sql
-- Attaching packages -------------------------------------------------------------- tidyverse 1.3.1 --
√ ggplot2 3.3.5 √ purrr 0.3.4
√ tibble 3.1.6 √ dplyr 1.0.8
√ tidyr 1.2.0 √ stringr 1.4.0
√ readr 2.0.2 √ forcats 0.5.1
Warning: package ‘tibble’ was built under R version 4.1.2
Warning: package ‘tidyr’ was built under R version 4.1.2
Warning: package ‘dplyr’ was built under R version 4.1.2
-- Conflicts ----------------------------------------------------------------- tidyverse_conflicts() --
x dplyr::filter() masks stats::filter()
x dplyr::lag() masks stats::lag()
x dplyr::select() masks MASS::select()
# Calculate payment assuming minimum payment of 0
payment_y1 <- pmax(0, est_effect_y1/(4*.0875))
payment_y2 <- pmax(0, est_effect_y2/(4*.1))
payment_y3 <- pmax(0, est_effect_y3/(4*.1))
payment_y4 <- pmax(0, est_effect_y4/(4*.0375))
total_payment = payment_y1+payment_y2+payment_y3+payment_y4
mean(total_payment)
[1] 5.39108
# histogram of total payment
ggplot(tibble(p = total_payment), aes(x = p)) +
geom_histogram(binwidth = .2) +
geom_vline(xintercept = 1, color = "red")+
labs(x = "Total Payment", y = "Frequency", caption = "Red line indicates what the investors should be paid", title = "Expected Payment w/ Current Evaluation Design")

# For completeness, calculate expected payment
payment_y1_nomin <- est_effect_y1/(4*.0875)
payment_y2_nomin <- est_effect_y2/(4*.1)
payment_y3_nomin <- est_effect_y3/(4*.1)
payment_y4_nomin <- est_effect_y4/(4*.0375)
total_payment_nomin = payment_y1_nomin+payment_y2_nomin+payment_y3_nomin+payment_y4_nomin
mean(total_payment_nomin)
[1] 1.014725
# histogram of total payment
ggplot(tibble(p = total_payment_nomin), aes(x = p)) +
geom_histogram(binwidth = .2) +
geom_vline(xintercept = 1, color = "red")+
labs(x = "Total Payment", y = "Frequency", title = "Expected Payment with Potential Negative Payments", caption = "Red line indicates what the investors should be paid")

Alternate Evaluation Design
The code below simulates measured learning gains and payments using
the same assumptions but with a slightly different, cohort-based
evaluation design.

# first cohort
g1y0b <- rnorm(n = num_sims, mean = 0, sd = se)
g2y1e <- rnorm(n = num_sims, mean = .2, sd = se)
est_effect_c1 <- g2y1e-g1y0b
# second cohort
g1y1b <- rnorm(n = num_sims, mean = 0, sd = se)
g2y2e <- rnorm(n = num_sims, mean = .2, sd = se)
est_effect_c2 <- g2y2e-g1y1b
# third cohort
g1y2b <- rnorm(n = num_sims, mean = 0, sd = se)
g2y3e <- rnorm(n = num_sims, mean = .2, sd = se)
est_effect_c3 <- g2y3e-g1y2b
# fourth cohort
g1y3b <- rnorm(n = num_sims, mean = 0, sd = se)
g2y4e <- rnorm(n = num_sims, mean = .2, sd = se)
est_effect_c4 <- g2y4e-g1y3b
payment_y1 <- pmax(0, est_effect_c1/.8)
payment_y2 <- pmax(0, est_effect_c2/.8)
payment_y3 <- pmax(0, est_effect_c3/.8)
payment_y4 <- pmax(0, est_effect_c4/.8)
total_payment = payment_y1+payment_y2+payment_y3+payment_y4
mean(total_payment)
[1] 1.003594
# histogram of total payment
ggplot(tibble(p = total_payment), aes(x = p)) +
geom_histogram(binwidth = .02) +
geom_vline(xintercept = 1, color = "red")+
labs(x = "Total Payment", y = "Frequency")

# For completeness, calculate expected payment with no minimum
payment_y1_nomin <- est_effect_c1/.8
payment_y2_nomin <- est_effect_c2/.8
payment_y3_nomin <- est_effect_c3/.8
payment_y4_nomin <- est_effect_c4/.8
total_payment_nomin = payment_y1_nomin+payment_y2_nomin+payment_y3_nomin+payment_y4_nomin
mean(total_payment_nomin)
[1] 1.000516
# histogram of total payment
ggplot(tibble(p = total_payment_nomin), aes(x = p)) +
geom_histogram(binwidth = .02) +
geom_vline(xintercept = 1, color = "red")+
labs(x = "Total Payment", y = "Frequency", title = "Expected Payment with Potential Negative Payments")

NA
NA
---
title: "B2SOF Sample Simulations"
output: html_notebook
---

# Summary

This notebook simulates the estimated yearly learning gains and yearly payment for a single IP. I first use the draft evaluation design and other inputs to simulate the outcome of each data collection round. I then use these simulated data to simulate yearly payments based on a simple payment formula.

We have opted to use a simulation approach rather than a typical power calculation approach as the a) typical power calculations are focused on ensuring results will be found statistically significant which is not a concern here, b) typical power calculations don't account for the effect of payment formulas, and c) it would be difficult to account for the intricacies of the complicated yearly evaluation design using power calculations.

**Overall, we find that the current evaluation design (and most plausible evaluation designs which measure yearly marginal impact on learning) leads to extremely high expected payments.** On average, investors would be paid 5x what they should be paid. The high expected payment is due to the high uncertainty of yearly estimated learning gains and the implicit minimum payment of 0. (We assume that investors won't be required to ever pay the outcome payers.)

**We also find that payments are extremely variable**. For example, there is a 7.1% chance that payment will be 10x what they should be paid.

**Changes to the evaluation design and payment formula matter have much greater impact on expected payments than minor changes to targets**. A final takeaway from these simulations is that the design

**To fix these issues, we recommend instead using a cohort-based evaluation design which does not seek to measure marginal yearly impact on learning**. At the end of this page, we reproduce the simulations using an alternate evaluation design which still produces annual estimates

**Alternatively, shifting to an overall, rather than yearly, evaluation approach would both significantly reduce cost while also increasing evaluation precision**. This would also allow DIB partners to focus less on technical issues and more on overall targets. As these simulations show, if using an annual payment formula DIB partners need to understand the sampling error of estimates, correlation of estimates across payment rounds, and the interaction of these things with the payment formula or else they likely will be very surprised the results. Shifting to an overall evaluation approach would mean that DIB partners would not need to understand these detailed technical issues.

# Current Evaluation Design

The table below shows the current draft evaluation design.

![](images/paste-0A07A1BA.png)

# Simulation Approach and Inputs

To simulate yearly estimated learning gains and yearly payments, I first simulate each year + grade data collection point (i.e. each cell in the evaluation design figure above) using the hypothesized true effect and the estimated variance of the measured effect.

For example, the red cell in the figure above represents the different in mean learning levels between treatment and control schools for year 2, grade 2 students $\Delta \bar{y}_{g2,y2}$. If we believe that the true effect of the intervention over 2 years is T and the standard error of this measurement is se and assume our estimate is normally distributed then our estimate of this quantity has the distribution

$$\widehat{\Delta \bar{y}_{g2,y2}} \sim N(T, se)$$

We can then simulate estimated learning gains and payments by drawing multiple values from the distribution of measured learning gains for each year + grade data collection round.

## True Effect Sizes and Targets by Year

I assume that the true effect sizes vary by grade but not by year.

-   .15 in grade 1
-   .05 in grade 2
-   .025 in grade 3

I assume that the targets are based on the true effects. This means that the targets by year (assuming equal sample from each grade) are:

-   Year 1 target = (.15+.025)/2=.0875
-   Years 2, and 3 target = (.15+.05)/2=.1
-   Year 4 target =(.05+.025)/2=0.0375

## Payment Formula

I assume payment = estimated effect / (4\*target) with a minimum of 0. So if the estimated effect is equal to the target in all 4 years, the payment is 1.

## Accounting for Measurement Sampling Error

If we take a random sample of J schools with K students per school to estimate mean learning levels y, then the variance of the estimate of the mean (i.e. the square of the standard error) of y is:

$$ V(\bar{y}) = SE^2= \sigma_y^2\left(\frac{\rho}{J}+\frac{(1-\rho)}{JK}\right)=\sigma_y^2*(A+B) $$ Where $\sigma_y$ is the variance of the outcome variable and $\rho$ is intra-class correlation (ICC).

For each grade + year data collection point, we calculate

$$
\Delta \bar{y}=\bar{y}_{treat}-\bar{y}_{control}
$$

Since we sample schools in treatment and control independently, the two terms on the right are independent of each other and thus the variance of their difference is just the sum of their variances.

Since we are calculating the standard error of standard effect sizes, we take $\sigma$ to be 1. In addition, when estimating effects, we will use covariates to reduce variance. After setting $\sigma$ to 1, taking into account the effects of covariates our variance is

$$ V(\Delta \bar{y}) = 2A(1-R_J^2)+2B(1-R_K^2) $$

Where:

1.  $R_J^2$ is the R squared from a regression of the outcome on the school-level covariates (e.g. UDISE information). I use .2 in the calculations below but this is just a guess.
2.  $R_K^2$ is the R squared from a regression of the outcome on the student-level covariates. I use .2 in the calculations below but this is just a guess.

In addition, while beginning of year grade + year data collection points are all independent since schools are resampled, data collection points at the end of year are correlated across grades since data collection will happen at the same schools for two grades. We account for this by drawing from a multivariate normal distribution with a correlation of .5 across grades. (Note that the estimate of the correlation is based on my intuition as estimating this would require detailed analysis of a dataset with multiple years of learning levels.)

# Calculations

```{r}
library(MASS)

# sampling inputs
J <- 65 # Number of schools per arm
K <- 25 # Number of students per school
rho <- 0.15 
rsj <- .2
rsk <- .2
rsl <- .63
corr <- .5
Sigma <- matrix(c(1,corr, corr, 1), ncol = 2)

# variance calculations
A <- rho/J
B <- (1-rho)/(J*K)
var <- 2*A*(1-rsj)+2*B*(1-rsk)
se = var^.5

# Effect size and target inputs
effect_year1 = .15
effect_year2 = .05
effect_year3 = .025

target_year1 = .0875
target_year2 = .1
target_year3 = .0375


# Simulations
# note that g2y0e indicates that the draw is for grade 2, year 0, end of year
num_sims <- 100000

# year 0
g2y0e <- rnorm(n = num_sims, mean = .2, sd = se)

# year 1
g1y1b <- rnorm(n = num_sims, mean = 0, sd = se)
y1e <- mvrnorm(n = num_sims, mu = c(.225, .15), Sigma = Sigma)
g3y1e <- y1e[,1]
g1y1e <- y1e[,2]

# year 2
g1y2b <- rnorm(n = num_sims, mean = 0, sd = se)
y2e <- mvrnorm(n = num_sims, mu = c(.2, .15), Sigma = Sigma)
g2y2e <- y2e[,1]
g1y2e <- y2e[,2]

# year 3
g1y3b <- rnorm(n = num_sims, mean = 0, sd = se)
y3e <- mvrnorm(n = num_sims, mu = c(.2, .15), Sigma = Sigma)
g2y3e <- y3e[,1]
g1y3e <- y3e[,2]

# year 4
y4e <- mvrnorm(n = num_sims, mu = c(.225, .2), Sigma = Sigma)
g3y4e <- y4e[,1]
g2y4e <- y4e[,2]

# Payment calculation
est_effect_y1 <- ((g3y1e-g2y0e)+(g1y1e-g1y1b))/2
est_effect_y2 <- ((g2y2e-g1y1e)+(g1y2e-g1y2b))/2
est_effect_y3 <- ((g2y3e-g1y2e)+(g1y3e-g1y3b))/2
est_effect_y4 <- ((g3y4e-g2y4e)+(g2y4e-g1y3e))/2

# Check that mean estimated effect is similar to targets
print("Mean estimated effect y1 vs target of .0875:")
round(mean(est_effect_y1),4)
print("Mean estimated effect y2 vs target of .1:")
round(mean(est_effect_y2),4)
print("Mean estimated effect y3 vs target of .1:")
round(mean(est_effect_y3),4)
print("Mean estimated effect y4 vs target of .0375:")
round(mean(est_effect_y4),4)

```

```{r}
library(tidyverse)

# Calculate payment assuming minimum payment of 0
payment_y1 <- pmax(0, est_effect_y1/(4*.0875))
payment_y2 <- pmax(0, est_effect_y2/(4*.1))
payment_y3 <- pmax(0, est_effect_y3/(4*.1))
payment_y4 <- pmax(0, est_effect_y4/(4*.0375))

total_payment = payment_y1+payment_y2+payment_y3+payment_y4
mean(total_payment)

# histogram of total payment
ggplot(tibble(p = total_payment), aes(x = p)) + 
  geom_histogram(binwidth = .2) +
  geom_vline(xintercept = 1, color = "red")+
  labs(x = "Total Payment", y = "Frequency", caption = "Red line indicates what the investors should be paid", title = "Expected Payment w/ Current Evaluation Design")    

# For completeness, calculate expected payment 
payment_y1_nomin <- est_effect_y1/(4*.0875)
payment_y2_nomin <- est_effect_y2/(4*.1)
payment_y3_nomin <- est_effect_y3/(4*.1)
payment_y4_nomin <- est_effect_y4/(4*.0375)
                                   
total_payment_nomin = payment_y1_nomin+payment_y2_nomin+payment_y3_nomin+payment_y4_nomin
mean(total_payment_nomin)

# histogram of total payment
ggplot(tibble(p = total_payment_nomin), aes(x = p)) + 
  geom_histogram(binwidth = .2) +
  geom_vline(xintercept = 1, color = "red")+
  labs(x = "Total Payment", y = "Frequency", title = "Expected Payment with Potential Negative Payments", caption = "Red line indicates what the investors should be paid")    

```

# Alternate Evaluation Design

The code below simulates measured learning gains and payments using the same assumptions but with a slightly different, cohort-based evaluation design.

![](images/paste-6F5390D5.png)

```{r}
# first cohort
g1y0b <- rnorm(n = num_sims, mean = 0, sd = se)
g2y1e <- rnorm(n = num_sims, mean = .2, sd = se)
est_effect_c1 <- g2y1e-g1y0b

# second cohort
g1y1b <- rnorm(n = num_sims, mean = 0, sd = se)
g2y2e <- rnorm(n = num_sims, mean = .2, sd = se)
est_effect_c2 <- g2y2e-g1y1b

# third cohort
g1y2b <- rnorm(n = num_sims, mean = 0, sd = se)
g2y3e <- rnorm(n = num_sims, mean = .2, sd = se)
est_effect_c3 <- g2y3e-g1y2b

# fourth cohort
g1y3b <- rnorm(n = num_sims, mean = 0, sd = se)
g2y4e <- rnorm(n = num_sims, mean = .2, sd = se)
est_effect_c4 <- g2y4e-g1y3b

payment_y1 <- pmax(0, est_effect_c1/.8)
payment_y2 <- pmax(0, est_effect_c2/.8)
payment_y3 <- pmax(0, est_effect_c3/.8)
payment_y4 <- pmax(0, est_effect_c4/.8)

total_payment = payment_y1+payment_y2+payment_y3+payment_y4
mean(total_payment)

# histogram of total payment
ggplot(tibble(p = total_payment), aes(x = p)) + 
  geom_histogram(binwidth = .02) +
  geom_vline(xintercept = 1, color = "red")+
  labs(x = "Total Payment", y = "Frequency")

# For completeness, calculate expected payment with no minimum
payment_y1_nomin <- est_effect_c1/.8
payment_y2_nomin <- est_effect_c2/.8
payment_y3_nomin <- est_effect_c3/.8
payment_y4_nomin <- est_effect_c4/.8
                                   
total_payment_nomin = payment_y1_nomin+payment_y2_nomin+payment_y3_nomin+payment_y4_nomin
mean(total_payment_nomin)

# histogram of total payment
ggplot(tibble(p = total_payment_nomin), aes(x = p)) + 
  geom_histogram(binwidth = .02) +
  geom_vline(xintercept = 1, color = "red")+
  labs(x = "Total Payment", y = "Frequency", title = "Expected Payment with Potential Negative Payments")  


```
