RD design

Author

J.D.

Published

April 1, 2023

Introduction

Koppensteiner and Matheson (2021) found that an intervention on education infrastructure significantly affected childbearing. Their abstract:

“This article investigates the effect that increasing secondary education opportunities have on teenage fertility in Brazil. Using a novel dataset to exploit variation from a 57 percent increase in secondary schools across 4,884 Brazilian municipalities between 1997 and 2009, the analysis shows an important role of secondary school availability on underage fertility. An increase of one school per 100 females reduces a cohort’s teenage birthrate by between 0.250 and 0.563 births per 100, or a reduction of one birth for roughly every 50 to 100 students who enroll in secondary education. The results highlight the important role of access to education leading to spillovers in addition to improving educational attainment.”

Below some descriptive statistics for our variables of interest (activer is the school density).

`summarise()` has grouped output by 'ano'. You can override using the `.groups`
argument.
`summarise()` has grouped output by 'ano', 'codmunic6'. You can override using
the `.groups` argument.
Adding missing grouping variables: `ano`
          vars     n    mean   sd  median trimmed  mad     min     max range
ano          1 74368 2002.50 4.61 2002.50 2002.50 5.93 1995.00 2010.00 15.00
fem_share    2 74368    0.20 0.06    0.20    0.20 0.06    0.03    0.48  0.45
ch_bear      3 74368    0.22 0.07    0.21    0.21 0.06    0.00    0.51  0.51
activer      4 74368    0.02 0.02    0.02    0.02 0.01    0.00    0.92  0.92
          skew kurtosis   se
ano       0.00    -1.21 0.02
fem_share 0.60     0.57 0.00
ch_bear   0.34     0.54 0.00
activer   7.30   215.98 0.00

Estimates

Before evaluating the effect of childbearing on females’ income share we present two preliminary analysis. First the relationship between school infrastructure and childbearing, and then the relationship between infrastructure and females’ income share.

  • The school density was obtained from the authors database (Koppensteiner and Matheson, 2021). It is available from 1996 to 2010, we added 1995 in order to have more infrastructure data before the reform (in 1997)
  • The female share of income and the childbearing dummy variable were contructed by us from the 2010 demographic census. This lead to almost 9 million observations that we aggregate at the municipality level.
  • Our study focuses on a panel of (around 5k) municipalities, with the aim of assessing the impact of childbearing (rates) on the share of income received by females as of 2010.

As expected, the centered school density (running variable) shows a break in the continuity of its distribution when evaluated at school density in 1997:


Manipulation testing using local polynomial density estimation.

Number of obs =       74368
Model =               unrestricted
Kernel =              triangular
BW method =           estimated
VCE method =          jackknife

c = 0                 Left of c           Right of c          
Number of obs         27514               46854               
Eff. Number of obs    10638               20123               
Order est. (p)        2                   2                   
Order bias (q)        3                   3                   
BW est. (h)           0.001               0.002               

Method                T                   P > |T|             
Robust                -4.6296             0                   


P-values of binomial tests (H0: p=0.5).

Window Length / 2          <c     >=c    P>|T|
0.000                      20   10098    0.0000
0.000                      84   10152    0.0000
0.000                     187   10241    0.0000
0.000                     277   10372    0.0000
0.000                     424   10520    0.0000
0.000                     565   10664    0.0000
0.000                     710   10817    0.0000
0.000                     836   10954    0.0000
0.000                     987   11110    0.0000
0.000                    1150   11281    0.0000

First

The following illustrates the relationship between school infrastructure i.e. the number of schools per 100 females (in schooling age) and the child bearing rate. Both are calculated at the municipality level.

The running variable in the horizontal axis (school density) is centered at the level at the beginning of the school infrastructure intervention in 1997. The first plot provides a full overview of the sample, whereas the next one provides a zoom near the cutoff (school density in 1997).

[1] "Mass points detected in the running variable."

[1] "Mass points detected in the running variable."

A preliminary sharp RD that measures the LATE of the intervention shows a reduction in the childbearing probability of about 0.3 percent:

Sharp RD estimates using local polynomial regression.

Number of Obs.                74368
BW type                       mserd
Kernel                   Triangular
VCE method                       NN

Number of Obs.                27514        46854
Eff. Number of Obs.           20495        31737
Order est. (p)                    1            1
Order bias  (q)                   2            2
BW est. (h)                   0.006        0.006
BW bias (b)                   0.015        0.015
rho (h/b)                     0.408        0.408
Unique Obs.                   14162        21781

=============================================================================
        Method     Coef. Std. Err.         z     P>|z|      [ 95% C.I. ]       
=============================================================================
  Conventional    -0.003     0.002    -1.884     0.060    [-0.007 , 0.000]     
        Robust         -         -    -1.916     0.055    [-0.007 , 0.000]     
=============================================================================

In other words (paraphrasing the abstract) an increase in one school per 100 female students reduces the teenager births in 0.3 births per 100 females. Very close to the reference paper magnitude, which is included by the confidence interval.

Second

Now we verify the relationship between infrastructure intervention and the female income share (expected positive).

[1] "Mass points detected in the running variable."

[1] "Mass points detected in the running variable."

A sharp RD on this relationship verifies that the intervention increased the females’ income share in about 0.8` percent:

Sharp RD estimates using local polynomial regression.

Number of Obs.                74368
BW type                       mserd
Kernel                   Triangular
VCE method                       NN

Number of Obs.                27514        46854
Eff. Number of Obs.           19105        29531
Order est. (p)                    1            1
Order bias  (q)                   2            2
BW est. (h)                   0.005        0.005
BW bias (b)                   0.013        0.013
rho (h/b)                     0.395        0.395
Unique Obs.                   14162        21781

=============================================================================
        Method     Coef. Std. Err.         z     P>|z|      [ 95% C.I. ]       
=============================================================================
  Conventional     0.007     0.002     4.404     0.000     [0.004 , 0.011]     
        Robust         -         -     4.445     0.000     [0.004 , 0.011]     
=============================================================================

Manual Fuzzy RD

From these results we can infer the effect of childbearing on the female income share. The first and second result represent the first and second stage of a manually implemented Fuzzy RD. The LATE of childbearing on females’ income share is obtained as the ratio of both effects. An increase of one percent in the childbearing probability leads to a 2.221355 percent decrease in females’ income share.

Automatic Fuzzy RD

The one-step estimator delivers a similar result:

Fuzzy RD estimates using local polynomial regression.

Number of Obs.                74368
BW type                       mserd
Kernel                   Triangular
VCE method                       NN

Number of Obs.                27514        46854
Eff. Number of Obs.           18707        28908
Order est. (p)                    1            1
Order bias  (q)                   2            2
BW est. (h)                   0.005        0.005
BW bias (b)                   0.013        0.013
rho (h/b)                     0.362        0.362
Unique Obs.                   14162        21781

First-stage estimates.

=============================================================================
        Method     Coef. Std. Err.         z     P>|z|      [ 95% C.I. ]       
=============================================================================
  Conventional    -0.004     0.002    -2.180     0.029    [-0.007 , -0.000]    
        Robust         -         -    -2.244     0.025    [-0.008 , -0.001]    
=============================================================================

Treatment effect estimates.

=============================================================================
        Method     Coef. Std. Err.         z     P>|z|      [ 95% C.I. ]       
=============================================================================
  Conventional    -1.906     0.926    -2.058     0.040    [-3.721 , -0.091]    
        Robust         -         -    -1.964     0.050    [-3.726 , -0.004]    
=============================================================================

where the S.E. are clustered at the municipality level. An increase of one percent in the childbearing probability leads to a 1.8652023 percent decrease in females’ income share.

Fuzzy RD from the 8 million individual observations

The following presents a first trial using an IV approach to mimic the previous RD result using the individual data. This estimate does not include control variables or fixed effects of any kind. S.E. are clustered to the municipality level.

Loading required package: zoo

Attaching package: 'zoo'
The following objects are masked from 'package:base':

    as.Date, as.Date.numeric

Call:
ivreg(formula = fem_share ~ x + ch_bear | treat + x, data = ready2, 
    weights = weight, cluster = codmunic6)

Residuals:
     Min       1Q   Median       3Q      Max 
-0.37884 -0.29733 -0.04471  0.14748  1.43100 

Coefficients:
            Estimate Std. Error z value Pr(>|z|)    
(Intercept)  0.37884    0.01483  25.546  < 2e-16 ***
x            2.53954    0.32554   7.801 6.14e-15 ***
ch_bearTRUE -0.80984    0.07113 -11.385  < 2e-16 ***

Diagnostic tests:
                     df1     df2 statistic p-value    
Weak instruments       1 5035754     204.4  <2e-16 ***
Wu-Hausman             1 5035753     333.0  <2e-16 ***
Sargan                 0      NA        NA      NA    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.3387 on Inf degrees of freedom
Multiple R-Squared: -1.637, Adjusted R-squared: -1.637 
Wald test:     2 on NA DF,  p-value: NA 

The LATE of a one percent increase in the childbearing likelihood on the female income share is -0.8098427 percent.