A re-analysis of regression discontinuity (RD) design on a discrete running variable for academic probation placements.

Reference: A Practical Introduction to Regression Discontinuity Designs Volume II Authors: Matias D. Cattaneo, Nicolas Idrobo and Rocio Titiunik

Package: https://sites.google.com/site/rdpackages/

Workstation Setup

FOREIGN: install.packages(‘foreign’); GGPLOT2: install.packages(‘ggplot2’); GRID: install.packages(‘grid’); LPDENSITY: install.packages(‘lpdensity’); RDDENSITY: install.packages(‘rddensity’); RDLOCRAND: install.packages(‘rdlocrand’); RDROBUST: install.packages(‘rdrobust’); TEACHINGDEMOS: install.packages(‘TeachingDemos’); GGTHEMES:install.package(‘ggthemes’) \(~\)


Workstation Setup

\(~\)

library(foreign)
library(ggplot2)
library(lpdensity)
library(rddensity)
library(rdrobust)
library(rdlocrand)
library(TeachingDemos)
library(ggthemes)

Get the dataset from desktop and re-define the main variables

\(~\)

library(readr)
data <- read_csv("CIT_2019_Cambridge_education.csv")
## Parsed with column specification:
## cols(
##   X = col_double(),
##   nextGPA = col_double(),
##   left_school = col_double(),
##   nextGPA_nonorm = col_double(),
##   T = col_double(),
##   hsgrade_pct = col_double(),
##   totcredits_year1 = col_double(),
##   age_at_entry = col_double(),
##   male = col_double(),
##   bpl_north_america = col_double(),
##   english = col_double(),
##   loc_campus1 = col_double(),
##   loc_campus2 = col_double(),
##   loc_campus3 = col_double(),
##   clustervar = col_double()
## )
nextGPA = data$nextGPA
left_school = data$left_school
nextGPA_nonorm = data$nextGPA_nonorm
X = data$X
T = data$T
T_X = T*X

Descriptive statistics for the main data (2010)

\(~\)

installed.packages("skimr")
##      Package LibPath Version Priority Depends Imports LinkingTo Suggests
##      Enhances License License_is_FOSS License_restricts_use OS_type Archs
##      MD5sum NeedsCompilation Built
library(skimr)
skim(data)
Data summary
Name data
Number of rows 44362
Number of columns 15
_______________________
Column type frequency:
numeric 15
________________________
Group variables None

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100 hist
X 0 1.00 -0.91 0.90 -2.8 -1.58 -0.98 -0.32 1.6 ▃▇▇▃▁
nextGPA 3780 0.91 1.05 0.92 -1.6 0.50 1.17 1.73 2.8 ▁▂▆▇▃
left_school 0 1.00 0.05 0.22 0.0 0.00 0.00 0.00 1.0 ▇▁▁▁▁
nextGPA_nonorm 3780 0.91 2.57 0.91 0.0 2.00 2.70 3.26 4.3 ▁▃▆▇▃
T 0 1.00 0.16 0.37 0.0 0.00 0.00 0.00 1.0 ▇▁▁▁▂
hsgrade_pct 0 1.00 50.17 28.86 1.0 25.00 50.00 75.00 100.0 ▇▇▇▇▇
totcredits_year1 0 1.00 4.57 0.51 3.0 4.00 5.00 5.00 6.5 ▁▃▇▁▁
age_at_entry 0 1.00 18.67 0.74 17.0 18.00 19.00 19.00 21.0 ▁▅▇▁▁
male 0 1.00 0.38 0.49 0.0 0.00 0.00 1.00 1.0 ▇▁▁▁▅
bpl_north_america 0 1.00 0.87 0.34 0.0 1.00 1.00 1.00 1.0 ▁▁▁▁▇
english 0 1.00 0.71 0.45 0.0 0.00 1.00 1.00 1.0 ▃▁▁▁▇
loc_campus1 0 1.00 0.58 0.49 0.0 0.00 1.00 1.00 1.0 ▆▁▁▁▇
loc_campus2 0 1.00 0.17 0.38 0.0 0.00 0.00 0.00 1.0 ▇▁▁▁▂
loc_campus3 0 1.00 0.24 0.43 0.0 0.00 0.00 0.00 1.0 ▇▁▁▁▂
clustervar 0 1.00 0.91 0.90 -1.6 0.32 0.98 1.58 2.8 ▁▃▇▇▃
***
### A Practical Intr oduction to Regression Disco ntinuity Designs , Volum e II, Se ction 3: RD Desi gns with Discrete Scores, Figure 3.1 - 1.1 in Report: Histogram
\(~\)
tempdata = as.data.frame(X); colnames(tempdata) = c("v1");
p = ggplot(data=tempdata, aes(tempdata$v1))+
  geom_histogram(breaks=seq(-2.8, 0, by = 0.1), col="black", fill="blue", alpha = 1)+
  geom_histogram(breaks=seq(0, 1.6, by = 0.1), col="black", fill="red", alpha = 1)+
  labs(x="Score", y="Number of Observations", title = "Histogram of Transformed Running Variable—LSO Data")+geom_vline(xintercept=0, color="black")+
  theme_economist()
p
## Warning: Use of `tempdata$v1` is discouraged. Use `v1` instead.

## Warning: Use of `tempdata$v1` is discouraged. Use `v1` instead.

# ggsave("./outputs/Vol-2-R_histogram.png", plot = p, width = 10, height = 5, units = "in")

Code snippet 1: Counting the number of observations with X different from missing

\(~\)

txtStart("./outputs/Vol-2-R_lso_countX.txt", commands=TRUE, results=TRUE, append=FALSE, visible.only=TRUE)
## Output being copied to text file,
## use txtStop to end
length(X)
## [1] 44362
txtStop()

Code snippet 2: Counting the unique values of X

\(~\)

txtStart("./outputs/Vol-2-R_lso_uniqueX.txt", commands=TRUE, results=TRUE, append=FALSE, visible.only=TRUE)
## Output being copied to text file,
## use txtStop to end
length(unique(X))
## [1] 430
txtStop()

Code snippet 3: Using rddensity

\(~\)

txtStart("./outputs/Vol-2-R_lso_falsification_rddensity.txt", commands=TRUE, results=TRUE, append=FALSE, visible.only=TRUE)
## Output being copied to text file,
## use txtStop to end
out = rddensity(X)
summary(out)
## 
## RD Manipulation Test using local polynomial density estimation.
## 
## Number of obs =       44362
## Model =               unrestricted
## Kernel =              triangular
## BW method =           comb
## VCE method =          jackknife
## 
## Cutoff c = 0          Left of c           Right of c          
## Number of obs         37211               7151                
## Eff. Number of obs    10083               4137                
## Order est. (p)        2                   2                   
## Order bias (q)        3                   3                   
## BW est. (h)           0.706               0.556               
## 
## Method                T                   P > |T|             
## Robust                -0.4544             0.6496
txtStop()

Code snippet 4: Using rdrobust on hsgrade_pct

\(~\)

txtStart("./outputs/Vol-2-R_lso_falsification_rdrobust_hsgrade_pct.txt", commands=TRUE, results=TRUE, append=FALSE, visible.only=TRUE)
## Output being copied to text file,
## use txtStop to end
out = rdrobust(data$hsgrade_pct, X)
## [1] "Mass points detected in the running variable."
summary(out)
## Call: rdrobust
## 
## Number of Obs.                44362
## BW type                       mserd
## Kernel                   Triangular
## VCE method                       NN
## 
## Number of Obs.               37211        7151
## Eff. Number of Obs.           6934        3972
## Order est. (p)                   1           1
## Order bias  (q)                  2           2
## BW est. (h)                  0.518       0.518
## BW bias (b)                  0.801       0.801
## rho (h/b)                    0.646       0.646
## Unique Obs.                    274         156
## 
## =============================================================================
##         Method     Coef. Std. Err.         z     P>|z|      [ 95% C.I. ]       
## =============================================================================
##   Conventional     1.140     0.961     1.186     0.235    [-0.743 , 3.023]     
##         Robust         -         -     1.140     0.254    [-0.943 , 3.566]     
## =============================================================================
txtStop()

Code snippet 5: (Figure 3.2)-Figure 2 in report: Using rdplot on hsgrade_pct

\(~\)

txtStart("./outputs/Vol-2-R_lso_falsification_rdplot_hsgrade_pct-COMMAND-ONLY.txt",
         commands=TRUE, results=FALSE, append=FALSE, visible.only=TRUE)
## Output being copied to text file,
## use txtStop to end
rdplot(data$hsgrade_pct, X, x.label = "Score", y.label = "Frequency Count", title="Regression Discontinuity Plot for High School Grade Percentile - LSO Data",col.dots ="blue", col.lines ="red")

txtStop()

# png("./outputs/Vol-2-R_lso_falsification_rdplot_hsgrade_pct.png")
rdplot(data$hsgrade_pct, X, x.label = "Score", y.label = "Frequency Count", title="Regression Discontinuity Plot for 'hsgrade_pct' - LSO Data")

dev.off()
## null device 
##           1

Table 3.3 in A Practical Introduction to Regression Discontinuity Designs, Volume II

\(~\)

summary(rdrobust(data$hsgrade_pct, X))
## [1] "Mass points detected in the running variable."
## Call: rdrobust
## 
## Number of Obs.                44362
## BW type                       mserd
## Kernel                   Triangular
## VCE method                       NN
## 
## Number of Obs.               37211        7151
## Eff. Number of Obs.           6934        3972
## Order est. (p)                   1           1
## Order bias  (q)                  2           2
## BW est. (h)                  0.518       0.518
## BW bias (b)                  0.801       0.801
## rho (h/b)                    0.646       0.646
## Unique Obs.                    274         156
## 
## =============================================================================
##         Method     Coef. Std. Err.         z     P>|z|      [ 95% C.I. ]       
## =============================================================================
##   Conventional     1.140     0.961     1.186     0.235    [-0.743 , 3.023]     
##         Robust         -         -     1.140     0.254    [-0.943 , 3.566]     
## =============================================================================
summary(rdrobust(data$totcredits_year1, X))
## [1] "Mass points detected in the running variable."
## Call: rdrobust
## 
## Number of Obs.                44362
## BW type                       mserd
## Kernel                   Triangular
## VCE method                       NN
## 
## Number of Obs.               37211        7151
## Eff. Number of Obs.           3995        2799
## Order est. (p)                   1           1
## Order bias  (q)                  2           2
## BW est. (h)                  0.322       0.322
## BW bias (b)                  0.609       0.609
## rho (h/b)                    0.529       0.529
## Unique Obs.                    274         156
## 
## =============================================================================
##         Method     Coef. Std. Err.         z     P>|z|      [ 95% C.I. ]       
## =============================================================================
##   Conventional     0.077     0.028     2.722     0.006     [0.022 , 0.133]     
##         Robust         -         -     2.768     0.006     [0.026 , 0.151]     
## =============================================================================
summary(rdrobust(data$age_at_entry, X))
## [1] "Mass points detected in the running variable."
## Call: rdrobust
## 
## Number of Obs.                44362
## BW type                       mserd
## Kernel                   Triangular
## VCE method                       NN
## 
## Number of Obs.               37211        7151
## Eff. Number of Obs.           6444        3798
## Order est. (p)                   1           1
## Order bias  (q)                  2           2
## BW est. (h)                  0.492       0.492
## BW bias (b)                  0.755       0.755
## rho (h/b)                    0.652       0.652
## Unique Obs.                    274         156
## 
## =============================================================================
##         Method     Coef. Std. Err.         z     P>|z|      [ 95% C.I. ]       
## =============================================================================
##   Conventional     0.017     0.033     0.511     0.609    [-0.048 , 0.081]     
##         Robust         -         -     0.528     0.598    [-0.057 , 0.098]     
## =============================================================================
summary(rdrobust(data$male, X))
## [1] "Mass points detected in the running variable."
## Call: rdrobust
## 
## Number of Obs.                44362
## BW type                       mserd
## Kernel                   Triangular
## VCE method                       NN
## 
## Number of Obs.               37211        7151
## Eff. Number of Obs.           6825        3932
## Order est. (p)                   1           1
## Order bias  (q)                  2           2
## BW est. (h)                  0.508       0.508
## BW bias (b)                  0.828       0.828
## rho (h/b)                    0.614       0.614
## Unique Obs.                    274         156
## 
## =============================================================================
##         Method     Coef. Std. Err.         z     P>|z|      [ 95% C.I. ]       
## =============================================================================
##   Conventional    -0.012     0.021    -0.561     0.575    [-0.053 , 0.029]     
##         Robust         -         -    -0.687     0.492    [-0.066 , 0.032]     
## =============================================================================
summary(rdrobust(data$bpl_north_america, X))
## [1] "Mass points detected in the running variable."
## Call: rdrobust
## 
## Number of Obs.                44362
## BW type                       mserd
## Kernel                   Triangular
## VCE method                       NN
## 
## Number of Obs.               37211        7151
## Eff. Number of Obs.           7171        4068
## Order est. (p)                   1           1
## Order bias  (q)                  2           2
## BW est. (h)                  0.536       0.536
## BW bias (b)                  0.858       0.858
## rho (h/b)                    0.624       0.624
## Unique Obs.                    274         156
## 
## =============================================================================
##         Method     Coef. Std. Err.         z     P>|z|      [ 95% C.I. ]       
## =============================================================================
##   Conventional     0.015     0.014     1.077     0.282    [-0.012 , 0.043]     
##         Robust         -         -     1.091     0.275    [-0.015 , 0.051]     
## =============================================================================
summary(rdrobust(data$english, X))
## [1] "Mass points detected in the running variable."
## Call: rdrobust
## 
## Number of Obs.                44362
## BW type                       mserd
## Kernel                   Triangular
## VCE method                       NN
## 
## Number of Obs.               37211        7151
## Eff. Number of Obs.           7588        4206
## Order est. (p)                   1           1
## Order bias  (q)                  2           2
## BW est. (h)                  0.568       0.568
## BW bias (b)                  0.920       0.920
## rho (h/b)                    0.618       0.618
## Unique Obs.                    274         156
## 
## =============================================================================
##         Method     Coef. Std. Err.         z     P>|z|      [ 95% C.I. ]       
## =============================================================================
##   Conventional    -0.035     0.019    -1.899     0.058    [-0.072 , 0.001]     
##         Robust         -         -    -1.809     0.070    [-0.083 , 0.003]     
## =============================================================================
summary(rdrobust(data$loc_campus1, X))
## [1] "Mass points detected in the running variable."
## Call: rdrobust
## 
## Number of Obs.                44362
## BW type                       mserd
## Kernel                   Triangular
## VCE method                       NN
## 
## Number of Obs.               37211        7151
## Eff. Number of Obs.           3917        2655
## Order est. (p)                   1           1
## Order bias  (q)                  2           2
## BW est. (h)                  0.313       0.313
## BW bias (b)                  0.534       0.534
## rho (h/b)                    0.586       0.586
## Unique Obs.                    274         156
## 
## =============================================================================
##         Method     Coef. Std. Err.         z     P>|z|      [ 95% C.I. ]       
## =============================================================================
##   Conventional    -0.016     0.027    -0.596     0.551    [-0.069 , 0.037]     
##         Robust         -         -    -0.854     0.393    [-0.089 , 0.035]     
## =============================================================================
summary(rdrobust(data$loc_campus2, X))
## [1] "Mass points detected in the running variable."
## Call: rdrobust
## 
## Number of Obs.                44362
## BW type                       mserd
## Kernel                   Triangular
## VCE method                       NN
## 
## Number of Obs.               37211        7151
## Eff. Number of Obs.           6934        3972
## Order est. (p)                   1           1
## Order bias  (q)                  2           2
## BW est. (h)                  0.517       0.517
## BW bias (b)                  0.795       0.795
## rho (h/b)                    0.650       0.650
## Unique Obs.                    274         156
## 
## =============================================================================
##         Method     Coef. Std. Err.         z     P>|z|      [ 95% C.I. ]       
## =============================================================================
##   Conventional    -0.016     0.017    -0.899     0.369    [-0.050 , 0.018]     
##         Robust         -         -    -0.863     0.388    [-0.059 , 0.023]     
## =============================================================================
summary(rdrobust(data$loc_campus3, X))
## [1] "Mass points detected in the running variable."
## Call: rdrobust
## 
## Number of Obs.                44362
## BW type                       mserd
## Kernel                   Triangular
## VCE method                       NN
## 
## Number of Obs.               37211        7151
## Eff. Number of Obs.           3623        2519
## Order est. (p)                   1           1
## Order bias  (q)                  2           2
## BW est. (h)                  0.292       0.292
## BW bias (b)                  0.523       0.523
## rho (h/b)                    0.559       0.559
## Unique Obs.                    274         156
## 
## =============================================================================
##         Method     Coef. Std. Err.         z     P>|z|      [ 95% C.I. ]       
## =============================================================================
##   Conventional     0.032     0.027     1.203     0.229    [-0.020 , 0.085]     
##         Robust         -         -     1.427     0.154    [-0.016 , 0.104]     
## =============================================================================

Figure 3.3:Figure 3.3: RD Plots for Predetermined Covariates—LSO Application, 1.3 in the report: High School Grade Percentile

\(~\)

# png("./outputs/Vol-2-R_lso_falsification_rdplot_hsgrade_pct.png")
rdplot(data$hsgrade_pct, X, x.label = "Score", y.label = "Frequency Count", title="High School Grade Percentile")

dev.off()
## null device 
##           1

Figure 3.3:Figure 3.3: RD Plots for Predetermined Covariates—LSO Application, 1.4 in the report: Total Credits in First Year

\(~\)

# png("./outputs/Vol-2-R_lso_falsification_rdplot_totcredits_year1.png")
rdplot(data$totcredits_year1, X, x.label = "Score", y.label = "Frequency Count", title="Total Credits in First Year")

dev.off()
## null device 
##           1

Figure 3.3:Figure 3.3: RD Plots for Predetermined Covariates—LSO Application, 1.5 in the

\(~\)

# png("./outputs/Vol-2-R_lso_falsification_rdplot_age_at_entry.png")
rdplot(data$age_at_entry, X, x.label = "Score", y.label = "Frequency Count", title="Age at Entry")

dev.off()
## null device 
##           1

Figure 3.3:Figure 3.3: RD Plots for Predetermined Covariates—LSO Application, 1.6 in the report: English First Language Indicator

\(~\)

# png("./outputs/Vol-2-R_lso_falsification_rdplot_english.png")
rdplot(data$english, X, x.label = "Score", y.label = "Frequency Count", title="English First Language Indicator")

dev.off()
## null device 
##           1

Figure 3.3:Figure 3.3: RD Plots for Predetermined Covariates—LSO Application, 1.7 in the report: Male Indicator

\(~\)

# png("./outputs/Vol-2-R_lso_falsification_rdplot_male.png")
rdplot(data$male, X, x.label = "Score", y.label = "Frequency Count", title="Male Indicator")

dev.off()
## null device 
##           1

Figure 3.3:Figure 3.3: RD Plots for Predetermined Covariates—LSO Application, 1.8 in the report: Born in North America Indicator

\(~\)

# png("./outputs/Vol-2-R_lso_falsification_rdplot_bpl_north_america.png")
rdplot(data$bpl_north_america, X, x.label = "Score", y.label = "Frequency Count", title="Born in North America Indicator")

dev.off()
## null device 
##           1

Code snippet 6 (Figure 3.4) - 1.9 in the report: Using rdplot on the outcome - Figure 1.9: RD Plot for nextGPA—LSO Data

\(~\)

txtStart("./outputs/Vol-2-R_lso3_rdplot_esmv.txt", commands=TRUE, results=TRUE, append=FALSE, visible.only=TRUE)
## Output being copied to text file,
## use txtStop to end
out = rdplot(nextGPA_nonorm, X,  binselect = 'esmv')

summary(out)
## Call: rdplot
## 
## Number of Obs.                40582
## Kernel                      Uniform
## 
## Number of Obs.               34854           5728
## Eff. Number of Obs.          34854           5728
## Order poly. fit (p)              4              4
## BW poly. fit (h)             2.800          1.600
## Number of bins scale             1              1
## 
## Bins Selected                  690            362
## Average Bin Length           0.004          0.004
## Median Bin Length            0.004          0.004
## 
## IMSE-optimal bins               44             13
## Mimicking Variance bins        690            362
## 
## Relative to IMSE-optimal:
## Implied scale               15.682         27.846
## WIMSE variance weight        0.000          0.000
## WIMSE bias weight            1.000          1.000
txtStop()

# pdf("./outputs/Vol-2-R_lso3_rdplot_esmv.pdf")
rdplot(nextGPA_nonorm, X,  binselect = 'esmv', x.label = 'Score', y.label = 'Outcome', title = 'Regression Discontinuity Plot for nextGPA—LSO Data')

dev.off()
## null device 
##           1

Code snippet 7: Using rdrobust on the outcome

\(~\)

txtStart("./outputs/Vol-2-R_lso3_rdrobust_triangular_mserd_p1_rhofree_regterm1.txt",
         commands=TRUE, results=TRUE, append=FALSE, visible.only=TRUE)
## Output being copied to text file,
## use txtStop to end
out = rdrobust(nextGPA_nonorm, X, kernel = 'triangular',  p = 1, bwselect = 'mserd')
## [1] "Mass points detected in the running variable."
summary(out)
## Call: rdrobust
## 
## Number of Obs.                40582
## BW type                       mserd
## Kernel                   Triangular
## VCE method                       NN
## 
## Number of Obs.               34854        5728
## Eff. Number of Obs.           5612        3164
## Order est. (p)                   1           1
## Order bias  (q)                  2           2
## BW est. (h)                  0.470       0.470
## BW bias (b)                  0.746       0.746
## rho (h/b)                    0.630       0.630
## Unique Obs.                    274         155
## 
## =============================================================================
##         Method     Coef. Std. Err.         z     P>|z|      [ 95% C.I. ]       
## =============================================================================
##   Conventional     0.224     0.038     5.852     0.000     [0.149 , 0.299]     
##         Robust         -         -     4.726     0.000     [0.126 , 0.304]     
## =============================================================================
txtStop()

Code snippet 8: Using rdrobust and showing the objects it returns

\(~\)

txtStart("./outputs/Vol-2-R_lso3_rdrobust_triangular_mserd_p1_rhofree_regterm1_namescoefsout_all.txt",
         commands=TRUE, results=TRUE, append=FALSE, visible.only=TRUE)
## Output being copied to text file,
## use txtStop to end
rdout = rdrobust(nextGPA_nonorm, X, kernel = 'triangular', p = 1, bwselect = 'mserd')
## [1] "Mass points detected in the running variable."
print(names(rdout))
##  [1] "Estimate"   "bws"        "coef"       "se"         "z"         
##  [6] "pv"         "ci"         "beta_p_l"   "beta_p_r"   "V_cl_l"    
## [11] "V_cl_r"     "V_rb_l"     "V_rb_r"     "N"          "N_h"       
## [16] "N_b"        "M"          "tau_cl"     "tau_bc"     "c"         
## [21] "p"          "q"          "bias"       "kernel"     "all"       
## [26] "vce"        "bwselect"   "level"      "masspoints" "call"
print(rdout$beta_p_r)
##            [,1]
## [1,]  2.0681763
## [2,] -0.6804732
print(rdout$beta_p_l)
##            [,1]
## [1,]  1.8444877
## [2,] -0.6853278
txtStop()

Code snippet 9: Using rdrobust with clustered standard errors

\(~\)

txtStart("./outputs/Vol-2-R_lso3_rdrobust_triangular_mserd_p1_rhofree_regterm1_cluster.txt",
         commands=TRUE, results=TRUE, append=FALSE, visible.only=TRUE)
## Output being copied to text file,
## use txtStop to end
clustervar = X
out = rdrobust(nextGPA_nonorm, X, kernel = 'triangular', p = 1, bwselect = 'mserd', vce = 'hc0', cluster = clustervar)
## [1] "Mass points detected in the running variable."
summary(out)
## Call: rdrobust
## 
## Number of Obs.                40582
## BW type                       mserd
## Kernel                   Triangular
## VCE method                      HC0
## 
## Number of Obs.               34854        5728
## Eff. Number of Obs.           5008        3016
## Order est. (p)                   1           1
## Order bias  (q)                  2           2
## BW est. (h)                  0.428       0.428
## BW bias (b)                  0.696       0.696
## rho (h/b)                    0.616       0.616
## Unique Obs.                    274         155
## 
## =============================================================================
##         Method     Coef. Std. Err.         z     P>|z|      [ 95% C.I. ]       
## =============================================================================
##   Conventional     0.221     0.032     6.991     0.000     [0.159 , 0.283]     
##         Robust         -         -     5.768     0.000     [0.140 , 0.284]     
## =============================================================================
txtStop()

Code snippet 10: Using rdrobust on the collapsed data

\(~\)

txtStart("./outputs/Vol-2-R_lso3_rdrobust_collapsed.txt", commands=TRUE, results=TRUE, append=FALSE, visible.only=TRUE)
## Output being copied to text file,
## use txtStop to end
data2 = data.frame(nextGPA_nonorm, X)
dim(data2)
## [1] 44362     2
collapsed = aggregate(nextGPA_nonorm ~ X, data = data2, mean)
dim(collapsed)
## [1] 429   2
out = rdrobust(collapsed$nextGPA_nonorm, collapsed$X)
summary(out)
## Call: rdrobust
## 
## Number of Obs.                  429
## BW type                       mserd
## Kernel                   Triangular
## VCE method                       NN
## 
## Number of Obs.                 274         155
## Eff. Number of Obs.             51          50
## Order est. (p)                   1           1
## Order bias  (q)                  2           2
## BW est. (h)                  0.508       0.508
## BW bias (b)                  0.805       0.805
## rho (h/b)                    0.630       0.630
## Unique Obs.                    274         155
## 
## =============================================================================
##         Method     Coef. Std. Err.         z     P>|z|      [ 95% C.I. ]       
## =============================================================================
##   Conventional     0.246     0.033     7.422     0.000     [0.181 , 0.310]     
##         Robust         -         -     6.083     0.000     [0.163 , 0.319]     
## =============================================================================
txtStop()

Code snippet 11: Binomial test with rdwinselect

\(~\)

txtStart("./outputs/Vol-2-R_lso_falsification_binomial.txt", commands=TRUE, results=TRUE, append=FALSE, visible.only=TRUE)
## Output being copied to text file,
## use txtStop to end
out = rdwinselect(X, wmin = 0.01, nwindows = 1, cutoff = 5.00000000000e-06)
## Mass points detected in running variable
## 
## 
## Window selection for RD under local randomization 
## 
## Number of obs     =         44362
## Order of poly     =             0
## Kernel type       =       uniform
## Reps              =          1000
## Testing method    =     rdrandinf
## Balance test      =     diffmeans
## 
## Cutoff c =    0.000   Left of c  Right of c
##       Number of obs       37211        7151
##      1st percentile         298           0
##      5th percentile        1817         269
##     10th percentile        3829         663
##     20th percentile        7588        1344
## 
## ================================================================================
##   Window length / 2   p-value        Var. name    Bin.test      Obs<c     Obs>=c
## ================================================================================
##               0.010         NA              NA       0.000        228         77
## ================================================================================
## Note: no covariates specified.
## Need to specify covariates to find recommended length.
txtStop()

Binomial Test by Hand

\(~\)

txtStart("./outputs/Vol-2-R_lso_falsification_binomial_byhand.txt", commands=TRUE, results=TRUE, append=FALSE, visible.only=TRUE)
## Output being copied to text file,
## use txtStop to end
binom.test(77, 305, 1/2)
## 
##  Exact binomial test
## 
## data:  77 and 305
## number of successes = 77, number of trials = 305, p-value < 2.2e-16
## alternative hypothesis: true probability of success is not equal to 0.5
## 95 percent confidence interval:
##  0.2046807 0.3051175
## sample estimates:
## probability of success 
##               0.252459
txtStop()

Code snippet 13: Using rdrandinf on hsgrade_pct

\(~\)

txtStart("./outputs/Vol-2-R_lso_falsification_rdrandinf_hsgrade_pct.txt", commands=TRUE, results=TRUE, append=FALSE, visible.only=TRUE)
## Output being copied to text file,
## use txtStop to end
out = rdrandinf(data$hsgrade_pct, X, wl = -0.005, wr = 0.01, seed = 50)
## 
## Selected window = [-0.005;0.01] 
## 
## Running randomization-based test...
## 
## Randomization-based test complete. 
## 
## 
## Number of obs     =         44362
## Order of poly     =             0
## Kernel type       =       uniform
## Reps              =          1000
## Window            =   set by user
## H0:          tau  =             0
## Randomization     = fixed margins
## 
## Cutoff c =    0.000   Left of c  Right of c
##       Number of obs       37211        7151
##  Eff. number of obs         228          77
##     Mean of outcome      29.118      32.675
##     S.d. of outcome      21.737      21.625
##              Window      -0.005       0.010
## 
## ================================================================================
##                                   Finite sample            Large sample         
##                                ------------------  -----------------------------
##           Statistic          T        P>|T|        P>|T|    Power vs d =  10.868
## ================================================================================
##      Diff. in means      3.557        0.223        0.213                   0.968
## ================================================================================
txtStop()

Code snippet 14: Using rdwinselect with covariates and windows with equal number of mass points to each side of the cutoff

\(~\)

txtStart("./outputs/Vol-2-R_lso_rdwinselect_consecutive_windows.txt", commands=TRUE, results=TRUE, append=FALSE, visible.only=TRUE)
## Output being copied to text file,
## use txtStop to end
Z = cbind(data$hsgrade_pct, data$totcredits_year1, data$age_at_entry, data$male,
          data$bpl_north_america, data$english, data$loc_campus1, data$loc_campus2, 
          data$loc_campus3)
colnames(Z) = c("hsgrade_pct", "totcredits_year1", "age_at_entry", "male",
                "bpl_north_america", "english", "loc_campus1", "loc_campus2",
                "loc_campus3")
out = rdwinselect(X, Z, p = 1, seed = 50, wmin = 0.01, wstep = 0.01, cutoff = 5.00000000000e-06)
## Mass points detected in running variable
## 
## 
## Window selection for RD under local randomization 
## 
## Number of obs     =         44362
## Order of poly     =             1
## Kernel type       =       uniform
## Reps              =          1000
## Testing method    =     rdrandinf
## Balance test      =     diffmeans
## 
## Cutoff c =    0.000   Left of c  Right of c
##       Number of obs       37211        7151
##      1st percentile         298           0
##      5th percentile        1817         269
##     10th percentile        3829         663
##     20th percentile        7588        1344
## 
## ================================================================================
##   Window length / 2   p-value        Var. name    Bin.test      Obs<c     Obs>=c
## ================================================================================
##               0.010      0.005     loc_campus3       0.000        228         77
##               0.020      0.000 totcredits_year       0.000        298        214
##               0.030      0.000     hsgrade_pct       0.000        374        269
##               0.040      0.000     loc_campus1       0.000        494        375
##               0.050      0.000 totcredits_year       0.000        636        418
##               0.060      0.000 totcredits_year       0.000        714        497
##               0.070      0.000     hsgrade_pct       0.000        807        663
##               0.080      0.000 totcredits_year       0.000        877        727
##               0.090      0.000 totcredits_year       0.000       1049        815
##               0.100      0.000 totcredits_year       0.001       1131        973
## ================================================================================
## Smallest window does not pass covariate test.
## Decrease smallest window or reduce level.
txtStop()

Code snippet 15: Using rdrandinf on the Outcome

\(~\)

txtStart("./outputs/Vol-2-R_lso3_rdrandinf_adhocsmall_p0.txt", commands=TRUE, results=TRUE, append=FALSE, visible.only=TRUE)
## Output being copied to text file,
## use txtStop to end
out = rdrandinf(nextGPA_nonorm, X, wl = -0.005, wr = 0.01, seed = 50)
## 
## Selected window = [-0.005;0.01] 
## 
## Running randomization-based test...
## 
## Randomization-based test complete. 
## 
## 
## Number of obs     =         40582
## Order of poly     =             0
## Kernel type       =       uniform
## Reps              =          1000
## Window            =   set by user
## H0:          tau  =             0
## Randomization     = fixed margins
## 
## Cutoff c =    0.000   Left of c  Right of c
##       Number of obs       34854        5728
##  Eff. number of obs         208          67
##     Mean of outcome       1.830       2.063
##     S.d. of outcome       0.868       0.846
##              Window      -0.005       0.010
## 
## ================================================================================
##                                   Finite sample            Large sample         
##                                ------------------  -----------------------------
##           Statistic          T        P>|T|        P>|T|    Power vs d =   0.434
## ================================================================================
##      Diff. in means      0.234        0.058        0.051                   0.952
## ================================================================================
txtStop()

Reference

Greg Snow (2016). TeachingDemos: Demonstrations for Teaching and Learning. R package version 2.10. https://CRAN.R-project.org/package=TeachingDemos

H. Wickham. ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York, 2016.

Jeffrey B. Arnold (2019). ggthemes: Extra Themes, Scales and Geoms for ‘ggplot2’. R package version 4.2.0. https://CRAN.R-project.org/package=ggthemes

Matias D. Cattaneo, Michael Jansson and Xinwei Ma (2019). lpdensity: Local Polynomial Density Estimation and Inference. R package version 1.0. https://CRAN.R-project.org/package=lpdensity

Matias D. Cattaneo, Michael Jansson and Xinwei Ma (2019). rddensity: Manipulation Testing Based on Density Discontinuity. R package version 1.0. https://CRAN.R-project.org/package=rddensity

Matias D. Cattaneo, Rocio Titiunik and Gonzalo Vazquez-Bare (2019). rdlocrand: Local Randomization Methods for RD Designs. R package version 0.5. https://CRAN.R-project.org/package=rdlocrand

R Core Team (2020). foreign: Read Data Stored by ‘Minitab’, ‘S’, ‘SAS’, ‘SPSS’, ‘Stata’, ‘Systat’, ‘Weka’, ‘dBase’, …. R package version 0.8-75. https://CRAN.R-project.org/package=foreign

Sebastian Calonico, Matias D. Cattaneo, Max H. Farrell and Rocio Titiunik (2020). rdrobust: Robust Data-Driven Statistical Inference in Regression-Discontinuity Designs. R package version 0.99.6. https://CRAN.R-project.org/package=rdrobust