Admissions analysis

What questions do we want to ask of this data in regards to equity? To start at the highest level, how do admissions for URM students compare to non-URM students? My first stab:

Is there disparate attrition at any step of the admissions process (URM vs non-URM)?
How does the BGS applicant pool comapre to the demographic composition of the US?

For simplicity, to start I’m looking only at all BGS-level numbers and individual years. Can get into individual graduate groups and the panel data later.

Figures

Alluvial plots

Applicants 2018-2020: All BGS

URM applicants 2018-2020: All BGS

EA Admissions

EA applicants by year

EA applicants: table 1

Two-sample bar graphs

BMB applicants (2018-2020)

CAMB applicants (2018-2020)

GCB applicants (2018-2020)

GGEB-EPID applicants (2018-2020)

GGEB-BSTA applicants (2018-2020)

IGG applicants (2018-2020)

PGG applicants (2018-2020)

NGG applicants (2018-2020)

Disparate attrition

Year: 2020

We’ll use two-sample tests of proportions and only look at 2020 data.

For 2020, BGS actually interviewed a significantly higher proportion of URM applicants vs non-URM (95% CI for mean difference (0.012, 0.13) p = 0.01)), and admitted and matriculated people in similar proportions (p = 0.4 and p = 1 respectively).

But has it always been that way? Again, saving more sophisticated time series analyses for later - I’ll just look at the year 2009.

Note that this difference is not due to GPA - the mean GPA for URM applicants is lower than for non-URM applicants.

Interviewed/applied

##         urm n_applied n_interviewed failures
## 435 non_urm      1214           224      990
## 436     urm       278            71      207

## 
##  2-sample test for equality of proportions with continuity
##  correction
## 
## data:  as.matrix(test[, c(3:4)])
## X-squared = 6.7246, df = 1, p-value = 0.009509
## alternative hypothesis: two.sided
## 95 percent confidence interval:
##  -0.12880491 -0.01295845
## sample estimates:
##    prop 1    prop 2 
## 0.1845140 0.2553957

Admitted/interviewed

##         urm n_interviewed n_admitted failures
## 435 non_urm           224        184       40
## 436     urm            71         62        9

## 
##  2-sample test for equality of proportions with continuity
##  correction
## 
## data:  as.matrix(test[, c(3:4)])
## X-squared = 0.70424, df = 1, p-value = 0.4014
## alternative hypothesis: two.sided
## 95 percent confidence interval:
##  -0.15330536  0.04968363
## sample estimates:
##    prop 1    prop 2 
## 0.8214286 0.8732394

Matriculated/admitted

##         urm n_admitted n_matriculated failures
## 435 non_urm        184             81      103
## 436     urm         62             27       35

## 
##  2-sample test for equality of proportions with continuity
##  correction
## 
## data:  as.matrix(test[, c(3:4)])
## X-squared = 1.7545e-30, df = 1, p-value = 1
## alternative hypothesis: two.sided
## 95 percent confidence interval:
##  -0.1427467  0.1522137
## sample estimates:
##    prop 1    prop 2 
## 0.4402174 0.4354839

Year: 2009

In 2009, BGS interviewed and matriculated URM applicants in similar proportions to non-URM applicants. But of those they interviewed, they admitted a significantly lower proportion of URM folks (95% CI of mean difference (0.16, 0.53) p = 0.000).

We should definitely take a look at this step in the admissions process across all years.

Interviewed/applied

##         urm n_applied n_interviewed failures
## 391 non_urm       566           201      365
## 392     urm        77            33       44

## 
##  2-sample test for equality of proportions with continuity
##  correction
## 
## data:  as.matrix(test[, c(3:4)])
## X-squared = 1.2782, df = 1, p-value = 0.2582
## alternative hypothesis: two.sided
## 95 percent confidence interval:
##  -0.19817897  0.05128346
## sample estimates:
##    prop 1    prop 2 
## 0.3551237 0.4285714

Admitted/interviewed

##         urm n_interviewed n_admitted failures
## 391 non_urm           201        192        9
## 392     urm            33         20       13

## 
##  2-sample test for equality of proportions with continuity
##  correction
## 
## data:  as.matrix(test[, c(3:4)])
## X-squared = 36.576, df = 1, p-value = 1.468e-09
## alternative hypothesis: two.sided
## 95 percent confidence interval:
##  0.1623795 0.5359471
## sample estimates:
##    prop 1    prop 2 
## 0.9552239 0.6060606

Matriculated/admitted

##         urm n_admitted n_matriculated failures
## 391 non_urm        192             78      114
## 392     urm         20              7       13

## 
##  2-sample test for equality of proportions with continuity
##  correction
## 
## data:  as.matrix(test[, c(3:4)])
## X-squared = 0.061882, df = 1, p-value = 0.8035
## alternative hypothesis: two.sided
## 95 percent confidence interval:
##  -0.1916327  0.3041327
## sample estimates:
##  prop 1  prop 2 
## 0.40625 0.35000

US population comparison

It’s also a problem if BGS isn’t properly recruiting applicants, e.g., the applicant pool looks nothing like the overall US population.

For this portion, I used US Census data 2010-2019 (2020 doesn’t exist yet - I just used 2019). The Census race and ethnicity categories are what the OMB and also Penn use to determine URM status. I calculated the proportion of the US population that is considered ‘URM’ (people who identify as Black or African American, Hispanic/Latinx, American Indian or Alaska Native/Indigneous folks and Native Hawaiians and other Pacific Islanders) and compared it to the proportion of applicants categorized as such. You can use a one-sample test of proportions to do so.

The proportion of the overall BGS applicant pool that is URM is significantly lower than the US population proportion (95% CI of the difference (0.17, 0.21) p = 0.000).

2020 applicant pool

##   year      gg    metric  all urm non_urm international natl_prop_urm
## 1 2020 all_bgs n_applied 2879 278    1214          1387     0.3340315
##   app_prop_urm trials
## 1    0.1863271   1492

## 
##  1-sample proportions test with continuity correction
## 
## data:  test$urm out of test$trials, null probability test$natl_prop_urm
## X-squared = 145.66, df = 1, p-value < 2.2e-16
## alternative hypothesis: true p is not equal to 0.3340315
## 95 percent confidence interval:
##  0.1670637 0.2072288
## sample estimates:
##         p 
## 0.1863271

Admissions analysis

January 07, 2021

Figures

Alluvial plots

Applicants 2018-2020: All BGS

URM applicants 2018-2020: All BGS

EA Admissions

EA applicants by year

EA applicants: table 1

Two-sample bar graphs

BMB applicants (2018-2020)

CAMB applicants (2018-2020)

GCB applicants (2018-2020)

GGEB-EPID applicants (2018-2020)

GGEB-BSTA applicants (2018-2020)

IGG applicants (2018-2020)

PGG applicants (2018-2020)

NGG applicants (2018-2020)

Disparate attrition

Year: 2020

Interviewed/applied

Admitted/interviewed

Matriculated/admitted

Year: 2009

Interviewed/applied

Admitted/interviewed

Matriculated/admitted

US population comparison

2020 applicant pool