library(lme4)

## Loading required package: Matrix

library(car)

## Loading required package: carData

library(knitr)
library(viridis)

## Loading required package: viridisLite

library(haven)
library(tidyverse)
library(GGally)

## Registered S3 method overwritten by 'GGally':
##   method from   
##   +.gg   ggplot2

theme_set(theme_bw())

d <- read_sav("The Case for Div in Higher Ed- University Codes and Gradrates Study 7-8.sav")

Starck et al. (2021) report that instrumental rationales for diversity tend to be preferred by White participants and dispreferred by Black participants. They also report that universities with instrumental diversity rationales had higher White-Black graduation disparities.

We coded university websites and surveyed admissions staff to determine that, never- theless, instrumental diversity rationales are more prevalent than moral ones are and that they are indeed associated with increas- ing White–Black graduation disparities, particularly among univer- sities with low levels of moral rationale use.

This RMarkdown examines the last study, the observational data analysis of university rationales.

Conclusion: we were able to reproduce the analyses reported in the study.

Initial visual exploration

Let’s begin by plotting the univariate relationship between Black graduation rates and instrumental diversity rationale presence.

ggplot(d, 
       aes(x = InstrumentalCode, y = Black.gradrates)) + 
  geom_jitter(width = .05, height = 0, alpha = .5) + 
  geom_smooth(method = "lm") + 
  xlab("Instrumental rationale rating") + 
  ylab("Black graduation rate")

## `geom_smooth()` using formula 'y ~ x'

## Warning: Removed 1 rows containing non-finite values (stat_smooth).

## Warning: Removed 1 rows containing missing values (geom_point).

We don’t see any relationship, which is a bit worrisome. On the other hand, the figure in the paper looks like this:

knitr::include_graphics("graph.png")

These are model-based, but let’s start by examining this pattern in the raw data. One decision is how to set the levels of morality. We can see that the distribution is skewed, so this is a bit tricky.

ggplot(d, aes(x = MoralCode)) + 
  geom_histogram()

## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

## Warning: Removed 1 rows containing non-finite values (stat_bin).

I can’t figure out exactly how this was done in the paper - perhaps it’s just +/- 1SD in the model predictions with standardized predictors? For starters here, let’s divide into < 2, 2-3, and 4; we could adjust this later. One observation is missing.

d$moral_cut <- fct_recode(cut(d$MoralCode, c(0,2,3,4)), 
                          "low moral (0-2]" = "(0,2]", 
                          "mid moral (2-3]" = "(2,3]", 
                          "high moral (3-4]" = "(3,4]")
                          

ggplot(filter(d, !is.na(MoralCode)), 
       aes(x = InstrumentalCode, y = Black.gradrates, col = moral_cut)) + 
  geom_jitter(width = .05, height = 0, alpha = .5) + 
  geom_smooth(method = "lm") + 
  xlab("Instrumental rationale rating") + 
  ylab("Black graduation rate")

## `geom_smooth()` using formula 'y ~ x'

Clearly these data are very noisy, and we’re not seeing the same pattern at all (with high error bars). So this pattern does not emerge in the basic data analysis.

What does that mean? It doesn’t mean that the pattern is necessarily absent, but if it’s not present, it’s being confounded by some other variable so strongly that the effect is eliminated. The trouble with these kinds of confounds is that you can’t be sure you’ve found the right confound.

Let’s now go to a more model-based approach.

Model-based approach

There is a preregistration, let’s start with that:

Each hypothesis will be tested using standard regression and logit regression analyses, including our instrumental and moral codes for schools’ diversity statements as the predictor variables. The outcome variables for each hypothesis are as follows:

 H1: The outcome for hypothesis 1 is a binary indicator of whether or not the university has a) theme housing and b) affinity spaces, information which we will gather by contacting university staff directly, for each degree of instrumentality and each degree of morality using our coding scale. We may also supplement the information obtained from direct communications with online resources that contain this information.

 H2: The outcome of hypothesis 2 is a binary indicator of whether or not the university has pre-orientation programs and/ or bridge programs for minority students,.

 H3 i. a: The outcome of hypothesis 3, part a is the representation of minority students, especially Black students, as a percentage of the student body. ii. b: The outcome of hypothesis 3, part b is the average GPA of students of color, especially Black students.  c: The outcome variables for hypothesis 3, part c are all the questions related to alcohol use, suicide, self-harm, emotional help, extracurricular involvement and the mental health data contained in the CCAPS (depression, generalized anxiety, social anxiety, academic distress, eating concerns, family distress, hostility, substance use) and CLICC in the Center for Collegiate Mental Health’s data set

 H4 i. a: The outcome of hypothesis 4, part a is the percentage of full time and part time professors that are minorities, as reported in the university’s most recent Common Data Set. ii. b: The outcome of hypothesis 4, part b is the acceptance rate of students who are underrepresented minorities. iii. c: The outcome of hypothesis 4, part c is the percentage of minority students who are considered underrepresented domestic racial minorities out of the total population of domestic racial minorities.

 H5: The outcome for hypothesis 5 is the numeric value (0-3) describing the weighting of race/ethnicity in a university’s admissions decisions, as noted by US World News and Reports,.

So we have to note first that although there is a preregistration, there are a lot of outcome variables. The one that is reported is graduation rate, but this is not part of the original prereg, there there is an amendment:

Though we stated our pre-registered hypothesis in terms of decreased graduation rates and retention rates (the latter ofwhich were not available for different racial groups) for minorities, we said we would evaluate it in terms of the representation of minority students in the student body. Using representation as an indicator of minority students’ experience at a university is less than ideal because such numbers can also be influenced by factors driving students to enroll in theuniversity. The available graduation rate data are a much better reflection and test of our hypothesis and are thus the focus of our analyses. However, for transparency, we acknowledgethat neither the representation of Black students nor the representationof minority students as a whole aresignificantly related to either rationale (all p‘s > .54).

So that means we have 10 dependent variables that are going to be explored in this study.

For Study 8, our first step was to assess the extent to which universities’ degree of instrumental and moral rationale use correlated with the set of university characteristics we identified as potential covariates. We found that only universities’ status as a religiously affiliated university, their log- transformed endowment, and their student body size significantly correlated with use of either rationale (see Table S7). Per our pre-registered analysis plan, only those variables that were significantly correlated with either of our predictor variables were included in our primary model (i.e., Model 1), though we also specify models including no covariates (Model 0) and all of these covariates (Model 2). In the Tables S6.1-S6.5 below, we report the full results for each racial group.

The strategy is that, because these might be related to the rationales, they should be included. (Increases colinearity risk, however).

As stated in the main text, in our primary analyses we conducted a multilevel regression model examining graduation rates as a function of universities’ degree of instrumental rationale use, degree of moral rationale use, student race nested within school, all interaction terms, and the Model 1 covariates mentioned above.

I’m confused how this is a multi-level model. It seems like each race is being treated as a separate observation for the school, I guess.

Some uncertainty:

is this really ALL interactions? that’s a lot of interactions - maybe the covariates are just linear
why is race nested within school? that means for each school there are exactly as many predictors as datapoints…
the abstract talks about disparities in grad rate but actually it looks like they do gradrates directly.

d_long <- d |>
  mutate(SchoolID = 1:n()) |>
  select(SchoolID, InstrumentalCode, MoralCode, Religious, 
         logEndowment, totalstudents, Black.gradrates, White.gradrates, 
         Hispanic.gradrates, Asian.gradrates) |> # don't get BWDisparity
  pivot_longer(names_to = "race", values_to = "gradrate", ends_with("gradrates")) 

mod <- lmer(gradrate ~ race * InstrumentalCode * MoralCode  + Religious + 
              logEndowment + totalstudents + (1 | SchoolID), 
            data = d_long)

knitr::kable(anova(mod))

	npar	Sum Sq	Mean Sq	F value
race	3	2.0936501	0.6978834	220.4957569
InstrumentalCode	1	0.0035978	0.0035978	1.1367242
MoralCode	1	0.0064799	0.0064799	2.0473243
Religious	1	0.0063656	0.0063656	2.0112030
logEndowment	1	0.5667535	0.5667535	179.0653612
totalstudents	1	0.0226258	0.0226258	7.1486106
race:InstrumentalCode	3	0.0056140	0.0018713	0.5912497
race:MoralCode	3	0.0024295	0.0008098	0.2558682
InstrumentalCode:MoralCode	1	0.0092238	0.0092238	2.9142500
race:InstrumentalCode:MoralCode	3	0.0237387	0.0079129	2.5000744

The key result we want to replicate is:

Crucial to our hypotheses, there was a significant three-way in- teraction between student race, instrumental rationale use, and moral rationale use (t(181) = −2.335, P = 0.021;

We see statistical support congruent with this three-way interaction in the ANOVA above.

Now let’s reproduce the plot (using the endpoints of the scale for the moral prediction, which is a bit optimistic).

preds <- expand_grid(race = "Black.gradrates", 
                              InstrumentalCode = 1:5, 
                              MoralCode = c(1,3,5), 
                              Religious = .5,
                              logEndowment = mean(d$logEndowment, na.rm=TRUE), 
                              totalstudents = mean(d$totalstudents))

preds$Black.gradrates <- predict(mod, newdata = preds, 
        re.form = ~0)

ggplot(d, 
       aes(x = InstrumentalCode, y = Black.gradrates, col = MoralCode, group = as.factor(MoralCode))) + 
  geom_jitter(width = .05, height = 0, alpha = .5) + 
  geom_line(data = preds) + 
  viridis::scale_color_viridis() + 
  xlab("Instrumental rationale rating") + 
  ylab("Black graduation rate")

## Warning: Removed 1 rows containing missing values (geom_point).

knitr::include_graphics("graph.png")

Further exploration

Let’s zoom in on just black gradrates, since that’s what we care most about. Here’s a very simple model relating to the plots above.

simple_mod <- lm(Black.gradrates ~ InstrumentalCode * MoralCode, data = d)

knitr::kable(summary(simple_mod)[4], digits = 3)

	Estimate	Std. Error	t value	Pr(>\|t\|)
(Intercept)	0.709	0.085	8.298	0.000
InstrumentalCode	-0.019	0.033	-0.583	0.561
MoralCode	-0.039	0.040	-0.990	0.324
InstrumentalCode:MoralCode	0.012	0.015	0.791	0.430

Could it be that the covariates are doing the work here?

covariates_mod <- lm(Black.gradrates ~ InstrumentalCode * MoralCode + 
                   Religious + logEndowment + totalstudents, data = d)

knitr::kable(summary(covariates_mod)[4], digits = 3)

	Estimate	Std. Error	t value	Pr(>\|t\|)
(Intercept)	-0.705	0.143	-4.940	0.000
InstrumentalCode	-0.054	0.027	-2.021	0.045
MoralCode	-0.069	0.032	-2.180	0.031
Religious	0.025	0.033	0.749	0.455
logEndowment	0.175	0.015	11.265	0.000
totalstudents	-0.002	0.001	-1.973	0.050
InstrumentalCode:MoralCode	0.024	0.012	1.952	0.053

The interaction is just above the .05 line, but the two main effects are < .05.

Checking on collinearity using VIFs:

car::vif(covariates_mod, type = "predictor")

## GVIFs computed for predictors

##                      GVIF Df GVIF^(1/(2*Df))   Interacts With
## InstrumentalCode 1.205904  3        1.031697        MoralCode
## MoralCode        1.205904  3        1.031697 InstrumentalCode
## Religious        1.188520  1        1.090192             --  
## logEndowment     1.142961  1        1.069094             --  
## totalstudents    1.274098  1        1.128759             --  
##                                                          Other Predictors
## InstrumentalCode                   Religious, logEndowment, totalstudents
## MoralCode                          Religious, logEndowment, totalstudents
## Religious        InstrumentalCode, MoralCode, logEndowment, totalstudents
## logEndowment        InstrumentalCode, MoralCode, Religious, totalstudents
## totalstudents        InstrumentalCode, MoralCode, Religious, logEndowment

VIFs are not terrible for the predictors, so that’s good.

Let’s plot this simple regression and see how it looks.

lm_preds <- expand_grid(InstrumentalCode = 1:5, 
                     MoralCode = c(1,3,5), 
                     Religious = .5,
                     logEndowment = mean(d$logEndowment, na.rm=TRUE), 
                     totalstudents = mean(d$totalstudents))

lm_preds$Black.gradrates <- predict(covariates_mod, newdata = lm_preds, 
        re.form = ~0)

ggplot(d, 
       aes(x = InstrumentalCode, y = Black.gradrates, col = MoralCode, group = as.factor(MoralCode))) + 
  geom_jitter(width = .05, height = 0, alpha = .5) + 
  geom_line(data = lm_preds) + 
  viridis::scale_color_viridis() + 
  xlab("Instrumental rationale rating") + 
  ylab("Black graduation rate")

## Warning: Removed 1 rows containing missing values (geom_point).

Looks very similar to the LMM results above overal.

Specification robustness

How robust is this result to alternative specifications? How about with just endowment, or just total students and endowment?

covariates_mod <- lm(Black.gradrates ~ InstrumentalCode * MoralCode + 
                   logEndowment , data = d)

knitr::kable(summary(covariates_mod)[4], digits = 3)

	Estimate	Std. Error	t value	Pr(>\|t\|)
(Intercept)	-0.628	0.139	-4.513	0.000
InstrumentalCode	-0.065	0.026	-2.477	0.014
MoralCode	-0.074	0.031	-2.373	0.019
logEndowment	0.164	0.015	11.033	0.000
InstrumentalCode:MoralCode	0.028	0.012	2.307	0.022

Looks even stronger if you just control for endowment.

No endowment.

covariates_mod <- lm(Black.gradrates ~ InstrumentalCode * MoralCode + 
                   Religious + totalstudents, data = d)

knitr::kable(summary(covariates_mod)[4], digits = 3)

	Estimate	Std. Error	t value	Pr(>\|t\|)
(Intercept)	0.710	0.086	8.279	0.000
InstrumentalCode	-0.028	0.034	-0.821	0.413
MoralCode	-0.051	0.041	-1.241	0.216
Religious	0.037	0.043	0.856	0.393
totalstudents	0.001	0.001	0.994	0.322
InstrumentalCode:MoralCode	0.016	0.016	1.019	0.309

Effect significance decreased with no endowment control.

Let’s examine the relation between these variables!

GGally::ggpairs(d |> select(Black.gradrates, InstrumentalCode, MoralCode, logEndowment))

## Warning in ggally_statistic(data = data, mapping = mapping, na.rm = na.rm, :
## Removing 1 row that contained a missing value

## Warning in ggally_statistic(data = data, mapping = mapping, na.rm = na.rm, :
## Removing 1 row that contained a missing value

## Warning in ggally_statistic(data = data, mapping = mapping, na.rm = na.rm, :
## Removed 3 rows containing missing values

## Warning: Removed 1 rows containing missing values (geom_point).

## Warning: Removed 1 rows containing non-finite values (stat_density).

## Warning in ggally_statistic(data = data, mapping = mapping, na.rm = na.rm, :
## Removing 1 row that contained a missing value

## Warning in ggally_statistic(data = data, mapping = mapping, na.rm = na.rm, :
## Removed 4 rows containing missing values

## Warning: Removed 1 rows containing missing values (geom_point).
## Removed 1 rows containing missing values (geom_point).

## Warning: Removed 1 rows containing non-finite values (stat_density).

## Warning in ggally_statistic(data = data, mapping = mapping, na.rm = na.rm, :
## Removed 4 rows containing missing values

## Warning: Removed 3 rows containing missing values (geom_point).

## Warning: Removed 4 rows containing missing values (geom_point).
## Removed 4 rows containing missing values (geom_point).

## Warning: Removed 3 rows containing non-finite values (stat_density).

Analysis of disparities

Let’s also look at the disparities predictor.

ggplot(d, aes(x = InstrumentalCode, y = BWDisparity.gradrates)) + 
  geom_point() + 
  geom_smooth()

## `geom_smooth()` using method = 'loess' and formula 'y ~ x'

## Warning: Removed 1 rows containing non-finite values (stat_smooth).

## Warning: Removed 1 rows containing missing values (geom_point).

d$moral_cut <- fct_recode(cut(d$MoralCode, c(0,2,3,4)), 
                          "low moral (0-2]" = "(0,2]", 
                          "mid moral (2-3]" = "(2,3]", 
                          "high moral (3-4]" = "(3,4]")
                          
ggplot(filter(d, !is.na(MoralCode)), 
       aes(x = InstrumentalCode, y = BWDisparity.gradrates, 
           col = moral_cut)) + 
  geom_jitter(width = .05, height = 0, alpha = .5) + 
  geom_smooth(method = "lm") + 
  scale_color_discrete(name = "Moral rationale") + 
  xlab("Instrumental rationale rating") + 
  ylab("Black-white disparity in graduation rate")

## `geom_smooth()` using formula 'y ~ x'

disparity_mod <- lm(BWDisparity.gradrates ~ InstrumentalCode * MoralCode, 
                    data = d)
summary(disparity_mod)

## 
## Call:
## lm(formula = BWDisparity.gradrates ~ InstrumentalCode * MoralCode, 
##     data = d)
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -0.205000 -0.056901 -0.006474  0.048889  0.285802 
## 
## Coefficients:
##                             Estimate Std. Error t value Pr(>|t|)  
## (Intercept)                 0.052003   0.042545   1.222   0.2232  
## InstrumentalCode            0.028804   0.016245   1.773   0.0779 .
## MoralCode                   0.041038   0.019741   2.079   0.0390 *
## InstrumentalCode:MoralCode -0.017038   0.007604  -2.241   0.0262 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.08449 on 184 degrees of freedom
##   (1 observation deleted due to missingness)
## Multiple R-squared:  0.02933,    Adjusted R-squared:  0.01351 
## F-statistic: 1.853 on 3 and 184 DF,  p-value: 0.1391

We see support in the disparities model.

Conclusions

We were able to successfully reproduce the analytic conclusions of Starck Study 8. There is some variability in whether the effect is significant only when controlling for endowment size, which is very robustly predictive of minority graduation outcomes. There is also some flexibility in outcome selection according to the preregistration.

Starck et al. (2021)

Mike Frank (with help from Ewart Thomas)

2022-11-03