Setup

library(DT)

LiberalArtsColleges <- subset(data, Type == "LiberalArts"); nrow(LiberalArtsColleges)

## [1] 24

NationalUniversities <- subset(data, Type == "NationalUniversity"); nrow(NationalUniversities)

## [1] 134

CRITR <- function(n, alpha = .05){
  df <- n - 2; CRITT <- qt(alpha/2, df, lower.tail = F)
  CRITR <- sqrt((CRITT^2)/((CRITT^2) + df))
  return(CRITR)}

ShannonH <- function(Props, rnd = 3){
  LPI = log(Props)
  H = (-rowSums(Props * LPI))
  return(round(H, rnd))}

SimpsonD <- function(Props, rnd = 3){
  PI2 = Props^2
  D = 1/(rowSums(PI2))
  return(round(D, rnd))}

RenyiH <- function(Props, rnd = 3){
  q = rowSums(Props)
  QH = (1/(1-q)) * log(rowSums(Props^q))
  return(round(QH, rnd))}

EthnicPolarization <- function(Props, rnd = 3){
  PI2 = Props^2
  EP = 4 * rowSums(PI2 * (1 - Props))
  return(round(EP, rnd))}

round(CRITR(c(24, 134, 158)), 3)

## [1] 0.404 0.170 0.156

datatable(data, extensions = c("Buttons", "FixedColumns"), options = list(dom = 'Bfrtip', buttons = c('copy', 'csv', 'print'), scrollX = T, fixedColumns = list(leftColumns = 3)))

Rationale

Eric Kaufmann shared the Foundation for Individual Rights in Education’s (FIRE’s) 2020 and 2021 combined campus free speech ranking survey data on political alignment by college university on his website, here: http://www.sneps.net/viewpoint-diversity-ranking-of-leading-us-colleges. He used a variable dubbed “IVD”, or the Index of Viewpoint Diversity, where lower scores represented higher diversity. This index is computed as

\[\text{IVD} = ((P(\text{Democrat}) - P(\text{Republican})) + (P(\text{Liberal}) - P(\text{Conservative}))) \times 10,000\]

Unfortunately, this is hardly a diversity index. It is, instead, better understood, as computed, as the proportion of ideological and effective liberals, where these quantities overlap and may fail to overlap due to measurement error. It will only coincidentally correlate with actual diversity indices because of the overrepresentation of liberals and/or Democrats at universities, because the residual is often too small to mean much else. To better understand diversity, we can apply two real diversity indices from ecology to this data.

The first index is Shannon’s index, H, given by

\[H = - \sum_{i = 1}^s p_i \text{ln}p_i = - \sum_{i = 1}^s \text{ln}p_i^{p_i}\]

As the name suggests, this is an information theoretic index, and like another popular information theoretic index, BIC, it has assumptions about representation. With BIC, it is assumed that the true model is among those compared; with H, it is assumed that all categories - or more commonly in ecology, species - are represented in computation of the statistic. Sampling accuracy is also assumed. The next index is Simpson’s, D, given by

\[D = \frac{1}{\sum_{i = 1}^s p_i^2}\]

Simpson’s D is a dominance index, since it is weighted towards those categories - or more commonly in ecology, species - which are more common, and as such, dominant in the sample. This index can be useful in our case because of the aforementioned abundance of liberal individuals in the academy. This index will be less swayed by measurement error, because it will be less swayed by the small groups that are more likely in our case to be affected by them. It makes much more difference to measure 10% of the population with a margin of error of 5% than it does to measure 70% of the population with the same amount of error. The practical implications of small differences matter when it comes to understanding “diversity” in a population.

Shannon’s index is generalized to values of q - the sum of proportions - other than one as Renyi’s index, given by

\[^qH = \frac{1}{1 - q} \text{ln}(\sum_{i = 1}^s p_i^q)\]

Alternative names for Shannon’s index are Shannon’s diversity index, the Shannon-Wiener index, Shannon entropy, information gain (machine learning), and von Neumann entropy (quantum statistical mechanics). Alternative names for Simpson’s index are the Herfindahl-Hirschman index (economics), the inverse participation ratio (physics), the effective number of parties (political science; as its inverse), and heterozygosity (genetics; as the Gibbs-Martin, Blau, or Gini-Simpson indices, simple extensions of Simpson’s). There are numerous other appropriate indices, many, but of course not all, of which are used for analyzing economic inequality, like the Theil, Hoover, Robin Hood, or Atkinson indices, or any number of named ratios. The Hoover index is just the Index of Dissimilarity and Gorard’s index of segregation. Many of these are similar to the other Index of Segregation, and related to Lieberson’s isolation index, the Hutchen’s square root index, Bell’s index, the other index of isolation, the index of exposure, the Ochiai index, Kulczynski’s coefficient, Yule’s Q or Y, the Hamman coefficient, the Baroni-Urbani-Buser coefficient, the Rogers-Tanimoto coefficient, the Sokal-Sneath coefficient, Sokal’s binary distance, the Russsel-Rao coefficient, the Phi coefficient, Soergel’s coefficient, Simpson’s coefficient (not to be confused with Simpson’s index), Dennis’ coefficient, Forbes’ coefficient, the Simple match coefficient, Fossum’s coefficient, Stile’s coefficient, Michael’s coefficient, Peirce’s coefficient, the Hawkin-Dotson coefficient, the Benini coefficient, the Gilbert coefficient, the modified Gini index, Kuhn’s index, the Gini index (not to be confused with the modified Gini index or the Gini-Simpson index), the Eyraud index, Soergel’s distance, the Tanimoto index, the Piatetsky-Shapiro’s index, or various other similar and oftentimes identical indices like Jaccard’s, Dice’s, the Match coefficient, Morisita’s index, the Canberra metric, or the Bray-Curtis index (i.e., Czekanowski’s quantitative index). There are more indices.

In addition to Shannon’s, Simpson’s, and Renyi’s indices, I have provided code for the ethnic polarization index used by Montalvo & Reynal-Querol (2005). This is given by

\[\text{EP} = 4\sum_{i = 1}^s p_i^2(1-p_i)\]

Shannon’s index is interpreted such that higher values indicate more diversity, like Simpson’s and Renyi’s indices. The ethnic polarization index ranges between 0, where everyone is part of one category or as many categories as there are people, and 1, where there are two groups of the same size. It is odder and harder to interpret, I am just presenting it for fun.

Each of these diversity indices is useful and interesting in their own ways, and they can be handily employed to better quantify political diversity in FIRE’s dataset. Moreover, we can use FIRE’s data and diversity indices like these and Kaufmann’s preferred IVD to assess the relationship between the quality - or at least, the rankings - of universities and student political diversity. Proponents of college viewpoint diversity would predict that a student body with diverse views should achieve higher-quality schools and accordingly higher ranks. For this purpose, I will use the ranks, scores, and categories - National Universities versus Liberal Arts Colleges - for the universities in Kaufmann’s data, taken from the U.S. News and World Report. I have used the most-recent scores as of October 3rd, 2022 for this purpose. Values of Renyi’s index use ideology and party identification, since Renyi’s index is not problematized by that sort of analysis, and it will allow testing the correspondence to Kaufmann’s own IVD.

Critical values of the Spearman correlation are .404 for the sample of liberal arts colleges and .170 for the national universities. The aggregate sample critical value is .156 because

\[t = r\sqrt{\frac{n-2}{1-r^2}}\]

and thus,

\[r = -\frac{t\sqrt{n-2+t^2}}{n-2+t^2}\]

nLA = 24;  tLA = qt(.05/2, nLA - 2)
nNU = 134; tNU = qt(.05/2, nNU - 2)
nAG = 158; tAG = qt(.05/2, nAG - 2)
rLA = -(tLA * sqrt(nLA - 2 + tLA^2))/(nLA - 2 + tLA^2); round(rLA, 3) #-(tLA...) because the coefficient is positive. Remove the '-' for negative coefficients, or just imagine it's reversed

## [1] 0.404

rNU = -(tNU * sqrt(nNU - 2 + tNU^2))/(nNU - 2 + tNU^2); round(rNU, 3)

## [1] 0.17

rAG = -(tAG * sqrt(nAG - 2 + tAG^2))/(nAG - 2 + tAG^2); round(rAG, 3)

## [1] 0.156

r2t <- function(r, n){
  t = r * sqrt((n-2)/(1-r^2))
  return(t)}

Analysis

Aggregate Results

data$HDiversity <- ShannonH(data[4:6])
data$DDiversity <- SimpsonD(data[4:6])
data$RDiversity <- RenyiH(data[4:8])
data$EDiversity <- EthnicPolarization(data[4:6])

data

"USNWR Ranks"

## [1] "USNWR Ranks"

cor(data$IVD, data$USNWRRank, method = "pearson") #Wrong type of correlation; only presented for comparison

## [1] 0.05312508

cor(data$IVD, data$USNWRRank, method = "spearman")

## [1] -0.5058135

cor(data$HDiversity, data$USNWRRank, method = "spearman")

## [1] 0.5738253

cor(data$DDiversity, data$USNWRRank, method = "spearman")

## [1] 0.564462

cor(data$RDiversity, data$USNWRRank, method = "spearman")

## [1] 0.5622851

cor(data$EDiversity, data$USNWRRank, method = "spearman")

## [1] 0.4034535

"USNWR Scores"

## [1] "USNWR Scores"

cor(data$IVD, data$USNWRScore, method = "pearson")

## [1] -0.03021161

cor(data$IVD, data$USNWRScore, method = "spearman")

## [1] 0.4963161

cor(data$HDiversity, data$USNWRScore, method = "spearman")

## [1] -0.5645527

cor(data$DDiversity, data$USNWRScore, method = "spearman")

## [1] -0.5556395

cor(data$RDiversity, data$USNWRScore, method = "spearman")

## [1] -0.5528564

cor(data$EDiversity, data$USNWRScore, method = "spearman")

## [1] -0.3956895

pt(r2t(.4034535, nAG), nAG - 2, lower.tail = F) #Lowest correlation in the bunch for the ranks

## [1] 7.35207e-08

Liberal Arts Colleges

LiberalArtsColleges$HDiversity <- ShannonH(LiberalArtsColleges[4:6])
LiberalArtsColleges$DDiversity <- SimpsonD(LiberalArtsColleges[4:6])
LiberalArtsColleges$RDiversity <- SimpsonD(LiberalArtsColleges[4:8])
LiberalArtsColleges$EDiversity <- SimpsonD(LiberalArtsColleges[4:6])

LiberalArtsColleges

"USNWR Ranks"

## [1] "USNWR Ranks"

cor(LiberalArtsColleges$IVD, LiberalArtsColleges$USNWRRank, method = "pearson")

## [1] 0.1577661

cor(LiberalArtsColleges$IVD, LiberalArtsColleges$USNWRRank, method = "spearman")

## [1] -0.07883287

cor(LiberalArtsColleges$HDiversity, LiberalArtsColleges$USNWRRank, method = "spearman")

## [1] 0.09756112

cor(LiberalArtsColleges$DDiversity, LiberalArtsColleges$USNWRRank, method = "spearman")

## [1] 0.01655055

cor(LiberalArtsColleges$RDiversity, LiberalArtsColleges$USNWRRank, method = "spearman")

## [1] 0.05008718

cor(LiberalArtsColleges$EDiversity, LiberalArtsColleges$USNWRRank, method = "spearman")

## [1] 0.01655055

"USNWR Scores"

## [1] "USNWR Scores"

cor(LiberalArtsColleges$IVD, LiberalArtsColleges$USNWRScore, method = "pearson")

## [1] -0.1774138

cor(LiberalArtsColleges$IVD, LiberalArtsColleges$USNWRScore, method = "spearman")

## [1] 0.07883287

cor(LiberalArtsColleges$HDiversity, LiberalArtsColleges$USNWRScore, method = "spearman")

## [1] -0.09756112

cor(LiberalArtsColleges$DDiversity, LiberalArtsColleges$USNWRScore, method = "spearman")

## [1] -0.01655055

cor(LiberalArtsColleges$RDiversity, LiberalArtsColleges$USNWRScore, method = "spearman")

## [1] -0.05008718

cor(LiberalArtsColleges$EDiversity, LiberalArtsColleges$USNWRScore, method = "spearman")

## [1] -0.01655055

pt(r2t(.01655055, nLA), nLA - 2, lower.tail = F)

## [1] 0.4694083

National Universities

NationalUniversities$HDiversity <- ShannonH(NationalUniversities[4:6])
NationalUniversities$DDiversity <- SimpsonD(NationalUniversities[4:6])
NationalUniversities$RDiversity <- SimpsonD(NationalUniversities[4:8])
NationalUniversities$EDiversity <- SimpsonD(NationalUniversities[4:6])

NationalUniversities

"USNWR Ranks"

## [1] "USNWR Ranks"

cor(NationalUniversities$IVD, NationalUniversities$USNWRRank, method = "pearson")

## [1] 0.1456497

cor(NationalUniversities$IVD, NationalUniversities$USNWRRank, method = "spearman")

## [1] -0.4208464

cor(NationalUniversities$HDiversity, NationalUniversities$USNWRRank, method = "spearman")

## [1] 0.5115962

cor(NationalUniversities$DDiversity, NationalUniversities$USNWRRank, method = "spearman")

## [1] 0.5076282

cor(NationalUniversities$RDiversity, NationalUniversities$USNWRRank, method = "spearman")

## [1] 0.4801009

cor(NationalUniversities$EDiversity, NationalUniversities$USNWRRank, method = "spearman")

## [1] 0.5076282

"USNWR Scores"

## [1] "USNWR Scores"

cor(NationalUniversities$IVD, NationalUniversities$USNWRScore, method = "pearson")

## [1] -0.1020902

cor(NationalUniversities$IVD, NationalUniversities$USNWRScore, method = "spearman")

## [1] 0.4208464

cor(NationalUniversities$HDiversity, NationalUniversities$USNWRScore, method = "spearman")

## [1] -0.5115962

cor(NationalUniversities$DDiversity, NationalUniversities$USNWRScore, method = "spearman")

## [1] -0.5076282

cor(NationalUniversities$RDiversity, NationalUniversities$USNWRScore, method = "spearman")

## [1] -0.4801009

cor(NationalUniversities$EDiversity, NationalUniversities$USNWRScore, method = "spearman")

## [1] -0.5076282

pt(r2t(.4801009, nNU), nNU - 2, lower.tail = F)

## [1] 2.178352e-09

Conclusions

Student political diversity is negatively associated with college rankings. This effect is significant in aggregate and for national universities, but not for liberal arts colleges, but that seems likely to be attributable to sample sizes and resulting range restriction. The interaction by type of college is significant, but that’s hardly worth much with such low power for the liberal arts college sample in the first place. The implications of the observation that diversity of viewpoint, at least in terms of politics, is uncorrelated with the quality of the university are potentially extreme. It may be that elite universities are ideological echo chambers. However, the imperfectness of the measurement of diversity along the dimensions used here suggests the need for future investigation with more granularity. Whatever the eventual conclusion, this finding is interesting.

References

Montalvo, J. G., & Reynal-Querol, M. (2005). Ethnic diversity and economic development. Journal of Development Economics, 76(2), 293–323.

Post-Script

Someone asked for correlation matrices.

round(cor(data[c(2:9, 11, 13:16)], method = "spearman"), 3)

##            IVDRank    IVD   PLib   PMod   PCon   PDem   PRep USNWRScore
## IVDRank      1.000  0.853  0.923 -0.585 -0.903  0.910 -0.920      0.572
## IVD          0.853  1.000  0.801 -0.511 -0.791  0.787 -0.805      0.496
## PLib         0.923  0.801  1.000 -0.689 -0.863  0.946 -0.885      0.537
## PMod        -0.585 -0.511 -0.689  1.000  0.306 -0.600  0.384     -0.259
## PCon        -0.903 -0.791 -0.863  0.306  1.000 -0.864  0.963     -0.593
## PDem         0.910  0.787  0.946 -0.600 -0.864  1.000 -0.893      0.522
## PRep        -0.920 -0.805 -0.885  0.384  0.963 -0.893  1.000     -0.575
## USNWRScore   0.572  0.496  0.537 -0.259 -0.593  0.522 -0.575      1.000
## USNWRRank   -0.583 -0.506 -0.547  0.264  0.603 -0.531  0.587     -0.995
## HDiversity  -0.977 -0.843 -0.922  0.631  0.891 -0.898  0.895     -0.565
## DDiversity  -0.973 -0.844 -0.929  0.661  0.869 -0.899  0.883     -0.556
## RDiversity  -0.977 -0.839 -0.929  0.663  0.871 -0.933  0.888     -0.553
## EDiversity  -0.751 -0.664 -0.813  0.843  0.553 -0.735  0.603     -0.396
##            USNWRRank HDiversity DDiversity RDiversity EDiversity
## IVDRank       -0.583     -0.977     -0.973     -0.977     -0.751
## IVD           -0.506     -0.843     -0.844     -0.839     -0.664
## PLib          -0.547     -0.922     -0.929     -0.929     -0.813
## PMod           0.264      0.631      0.661      0.663      0.843
## PCon           0.603      0.891      0.869      0.871      0.553
## PDem          -0.531     -0.898     -0.899     -0.933     -0.735
## PRep           0.587      0.895      0.883      0.888      0.603
## USNWRScore    -0.995     -0.565     -0.556     -0.553     -0.396
## USNWRRank      1.000      0.574      0.564      0.562      0.403
## HDiversity     0.574      1.000      0.994      0.985      0.758
## DDiversity     0.564      0.994      1.000      0.984      0.766
## RDiversity     0.562      0.985      0.984      1.000      0.772
## EDiversity     0.403      0.758      0.766      0.772      1.000

round(cor(LiberalArtsColleges[c(2:9, 11, 13:16)], method = "spearman"), 3)

##            IVDRank    IVD   PLib   PMod   PCon   PDem   PRep USNWRScore
## IVDRank      1.000  0.760  0.871 -0.797 -0.768  0.824 -0.767      0.194
## IVD          0.760  1.000  0.652 -0.598 -0.548  0.605 -0.547      0.079
## PLib         0.871  0.652  1.000 -0.725 -0.764  0.908 -0.786      0.136
## PMod        -0.797 -0.598 -0.725  1.000  0.361 -0.635  0.374      0.131
## PCon        -0.768 -0.548 -0.764  0.361  1.000 -0.700  0.924     -0.311
## PDem         0.824  0.605  0.908 -0.635 -0.700  1.000 -0.766      0.187
## PRep        -0.767 -0.547 -0.786  0.374  0.924 -0.766  1.000     -0.309
## USNWRScore   0.194  0.079  0.136  0.131 -0.311  0.187 -0.309      1.000
## USNWRRank   -0.194 -0.079 -0.136 -0.131  0.311 -0.187  0.309     -1.000
## HDiversity  -0.969 -0.729 -0.852  0.835  0.790 -0.766  0.753     -0.098
## DDiversity  -0.948 -0.708 -0.862  0.892  0.700 -0.788  0.713     -0.017
## RDiversity  -0.928 -0.688 -0.826  0.897  0.627 -0.840  0.662     -0.050
## EDiversity  -0.948 -0.708 -0.862  0.892  0.700 -0.788  0.713     -0.017
##            USNWRRank HDiversity DDiversity RDiversity EDiversity
## IVDRank       -0.194     -0.969     -0.948     -0.928     -0.948
## IVD           -0.079     -0.729     -0.708     -0.688     -0.708
## PLib          -0.136     -0.852     -0.862     -0.826     -0.862
## PMod          -0.131      0.835      0.892      0.897      0.892
## PCon           0.311      0.790      0.700      0.627      0.700
## PDem          -0.187     -0.766     -0.788     -0.840     -0.788
## PRep           0.309      0.753      0.713      0.662      0.713
## USNWRScore    -1.000     -0.098     -0.017     -0.050     -0.017
## USNWRRank      1.000      0.098      0.017      0.050      0.017
## HDiversity     0.098      1.000      0.975      0.933      0.975
## DDiversity     0.017      0.975      1.000      0.963      1.000
## RDiversity     0.050      0.933      0.963      1.000      0.963
## EDiversity     0.017      0.975      1.000      0.963      1.000

round(cor(NationalUniversities[c(2:9, 11, 13:16)], method = "spearman"), 3)

##            IVDRank    IVD   PLib   PMod   PCon   PDem   PRep USNWRScore
## IVDRank      1.000  0.869  0.952 -0.520 -0.927  0.936 -0.947      0.522
## IVD          0.869  1.000  0.846 -0.475 -0.826  0.826 -0.842      0.421
## PLib         0.952  0.846  1.000 -0.670 -0.872  0.944 -0.893      0.528
## PMod        -0.520 -0.475 -0.670  1.000  0.271 -0.570  0.367     -0.258
## PCon        -0.927 -0.826 -0.872  0.271  1.000 -0.862  0.950     -0.531
## PDem         0.936  0.826  0.944 -0.570 -0.862  1.000 -0.894      0.497
## PRep        -0.947 -0.842 -0.893  0.367  0.950 -0.894  1.000     -0.520
## USNWRScore   0.522  0.421  0.528 -0.258 -0.531  0.497 -0.520      1.000
## USNWRRank   -0.522 -0.421 -0.528  0.258  0.531 -0.497  0.520     -1.000
## HDiversity  -0.970 -0.853 -0.960  0.597  0.903 -0.922  0.907     -0.512
## DDiversity  -0.965 -0.852 -0.961  0.617  0.886 -0.918  0.900     -0.508
## RDiversity  -0.925 -0.811 -0.939  0.685  0.815 -0.958  0.831     -0.480
## EDiversity  -0.965 -0.852 -0.961  0.617  0.886 -0.918  0.900     -0.508
##            USNWRRank HDiversity DDiversity RDiversity EDiversity
## IVDRank       -0.522     -0.970     -0.965     -0.925     -0.965
## IVD           -0.421     -0.853     -0.852     -0.811     -0.852
## PLib          -0.528     -0.960     -0.961     -0.939     -0.961
## PMod           0.258      0.597      0.617      0.685      0.617
## PCon           0.531      0.903      0.886      0.815      0.886
## PDem          -0.497     -0.922     -0.918     -0.958     -0.918
## PRep           0.520      0.907      0.900      0.831      0.900
## USNWRScore    -1.000     -0.512     -0.508     -0.480     -0.508
## USNWRRank      1.000      0.512      0.508      0.480      0.508
## HDiversity     0.512      1.000      0.994      0.945      0.994
## DDiversity     0.508      0.994      1.000      0.948      1.000
## RDiversity     0.480      0.945      0.948      1.000      0.948
## EDiversity     0.508      0.994      1.000      0.948      1.000

sessionInfo()

## R version 4.1.2 (2021-11-01)
## Platform: x86_64-w64-mingw32/x64 (64-bit)
## Running under: Windows 10 x64 (build 19044)
## 
## Matrix products: default
## 
## locale:
## [1] LC_COLLATE=English_United States.1252 
## [2] LC_CTYPE=English_United States.1252   
## [3] LC_MONETARY=English_United States.1252
## [4] LC_NUMERIC=C                          
## [5] LC_TIME=English_United States.1252    
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] DT_0.20
## 
## loaded via a namespace (and not attached):
##  [1] digest_0.6.28     R6_2.5.1          jsonlite_1.7.2    magrittr_2.0.1   
##  [5] evaluate_0.14     rlang_0.4.12      stringi_1.7.5     jquerylib_0.1.4  
##  [9] bslib_0.3.1       vctrs_0.3.8       rmarkdown_2.11    tools_4.1.2      
## [13] stringr_1.4.0     htmlwidgets_1.5.4 crosstalk_1.2.0   xfun_0.27        
## [17] yaml_2.2.1        fastmap_1.1.0     compiler_4.1.2    htmltools_0.5.2  
## [21] knitr_1.36        sass_0.4.0

Diversity Indices Applied to Campus Political Diversity