The following code accompanies a manuscript by Adrienne Wood, Adam Kleinbaum, and Thalia Wheatley. Please contact Adrienne (adrienne.wood@virginia.edu) with any questions.

OSF Page: https://osf.io/ycpwt/

OSF Preprint: https://psyarxiv.com/qvthk/

Overview
Descriptive Statistics
Models using population variables to predict students’ social network positions
Models using population variables to predict students’ personality traits
Follow-up models controlling for key covariates
Graphing model estimates and raw data for key findings

Overview

In the following code, we use different measures of cultural diversity–both present-day and historical–to predict international and American students’ social network positions in a series of Bayesian multilevel models. Estimates from models looking at other network variables, such as the cultural diversity of students’ friends, are also reported, although none of them were predicted by our distal cultural diversity measures.

We also looked at the effect of cultural diversity on several “socially adaptable” personality traits, including self-monitoring and extraversion. For those that were predicted by culture, we then asked whether they moderated the effect of cultural diversity on brokerage.

See the “Data Preparation” R Markdown and html files on the OSF project page for details about how these variables were computed.

Variable codebook

codebook <-read.csv("variablenames.csv")
DT::datatable(codebook,caption="Variables in dataset")

Click the “code” button to see details for any of the following preparatory steps: 1. Read in data and create lists of key variables

dAll <-read.csv("complete_culture_network_data.csv")

# reviewers requested this additional covariate:
# number of friends in network. We'll just add together our
# existing variables, number of American friends and number
# of international friends.
dAll$numFriends <-dAll$numAmericanFriends+dAll$numInternationalFriends

# create a list of key numeric student variables
studentVariables <-c("brokerageZ","IndegreeNonreciprocity","OutdegreeNonreciprocity","numAmericanFriends","numInternationalFriends","numFriends","percentofFriendsAmerican","diversityofNetworkFriends","ESCIconflict","ESCIconflictPeer","ESCIempathy","ESCIempathyPeer","ESCIInfluence","ESCIinfluencePeer","ESCIteamwork","ESCIteamworkPeer","Openness_z","extraversion_z","self_monitor_factor1","self_monitor_factor2","self_monitor_factor3")

# sublist of just personality measures
personalityVariables <-c("ESCIconflict","ESCIconflictPeer","ESCIempathy","ESCIempathyPeer","ESCIInfluence","ESCIinfluencePeer","ESCIteamwork","ESCIteamworkPeer","Openness_z","extraversion_z","self_monitor_factor1","self_monitor_factor2","self_monitor_factor3")
# create a list of additional international student variables
intVariables <-c(studentVariables,c("alesina_ethnic_fractionalization",
"ancestral_diversity_hhiScore",
"english_official_language_wiki",
"nation_tightness",
"uk_classified_majority_english_speaking",
"relational_mobility",
"percent_of_population_that_is_americanLog"))

# create a list of additional American student variables
amVariables <-c(studentVariables,c("population_density_USLog","ipumsHHIDiversity",
"US_county_SCI_international_connectednessLog",
"US_state_tightness"))

Standardize our personality measures
Separate American and International student datasets

dInt <-dAll[dAll$nationality!="united states",]
dInt <-dInt[dInt$nationality!="",]
dAm <-dAll[dAll$nationality=="united states",]

Standardize our 4 cultural diversity predictors

dAm <-dAm %>%
mutate(across(.cols=c(ipumsHHIDiversity,
US_county_SCI_international_connectednessLog),~(.x-mean(.x,na.rm=T))/sd(.x,na.rm=T)))

dInt <-dInt %>%
mutate(across(.cols=c(alesina_ethnic_fractionalization,
ancestral_diversity_hhiScore),~(.x-mean(.x,na.rm=T))/sd(.x,na.rm=T)))

Separate out first wave of social network survey

dInt1 <-dInt[dInt$sampleNumber==1,]
dAll1 <-dAll[dAll$sampleNumber==1,]
dAm1 <-dAm[dAm$sampleNumber==1,]

Descriptive statistics

Create the table of all students’ descriptives for paper

Descriptives

Demographics frequencies

How many students identify as international vs. American, male vs female; what are their self-reported ethnicities and how many students were in each cohort.

# How many didn't report international status
kable(table(dAll1$intStatus),col.names=c("Status","Number of students"))

Status	Number of students
American	1461
International	785

print(paste(as.character(nrow(dAll1[is.na(dAll1$intStatus),]))," students did not report whether they were U.S. or international students.",sep=""))

[1] “41 students did not report whether they were U.S. or international students.”

kable(table(dAll1$Gender),col.names=c("Gender","Number of students"))

Gender	Number of students
	1
F	836
M	1436

kable(table(dAll1$ethnicity),col.names=c("Ethnicity","Number of Students"))

Ethnicity	Number of Students
0	5
Asian, Asian Am, Pacific Islan	331
Black, non-Hispanic	105
Hispanic/Latino	99
Multi Racial	35
Native American	13
No Response	539
White, non-Hispanic	1146

kable(table(dAll1$nonHispNonWhite),col.names=c("Does the student identify as non-White and/or Hispanic? (1=yes)","Number of students"))

Does the student identify as non-White and/or Hispanic? (1=yes)	Number of students
0	1146
1	586

kable(table(dAll1$year),col.names=c("Cohort ID name","Number of students"))

Cohort ID name	Number of students
12	288
13	273
14	289
15	277
16	281
17	286
18	285
19	293

Descriptives for International students

dfSummary(dInt1[,c(intVariables)], plain.ascii = FALSE, style = "grid", 
          graph.magnif = 0.75, valid.col = FALSE, tmp.img.dir = "/tmp",round.digits=3)

Data Frame Summary

dInt1
Dimensions: 918 x 28
Duplicates: 144

No	Variable	Stats / Values	Freqs (% of Valid)	Missing
1	brokerageZ [numeric]	Mean (sd) : -0.329 (0.758) min < med < max: -2.543 < -0.311 < 1.868 IQR (CV) : 1.002 (-2.306)	699 distinct values	133 (14.5%)
2	IndegreeNonreciprocity [numeric]	Mean (sd) : 0.525 (0.24) min < med < max: 0 < 0.52 < 1 IQR (CV) : 0.381 (0.458)	285 distinct values	133 (14.5%)
3	OutdegreeNonreciprocity [numeric]	Mean (sd) : 0.442 (0.233) min < med < max: 0 < 0.462 < 1 IQR (CV) : 0.337 (0.528)	264 distinct values	135 (14.7%)
4	numAmericanFriends [integer]	Mean (sd) : 11.302 (13.339) min < med < max: 0 < 7 < 106 IQR (CV) : 11.25 (1.18)	61 distinct values	134 (14.6%)
5	numInternationalFriends [integer]	Mean (sd) : 12.19 (10.304) min < med < max: 0 < 10 < 64 IQR (CV) : 14 (0.845)	51 distinct values	134 (14.6%)
6	numFriends [integer]	Mean (sd) : 23.492 (21.453) min < med < max: 1 < 18 < 148 IQR (CV) : 25 (0.913)	94 distinct values	134 (14.6%)
7	percentofFriendsAmerican [numeric]	Mean (sd) : 0.43 (0.243) min < med < max: 0 < 0.439 < 1 IQR (CV) : 0.339 (0.566)	287 distinct values	134 (14.6%)
8	diversityofNetworkFriends [numeric]	Mean (sd) : 58.505 (20.971) min < med < max: 0 < 63.599 < 88.889 IQR (CV) : 23.75 (0.358)	507 distinct values	134 (14.6%)
9	ESCIconflict [numeric]	Mean (sd) : 0.024 (1.006) min < med < max: -2.624 < 0.077 < 2.169 IQR (CV) : 1.394 (42.417)	23 distinct values	467 (50.9%)
10	ESCIconflictPeer [numeric]	Mean (sd) : -0.344 (1.014) min < med < max: -4.062 < -0.299 < 2.341 IQR (CV) : 1.343 (-2.95)	252 distinct values	441 (48.0%)
11	ESCIempathy [numeric]	Mean (sd) : 0.15 (0.938) min < med < max: -2.835 < 0.004 < 1.779 IQR (CV) : 1.065 (6.262)	20 distinct values	467 (50.9%)
12	ESCIempathyPeer [numeric]	Mean (sd) : -0.037 (0.988) min < med < max: -3.461 < 0.098 < 2.042 IQR (CV) : 1.23 (-26.568)	243 distinct values	441 (48.0%)
13	ESCIInfluence [numeric]	Mean (sd) : -0.071 (1.041) min < med < max: -3.682 < 0.039 < 2.068 IQR (CV) : 1.353 (-14.734)	26 distinct values	467 (50.9%)
14	ESCIinfluencePeer [numeric]	Mean (sd) : -0.334 (1.047) min < med < max: -3.778 < -0.277 < 2.492 IQR (CV) : 1.351 (-3.137)	281 distinct values	441 (48.0%)
15	ESCIteamwork [numeric]	Mean (sd) : 0.05 (0.99) min < med < max: -3.913 < 0.056 < 1.499 IQR (CV) : 1.443 (19.846)	18 distinct values	467 (50.9%)
16	ESCIteamworkPeer [numeric]	Mean (sd) : -0.231 (0.995) min < med < max: -3.812 < -0.129 < 1.866 IQR (CV) : 1.234 (-4.317)	153 distinct values	441 (48.0%)
17	Openness_z [numeric]	Mean (sd) : 0.031 (0.951) min < med < max: -3.104 < 0.044 < 2.177 IQR (CV) : 1.356 (30.232)	118 distinct values	433 (47.2%)
18	extraversion_z [numeric]	Mean (sd) : -0.209 (0.976) min < med < max: -3.137 < -0.102 < 1.921 IQR (CV) : 1.349 (-4.671)	28 distinct values	722 (78.6%)
19	self_monitor_factor1 [numeric]	Mean (sd) : -0.301 (1.051) min < med < max: -2.298 < -0.188 < 1.219 IQR (CV) : 1.407 (-3.489)	6 distinct values	626 (68.2%)
20	self_monitor_factor2 [numeric]	Mean (sd) : -0.103 (0.975) min < med < max: -1.379 < -0.24 < 2.038 IQR (CV) : 1.281 (-9.434)	7 distinct values	626 (68.2%)
21	self_monitor_factor3 [numeric]	Mean (sd) : -0.002 (0.969) min < med < max: -1.291 < -0.316 < 1.633 IQR (CV) : 0.975 (-402.651)	4 distinct values	626 (68.2%)
22	alesina_ethnic_fractionalization [numeric]	Mean (sd) : -0.01 (1.001) min < med < max: -1.512 < 0.336 < 2.682 IQR (CV) : 1.768 (-100.371)	67 distinct values	133 (14.5%)
23	ancestral_diversity_hhiScore [numeric]	Mean (sd) : 0.002 (1.007) min < med < max: -1.106 < -0.491 < 1.905 IQR (CV) : 2.256 (403.048)	56 distinct values	145 (15.8%)
24	english_official_language_wiki [integer]	Min : 0 Mean : 0.366 Max : 1	0 : 495 (63.4%) 1 : 286 (36.6%)	137 (14.9%)
25	nation_tightness [numeric]	Mean (sd) : 8.066 (2.575) min < med < max: 1.6 < 7.9 < 12.3 IQR (CV) : 4.175 (0.319)	23 distinct values	316 (34.4%)
26	uk_classified_majority_english_speaking [integer]	Min : 0 Mean : 0.091 Max : 1	0 : 710 (90.9%) 1 : 71 ( 9.1%)	137 (14.9%)
27	relational_mobility [numeric]	Mean (sd) : 0.065 (0.219) min < med < max: -0.414 < 0.138 < 0.359 IQR (CV) : 0.21 (3.384)	24 distinct values	551 (60.0%)
28	percent_of_population_that_is_americanLog [numeric]	Mean (sd) : 0.166 (0.288) min < med < max: 0.002 < 0.033 < 1.309 IQR (CV) : 0.224 (1.738)	28 distinct values	247 (26.9%)

Descriptives for American students

dfSummary(dAm1[,c(amVariables)], plain.ascii = FALSE, style = "grid", 
          graph.magnif = 0.75, valid.col = FALSE, tmp.img.dir = "/tmp",round.digits=3)

Data Frame Summary

dAm1
Dimensions: 1594 x 25
Duplicates: 132

No	Variable	Stats / Values	Freqs (% of Valid)	Missing
1	brokerageZ [numeric]	Mean (sd) : -0.018 (0.802) min < med < max: -2.543 < -0.006 < 2.338 IQR (CV) : 1.081 (-44.525)	1358 distinct values	133 (8.3%)
2	IndegreeNonreciprocity [numeric]	Mean (sd) : 0.502 (0.225) min < med < max: 0 < 0.5 < 1 IQR (CV) : 0.333 (0.449)	509 distinct values	134 (8.4%)
3	OutdegreeNonreciprocity [numeric]	Mean (sd) : 0.406 (0.213) min < med < max: 0 < 0.406 < 1 IQR (CV) : 0.311 (0.525)	428 distinct values	138 (8.7%)
4	numAmericanFriends [integer]	Mean (sd) : 22.187 (19.332) min < med < max: 0 < 17 < 137 IQR (CV) : 22 (0.871)	99 distinct values	138 (8.7%)
5	numInternationalFriends [integer]	Mean (sd) : 5.396 (6.373) min < med < max: 0 < 3 < 56 IQR (CV) : 6 (1.181)	41 distinct values	138 (8.7%)
6	numFriends [integer]	Mean (sd) : 27.582 (24.113) min < med < max: 1 < 22 < 180 IQR (CV) : 26 (0.874)	116 distinct values	138 (8.7%)
7	percentofFriendsAmerican [numeric]	Mean (sd) : 0.803 (0.156) min < med < max: 0 < 0.828 < 1 IQR (CV) : 0.182 (0.194)	364 distinct values	138 (8.7%)
8	diversityofNetworkFriends [numeric]	Mean (sd) : 30.83 (19.502) min < med < max: 0 < 30.775 < 86.42 IQR (CV) : 28.087 (0.633)	711 distinct values	138 (8.7%)
9	ESCIconflict [numeric]	Mean (sd) : -0.014 (0.999) min < med < max: -4.106 < 0.077 < 2.169 IQR (CV) : 1.046 (-71.575)	29 distinct values	727 (45.6%)
10	ESCIconflictPeer [numeric]	Mean (sd) : 0.18 (0.943) min < med < max: -3.999 < 0.235 < 2.552 IQR (CV) : 1.264 (5.234)	322 distinct values	680 (42.7%)
11	ESCIempathy [numeric]	Mean (sd) : -0.075 (1.022) min < med < max: -3.545 < 0.004 < 1.779 IQR (CV) : 1.42 (-13.609)	23 distinct values	727 (45.6%)
12	ESCIempathyPeer [numeric]	Mean (sd) : 0.024 (1) min < med < max: -3.374 < 0.142 < 2.217 IQR (CV) : 1.245 (41.508)	306 distinct values	680 (42.7%)
13	ESCIInfluence [numeric]	Mean (sd) : 0.035 (0.978) min < med < max: -3.006 < 0.039 < 2.068 IQR (CV) : 1.353 (27.886)	28 distinct values	727 (45.6%)
14	ESCIinfluencePeer [numeric]	Mean (sd) : 0.175 (0.929) min < med < max: -3.491 < 0.235 < 2.85 IQR (CV) : 1.189 (5.303)	353 distinct values	680 (42.7%)
15	ESCIteamwork [numeric]	Mean (sd) : -0.024 (1.004) min < med < max: -4.274 < 0.056 < 1.499 IQR (CV) : 1.443 (-41.623)	21 distinct values	727 (45.6%)
16	ESCIteamworkPeer [numeric]	Mean (sd) : 0.12 (0.979) min < med < max: -4.388 < 0.221 < 1.866 IQR (CV) : 1.317 (8.175)	177 distinct values	680 (42.7%)
17	Openness_z [numeric]	Mean (sd) : -0.015 (1.025) min < med < max: -3.504 < 0.042 < 2.245 IQR (CV) : 1.356 (-66.403)	145 distinct values	658 (41.3%)
18	extraversion_z [numeric]	Mean (sd) : 0.108 (0.997) min < med < max: -2.8 < 0.235 < 2.258 IQR (CV) : 1.264 (9.217)	31 distinct values	1219 (76.5%)
19	self_monitor_factor1 [numeric]	Mean (sd) : 0.156 (0.938) min < med < max: -2.298 < 0.515 < 1.219 IQR (CV) : 0.703 (6.002)	6 distinct values	1030 (64.6%)
20	self_monitor_factor2 [numeric]	Mean (sd) : 0.054 (1.011) min < med < max: -1.379 < -0.24 < 2.038 IQR (CV) : 1.708 (18.734)	7 distinct values	1030 (64.6%)
21	self_monitor_factor3 [numeric]	Mean (sd) : 0 (1.017) min < med < max: -1.291 < -0.316 < 1.633 IQR (CV) : 1.949 (12938.52)	4 distinct values	1030 (64.6%)
22	population_density_USLog [numeric]	Mean (sd) : 7.421 (1.025) min < med < max: 1.609 < 7.346 < 9.589 IQR (CV) : 1.261 (0.138)	521 distinct values	374 (23.5%)
23	ipumsHHIDiversity [numeric]	Mean (sd) : -0.002 (1.003) min < med < max: -2.701 < 0.045 < 1.908 IQR (CV) : 1.32 (-418.76)	180 distinct values	725 (45.5%)
24	US_county_SCI_international_connectednessLog [numeric]	Mean (sd) : -0.002 (1.003) min < med < max: -3.383 < -0.041 < 2.087 IQR (CV) : 1.12 (-488.786)	276 distinct values	279 (17.5%)
25	US_state_tightness [numeric]	Mean (sd) : 42.266 (10.874) min < med < max: 27.37 < 39.42 < 78.86 IQR (CV) : 13.81 (0.257)	44 distinct values	279 (17.5%)

Correlations of major variables separately for American and International students

Code for data table heatmap: https://rstudio.github.io/DT/010-style.html

International students

dIntCor <-data.frame(round(cor(dInt1[,intVariables],use="pairwise.complete.obs"),3))

diag(dIntCor) <-NA
brks <-seq(-1,1,.1)
clrsneg <- rev(round(seq(255, 40, length.out = length(brks)/2), 0) %>%
  {paste0("rgb(255,", ., ",", ., ")")})
clrspos <- round(seq(255, 40, length.out = length(brks)/2), 0) %>%
  {paste0("rgb(",.,",255,",.,")")}
clrs <-c(clrsneg,clrspos)
DT::datatable(dIntCor) %>% formatStyle(names(dIntCor),backgroundColor = styleInterval(brks, clrs))

American students

dAmCor <-data.frame(round(cor(dAm1[,amVariables],use="pairwise.complete.obs"),3))

diag(dAmCor) <-NA
brks <-seq(-1,1,.1)
clrsneg <- rev(round(seq(255, 40, length.out = length(brks)/2), 0) %>%
  {paste0("rgb(255,", ., ",", ., ")")})
clrspos <- round(seq(255, 40, length.out = length(brks)/2), 0) %>%
  {paste0("rgb(",.,",255,",.,")")}
clrs <-c(clrsneg,clrspos)
DT::datatable(dAmCor) %>% formatStyle(names(dAmCor),backgroundColor = styleInterval(brks, clrs))

Population variables predicting student social network variables

# predictors for international students
intPreds <-c("ancestral_diversity_hhiScore",
"alesina_ethnic_fractionalization")

# covariates for international students
intCovs <-c( "nonHispNonWhite","english_official_language_wiki","uk_classified_majority_english_speaking", "nation_tightness","relational_mobility","percent_of_population_that_is_americanLog","numFriends")

# predictors for American students
amPreds <- c("US_county_SCI_international_connectednessLog","ipumsHHIDiversity")

# covariates for American students
amCovs <-c("nonHispNonWhite","US_state_tightness","population_density_USLog","numFriends")

# network analysis outcome variables
networkVars <- c("brokerageZ","IndegreeNonreciprocity","OutdegreeNonreciprocity","percentofFriendsAmerican","diversityofNetworkFriends")

# personality outcomes/moderators
personalityVars <- c("ESCIconflict","ESCIconflictPeer","ESCIempathy","ESCIempathyPeer","ESCIInfluence","ESCIinfluencePeer","ESCIteamwork","ESCIteamworkPeer","Openness_z","extraversion_z","self_monitor_factor1","self_monitor_factor2","self_monitor_factor3")

Models predicting international students’ network positions

We loop through 5 network outcome variables, predicting each of them from the historical and present-day cultural diversity of students’ home nations.

Network outcome variables:
* Brokerage–our key network variable * Indegree nonreciprocity–the proportion of incoming ties that the student does not reciprocate.
* Outdegree nonreciprocity–the proportion of outgoing ties that are not reciprocated by others.
* Percent of friends who are American–basically, how international are the students’ friends? * Diversity of network friends–reversed HHI score for each students’ friends’ ethnicities

Models are as follows: stan_glmer(network_variable ~ diversity_variable + (1|nationality))

Slope estimates from all international student models with an added column at the end that indicates whether the 95% credibility interval excludes 0 (in other words, whether there “is an effect”). The cultural variable of interest in each row is identified under “iv” and the network variable of interest in each row is identified under “dv.” Search for variables of interest using the search bar.

intModelOutputs <-read.csv("intModelOutputs.csv")
DT::datatable(intModelOutputs,caption="Predicting international students' social network positions")

Are the International student predictors moderated by time?

Are the effects with credibility intervals that exclude 0 moderated by time of social network survey? Recall that students completed 2 surveys in their first year in the MBA program, typically once in the fall and once in the spring. We can ask whether the effects of culture on network position change over the course of the year.

Here are the estimates from the models testing for moderation by time. The first set of means, MCSEs, SDs, CIs, effective sample sizes, and Rhats are for the cultural predictor; the second set is for the main effect of time; the third set is for the interaction term. The final column tells you whether the CI for the interaction between the culture variable and time excluded 0, which means it’s a meaningful effect.

intTimeModelOutputs <-read.csv("intTimeModelOutputs.csv")

DT::datatable(intTimeModelOutputs,caption="Predicting International students' networks with time of survey as moderator")

Models predicting American students’ network positions

Repeat the above steps for American students using their county-level diversity measures as predictors.

See above instructions for interpreting this table.

amModelOutputs <-read.csv("amModelOutputs.csv")

DT::datatable(amModelOutputs,caption="Predicting American students' social network positions")

Are the American student predictors moderated by time?

Are the effects with credibility intervals that exclude 0 moderated by time of social network survey?

Instructions for interpreting this table are the same as for the international student table

amTimeModelOutputs <-read.csv("amTimeModelOutputs.csv")

DT::datatable(amTimeModelOutputs,caption="Predicting American students' networks with time of survey as moderator")

Four key model outputs

Table with just the slopes and intercepts from our 4 central models reported in the paper: our four cultural diversity dimensions predicting brokerage.

intModelOutputs$X <-NULL
amModelOutputs$X <-NULL
modTable <-rbind(intModelOutputs[intModelOutputs$dv=="brokerageZ"&intModelOutputs$iv=="ancestral_diversity_hhiScore",],
      intModelOutputs[intModelOutputs$dv=="brokerageZ"&intModelOutputs$iv=="alesina_ethnic_fractionalization",],
     amModelOutputs[amModelOutputs$dv=="brokerageZ"&amModelOutputs$iv=="ipumsHHIDiversity",],
    amModelOutputs[amModelOutputs$dv=="brokerageZ"&amModelOutputs$iv=="US_county_SCI_international_connectednessLog",])

modTable <-modTable[!is.na(modTable$mean),]
modTable$dv <-NULL
modTable$excludes0 <-NULL

modTable$iv <- varRecode(modTable$iv,c("ancestral_diversity_hhiScore","alesina_ethnic_fractionalization","ipumsHHIDiversity","US_county_SCI_international_connectednessLog"),c("International students,\nhistorical diversity","International students,\npresent-day diversity","American students,\nhistorical diversity","American students,\npresent-day diversity"))
modTable$order <-varRecode(modTable$iv,c("International students,\nhistorical diversity","International students,\npresent-day diversity","American students,\nhistorical diversity","American students,\npresent-day diversity"),c(1,2,3,4))

modTableSlopes <-modTable[,c(1:8,18)]
modTableIntercepts <-modTable[,c(1,9:15,18)]
modTableSlopes <-varRename(modTableSlopes, c("iv","mean","mcse","sd","lowerCI","upperCI","Rhat"),c("Cultural Diversity Measure","B","MCSE","SD","2.5% C.I.","97.5% C.I.","R hat"))
modTableSlopes$Parameter <-"Slope"
modTableIntercepts <-varRename(modTableIntercepts, c("iv","meanIntercept","mcseIntercept","sdIntercept","lowerCIIntercept","upperCIIntercept","RhatIntercept","n_effIntercept"),c("Cultural Diversity Measure","B","MCSE","SD","2.5% C.I.","97.5% C.I.","R hat","n_eff"))
modTableIntercepts$Parameter <-"Intercept"
modTableLong <-rbind(modTableSlopes,modTableIntercepts)
modTableLong <-modTableLong[,c(1,10,2,3,4,5,6,7,8,9)]
modTableLong <-arrange(modTableLong,order,Parameter)

modTableLong[,1]<-ifelse(modTableLong$Parameter =="Slope","",modTableLong[,1])
modTableLong[,c(3:9)]<-round(modTableLong[,c(3:9)],3)
modTableLong$order <-NULL
write.csv(modTableLong,"diversity_network_model_summaries.csv",row.names=F)
kable(modTableLong,digits=3,row.names = F)

Cultural Diversity Measure	Parameter	B	MCSE	SD	2.5% C.I.	97.5% C.I.	n_eff	R hat
International students, historical diversity	Intercept	-0.302	0.001	0.050	-0.401	-0.202	1921	1.001
	Slope	0.139	0.001	0.051	0.040	0.240	1676	1.004
International students, present-day diversity	Intercept	-0.265	0.001	0.050	-0.360	-0.163	1418	1.003
	Slope	0.107	0.001	0.044	0.022	0.193	1758	1.000
American students, historical diversity	Intercept	-0.011	0.000	0.029	-0.071	0.046	6465	0.999
	Slope	0.068	0.000	0.028	0.012	0.125	6488	1.000
American students, present-day diversity	Intercept	-0.023	0.000	0.025	-0.071	0.025	6522	0.999
	Slope	0.047	0.000	0.024	-0.001	0.094	7325	0.999

Population variables predicting personality traits in students

Models predicting personality measures among international students

Now we’ll loop through our personality measures of interest and see whether cultural diversity predicts them within international students.

Instructions for interpreting this table are the same as above; the “dv” column indicates the personality variable for each row and the “iv” indicates the culture diversity variable.

intPersModelOutputs <-read.csv("intPersModelOutputs.csv")

DT::datatable(intPersModelOutputs,caption="Predicting international students' personality characteristics")

Do the personality variables moderate the effect of diversity on brokerage?

Take the personality variables that were predicted by diversity and see whether they moderate the effect of diversity on brokerage.

The “iv” column indicates the cultural diversity variable and the “moderator” column indicates the personality variable. The first set of model statistics in each row are for cultural diversity; the second set is for the personality moderator; and the third set is for their interaction term. The final column indicates whether their interaction term’s credibility interval excludes 0.

intPersModeratorModelOutputs <-read.csv("intPersModeratorModelOutputs.csv")

DT::datatable(intPersModeratorModelOutputs,caption="Predicting International students' brokerage from interaction between diversity and relevant personality variables")

Models predicting personality measures among American students

Repeat the same steps using the American student cultural diversity measures.

See the instructions above for interpreting the table.

amPersModelOutputs <-read.csv("amPersModelOutputs.csv")

DT::datatable(amPersModelOutputs,caption="Predicting American students' personality characteristics")

Controlling for other covariates

We will return to estimated effects with 95% credibility intervals that exclude 0. We will see whether they hold when controlling for key potential confounds. Most of these covariates are unavailable for the full set of counties/nations, therefore they restrict our effective sample size. Take any findings with a grain of salt.

In International student models

Here are the covariates we will add to our earlier international student models:

nonHispNonWhite: Whether the participant is White and non-Hispanic or not
english_official_language_wiki: Whether the home nation has English as an official language according to Wikipedia
uk_classified_majority_english_speaking: Whether the UK classifies the home nation as having a majority English-speaking population
nation_tightness: The tightness-looseness score of the home nation as quantified by Gelfand et al.
relational_mobility: The relational mobility score of the home nation as quantified by Thomson et al.
percent_of_population_that_is_americanLog: The log-transformed percent of the home nation’s population that is American ex-pats.
Number of friends in network.

The “iv” column indicates which cultural diversity variable is the focus of the model. The “covariate” column indicates which covariate was added. The first model coefficent columns contain estimates for the cultural diversity parameter; the second set is for the covariate that was added to the model.

intCovModelOutputs <-read.csv("intCovModelOutputs.csv")

DT::datatable(intCovModelOutputs,caption="Predicting International students' networks while controlling for covariates")

In American models

We controlled for the following covariates:
* nonHispNonWhite: Whether the participant was White and non-Hispanic
* US_state_tightness: The tightness-looseness score of the participant’s home state, as quantified by Gelfand et al.
* population_density_USLog: The log-transformed population density of the students’ home counties.

amCovModelOutputs <-read.csv("amCovModelOutputs.csv")

DT::datatable(amCovModelOutputs,caption="Predicting American students' networks while controlling for covariates")

Graphing model estimates and raw data for key findings

International historical diversity predicting brokerage

# Graph Nation-level ancestral diversity 

plotIH <- ggplot(dIntPCA,aes_string(x="ancestral_diversity_hhiScore",y="brokerageZ"))+
    geom_point(position=position_jitter(width=.05),alpha=.3,color="#57575f")+
    #geom_errorbar(data=dIntAg,aes(ymin=CIlo,ymax=CIhi),alpha=.8,width=.1,color="#84a0ad")+
    geom_point(data=dIntAg,aes(color=ancestral_diversity_hhiScore))+#color="#28647c")+
    scale_colour_gradient(low = "dark red", high = "blue")+
    geom_abline(intercept=bm1c[1,1],slope=bm1c[2,1],color="#ca3542")+
    labs(x="Historical cultural diversity of international students' home nations\n(Reversed Herfindahl index of migration matrix, standardized)",
         y="Social brokerage (standardized)")+
    ggrepel::geom_text_repel(data=dIntAg,aes(label=nationality),size=2.5,segment.size  = 0.2,
    segment.color = "grey50")+theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank(),
panel.background = element_blank(), axis.line = element_line(colour = "black"),legend.position = "none")+
    scale_y_continuous(breaks=c(-5,-4,-3,-2,-1,0,1))

plotIH

fig <-ggdraw(plotIH)
ggsave("graph_brokerage_ancestral_diversity_international_students.png",fig,width=7,height=6)

International present-day diversity predicting brokerage

# Graph Nation-level present-day diversity 


plotIP <- ggplot(dIntPCA,aes_string(x="alesina_ethnic_fractionalization",y="brokerageZ"))+
    geom_point(position=position_jitter(width=.05),alpha=.3,color="#57575f")+
    #geom_errorbar(data=dIntAg,aes(ymin=CIlo,ymax=CIhi),alpha=.8,width=.1,color="#84a0ad")+
    geom_point(data=dIntAg,aes(color=alesina_ethnic_fractionalization))+#color="#28647c")+
    scale_colour_gradient(low = "dark red", high = "blue")+
    geom_abline(intercept=bm1c[1,1],slope=bm1c[2,1],color="#ca3542")+
    labs(x="Present-day cultural diversity of international students' home nations\n(Ethnic fractionalization, standardized)",
         y="Social brokerage (standardized)")+
    ggrepel::geom_text_repel(data=dIntAg,aes(label=nationality),size=2.5,segment.size  = 0.2,
    segment.color = "grey50")+guides(color=FALSE)+theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank(),
panel.background = element_blank(), axis.line = element_line(colour = "black"),legend.position = "none",axis.title.y = element_blank())+
    scale_y_continuous(breaks=c(-5,-4,-3,-2,-1,0,1))

plotIP

fig <-ggdraw(plotIP)
ggsave("graph_brokerage_present-day_diversity_international_students.png",fig,width=7,height=6)

US historical diversity predicting brokerage

# Graph county-level historical diversity

plotAH <- ggplot(dAmPCA,aes_string(x="ipumsHHIDiversity",y="brokerageZ"))+
    geom_point(position=position_jitter(width=.0005),alpha=.2,color="#57575f")+
    geom_point(data=dAmAg, aes(color=ipumsHHIDiversity))+
    scale_colour_gradient(low = "dark red", high = "blue")+
    geom_abline(intercept=bm2c[1,1],slope=bm2c[2,1],color="#ca3542")+
    labs(x="Historical cultural diversity of American students' home counties\n(Reversed Herfindahl index of IPUMS ancestry, standardized)",
         y="Social brokerage (standardized)")+
    ggrepel::geom_text_repel(data=dAmAg,aes(label=labels),size=2.5,segment.size  = 0.2,
    segment.color = "grey50")+
    guides(color=FALSE)+theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank(),
panel.background = element_blank(), axis.line = element_line(colour = "black"),legend.position = "none")+
  scale_y_continuous(breaks=c(-3,-2,-1,0,1))
    

plotAH

fig<-ggdraw(plotAH)
ggsave("graph_historical_diversity_brokerage_americans.png",fig,width=7,height=6)

US present day diversity predicting brokerage

# Graph county-level present-day diversity

plotAP <- ggplot(dAmPCA,aes_string(x="US_county_SCI_international_connectednessLog",y="brokerageZ"))+
    geom_point(position=position_jitter(width=.05),alpha=.2,color="#57575f")+
    geom_point(data=dAmAg, aes(color=US_county_SCI_international_connectednessLog))+#,color="#28647c")+
    scale_colour_gradient(low = "dark red", high = "blue")+
    geom_abline(intercept=bm2c[1,1],slope=bm2c[2,1],color="#ca3542")+
    labs(x="Present-day cultural diversity of American students' home counties\n(Log-transformed international Facebook connectedness, standardized)",
         y="Social brokerage (standardized)")+
    ggrepel::geom_text_repel(data=dAmAg,aes(label=labels),size=2.5,segment.size  = 0.2,
    segment.color = "grey50")+
    guides(color=FALSE)+theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank(),
panel.background = element_blank(), axis.line = element_line(colour = "black"),legend.position = "none",axis.title.y = element_blank())+
    scale_y_continuous(breaks=c(-3,-2,-1,0,1))

plotAP

fig<-ggdraw(plotAP)
ggsave("graph_present-day_brokerage_americans.png",fig,width=7,height=6)

Analyses for ‘Cultural Diversity Broadens Social Networks’

Adrienne Wood

8/3/2021

Table of Contents

Overview

Variable codebook

Descriptive statistics

Demographics frequencies

Descriptives for International students

Data Frame Summary

Descriptives for American students

Data Frame Summary

Correlations of major variables separately for American and International students

International students

American students

Population variables predicting personality traits in students

Models predicting personality measures among international students

Do the personality variables moderate the effect of diversity on brokerage?

Models predicting personality measures among American students

Controlling for other covariates

In International student models

In American models

Graphing model estimates and raw data for key findings

International historical diversity predicting brokerage

International present-day diversity predicting brokerage

US historical diversity predicting brokerage

US present day diversity predicting brokerage