The following code accompanies a manuscript by Adrienne Wood, Adam Kleinbaum, and Thalia Wheatley. Please contact Adrienne (adrienne.wood@virginia.edu) with any questions.
OSF Page: https://osf.io/ycpwt/
OSF Preprint: https://psyarxiv.com/qvthk/
In the following code, we use different measures of cultural diversity–both present-day and historical–to predict international and American students’ social network positions in a series of Bayesian multilevel models. Estimates from models looking at other network variables, such as the cultural diversity of students’ friends, are also reported, although none of them were predicted by our distal cultural diversity measures.
We also looked at the effect of cultural diversity on several “socially adaptable” personality traits, including self-monitoring and extraversion. For those that were predicted by culture, we then asked whether they moderated the effect of cultural diversity on brokerage.
See the “Data Preparation” R Markdown and html files on the OSF project page for details about how these variables were computed.
codebook <-read.csv("variablenames.csv")
DT::datatable(codebook,caption="Variables in dataset")
Click the “code” button to see details for any of the following preparatory steps: 1. Read in data and create lists of key variables
dAll <-read.csv("complete_culture_network_data.csv")
# reviewers requested this additional covariate:
# number of friends in network. We'll just add together our
# existing variables, number of American friends and number
# of international friends.
dAll$numFriends <-dAll$numAmericanFriends+dAll$numInternationalFriends
# create a list of key numeric student variables
studentVariables <-c("brokerageZ","IndegreeNonreciprocity","OutdegreeNonreciprocity","numAmericanFriends","numInternationalFriends","numFriends","percentofFriendsAmerican","diversityofNetworkFriends","ESCIconflict","ESCIconflictPeer","ESCIempathy","ESCIempathyPeer","ESCIInfluence","ESCIinfluencePeer","ESCIteamwork","ESCIteamworkPeer","Openness_z","extraversion_z","self_monitor_factor1","self_monitor_factor2","self_monitor_factor3")
# sublist of just personality measures
personalityVariables <-c("ESCIconflict","ESCIconflictPeer","ESCIempathy","ESCIempathyPeer","ESCIInfluence","ESCIinfluencePeer","ESCIteamwork","ESCIteamworkPeer","Openness_z","extraversion_z","self_monitor_factor1","self_monitor_factor2","self_monitor_factor3")
# create a list of additional international student variables
intVariables <-c(studentVariables,c("alesina_ethnic_fractionalization",
"ancestral_diversity_hhiScore",
"english_official_language_wiki",
"nation_tightness",
"uk_classified_majority_english_speaking",
"relational_mobility",
"percent_of_population_that_is_americanLog"))
# create a list of additional American student variables
amVariables <-c(studentVariables,c("population_density_USLog","ipumsHHIDiversity",
"US_county_SCI_international_connectednessLog",
"US_state_tightness"))
Standardize our personality measures
Separate American and International student datasets
dInt <-dAll[dAll$nationality!="united states",]
dInt <-dInt[dInt$nationality!="",]
dAm <-dAll[dAll$nationality=="united states",]
dAm <-dAm %>%
mutate(across(.cols=c(ipumsHHIDiversity,
US_county_SCI_international_connectednessLog),~(.x-mean(.x,na.rm=T))/sd(.x,na.rm=T)))
dInt <-dInt %>%
mutate(across(.cols=c(alesina_ethnic_fractionalization,
ancestral_diversity_hhiScore),~(.x-mean(.x,na.rm=T))/sd(.x,na.rm=T)))
dInt1 <-dInt[dInt$sampleNumber==1,]
dAll1 <-dAll[dAll$sampleNumber==1,]
dAm1 <-dAm[dAm$sampleNumber==1,]
Create the table of all students’ descriptives for paper
Descriptives
How many students identify as international vs. American, male vs female; what are their self-reported ethnicities and how many students were in each cohort.
# How many didn't report international status
kable(table(dAll1$intStatus),col.names=c("Status","Number of students"))
| Status | Number of students |
|---|---|
| American | 1461 |
| International | 785 |
print(paste(as.character(nrow(dAll1[is.na(dAll1$intStatus),]))," students did not report whether they were U.S. or international students.",sep=""))
[1] “41 students did not report whether they were U.S. or international students.”
kable(table(dAll1$Gender),col.names=c("Gender","Number of students"))
| Gender | Number of students |
|---|---|
| 1 | |
| F | 836 |
| M | 1436 |
kable(table(dAll1$ethnicity),col.names=c("Ethnicity","Number of Students"))
| Ethnicity | Number of Students |
|---|---|
| 0 | 5 |
| Asian, Asian Am, Pacific Islan | 331 |
| Black, non-Hispanic | 105 |
| Hispanic/Latino | 99 |
| Multi Racial | 35 |
| Native American | 13 |
| No Response | 539 |
| White, non-Hispanic | 1146 |
kable(table(dAll1$nonHispNonWhite),col.names=c("Does the student identify as non-White and/or Hispanic? (1=yes)","Number of students"))
| Does the student identify as non-White and/or Hispanic? (1=yes) | Number of students |
|---|---|
| 0 | 1146 |
| 1 | 586 |
kable(table(dAll1$year),col.names=c("Cohort ID name","Number of students"))
| Cohort ID name | Number of students |
|---|---|
| 12 | 288 |
| 13 | 273 |
| 14 | 289 |
| 15 | 277 |
| 16 | 281 |
| 17 | 286 |
| 18 | 285 |
| 19 | 293 |
dfSummary(dInt1[,c(intVariables)], plain.ascii = FALSE, style = "grid",
graph.magnif = 0.75, valid.col = FALSE, tmp.img.dir = "/tmp",round.digits=3)
dInt1
Dimensions: 918 x 28
Duplicates: 144
| No | Variable | Stats / Values | Freqs (% of Valid) | Graph | Missing |
|---|---|---|---|---|---|
| 1 | brokerageZ [numeric] |
Mean (sd) : -0.329 (0.758) min < med < max: -2.543 < -0.311 < 1.868 IQR (CV) : 1.002 (-2.306) |
699 distinct values | 133 (14.5%) |
|
| 2 | IndegreeNonreciprocity [numeric] |
Mean (sd) : 0.525 (0.24) min < med < max: 0 < 0.52 < 1 IQR (CV) : 0.381 (0.458) |
285 distinct values | 133 (14.5%) |
|
| 3 | OutdegreeNonreciprocity [numeric] |
Mean (sd) : 0.442 (0.233) min < med < max: 0 < 0.462 < 1 IQR (CV) : 0.337 (0.528) |
264 distinct values | 135 (14.7%) |
|
| 4 | numAmericanFriends [integer] |
Mean (sd) : 11.302 (13.339) min < med < max: 0 < 7 < 106 IQR (CV) : 11.25 (1.18) |
61 distinct values | 134 (14.6%) |
|
| 5 | numInternationalFriends [integer] |
Mean (sd) : 12.19 (10.304) min < med < max: 0 < 10 < 64 IQR (CV) : 14 (0.845) |
51 distinct values | 134 (14.6%) |
|
| 6 | numFriends [integer] |
Mean (sd) : 23.492 (21.453) min < med < max: 1 < 18 < 148 IQR (CV) : 25 (0.913) |
94 distinct values | 134 (14.6%) |
|
| 7 | percentofFriendsAmerican [numeric] |
Mean (sd) : 0.43 (0.243) min < med < max: 0 < 0.439 < 1 IQR (CV) : 0.339 (0.566) |
287 distinct values | 134 (14.6%) |
|
| 8 | diversityofNetworkFriends [numeric] |
Mean (sd) : 58.505 (20.971) min < med < max: 0 < 63.599 < 88.889 IQR (CV) : 23.75 (0.358) |
507 distinct values | 134 (14.6%) |
|
| 9 | ESCIconflict [numeric] |
Mean (sd) : 0.024 (1.006) min < med < max: -2.624 < 0.077 < 2.169 IQR (CV) : 1.394 (42.417) |
23 distinct values | 467 (50.9%) |
|
| 10 | ESCIconflictPeer [numeric] |
Mean (sd) : -0.344 (1.014) min < med < max: -4.062 < -0.299 < 2.341 IQR (CV) : 1.343 (-2.95) |
252 distinct values | 441 (48.0%) |
|
| 11 | ESCIempathy [numeric] |
Mean (sd) : 0.15 (0.938) min < med < max: -2.835 < 0.004 < 1.779 IQR (CV) : 1.065 (6.262) |
20 distinct values | 467 (50.9%) |
|
| 12 | ESCIempathyPeer [numeric] |
Mean (sd) : -0.037 (0.988) min < med < max: -3.461 < 0.098 < 2.042 IQR (CV) : 1.23 (-26.568) |
243 distinct values | 441 (48.0%) |
|
| 13 | ESCIInfluence [numeric] |
Mean (sd) : -0.071 (1.041) min < med < max: -3.682 < 0.039 < 2.068 IQR (CV) : 1.353 (-14.734) |
26 distinct values | 467 (50.9%) |
|
| 14 | ESCIinfluencePeer [numeric] |
Mean (sd) : -0.334 (1.047) min < med < max: -3.778 < -0.277 < 2.492 IQR (CV) : 1.351 (-3.137) |
281 distinct values | 441 (48.0%) |
|
| 15 | ESCIteamwork [numeric] |
Mean (sd) : 0.05 (0.99) min < med < max: -3.913 < 0.056 < 1.499 IQR (CV) : 1.443 (19.846) |
18 distinct values | 467 (50.9%) |
|
| 16 | ESCIteamworkPeer [numeric] |
Mean (sd) : -0.231 (0.995) min < med < max: -3.812 < -0.129 < 1.866 IQR (CV) : 1.234 (-4.317) |
153 distinct values | 441 (48.0%) |
|
| 17 | Openness_z [numeric] |
Mean (sd) : 0.031 (0.951) min < med < max: -3.104 < 0.044 < 2.177 IQR (CV) : 1.356 (30.232) |
118 distinct values | 433 (47.2%) |
|
| 18 | extraversion_z [numeric] |
Mean (sd) : -0.209 (0.976) min < med < max: -3.137 < -0.102 < 1.921 IQR (CV) : 1.349 (-4.671) |
28 distinct values | 722 (78.6%) |
|
| 19 | self_monitor_factor1 [numeric] |
Mean (sd) : -0.301 (1.051) min < med < max: -2.298 < -0.188 < 1.219 IQR (CV) : 1.407 (-3.489) |
6 distinct values | 626 (68.2%) |
|
| 20 | self_monitor_factor2 [numeric] |
Mean (sd) : -0.103 (0.975) min < med < max: -1.379 < -0.24 < 2.038 IQR (CV) : 1.281 (-9.434) |
7 distinct values | 626 (68.2%) |
|
| 21 | self_monitor_factor3 [numeric] |
Mean (sd) : -0.002 (0.969) min < med < max: -1.291 < -0.316 < 1.633 IQR (CV) : 0.975 (-402.651) |
4 distinct values | 626 (68.2%) |
|
| 22 | alesina_ethnic_fractionalization [numeric] |
Mean (sd) : -0.01 (1.001) min < med < max: -1.512 < 0.336 < 2.682 IQR (CV) : 1.768 (-100.371) |
67 distinct values | 133 (14.5%) |
|
| 23 | ancestral_diversity_hhiScore [numeric] |
Mean (sd) : 0.002 (1.007) min < med < max: -1.106 < -0.491 < 1.905 IQR (CV) : 2.256 (403.048) |
56 distinct values | 145 (15.8%) |
|
| 24 | english_official_language_wiki [integer] |
Min : 0 Mean : 0.366 Max : 1 |
0 : 495 (63.4%) 1 : 286 (36.6%) |
137 (14.9%) |
|
| 25 | nation_tightness [numeric] |
Mean (sd) : 8.066 (2.575) min < med < max: 1.6 < 7.9 < 12.3 IQR (CV) : 4.175 (0.319) |
23 distinct values | 316 (34.4%) |
|
| 26 | uk_classified_majority_english_speaking [integer] |
Min : 0 Mean : 0.091 Max : 1 |
0 : 710 (90.9%) 1 : 71 ( 9.1%) |
137 (14.9%) |
|
| 27 | relational_mobility [numeric] |
Mean (sd) : 0.065 (0.219) min < med < max: -0.414 < 0.138 < 0.359 IQR (CV) : 0.21 (3.384) |
24 distinct values | 551 (60.0%) |
|
| 28 | percent_of_population_that_is_americanLog [numeric] |
Mean (sd) : 0.166 (0.288) min < med < max: 0.002 < 0.033 < 1.309 IQR (CV) : 0.224 (1.738) |
28 distinct values | 247 (26.9%) |
dfSummary(dAm1[,c(amVariables)], plain.ascii = FALSE, style = "grid",
graph.magnif = 0.75, valid.col = FALSE, tmp.img.dir = "/tmp",round.digits=3)
dAm1
Dimensions: 1594 x 25
Duplicates: 132
| No | Variable | Stats / Values | Freqs (% of Valid) | Graph | Missing |
|---|---|---|---|---|---|
| 1 | brokerageZ [numeric] |
Mean (sd) : -0.018 (0.802) min < med < max: -2.543 < -0.006 < 2.338 IQR (CV) : 1.081 (-44.525) |
1358 distinct values | 133 (8.3%) |
|
| 2 | IndegreeNonreciprocity [numeric] |
Mean (sd) : 0.502 (0.225) min < med < max: 0 < 0.5 < 1 IQR (CV) : 0.333 (0.449) |
509 distinct values | 134 (8.4%) |
|
| 3 | OutdegreeNonreciprocity [numeric] |
Mean (sd) : 0.406 (0.213) min < med < max: 0 < 0.406 < 1 IQR (CV) : 0.311 (0.525) |
428 distinct values | 138 (8.7%) |
|
| 4 | numAmericanFriends [integer] |
Mean (sd) : 22.187 (19.332) min < med < max: 0 < 17 < 137 IQR (CV) : 22 (0.871) |
99 distinct values | 138 (8.7%) |
|
| 5 | numInternationalFriends [integer] |
Mean (sd) : 5.396 (6.373) min < med < max: 0 < 3 < 56 IQR (CV) : 6 (1.181) |
41 distinct values | 138 (8.7%) |
|
| 6 | numFriends [integer] |
Mean (sd) : 27.582 (24.113) min < med < max: 1 < 22 < 180 IQR (CV) : 26 (0.874) |
116 distinct values | 138 (8.7%) |
|
| 7 | percentofFriendsAmerican [numeric] |
Mean (sd) : 0.803 (0.156) min < med < max: 0 < 0.828 < 1 IQR (CV) : 0.182 (0.194) |
364 distinct values | 138 (8.7%) |
|
| 8 | diversityofNetworkFriends [numeric] |
Mean (sd) : 30.83 (19.502) min < med < max: 0 < 30.775 < 86.42 IQR (CV) : 28.087 (0.633) |
711 distinct values | 138 (8.7%) |
|
| 9 | ESCIconflict [numeric] |
Mean (sd) : -0.014 (0.999) min < med < max: -4.106 < 0.077 < 2.169 IQR (CV) : 1.046 (-71.575) |
29 distinct values | 727 (45.6%) |
|
| 10 | ESCIconflictPeer [numeric] |
Mean (sd) : 0.18 (0.943) min < med < max: -3.999 < 0.235 < 2.552 IQR (CV) : 1.264 (5.234) |
322 distinct values | 680 (42.7%) |
|
| 11 | ESCIempathy [numeric] |
Mean (sd) : -0.075 (1.022) min < med < max: -3.545 < 0.004 < 1.779 IQR (CV) : 1.42 (-13.609) |
23 distinct values | 727 (45.6%) |
|
| 12 | ESCIempathyPeer [numeric] |
Mean (sd) : 0.024 (1) min < med < max: -3.374 < 0.142 < 2.217 IQR (CV) : 1.245 (41.508) |
306 distinct values | 680 (42.7%) |
|
| 13 | ESCIInfluence [numeric] |
Mean (sd) : 0.035 (0.978) min < med < max: -3.006 < 0.039 < 2.068 IQR (CV) : 1.353 (27.886) |
28 distinct values | 727 (45.6%) |
|
| 14 | ESCIinfluencePeer [numeric] |
Mean (sd) : 0.175 (0.929) min < med < max: -3.491 < 0.235 < 2.85 IQR (CV) : 1.189 (5.303) |
353 distinct values | 680 (42.7%) |
|
| 15 | ESCIteamwork [numeric] |
Mean (sd) : -0.024 (1.004) min < med < max: -4.274 < 0.056 < 1.499 IQR (CV) : 1.443 (-41.623) |
21 distinct values | 727 (45.6%) |
|
| 16 | ESCIteamworkPeer [numeric] |
Mean (sd) : 0.12 (0.979) min < med < max: -4.388 < 0.221 < 1.866 IQR (CV) : 1.317 (8.175) |
177 distinct values | 680 (42.7%) |
|
| 17 | Openness_z [numeric] |
Mean (sd) : -0.015 (1.025) min < med < max: -3.504 < 0.042 < 2.245 IQR (CV) : 1.356 (-66.403) |
145 distinct values | 658 (41.3%) |
|
| 18 | extraversion_z [numeric] |
Mean (sd) : 0.108 (0.997) min < med < max: -2.8 < 0.235 < 2.258 IQR (CV) : 1.264 (9.217) |
31 distinct values | 1219 (76.5%) |
|
| 19 | self_monitor_factor1 [numeric] |
Mean (sd) : 0.156 (0.938) min < med < max: -2.298 < 0.515 < 1.219 IQR (CV) : 0.703 (6.002) |
6 distinct values | 1030 (64.6%) |
|
| 20 | self_monitor_factor2 [numeric] |
Mean (sd) : 0.054 (1.011) min < med < max: -1.379 < -0.24 < 2.038 IQR (CV) : 1.708 (18.734) |
7 distinct values | 1030 (64.6%) |
|
| 21 | self_monitor_factor3 [numeric] |
Mean (sd) : 0 (1.017) min < med < max: -1.291 < -0.316 < 1.633 IQR (CV) : 1.949 (12938.52) |
4 distinct values | 1030 (64.6%) |
|
| 22 | population_density_USLog [numeric] |
Mean (sd) : 7.421 (1.025) min < med < max: 1.609 < 7.346 < 9.589 IQR (CV) : 1.261 (0.138) |
521 distinct values | 374 (23.5%) |
|
| 23 | ipumsHHIDiversity [numeric] |
Mean (sd) : -0.002 (1.003) min < med < max: -2.701 < 0.045 < 1.908 IQR (CV) : 1.32 (-418.76) |
180 distinct values | 725 (45.5%) |
|
| 24 | US_county_SCI_international_connectednessLog [numeric] |
Mean (sd) : -0.002 (1.003) min < med < max: -3.383 < -0.041 < 2.087 IQR (CV) : 1.12 (-488.786) |
276 distinct values | 279 (17.5%) |
|
| 25 | US_state_tightness [numeric] |
Mean (sd) : 42.266 (10.874) min < med < max: 27.37 < 39.42 < 78.86 IQR (CV) : 13.81 (0.257) |
44 distinct values | 279 (17.5%) |
Code for data table heatmap: https://rstudio.github.io/DT/010-style.html
dIntCor <-data.frame(round(cor(dInt1[,intVariables],use="pairwise.complete.obs"),3))
diag(dIntCor) <-NA
brks <-seq(-1,1,.1)
clrsneg <- rev(round(seq(255, 40, length.out = length(brks)/2), 0) %>%
{paste0("rgb(255,", ., ",", ., ")")})
clrspos <- round(seq(255, 40, length.out = length(brks)/2), 0) %>%
{paste0("rgb(",.,",255,",.,")")}
clrs <-c(clrsneg,clrspos)
DT::datatable(dIntCor) %>% formatStyle(names(dIntCor),backgroundColor = styleInterval(brks, clrs))
dAmCor <-data.frame(round(cor(dAm1[,amVariables],use="pairwise.complete.obs"),3))
diag(dAmCor) <-NA
brks <-seq(-1,1,.1)
clrsneg <- rev(round(seq(255, 40, length.out = length(brks)/2), 0) %>%
{paste0("rgb(255,", ., ",", ., ")")})
clrspos <- round(seq(255, 40, length.out = length(brks)/2), 0) %>%
{paste0("rgb(",.,",255,",.,")")}
clrs <-c(clrsneg,clrspos)
DT::datatable(dAmCor) %>% formatStyle(names(dAmCor),backgroundColor = styleInterval(brks, clrs))
Now we’ll loop through our personality measures of interest and see whether cultural diversity predicts them within international students.
Instructions for interpreting this table are the same as above; the “dv” column indicates the personality variable for each row and the “iv” indicates the culture diversity variable.
intPersModelOutputs <-read.csv("intPersModelOutputs.csv")
DT::datatable(intPersModelOutputs,caption="Predicting international students' personality characteristics")
Take the personality variables that were predicted by diversity and see whether they moderate the effect of diversity on brokerage.
The “iv” column indicates the cultural diversity variable and the “moderator” column indicates the personality variable. The first set of model statistics in each row are for cultural diversity; the second set is for the personality moderator; and the third set is for their interaction term. The final column indicates whether their interaction term’s credibility interval excludes 0.
intPersModeratorModelOutputs <-read.csv("intPersModeratorModelOutputs.csv")
DT::datatable(intPersModeratorModelOutputs,caption="Predicting International students' brokerage from interaction between diversity and relevant personality variables")
Repeat the same steps using the American student cultural diversity measures.
See the instructions above for interpreting the table.
amPersModelOutputs <-read.csv("amPersModelOutputs.csv")
DT::datatable(amPersModelOutputs,caption="Predicting American students' personality characteristics")
We will return to estimated effects with 95% credibility intervals that exclude 0. We will see whether they hold when controlling for key potential confounds. Most of these covariates are unavailable for the full set of counties/nations, therefore they restrict our effective sample size. Take any findings with a grain of salt.
Here are the covariates we will add to our earlier international student models:
The “iv” column indicates which cultural diversity variable is the focus of the model. The “covariate” column indicates which covariate was added. The first model coefficent columns contain estimates for the cultural diversity parameter; the second set is for the covariate that was added to the model.
intCovModelOutputs <-read.csv("intCovModelOutputs.csv")
DT::datatable(intCovModelOutputs,caption="Predicting International students' networks while controlling for covariates")
We controlled for the following covariates:
* nonHispNonWhite: Whether the participant was White and non-Hispanic
* US_state_tightness: The tightness-looseness score of the participant’s home state, as quantified by Gelfand et al.
* population_density_USLog: The log-transformed population density of the students’ home counties.
The “iv” column indicates which cultural diversity variable is the focus of the model. The “covariate” column indicates which covariate was added. The first model coefficent columns contain estimates for the cultural diversity parameter; the second set is for the covariate that was added to the model.
amCovModelOutputs <-read.csv("amCovModelOutputs.csv")
DT::datatable(amCovModelOutputs,caption="Predicting American students' networks while controlling for covariates")
# Graph Nation-level ancestral diversity
plotIH <- ggplot(dIntPCA,aes_string(x="ancestral_diversity_hhiScore",y="brokerageZ"))+
geom_point(position=position_jitter(width=.05),alpha=.3,color="#57575f")+
#geom_errorbar(data=dIntAg,aes(ymin=CIlo,ymax=CIhi),alpha=.8,width=.1,color="#84a0ad")+
geom_point(data=dIntAg,aes(color=ancestral_diversity_hhiScore))+#color="#28647c")+
scale_colour_gradient(low = "dark red", high = "blue")+
geom_abline(intercept=bm1c[1,1],slope=bm1c[2,1],color="#ca3542")+
labs(x="Historical cultural diversity of international students' home nations\n(Reversed Herfindahl index of migration matrix, standardized)",
y="Social brokerage (standardized)")+
ggrepel::geom_text_repel(data=dIntAg,aes(label=nationality),size=2.5,segment.size = 0.2,
segment.color = "grey50")+theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank(),
panel.background = element_blank(), axis.line = element_line(colour = "black"),legend.position = "none")+
scale_y_continuous(breaks=c(-5,-4,-3,-2,-1,0,1))
plotIH
fig <-ggdraw(plotIH)
ggsave("graph_brokerage_ancestral_diversity_international_students.png",fig,width=7,height=6)
# Graph Nation-level present-day diversity
plotIP <- ggplot(dIntPCA,aes_string(x="alesina_ethnic_fractionalization",y="brokerageZ"))+
geom_point(position=position_jitter(width=.05),alpha=.3,color="#57575f")+
#geom_errorbar(data=dIntAg,aes(ymin=CIlo,ymax=CIhi),alpha=.8,width=.1,color="#84a0ad")+
geom_point(data=dIntAg,aes(color=alesina_ethnic_fractionalization))+#color="#28647c")+
scale_colour_gradient(low = "dark red", high = "blue")+
geom_abline(intercept=bm1c[1,1],slope=bm1c[2,1],color="#ca3542")+
labs(x="Present-day cultural diversity of international students' home nations\n(Ethnic fractionalization, standardized)",
y="Social brokerage (standardized)")+
ggrepel::geom_text_repel(data=dIntAg,aes(label=nationality),size=2.5,segment.size = 0.2,
segment.color = "grey50")+guides(color=FALSE)+theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank(),
panel.background = element_blank(), axis.line = element_line(colour = "black"),legend.position = "none",axis.title.y = element_blank())+
scale_y_continuous(breaks=c(-5,-4,-3,-2,-1,0,1))
plotIP
fig <-ggdraw(plotIP)
ggsave("graph_brokerage_present-day_diversity_international_students.png",fig,width=7,height=6)
# Graph county-level historical diversity
plotAH <- ggplot(dAmPCA,aes_string(x="ipumsHHIDiversity",y="brokerageZ"))+
geom_point(position=position_jitter(width=.0005),alpha=.2,color="#57575f")+
geom_point(data=dAmAg, aes(color=ipumsHHIDiversity))+
scale_colour_gradient(low = "dark red", high = "blue")+
geom_abline(intercept=bm2c[1,1],slope=bm2c[2,1],color="#ca3542")+
labs(x="Historical cultural diversity of American students' home counties\n(Reversed Herfindahl index of IPUMS ancestry, standardized)",
y="Social brokerage (standardized)")+
ggrepel::geom_text_repel(data=dAmAg,aes(label=labels),size=2.5,segment.size = 0.2,
segment.color = "grey50")+
guides(color=FALSE)+theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank(),
panel.background = element_blank(), axis.line = element_line(colour = "black"),legend.position = "none")+
scale_y_continuous(breaks=c(-3,-2,-1,0,1))
plotAH
fig<-ggdraw(plotAH)
ggsave("graph_historical_diversity_brokerage_americans.png",fig,width=7,height=6)
# Graph county-level present-day diversity
plotAP <- ggplot(dAmPCA,aes_string(x="US_county_SCI_international_connectednessLog",y="brokerageZ"))+
geom_point(position=position_jitter(width=.05),alpha=.2,color="#57575f")+
geom_point(data=dAmAg, aes(color=US_county_SCI_international_connectednessLog))+#,color="#28647c")+
scale_colour_gradient(low = "dark red", high = "blue")+
geom_abline(intercept=bm2c[1,1],slope=bm2c[2,1],color="#ca3542")+
labs(x="Present-day cultural diversity of American students' home counties\n(Log-transformed international Facebook connectedness, standardized)",
y="Social brokerage (standardized)")+
ggrepel::geom_text_repel(data=dAmAg,aes(label=labels),size=2.5,segment.size = 0.2,
segment.color = "grey50")+
guides(color=FALSE)+theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank(),
panel.background = element_blank(), axis.line = element_line(colour = "black"),legend.position = "none",axis.title.y = element_blank())+
scale_y_continuous(breaks=c(-3,-2,-1,0,1))
plotAP
fig<-ggdraw(plotAP)
ggsave("graph_present-day_brokerage_americans.png",fig,width=7,height=6)