Executive Summary

Problem Statement

In this demo, I explore some factors correlated with low trust toward Syrians among a population of rural Jordanians. I intend to demonstrate my skills at data vizualization, survey analysis, and my ability to connect vizualizations to contextual information. The code also draws on my data cleaning and data type handling skills.

Data Source

The source for this demo is a publically available survey of 2088 Jordanians and Syrians taken in January and February of 2019 by Impact Initatives. The survey covered only residents of the areas surrounding 5 dams and one UNESCO World Heritage Site. Impact used it to analyze the potential stability risks of sending Syrian refugee cash-for-workers to carry out projects in those sites on behalf of the Ministry of Water and Irrigation (I personally presented our findings to MWI officials). For this reason, the survey covers an unusually large number of residents, for an thematic survey, since it was intended to provide a “95-5” survey for each dam (95% confidence interval and 5% confidence interval). Because this survey requires a bit less than 400 people, the total residents surveyed was 2,088, making this an unusually powerful survey of Jordanians attitudes toward Syrians.

Key Findings

In general Jordanians are more trusting of Syrians then distrusting, even rural Jordanians. This mirrors the consensus among donors that inter-Jordanian issues, particularly state-society relations, are currently larger stability threat than tensions between Jordanians and Syrian refugees. Both employment and education level are not strongly correlated with trust toward Sryians. However, attitudes toward Syrians vary significantly by tribe. This relationship is partly explained by greater interaction with Syrians among some tribes (possible due to geographic differences between tribes or the bedouine-fellahin cultural divide). However, sigificant variation remains between tribes even accounting for contact.

Set up

For this analysis, I use the tidyverse package set, as well as the ggthemes extension for ggplot. For an introduction to the package, I recommend “R for Data Science” by Wickham and Grolemund. Here I also set the theme for subsequent plots to “Economist”, designed by the popular magazine, and import the data.

library(tidyverse)
library(tidyr)
library(ggthemes)
library(plotly)

theme_set(theme_economist())

setwd("C:\\Users\\liptr\\Documents\\R\\dams")

dams <- read_csv("from_reach.csv")

Data cleaning

Fortunately, this data has been extensively cleaned by Impact Initiatives staff (thanks Becca!). Unfortunately it was designed to be analyzed in Excell, so it has a few conventions that I must change.


dams_demo <- dams %>% rename_all(funs(tolower(.))) %>%
  mutate_all(funs(tolower(.))) %>%
  mutate(educ_hhh=ifelse(is.na(q2_7_educ_hhh), q2_9_educ_resp, q2_7_educ_hhh)) %>%
  select(-contains("/")) %>%
#  filter(!is.na(educ_hhh)) %>%
  filter_at(vars(educ_hhh,q4_4_11_actor_trust_syrians,q2_10_men_working,q2_20_tribe),all_vars(!is.na(.)))
## Warning: funs() is soft deprecated as of dplyr 0.8.0
## Please use a list of either functions or lambdas: 
## 
##   # Simple named list: 
##   list(mean = mean, median = median)
## 
##   # Auto named with `tibble::lst()`: 
##   tibble::lst(mean, median)
## 
##   # Using lambdas
##   list(~ mean(., trim = .2), ~ median(., na.rm = TRUE))
## This warning is displayed once per session.

Data Type Handling

Dplyr (an package in Tidyverse) reads the original file as strings of characters or numerics. Our analysis will be easier with the more precise data form factors, which can be ordered. This makes it easier to arrange our plots, so we order them now.


dams_fct <- dams_demo %>%
  mutate_if(sapply(dams_demo, is.character), as.factor)

edu_lev = c("illiterate", "primary", "secondary", "vocational",   "postgraduate")
trust_lev = c("very_much_trust", "trust", "dont_know", "distrust", "very_much_distrust")
freq_lev = c("at_least_once_a_week", "at_least_once_a_month", "less_than_once_a_month", "at_least_once_every_six_months", "at_least_once_a_year", "never", "not_sure_dont_know")

Exploratory Data Analysis

Trust Toward Syrians

Analyzing individual questions is a simple operation (which is why it is often done in excell). This line of code calculates the responses of residents to the Syria trust question. As you can see a plurality of respondents (42%) do not know if they trust Syrians (which is not surprising, since they have relatively little contact with Syrians, who mostly resident in Jordan’s crowded cities, where access to work and assistance is greater than rural communities like those surveyed). A signficant minority (35%) responded that they do trust Syrians, and a small minoriy (22%) distrust Syrians.


totals <- fct_count(fct_relevel(dams_fct$q4_4_11_actor_trust_syrians, trust_lev))

totals %>% mutate(Percentage= 100*(n / 2003))

A simple way to vizualize this information is to divide a bar graph vertically into sections. The iconic pie graph is suprisingly easy to misread.


ggplot(dams_fct) + geom_bar(aes(x=fct_relevel(q4_4_11_actor_trust_syrians, trust_lev),fill=fct_relevel(q4_4_11_actor_trust_syrians, trust_lev)), height = 1) + coord_flip() +
  labs(title = "Trust in Syrians In Respondents", y = "Number of Respondents", x = "Degree of Trust") +
  theme(legend.title=element_blank()) + scale_fill_brewer()

Employment Sector

To calculate the total employed members of households, I sum the number of employed women and men. I then lump households with more than 5 employed members together, as representing such small groups would reduce the confidence interval of the survey.

The degree of employment does not have a strong relationship with trust toward Syrians. This survey was conducted when the Syrian crisis was in its 8th year, so it is not surprising that attitudes are stable to changes in employment.


dams_fct <- dams_fct %>% 
  mutate(men_working = as.numeric(q2_10_men_working),
         women_working = as.numeric(q2_13_women_working)) %>% 
  mutate(working = men_working + women_working) %>%
  mutate(working = fct_lump_min(factor(working), min=150, other_level = "more"))

emp <- ggplot(dams_fct, aes(x=working))
emp + geom_bar(aes(fill=fct_relevel(q4_4_11_actor_trust_syrians, trust_lev)), position="fill", width = .8) + coord_flip() +
  labs(title = "Trust in Syrians by Number of Employed Household Members", x = "Number of Household Members Working", y = "Portion of Households") +
  theme(legend.title=element_blank()) + scale_fill_brewer()

Education Sector

By education level there is also no strong trend. Interestingly, more educated Jordanians were more likely to answer both trust and distrust. Given that few respondents have regular contact with Syrians, the educated may simply be less willing to admit they do not know.


dams_fct <- dams_fct %>%
 mutate(educ_hhh = fct_lump_min(educ_hhh, min=150, other_level = "other"))

s <- ggplot(dams_fct, aes(x=fct_relevel(educ_hhh, edu_lev)))
s + geom_bar(aes(fill=fct_relevel(q4_4_11_actor_trust_syrians, trust_lev)), position="fill", width = .8,) + coord_flip() +
  labs(title = "Trust in Syrians by Education of Head of Household", x = "Education of Head of Household", y = "Percent of Households") +
  theme(legend.title=element_blank()) + scale_fill_brewer()

Public Trust

Tribe

Interestingly, trust does vary heavily by the tribe of the respondent. This is not surprising, because tribal identity dominates the political life of of most Jordanians, particularly in rural areas. When hiking in Jordan, I found that people often asked my tribe out of habit, despite my American appearance and accent. Patronage jobs, in the military or municipal governance are usually distirbuted along tribe or sub-tribe. Tribal leaders provide most of the legal services and dispute resolution in rural areas and small cities, while formal courts are reserved for rare events. When anti-state demonstrations have occured in rural Jordan, they have organized around tribal identity. In important ways, tribal identity is analoguous to party identification in the US, so it is not surprising that attitudes toward refugees depend on tribal identity.

dams_fct_few <- dams_fct %>%
  mutate(tribe = fct_lump_min(q2_20_tribe, min=40, other_level = "All_Other_Tribes"))

tri <- ggplot(dams_fct_few, aes(x=fct_infreq(tribe)))
tri + geom_bar(aes(fill=fct_relevel(q4_4_11_actor_trust_syrians, trust_lev)), position="fill", width = .8) + coord_flip() +
  labs(title = "Trust in Syrians by Tribe", x = "Tribe (largest tribes on bottom)", y = "Percent of Households") +
  theme(legend.title=element_blank()) + scale_fill_brewer()

Contact with Syrians

Contact with Syrians shows a robust trend. Greater contact is signficantly correlated with trust. This data does not prove that contact is causing improved trust, since more trusting Jordanians may choose to interact with Syrians. However, we know from the academic literature that there is often a causal relationship between cross-group contact and reduced anti-immigrant sentiment (Paas and Halapuu, “Attitudes towards immigrants and the integration of ethnically diverse societies”).


dams_fct_con <- dams_fct %>%
  mutate(q4_1_8_syrians = fct_lump_min(q4_1_8_syrians, min=80, other_level = "less_than_once_a_month"))

fre <- ggplot(dams_fct_con, aes(x=(x=fct_relevel(q4_1_8_syrians, freq_lev))))
fre + geom_bar(aes(fill=fct_relevel(q4_4_11_actor_trust_syrians, trust_lev)), position = "fill", width = .8) + coord_flip() +
  labs(title = "Trust in Syrians by Degree of Contact", y = "Percent of Respondents", x = "Degree of Contact") +
  theme(legend.title=element_blank()) + scale_fill_brewer()

Tribes and contact

Are these tribes more trusting of Syrians because of their greater contact. In other words, does Bani Hasan tribe have more contact with Syrians, and therefore have more trust? Alternatively, are there other aspects of tribal politics influencing attitudes toward refugees. Here, I construct a trust metric (trusters - distrusters as a percentage of total responding tribe members), and plot it against the percentage of tribe members reporting monthly or greater contact with Syrians.

Unsuprisingly, the higher contact tribes do tend to be more trusting. However we see several large outliers. Jordanians of Palestinian origins have more trust for Syrians, likely related to their family history as refugees. The Bani Haidah tribe, one of the largest in the database, is the least trusting despite moderate There may be other effects, possibly tribal politics,

dams_fct3 <- dams_fct %>% mutate(ats = fct_rev(fct_relevel(q4_4_11_actor_trust_syrians, trust_lev)),
                                 tribe = fct_lump_min(q2_20_tribe, min=20, other_level = "other"),
                                 contact = as.character(q4_1_8_syrians),
                                 contact_point = ifelse(contact=="at_least_once_a_week"|contact=="at_least_once_a_month", 1,0))

rtribe_contact <- dams_fct3 %>%
  group_by(tribe) %>%
  summarise(
    trust = mean(as.numeric(ats), na.rm = TRUE) - 3,
   contact = 100 * mean(contact_point),
    n = n()
  )

relat <- ggplot(rtribe_contact, aes(x=contact, y=trust, label=tribe))
relat + geom_point() + geom_text(aes(label=tribe),hjust=0, vjust=0) +
    labs(title = "Tribes by Trust and Contact", x = "Portion Reporting Monthly or Weekly Contact with Syrians", y = "Ratio of Trusting Respondents to Distrusting") + 
  theme(legend.title=element_blank()) + scale_fill_brewer()

Conclusion

This demonstration explored corrleations with trust in Syrians among rural Jordanians. It found that education and employment rate are not strongly correlated with trust or distrust toward Syrians, but contact with Syrians is correlated with trust. Trust toward Syrians also depends on tribal identities. Degree of contact explains some but not all of the variation between tribes.