Introduction

According to the U.S Department of Homeland Security, refugees are people who are outside their country of nationality and are unwilling or unable to return there because they fear persecution for their race, religious beliefs, nationality, or political opinions. Asylees are refugees who have already entered or are attempting to enter the United States and meet the definition of refugees. Reports by the Pew Research Center indicate that the number of refugees admitted into the U.S. during the Trump administration declined substantially. This report explores the historical flow of refugees to the U.S from 2006 - 2015.

Load libraries and data

library(tidyverse)  
library(countrycode)
library(lubridate)
library(maps)
library(plotly)
library(dplyr)
library(scales) 
library(patchwork) 
library(sf) 

refugees_raw <- read_csv("data/refugee_status.csv", na = c("-", "X", "D")) 

Clean data

non_countries <- c("Africa", "Asia", "Europe", "North America", "Oceania", 
                   "South America", "Unknown", "Other", "Total")

refugees_clean <- refugees_raw %>%
  rename(origin_country = `Continent/Country of Nationality`) %>%
  filter(!(origin_country %in% non_countries)) %>%
  mutate(iso3 = countrycode(origin_country, "country.name", "iso3c",
                            custom_match = c("Korea, North" = "PRK"))) %>%
  mutate(origin_country = countrycode(iso3, "iso3c", "country.name"),
         origin_region = countrycode(iso3, "iso3c", "region"),
         origin_continent = countrycode(iso3, "iso3c", "continent")) %>%
  gather(year, number, -origin_country, -iso3, -origin_region, -origin_continent) %>%
  mutate(year = as.numeric(year),
         year_date = ymd(paste0(year, "-01-01"))) 

Cumulative country totals over time

refugees_countries_cumulative <- refugees_clean %>%
  arrange(year_date) %>%
  group_by(origin_country) %>%
  mutate(cumulative_total = cumsum(number)) %>% 
  rename(region = origin_country) 

Continent totals over time

refugees_continents <- refugees_clean %>%
  group_by(origin_continent, year_date) %>%
  summarize(total = sum(number, na.rm = TRUE))

Cumulative continent totals over time

refugees_continents_cumulative <- refugees_clean %>%
  group_by(origin_continent, year_date) %>%
  summarize(total = sum(number, na.rm = TRUE)) %>%
  arrange(year_date) %>%
  group_by(origin_continent) %>%
  mutate(cumulative_total = cumsum(total))

Intermediary Graphic

I created two line graphs with geom_point() to compare the cumulative total number of refugees to the cumulative average number of refugees from the various continents. I used the patchwork package to combine the charts with telling a better story. I believe that adding the average chart throws more light on the trends.

refugees_continents_cumaverage <- refugees_clean %>% 
  group_by(origin_continent, year_date) %>%
  summarize(average = mean(number, na.rm = TRUE)) %>%
  arrange(year_date) %>%
  group_by(origin_continent) %>%
  mutate(cumulative_average = cummean(average))
aveplot <- ggplot(refugees_continents_cumaverage, 
       aes(x = year_date, y = cumulative_average, color = origin_continent)) + 
  geom_line(size = 1) + 
  geom_point(size = 3) + 
  scale_color_manual(values = c("#252DFA", "#39F73F", "#AD9778", 
                                "#FFAF3D")) + 
  scale_y_continuous(labels = comma) +
  labs(subtitle = "Cumulative Averages Over Time", 
       x = NULL, 
       y = "Cumulative Average", 
       caption = "Source: The US Department of Homeland Security") +
  theme_minimal() + 
  theme(plot.title=element_text(size = 20, 
                                    face = "bold", 
                                    family = "serif",
                                    color = "black",
                                    lineheight=1.2), 
        plot.subtitle = element_text(size = 15, 
                                     family = "serif",
                                     color = "black"), 
        plot.caption=element_text(size = 10, 
                                      family = "serif"), 
        panel.grid.major = element_blank(), 
        panel.grid.minor = element_blank()) +
   theme(legend.position = "bottom") + 
   guides(color = "none")
 
aveplot

cumplot <- ggplot(refugees_continents_cumulative, 
       aes(x = year_date, y = cumulative_total, color = origin_continent)) + 
  geom_line(size = 1) + 
  geom_point(size = 3) + 
  scale_color_manual(values = c("#252DFA", "#39F73F", "#AD9778", 
                                "#FFAF3D")) + 
  scale_y_continuous(labels = comma) +
  labs(title = "Refugees Admited to the US from 2006 to 2015", 
       subtitle = "Cumulative Total Over Time", 
       x = NULL, 
       y = "Cumulative Total") +
  theme_minimal() + 
  theme(plot.title=element_text(size = 20, 
                                    face = "bold", 
                                    family = "serif",
                                    color = "black",
                                    lineheight=1.2), 
        plot.subtitle = element_text(size = 15, 
                                     family = "serif",
                                     color = "black"), 
        plot.caption=element_text(size = 10, 
                                      family = "serif"), 
        panel.grid.major = element_blank(), 
        panel.grid.minor = element_blank()) + 
  guides(color=guide_legend("Continent"))
 
cumplot

Use patchwork to display both charts

We can see from the combined charts that between 2007 and 2009, Africa and Europe had declining trends of refugees admitted to the U.S. on average. During the same period, Asia and the Americas had a rising number of refugees admitted to the U.S. However, the cumulative total chart shows positive trends for all continents across time.

patchwork <- cumplot/aveplot
patchwork 

Map Visualization

Since the intermediary chart doesn’t tell the complete story because we cannot tell which countries were dominant in driving the trends, I created a map showing the number of refugees and their countries of origin.

I created a new sub-dataset by filtering the refugees’ countries’ cumulative dataset to include only 2015. After making the map, I exported it to Adobe Illustrator to redesign some elements using the CRAP principles.

I added some contrasting colors from the Adobe Color tool by manually creating a gradient to differentiate the densities on the map. Also, using Adobe Illustrator to ensure contrast and repetition in fonts, I used the “Arial Black” regular font in navy blue color for the main title and “Arial Rounded MT Bold” regular for all other texts with opacity set to 55. The main title, subtitle, and caption were aligned to the left, while I aligned the legend to the right. There was not much to be done for proximity.

world_shapes <- read_sf("data/ne_110m_admin_0_countries/ne_110m_admin_0_countries.shp")

refugee_cummtotal_2015 <- refugees_countries_cumulative %>%  
  filter(year == 2015) %>% 
  select(iso3, number) 

map_refugee_cummtotal_2015 <- world_shapes %>% 
  left_join(refugee_cummtotal_2015, by = c("ISO_A3" = "iso3"))
mapplot <- ggplot() + 
              geom_sf(data = map_refugee_cummtotal_2015, aes(fill = number)) + 
              coord_sf(crs = st_crs("ESRI:54030")) + 
              scale_fill_gradient(name ="Number",
                    low = "#440154", 
                    high = "#fde725", 
                    na.value = "white", 
                    labels = comma) + 
              labs(fill = "number") +
              theme_void() + 
              theme(legend.position = "right") +
              labs(title = "Refugees Admited to the US in 2015",
                   subtitle = "The Department of Homeland Security's Annual Report",
                   caption = "Source: The US Department of Homeland Security") + 
  theme(plot.title=element_text(size = 20, 
                                    face = "bold", 
                                    family = "serif",
                                    color = "black",
                                    hjust = 0.5,
                                    lineheight=1.2), 
        plot.subtitle = element_text(size = 15, 
                                     family = "serif",
                                     hjust = 0.5,
                                     color = "black"), 
        plot.caption=element_text(size = 10, 
                                      family = "serif",
                                      hjust = 0.5))
  


mapplot

Save graphs

ggsave(patchwork, filename = "intermediary.pdf", width = 8, height = 4.5) 

ggsave(mapplot, filename = "mapplot.pdf", width = 8, height = 4.5) 
knitr::include_graphics("/Users/iattram1/Desktop/MA ECON/Data Visualization/Mini Project 2/mapplot_CRAP.png")

Alberto Cairo’s five qualities of great visualizations

In line with the five qualities, I ensured not to manipulate the data to achieve the desired result. Therefore, the charts I created were an accurate representation of the data. I had to create a sub-dataset by filtering for 2015 to create the map graphic without altering the variables. In terms of functionality, I filled the map by the number of refugees, so the data-ink ratio was accurate. Also, I colored the charts manually with complementary colors from the Adobe Color tool to create a gradient to enhance the aesthetics. Finally, I added a text box to highlight Myanmar as the country with the largest number of refugees admitted to the U.S in 2015.

Conclusion

To conclude, I believe the findings are insightful and enlightening since I effectively combined all the essential components to tell the story of the refugees admitted to the U.S between 2006 and 2015.