MEMO

The data contains 600 observations. The style I chose is to showcase the global scale of the issue. This is to be able to have plots that includes each and every observation factored into it. The story I am trying to tell is to show the drastic increase in refugee migration into the US with a period of nine years. The plots helps to see which continents have the most immigrants. It is clear that massive number of refugees keep coming from mainly the Asian countries. Even though countries in the Americas are closer to the United States of America (same continent), the increase in refugee immigration is 3rd to Asia and Africa. This is because most of the refugee situations are caused mainly by wars. Countries in the Americas are not really engaged in long term rivalry or war, even though there are some political and drug related conflicts that can displace people. That might be the reason they are 3rd in the list. Africa has over 50 countries and a number of them have experience political instability and civil wars and that create refugees. The middle east has been overwhelmed in wars during the time of this data and that is clearly showing by the increase in refugees. However, an updated data might change the case due to the recent war in Ukraine. Europe might swap position or it might not depending on which countries European refugees go to.

PRINCIPLESOF CRAP

• Contrast - I used different colours and size to display object different groups. I used bold text of the same family to display difference in titles • Repetition- The diagrams all follow the same style, colour and labeling format. This keeps the report in theme. I used the same elements throughout the report. I used Roboto Condensed text throughout the final plot. The color theme was mainly viridis c. I added some colors in the United states flag as well. • Alignment- The diagrams are organized and structured properly within the data. Every plot in the same page have their top and down edges aligned. • Proximity- Similar explanations or graphs are grouped together to compliment the explanation. Graphs on the same page assists to help tell the story much better. All labels of plots are properly positioned close to their corresponding plot.

Kieran Healy’s principles of great visualizations

I cleaned and edited the data to make it easier for me to work with, especially for the plot and the kind of story I wanted to tell. The data did not have a lot of variables therefore there was not much needed fancy plots to portray the story the data is hiding. I chose a style of just showing global and big changes in the observations. The color theme choice was viridis c and some few colors in the American flag. The idea was to label them in illustrator therefore I did simple plots in r. The story I chose was to tell it from a global perspective.

Loading libraries

library(tidyverse)
library(countrycode)
library(sf)
library(stringr) #removing spaces in string variables
library(scales)
library(forcats)
windowsFonts("Roboto Condensed"= windowsFont("Roboto Condensed"))
library(plotly)
library(ggtext)
library(ggrepel)
library(png)

Loading data

refugees_raw <- read.csv("data/refugee_status.csv", na = c("-", "X", "D"))
non_countries <- c("Africa", "Asia", "Europe", "North America", "Oceania", 
                   "South America", "Unknown", "Other", "Total")

refugees_clean <- refugees_raw %>%
  # Make this column name easier to work with
  rename(origin_country = "Continent.Country.of.Nationality") %>%
  # Get rid of non-countries
  filter(!(origin_country %in% non_countries)) %>% 
   # Convert country names to ISO3 codes
  mutate(iso3 = countrycode(origin_country, "country.name", "iso3c",
                            custom_match = c("Korea, North" = "PRK")))

#turning missing values into 0
refugees_clean[is.na(refugees_clean)] <- 0

#Creating a column for total migration per country from 2006 to 2015

refugees_clean$country_m_total<- refugees_clean$X2006 + refugees_clean$X2007+
                                    refugees_clean$X2008 +refugees_clean$X2009+
                                    refugees_clean$X2010 +
                                    refugees_clean$X2011 +
                                    refugees_clean$X2012 +
                                    refugees_clean$X2013 +
                                    refugees_clean$X2014 +
                                    refugees_clean$X2015 
#creating the plot dataset and arrange the total by descending order
migration <- refugees_clean %>% 
  select(origin_country,iso3, country_m_total) %>% 
  arrange(desc(country_m_total))

General overview plot

fig2<-ggplot(data = migration, aes(y=fct_inorder(origin_country), x = country_m_total)) + geom_pointrange(position =
                                                position_dodge(width = 0.2), aes(
    xmin= 0, xmax= country_m_total 
  )) + scale_x_log10(labels = label_comma()) +
  theme_bw(base_family = "Roboto Condensed") +
  labs(x ="Number of Immigrants", y= "Country", title="Number of Immigrants into the US", subtitle=
         "From Year 2006 to 2015")

print(fig2)

#loading world map
world_map<- read_sf("data/maps/ne_110m_admin_0_countries/")

usmap<- read_sf("data/maps/cb_2022_us_state_20m/")

Filtering the data

world<- world_map %>% 
  select(ISO_A3, geometry)

Joining the two datasets

world_map_immigration <- world %>% 
  left_join(migration, by = c("ISO_A3" = "iso3"))

Plottig migration on the map

ggplot()+
  geom_sf(data = world_map_immigration, aes(fill = country_m_total)) + geom_sf(data = usmap, aes(col = "red"))+
  theme_void() + coord_sf(crs = "+proj=robin") +
  scale_fill_viridis_c(option = "viridis")+ guides(col = FALSE)+
  labs(title = "The Migration Of Refugees into the US BY COUNTRIES") + theme_classic(base_family = "Roboto Condensed")
## Warning: The `<scale>` argument of `guides()` cannot be `FALSE`. Use "none" instead as
## of ggplot2 3.3.4.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.

#creating a dataset for plotting on the map
immigration_countries<- world_map_immigration %>% 
  select(origin_country, ISO_A3) %>% 
  filter(!is.na(origin_country))


fig1<-ggplot()+
  geom_sf(data = world_map_immigration, aes(fill = country_m_total)) + geom_sf(data = usmap, aes(col = "red"))+
  theme_void() + coord_sf(crs = "+proj=robin") +
  scale_fill_viridis_c(option = "viridis")+ guides(col = FALSE)

Main change over time analysis

main_1<- refugees_raw %>% 
  rename(origin_country = Continent.Country.of.Nationality) %>%
  # Get rid of non-countries
  filter(!(origin_country %in% non_countries)) %>%
  # Convert country names to ISO3 codes
  mutate(iso3 = countrycode(origin_country, "country.name", "iso3c",
                            custom_match = c("Korea, North" = "PRK"))) %>%
  # Convert ISO3 codes to country names, regions, and continents
  mutate(origin_country = countrycode(iso3, "iso3c", "country.name"),
         origin_region = countrycode(iso3, "iso3c", "region"),
         origin_continent = countrycode(iso3, "iso3c", "continent")) %>%  
  # Make this data tidy 
gather(year, number, -origin_country, -iso3, -origin_region, -origin_continent)


main_1$year[main_1$year == "X2006"] <- 2006
main_1$year[main_1$year == "X2007"] <- 2007
main_1$year[main_1$year == "X2008"] <- 2008
main_1$year[main_1$year == "X2009"] <- 2009
main_1$year[main_1$year == "X2010"] <- 2010
main_1$year[main_1$year == "X2011"] <- 2011
main_1$year[main_1$year == "X2012"] <- 2012
main_1$year[main_1$year == "X2013"] <- 2013
main_1$year[main_1$year == "X2014"] <- 2014
main_1$year[main_1$year == "X2015"] <- 2015

  
Travellers<- main_1 %>% 
  mutate(year = as.numeric(year),
         year_date = ymd(paste0(year, "-01-01")))

Cummulative Total

cumulative_immigration<- Travellers %>%
  group_by(origin_continent, year_date) %>%
  summarize(total = sum(number, na.rm = TRUE)) %>%
  arrange(year_date) %>%
  mutate(cumulative_total = cumsum(total))

print(cumulative_immigration)
## # A tibble: 40 × 4
## # Groups:   origin_continent [4]
##    origin_continent year_date  total cumulative_total
##    <chr>            <date>     <int>            <int>
##  1 Africa           2006-01-01 18116            18116
##  2 Americas         2006-01-01  3258             3258
##  3 Asia             2006-01-01 10076            10076
##  4 Europe           2006-01-01  9605             9605
##  5 Africa           2007-01-01 17473            35589
##  6 Americas         2007-01-01  2976             6234
##  7 Asia             2007-01-01 23557            33633
##  8 Europe           2007-01-01  4179            13784
##  9 Africa           2008-01-01  8931            44520
## 10 Americas         2008-01-01  4271            10505
## # ℹ 30 more rows

Analyzing further

year1_end <- c("2006-01-01", "2015-01-01")

longterm_migration<- cumulative_immigration %>% 
  filter(year_date %in% year1_end) %>% 
  select(origin_continent,year_date,cumulative_total) %>% 
  group_by(origin_continent) %>% 
  mutate(pct_increase = ((cumulative_total/lag(cumulative_total)-1)*100)) %>% 
  mutate(left_label = ifelse(year_date == "2006-01-01", origin_continent,NA))


 
 print(longterm_migration)
## # A tibble: 8 × 5
## # Groups:   origin_continent [4]
##   origin_continent year_date  cumulative_total pct_increase left_label
##   <chr>            <date>                <int>        <dbl> <chr>     
## 1 Africa           2006-01-01            18116          NA  Africa    
## 2 Americas         2006-01-01             3258          NA  Americas  
## 3 Asia             2006-01-01            10076          NA  Asia      
## 4 Europe           2006-01-01             9605          NA  Europe    
## 5 Africa           2015-01-01           141682         682. <NA>      
## 6 Americas         2015-01-01            36187        1011. <NA>      
## 7 Asia             2015-01-01           417479        4043. <NA>      
## 8 Europe           2015-01-01            24119         151. <NA>
plot9<-ggplot(data = longterm_migration, mapping = aes(x = year_date, col = origin_continent
                                                , y = cumulative_total, group = origin_continent
                                                , text = "cumulative_total") ) +
  geom_point(size = 2)  + geom_line(size = 1) + labs(size = 30)+
   scale_y_continuous(sec.axis = dup_axis(), labels = label_comma()) +theme_bw(base_family = "Roboto Condensed") +
  labs(title = "Cummulative Total of Immigrants", subtitle = "Year 2006 and 2015
       ",)
## Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
## ℹ Please use `linewidth` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
print(plot9)

percentage_increase <-longterm_migration %>%  filter(pct_increase!= "NA")

fig5<-ggplot(data = percentage_increase, aes(x = origin_continent, y = pct_increase, size= pct_increase)) +
  geom_point()+ theme_bw(base_family = "Roboto Condensed") +
  labs(title = "Percentage Increase In Immigrants", subtitle = "Year 2006 and 2015")

print(fig5)

#ggsave(fig1, filename = "fig1.pdf", width = 10, height = 10)
#ggsave(fig2, filename = "fig2.pdf", width = 10, height = 10)
#ggsave(plot9, filename = "fig4.pdf", width = 10, height = 10)
#ggsave(fig5, filename = "fig5.pdf", width = 10, height = 10)
#ggsave(fig4, filename = "fig3.pdf", width = 10, height = 10)

Importing Enhanced image into r

Caption
Caption