MEMO
The data contains 600 observations. The style I chose is to showcase the global scale of the issue. This is to be able to have plots that includes each and every observation factored into it. The story I am trying to tell is to show the drastic increase in refugee migration into the US with a period of nine years. The plots helps to see which continents have the most immigrants. It is clear that massive number of refugees keep coming from mainly the Asian countries. Even though countries in the Americas are closer to the United States of America (same continent), the increase in refugee immigration is 3rd to Asia and Africa. This is because most of the refugee situations are caused mainly by wars. Countries in the Americas are not really engaged in long term rivalry or war, even though there are some political and drug related conflicts that can displace people. That might be the reason they are 3rd in the list. Africa has over 50 countries and a number of them have experience political instability and civil wars and that create refugees. The middle east has been overwhelmed in wars during the time of this data and that is clearly showing by the increase in refugees. However, an updated data might change the case due to the recent war in Ukraine. Europe might swap position or it might not depending on which countries European refugees go to.
PRINCIPLESOF CRAP
• Contrast - I used different colours and size to display object different groups. I used bold text of the same family to display difference in titles • Repetition- The diagrams all follow the same style, colour and labeling format. This keeps the report in theme. I used the same elements throughout the report. I used Roboto Condensed text throughout the final plot. The color theme was mainly viridis c. I added some colors in the United states flag as well. • Alignment- The diagrams are organized and structured properly within the data. Every plot in the same page have their top and down edges aligned. • Proximity- Similar explanations or graphs are grouped together to compliment the explanation. Graphs on the same page assists to help tell the story much better. All labels of plots are properly positioned close to their corresponding plot.
Kieran Healy’s principles of great visualizations
I cleaned and edited the data to make it easier for me to work with, especially for the plot and the kind of story I wanted to tell. The data did not have a lot of variables therefore there was not much needed fancy plots to portray the story the data is hiding. I chose a style of just showing global and big changes in the observations. The color theme choice was viridis c and some few colors in the American flag. The idea was to label them in illustrator therefore I did simple plots in r. The story I chose was to tell it from a global perspective.
Loading libraries
library(tidyverse)
library(countrycode)
library(sf)
library(stringr) #removing spaces in string variables
library(scales)
library(forcats)
windowsFonts("Roboto Condensed"= windowsFont("Roboto Condensed"))
library(plotly)
library(ggtext)
library(ggrepel)
library(png)
Loading data
refugees_raw <- read.csv("data/refugee_status.csv", na = c("-", "X", "D"))
non_countries <- c("Africa", "Asia", "Europe", "North America", "Oceania",
"South America", "Unknown", "Other", "Total")
refugees_clean <- refugees_raw %>%
# Make this column name easier to work with
rename(origin_country = "Continent.Country.of.Nationality") %>%
# Get rid of non-countries
filter(!(origin_country %in% non_countries)) %>%
# Convert country names to ISO3 codes
mutate(iso3 = countrycode(origin_country, "country.name", "iso3c",
custom_match = c("Korea, North" = "PRK")))
#turning missing values into 0
refugees_clean[is.na(refugees_clean)] <- 0
#Creating a column for total migration per country from 2006 to 2015
refugees_clean$country_m_total<- refugees_clean$X2006 + refugees_clean$X2007+
refugees_clean$X2008 +refugees_clean$X2009+
refugees_clean$X2010 +
refugees_clean$X2011 +
refugees_clean$X2012 +
refugees_clean$X2013 +
refugees_clean$X2014 +
refugees_clean$X2015
#creating the plot dataset and arrange the total by descending order
migration <- refugees_clean %>%
select(origin_country,iso3, country_m_total) %>%
arrange(desc(country_m_total))
General overview plot
fig2<-ggplot(data = migration, aes(y=fct_inorder(origin_country), x = country_m_total)) + geom_pointrange(position =
position_dodge(width = 0.2), aes(
xmin= 0, xmax= country_m_total
)) + scale_x_log10(labels = label_comma()) +
theme_bw(base_family = "Roboto Condensed") +
labs(x ="Number of Immigrants", y= "Country", title="Number of Immigrants into the US", subtitle=
"From Year 2006 to 2015")
print(fig2)
#loading world map
world_map<- read_sf("data/maps/ne_110m_admin_0_countries/")
usmap<- read_sf("data/maps/cb_2022_us_state_20m/")
Filtering the data
world<- world_map %>%
select(ISO_A3, geometry)
Joining the two datasets
world_map_immigration <- world %>%
left_join(migration, by = c("ISO_A3" = "iso3"))
Plottig migration on the map
ggplot()+
geom_sf(data = world_map_immigration, aes(fill = country_m_total)) + geom_sf(data = usmap, aes(col = "red"))+
theme_void() + coord_sf(crs = "+proj=robin") +
scale_fill_viridis_c(option = "viridis")+ guides(col = FALSE)+
labs(title = "The Migration Of Refugees into the US BY COUNTRIES") + theme_classic(base_family = "Roboto Condensed")
## Warning: The `<scale>` argument of `guides()` cannot be `FALSE`. Use "none" instead as
## of ggplot2 3.3.4.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
#creating a dataset for plotting on the map
immigration_countries<- world_map_immigration %>%
select(origin_country, ISO_A3) %>%
filter(!is.na(origin_country))
fig1<-ggplot()+
geom_sf(data = world_map_immigration, aes(fill = country_m_total)) + geom_sf(data = usmap, aes(col = "red"))+
theme_void() + coord_sf(crs = "+proj=robin") +
scale_fill_viridis_c(option = "viridis")+ guides(col = FALSE)
Main change over time analysis
main_1<- refugees_raw %>%
rename(origin_country = Continent.Country.of.Nationality) %>%
# Get rid of non-countries
filter(!(origin_country %in% non_countries)) %>%
# Convert country names to ISO3 codes
mutate(iso3 = countrycode(origin_country, "country.name", "iso3c",
custom_match = c("Korea, North" = "PRK"))) %>%
# Convert ISO3 codes to country names, regions, and continents
mutate(origin_country = countrycode(iso3, "iso3c", "country.name"),
origin_region = countrycode(iso3, "iso3c", "region"),
origin_continent = countrycode(iso3, "iso3c", "continent")) %>%
# Make this data tidy
gather(year, number, -origin_country, -iso3, -origin_region, -origin_continent)
main_1$year[main_1$year == "X2006"] <- 2006
main_1$year[main_1$year == "X2007"] <- 2007
main_1$year[main_1$year == "X2008"] <- 2008
main_1$year[main_1$year == "X2009"] <- 2009
main_1$year[main_1$year == "X2010"] <- 2010
main_1$year[main_1$year == "X2011"] <- 2011
main_1$year[main_1$year == "X2012"] <- 2012
main_1$year[main_1$year == "X2013"] <- 2013
main_1$year[main_1$year == "X2014"] <- 2014
main_1$year[main_1$year == "X2015"] <- 2015
Travellers<- main_1 %>%
mutate(year = as.numeric(year),
year_date = ymd(paste0(year, "-01-01")))
Cummulative Total
cumulative_immigration<- Travellers %>%
group_by(origin_continent, year_date) %>%
summarize(total = sum(number, na.rm = TRUE)) %>%
arrange(year_date) %>%
mutate(cumulative_total = cumsum(total))
print(cumulative_immigration)
## # A tibble: 40 × 4
## # Groups: origin_continent [4]
## origin_continent year_date total cumulative_total
## <chr> <date> <int> <int>
## 1 Africa 2006-01-01 18116 18116
## 2 Americas 2006-01-01 3258 3258
## 3 Asia 2006-01-01 10076 10076
## 4 Europe 2006-01-01 9605 9605
## 5 Africa 2007-01-01 17473 35589
## 6 Americas 2007-01-01 2976 6234
## 7 Asia 2007-01-01 23557 33633
## 8 Europe 2007-01-01 4179 13784
## 9 Africa 2008-01-01 8931 44520
## 10 Americas 2008-01-01 4271 10505
## # ℹ 30 more rows
Analyzing further
year1_end <- c("2006-01-01", "2015-01-01")
longterm_migration<- cumulative_immigration %>%
filter(year_date %in% year1_end) %>%
select(origin_continent,year_date,cumulative_total) %>%
group_by(origin_continent) %>%
mutate(pct_increase = ((cumulative_total/lag(cumulative_total)-1)*100)) %>%
mutate(left_label = ifelse(year_date == "2006-01-01", origin_continent,NA))
print(longterm_migration)
## # A tibble: 8 × 5
## # Groups: origin_continent [4]
## origin_continent year_date cumulative_total pct_increase left_label
## <chr> <date> <int> <dbl> <chr>
## 1 Africa 2006-01-01 18116 NA Africa
## 2 Americas 2006-01-01 3258 NA Americas
## 3 Asia 2006-01-01 10076 NA Asia
## 4 Europe 2006-01-01 9605 NA Europe
## 5 Africa 2015-01-01 141682 682. <NA>
## 6 Americas 2015-01-01 36187 1011. <NA>
## 7 Asia 2015-01-01 417479 4043. <NA>
## 8 Europe 2015-01-01 24119 151. <NA>
plot9<-ggplot(data = longterm_migration, mapping = aes(x = year_date, col = origin_continent
, y = cumulative_total, group = origin_continent
, text = "cumulative_total") ) +
geom_point(size = 2) + geom_line(size = 1) + labs(size = 30)+
scale_y_continuous(sec.axis = dup_axis(), labels = label_comma()) +theme_bw(base_family = "Roboto Condensed") +
labs(title = "Cummulative Total of Immigrants", subtitle = "Year 2006 and 2015
",)
## Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
## ℹ Please use `linewidth` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
print(plot9)
percentage_increase <-longterm_migration %>% filter(pct_increase!= "NA")
fig5<-ggplot(data = percentage_increase, aes(x = origin_continent, y = pct_increase, size= pct_increase)) +
geom_point()+ theme_bw(base_family = "Roboto Condensed") +
labs(title = "Percentage Increase In Immigrants", subtitle = "Year 2006 and 2015")
print(fig5)
#ggsave(fig1, filename = "fig1.pdf", width = 10, height = 10)
#ggsave(fig2, filename = "fig2.pdf", width = 10, height = 10)
#ggsave(plot9, filename = "fig4.pdf", width = 10, height = 10)
#ggsave(fig5, filename = "fig5.pdf", width = 10, height = 10)
#ggsave(fig4, filename = "fig3.pdf", width = 10, height = 10)
Importing Enhanced image into r