International Migrant Stock 2019

Data Source: United Nations, Department of Economic and Social Affairs. Population Division (2019). International Migrant Stock 2019 (United Nations database, POP/DB/MIG/Stock/Rev.2019).

The dataset presents estimates of international migrant by age, sex and origin. Estimates are presented for 1990, 1995, 2000, 2005, 2010, 2015 and 2019 and are available for all countries and areas of the world. The estimates are based on official statistics on the foreign-born or the foreign population.

I will be using the International migrant stock by destination and origin data file for this project.

Import Data

Here I am importing a .csv file with UN migration data for 1990 - 2019. It has three columns with Years, Destination Regions and Total number of migrations. These data came from nited Nations, Department of Economic and Social Affairs. Population Division (2019) web site.

UN_migration_data <- read.csv("C:/Users/Staff/Documents/Geeth/SPS-Fall 19/DATA 607/UN_migration_data.csv")
head(UN_migration_data)
##   ï..Year Major.area..region..country.or.area.of.destination origin.Total
## 1    1990                                 Geographic regions           ..
## 2    1990                                             Africa   15,689,666
## 3    1990                                               Asia   48,209,949
## 4    1990                                             Europe   49,608,231
## 5    1990                    Latin America and the Caribbean    7,161,371
## 6    1990                                   Northern America   27,610,408
View(UN_migration_data)

Tidying UN data for better visualizatin

In several sections I am going to tidy my data using some data wrangling functions that comes with dplyr and tidyr.

names(UN_migration_data)
## [1] "ï..Year"                                           
## [2] "Major.area..region..country.or.area.of.destination"
## [3] "origin.Total"
UN_migration_data$Major.area..region..country.or.area.of.destination <- as.character(UN_migration_data$Major.area..region..country.or.area.of.destination)

Rename Columns

Fist of all I am going to start by renaming my columns. By default my data table has column titles with symbols and some long names. In this section I will eliminate all of them and rename with simple titles that make sense.

UN_migration <- UN_migration_data %>%
  rename(Year = ï..Year) %>%
  rename(Region = Major.area..region..country.or.area.of.destination) %>%
  rename(Total.migrations = origin.Total)
UN_migration
##    Year                          Region Total.migrations
## 1  1990              Geographic regions               ..
## 2  1990                          Africa       15,689,666
## 3  1990                            Asia       48,209,949
## 4  1990                          Europe       49,608,231
## 5  1990 Latin America and the Caribbean        7,161,371
## 6  1990                Northern America       27,610,408
## 7  1990                         Oceania        4,731,848
## 8  1995              Geographic regions               ..
## 9  1995                          Africa       16,357,077
## 10 1995                            Asia       46,418,044
## 11 1995                          Europe       53,489,829
## 12 1995 Latin America and the Caribbean        6,688,710
## 13 1995                Northern America       33,340,948
## 14 1995                         Oceania        5,022,287
## 15 2000              Geographic regions               ..
## 16 2000                          Africa       15,051,677
## 17 2000                            Asia       49,394,322
## 18 2000                          Europe       56,858,788
## 19 2000 Latin America and the Caribbean        6,570,729
## 20 2000                Northern America       40,351,694
## 21 2000                         Oceania        5,361,231
## 22 2005              Geographic regions               ..
## 23 2005                          Africa       15,969,835
## 24 2005                            Asia       53,439,306
## 25 2005                          Europe       63,594,822
## 26 2005 Latin America and the Caribbean        7,224,942
## 27 2005                Northern America       45,363,257
## 28 2005                         Oceania        6,023,412
## 29 2010              Geographic regions               ..
## 30 2010                          Africa       17,804,198
## 31 2010                            Asia       65,938,712
## 32 2010                          Europe       70,678,025
## 33 2010 Latin America and the Caribbean        8,262,433
## 34 2010                Northern America       50,970,861
## 35 2010                         Oceania        7,127,680
## 36 2015              Geographic regions               ..
## 37 2015                          Africa       23,476,251
## 38 2015                            Asia       77,231,760
## 39 2015                          Europe       75,008,219
## 40 2015 Latin America and the Caribbean        9,441,679
## 41 2015                Northern America       55,633,443
## 42 2015                         Oceania        8,069,944
## 43 2019              Geographic regions               ..
## 44 2019                          Africa       26,529,334
## 45 2019                            Asia       83,559,197
## 46 2019                          Europe       82,304,539
## 47 2019 Latin America and the Caribbean       11,673,288
## 48 2019                Northern America       58,647,822
## 49 2019                         Oceania        8,927,925

Delete rows

While I was going through my data, I found out that there is a sub tiltle in every row called Geographic region I am going to delete those rows so that my data will only have region names.

UN_migration <- UN_migration[UN_migration$Region != "Geographic regions",]
UN_migration
##    Year                          Region Total.migrations
## 2  1990                          Africa       15,689,666
## 3  1990                            Asia       48,209,949
## 4  1990                          Europe       49,608,231
## 5  1990 Latin America and the Caribbean        7,161,371
## 6  1990                Northern America       27,610,408
## 7  1990                         Oceania        4,731,848
## 9  1995                          Africa       16,357,077
## 10 1995                            Asia       46,418,044
## 11 1995                          Europe       53,489,829
## 12 1995 Latin America and the Caribbean        6,688,710
## 13 1995                Northern America       33,340,948
## 14 1995                         Oceania        5,022,287
## 16 2000                          Africa       15,051,677
## 17 2000                            Asia       49,394,322
## 18 2000                          Europe       56,858,788
## 19 2000 Latin America and the Caribbean        6,570,729
## 20 2000                Northern America       40,351,694
## 21 2000                         Oceania        5,361,231
## 23 2005                          Africa       15,969,835
## 24 2005                            Asia       53,439,306
## 25 2005                          Europe       63,594,822
## 26 2005 Latin America and the Caribbean        7,224,942
## 27 2005                Northern America       45,363,257
## 28 2005                         Oceania        6,023,412
## 30 2010                          Africa       17,804,198
## 31 2010                            Asia       65,938,712
## 32 2010                          Europe       70,678,025
## 33 2010 Latin America and the Caribbean        8,262,433
## 34 2010                Northern America       50,970,861
## 35 2010                         Oceania        7,127,680
## 37 2015                          Africa       23,476,251
## 38 2015                            Asia       77,231,760
## 39 2015                          Europe       75,008,219
## 40 2015 Latin America and the Caribbean        9,441,679
## 41 2015                Northern America       55,633,443
## 42 2015                         Oceania        8,069,944
## 44 2019                          Africa       26,529,334
## 45 2019                            Asia       83,559,197
## 46 2019                          Europe       82,304,539
## 47 2019 Latin America and the Caribbean       11,673,288
## 48 2019                Northern America       58,647,822
## 49 2019                         Oceania        8,927,925

Reshaping Data

In this section I am going to change my long data view to a wide data view by using spread() function. Column Year in my original data set will break in into 7 columns for each year.

UN_Migration_byYear <- as.data.frame.matrix(UN_migration)
UN_Migration_byYear <- UN_Migration_byYear %>%
  spread(key = Year, value = Total.migrations)
head(UN_Migration_byYear)
##                            Region       1990       1995       2000
## 1                          Africa 15,689,666 16,357,077 15,051,677
## 2                            Asia 48,209,949 46,418,044 49,394,322
## 3                          Europe 49,608,231 53,489,829 56,858,788
## 4 Latin America and the Caribbean  7,161,371  6,688,710  6,570,729
## 5                Northern America 27,610,408 33,340,948 40,351,694
## 6                         Oceania  4,731,848  5,022,287  5,361,231
##         2005       2010       2015       2019
## 1 15,969,835 17,804,198 23,476,251 26,529,334
## 2 53,439,306 65,938,712 77,231,760 83,559,197
## 3 63,594,822 70,678,025 75,008,219 82,304,539
## 4  7,224,942  8,262,433  9,441,679 11,673,288
## 5 45,363,257 50,970,861 55,633,443 58,647,822
## 6  6,023,412  7,127,680  8,069,944  8,927,925
UN_Migration_byYear$`1990` <- str_remove(UN_Migration_byYear$`1990`, ",")
as.numeric(UN_Migration_byYear$`1990`)
## Warning: NAs introduced by coercion
## [1] NA NA NA NA NA NA
UN_Migration_byYear$`1995` <- str_remove(UN_Migration_byYear$`1995`, ",")
as.numeric(UN_Migration_byYear$`1995`)
## Warning: NAs introduced by coercion
## [1] NA NA NA NA NA NA
UN_Migration_byYear$`2000` <- str_remove(UN_Migration_byYear$`2000`, ",")
as.numeric(UN_Migration_byYear$`2000`)
## Warning: NAs introduced by coercion
## [1] NA NA NA NA NA NA
UN_Migration_byYear$`2005` <- str_remove(UN_Migration_byYear$`2005`, ",")
as.numeric(UN_Migration_byYear$`2005`)
## Warning: NAs introduced by coercion
## [1] NA NA NA NA NA NA
UN_Migration_byYear$`2010` <- str_remove(UN_Migration_byYear$`2010`, ",")
as.numeric(UN_Migration_byYear$`2010`)
## Warning: NAs introduced by coercion
## [1] NA NA NA NA NA NA
UN_Migration_byYear$`2015` <- str_remove(UN_Migration_byYear$`2015`, ",")
as.numeric(UN_Migration_byYear$`2015`)
## Warning: NAs introduced by coercion
## [1] NA NA NA NA NA NA
UN_Migration_byYear$`2019` <- str_remove(UN_Migration_byYear$`2019`, ",")
as.numeric(UN_Migration_byYear$`2019`)
## Warning: NAs introduced by coercion
## [1] NA NA NA NA NA NA
as.integer(UN_Migration_byYear$`1990`)
## Warning: NAs introduced by coercion
## [1] NA NA NA NA NA NA
str(UN_Migration_byYear)
## 'data.frame':    6 obs. of  8 variables:
##  $ Region: chr  "Africa" "Asia" "Europe" "Latin America and the Caribbean" ...
##  $ 1990  : chr  "15689,666" "48209,949" "49608,231" "7161,371" ...
##  $ 1995  : chr  "16357,077" "46418,044" "53489,829" "6688,710" ...
##  $ 2000  : chr  "15051,677" "49394,322" "56858,788" "6570,729" ...
##  $ 2005  : chr  "15969,835" "53439,306" "63594,822" "7224,942" ...
##  $ 2010  : chr  "17804,198" "65938,712" "70678,025" "8262,433" ...
##  $ 2015  : chr  "23476,251" "77231,760" "75008,219" "9441,679" ...
##  $ 2019  : chr  "26529,334" "83559,197" "82304,539" "11673,288" ...