author: “Sufian”
date: “9/30/2019”
output: html_document
Rpub link:
http://rpubs.com/ssufian/535540
Source: United Nations Population Division
Department of Economic and Social Affairs : UN International Migrant Stock
NOTE: Migrant Data refers to Immigration into area of destination; area, region or countries
Jai posed these questions: Summaries could be done by gender at any of the aggregation levels below
• Gender ratios of migrant stock for each region of the world, for each income group, etc.
• Average gender ratios of world migrant stock.
• What is the variance across countries.
• Is there a trend across years for any of these sequences.
Developed vs. Less Developed Regions
High vs. Low Income Countries
url <- 'https://raw.githubusercontent.com/ssufian/Data_607/master/UN_MigrantStockTotal_2019%20(1).csv'
# Reading & Loading data
df <- read.csv(file = url ,sep = ",", na.strings = c("NA", " ", ""), strip.white = TRUE, stringsAsFactors = F, skip=13,header=F)
head(df)
## V1 V2 V3 V4
## 1 1 WORLD <NA> 900
## 2 2 UN development groups <NA> NA
## 3 3 More developed regions b 901
## 4 4 Less developed regions c 902
## 5 5 Least developed countries d 941
## 6 6 Less developed regions, excluding least developed countries <NA> 934
## V5 V6 V7 V8 V9 V10
## 1 <NA> 153,011,473 161,316,895 173,588,441 191,615,574 220,781,909
## 2 <NA> .. .. .. .. ..
## 3 <NA> 82,767,216 92,935,095 103,961,989 116,687,616 130,613,460
## 4 <NA> 70,244,257 68,381,800 69,626,452 74,927,958 90,168,449
## 5 <NA> 11,060,221 11,681,777 10,063,948 9,833,150 10,432,671
## 6 <NA> 59,184,036 56,700,023 59,562,504 65,094,808 79,735,778
## V11 V12 V13 V14 V15 V16
## 1 248,861,296 271,642,105 77,661,689 81,686,116 88,029,221 97,860,838
## 2 .. .. .. .. .. ..
## 3 140,643,317 152,069,261 40,426,798 45,377,588 50,801,898 57,078,401
## 4 108,217,979 119,572,844 37,234,891 36,308,528 37,227,323 40,782,437
## 5 13,631,349 16,289,023 5,550,233 5,824,077 5,033,932 4,987,537
## 6 94,586,630 103,283,821 31,684,658 30,484,451 32,193,391 35,794,900
## V17 V18 V19 V20 V21 V22
## 1 114,061,680 128,863,389 141,488,004 75,349,784 79,630,779 85,559,220
## 2 .. .. .. .. .. ..
## 3 63,408,858 67,824,389 73,765,353 42,340,418 47,557,507 53,160,091
## 4 50,652,822 61,039,000 67,722,651 33,009,366 32,073,272 32,399,129
## 5 5,185,496 6,784,461 8,086,158 5,509,988 5,857,700 5,030,016
## 6 45,467,326 54,254,539 59,636,493 27,499,378 26,215,572 27,369,113
## V23 V24 V25 V26
## 1 93,754,736 106,720,229 119,997,907 130,154,101
## 2 .. .. .. ..
## 3 59,609,215 67,204,602 72,818,928 78,303,908
## 4 34,145,521 39,515,627 47,178,979 51,850,193
## 5 4,845,613 5,247,175 6,846,888 8,202,865
## 6 29,299,908 34,268,452 40,332,091 43,647,328
#labeling columns
new_name <- c("Sort","Region","Notes","Code","Data_type","1990.Total","1995.Total","2000.Total","2005.Total","2010.Total","2015.Total","2019.Total","1990.Male","1995.Male","2000.Male","2005.Male","2010.Male","2015.Male","2019.Male","1990.Female","1995.Female","2000.Female","2005.Female","2010.Female","2015.Female","2019.Female")
#Rename Columns
df <- df %>%
rename_at(vars(starts_with("V")), funs(gsub(.,"V",new_name)))
## Warning: funs() is soft deprecated as of dplyr 0.8.0
## Please use a list of either functions or lambdas:
##
## # Simple named list:
## list(mean = mean, median = median)
##
## # Auto named with `tibble::lst()`:
## tibble::lst(mean, median)
##
## # Using lambdas
## list(~ mean(., trim = .2), ~ median(., na.rm = TRUE))
## This warning is displayed once per session.
## Warning in gsub(., "V", new_name): argument 'pattern' has length > 1 and
## only the first element will be used
head(df)
## Sort Region Notes
## 1 1 WORLD <NA>
## 2 2 UN development groups <NA>
## 3 3 More developed regions b
## 4 4 Less developed regions c
## 5 5 Least developed countries d
## 6 6 Less developed regions, excluding least developed countries <NA>
## Code Data_type 1990.Total 1995.Total 2000.Total 2005.Total
## 1 900 <NA> 153,011,473 161,316,895 173,588,441 191,615,574
## 2 NA <NA> .. .. .. ..
## 3 901 <NA> 82,767,216 92,935,095 103,961,989 116,687,616
## 4 902 <NA> 70,244,257 68,381,800 69,626,452 74,927,958
## 5 941 <NA> 11,060,221 11,681,777 10,063,948 9,833,150
## 6 934 <NA> 59,184,036 56,700,023 59,562,504 65,094,808
## 2010.Total 2015.Total 2019.Total 1990.Male 1995.Male 2000.Male
## 1 220,781,909 248,861,296 271,642,105 77,661,689 81,686,116 88,029,221
## 2 .. .. .. .. .. ..
## 3 130,613,460 140,643,317 152,069,261 40,426,798 45,377,588 50,801,898
## 4 90,168,449 108,217,979 119,572,844 37,234,891 36,308,528 37,227,323
## 5 10,432,671 13,631,349 16,289,023 5,550,233 5,824,077 5,033,932
## 6 79,735,778 94,586,630 103,283,821 31,684,658 30,484,451 32,193,391
## 2005.Male 2010.Male 2015.Male 2019.Male 1990.Female 1995.Female
## 1 97,860,838 114,061,680 128,863,389 141,488,004 75,349,784 79,630,779
## 2 .. .. .. .. .. ..
## 3 57,078,401 63,408,858 67,824,389 73,765,353 42,340,418 47,557,507
## 4 40,782,437 50,652,822 61,039,000 67,722,651 33,009,366 32,073,272
## 5 4,987,537 5,185,496 6,784,461 8,086,158 5,509,988 5,857,700
## 6 35,794,900 45,467,326 54,254,539 59,636,493 27,499,378 26,215,572
## 2000.Female 2005.Female 2010.Female 2015.Female 2019.Female
## 1 85,559,220 93,754,736 106,720,229 119,997,907 130,154,101
## 2 .. .. .. .. ..
## 3 53,160,091 59,609,215 67,204,602 72,818,928 78,303,908
## 4 32,399,129 34,145,521 39,515,627 47,178,979 51,850,193
## 5 5,030,016 4,845,613 5,247,175 6,846,888 8,202,865
## 6 27,369,113 29,299,908 34,268,452 40,332,091 43,647,328
df1 <- df
df1[df1==".."] <- "0"
# making dataset long format
df1 <- gather(df1,"year_types","n_years",6:26)
head(df1)
## Sort Region Notes
## 1 1 WORLD <NA>
## 2 2 UN development groups <NA>
## 3 3 More developed regions b
## 4 4 Less developed regions c
## 5 5 Least developed countries d
## 6 6 Less developed regions, excluding least developed countries <NA>
## Code Data_type year_types n_years
## 1 900 <NA> 1990.Total 153,011,473
## 2 NA <NA> 1990.Total 0
## 3 901 <NA> 1990.Total 82,767,216
## 4 902 <NA> 1990.Total 70,244,257
## 5 941 <NA> 1990.Total 11,060,221
## 6 934 <NA> 1990.Total 59,184,036
df2<-df1 %>%
mutate(n_years=str_replace_all(n_years,",","")) %<>% mutate_at(7, as.numeric)
head(df2)
## Sort Region Notes
## 1 1 WORLD <NA>
## 2 2 UN development groups <NA>
## 3 3 More developed regions b
## 4 4 Less developed regions c
## 5 5 Least developed countries d
## 6 6 Less developed regions, excluding least developed countries <NA>
## Code Data_type year_types n_years
## 1 900 <NA> 1990.Total 153011473
## 2 NA <NA> 1990.Total 0
## 3 901 <NA> 1990.Total 82767216
## 4 902 <NA> 1990.Total 70244257
## 5 941 <NA> 1990.Total 11060221
## 6 934 <NA> 1990.Total 59184036
#segregate total years into male and female years
separate_DF <- df2 %>% separate(year_types, c("Year", "gender"))
head(separate_DF)
## Sort Region Notes
## 1 1 WORLD <NA>
## 2 2 UN development groups <NA>
## 3 3 More developed regions b
## 4 4 Less developed regions c
## 5 5 Least developed countries d
## 6 6 Less developed regions, excluding least developed countries <NA>
## Code Data_type Year gender n_years
## 1 900 <NA> 1990 Total 153011473
## 2 NA <NA> 1990 Total 0
## 3 901 <NA> 1990 Total 82767216
## 4 902 <NA> 1990 Total 70244257
## 5 941 <NA> 1990 Total 11060221
## 6 934 <NA> 1990 Total 59184036
wide_DF <- separate_DF%>% spread(gender, n_years)
head(wide_DF)
## Sort Region Notes
## 1 1 WORLD <NA>
## 2 2 UN development groups <NA>
## 3 3 More developed regions b
## 4 4 Less developed regions c
## 5 5 Least developed countries d
## 6 6 Less developed regions, excluding least developed countries <NA>
## Code Data_type Year Female Male Total
## 1 900 <NA> 1990 75349784 77661689 153011473
## 2 NA <NA> 1990 0 0 0
## 3 901 <NA> 1990 42340418 40426798 82767216
## 4 902 <NA> 1990 33009366 37234891 70244257
## 5 941 <NA> 1990 5509988 5550233 11060221
## 6 934 <NA> 1990 27499378 31684658 59184036
no_zero_DF_wide <- wide_DF %>% filter(Female != 0)
# Drop the unnecessary columns of the dataframe
no_zero_DF_wide <- select (no_zero_DF_wide,-c(Notes,Code,Data_type)) %>% mutate_at(3, as.integer)
head(no_zero_DF_wide)
## Sort Region Year
## 1 1 WORLD 1990
## 2 3 More developed regions 1990
## 3 4 Less developed regions 1990
## 4 5 Least developed countries 1990
## 5 6 Less developed regions, excluding least developed countries 1990
## 6 8 High-income countries 1990
## Female Male Total
## 1 75349784 77661689 153011473
## 2 42340418 40426798 82767216
## 3 33009366 37234891 70244257
## 4 5509988 5550233 11060221
## 5 27499378 31684658 59184036
## 6 37812794 39990074 77802868
no_zero_DF1 <- gather(no_zero_DF_wide, "gender","N_years",4:6)
head(no_zero_DF1)
## Sort Region Year
## 1 1 WORLD 1990
## 2 3 More developed regions 1990
## 3 4 Less developed regions 1990
## 4 5 Least developed countries 1990
## 5 6 Less developed regions, excluding least developed countries 1990
## 6 8 High-income countries 1990
## gender N_years
## 1 Female 75349784
## 2 Female 42340418
## 3 Female 33009366
## 4 Female 5509988
## 5 Female 27499378
## 6 Female 37812794
require(ggthemes)
## Loading required package: ggthemes
world_trend <- filter(no_zero_DF1, Region == 'WORLD') %>%
filter(gender == "Male" | gender == "Female") %>%
mutate(percent_migration_trends = N_years/sum(N_years))
head(world_trend)
## Sort Region Year gender N_years percent_migration_trends
## 1 1 WORLD 1990 Female 75349784 0.05303269
## 2 1 WORLD 1995 Female 79630779 0.05604574
## 3 1 WORLD 2000 Female 85559220 0.06021830
## 4 1 WORLD 2005 Female 93754736 0.06598646
## 5 1 WORLD 2010 Female 106720229 0.07511184
## 6 1 WORLD 2015 Female 119997907 0.08445693
#trendline plots of World Migrants
# Multiple line plot
ggplot(world_trend , aes(x = Year, y = percent_migration_trends)) +
geom_line(aes(color = gender), size = 1) +
scale_color_manual(values = c("#00AFBB", "#E7B800")) +
ggtitle("World - Migration Trendlines % terms: Men vs Women")+
theme_excel()
Observation 1a:
Total World Migrations by men & women have been rising from 1990 thru 2020 with men consistently higher than women
Men seems to be more mobile than women across time
require(ggthemes)
library(ggplot2)
library(formattable)
#world migration over the years male vs. female
# Basic barplot - migration patterns over the years; men vs women
world <- no_zero_DF1 %>% filter(Region == "WORLD")
head(world)
## Sort Region Year gender N_years
## 1 1 WORLD 1990 Female 75349784
## 2 1 WORLD 1995 Female 79630779
## 3 1 WORLD 2000 Female 85559220
## 4 1 WORLD 2005 Female 93754736
## 5 1 WORLD 2010 Female 106720229
## 6 1 WORLD 2015 Female 119997907
world %>% group_by(gender, Year) %>%
ggplot(aes(x=Year, y=N_years,fill = gender)) +
geom_bar(stat = "identity", position = "dodge") +
xlab("Year") + ylab("Int'l Migrant Stock") +
ggtitle("Drivers of World Migration Trends by Gender") + ylim(0, 271642105)+
theme_excel()
Observation 1b:
Total migratory growth mainly driven by Men in abs. terms across time; confirming the first chart
# Investigating variances of World Migration trends by Income & Development regions
#variance by male & female
world_trend_variance <- no_zero_DF1 %>%
group_by(gender) %>%
filter(gender == "Male" |gender == "Female" ) %>%
filter(Region== "High-income countries" | Region=="Low-income countries"| Region=="More developed regions"| Region=="Less developed regions") %>%
mutate(std_dev = sd(N_years))
#Boxplot to show variances between regions and gender
p <- ggplot(world_trend_variance, aes(y=N_years,x=Region,fill=Region))+geom_boxplot()+
ggtitle("Variances of World Migration by Gender from FY90 - FY20")+facet_grid( ~ gender)+
theme_excel()
p <- p + theme(axis.text = element_text(size = 10,angle =45, hjust = 1))
p
Observation 2a:
medians with men seeing greater variances. Not surprisingly, less developed regions and lower income countries saw
lower immigration based on their much lower medians and it also had lower spread in its
distributions
# Trendlines by Hi Income Group
Hi_Income_trend <- no_zero_DF1 %>% filter( Region == 'High-income countries') %>%
group_by(Year) %>%
filter(gender == "Male" | gender == "Female") %>%
mutate(percent_migration_trends = N_years/sum(N_years))
head(Hi_Income_trend)
## # A tibble: 6 x 6
## # Groups: Year [6]
## Sort Region Year gender N_years percent_migration_tren~
## <int> <chr> <int> <chr> <dbl> <dbl>
## 1 8 High-income countries 1990 Female 37812794 0.486
## 2 8 High-income countries 1995 Female 43679245 0.489
## 3 8 High-income countries 2000 Female 50568214 0.491
## 4 8 High-income countries 2005 Female 58753262 0.487
## 5 8 High-income countries 2010 Female 69216394 0.480
## 6 8 High-income countries 2015 Female 76927104 0.480
#trendline plots of Hi Income Migrants
ggplot(Hi_Income_trend , aes(x = Year, y = percent_migration_trends)) +
geom_line(aes(color = gender), size = 1) +
scale_color_manual(values = c("#00AFBB", "#E7B800")) +
ggtitle("Hi Income Group - Migration Trendlines % terms: Men vs Women")+
theme_excel()
# trendline plots of More Developed Regions Migrants
More_developed_trend <- no_zero_DF1 %>% filter( Region == 'More developed regions') %>%
group_by(Year) %>%
filter(gender == "Male" | gender == "Female") %>%
mutate(percent_migration_trends = N_years/sum(N_years))
head(More_developed_trend)
## # A tibble: 6 x 6
## # Groups: Year [6]
## Sort Region Year gender N_years percent_migration_tren~
## <int> <chr> <int> <chr> <dbl> <dbl>
## 1 3 More developed regio~ 1990 Female 42340418 0.512
## 2 3 More developed regio~ 1995 Female 47557507 0.512
## 3 3 More developed regio~ 2000 Female 53160091 0.511
## 4 3 More developed regio~ 2005 Female 59609215 0.511
## 5 3 More developed regio~ 2010 Female 67204602 0.515
## 6 3 More developed regio~ 2015 Female 72818928 0.518
#trendline plots of More Developed Region Migrants
ggplot(More_developed_trend , aes(x = Year, y = percent_migration_trends)) +
geom_line(aes(color = gender), size = 1) +
scale_color_manual(values = c("#00AFBB", "#E7B800")) +
ggtitle("More Developed Region - Migration Trendlines % terms: Men vs Women")+
theme_excel()
#--------------------------------------------------------------------------------------------
# trendline plots of Low Income Migrants
Lo_Income_trend <- no_zero_DF1%>% filter(Region == 'Low-income countries') %>%
group_by(Year) %>%
filter(gender == "Male" | gender == "Female") %>%
mutate(percent_migration_trends = N_years/sum(N_years))
head(Lo_Income_trend)
## # A tibble: 6 x 6
## # Groups: Year [6]
## Sort Region Year gender N_years percent_migration_trends
## <int> <chr> <int> <chr> <dbl> <dbl>
## 1 12 Low-income countries 1990 Female 4909022 0.501
## 2 12 Low-income countries 1995 Female 5347591 0.506
## 3 12 Low-income countries 2000 Female 4526567 0.504
## 4 12 Low-income countries 2005 Female 4463965 0.498
## 5 12 Low-income countries 2010 Female 5094169 0.507
## 6 12 Low-income countries 2015 Female 6027533 0.508
#trendline plots of low Income Migrants
ggplot(Lo_Income_trend , aes(x = Year, y = percent_migration_trends)) +
geom_line(aes(color = gender), size = 1) +
scale_color_manual(values = c("#00AFBB", "#E7B800")) +
ggtitle("Low Income Group - Migration Trendlines % terms: Men vs Women")+
theme_excel()
# trendline plots of Less Developed Regions Migrants
Less_developed_trend <- no_zero_DF1 %>% filter(Region == 'Less developed regions') %>%
group_by(Year) %>%
filter(gender == "Male" | gender == "Female") %>%
mutate(percent_migration_trends = N_years/sum(N_years))
head(Less_developed_trend)
## # A tibble: 6 x 6
## # Groups: Year [6]
## Sort Region Year gender N_years percent_migration_tren~
## <int> <chr> <int> <chr> <dbl> <dbl>
## 1 4 Less developed regio~ 1990 Female 33009366 0.470
## 2 4 Less developed regio~ 1995 Female 32073272 0.469
## 3 4 Less developed regio~ 2000 Female 32399129 0.465
## 4 4 Less developed regio~ 2005 Female 34145521 0.456
## 5 4 Less developed regio~ 2010 Female 39515627 0.438
## 6 4 Less developed regio~ 2015 Female 47178979 0.436
#trendline plots of Less Developed Migrants
ggplot(Less_developed_trend , aes(x = Year, y = percent_migration_trends)) +
geom_line(aes(color = gender), size = 1) +
scale_color_manual(values = c("#00AFBB", "#E7B800")) +
ggtitle("Less Developed Region - Migration Trendlines % terms: Men vs Women")+
theme_excel()
Observation 2b:
The various charts above were performed within each income countries and development regions to
see how each gender’s(men vs. women) migatory patterns changed over time.
after FY20 with men
growing while women decreased
increasing post FTY20 while
women decreased; similar divergent trendline as in Hi Income countries
elevated level vs.men. The interesting fact was, both genders did not see any growth (flat
line) for almost 2 decades before women started to diverge and grew while surprisingly, men
decreased around FY2005. It was until late 2018’s that men rebounded slightly and the women
decreased simultaneously at the opposite end
were trending higherearly FY20 and then criss-crossed with men in mid FY2005. Post FY2005, men
decreased while women’s numbers soared
nations that are also low income. For example, G7 countries vs. 7 Less developed and Low Income
Countries, which were randomly selected from each hemisphere
discerning (clear) migration behavior between genders.
Note: My hunch is that the way United Nations grouped the countries into various categories
may have cross-listed countries leading to this contradiction
#Segregating countries into poor (randomnly picked from each hemisphere) vs rich (G7 nations)
rich_countires <- no_zero_DF1 %>% filter(Region == 'United States of America'|Region == 'Canada'|Region == 'France'|Region == 'Germany'|Region == 'Italy'|Region == 'United Kingdom'|Region == 'Japan') %>%
group_by(Year) %>%
filter(gender == "Male" | gender == "Female") %>%
mutate(percent_migration_trends = N_years/sum(N_years))
poor_countries <- no_zero_DF1 %>% filter(Region == 'Albania'|Region =='Venezuela (Bolivarian Republic of)'|Region == 'Mexico'|Region == 'Honduras'|Region =='Syrian Arab Republic'|Region =='Egypt'
|Region =='Senegal') %>%
group_by(Year) %>%
filter(gender == "Male" | gender == "Female") %>%
mutate(percent_migration_trends = N_years/sum(N_years))
poor_countries_men <- poor_countries %>%
filter(gender == "Male" | gender == "Female") %>%
group_by(gender) %>%
summarise(sum(N_years))
rich_countries_men <- rich_countires%>%
filter(gender == "Male" | gender == "Female") %>%
group_by(gender) %>%
summarise(sum(N_years))
head(poor_countries)
## # A tibble: 6 x 6
## # Groups: Year [1]
## Sort Region Year gender N_years percent_migration_tr~
## <int> <chr> <int> <chr> <dbl> <dbl>
## 1 75 Senegal 1990 Female 131570 0.0409
## 2 81 Egypt 1990 Female 81838 0.0255
## 3 102 Syrian Arab Republic 1990 Female 350063 0.109
## 4 177 Honduras 1990 Female 132850 0.0413
## 5 178 Mexico 1990 Female 347321 0.108
## 6 195 Venezuela (Bolivarian R~ 1990 Female 507430 0.158
head(rich_countires)
## # A tibble: 6 x 6
## # Groups: Year [1]
## Sort Region Year gender N_years percent_migration_trends
## <int> <chr> <int> <chr> <dbl> <dbl>
## 1 129 Japan 1990 Female 536552 0.0118
## 2 250 United Kingdom 1990 Female 1893838 0.0416
## 3 259 Italy 1990 Female 785805 0.0172
## 4 271 France 1990 Female 2897891 0.0636
## 5 272 Germany 1990 Female 2643053 0.0580
## 6 280 Canada 1990 Female 2223666 0.0488
# bar plots of poor countries
ggplot(poor_countries, aes(x=Year, y=percent_migration_trends,fill = gender)) +
geom_bar(stat = "identity", position = "dodge") +
xlab("Year") + ylab("Int'l Migrant Stock") +
ggtitle("Drivers of World Migration in poor countries by Gender") +
theme_excel()
# bar plots of G7 "rich" countries
ggplot(rich_countires, aes(x=Year, y=percent_migration_trends,fill = gender)) +
geom_bar(stat = "identity", position = "dodge") +
xlab("Year") + ylab("Int'l Migrant Stock") +
ggtitle("Drivers of World Migration in G7 countries by Gender") +
theme_excel()
Observation 3a:
By randomly picking known poor and less deveoped countries vs. known rich and developed
countries such as the G7 nations, it was more visible which gender was more mobile (dominant).
It should be noted also that a more rigorous and robust way to select poor vs. rich countries
is via each countries’ GDP or other socio-economic metrics. This extra step was to quickly
answer the very confusing and contracdictory patterns that emerged in the deep dive into
the constituents of the different groupings in Observation 2b
nations while the opposite it true for richer nations that made up of the G7 nations; in fact
now that rich vs. poor countries were segregated distinctly to avoid cross-over
categorization, women actually were slightly more mobile 51% to 49% relative men in the rich
nations. While the opposite is true in the poorer countries.
The top 3 countries that experienced the largest migration flows were Venezuela, Syria and
Mexico for the last 3 decades: 1990 to 2019. This was not surprising as these 3 were conflict
nations; Venezuela had the most punishing economic collapse, the Syrians had a massive civil
war while sadly enough, Mexico had the worst civilian violence stemming from the Narco trade
most_migrant_countries <- no_zero_DF1 %>% filter(Region == 'Albania'|Region =='Venezuela (Bolivarian Republic of)'|Region == 'Mexico'|Region == 'Honduras'|Region =='Syrian Arab Republic'|Region =='Egypt'
|Region =='Senegal') %>%
group_by(Year) %>%
filter(gender=="Total") %>%
arrange(desc(N_years))
most_migrant_countries
## # A tibble: 49 x 5
## # Groups: Year [7]
## Sort Region Year gender N_years
## <int> <chr> <int> <chr> <dbl>
## 1 102 Syrian Arab Republic 2010 Total 1787561
## 2 195 Venezuela (Bolivarian Republic of) 2015 Total 1404448
## 3 195 Venezuela (Bolivarian Republic of) 2019 Total 1375690
## 4 195 Venezuela (Bolivarian Republic of) 2010 Total 1347347
## 5 195 Venezuela (Bolivarian Republic of) 2005 Total 1076474
## 6 178 Mexico 2019 Total 1060707
## 7 178 Mexico 2015 Total 1028803
## 8 195 Venezuela (Bolivarian Republic of) 1990 Total 1025009
## 9 195 Venezuela (Bolivarian Republic of) 1995 Total 1019996
## 10 195 Venezuela (Bolivarian Republic of) 2000 Total 1013738
## # ... with 39 more rows
ggplot(most_migrant_countries, aes(x=Year, y=N_years/sum(N_years),fill = Region)) +
geom_bar(stat = "identity", position = "dodge") +
xlab("Year") + ylab(" Migrant Stock") +
ggtitle("Driver countries of Migration in poorer nations") +
theme_excel()
# comparing world migration variances over ALL Regions over time
world_migration_wide <- no_zero_DF_wide %>%
group_by(Region, Year) %>%
mutate(femalepct = Female/Total) %>%
mutate(malepct = Male/Total)
head(world_migration_wide)
## # A tibble: 6 x 8
## # Groups: Region, Year [6]
## Sort Region Year Female Male Total femalepct malepct
## <int> <chr> <int> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 1 WORLD 1990 7.53e7 7.77e7 1.53e8 0.492 0.508
## 2 3 More developed regio~ 1990 4.23e7 4.04e7 8.28e7 0.512 0.488
## 3 4 Less developed regio~ 1990 3.30e7 3.72e7 7.02e7 0.470 0.530
## 4 5 Least developed coun~ 1990 5.51e6 5.55e6 1.11e7 0.498 0.502
## 5 6 Less developed regio~ 1990 2.75e7 3.17e7 5.92e7 0.465 0.535
## 6 8 High-income countries 1990 3.78e7 4.00e7 7.78e7 0.486 0.514
dfhist <- world_migration_wide %>%
group_by(Year) #%>%
# Overlaid histograms
pf <- ggplot(dfhist, aes(x=femalepct, color=Year)) +
geom_histogram(fill="red", alpha=0.5, position="identity")+facet_grid(Year ~ .)
pf
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
pm <- ggplot(dfhist, aes(x=malepct, color=Year)) +
geom_histogram(fill="yellow", alpha=0.5, position="identity")+facet_grid(Year ~ .)
pm
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
Observation 3b:
Variance Analysis showed that both men and women Total World migratory distributions over time
exhibited normality.
The only difference was women had a left-skew while men had a right skew. Their central
tendencies were very similar as well. The right skewness in men also confirmed the earlier box
plots of the hight income & high developed region groupings which was showing that men relative
to women had a higher spread due to longer positive tails.
This short study had shown very interesting migatory behaviors between men and women over the
time periods: 1990 to 2019. In general, men seem to be more mobile and was able to move into
higher income countries. It was also shown that men were able to move more into the "less
developed regions" as well. However, what is paradoxical, were the trends that showed women
were outpacing men in the “developed regions” and was also better in the low income
countries. This was an ironic “finding” that deserved further investigation and analysis to
say the least. Because these two sets of findings seems to be in contradiction. The next
step of this study was to truly separated out the traditionally known rich nations relative to
to the poorer nations. I randomly picked 7 poor nations from each hemisphere and compared it
to the G7 countries:
men at 51% vs. 49%. This statistics curiously was exactly the opposite in poorer countries
with men having the slight advantage.
migration flow activities, most probably stemming from its internal socio-economic issues