United Nations’ Migration Data

Discussion thread created by : Subhalaxmi Rout

1.Introduction

People who have migrated across the countries all over the world and it was prepared and published by United Nation. Each of the origin countries, where the migrants are coming from is presented in each column and each of the destination countries, where the migrants are going to is represented in each row.The file contains a bunch of worksheets to include different years and data broken down by total / male / female. But I’m going to import ‘Table 16’ which contains the total migrants data for 2015 for this post.

Link: Dataset link

2.Load library

## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union

3.Data load and cleaning

Data is stored in the Github and loaded data from Github to Rstudio using read.csv() method.

3.1 Remove region and keep only contries

3.2 Remove unnecessary columns

3.3 Rename column names

3.4 Gather 232 columns to make it tidy

There are many pair of contries people did not migratate, so remove those contries.

3.5 Pair of contries with greater than 1 million people

4. Analysis

This part shows relation between origin country and desitnation country. The darkred color country has low imigrants and the blue color country has more number of immirants.

Note: Due to filter immigrants count shows in million.

4.2 People migrated to United States

4.4 Americans migrated country

5. Conclusion

The data set contains 232 columns, using differnt method of tydr, dplyr converted those colums to row. Applied filter condition to get below analysis.

  • People migrated from original country to migrated country
  • Top 20 original country - where people migrated to United states
  • Top 20 desination country- where americans migrated to destination contry