Immigration in the US by Charlotte Levavasseur

This project relies on data provided by Homeland Security. On their website, you can download the yearbook of immigration Statistics for the past 15 years. (https://www.dhs.gov/immigration-statistics/yearbook/2016)

The data is available in a PDF format as well as .xls tables.

I’ve chosen to focus this project on the characteristics of foreign nationals who are granted lawful permanent residence (i.e., immigrants who receive a “green card”) over the past 10 years.

I have downloaded and the following tables and inserted the data into my immigration dataset. Table 1 Persons Obtaining Lawful Permanent Resident Status: Fiscal Years 1820 To 2016 Table 3* Persons Obtaining Lawful Permanent Resident Status By Region And Country Of Birth: Fiscal Years 2007 To 2016 Table 8 Persons Obtaining Lawful Permanent Resident Status By Sex, Age, Marital Status, And Occupation: Fiscal Years 2007 To 2016

Note: Before importing the different tables into my dataset, I performed a bit of cleaning: -Removed the header and footnotes to facilitate the import -Transposed column & rows to have the year in column

Before diving deeper into the analysis, let’s take a look at our newly created dataset. Our immigration dataset contains the following variables:

##  [1] "Year"                   "GCR"                   
##  [3] "Africa"                 "Asia"                  
##  [5] "Europe"                 "North.America"         
##  [7] "Oceania"                "South.America"         
##  [9] "Unknown"                "Female"                
## [11] "Male"                   "Under.16.years"        
## [13] "X16.to.20.years"        "X21.years.and.over"    
## [15] "Married"                "Single"                
## [17] "Widowed"                "Divorced.Separated"    
## [19] "Unknown_marital_status"

First, let’s take a look at the global evolution of immigration in the US over the last century. We can notice drops in the number of immigrants around the two world war, a steady increase after WW2, a spike in the 1990.

Now that we have an overview of the evolution of immigration in the US over the last century. We are going to focus our study on the last 10Y.

Univariate Plots Section

Univariate Analysis

What is the structure of your dataset?

The dataset I used doesn’t

What is/are the main feature(s) of interest in your dataset?

What other features in the dataset do you think will help support your
investigation into your feature(s) of interest?

Did you create any new variables from existing variables in the dataset?

Of the features you investigated, were there any unusual distributions?
Did you perform any operations on the data to tidy, adjust, or change the form
of the data? If so, why did you do this?

For the past ten years, immigration has been fairly stable, and as you can see in the graph below, most of the immigrants are coming from Asia and North America. Both of them represent on average 72.8% of the total immigration.

##     Oceania       South.America         Asia       North.America  
##  Min.   :0.4597   Min.   : 6.726   Min.   :36.04   Min.   :31.44  
##  1st Qu.:0.4730   1st Qu.: 7.363   1st Qu.:38.12   1st Qu.:31.88  
##  Median :0.4981   Median : 8.139   Median :40.17   Median :32.26  
##  Mean   :0.5012   Mean   : 8.131   Mean   :39.68   Mean   :33.11  
##  3rd Qu.:0.5138   3rd Qu.: 8.766   3rd Qu.:41.35   3rd Qu.:34.42  
##  Max.   :0.5797   Max.   :10.121   Max.   :42.52   Max.   :36.10  
##      Europe           Africa      
##  Min.   : 7.895   Min.   : 8.999  
##  1st Qu.: 7.978   1st Qu.: 9.571  
##  Median : 8.354   Median : 9.665  
##  Mean   : 8.616   Mean   : 9.821  
##  3rd Qu.: 9.180   3rd Qu.: 9.873  
##  Max.   :10.126   Max.   :11.235
## [1] 72.792

The gender doesn’t seem to be a determinent factor when trying to paint the picture of green card recipients over the past 10 years. The total of men and women receiving it is pretty close and alternate from one year to the other.

Bivariate Plots Section

Tip: Based on what you saw in the univariate plots, what relationships between variables might be interesting to look at in this section? Don’t limit yourself to relationships between a main output feature and one of the supporting variables. Try to look at relationships between supporting variables as well.

Bivariate Analysis

Tip: As before, summarize what you found in your bivariate explorations here. Use the questions below to guide your discussion.

Talk about some of the relationships you observed in this part of the
investigation. How did the feature(s) of interest vary with other features in
the dataset?

Did you observe any interesting relationships between the other features
(not the main feature(s) of interest)?

What was the strongest relationship you found?

Multivariate Plots Section

Tip: Now it’s time to put everything together. Based on what you found in the bivariate plots section, create a few multivariate plots to investigate more complex interactions between variables. Make sure that the plots that you create here are justified by the plots you explored in the previous section. If you plan on creating any mathematical models, this is the section where you will do that.

Multivariate Analysis

Talk about some of the relationships you observed in this part of the
investigation. Were there features that strengthened each other in terms of
looking at your feature(s) of interest?

Were there any interesting or surprising interactions between features?

OPTIONAL: Did you create any models with your dataset? Discuss the
strengths and limitations of your model.


Final Plots and Summary

Tip: You’ve done a lot of exploration and have built up an understanding of the structure of and relationships between the variables in your dataset. Here, you will select three plots from all of your previous exploration to present here as a summary of some of your most interesting findings. Make sure that you have refined your selected plots for good titling, axis labels (with units), and good aesthetic choices (e.g. color, transparency). After each plot, make sure you justify why you chose each plot by describing what it shows.

Plot One

Description One

Plot Two

Description Two

Plot Three

Description Three


Reflection

Tip: Here’s the final step! Reflect on the exploration you performed and the insights you found. What were some of the struggles that you went through? What went well? What was surprising? Make sure you include an insight into future work that could be done with the dataset.

Tip: Don’t forget to remove this, and the other Tip sections before saving your final work and knitting the final report!