The data set I chose for my final project is about Immigration Total in the Fiscal Year of 2022. These statistics cover demographics that are Asylees, Lawful Residents, Nonimmigrants, Naturalizations,and Refugees, and overall population. The variables I will be using the ones I mentioned above including totals. This topic and data set stood out to me because it’s something that hits close to home. My father is an immigrant who crossed the Rio Grande in 1986. On May 1st 2024 he passed his citizenship exam and became an official U.S citizen. The cleaning process I did for this data set included removing any commas or extraneous spaces to allow for numeric data and not data. I also removed headers and extra rows that could disrupt the provided data. The source for the data is Office of Homeland Security Statistics analysis of Department of Homeland Security, Department of State, Department of Justice data, and U.S. Census Bureau data.
Sources including the Office of Homeland Security Statistics and agencies like Customs and Border Protection (CBP), ICE, and U.S. Citizenship and Immigration Services (USCIS) collect data on immigration. The data captures a wide range of aspects such as lawful permanent residents, refugees/asylees, naturalizations, nonimmigrants, enforcement actions, and unauthorized immigrants. The New Monthly Immigration Report from the Office of Homeland Security Statistics provides detailed monthly data on encounters, arrests, detention, removals, and other immigration processes. These sources use reporting like annual reports, FOIA libraries, and data portals to get information on immigration trends and statistics. USCIS offers historical documents and operational data, while ICE focuses on enforcement and detention stats.
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr 1.1.4 ✔ readr 2.1.5
✔ forcats 1.0.0 ✔ stringr 1.5.1
✔ ggplot2 3.4.4 ✔ tibble 3.2.1
✔ lubridate 1.9.3 ✔ tidyr 1.3.0
✔ purrr 1.0.2
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag() masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(dplyr)library(readr)library(plotly)
Attaching package: 'plotly'
The following object is masked from 'package:ggplot2':
last_plot
The following object is masked from 'package:stats':
filter
The following object is masked from 'package:graphics':
layout
library(broom)library(magrittr)
Attaching package: 'magrittr'
The following object is masked from 'package:purrr':
set_names
The following object is masked from 'package:tidyr':
extract
# Load data setdata <- readr::read_csv("/Users/jasonlaucel/Data 110 Folder/2023_0824_plcy_state_immigration_data_sheets_fy2022/2023_0824_plcy_state_immigration_data_sheets_fy2022 2/All States Totals 2022-Table 1 7.csv")
New names:
Rows: 52 Columns: 16
── Column specification
──────────────────────────────────────────────────────── Delimiter: "," chr
(1): States dbl (15): Population, Lawful Permanent Residents, Lawful Permanent
Residents...
ℹ Use `spec()` to retrieve the full column specification for this data. ℹ
Specify the column types or set `show_col_types = FALSE` to quiet this message.
• `Rank` -> `Rank...6`
• `Rank` -> `Rank...8`
library(dplyr)# Three Dplyr commandsdata <- data %>%filter(Population >5000000) %>%# Filter for states w/ population > 5 milliongroup_by(States) %>%# Group by State summarise(mean_Naturalizations =mean(Naturalizations)) # Calculate Mean Naturalizations by state# Summarize the model
# Load ggplot2 library(ggplot2)data$State_ <-gsub(" ", "\n", data$States)# I randomly chose my own custom color palettemy_colors <-c("#FF5733", "#7FFF00", "#33FFB5", "#FF3386", "#336BFF", "#FFD700", "#33FF33", "#FF00FF", "#FF5733", "#00FFFF", "#8A2BE2", "#FF4500", "#00FF7F", "#ADFF2F", "#008080", "#FFD700", "#FF6347", "#BA55D3", "#4169E1", "#00FA9A", "#20B2AA", "#FFA500", "#800080", "#FFC0CB", "#2E8B57", "#FF00FF", "#4682B4", "#9400D3", "#FFA07A", "#FF69B4", "#8B008B", "#4B0082", "#7FFFD4", "#000080", "#FF6347", "#ADFF2F", "#00BFFF", "#FF7F50", "#6495ED", "#FF00FF", "#40E0D0", "#FFD700", "#00FF7F", "#BDB76B", "#20B2AA", "#B22222", "#FF4500", "#B0C4DE", "#008080", "#DAA520", "#FF6347", "#0000FF", "#FF00FF")# ggplot with custom palette p <-ggplot(data, aes(x = States, y = mean_Naturalizations, fill = States)) +geom_bar(stat ="identity") +scale_fill_manual(values = my_colors) +# Use custom color palettetheme(axis.text.x =element_text(angle =90, hjust =1, size=3)) +labs(x ="State", y ="Naturalizations to Population") +# Label axesggtitle("Naturalizations to Population by State")# interactivity p <-ggplotly(p)plotly::ggplotly(p)
##. This visualization represents the dplyr filters and commands I applied earlier. The visualizations represents naturalizations done in the filtered states. I chose this specific category of Immigration Status because I know people who’ve recently been naturalized.
##. Tableau Visualization
##. U.S State Map https://public.tableau.com/shared/4YF6DDDYG?:display_count=n&:origin=viz_share_link This visualization of the U.S Map shows the clear influx of immigrants per state. The colors used are intentional as the states with heavy traffic stand out the most. There is also interactivity for the tableau data points for each state.
##. U.S Data for all states https://public.tableau.com/views/U_SImmigrationFiscalYear2022Data/U_SImmigrationFiscalYear2022?:language=en-US&publish=yes&:sid=&:display_count=n&:origin=viz_share_link
This data visualization gives a more analytical approach to the data. It is less focused on the clear visual output done by the U.S Map. However, it does a good job in showing the variation and differences of individual state statistics.
Immigration is an ongoing situation that has been going on for centuries here in the U.S. It’s prevalence in the political climate has risen drastically over the past decades. The mass influx of immigrants from all over the world flooding into the U.S has raised political and social alarms. There are many reasons as to why people immigrate to the U.S. With immigration also comes various different status types for each individual’s specific case. I went over some of the different types in the visualizations for each individual state.
Some current day news that I’ve researched is related to the current Biden bipartisan immigration deal. This proposal by the president is aimed to create a just and firm handling of the Mexican-American border dilemma. The immediate action would be emergency funding for detention facilities, Asylee processing and administration, border shutoffs or caps, ports of entry during ” border shutdowns”, Local state funding for large immigrant communities, Border wall, Visas, DEA and Border Security funding, Dreamers, Work permits, Border patrol, and much more. This list I provided is quite extensive and purposefully so because there’s so much that needs to be addressed.
The visualizations I’ve created represent the comparison of different possible situations and given statistics. The Tableau visualizations did a good job in providing country wide context to show the gravity and specific locations. The visualization I created on here analyzes naturalizations compared to specific state populations based on commands and filter systems.
Scholtes, J., & Emma, C. (2024, February 5). Detention and that border ‘shutdown’: What’s really in Biden’s bipartisan immigration deal. Politico. Retrieved May 4, 2024, from https://www.politico.com/news/2024/02/05/biden-bipartisan-immigration-deal-00139558