This project will allow you to access data on deportations in the United States from 2003 to 2024. In your analysis, you will be able to assess several claims that have been made regarding deportation. Are common narratives about deportation sustainable given the observed data? This is what social scientists do: we make, or attempt to make, evidence-based claims. The tasks I am asking you to do here are portable to any (or most any) data set you might encounter whether it is in international relations, comparative politics, economics, sociology, and so forth. The “POL 51” aspect of this assignment is to give you hands-on experience in interpreting plots, univariate statistics, and rudimentary hypothesis testing. In class, we will cover this extensively. I have assigned an article by Patler and Jones and it is posted on Canvas. It is expected you read this article in advance of writing up your responses.
Project 1 is worth 800 points with an extra credit option at the end. This assignment is due on July 14 by 11:59 PM. You must submit an HTML file to Canvas. Rmd files will not be graded and you will score a 0. Use of AI to generate code or answers is prohibited.
A common trope in the immigration debate is that undocumented immigrants commit, at high rates, violent crimes. Therefore, the supposition is that migrants who are deported are migrants who have committed serious criminal infractions. This idea is prevalent in political rhetoric surrounding the issue of deportation. But is the claim consistent with the actual data?
Part 1 of this assignment is asking you to analyze real-world data on deportations in the United States between the years 2003 and 2024. The data you access records annual ICE removals (deportations) based on what ICE records as the “Most Serious Criminal Conviction” for someone who is deported. The following information is from TRAC (Transactional Records Access Clearinghouse) and describes what the classification levels mean:
“Seriousness Level of MSCC Conviction. ICE classifies National Crime Information Center (NCIC) offense codes into three seriousness levels. The most serious (Level 1) covers what ICE considers to be”aggravated felonies.” Level 2 offenses cover other felonies, while Level 3 offenses are misdemeanors, including petty and other minor violations of the law. TRAC uses ICE’s “business rules” to group recorded NCIC offense codes into these three seriousness levels.”
Essentially what this loosely means is that “Level 1” convictions are the most serious and “Level 3” convictions are generally minor legal infractions. In addition to Levels 1-3, there is a fourth category called “NoneER” denoting that the deportee had no criminal convictions.
This chunk of code will access the data set.
reasons="https://raw.githubusercontent.com/mightyjoemoon/POL51/main/ICE_reasonforremoval.csv"
reasons<-read_csv(url(reasons))
summary(reasons)
## Year President All None
## Min. :2003 Length:22 Min. : 56882 Min. : 19495
## 1st Qu.:2008 Class :character 1st Qu.:178148 1st Qu.: 85446
## Median :2014 Mode :character Median :238765 Median :106426
## Mean :2014 Mean :248987 Mean :122287
## 3rd Qu.:2019 3rd Qu.:356423 3rd Qu.:165287
## Max. :2024 Max. :407821 Max. :253342
## Level1 Level2 Level3 Undocumented
## Min. : 9819 Min. : 3846 Min. : 11045 Min. :10100000
## 1st Qu.:38484 1st Qu.: 9056 1st Qu.: 34978 1st Qu.:10500000
## Median :46743 Median :17480 Median : 63186 Median :11050000
## Mean :46534 Mean :15601 Mean : 64541 Mean :11015455
## 3rd Qu.:57148 3rd Qu.:20342 3rd Qu.: 90950 3rd Qu.:11375000
## Max. :75590 Max. :29436 Max. :130251 Max. :12200000
## ER_Non
## Min. : 4018
## 1st Qu.:28563
## Median :41647
## Mean :38980
## 3rd Qu.:50230
## Max. :71686
For this task, you are going to summarize the data by way of analyzing descriptive, univariate statistics (i.e. mean, s.d., median, iqr). To do this:
First, create four variables recording the percentage of all deportations that are: Level 1, Level 2, Level 3, and None. Put your code in the chunk below to do this.
#Insert code for Task 1.1 here:
reasons$percent_level1 <-100*(reasons$Level1/reasons$All)
reasons$percent_level2 <-100*(reasons$Level2/reasons$All)
reasons$percent_level3 <-100*(reasons$Level3/reasons$All)
reasons$percent_levelNone <-100*(reasons$None/reasons$All)
#didnt work
reasons$percent_None
## NULL
Second, compute the mean, standard deviation, median, and iqr for each of the newly created variables (this means you will have \(4 \times 4=16\) statistics). Put your code to do this in the chunk below.
#Insert code for task 2.2 here
mean(reasons$percent_level1, na.rm=TRUE)
## [1] 19.77446
#should be 20%
sd(reasons$percent_level1, na.rm=TRUE)
## [1] 5.625592
median(reasons$percent_level1, na.rm=TRUE)
## [1] 18.3565
IQR(reasons$percent_level1, na.rm=TRUE)
## [1] 4.999591
mean(reasons$percent_level2, na.rm=TRUE)
## [1] 6.53759
sd(reasons$percent_level2, na.rm=TRUE)
## [1] 2.10969
median(reasons$percent_level2, na.rm=TRUE)
## [1] 6.715175
IQR(reasons$percent_level2, na.rm=TRUE)
## [1] 3.27075
mean(reasons$percent_level3, na.rm=TRUE)
## [1] 24.50703
sd(reasons$percent_level3, na.rm=TRUE)
## [1] 7.295998
median(reasons$percent_level3, na.rm=TRUE)
## [1] 24.30652
IQR(reasons$percent_level3, na.rm=TRUE)
## [1] 12.03445
mean(reasons$percent_levelNone, na.rm=TRUE)
## [1] 49.17316
sd(reasons$percent_levelNone, na.rm=TRUE)
## [1] 9.966849
median(reasons$percent_levelNone, na.rm=TRUE)
## [1] 45.3763
IQR(reasons$percent_levelNone, na.rm=TRUE)
## [1] 10.11668
Third, using the shell table code below, enter the statistics you computed, rounding up to one decimal spot.
| Type | Mean | s.d. | Median | IQR |
|---|---|---|---|---|
| Level 1 | 19.8 | 5.6 | 18.4 | 5.0 |
| Level 2 | 6.5 | 2.1 | 6.7 | 3.3 |
| Level 3 | 24.5 | 7.3 | 24.3 | 12.3 |
| None | 49.2 | 10.0 | 45.4 | 10.1 |
Fourth, provide a thorough and substantive interpretation of the tabularized data. What do we learn? How does your analysis square with the criminality narrative.?
What we see from the tabularized data is a clear dispelling of the criminality narrative. The most telling point of the data is that deportations of immigrants with zero criminal history have a 45% median and a IQR of 10. This shows that even when weighting for outliers, 40-50% of migrants deported from 2003-2020 had no criminal record. We get an even higher number when we use mean as our measure of center, with an average of 49.2% and a standard deviation of 10; suggesting that there tend to be years where the percentage of migrants with no criminal record being deported tends to be higher then the median of 45%.
The criminality narrative gets even weaker when you examine the percentage of migrants with minor criminal infractions; like running a red light. Both the mean and median sit around 24% with negligible differences between IQR and standard deviation(S.D 7.3 IQR 12.3). So the actual median for deportations that don’t fit the traditional parameter of “criminality” sits closer 64-74%, a striking majority
Conversely the percentage of immigrants deported with non aggravated felonies falls near 6.5%, with incredibly small variation over the years (IQR of 3.3 and S.D of 2.1)
Finally the percentage of immigrants with an “aggravated felony” falls near 19% although it does vary by about 5%(S.D 5.6 IQR 5.0). It is important to note the definition of an aggravated felony has substantially expanded since its inception in 1988, to include crimes such as filing a false tax return and failing to appeal before court (Source American Immigration Council https://www.americanimmigrationcouncil.org/fact-sheet/aggravated-felonies-overview/.
This artificially inflates these statistics, and deflates the “class 2” or non aggravated felonies.
## Task 2: Interpretation of a line plot
The following is a line plot of the four levels of criminality in percentages based on the variables you just created (Levels 1-3 and None).
First add proper labels to each axis and give an informative main title.
Second, provide a thorough substantive interpretation of the plot that is non-mechanical.
If you were conveying the information from this plot to an audience interested in understanding deportation, what would you say? And perhaps, more importantly, what would you not say?
Mechanical answers, short answers, or answers that do not display a substantive understanding of the issue will receive low scores even if the mechanical interpretation is correct.
We will need to turn the following code into a chunk. This will be done in class
t2<-ggplot(reasons, aes(x = Year)) +
geom_line(aes(y = percent_level1, color="Level 1"), size=.9, linetype=3) +
geom_line(aes(y = percent_level2, color="Level 2"), size=.8, linetype=1) +
geom_line(aes(y = percent_level3, color="Level 3"), size=.8, linetype=2) +
geom_line(aes(y = percent_levelNone, color="None"), size=.8, linetype=1) +
labs(title="ICE Criminality Data 2005-2025",
y="Number of Migrants Deported ", x="Years",
color="Severity of criminal record") +
theme_classic() +
theme(#panel.grid.major.y = element_line(colour = "grey", linetype = "dashed"),
panel.grid.major.x = element_blank(),
panel.grid.minor.x = element_blank(),
axis.text.y = element_text(size=9),
axis.text.x = element_text(size=9, angle=45, hjust=1),
axis.ticks = element_blank(),
plot.caption=element_text(hjust=0, size=10),
legend.position=c(.85,.98),
legend.justification=c("right", "top"),
legend.title = element_text(size = 8),
legend.text = element_text(size = 8),
plot.title = element_text(size=12))
t2
The graph of level of deportations level over time is highly elucidating about the nature of deportations over the past 25 years. The graph confirms our analysis from the tabularized data, as we can clearly see the deportations for immigrants with no criminal convictions hovers from 45-60% spiking in 2008 and again after 2020.
We see that level 3 convictions (minor offenses) starts at around 20% and rises, overtaking level 1 convictions (aggravated felonies) near the 2010’s and staying at around 35% of deportations until 2020, where it falls back down to 20%
These two in conjunction confirm our earlier assessment from the tabularized data about the criminality narrative. We can clearly see deportations with no criminal record are the plurality to majoirty of the data, and for most of the data, level 3 deportations are a strong second place
Level 2 deportations (non-aggravated felonies) stabilize at around 6.5%
Level 3 deportations start at around 30% and fall quickly to around 20% around 2008. During 2020-2021 they briefly spike to just under 40%, before falling back down to 20%. The angle of the decline post 2020 is almost exactly opposite the angle of incline of “level none” deportations suggesting that that from around 2021-2025 they are inversely correlated.
Speculation:. The rise of “level none” deportations in 2008 would correlate to the expansion of the 287g program, in which local law enforcement were co-opted by ICE for immigration enforcement, as well as increased use of expedited removal (Source Patler and Jones)
The sudden spike in level 1 convictions and crash in level 3 convictions during 2020 was likely due to changes in immigration patterns and enforcement directives during Covid.
For this task you will create three new variables from existing ones in the data set.
First, create a new variable called minor that sums all deportations associated with no criminal conviction (“None”) and Level 3 convictions. These are the deportations associated with minor or no criminal activity. Enter the code to do this part of the task in this chunk.
#Insert code for Task 3.1 here
reasons$minor <- reasons$None + reasons$Level3
summary(reasons$minor)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 31292 121880 172478 186829 265590 319536
Second, compute the percentage of all deportations that are “minor” deportations (i.e. \(100 \times \frac{None + Level~3}{None + Level~1 + Level~2 + Level~3}\)). Call this new variable percent_minor.
#Insert the code for task 3.2 here
reasons$percent_minor <- 100*(reasons$minor/reasons$All)
Third, create a presentation-grade plot of the variable percent_minor and provide a thorough and substantive interpretation of the plot. Quality of the plot will be scored on a 100-point scale and quality of the write-up will be scored on a 100-point scale.
#Insert code for task 3.3 here
t3 <- ggplot(reasons, aes(x=Year))+ geom_line(aes(y=percent_minor),size=0.5, color=c("red3"))+labs(title= "Deportations with Minor/No Convictions", size=0.01, y="Percent")+theme_bw()
t3
This graph focuses in on the ludicrousness of the criminality narrative. Substantively this graph is is a combination of the percent level 3 and level None lines in the previous task. It shows that the percentage of deportations with no felonies usually hovers around 75%. In 2003 (when we started collecting data) it was around 67% and climbed to over 80% by around 2008, before stabilizing at normal levels. Then it crashed during covid before surging to just under 90% between 2022-2024.
The larger point is this: the percentage of non-felony deportations has never been lower than 60%. As of the most recent data it was approaching 90%. This is entirely dispelling of the criminality narrative
For this part of the project, we will consider deportations in the context of who the President of the United States is.
First, in order to do this, we need to create a factor-level In the data set, there is a variable called “President” and records each president as: “Bush1”, “Bush2”, “Obama1”, “Obama2”, “Trump”, “Biden.” In the chunk below, insert code to create the factor-level variable. Name this variable PresFactor.
#Insert code for task 4.1 here
reasons$PresFactor<-factor(reasons$President,levels = c("Bush1", "Bush2", "Obama1", "Obama2", "Trump", "Biden"),
labels = c("Bush1", "Bush2", "Obama1", "Obama2", "Trump", "Biden")
)
Second, create a boxplot of the variable percent_minor for each Presidential administration. In the boxplot, include a symbol indicating the mean of the variable percent_minor. You can modify the code in the chunk below to help get you started.
#Insert code for task 4.2 here
PP<-ggplot(reasons, aes(PresFactor,percent_minor)) +
geom_boxplot(fill=c("red2", "red2", "dodgerblue", "dodgerblue", "red2", "dodgerblue")) +
stat_summary(fun = mean, geom = "point", shape = 20, size = 2, color = "black") +
labs(title="Percent of Minor Deportations by Presidential Administration",
x="President",
y="Percent") +
theme_minimal()
PP
Third, provide a thorough and substantive interpretation of the plot. What do we learn about variation across Presidencies regarding deportation of individuals with no, or minor, criminal convictions? For this task, the plot is worth 100 points and the write-up is worth 100 points.
The graph above presents several interesting trends. The first and most alarming is the tendency of presidents deportations to overwhelmingly favor deportations of immigrants without felonies(reffered to henceforth as minor deportations). The lowest extreme is Biden during 2020, and it still remains at a staggering 55%. Excluding Bush1, every president has a median minor deportation rate of 70-80%. This rate is irregardless of party. Interestingly Biden has the most variation, with the lowest minor deportation rate being around 55% and the maximum being above 85%, the largest minor deportation rate of any presidency in the past 25 years. This shows us how his deportation policy shifted massively since the beginning of his administration. However largest resounding message is once again a dispelling of the criminality narrative. Every president since the data has been recorded, deports more immigrants with no or minor infractions than immigrants with criminal convictions. Usually, alot more; around 70-80%.
Using the factor-level variable you created in Task 4, create a boxplot of total deportations (this is the variable called All in the dataset) and, as in Task 4, provide a thorough and substantive interpretation. Are there major differences across presidencies? Across parties? For example, Barack Obama, fairly or unfairly, was dubbed the “deporter-in-chief.” Is there a basis for this claim? You can modify the code in the chunk below to help get you started. Like Task 4, this task is worth 200 points.
#Insert code for task 5 here
p<-ggplot(reasons, aes(PresFactor, All)) +
geom_boxplot(fill=c("red2", "red2", "dodgerblue", "dodgerblue", "red2", "dodgerblue")) +
stat_summary(fun = mean, geom = "point", shape = 20, size = 1, color = "black") +
labs(title="Deportations by Presidential Administration",
x="President",
y="Number of deportations") +
theme_minimal()
p
The chart above gives us critical insight into how deportation ranges across presidency. The most interesting insight (in my opnion) is that number of total deportations don’t appear have a partisan trend. In-fact contradicting the modern political paradigm of immigration enforcement, the Democratic president Obama deported the most migrants of any president in our data set, averaging 400,000 deportations a year in his first term. This drops in his second term to around 250,000-350,000 (depending on the year) but is still the second most amount of average deportations, narrowly beating out Bush’s 2nd term. This means the title bestowed upon him of “deporter in chief” is decidedly accurate.
Bush’s Presidency is also fascinating. It clocks in with the 3rd and 5th highest amount of average deportations. During his first term Bush deported around 175,000 migrants, with little variation between years. However during his second term it spikes to a maximum nearing 380,000 deportations, making him the president with the single highest change in deportations across administration; starting at 175,000 in 2003 and peaking at 380,000, 2.2 times the amount he started with.
This massive range contrasts his next closest Republican peer, Trump. Despite immigration playing a key role in the platfrom of his 2016 campaign, including his now infamous plan “build a wall” along the U.S mexico border (https://www.federalregister.gov/documents/2017/01/30/2017-02095/border-security-and-immigration-enforcement-improvements)
he actually deported less on average than his predecessor; at an average around 225,000 deportations a year. However it is important to note his highest years are roughly the same as Obama’s lowest years, with a maximum of around 260,000 deportations
Biden deported the least number of immigrants of any president during the last 25 years. He averages at less then 100,000 deportations a year, with a maximum of just under 150,000. While Covid likely played a major factor in the overall decline in deportations across his presidency, even during 2023-2024 he still deported less than any other president in the 21st century.
Below is a chunk of code that creates some new variables, most implortantly is the variable called “Alladj”. This variable is an adjusted count of total deportations that subtracts out migrants who were removed under “expedited removal” or ER. Most ERs are of migrants with no criminal record but are migrants who were apprehended at the border and immediately removed. Reproduce the analysis in Task 5 but use “Alladj” as the \(y\)-axis variable. What, if any, differences do you observe? Provide a substantive and thorough interpretation. You may earn up to 25 points. Bare minimum effort will result in a bare minimum score, or even a 0. There is no penalty for not doing this, so only do it if you take it seriously and engage the task.
reasons2="https://raw.githubusercontent.com/mightyjoemoon/POL51/main/ICE_reasonforremoval.csv"
reasons2<-read_csv(url(reasons2))
reasons2$NoneER <- reasons2$None - reasons2$ER_Non
reasons2$Alladj <- reasons2$NoneER + reasons2$Level1 + reasons2$Level2 + reasons2$Level3
reasons2$PresFactor<-factor(reasons2$President,
levels=c("Bush1", "Bush2", "Obama1", "Obama2", "Trump", "Biden"),
labels=c("Bush1", "Bush2", "Obama1",
"Obama2", "Trump", "Biden"))
summary(reasons2)
## Year President All None
## Min. :2003 Length:22 Min. : 56882 Min. : 19495
## 1st Qu.:2008 Class :character 1st Qu.:178148 1st Qu.: 85446
## Median :2014 Mode :character Median :238765 Median :106426
## Mean :2014 Mean :248987 Mean :122287
## 3rd Qu.:2019 3rd Qu.:356423 3rd Qu.:165287
## Max. :2024 Max. :407821 Max. :253342
## Level1 Level2 Level3 Undocumented
## Min. : 9819 Min. : 3846 Min. : 11045 Min. :10100000
## 1st Qu.:38484 1st Qu.: 9056 1st Qu.: 34978 1st Qu.:10500000
## Median :46743 Median :17480 Median : 63186 Median :11050000
## Mean :46534 Mean :15601 Mean : 64541 Mean :11015455
## 3rd Qu.:57148 3rd Qu.:20342 3rd Qu.: 90950 3rd Qu.:11375000
## Max. :75590 Max. :29436 Max. :130251 Max. :12200000
## ER_Non NoneER Alladj PresFactor
## Min. : 4018 Min. : 11888 Min. : 47334 Bush1 :2
## 1st Qu.:28563 1st Qu.: 51729 1st Qu.:155016 Bush2 :4
## Median :41647 Median : 68923 Median :192514 Obama1:4
## Mean :38980 Mean : 83308 Mean :209984 Obama2:4
## 3rd Qu.:50230 3rd Qu.: 98312 3rd Qu.:287580 Trump :4
## Max. :71686 Max. :219648 Max. :376140 Biden :4
Enter you code in the chunk below to produce the boxplot and then provide your response below the plot.
#Insert code in the chunk below
padj<-ggplot(reasons, aes(PresFactor, reasons2$Alladj)) +
geom_boxplot(fill=c("red2", "red2", "dodgerblue", "dodgerblue", "red2", "dodgerblue")) +
stat_summary(fun = mean, geom = "point", shape = 20, size = 1, color = "black") +
labs(title="Deportations by Presidential Administration W/O Expedited Removal",
x="President",
y="Number of deportations") +
theme_minimal()
padj
For the purpose of this analysis I will compare their deportations without expedited removal to there deportations with expedited removal
The Boxplot does not move for Bush’s first term, suggesting he did not use expedited removals in his first term.
During his second term the entire boxplot shifts downards, suggesting he was using about 25,000 expedited removals a year. The shape of the box plot is the same so it suggests he used expedited removals evenly during the four years of his second term.
During Obama’s first term the box plot shifts heavily downwards, suggesting that his administration majorly shifted to use of expedited removals, to the tune of 50,000 deportations a year. However the tails are extended, suggesting he used them more frequently towards the end of his term
During his second term the boxlpot shifts down by about 50,000, showing us that even as the amount of deportations he preformed when down, the amount of expedited removals remained the same, becoming an overall greater percentage of his total deportations during his second term.
During Trumps Presidency the boxplot shifts down by about 40,000, suggesting about 40,000 of his deportations were using expedited removal. This means that the total percentage of expedited removals has once again increased.
Biden’s shift is by far the most drastic. Without expedited removals his average deportations plummet by about 25,000. This indicates that expedited removals made up a little under a quarter of his total deportations. His maximum shifted by 40,000, from around a little under 150,000 to a little under 100,000; meaning during peak years Biden was using expedited removals for about 1/3 total deportations. This is corroborated by our work in task 3. Expedited removals usually include people with no criminal convictions, as they are not living in the U.S and have very little time to commit any crimes. We can see the amount of “minor deportations” spike at the beginning of 2020, in line with the Biden administration control of the white house.
The overall trend is clear. Ever since Bush started the expedited removal program in his second term, they have made up an increasing percentage of total deportations in the United States.