Immigration Enforcement Patterns and Case Outcomes in the United States

Author

Flioria Akesse

Introduction

This study investigates trends in immigration enforcement, threat levels, and case results in the United States through the use of data from ICE Enforcement and Removal Operations for the fiscal year 2026.

The database consists of information on administrative immigration arrests carried out between October 1, 2025 and March 10, 2026. Every observation in the database pertains to an individual case of immigration arrest and comprises variables that pertain to demographics, location of arrest, modes of arrest, criminality level, and case result.

The objective of this study is to investigate the connection between demographic and geographic characteristics and increased threat levels and case results.

Dataset Source

Data Source:

Immigration and Customs Enforcement (ICE), Enforcement and Removal Operations dataset.

Dataset provider: https://deportationdata.org/data/ice.html

The dataset contains approximately 191,000 observations and 25 variables.

Variables and Research Questions

Both categorical and quantitative variables are included in this dataset.

Some of the categorical variables include: - Place of arrest - Country of citizenship - Gender - Threat category - Criminality classification - Method of apprehension - Status of the case

Some of the quantitative or date variables include: - Birth year - Birth date - Arrest date - Date of final order

Other variables like age at the time of arrest and case processing time would be generated during the data cleaning process.

Questions addressed by this research include: - What are the states where immigration arrests happen the most? - Is there any association between threat categories and criminality classifications? - Do demographic variables influence case outcomes? - Which variables influence case processing time?

Data Collection Methodology

The data set was compiled and released through the Deportation Data Project based on Immigration and Customs Enforcement (ICE) Enforcement and Removal Operations data.

Even though there is some information provided about the variables in the dataset in the documentation for the data, information about the method used to compile and report the data is scant.

Why I Chose This Dataset

The reason for selecting this dataset is that the topic is a significant public policy problem in the United States. Using this dataset, one can explore the realities of demographics, geography, and law from a statistical and visualization perspective.

I wanted to analyze the effects of various factors on the outcome and classification of the immigration cases. The large sample size makes the dataset suitable for regression analysis and visualization in addition to data cleaning with the help of R.

Load Libraries and Dataset

# Load libraries
library(tidyverse)

Warning: package 'tidyverse' was built under R version 4.5.3

Warning: package 'ggplot2' was built under R version 4.5.3

Warning: package 'tidyr' was built under R version 4.5.3

Warning: package 'readr' was built under R version 4.5.3

Warning: package 'purrr' was built under R version 4.5.3

Warning: package 'stringr' was built under R version 4.5.3

Warning: package 'forcats' was built under R version 4.5.3

Warning: package 'lubridate' was built under R version 4.5.3

── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.2.0     ✔ readr     2.2.0
✔ forcats   1.0.1     ✔ stringr   1.6.0
✔ ggplot2   4.0.2     ✔ tibble    3.3.1
✔ lubridate 1.9.5     ✔ tidyr     1.3.2
✔ purrr     1.2.1     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

library(readr)
library(dplyr)
library(ggplot2)
library(lubridate)
library(plotly)

Warning: package 'plotly' was built under R version 4.5.3


Attaching package: 'plotly'

The following object is masked from 'package:ggplot2':

    last_plot

The following object is masked from 'package:stats':

    filter

The following object is masked from 'package:graphics':

    layout

library(readxl)

Warning: package 'readxl' was built under R version 4.5.3

# Load dataset correctly
ice_data <- read_excel(
  "2026-ICLI-00005_Arrests_FY26_20260311_Redacted.xlsx",
  skip = 6
)

# View first rows
head(ice_data)

# A tibble: 6 × 25
  `Apprehension Date` `Apprehension Type` State    County `TOA Current Duty AOR`
  <dttm>              <chr>               <chr>    <lgl>  <chr>                 
1 2025-11-16 14:21:59 Targeted            TEXAS    NA     San Antonio Area of R…
2 2025-11-05 14:29:30 Collateral          TEXAS    NA     San Antonio Area of R…
3 2025-12-14 08:18:33 Targeted            UTAH     NA     Salt Lake City Area o…
4 2026-02-14 10:27:08 Targeted            FLORIDA  NA     Miami Area of Respons…
5 2025-11-19 07:44:46 Targeted            ILLINOIS NA     Chicago Area of Respo…
6 2025-10-09 14:24:47 Targeted            TEXAS    NA     El Paso Area of Respo…
# ℹ 20 more variables: `Apprehension Final Program` <chr>,
#   `Arresting Agency` <chr>, `Apprehension Method` <chr>,
#   `Apprehension Criminality` <chr>, `Case Status` <chr>,
#   `Case Category` <chr>, `Departure Country` <chr>,
#   `Final Order Yes No` <chr>, `Birth Date` <chr>, `Birth Year` <dbl>,
#   `Citizenship Country` <chr>, Gender <chr>, `Departed Date` <dttm>,
#   `Final Order Date` <dttm>, `Apprehension Site Landmark` <chr>, …

# View structure
glimpse(ice_data)

Rows: 191,546
Columns: 25
$ `Apprehension Date`          <dttm> 2025-11-16 14:21:59, 2025-11-05 14:29:30…
$ `Apprehension Type`          <chr> "Targeted", "Collateral", "Targeted", "Ta…
$ State                        <chr> "TEXAS", "TEXAS", "UTAH", "FLORIDA", "ILL…
$ County                       <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
$ `TOA Current Duty AOR`       <chr> "San Antonio Area of Responsibility", "Sa…
$ `Apprehension Final Program` <chr> "ERO Criminal Alien Program", "ERO Crimin…
$ `Arresting Agency`           <chr> "ICE", "ICE", "ICE", "ICE", "ICE", "ICE",…
$ `Apprehension Method`        <chr> "Non-Custodial Arrest", "Custodial Arrest…
$ `Apprehension Criminality`   <chr> "3 Other Immigration Violator", "1 Convic…
$ `Case Status`                <chr> "8-Excluded/Removed - Inadmissibility", "…
$ `Case Category`              <chr> "[8C] Excludable / Inadmissible - Adminis…
$ `Departure Country`          <chr> "VENEZUELA", "COLOMBIA", "MEXICO", NA, "M…
$ `Final Order Yes No`         <chr> "YES", "YES", "YES", "NO", "YES", "YES", …
$ `Birth Date`                 <chr> "b(6), b(7)c", "b(6), b(7)c", "b(6), b(7)…
$ `Birth Year`                 <dbl> 1990, 1998, 1984, 1988, 1998, 2001, 2003,…
$ `Citizenship Country`        <chr> "VENEZUELA", "COLOMBIA", "MEXICO", "GUATE…
$ Gender                       <chr> "Male", "Female", "Male", "Male", "Male",…
$ `Departed Date`              <dttm> 2025-11-28, 2025-12-02, 2025-12-20, NA, …
$ `Final Order Date`           <dttm> 2024-08-01, 2025-10-28, 2025-12-15, NA, …
$ `Apprehension Site Landmark` <chr> "FRU - ALL AREAS INISDE BEXAR COUNTY", "D…
$ Operation                    <chr> "-", NA, "-", NA, "-", NA, NA, NA, "-", "…
$ `TOA Current Duty Site`      <chr> "ERO - Austin, TX Sub Office", "ERO - Del…
$ `Case Criminality`           <chr> "3 Other Immigration Violator", "1 Convic…
$ `Case Threat Level`          <chr> "NA", "3", "2", "NA", "1", "3", "NA", "NA…
$ `Anonymized Identifier`      <chr> "0000004d7b875782a3b6f2b460f98c0750a5a899…

Data Cleaning and Wrangling

# Select important variables
ice_clean <- ice_data %>%
  select(
    State,
    Gender,
    `Case Threat Level`,
    `Case Criminality`,
    `Case Status`,
    `Citizenship Country`,
    `Birth Year`,
    `Apprehension Date`,
    `Final Order Date`
  )

# Remove observations with missing important variables
ice_clean <- ice_clean %>%
  filter(
    !is.na(State),
    !is.na(Gender),
    !is.na(`Case Threat Level`),
    !is.na(`Birth Year`)
  )

# Create age variable
ice_clean <- ice_clean %>%
  mutate(
    Age = 2026 - `Birth Year`
  )

# Preview cleaned dataset
head(ice_clean)

# A tibble: 6 × 10
  State    Gender `Case Threat Level` `Case Criminality`           `Case Status`
  <chr>    <chr>  <chr>               <chr>                        <chr>        
1 TEXAS    Male   NA                  3 Other Immigration Violator 8-Excluded/R…
2 TEXAS    Female 3                   1 Convicted Criminal         8-Excluded/R…
3 UTAH     Male   2                   1 Convicted Criminal         8-Excluded/R…
4 FLORIDA  Male   NA                  2 Pending Criminal Charges   ACTIVE       
5 ILLINOIS Male   1                   1 Convicted Criminal         8-Excluded/R…
6 TEXAS    Male   3                   1 Convicted Criminal         ACTIVE       
# ℹ 5 more variables: `Citizenship Country` <chr>, `Birth Year` <dbl>,
#   `Apprehension Date` <dttm>, `Final Order Date` <dttm>, Age <dbl>

glimpse(ice_clean)

Rows: 182,747
Columns: 10
$ State                 <chr> "TEXAS", "TEXAS", "UTAH", "FLORIDA", "ILLINOIS",…
$ Gender                <chr> "Male", "Female", "Male", "Male", "Male", "Male"…
$ `Case Threat Level`   <chr> "NA", "3", "2", "NA", "1", "3", "NA", "NA", "NA"…
$ `Case Criminality`    <chr> "3 Other Immigration Violator", "1 Convicted Cri…
$ `Case Status`         <chr> "8-Excluded/Removed - Inadmissibility", "8-Exclu…
$ `Citizenship Country` <chr> "VENEZUELA", "COLOMBIA", "MEXICO", "GUATEMALA", …
$ `Birth Year`          <dbl> 1990, 1998, 1984, 1988, 1998, 2001, 2003, 1982, …
$ `Apprehension Date`   <dttm> 2025-11-16 14:21:59, 2025-11-05 14:29:30, 2025-…
$ `Final Order Date`    <dttm> 2024-08-01, 2025-10-28, 2025-12-15, NA, 2026-01…
$ Age                   <dbl> 36, 28, 42, 38, 28, 25, 23, 44, 40, 21, 46, 51, …

head(ice_clean)

# A tibble: 6 × 10
  State    Gender `Case Threat Level` `Case Criminality`           `Case Status`
  <chr>    <chr>  <chr>               <chr>                        <chr>        
1 TEXAS    Male   NA                  3 Other Immigration Violator 8-Excluded/R…
2 TEXAS    Female 3                   1 Convicted Criminal         8-Excluded/R…
3 UTAH     Male   2                   1 Convicted Criminal         8-Excluded/R…
4 FLORIDA  Male   NA                  2 Pending Criminal Charges   ACTIVE       
5 ILLINOIS Male   1                   1 Convicted Criminal         8-Excluded/R…
6 TEXAS    Male   3                   1 Convicted Criminal         ACTIVE       
# ℹ 5 more variables: `Citizenship Country` <chr>, `Birth Year` <dbl>,
#   `Apprehension Date` <dttm>, `Final Order Date` <dttm>, Age <dbl>

glimpse(ice_clean)

Rows: 182,747
Columns: 10
$ State                 <chr> "TEXAS", "TEXAS", "UTAH", "FLORIDA", "ILLINOIS",…
$ Gender                <chr> "Male", "Female", "Male", "Male", "Male", "Male"…
$ `Case Threat Level`   <chr> "NA", "3", "2", "NA", "1", "3", "NA", "NA", "NA"…
$ `Case Criminality`    <chr> "3 Other Immigration Violator", "1 Convicted Cri…
$ `Case Status`         <chr> "8-Excluded/Removed - Inadmissibility", "8-Exclu…
$ `Citizenship Country` <chr> "VENEZUELA", "COLOMBIA", "MEXICO", "GUATEMALA", …
$ `Birth Year`          <dbl> 1990, 1998, 1984, 1988, 1998, 2001, 2003, 1982, …
$ `Apprehension Date`   <dttm> 2025-11-16 14:21:59, 2025-11-05 14:29:30, 2025-…
$ `Final Order Date`    <dttm> 2024-08-01, 2025-10-28, 2025-12-15, NA, 2026-01…
$ Age                   <dbl> 36, 28, 42, 38, 28, 25, 23, 44, 40, 21, 46, 51, …

Exploratory Data Analysis

# Number of arrests by state
state_summary <- ice_clean %>%
  group_by(State) %>%
  summarize(
    Total_Arrests = n()
  ) %>%
  arrange(desc(Total_Arrests))

head(state_summary)

# A tibble: 6 × 2
  State      Total_Arrests
  <chr>              <int>
1 TEXAS              48231
2 FLORIDA            17713
3 CALIFORNIA         14373
4 NEW YORK            7020
5 GEORGIA             6205
6 NEW JERSEY          5875

Visualization 1: Immigration Arrests by State

# Bar chart of arrests by state

top_states <- state_summary %>%
  slice_head(n = 10)

ggplot(top_states,
       aes(x = reorder(State, Total_Arrests),
           y = Total_Arrests,
           fill = State)) +

  geom_col() +

  coord_flip() +

  labs(
    title = "Top 10 States with the Highest Number of Immigration Arrests",
    x = "State",
    y = "Number of Arrests",
    caption = "Source: ICE Enforcement and Removal Operations Data"
  ) +

  theme_minimal() +

  scale_fill_brewer(palette = "Paired")

This graph depicts the top 10 states with the most immigration administrative arrests. Texas was ranked first in terms of the most arrests, followed by Florida and California. From the graph, it can be concluded that there is immigration enforcement in certain geographic locations within the United States.

Visualization 2: Threat Level by Gender

# Create summary for threat level by gender

gender_threat <- ice_clean %>%
  group_by(Gender, `Case Threat Level`) %>%
  summarize(
    Count = n()
  )

`summarise()` has regrouped the output.
ℹ Summaries were computed grouped by Gender and Case Threat Level.
ℹ Output is grouped by Gender.
ℹ Use `summarise(.groups = "drop_last")` to silence this message.
ℹ Use `summarise(.by = c(Gender, Case Threat Level))` for per-operation
  grouping (`?dplyr::dplyr_by`) instead.

# Interactive visualization

p <- ggplot(gender_threat,
            aes(x = Gender,
                y = Count,
                fill = `Case Threat Level`)) +

  geom_col(position = "dodge") +

  labs(
    title = "Case Threat Levels by Gender",
    x = "Gender",
    y = "Number of Cases",
    fill = "Threat Level",
    caption = "Source: ICE Enforcement and Removal Operations Data"
  ) +

  theme_light() +

  scale_fill_brewer(palette = "Dark2")

ggplotly(p)

This interactive visualization compares immigration case threat levels across gender categories. The chart allows viewers to explore the distribution of threat classifications interactively. Most cases appear within lower threat categories, while higher threat classifications occur less frequently.

Multiple Linear Regression Analysis

# Create regression dataset

regression_data <- ice_clean %>%
  filter(
    !is.na(Age),
    !is.na(`Case Threat Level`)
  ) %>%
  mutate(
    Threat_Level_Numeric = as.numeric(`Case Threat Level`)
  )

Warning: There was 1 warning in `mutate()`.
ℹ In argument: `Threat_Level_Numeric = as.numeric(`Case Threat Level`)`.
Caused by warning:
! NAs introduced by coercion

# Multiple linear regression model

model <- lm(
  Threat_Level_Numeric ~ Age + Gender,
  data = regression_data
)

# Model summary

summary(model)


Call:
lm(formula = Threat_Level_Numeric ~ Age + Gender, data = regression_data)

Residuals:
     Min       1Q   Median       3Q      Max 
-1.54414 -0.85333 -0.02363  0.84391  1.71433 

Coefficients:
                Estimate Std. Error t value Pr(>|t|)    
(Intercept)    2.9036613  0.0205200 141.504  < 2e-16 ***
Age           -0.0189221  0.0003371 -56.128  < 2e-16 ***
GenderMale    -0.1420647  0.0164010  -8.662  < 2e-16 ***
GenderUnknown -0.3870093  0.0505684  -7.653 1.99e-14 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.8444 on 51974 degrees of freedom
  (130769 observations deleted due to missingness)
Multiple R-squared:  0.0597,    Adjusted R-squared:  0.05964 
F-statistic:  1100 on 3 and 51974 DF,  p-value: < 2.2e-16

Multiple regression analysis was performed to determine the correlation between the variables age, gender, and threat level. From the results of the multiple regression, it can be seen that both the variables age and gender are statistically significant factors affecting the threat level, since the p-values are less than 0.05. As per the adjusted R squared values, it can be said that the model accounts for a small percentage of the variation in the threat level categories.

# Diagnostic plots for regression model

par(mfrow = c(2,2))
plot(model)

# Export cleaned dataset for Tableau

write.csv(ice_clean,
          "ice_clean.csv",
          row.names = FALSE)

Tableau Visualization

Interactive Tableau visualization:

https://us-east-1.online.tableau.com/#/site/akesseflioria-b8b21aef72/views/Data110FinalProjectFlioriaAkesse/Sheet1?:iid=1

Background Research

Immigration enforcement has become a major public policy issue in the United States. According to the American Immigration Council, immigration enforcement policies affect millions of individuals and families and have major economic and social impacts across communities. Researchers continue to study how immigration enforcement patterns vary geographically and how enforcement priorities influence arrest outcomes and detention practices.

Reference: American Immigration Council. “The Cost of Immigration Enforcement and Border Security.” https://www.americanimmigrationcouncil.org

Conclusion

The purpose of this research project is to understand patterns of immigration enforcement, threats classifications, and outcomes using ICE Enforcement and Removal Operations data. In this research, it can be concluded that immigration arrests take place in a few number of states, particularly in Texas and Florida. Visualization also provided a comparison between case threat levels based on gender.

From the regression results, it can be observed that age and gender significantly predict the outcome of the threat classification. However, the regression equation explains a small part of the variance within the data set. Hence, there might be other independent variables that affect the outcomes of the immigration cases.

Some of the weaknesses identified from this project are related to some missing information among the variables used, and there is very little information provided about how the original database was constructed.