ICE Administrative Arrests Analysis

Author

Flioria Akesse

# Introduction

This project analyzes administrative arrest data from U.S. Immigration and Customs Enforcement (ICE) for fiscal year 2026. The dataset contains information about apprehension type, state, arrest method, criminality classification, gender, citizenship country, and case status.

This topic is important because immigration enforcement is a major public policy issue in the United States. By analyzing arrest patterns, we can better understand how enforcement varies across states and demographic groups.

Research Question

What patterns exist in immigration administrative arrests in the United States, and how do demographic factors such as birth year relate to case threat level?

# Load libraries

library(readr)
Warning: package 'readr' was built under R version 4.5.3
library(dplyr)

Attaching package: 'dplyr'
The following objects are masked from 'package:stats':

    filter, lag
The following objects are masked from 'package:base':

    intersect, setdiff, setequal, union
library(ggplot2)
Warning: package 'ggplot2' was built under R version 4.5.3
library(plotly)
Warning: package 'plotly' was built under R version 4.5.3

Attaching package: 'plotly'
The following object is masked from 'package:ggplot2':

    last_plot
The following object is masked from 'package:stats':

    filter
The following object is masked from 'package:graphics':

    layout

#Load Data Section

library(readxl)
Warning: package 'readxl' was built under R version 4.5.3
ice <- read_excel(
"2026-ICLI-00005_Arrests_FY26_20260311_Redacted.xlsx",
skip = 6
)

names(ice)
 [1] "Apprehension Date"          "Apprehension Type"         
 [3] "State"                      "County"                    
 [5] "TOA Current Duty AOR"       "Apprehension Final Program"
 [7] "Arresting Agency"           "Apprehension Method"       
 [9] "Apprehension Criminality"   "Case Status"               
[11] "Case Category"              "Departure Country"         
[13] "Final Order Yes No"         "Birth Date"                
[15] "Birth Year"                 "Citizenship Country"       
[17] "Gender"                     "Departed Date"             
[19] "Final Order Date"           "Apprehension Site Landmark"
[21] "Operation"                  "TOA Current Duty Site"     
[23] "Case Criminality"           "Case Threat Level"         
[25] "Anonymized Identifier"     
names(ice)
 [1] "Apprehension Date"          "Apprehension Type"         
 [3] "State"                      "County"                    
 [5] "TOA Current Duty AOR"       "Apprehension Final Program"
 [7] "Arresting Agency"           "Apprehension Method"       
 [9] "Apprehension Criminality"   "Case Status"               
[11] "Case Category"              "Departure Country"         
[13] "Final Order Yes No"         "Birth Date"                
[15] "Birth Year"                 "Citizenship Country"       
[17] "Gender"                     "Departed Date"             
[19] "Final Order Date"           "Apprehension Site Landmark"
[21] "Operation"                  "TOA Current Duty Site"     
[23] "Case Criminality"           "Case Threat Level"         
[25] "Anonymized Identifier"     

#Data Cleaning Section

ice_clean <- ice %>%
  filter(!is.na(State)) %>%
  slice_sample(n = 800)

#Visualization Section

ggplot(ice_clean, aes(x = State)) +
  geom_bar(fill = "steelblue") +
  theme_minimal() +
  labs(
    title = "Arrests by State",
    x = "State",
    y = "Number of Arrests",
    caption = "Source: ICE dataset"
  )

#Interactive Visualization

p <- ggplot(ice_clean, aes(x = State)) +
  geom_bar(fill = "darkred")

ggplotly(p)

#Regression Analysis

model_data <- ice_clean %>%
  mutate(
    threat = as.numeric(`Case Threat Level`),
    birth_year = as.numeric(`Birth Year`)
  ) %>%
  filter(
    !is.na(threat),
    !is.na(birth_year)
  )
Warning: There was 1 warning in `mutate()`.
ℹ In argument: `threat = as.numeric(`Case Threat Level`)`.
Caused by warning:
! NAs introduced by coercion
model <- lm(threat ~ birth_year, data = model_data)

summary(model)

Call:
lm(formula = threat ~ birth_year, data = model_data)

Residuals:
     Min       1Q   Median       3Q      Max 
-1.43734 -0.84605  0.03123  0.84157  1.34360 

Coefficients:
              Estimate Std. Error t value Pr(>|t|)    
(Intercept) -42.254660  11.054109  -3.823 0.000172 ***
birth_year    0.022313   0.005564   4.010 8.29e-05 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.856 on 221 degrees of freedom
Multiple R-squared:  0.06784,   Adjusted R-squared:  0.06362 
F-statistic: 16.08 on 1 and 221 DF,  p-value: 8.293e-05

Conclusion

The project shows how R functions to clean data and create visualizations and perform analytical operations on actual datasets from real-world sources. The analysis reveals immigration enforcement data patterns through statistical methods which help us understand complicated policy matters.

source: New York Time