Overview

A common trope in the immigration debate is that undocumented immigrants commit, at high rates, violent crimes. Therefore, the supposition is that migrants who are deported are migrants who have committed serious criminal infractions. This idea is prevalent in political rhetoric surrounding the issue of deportation. But is the claim consistent with the actual data? This is the basis of this short Project 3. This assignment is asking you to analyze real-world data on deportations in the United States between the years 2003 and 2024. The data you access records annual ICE removals (deportations) based on what ICE records as the “Most Serious Criminal Conviction” for someone who is deported. The following information is from TRAC (Transactional Records Access Clearinghouse) and describes what the classification levels mean:

“Seriousness Level of MSCC Conviction. ICE classifies National Crime Information Center (NCIC) offense codes into three seriousness levels. The most serious (Level 1) covers what ICE considers to be”aggravated felonies.” Level 2 offenses cover other felonies, while Level 3 offenses are misdemeanors, including petty and other minor violations of the law. TRAC uses ICE’s “business rules” to group recorded NCIC offense codes into these three seriousness levels.”

Essentially what this loosely means is that “Level 1” convictions are the most serious and “Level 3” convictions are generally minor legal infractions. In addition to Levels 1-3, there is a fourth category called “None” denoting that the deportee had no criminal convictions. Review the Patler and Jones article, especially the section on the criminality narrative.

##       Year       President              All              None       
##  Min.   :2003   Length:22          Min.   : 56882   Min.   : 19495  
##  1st Qu.:2008   Class :character   1st Qu.:178148   1st Qu.: 85446  
##  Median :2014   Mode  :character   Median :238765   Median :106426  
##  Mean   :2014                      Mean   :248987   Mean   :122287  
##  3rd Qu.:2019                      3rd Qu.:356423   3rd Qu.:165287  
##  Max.   :2024                      Max.   :407821   Max.   :253342  
##      Level1          Level2          Level3        Undocumented     
##  Min.   : 9819   Min.   : 3846   Min.   : 11045   Min.   :10100000  
##  1st Qu.:38484   1st Qu.: 9056   1st Qu.: 34978   1st Qu.:10500000  
##  Median :46743   Median :17480   Median : 63186   Median :11050000  
##  Mean   :46534   Mean   :15601   Mean   : 64541   Mean   :11015455  
##  3rd Qu.:57148   3rd Qu.:20342   3rd Qu.: 90950   3rd Qu.:11375000  
##  Max.   :75590   Max.   :29436   Max.   :130251   Max.   :12200000  
##      ER_Non     
##  Min.   : 4018  
##  1st Qu.:28563  
##  Median :41647  
##  Mean   :38980  
##  3rd Qu.:50230  
##  Max.   :71686

Task 1

The following is a line plot of the four levels of criminality (Levels 1-3 and None). First add proper labels to each axis and give a main title. Next, provide a thorough interpretation of the plot that is non-mechanical and substantive. If you were conveying the information from this plot to an audience interested in understanding deportation, what would you say? This task is worth 100 points.

Task 1 answer goes here

Task 2: Regression

For this task you will create three new variables from existing ones in the data set.

First, create a new variable called “minor” that sums all deportations associated with no criminal conviction (“None”) and Level 3 convictions. These are the deportations associated with minor or no criminal activity.

Second, compute the percentage of all deportations that are “minor” deportations (i.e. \(100 \times \frac{None + Level~3}{None + Level~1 + Level~2 + Level~3}\)). Call this variable “percent_minor.”

Third, center the variable using 2014 as the basis year (how to do this will be discussed in class). Name this variable “time.”

Fourth, estimate a linear regression model using the variable percent_minor as the dependent variable and the variable time as the independent variable. Provide an interpretation of the regression results including presenting the results visually using plot_model. What do we learn about the criminality narrative based on these results. This task is worth 100 points.

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   55.01   71.82   74.61   73.68   77.02   86.63

#Center Year from 2014

## 
## -11 -10  -9  -8  -7  -6  -5  -4  -3  -2  -1   0   1   2   3   4   5   6   7   8 
##   1   1   1   1   1   1   1   1   1   1   1   1   1   1   1   1   1   1   1   1 
##   9  10 
##   1   1

Regression

## 
## Call:
## lm(formula = minortotal ~ Year, data = reasons)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -19.0062  -1.9510   0.9342   2.9694  12.4761 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)
## (Intercept) -17.09483  446.25280  -0.038    0.970
## Year          0.04508    0.22163   0.203    0.841
## 
## Residual standard error: 6.595 on 20 degrees of freedom
## Multiple R-squared:  0.002065,   Adjusted R-squared:  -0.04783 
## F-statistic: 0.04138 on 1 and 20 DF,  p-value: 0.8409

Task 2 answer goes here

Task 3: Presidential differences

Are there differences in criminality levels of deportees by President? This is the question you will answer here. To do this, create a factor-level variable denoting each President. In the data set, there is a variable called “President” and records each president as: “Bush1”, “Bush2”, “Obama1”, “Obama2”, “Trump”, “Biden.” Estimate a regression model treating this factor-level variable as the indpendent variable and “percent_minor” as the dependent variable. What do the results show? Provide an interpretation of the regression results, including a plot of the regression model. This task is worth 100 points.

Making factor

## 
##  Bush1  Bush2 Obama1 Obama2  Trump  Biden 
##      2      4      4      4      4      4

Regression model

## 
## Call:
## lm(formula = minortotal ~ PresFactor, data = reasons)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -16.1117  -1.7610  -0.6386   2.6331  15.5058 
## 
## Coefficients:
##                  Estimate Std. Error t value       Pr(>|t|)    
## (Intercept)        67.026      4.686  14.304 0.000000000156 ***
## PresFactorBush2     8.834      5.739   1.539          0.143    
## PresFactorObama1    9.314      5.739   1.623          0.124    
## PresFactorObama2    6.397      5.739   1.115          0.281    
## PresFactorTrump     7.955      5.739   1.386          0.185    
## PresFactorBiden     4.098      5.739   0.714          0.485    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 6.627 on 16 degrees of freedom
## Multiple R-squared:  0.1939, Adjusted R-squared:  -0.05797 
## F-statistic: 0.7699 on 5 and 16 DF,  p-value: 0.585

Plotting the model

Task 3 answer goes here

Task 4: Total deportations by President

In this task, we are asking the question: do total deportations vary across Presidencies. There is a basis for this question. President Obama has often been called the “deporter-in-chief” because of the number of deportations that occurred during his presidency, especially the first term. Also, President Trump, in his first administration promised to increase deportations. Are any of these claims valid? Estimate a regressiom model treating the total number of deportations as the dependent variable and the presidential factor variable as the independent variable. Provide a substantive interpretation of the regression results as well the plot of the regression model. This task is worth 100 points.

Regression model and plot

## 
## Call:
## lm(formula = All ~ PresFactor, data = reasons)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -86997 -33504    250  26966 115451 
## 
## Coefficients:
##                  Estimate Std. Error t value Pr(>|t|)    
## (Intercept)        168379      38462   4.378 0.000468 ***
## PresFactorBush2    101881      47106   2.163 0.046044 *  
## PresFactorObama1   227705      47106   4.834 0.000183 ***
## PresFactorObama2   123022      47106   2.612 0.018891 *  
## PresFactorTrump     64590      47106   1.371 0.189250    
## PresFactorBiden    -73855      47106  -1.568 0.136482    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 54390 on 16 degrees of freedom
## Multiple R-squared:  0.8124, Adjusted R-squared:  0.7538 
## F-statistic: 13.86 on 5 and 16 DF,  p-value: 0.00002467

Task 4 answer goes here

Task 5: Broken stick regression

For this task, first create a diagnostic plot of all deportations by year. Based on inspection of the plot, how many piecewise functions do you think would best fit these data? Following this, estimate a regression function using a spline function with a polynomial of order 1 and the number of splines equal to what your diagnostic plot suggests. Comparing a model with 2 or 3 degrees of freedom, which model best describes the data? This question is worth 50 points.

Diagnostic plot (labels are not needed)

Spline function

Task 5 answer here