Overview

remove.1 <- read.csv("ICE_removals_1948 (1).csv", stringsAsFactors = FALSE)

I have compiled data of deportations by fiscal year from 1948 to 2022. Your job is to use some of the skills we are learning in this class to better understand these data. As such, I will be asking you to engage in a number of tasks requiring the use of \(t\)-tests and simple regression. Your grade will be based on analysis and presentation of the data. This assignment is worth 600 points. It will be due May 30 by 11:59 PM. You need to submit an HTML document or a document that includes code and viewable output.

Reading in the deportation data

This chunk reads in the data on deportations from 1948 to 2022.

urlfile="https://raw.githubusercontent.com/mightyjoemoon/POL51/main/ICE_removals_1948.csv"

remove.1<-read_csv(url(urlfile))

summary(remove.1)
##       Year      Apprehensions      President             Party       
##  Min.   :1948   Min.   :  45336   Length:75          Min.   :0.0000  
##  1st Qu.:1966   1st Qu.: 444232   Class :character   1st Qu.:0.0000  
##  Median :1985   Median : 889212   Mode  :character   Median :0.0000  
##  Mean   :1985   Mean   : 852071                      Mean   :0.4667  
##  3rd Qu.:2004   3rd Qu.:1194182                      3rd Qu.:1.0000  
##  Max.   :2022   Max.   :2584220                      Max.   :1.0000  
##                                                                      
##      PCGdp           Decade      Deportations          VR         
##  Min.   : 1833   Min.   :1940   Min.   :  5989   Min.   :  52383  
##  1st Qu.: 4231   1st Qu.:1960   1st Qu.: 17362   1st Qu.: 174562  
##  Median :18237   Median :1980   Median : 29277   Median : 673169  
##  Mean   :24128   Mean   :1978   Mean   :109287   Mean   : 648029  
##  3rd Qu.:40607   3rd Qu.:2000   3rd Qu.:188746   3rd Qu.:1017324  
##  Max.   :77247   Max.   :2010   Max.   :432334   Max.   :1675876  
##                  NA's   :2                                        
##  Administrative   EnforcementReturns    Criminal       Noncriminal    
##  Min.   : 15072   Min.   : 49664     Min.   : 61117   Min.   : 24666  
##  1st Qu.: 44947   1st Qu.: 81191     1st Qu.:114680   1st Qu.:161440  
##  Median : 60150   Median : 86800     Median :135509   Median :190058  
##  Mean   : 70965   Mean   :159377     Mean   :139193   Mean   :168409  
##  3rd Qu.: 85478   3rd Qu.:171374     3rd Qu.:176722   3rd Qu.:215554  
##  Max.   :180266   Max.   :523153     Max.   :200039   Max.   :233846  
##  NA's   :61       NA's   :61         NA's   :63       NA's   :63      
##     Title 42        Foreign Born       Naturalized         Noncitizen      
##  Min.   : 206770   Min.   : 9619300   Min.   :14967828   Min.   :20722014  
##  1st Qu.: 638922   1st Qu.: 9738100   1st Qu.:17003818   1st Qu.:21671389  
##  Median :1071074   Median :19767300   Median :19639724   Median :21965584  
##  Mean   : 793937   Mean   :23434849   Mean   :19752182   Mean   :21939190  
##  3rd Qu.:1087520   3rd Qu.:36154329   3rd Qu.:22459486   3rd Qu.:22364709  
##  Max.   :1103966   Max.   :46182177   Max.   :24509131   Max.   :22593269  
##  NA's   :72        NA's   :7          NA's   :57         NA's   :57        
##  Unauthorized population US Population         App_lagged     
##  Min.   : 3500000        Min.   :146631302   Min.   :  45336  
##  1st Qu.:10237500        1st Qu.:197636197   1st Qu.: 382740  
##  Median :10850000        Median :237923795   Median : 885587  
##  Mean   :10168182        Mean   :241806480   Mean   : 820197  
##  3rd Qu.:11375000        3rd Qu.:291456616   3rd Qu.:1183164  
##  Max.   :12200000        Max.   :333287557   Max.   :1865379  
##  NA's   :53

Task 1: Interpret barplot of deportations

Below is code to produce a barplot of deportations over the time frame. I want you to provide a professional-grade interpretation of the plot you are seeing. This task is worth 100 points.

df_melted <- aggregate(data = remove.1, Deportations ~ Year, mean)
names(df_melted) <- c("Year", "mean_Deportations")

ggplot(df_melted, aes(x = Year, y = mean_Deportations, width=1)) +
  geom_bar(stat = "identity") +
  scale_x_continuous(n.breaks = 10) +
labs(title="Figure 1: Deportations by year (FY 1948-2022)",
       y="Number of deportations", x="Fiscal year",
       color="") +
  theme_bw() +
  theme(#panel.grid.major.y = element_line(colour = "grey", linetype = "dashed"),
    panel.grid.major.x = element_blank(),
    panel.grid.minor.x = element_blank(),
    axis.text.y = element_text(size=9),
    axis.text.x = element_text(size=9),
    #axis.title.y=element_blank(),
    #axis.title.x=element_blank(),
    #legend.title=element_blank(),
    #legend.position=c(.01, .77),
    #legend.justification=c("left", "bottom"),
    #legend.title = element_text(size = 5), 
    #legend.text = element_text(size = 5),
    #legend.margin=margin(0,0,0,0),
    #legend.box.margin=margin(-1,-1,-1,-1),
    plot.title = element_text(size=12))  

Task 1 answer here

Task 2: T-test by Party

Create a factor-level variable for Party of the President labeled “Republican” for Republicans and “Democrat” for Democrats. Following this, compute a two-group difference-in-means test assessing the following research question: Are the number of Deportations under a Democratic Presidency significantly different from Deportations under a Republican Presidency? In a paragraph, report results from the analysis using substantive language that could be understandable to a lay-person. This task is worth 100 points.

#Answer 1 From the 1950s all the way to 1990s there wasnt much deporations as you see in the graph that the amount of deportations remained pretty low. This was most likely due to more loose immigration enforcment. After the 1990s there was was in increase which is most likely due to 9/11 since border security was a focal point from the 9/11 terrorist attack. Then there is a major increase around the 2010s which is easily explained by obama being president and in his presidency there were record high deportations. After its peak it starts to decline. I believe that this could be explained by public pressure as better treatment for immigrants has been a thing since the mid 2010s 

remove.1 <- read.csv("ICE_removals_1948 (1).csv", stringsAsFactors = FALSE)

remove.1 <- remove.1 %>%
  mutate(PartyFactor = ifelse(Party == 0, "Republican", "Democrat")) %>%
  mutate(PartyFactor = factor(PartyFactor, levels = c("Democrat", "Republican")))

group_means <- remove.1 %>%
  group_by(PartyFactor) %>%
  summarise(mean_deportations = mean(Deportations, na.rm = TRUE))

diff_test <- t.test(Deportations ~ PartyFactor, data = remove.1)

group_means
## # A tibble: 2 × 2
##   PartyFactor mean_deportations
##   <fct>                   <dbl>
## 1 Democrat              125843.
## 2 Republican             94800.
diff_test
## 
##  Welch Two Sample t-test
## 
## data:  Deportations by PartyFactor
## t = 0.97685, df = 64.521, p-value = 0.3323
## alternative hypothesis: true difference in means between group Democrat and group Republican is not equal to 0
## 95 percent confidence interval:
##  -32432.22  94518.08
## sample estimates:
##   mean in group Democrat mean in group Republican 
##                125843.26                 94800.32

Task 2 answer goes here

This t test shows that there isnt a significant difference between the 2 even though democratic presidents did deport more people the gap between the 2 isnt enough to say that the type of politcal party president affects deportations. As we see in the data the between the two averages and find mean which is 31,043. We say the difference is not signifciant is because the p score isnt above .05. ### Task 3: Regression with a dummy variable Estimate a bivariate regression model of the form: \(\hat{Deportations}=\beta_0 + \beta_1*Party~of~President\) and report the results from the regression model by summarizing the regression object. Based on the table of results, what would be the predicted number of deportations for Republicans and for Democrats. What does \(\beta_0\) and \(\beta_1\) tell us? Based on the model, is there evidence to reject the null hypothsis that \(\overline{D}_{Dem}=\overline{D}_{Rep}\)? This task is worth 100 points. Before doing this, you should read “The US Deportation System: History, Impacts, and New Empirical Research” by Caitlin Patler and Bradford Jones.

remove.1 <- remove.1 %>%
  mutate(PartyFactor = factor(Party, levels = c(0, 1), labels = c("Republican", "Democrat")))

model <- lm(Deportations ~ PartyFactor, data = remove.1)

summary(model)
## 
## Call:
## lm(formula = Deportations ~ PartyFactor, data = remove.1)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -118080  -87243  -71318   82297  306491 
## 
## Coefficients:
##                     Estimate Std. Error t value  Pr(>|t|)    
## (Intercept)            94800      21372   4.436 0.0000319 ***
## PartyFactorDemocrat    31043      31286   0.992     0.324    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 135200 on 73 degrees of freedom
## Multiple R-squared:  0.01331,    Adjusted R-squared:  -0.0002093 
## F-statistic: 0.9845 on 1 and 73 DF,  p-value: 0.3244

Task 3 answer goes here

this regression model shows the tells the intercept which is 94,800 which represents the predicted number of deportations under republican presidents and the slope which is 31,043 shows how the number of deportations should go up when there is a democratic president in office. However once again the slope is not significant which tell us how the politcal party is a factor ### Task 4: Plot regression object

Using \(\textrm{plot_model}\) (from the \(\textrm{sjPlot}\)), provide a professional-grade plot of the regression model along with an interpretation of the plot. Which hypothesis is the plot most consistent with? This task is worth 100 points.

model <- lm(Deportations ~ PartyFactor, data = remove.1)


predicted_df <- data.frame(
  PartyFactor = factor(c("Republican", "Democrat"), levels = c("Republican", "Democrat"))
)


predicted_df$predicted <- predict(model, newdata = predicted_df)


ggplot(predicted_df, aes(x = PartyFactor, y = predicted, fill = PartyFactor)) +
  geom_bar(stat = "identity", width = 0.6) +
  labs(
    title = "Predicted Deportations by President's Party",
    x = "President's Party",
    y = "Predicted Deportations"
  ) +
  theme_minimal() +
  theme(legend.position = "none")

Task 4 answer goes here

In this graph we see that democrats have a higher deportation than republicans. Unlike the other graphs the bar does look more appealing the numbers itself shows that there isnt a significant difference and we get this because it goes in line with the null hypothesis.

Task 5: Regression by decade

In the Patler and Jones article I asked you to read, they point out that several policies were enacted that made deportations easier to carry out. Among one of the most important policy was the Illegal Immigration Reform and Immigrant Responsibility Act, 1996. One prediction might be that after changes in the 1990 (like the IIRIA), we should observe and increase in deportations starting in the 1990s. To assess this claim, do the following:

Create a well-labled factor-level variable denoting each decade starting with the 1950s (1951-1960) going up to the 2010s (2011-2020) and then estimate a regression model treating the dependent variable (i.e the number of deportations) as a function of the decade-factor level variable. Following this plot the regression model using $. Provide a thorough interpretation of the regression model with a focus on the claims made in the paragraph above. Are the results consistent with the basic claim made? This task is worth 100 points.

# Check your dataset
print(head(remove.1))
##   Year Apprehensions President Party    PCGdp Decade Deportations     VR
## 1 2022       2584220     Biden     1 77246.67     NA       108733 261387
## 2 2021       1865379     Biden     1 71055.88     NA        85783 178003
## 3 2020        609265     Trump     0 64317.40   2010       237364 167452
## 4 2019       1175841     Trump     0 65548.07   2010       347090 171120
## 5 2018        739486     Trump     0 63201.05   2010       327608 159958
## 6 2017        607677     Trump     0 60322.26   2010       284365 100452
##   Administrative EnforcementReturns Criminal Noncriminal Title.42 Foreign.Born
## 1         180266              81121    63266       45467  1103966     46182177
## 2         128339              49664    61117       24666  1071074     45270103
## 3         113857              53595   119142      118222   206770     45101502
## 4          89719              81401   169898      177192       NA     44932901
## 5          72756              87202   148203      179405       NA     44728721
## 6          15072              85380   108519      175846       NA     44525855
##   Naturalized Noncitizen Unauthorized.population US.Population App_lagged
## 1    24509131   21673046                      NA     333287557    1865379
## 2    24044083   21226020                10500000     332031554     609265
## 3    23613500   21488002                10350000     331511512    1175841
## 4    23182917   21749984                10200000     328329953     739486
## 5    22629737   22098984                10500000     326838199     607677
## 6    21948732   22577123                10500000     325122128     683782
##   PartyFactor
## 1    Democrat
## 2    Democrat
## 3  Republican
## 4  Republican
## 5  Republican
## 6  Republican
# Create Decade variable
remove.1 <- remove.1 %>%
  mutate(Decade = case_when(
    Year >= 1951 & Year <= 1960 ~ "1950s",
    Year >= 1961 & Year <= 1970 ~ "1960s",
    Year >= 1971 & Year <= 1980 ~ "1970s",
    Year >= 1981 & Year <= 1990 ~ "1980s",
    Year >= 1991 & Year <= 2000 ~ "1990s",
    Year >= 2001 & Year <= 2010 ~ "2000s",
    Year >= 2011 & Year <= 2020 ~ "2010s"
  ))

# Print to confirm variable added
print(table(remove.1$Decade))
## 
## 1950s 1960s 1970s 1980s 1990s 2000s 2010s 
##    10    10    10    10    10    10    10
# Convert to factor
remove.1$Decade <- factor(remove.1$Decade, levels = c("1950s", "1960s", "1970s", "1980s", "1990s", "2000s", "2010s"))

# Run regression
model_decade <- lm(Deportations ~ Decade, data = remove.1)

# Output summary
summary(model_decade)
## 
## Call:
## lm(formula = Deportations ~ Decade, data = remove.1)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -112307   -8002    -742    7322  104976 
## 
## Coefficients:
##             Estimate Std. Error t value             Pr(>|t|)    
## (Intercept)    15047      14387   1.046             0.299619    
## Decade1960s    -4927      20347  -0.242             0.809460    
## Decade1970s     8974      20347   0.441             0.660666    
## Decade1980s     8236      20347   0.405             0.687015    
## Decade1990s    79603      20347   3.912             0.000227 ***
## Decade2000s   262426      20347  12.898 < 0.0000000000000002 ***
## Decade2010s   334623      20347  16.446 < 0.0000000000000002 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 45500 on 63 degrees of freedom
##   (5 observations deleted due to missingness)
## Multiple R-squared:  0.9016, Adjusted R-squared:  0.8923 
## F-statistic: 96.25 on 6 and 63 DF,  p-value: < 0.00000000000000022
plot_model(model_decade, type = "est", show.values = TRUE, value.offset = 0.3) +
  labs(
    title = "Deportations by Decade (Relative to 1950s)",
    x = "Decade",
    y = "Estimated Deportations"
  )

Task 5 answer goes here

This model shows whether if deportations have grown over the past decades. The most noteable changes coming with the Immigration reform and Immigrant responsibility act in 1996. As we also see how in the 1950s deportations were pretty low until the 1990s which could be explained by the acts that were brought which did it make it easy for people to get deported.

Task 6: Pre-post 1996

Create a dummy variable (or binary variable) coded 1 if the year is 1996 or later and 0 otherwise. Estimate a regression model treating deportations as a function of this dummy variable. Plot the regression model and provide a thorough substantive interpretation of the regression results. To start, what would be the null and alternative hypotheses for \(\beta_1\) given the research question? Suggested ways to interpret this would be to report the predicted number of deportations in the later period compared to the earlier period as well as the discussing the coefficient showing the difference. You should tie your interpretation back to the regression estimates. This task is worth 100 points.

# Create a dummy variable for Post-1996 period
remove.1 <- remove.1 %>%
  mutate(Post1996 = ifelse(Year >= 1996, 1, 0))

# Run regression
model_post1996 <- lm(Deportations ~ Post1996, data = remove.1)

# View regression summary
summary(model_post1996)
## 
## Call:
## lm(formula = Deportations ~ Post1996, data = remove.1)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -196855  -12510   -2139   14015  165799 
## 
## Coefficients:
##             Estimate Std. Error t value            Pr(>|t|)    
## (Intercept)    20835       9385    2.22              0.0295 *  
## Post1996      245700      15642   15.71 <0.0000000000000002 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 65020 on 73 degrees of freedom
## Multiple R-squared:  0.7717, Adjusted R-squared:  0.7686 
## F-statistic: 246.7 on 1 and 73 DF,  p-value: < 0.00000000000000022
plot_model(model_post1996, type = "est", show.values = TRUE, value.offset = 0.3) +
  labs(
    title = "Effect of Post-1996 Policy Change on Deportations",
    x = "Policy Change Dummy (1 = Year ≥ 1996)",
    y = "Estimated Deportations"
  )

Task 6 answer goes here

This graph shows deportations after 1996 which is when the immigration act was in place. This also shows how the predicted number of deporations before 1996 and how much higher the deportations were since. This also supports the idea of how the immigration act made deportations easier making the number go higher

Works cited The Obama Record on Deportations: Deporter in Chief or Not? By Sarah Pierce Year: 2017 Container: Migration Policy Institute URL: https://www.migrationpolicy.org/article/obama-record-deportations-deporter-chief-or-not

Post-9/11 | USCIS By USCIS Year: 2025 Container: USCIS URL: https://www.uscis.gov/about-us/our-history/explore-agency-history/overview-of-agency-history/post-911

Illegal Immigration Reform and Immigration Responsibility Act By Cornell Law School Year: 2018 Container: LII / Legal Information Institute Publisher: Cornell Law School URL: https://www.law.cornell.edu/wex/illegal_immigration_reform_and_immigration_responsibility_act