remove.1 <- read.csv("ICE_removals_1948 (1).csv", stringsAsFactors = FALSE)
I have compiled data of deportations by fiscal year from 1948 to 2022. Your job is to use some of the skills we are learning in this class to better understand these data. As such, I will be asking you to engage in a number of tasks requiring the use of \(t\)-tests and simple regression. Your grade will be based on analysis and presentation of the data. This assignment is worth 600 points. It will be due May 30 by 11:59 PM. You need to submit an HTML document or a document that includes code and viewable output.
This chunk reads in the data on deportations from 1948 to 2022.
urlfile="https://raw.githubusercontent.com/mightyjoemoon/POL51/main/ICE_removals_1948.csv"
remove.1<-read_csv(url(urlfile))
summary(remove.1)
## Year Apprehensions President Party
## Min. :1948 Min. : 45336 Length:75 Min. :0.0000
## 1st Qu.:1966 1st Qu.: 444232 Class :character 1st Qu.:0.0000
## Median :1985 Median : 889212 Mode :character Median :0.0000
## Mean :1985 Mean : 852071 Mean :0.4667
## 3rd Qu.:2004 3rd Qu.:1194182 3rd Qu.:1.0000
## Max. :2022 Max. :2584220 Max. :1.0000
##
## PCGdp Decade Deportations VR
## Min. : 1833 Min. :1940 Min. : 5989 Min. : 52383
## 1st Qu.: 4231 1st Qu.:1960 1st Qu.: 17362 1st Qu.: 174562
## Median :18237 Median :1980 Median : 29277 Median : 673169
## Mean :24128 Mean :1978 Mean :109287 Mean : 648029
## 3rd Qu.:40607 3rd Qu.:2000 3rd Qu.:188746 3rd Qu.:1017324
## Max. :77247 Max. :2010 Max. :432334 Max. :1675876
## NA's :2
## Administrative EnforcementReturns Criminal Noncriminal
## Min. : 15072 Min. : 49664 Min. : 61117 Min. : 24666
## 1st Qu.: 44947 1st Qu.: 81191 1st Qu.:114680 1st Qu.:161440
## Median : 60150 Median : 86800 Median :135509 Median :190058
## Mean : 70965 Mean :159377 Mean :139193 Mean :168409
## 3rd Qu.: 85478 3rd Qu.:171374 3rd Qu.:176722 3rd Qu.:215554
## Max. :180266 Max. :523153 Max. :200039 Max. :233846
## NA's :61 NA's :61 NA's :63 NA's :63
## Title 42 Foreign Born Naturalized Noncitizen
## Min. : 206770 Min. : 9619300 Min. :14967828 Min. :20722014
## 1st Qu.: 638922 1st Qu.: 9738100 1st Qu.:17003818 1st Qu.:21671389
## Median :1071074 Median :19767300 Median :19639724 Median :21965584
## Mean : 793937 Mean :23434849 Mean :19752182 Mean :21939190
## 3rd Qu.:1087520 3rd Qu.:36154329 3rd Qu.:22459486 3rd Qu.:22364709
## Max. :1103966 Max. :46182177 Max. :24509131 Max. :22593269
## NA's :72 NA's :7 NA's :57 NA's :57
## Unauthorized population US Population App_lagged
## Min. : 3500000 Min. :146631302 Min. : 45336
## 1st Qu.:10237500 1st Qu.:197636197 1st Qu.: 382740
## Median :10850000 Median :237923795 Median : 885587
## Mean :10168182 Mean :241806480 Mean : 820197
## 3rd Qu.:11375000 3rd Qu.:291456616 3rd Qu.:1183164
## Max. :12200000 Max. :333287557 Max. :1865379
## NA's :53
Below is code to produce a barplot of deportations over the time frame. I want you to provide a professional-grade interpretation of the plot you are seeing. This task is worth 100 points.
df_melted <- aggregate(data = remove.1, Deportations ~ Year, mean)
names(df_melted) <- c("Year", "mean_Deportations")
ggplot(df_melted, aes(x = Year, y = mean_Deportations, width=1)) +
geom_bar(stat = "identity") +
scale_x_continuous(n.breaks = 10) +
labs(title="Figure 1: Deportations by year (FY 1948-2022)",
y="Number of deportations", x="Fiscal year",
color="") +
theme_bw() +
theme(#panel.grid.major.y = element_line(colour = "grey", linetype = "dashed"),
panel.grid.major.x = element_blank(),
panel.grid.minor.x = element_blank(),
axis.text.y = element_text(size=9),
axis.text.x = element_text(size=9),
#axis.title.y=element_blank(),
#axis.title.x=element_blank(),
#legend.title=element_blank(),
#legend.position=c(.01, .77),
#legend.justification=c("left", "bottom"),
#legend.title = element_text(size = 5),
#legend.text = element_text(size = 5),
#legend.margin=margin(0,0,0,0),
#legend.box.margin=margin(-1,-1,-1,-1),
plot.title = element_text(size=12))
Create a factor-level variable for Party of the President labeled “Republican” for Republicans and “Democrat” for Democrats. Following this, compute a two-group difference-in-means test assessing the following research question: Are the number of Deportations under a Democratic Presidency significantly different from Deportations under a Republican Presidency? In a paragraph, report results from the analysis using substantive language that could be understandable to a lay-person. This task is worth 100 points.
#Answer 1 From the 1950s all the way to 1990s there wasnt much deporations as you see in the graph that the amount of deportations remained pretty low. This was most likely due to more loose immigration enforcment. After the 1990s there was was in increase which is most likely due to 9/11 since border security was a focal point from the 9/11 terrorist attack. Then there is a major increase around the 2010s which is easily explained by obama being president and in his presidency there were record high deportations. After its peak it starts to decline. I believe that this could be explained by public pressure as better treatment for immigrants has been a thing since the mid 2010s
remove.1 <- read.csv("ICE_removals_1948 (1).csv", stringsAsFactors = FALSE)
remove.1 <- remove.1 %>%
mutate(PartyFactor = ifelse(Party == 0, "Republican", "Democrat")) %>%
mutate(PartyFactor = factor(PartyFactor, levels = c("Democrat", "Republican")))
group_means <- remove.1 %>%
group_by(PartyFactor) %>%
summarise(mean_deportations = mean(Deportations, na.rm = TRUE))
diff_test <- t.test(Deportations ~ PartyFactor, data = remove.1)
group_means
## # A tibble: 2 × 2
## PartyFactor mean_deportations
## <fct> <dbl>
## 1 Democrat 125843.
## 2 Republican 94800.
diff_test
##
## Welch Two Sample t-test
##
## data: Deportations by PartyFactor
## t = 0.97685, df = 64.521, p-value = 0.3323
## alternative hypothesis: true difference in means between group Democrat and group Republican is not equal to 0
## 95 percent confidence interval:
## -32432.22 94518.08
## sample estimates:
## mean in group Democrat mean in group Republican
## 125843.26 94800.32
This t test shows that there isnt a significant difference between the 2 even though democratic presidents did deport more people the gap between the 2 isnt enough to say that the type of politcal party president affects deportations. As we see in the data the between the two averages and find mean which is 31,043. We say the difference is not signifciant is because the p score isnt above .05. ### Task 3: Regression with a dummy variable Estimate a bivariate regression model of the form: \(\hat{Deportations}=\beta_0 + \beta_1*Party~of~President\) and report the results from the regression model by summarizing the regression object. Based on the table of results, what would be the predicted number of deportations for Republicans and for Democrats. What does \(\beta_0\) and \(\beta_1\) tell us? Based on the model, is there evidence to reject the null hypothsis that \(\overline{D}_{Dem}=\overline{D}_{Rep}\)? This task is worth 100 points. Before doing this, you should read “The US Deportation System: History, Impacts, and New Empirical Research” by Caitlin Patler and Bradford Jones.
remove.1 <- remove.1 %>%
mutate(PartyFactor = factor(Party, levels = c(0, 1), labels = c("Republican", "Democrat")))
model <- lm(Deportations ~ PartyFactor, data = remove.1)
summary(model)
##
## Call:
## lm(formula = Deportations ~ PartyFactor, data = remove.1)
##
## Residuals:
## Min 1Q Median 3Q Max
## -118080 -87243 -71318 82297 306491
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 94800 21372 4.436 0.0000319 ***
## PartyFactorDemocrat 31043 31286 0.992 0.324
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 135200 on 73 degrees of freedom
## Multiple R-squared: 0.01331, Adjusted R-squared: -0.0002093
## F-statistic: 0.9845 on 1 and 73 DF, p-value: 0.3244
this regression model shows the tells the intercept which is 94,800 which represents the predicted number of deportations under republican presidents and the slope which is 31,043 shows how the number of deportations should go up when there is a democratic president in office. However once again the slope is not significant which tell us how the politcal party is a factor ### Task 4: Plot regression object
Using \(\textrm{plot_model}\) (from the \(\textrm{sjPlot}\)), provide a professional-grade plot of the regression model along with an interpretation of the plot. Which hypothesis is the plot most consistent with? This task is worth 100 points.
model <- lm(Deportations ~ PartyFactor, data = remove.1)
predicted_df <- data.frame(
PartyFactor = factor(c("Republican", "Democrat"), levels = c("Republican", "Democrat"))
)
predicted_df$predicted <- predict(model, newdata = predicted_df)
ggplot(predicted_df, aes(x = PartyFactor, y = predicted, fill = PartyFactor)) +
geom_bar(stat = "identity", width = 0.6) +
labs(
title = "Predicted Deportations by President's Party",
x = "President's Party",
y = "Predicted Deportations"
) +
theme_minimal() +
theme(legend.position = "none")
In this graph we see that democrats have a higher deportation than republicans. Unlike the other graphs the bar does look more appealing the numbers itself shows that there isnt a significant difference and we get this because it goes in line with the null hypothesis.
In the Patler and Jones article I asked you to read, they point out that several policies were enacted that made deportations easier to carry out. Among one of the most important policy was the Illegal Immigration Reform and Immigrant Responsibility Act, 1996. One prediction might be that after changes in the 1990 (like the IIRIA), we should observe and increase in deportations starting in the 1990s. To assess this claim, do the following:
Create a well-labled factor-level variable denoting each decade starting with the 1950s (1951-1960) going up to the 2010s (2011-2020) and then estimate a regression model treating the dependent variable (i.e the number of deportations) as a function of the decade-factor level variable. Following this plot the regression model using $. Provide a thorough interpretation of the regression model with a focus on the claims made in the paragraph above. Are the results consistent with the basic claim made? This task is worth 100 points.
# Check your dataset
print(head(remove.1))
## Year Apprehensions President Party PCGdp Decade Deportations VR
## 1 2022 2584220 Biden 1 77246.67 NA 108733 261387
## 2 2021 1865379 Biden 1 71055.88 NA 85783 178003
## 3 2020 609265 Trump 0 64317.40 2010 237364 167452
## 4 2019 1175841 Trump 0 65548.07 2010 347090 171120
## 5 2018 739486 Trump 0 63201.05 2010 327608 159958
## 6 2017 607677 Trump 0 60322.26 2010 284365 100452
## Administrative EnforcementReturns Criminal Noncriminal Title.42 Foreign.Born
## 1 180266 81121 63266 45467 1103966 46182177
## 2 128339 49664 61117 24666 1071074 45270103
## 3 113857 53595 119142 118222 206770 45101502
## 4 89719 81401 169898 177192 NA 44932901
## 5 72756 87202 148203 179405 NA 44728721
## 6 15072 85380 108519 175846 NA 44525855
## Naturalized Noncitizen Unauthorized.population US.Population App_lagged
## 1 24509131 21673046 NA 333287557 1865379
## 2 24044083 21226020 10500000 332031554 609265
## 3 23613500 21488002 10350000 331511512 1175841
## 4 23182917 21749984 10200000 328329953 739486
## 5 22629737 22098984 10500000 326838199 607677
## 6 21948732 22577123 10500000 325122128 683782
## PartyFactor
## 1 Democrat
## 2 Democrat
## 3 Republican
## 4 Republican
## 5 Republican
## 6 Republican
# Create Decade variable
remove.1 <- remove.1 %>%
mutate(Decade = case_when(
Year >= 1951 & Year <= 1960 ~ "1950s",
Year >= 1961 & Year <= 1970 ~ "1960s",
Year >= 1971 & Year <= 1980 ~ "1970s",
Year >= 1981 & Year <= 1990 ~ "1980s",
Year >= 1991 & Year <= 2000 ~ "1990s",
Year >= 2001 & Year <= 2010 ~ "2000s",
Year >= 2011 & Year <= 2020 ~ "2010s"
))
# Print to confirm variable added
print(table(remove.1$Decade))
##
## 1950s 1960s 1970s 1980s 1990s 2000s 2010s
## 10 10 10 10 10 10 10
# Convert to factor
remove.1$Decade <- factor(remove.1$Decade, levels = c("1950s", "1960s", "1970s", "1980s", "1990s", "2000s", "2010s"))
# Run regression
model_decade <- lm(Deportations ~ Decade, data = remove.1)
# Output summary
summary(model_decade)
##
## Call:
## lm(formula = Deportations ~ Decade, data = remove.1)
##
## Residuals:
## Min 1Q Median 3Q Max
## -112307 -8002 -742 7322 104976
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 15047 14387 1.046 0.299619
## Decade1960s -4927 20347 -0.242 0.809460
## Decade1970s 8974 20347 0.441 0.660666
## Decade1980s 8236 20347 0.405 0.687015
## Decade1990s 79603 20347 3.912 0.000227 ***
## Decade2000s 262426 20347 12.898 < 0.0000000000000002 ***
## Decade2010s 334623 20347 16.446 < 0.0000000000000002 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 45500 on 63 degrees of freedom
## (5 observations deleted due to missingness)
## Multiple R-squared: 0.9016, Adjusted R-squared: 0.8923
## F-statistic: 96.25 on 6 and 63 DF, p-value: < 0.00000000000000022
plot_model(model_decade, type = "est", show.values = TRUE, value.offset = 0.3) +
labs(
title = "Deportations by Decade (Relative to 1950s)",
x = "Decade",
y = "Estimated Deportations"
)
This model shows whether if deportations have grown over the past decades. The most noteable changes coming with the Immigration reform and Immigrant responsibility act in 1996. As we also see how in the 1950s deportations were pretty low until the 1990s which could be explained by the acts that were brought which did it make it easy for people to get deported.
Create a dummy variable (or binary variable) coded 1 if the year is 1996 or later and 0 otherwise. Estimate a regression model treating deportations as a function of this dummy variable. Plot the regression model and provide a thorough substantive interpretation of the regression results. To start, what would be the null and alternative hypotheses for \(\beta_1\) given the research question? Suggested ways to interpret this would be to report the predicted number of deportations in the later period compared to the earlier period as well as the discussing the coefficient showing the difference. You should tie your interpretation back to the regression estimates. This task is worth 100 points.
# Create a dummy variable for Post-1996 period
remove.1 <- remove.1 %>%
mutate(Post1996 = ifelse(Year >= 1996, 1, 0))
# Run regression
model_post1996 <- lm(Deportations ~ Post1996, data = remove.1)
# View regression summary
summary(model_post1996)
##
## Call:
## lm(formula = Deportations ~ Post1996, data = remove.1)
##
## Residuals:
## Min 1Q Median 3Q Max
## -196855 -12510 -2139 14015 165799
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 20835 9385 2.22 0.0295 *
## Post1996 245700 15642 15.71 <0.0000000000000002 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 65020 on 73 degrees of freedom
## Multiple R-squared: 0.7717, Adjusted R-squared: 0.7686
## F-statistic: 246.7 on 1 and 73 DF, p-value: < 0.00000000000000022
plot_model(model_post1996, type = "est", show.values = TRUE, value.offset = 0.3) +
labs(
title = "Effect of Post-1996 Policy Change on Deportations",
x = "Policy Change Dummy (1 = Year ≥ 1996)",
y = "Estimated Deportations"
)
This graph shows deportations after 1996 which is when the immigration act was in place. This also shows how the predicted number of deporations before 1996 and how much higher the deportations were since. This also supports the idea of how the immigration act made deportations easier making the number go higher
Works cited The Obama Record on Deportations: Deporter in Chief or Not? By Sarah Pierce Year: 2017 Container: Migration Policy Institute URL: https://www.migrationpolicy.org/article/obama-record-deportations-deporter-chief-or-not
Post-9/11 | USCIS By USCIS Year: 2025 Container: USCIS URL: https://www.uscis.gov/about-us/our-history/explore-agency-history/overview-of-agency-history/post-911
Illegal Immigration Reform and Immigration Responsibility Act By Cornell Law School Year: 2018 Container: LII / Legal Information Institute Publisher: Cornell Law School URL: https://www.law.cornell.edu/wex/illegal_immigration_reform_and_immigration_responsibility_act