Undocumented Crime

Author

Allan Maino Vieytes

Texas Border: (Qian Weizhong/VCG/Newscom)

Intro Essay

The source for the data i have chosen is the Texas Computerized Criminal History (CCH) and the classifications that are used come from Texas Department of Public Safety (DPS). Although the data-sets come from research conducted at the University of Madison Wisconsin and authored by Michael Light, Jingying He, Jason Robey. The main variables i will be using are: year, category (different crimes), citizen_crime_rate (US citizen crime rate), immigrant_crime_rate (legal immigrant and naturalized citizens crime rate), illegal2_immigrants_crime_rate (undocumented immigrant crime rate), total_crime_rate (total crime rate). The questions i would like to answer through this data is whether or not undocumented immigrants commit more crime (more specifically sexual assault, homicide and drug related offenses). The reason i chose this topic is because of the incessant anti-immigrant posturing from the US conservative camp. Specifically it is this quote that drove me to use this data: “When Mexico sends its people, they’re not sending their best. […] They’re sending people that have lots of problems, and they’re bringing those problems with us. They’re bringing drugs. They’re bringing crime. They’re rapists. And some, I assume, are good people”. The Texas CCH collects its data through police agency reports.

Libraries & Data-Sets

library(haven)
library(tidyverse)
Warning: package 'ggplot2' was built under R version 4.3.3
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.5
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ ggplot2   3.5.1     ✔ tibble    3.2.1
✔ lubridate 1.9.3     ✔ tidyr     1.3.1
✔ purrr     1.0.2     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(highcharter)
Warning: package 'highcharter' was built under R version 4.3.3
Registered S3 method overwritten by 'quantmod':
  method            from
  as.zoo.data.frame zoo 
library(gganimate)
Warning: package 'gganimate' was built under R version 4.3.3
library(viridis)
Loading required package: viridisLite
library(gifski)
Warning: package 'gifski' was built under R version 4.3.3
library(wesanderson)
Warning: package 'wesanderson' was built under R version 4.3.3
library(ggpattern)
Warning: package 'ggpattern' was built under R version 4.3.3
library(GGally)
Warning: package 'GGally' was built under R version 4.3.3
Registered S3 method overwritten by 'GGally':
  method from   
  +.gg   ggplot2
 Allmis = read_dta("E:/data-110/Allmis.dta")
write.csv(Allmis, file = "Allmis.csv")
 big_category = read_dta("E:/data-110/big_category.dta")
write.csv(big_category, file = "big_category.csv")
 detailed_category = read_dta("E:/data-110/detailed_category.dta")
write.csv(detailed_category, file = "detailed_category.csv")

Cleanup

det_cat <- detailed_category %>%
  filter( category %in% c( "4", "3", "9", "6", "11", "10" ), year == 2018) # filtering for 1 "Homicide" 2 "Assault" 3"Robbery" 4 "Sexual Assault"  6 "Burglary" 7 "Theft" 8 "Arson" in categories

df_new <- det_cat %>%
  pivot_longer( cols = 7:9,
                names_to = "immigration_type",
                values_to = "number_of")

df_new$category <- ifelse( df_new$category == 4, "Robbery",
                   ifelse( df_new$category == 3, "Sexual Assault",
                   ifelse( df_new$category == 9, "Burglary",
                   ifelse( df_new$category == 10, "Assault",
                   ifelse( df_new$category == 11, "Homicide",
                   ifelse( df_new$category == 6, "Theft", df_new$category)))))) # changes category numbers to their corresponding names
df_new$number_of<-round( df_new$number_of, 1)
big_cat <- big_category %>%
  filter( category %in% c( "1", "2", "3", "4" )) # filtering for 1 "Homicide" 2 "Assault" 3"Robbery" 4 "Sexual Assault"  6 "Burglary" 7 "Theft" 8 "Arson" in categories

big_new <- big_cat

big_new$category <- ifelse( big_new$category == 1, "Violent Crime",
                    ifelse( big_new$category == 2, "Property Crime",
                    ifelse( big_new$category == 3, "Drug Violations",
                    ifelse( big_new$category == 4, "Traffic Violations", big_new$category)))) # changes category numbers to their corresponding names
big_new <- big_new %>% 
  filter(category == "Violent Crime")
big_new$illegal2_immigrants_crime_rate<-round( big_new$illegal2_immigrants_crime_rate, 1)
big_new$total_crime_rate<-round( big_new$total_crime_rate, 1)
big_new$citizen_crime_rate<-round( big_new$citizen_crime_rate, 1)
big_new$immigrant_crime_rate<-round( big_new$immigrant_crime_rate, 1)
Allmis_new <- Allmis %>%
  pivot_longer( cols = 7:10,
                names_to = "immigration_type",
                values_to = "number_of")
Allmis_new$number_of<-round(Allmis_new$number_of,1)
head(df_new)
# A tibble: 6 × 17
   year category    total_charge_incidents undocumented_immigra…¹ citizen_charge
  <dbl> <chr>                        <dbl>                  <dbl>          <dbl>
1  2018 Sexual Ass…                   6032                    278           4607
2  2018 Sexual Ass…                   6032                    278           4607
3  2018 Sexual Ass…                   6032                    278           4607
4  2018 Robbery                       7175                     80           6502
5  2018 Robbery                       7175                     80           6502
6  2018 Robbery                       7175                     80           6502
# ℹ abbreviated name: ¹​undocumented_immigrants_charge
# ℹ 12 more variables: immigrants_charge <dbl>, total_crime_rate <dbl>,
#   sum_cri_cit <dbl>, illegal2_immigrants_crime_total <dbl>,
#   citizen_crime_total <dbl>, immigrant_crime_total <dbl>, tot_citizen <dbl>,
#   pop_undoc <dbl>, tot_legal2_immi <dbl>, tot_pop <dbl>,
#   immigration_type <chr>, number_of <dbl>
subset_data <- big_new[, 3:8]

# Use ggpairs with the subsetted data
ggpairs(subset_data)

subset_data1 <- big_new[, 5:8]

# Use ggpairs with the subsetted data
ggpairs(subset_data1)

subset_data2 <- big_new[, 8:10]

# Use ggpairs with the subsetted data
ggpairs(subset_data2)

cor( big_new$immigrant_crime_rate, big_new$citizen_crime_rate, use = "complete.obs" ) #Provides the correlation Coefficient 
[1] -0.2470176
fit1 <- lm( immigrant_crime_rate ~ citizen_crime_rate, data = big_new) #Fits the LR Model
summary(fit1) # Summary of the model

Call:
lm(formula = immigrant_crime_rate ~ citizen_crime_rate, data = big_new)

Residuals:
       1        2        3        4        5        6        7 
 17.1063   9.9632  -7.7148 -10.0210   0.7200  -9.3941  -0.6596 
attr(,"label")
[1] "(firstnm) immi_rate_cit_violent"
attr(,"format.stata")
[1] "%9.0g"

Coefficients:
                   Estimate Std. Error t value Pr(>|t|)
(Intercept)        269.2654   146.4648   1.838    0.125
citizen_crime_rate  -0.3921     0.6878  -0.570    0.593

Residual standard error: 11.32 on 5 degrees of freedom
Multiple R-squared:  0.06102,   Adjusted R-squared:  -0.1268 
F-statistic: 0.3249 on 1 and 5 DF,  p-value: 0.5933

Analysis

ARS = 0.2243 The model explains ~22.43% of the total variation in the dependent variable, leaving ~77.57% of the variation unexplained.

Cor-Co:-0.2470176 A correlation coefficient of -0.2470176 suggests a weak negative correlation

P-Val: 0.5933 there is not enough evidence to reject the null hypothesis

\[Y_{i}=\beta_{0}+\beta_{1}X_{i}\] Where \(Y_{i}\) is the Immigrant Crime Rate of the \(i^{th}\) observation, \(\beta_{0}\) is the intercept, \(\beta_{1}\) is the slope, and \(X_{i}\) is the Citizen Crime Rate of the \(i^{th}\) observation. \[Y_{i}=269.265+(-0.3921)X_{i}\]

Visualizations

P1

df_new$immigration_type <- factor(df_new$immigration_type, levels = c("citizen_crime_rate", "immigrant_crime_rate", "illegal2_immigrants_crime_rate"))

p.1 <- ggplot(data = df_new, aes(
  x = immigration_type,
  y = number_of,
  fill = immigration_type),
  color = "black") +
  geom_col(
           lwd = 2,
           width = 1) +
  geom_text(aes(label = number_of),
            position = position_stack(vjust = 0), hjust = -0.2,
            color = "black", size = 3, family = "mono") +
  coord_flip() +
  facet_wrap( ~ category, ncol = 2, dir = "v") +  # Facet by the "category" variable
  theme_linedraw() +
  labs( title = "Unveiling Crime Trends: Natives vs. Undocumented \n Year: 2018",  # Labels title
        caption = "Source: Texas Computerized Criminal History (CCH)" ) +
  theme(
        axis.text = element_blank(),   # Remove axis text
        axis.title = element_blank(),  # Remove axis labels
        axis.ticks = element_blank(),  # Remove axis ticks    
        aspect.ratio = 0.285, # Made the overall size of the visualization smaller
        panel.spacing.x = unit(8, "lines"),
        legend.position = c( .5, .53), # Changes legend position
        legend.direction = "vertical" ,
        legend.background = element_rect( color = "black", fill = "#24868eff" ), # legend background
        legend.box.background = element_rect( color = "#24868eff" ),
        legend.title = element_text( color = "white", face = "bold", family = "mono"),
        legend.text = element_text( family = "mono", face = "bold", size = 10),
        plot.background = element_rect( fill = "#E5E4E2"),
        plot.title = element_text( family = "mono", # Changes font family of Title
                                   size = 17, # Changes size of Title
                                   face = "bold", # Boldens Title
                                   hjust = 0.5), # Centers the Title to the Plots
        plot.caption = element_text( hjust = 0.55, # Centers the caption to the plots
                                    face = "italic" ), # Italicizes the caption
        panel.background = element_rect(fill = "white"),  # Change panel background color
        strip.background = element_rect( color = "black", 
                                         fill = "#24868eff",
                                         linetype="solid"), # Colored the name plates
        strip.text = element_text( family = "mono", # Changes font family of Panel Titles
                                   size = 12, # Changes size of Panel Titles
                                   face = "bold")) +  # Boldens Panel Titles) +  
  scale_fill_manual(values = wes_palette("Zissou1", n = 4),
                    name = "Crime Rates \nPer 100,000",
                    labels = c("US Citizens", "Legal \nImmigrants", "Undocumented \nImmigrants"))
Warning: A numeric `legend.position` argument in `theme()` was deprecated in ggplot2
3.5.0.
ℹ Please use the `legend.position.inside` argument of `theme()` instead.
p.1

This Visualization depicts crime rates (per 100,000) for 6 crimes major crimes. The crime rates are categorized by US citizens, legal immigrant, undocumented immigrant. From the graph we can discern that having a US citizenship makes you more likely to commit crime within the US. This factoid goes against popular right wing narratives about immigrant crime.

P3

colors <- c( "steelblue", "black", "red", "orange", "blue" ) #color choices for points
my_own_theme <- hc_theme(
  colors = c("red", "green", "blue"),
  chart = list(
    backgroundColor = NULL,
    divBackgroundImage = "https://media.giphy.com/media/v1.Y2lkPTc5MGI3NjExcTFwMDl6eXlqbWhuNmMwaW5nMTkxZGxqc3pueWd4Z3R3N29zNjF1ZyZlcD12MV9pbnRlcm5hbF9naWZfYnlfaWQmY3Q9Zw/l1J3Ll1FWR0iqoeUo/giphy.gif"
  ),
  title = list(
    style = list(
      color = "#333333",
      fontFamily = "Lato"
    )
  ),
  subtitle = list(
    style = list(
      color = "#666666",
      fontFamily = "Lato"
    )
  ),
  legend = list(
    itemStyle = list(
      fontFamily = "Lato",
      color = "black"
    ),
    itemHoverStyle = list(
      color = "gray"
    )
  )
)
highchart() |>
 
  hc_add_series(data = big_new$citizen_crime_rate,     #adds US-Life Expectancy to chart
                name = "US Citizen Crime Rate",
                type = "area",
                yAxis = 0) |>
  hc_add_series(data = big_new$total_crime_rate, #adds Vietnam-Life Expectancy to chart
                name = "Total Crime Rate",
                type = "area",
                yAxis = 0) |>
  hc_add_series(data = big_new$immigrant_crime_rate,      #adds Vietnam-Population to chart
                name = "Legal Immigrat Crime Rate",
                type = "area",
                yAxis = 0) |>
   hc_add_series(data = big_new$illegal2_immigrants_crime_rate,          #adds US-Population to chart
                name = "Undocumented Crime Rate",
                type = "area",
                yAxis = 0) |>
  hc_xAxis(categories = big_new$year, # adds x-axis
           tickInterval = 1) |>
  hc_xAxis(title = list(text="Year")) |> # adds x-axis title
  hc_yAxis(
        title = list(text = "Crime Rates per 100,000"),
        min = 50,
        max = 300
      ) |>
  hc_title(text = "Violent Crime Rate Disparities: A Disturbing Trend ") |> # adds title
   hc_subtitle(text = "Rates are per 100,000") |>
  hc_colors(colors) |> #adds colors
  hc_chart(style = list(fontFamily = "Georgia", # boldens and changes font to Georgia
                        fontWeight = "bold")) |>
  hc_legend(verticalAlign = "top", # places legend on top of chart and changes its orientation to horizontal
            layout = "horizontal") |> 
  hc_add_theme(my_own_theme) # changes theme to gray scale

This graph compares Violent crime rates between US citizens, legal immigrant, undocumented immigrant and total. It seems that US citizens commit more violent crime than legal/naturalized immigrants and undocumented immigrants. The GIF in the background is suppost to represent the U.S border.

P2

p.2 <- ggplot( data = Allmis_new, # Loaded union.unemp.2 into ggplot
              mapping = aes( x = year, #  Applied year to the X-axis
                            y = number_of , # Applied unemployment percentage to the Y-axis
                            fill = category )) + # Applied Sex to fill in the area
    geom_area ( color = wes_palette( "GrandBudapest2", 1 ) ) + # Adds the area plot layer
    facet_wrap( ~ factor( immigration_type, levels = c( "citizen_crime_rate", "immigrant_crime_rate", "illegal2_immigrants_crime_rate", "total_crime_rate"))) + # levels allows the ordering
    labs( title = "Misdemeaners V.S Felonies", # Labels title
        x = "Year (2012-18)", # Labels x axis
        y = "Unemployment (Percent)", # Labels y axis
        caption = "Source: Organisation for Economic Co-operation and Development (OECD)" ) + # Adds Source
    theme_linedraw() + # sets the theme for the graphs, it is the reason they are dark
    theme(
        aspect.ratio =0.8, # Made the overall size of the visualization smaller
        axis.title.x = element_text( size=14 ), # Changes size of X-axis Label
        axis.title.y = element_text( size=14 ), # Changes size of Y-axis Label
        axis.text = element_text( size = 9 ), # changes axes text sizes
        legend.background = element_blank(), # Makes the background of the legend box blank
        legend.box.background = element_rect( color = "black" ),
        legend.position = c( .768, 0.305 ), # Changes legend position
        legend.title = element_text( size = 8.5, face = "bold" ), # Changes the legend title text size
        legend.text=element_text( size = 8.5 ),# Changes the legend text size
        plot.title = element_text( size = 17, # Changes size of Title
                                   face = "bold", # Boldens Title
                                   hjust = 0.5, ), # Centers the Title to the Plots
        plot.caption = element_text( hjust = 0.5, # Centers the caption to the plots
                                    face = "italic" ), # Italicizes the caption
        panel.spacing = unit( 1, "lines" ), # Spreads out the facets plots
        strip.background = element_rect( color = "black", fill = wes_palette( "Chevalier1" ), linetype="solid" )) # Colored the name plates above each graph with Wes Anderson color pallet
        
p.2

ggplot(data = df_new, aes(x = immigration_type, y = number_of, fill = immigration_type)) +
  geom_col() +

  coord_flip() +
  facet_grid(rows = vars(category)) +
  theme(axis.title.x=element_blank(),
        axis.text.x=element_blank(),
        axis.ticks.x=element_blank()) +
  theme_void()

Final Thoughts

Although it had been clear to me, prior to making these visualizations, that the popular right wing sentiment that immigrants are more than likely to commit more crime, use and sell more drugs, commit sexual crimes and homicides than the average red blooded american are all false red herrings brought about to galvanize the masses. These visualizations clearly show how US citizens, per-capita, commit more crime than undocumented and naturalized immigrants respectively. It also seems as though the more you assimilate into American culture the more likely you may be to commit said acts. This is to say that crime may be a systemic issue in the US and not one of state and non-state actors. In an article APnews by Will Weissert & Jill Colvin, the authors examines how former President Donald Trump’s rhetoric on immigration, characterized by fear-mongering and portraying migrants as criminals, has resonated with certain segments of the population. It goes on to discuss the use of alarming language and imagery to convey the perception of immigrants as a threat to American society. This work reflects my similar sentiments on the visualizations i have created. Overall, i feel i could have experimented a lot more with the visualization aspects and how i represented the data. I also feel, in retrospect, that i very much enjoy creating geo-spacial visualizations. Nevertheless, i very much enjoyed this project and learned some things along the way, most notably how to incorporate any GIF or image behind my visualizations in high charter.