Final Project

Introduction

Research Question

Is there a statistically significant association between the type of agency reporting a crime such as Cities, Universities, Metropolitan Counties, and the specific bias motivation of the hate crime such as Race, Religion, Sexual Orientation?

Dataset Introduction & Reference

Study Name: FBI Hate Crime Statistics for 2013 (FBI 2013)

Dataset: Presents the data from those agencies that reported one or more hate crime incidents occurred in their respective jurisdictions during 1 or more quarters in 2013. The data are distributed by bias motivation and quarter.

Sample Size: 1827 samples

Variables: 15 variables

State, Agency type, Agency name, Race, Religion, Sexual orientation, Ethnicity, Disability, Gender, Gender Identity, 1st quarter, 2nd quarter, 3rd quarter, 4th quarter, Population

Key Variables: Agency type & Race, Religion, Sexual orientation, Ethnicity, Disability, Gender, Gender Identity

Key Focus of this Study: This analysis focuses specifically on whether agency type (such as Cities, Universities, Metropolitan Counties) influences what type of hate crimes (Race, Religion, Sexual orientation, Ethnicity, Disability, Gender, Gender Identity) are being done.

Direct Link: https://github.com/emorisse/FBI-Hate-Crime-Statistics/blob/master/2013/table13.csv

Data Analysis

In this section, I cleaned the dataset by aggregating the various bias categories—Race, Religion, Sexual Orientation, Ethnicity, Disability, Gender, and Gender Identity—across all reporting agency types. The data was processed to ensure numerical consistency and grouped to allow for a comparison between different law enforcement jurisdictions, such as Cities, Universities, and Counties. I will be performing a Chi-Squared Test of Independence to determine if there is any significant relationship between the type of agency and the specific categories of hate crimes reported. Visualizations include a stacked bar chart to illustrate the distribution of these motivations across all agency categories.

library(tidyverse)

# Load the dataset
hate_crimes <- read.csv("table13.csv")

# Clean and prepare the data
# Selecting all bias categories and the agency type
bias_columns <- c("Race", "Religion", "Sexual.orientation", "Ethnicity", 
                  "Disability", "Gender", "Gender.Identity")

# Aggregate counts for all bias categories by all available agency types
analysis_all <- hate_crimes %>%
  group_by(Agency.type) %>%
  summarise(across(all_of(bias_columns), ~sum(.x, na.rm = TRUE)))

# EDA Function 1
summary(analysis_all)

##  Agency.type             Race            Religion      Sexual.orientation
##  Length:6           Min.   :  23.00   Min.   : 16.00   Min.   :   6.0    
##  Class :character   1st Qu.:  43.25   1st Qu.: 17.75   1st Qu.:  16.5    
##  Mode  :character   Median :  88.00   Median : 29.00   Median :  35.5    
##                     Mean   : 478.67   Mean   :172.00   Mean   : 206.2    
##                     3rd Qu.: 295.50   3rd Qu.:128.00   3rd Qu.: 104.8    
##                     Max.   :2280.00   Max.   :783.00   Max.   :1022.0    
##    Ethnicity        Disability        Gender     Gender.Identity 
##  Min.   :  3.00   Min.   : 0.00   Min.   : 0.0   Min.   : 0.000  
##  1st Qu.:  8.75   1st Qu.: 0.25   1st Qu.: 0.0   1st Qu.: 0.000  
##  Median : 12.50   Median : 4.00   Median : 0.5   Median : 0.500  
##  Mean   :109.17   Mean   :13.83   Mean   : 3.0   Mean   : 5.167  
##  3rd Qu.: 92.00   3rd Qu.: 9.25   3rd Qu.: 2.5   3rd Qu.: 1.750  
##  Max.   :501.00   Max.   :65.00   Max.   :14.0   Max.   :28.000

# EDA Function 2
head(analysis_all)

## # A tibble: 6 × 8
##   Agency.type       Race Religion Sexual.orientation Ethnicity Disability Gender
##   <chr>            <int>    <int>              <int>     <int>      <int>  <int>
## 1 Cities            2280      783               1022       501         65     14
## 2 Metropolitan Co…   363      158                122       118         10      0
## 3 Nonmetropolitan…    83       16                 18        14          7      0
## 4 Other Agencies      23       20                  6         3          0      0
## 5 State Police Ag…    30       17                 16         8          0      1
## 6 Universities an…    93       38                 53        11          1      3
## # ℹ 1 more variable: Gender.Identity <int>

# Pivot data for visualization
long_data_all <- analysis_all %>%
  pivot_longer(cols = -Agency.type, names_to = "Bias_Motivation", values_to = "Count")

# Create Visualization
ggplot(long_data_all, aes(x = Agency.type, y = Count, fill = Bias_Motivation)) +
  geom_bar(stat = "identity", position = "stack") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
  labs(title = "Distribution of Bias Motivations by Agency Type",
       x = "Agency Type",
       y = "Total Incident Count",
       fill = "Bias Category")

Statistical Analysis

Hypothesis:

\(H_0\): There is no association between the type of reporting agency and the specific bias motivation of the hate crime.

\(H_a\): There is a significant association between the type of reporting agency and the specific bias motivation of the hate crime.

# Create the contingency table
# We exclude the first column (Agency.type names since it contains character strings) to create the matrix
contingency_table_all <- as.matrix(analysis_all[, -1])
rownames(contingency_table_all) <- analysis_all$Agency.type

# View the table
contingency_table_all

##                           Race Religion Sexual.orientation Ethnicity Disability
## Cities                    2280      783               1022       501         65
## Metropolitan Counties      363      158                122       118         10
## Nonmetropolitan Counties    83       16                 18        14          7
## Other Agencies              23       20                  6         3          0
## State Police Agencies       30       17                 16         8          0
## Universities and Colleges   93       38                 53        11          1
##                           Gender Gender.Identity
## Cities                        14              28
## Metropolitan Counties          0               2
## Nonmetropolitan Counties       0               0
## Other Agencies                 0               0
## State Police Agencies          1               0
## Universities and Colleges      3               1

# Perform Chi-Squared Test
chi_result <- chisq.test(contingency_table_all)
chi_result

## 
##  Pearson's Chi-squared test
## 
## data:  contingency_table_all
## X-squared = 103.67, df = 30, p-value = 4.854e-10

# Check expected counts and statistic
chi_result$expected

##                                 Race   Religion Sexual.orientation  Ethnicity
## Cities                    2273.66667 817.000000          979.29167 518.541667
## Metropolitan Counties      374.50337 134.570850          161.30246  85.410762
## Nonmetropolitan Counties    66.85830  24.024291           28.79656  15.247976
## Other Agencies              25.19298   9.052632           10.85088   5.745614
## State Police Agencies       34.88259  12.534413           15.02429   7.955466
## Universities and Colleges   96.89609  34.817814           41.73414  22.098516
##                           Disability     Gender Gender.Identity
## Cities                    65.7083333 14.2500000      24.5416667
## Metropolitan Counties     10.8230432  2.3471660       4.0423414
## Nonmetropolitan Counties   1.9321862  0.4190283       0.7216599
## Other Agencies             0.7280702  0.1578947       0.2719298
## State Police Agencies      1.0080972  0.2186235       0.3765182
## Universities and Colleges  2.8002699  0.6072874       1.0458839

chi_result$statistic

## X-squared 
##  103.6658

Interpretation

With a p-value of 4.854e-10 (which is 0.0000000004854), the result is significantly lower than the standard alpha level of 0.05. Therefore, we reject the null hypothesis. This provides strong statistical evidence that there is a significant association between the type of reporting agency and the specific bias motivations of the hate crimes recorded. In other words, the distribution of hate crime categories is not uniform across different types of law enforcement jurisdictions.

Universities and Colleges: The observed counts for Sexual Orientation (53) were notably higher than the expected counts (41.7), suggesting that these incidents may be reported at a higher frequency in campus environments compared to the national average.

Metropolitan Counties: This group showed higher than expected counts for Religion (158 observed vs. 134.5 expected) and Ethnicity (118 observed vs. 85.4 expected).

Cities: As the largest reporting group, Cities were very close to expected values but showed slightly higher than expected reporting for Gender Identity (28 observed vs. 24.5 expected).

Nonmetropolitan Counties: Interestingly, these agencies reported significantly more Race-based bias (83 observed vs. 66.8 expected) and Disability-based bias (7 observed vs. 1.9 expected) than the statistical model predicted.

The high Chi-squared statistic (103.67) further confirms that the differences between what was observed and what would be expected under the null hypothesis are substantial. These results suggest that different social or environmental factors in these jurisdictions—or perhaps differences in how these agencies categorize and report crimes—lead to distinct patterns in hate crime data.

Conclusion

The analysis of the 2013 FBI Hate Crime dataset indicates a clear relationship between the type of agency reporting a crime and the motivation behind that crime. We found that certain bias categories, such as Sexual Orientation in University settings or Race and Disability in Nonmetropolitan Counties, appear more frequently than expected.

Implications:

These findings suggest that public policy and law enforcement training might need to be tailored to the specific needs of a jurisdiction. For instance, university police might require more resources focused on LGBTQ+ safety, while rural or nonmetropolitan agencies might see a greater need for resources addressing racial and disability-based bias.

Future Directions:

Future research could incorporate the Population variable to see if the reporting rates per capita vary by agency type. Additionally, comparing this 2013 data to more recent years could reveal whether these institutional patterns have shifted over the last decade.

References

Morisse, E. (2014, October 23). table13.csv [Data set]. GitHub. https://github.com/emorisse/FBI-Hate-Crime-Statistics/blob/master/2013/table13.csv

Federal Bureau of Investigation. (2013). Hate Crime Statistics, 2013: Table 13. U.S. Department of Justice. https://ucr.fbi.gov/hate-crime/2013/tables/13tabledatadecide_pdf