Is there a statistically significant association between the type of agency reporting a crime such as Cities, Universities, Metropolitan Counties, and the specific bias motivation of the hate crime such as Race, Religion, Sexual Orientation?
Study Name: FBI Hate Crime Statistics for 2013 (FBI 2013)
Dataset: Presents the data from those agencies that reported one or more hate crime incidents occurred in their respective jurisdictions during 1 or more quarters in 2013. The data are distributed by bias motivation and quarter.
Sample Size: 1827 samples
Variables: 15 variables
State, Agency type, Agency name, Race, Religion, Sexual orientation, Ethnicity, Disability, Gender, Gender Identity, 1st quarter, 2nd quarter, 3rd quarter, 4th quarter, Population
Key Variables: Agency type & Race, Religion, Sexual orientation, Ethnicity, Disability, Gender, Gender Identity
Key Focus of this Study: This analysis focuses specifically on whether agency type (such as Cities, Universities, Metropolitan Counties) influences what type of hate crimes (Race, Religion, Sexual orientation, Ethnicity, Disability, Gender, Gender Identity) are being done.
Direct Link: https://github.com/emorisse/FBI-Hate-Crime-Statistics/blob/master/2013/table13.csv
In this section, I cleaned the dataset by aggregating the various bias categories—Race, Religion, Sexual Orientation, Ethnicity, Disability, Gender, and Gender Identity—across all reporting agency types. The data was processed to ensure numerical consistency and grouped to allow for a comparison between different law enforcement jurisdictions, such as Cities, Universities, and Counties. I will be performing a Chi-Squared Test of Independence to determine if there is any significant relationship between the type of agency and the specific categories of hate crimes reported. Visualizations include a stacked bar chart to illustrate the distribution of these motivations across all agency categories.
library(tidyverse)
# Load the dataset
hate_crimes <- read.csv("table13.csv")
# Clean and prepare the data
# Selecting all bias categories and the agency type
bias_columns <- c("Race", "Religion", "Sexual.orientation", "Ethnicity",
"Disability", "Gender", "Gender.Identity")
# Aggregate counts for all bias categories by all available agency types
analysis_all <- hate_crimes %>%
group_by(Agency.type) %>%
summarise(across(all_of(bias_columns), ~sum(.x, na.rm = TRUE)))
# EDA Function 1
summary(analysis_all)
## Agency.type Race Religion Sexual.orientation
## Length:6 Min. : 23.00 Min. : 16.00 Min. : 6.0
## Class :character 1st Qu.: 43.25 1st Qu.: 17.75 1st Qu.: 16.5
## Mode :character Median : 88.00 Median : 29.00 Median : 35.5
## Mean : 478.67 Mean :172.00 Mean : 206.2
## 3rd Qu.: 295.50 3rd Qu.:128.00 3rd Qu.: 104.8
## Max. :2280.00 Max. :783.00 Max. :1022.0
## Ethnicity Disability Gender Gender.Identity
## Min. : 3.00 Min. : 0.00 Min. : 0.0 Min. : 0.000
## 1st Qu.: 8.75 1st Qu.: 0.25 1st Qu.: 0.0 1st Qu.: 0.000
## Median : 12.50 Median : 4.00 Median : 0.5 Median : 0.500
## Mean :109.17 Mean :13.83 Mean : 3.0 Mean : 5.167
## 3rd Qu.: 92.00 3rd Qu.: 9.25 3rd Qu.: 2.5 3rd Qu.: 1.750
## Max. :501.00 Max. :65.00 Max. :14.0 Max. :28.000
# EDA Function 2
head(analysis_all)
## # A tibble: 6 × 8
## Agency.type Race Religion Sexual.orientation Ethnicity Disability Gender
## <chr> <int> <int> <int> <int> <int> <int>
## 1 Cities 2280 783 1022 501 65 14
## 2 Metropolitan Co… 363 158 122 118 10 0
## 3 Nonmetropolitan… 83 16 18 14 7 0
## 4 Other Agencies 23 20 6 3 0 0
## 5 State Police Ag… 30 17 16 8 0 1
## 6 Universities an… 93 38 53 11 1 3
## # ℹ 1 more variable: Gender.Identity <int>
# Pivot data for visualization
long_data_all <- analysis_all %>%
pivot_longer(cols = -Agency.type, names_to = "Bias_Motivation", values_to = "Count")
# Create Visualization
ggplot(long_data_all, aes(x = Agency.type, y = Count, fill = Bias_Motivation)) +
geom_bar(stat = "identity", position = "stack") +
theme_minimal() +
theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
labs(title = "Distribution of Bias Motivations by Agency Type",
x = "Agency Type",
y = "Total Incident Count",
fill = "Bias Category")
Hypothesis:
\(H_0\): There is no association between the type of reporting agency and the specific bias motivation of the hate crime.
\(H_a\): There is a significant association between the type of reporting agency and the specific bias motivation of the hate crime.
# Create the contingency table
# We exclude the first column (Agency.type names since it contains character strings) to create the matrix
contingency_table_all <- as.matrix(analysis_all[, -1])
rownames(contingency_table_all) <- analysis_all$Agency.type
# View the table
contingency_table_all
## Race Religion Sexual.orientation Ethnicity Disability
## Cities 2280 783 1022 501 65
## Metropolitan Counties 363 158 122 118 10
## Nonmetropolitan Counties 83 16 18 14 7
## Other Agencies 23 20 6 3 0
## State Police Agencies 30 17 16 8 0
## Universities and Colleges 93 38 53 11 1
## Gender Gender.Identity
## Cities 14 28
## Metropolitan Counties 0 2
## Nonmetropolitan Counties 0 0
## Other Agencies 0 0
## State Police Agencies 1 0
## Universities and Colleges 3 1
# Perform Chi-Squared Test
chi_result <- chisq.test(contingency_table_all)
chi_result
##
## Pearson's Chi-squared test
##
## data: contingency_table_all
## X-squared = 103.67, df = 30, p-value = 4.854e-10
# Check expected counts and statistic
chi_result$expected
## Race Religion Sexual.orientation Ethnicity
## Cities 2273.66667 817.000000 979.29167 518.541667
## Metropolitan Counties 374.50337 134.570850 161.30246 85.410762
## Nonmetropolitan Counties 66.85830 24.024291 28.79656 15.247976
## Other Agencies 25.19298 9.052632 10.85088 5.745614
## State Police Agencies 34.88259 12.534413 15.02429 7.955466
## Universities and Colleges 96.89609 34.817814 41.73414 22.098516
## Disability Gender Gender.Identity
## Cities 65.7083333 14.2500000 24.5416667
## Metropolitan Counties 10.8230432 2.3471660 4.0423414
## Nonmetropolitan Counties 1.9321862 0.4190283 0.7216599
## Other Agencies 0.7280702 0.1578947 0.2719298
## State Police Agencies 1.0080972 0.2186235 0.3765182
## Universities and Colleges 2.8002699 0.6072874 1.0458839
chi_result$statistic
## X-squared
## 103.6658
With a p-value of 4.854e-10 (which is 0.0000000004854), the result is significantly lower than the standard alpha level of 0.05. Therefore, we reject the null hypothesis. This provides strong statistical evidence that there is a significant association between the type of reporting agency and the specific bias motivations of the hate crimes recorded. In other words, the distribution of hate crime categories is not uniform across different types of law enforcement jurisdictions.
Universities and Colleges: The observed counts for Sexual Orientation (53) were notably higher than the expected counts (41.7), suggesting that these incidents may be reported at a higher frequency in campus environments compared to the national average.
Metropolitan Counties: This group showed higher than expected counts for Religion (158 observed vs. 134.5 expected) and Ethnicity (118 observed vs. 85.4 expected).
Cities: As the largest reporting group, Cities were very close to expected values but showed slightly higher than expected reporting for Gender Identity (28 observed vs. 24.5 expected).
Nonmetropolitan Counties: Interestingly, these agencies reported significantly more Race-based bias (83 observed vs. 66.8 expected) and Disability-based bias (7 observed vs. 1.9 expected) than the statistical model predicted.
The high Chi-squared statistic (103.67) further confirms that the differences between what was observed and what would be expected under the null hypothesis are substantial. These results suggest that different social or environmental factors in these jurisdictions—or perhaps differences in how these agencies categorize and report crimes—lead to distinct patterns in hate crime data.
The analysis of the 2013 FBI Hate Crime dataset indicates a clear relationship between the type of agency reporting a crime and the motivation behind that crime. We found that certain bias categories, such as Sexual Orientation in University settings or Race and Disability in Nonmetropolitan Counties, appear more frequently than expected.
Implications:
These findings suggest that public policy and law enforcement training might need to be tailored to the specific needs of a jurisdiction. For instance, university police might require more resources focused on LGBTQ+ safety, while rural or nonmetropolitan agencies might see a greater need for resources addressing racial and disability-based bias.
Future Directions:
Future research could incorporate the Population variable to see if the reporting rates per capita vary by agency type. Additionally, comparing this 2013 data to more recent years could reveal whether these institutional patterns have shifted over the last decade.
Morisse, E. (2014, October 23). table13.csv [Data set]. GitHub. https://github.com/emorisse/FBI-Hate-Crime-Statistics/blob/master/2013/table13.csv
Federal Bureau of Investigation. (2013). Hate Crime Statistics, 2013: Table 13. U.S. Department of Justice. https://ucr.fbi.gov/hate-crime/2013/tables/13tabledatadecide_pdf