In recent couple years, we hear hate crimes more often from the media. Hate Crimes are the offenses that motivated by a particular race, religion, ethnicity, gender, age, disability, ancestry, national origin or sexual orientation. it may due to “the new normal” that caused by pandemic. Pandemic also cause a lot of people losing their job, furthermore, does unemployment drive hate crimes as well? Is there a relationship between unemployment rate and hate crimes?
The data are from FBI and Southern Poverty Law Center.
The FBI Uniform Crime Reporting Program collects hate crime data from law enforcement agencies. the UCR Program collects data on only prosecutable hate crimes, which make up a fraction of hate incidents (which includes non-prosecutable offenses, such as circulation of white nationalist recruitment materials on college campuses).
The Southern Poverty Law Center uses media accounts and people’s self-reports to assess the situation.
I want to define what is high unemployed rate, so I use median as a dividing line.
If the share_unemployed_seasonal is higher than median, then the reply under high_unemployed is true, else is false.
Here is the summary of the data:
## state median_household_income share_unemployed_seasonal
## Length:47 Min. :35521 Min. :0.02900
## Class :character 1st Qu.:47630 1st Qu.:0.04350
## Mode :character Median :54310 Median :0.05200
## Mean :54802 Mean :0.05087
## 3rd Qu.:60598 3rd Qu.:0.05800
## Max. :76165 Max. :0.07300
## hate_crimes_per_100k_splc avg_hatecrimes_per_100k_fbi hate_crimes_combine
## Min. :0.06745 Min. : 0.412 Min. : 0.5324
## 1st Qu.:0.14271 1st Qu.: 1.304 1st Qu.: 1.4788
## Median :0.22620 Median : 1.937 Median : 2.2272
## Mean :0.30409 Mean : 2.342 Mean : 2.6460
## 3rd Qu.:0.35693 3rd Qu.: 3.119 3rd Qu.: 3.4408
## Max. :1.52230 Max. :10.953 Max. :12.4758
## high_unemployed share_unemployed_seasonal_100k
## Mode :logical Min. :2900
## FALSE:27 1st Qu.:4350
## TRUE :20 Median :5200
## Mean :5087
## 3rd Qu.:5800
## Max. :7300
First of all, lets check the data distribution and summary of unemployed rate. The data appear a little left skewed for me.
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 2900 4350 5200 5087 5800 7300
Now lets check the data distribution and summary of crimes case. It is different from previous result, the data appear right skewed for me, and clearly we can see one outliers from the plot.
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.5324 1.4788 2.2272 2.6460 3.4408 12.4758
Because we see the outliers, we would like to clear it.
Now we can see the mean drop from 2.6460 to 2.4323
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.5324 1.4648 2.2042 2.4323 3.4073 5.4327
After removing the outliner, let compare the result of both higher than avg unemployed rate and lower than avg unemployed rate group. There is more states are lower than avg unemployed rate.
However, they tend to have a higher hate crime case.
ggplot(new_hate_url, aes(x = high_unemployed)) +
geom_bar(fill="blue") +
labs(
x = "", y = "",
title = "Is your state has higher than avg unemployed rate?"
) +
coord_flip()
boxplot(new_hate_url$hate_crimes_combine ~ new_hate_url$high_unemployed, main = "Boxplot with hate crimes case of high unemployed rate", ylab = "hate crimes case", xlab = "high unemployed rate",col="blue")
hate_model <- lm(new_hate_url$hate_crimes_combine ~ new_hate_url$share_unemployed_seasonal_100k )
summary(hate_model)
##
## Call:
## lm(formula = new_hate_url$hate_crimes_combine ~ new_hate_url$share_unemployed_seasonal_100k)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.8308 -0.8845 -0.2797 0.9434 2.9586
##
## Coefficients:
## Estimate Std. Error t value
## (Intercept) 2.8991798 0.9601032 3.020
## new_hate_url$share_unemployed_seasonal_100k -0.0000924 0.0001866 -0.495
## Pr(>|t|)
## (Intercept) 0.0042 **
## new_hate_url$share_unemployed_seasonal_100k 0.6229
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.233 on 44 degrees of freedom
## Multiple R-squared: 0.005542, Adjusted R-squared: -0.01706
## F-statistic: 0.2452 on 1 and 44 DF, p-value: 0.6229
\[ \hat{y} = 2.8991798 - 0.0000924 \times share\_unemployed\_seasonal\_100k \]
the p-value is 0.6229 which mean it is not statistically significant. since the sample size is small and not huge, it should not affect the raito of N, so the high p-value is pretty accurate and meaningful.
also, one more thing bring my attention which is R-squared is 0.005542 which mean effect size is small, it also show in the plot which is far away from the regression line.
## `geom_smooth()` using formula 'y ~ x'
Also, the distribution is heavy tailed since there is few extreme prositve and negative residuals.
ggplot(data = hate_model, aes(sample = .resid)) +
stat_qq(colour = "blue", size = 1) + stat_qq_line(colour = "red", size = 1)
Unlike Maimuna Majumder found that higher rates of hate crimes are tied to income inequality, I do not think the data show that is a strong relationship between unemployment rate and hate crimes case from all the plot and summary we see above.
There is too much variability in the model and High p value shows that the model is not statistically significant. I believe if the data break it down into zip code it may have a more accurate conclusion by comparing by zip code group since they share more similar cultural values. Also I think political socialization drive the result. It is more fair to pick a state for study instead of the whole US.
https://fivethirtyeight.com/features/higher-rates-of-hate-crimes-are-tied-to-income-inequality/