Racism is one of the biggest social issues all over the world. In the US, the build of extremist groups and hate crimes have seen an increase since the election of our current president.
Research Question The purpose of this project is to investivate a relationship between hate crimes during the period of November 9-18 2016 and the presence of minorities groups and presidential elect chosen per state across the US.
Data Collection The data collected is from Github. Here is the link to the repository:
Data Source https://github.com/fivethirtyeight/data/tree/master/hate-crimes
Cases There are 51 different cases in this dataset, some of which will be excluded.
Independent Variables The explanatory variables I will be using is whether the State went Red or Blue (categorical), and the share of non-white voters (numerical).
Dependent Variable The response variable is hate crimes per 100000 population per the Southern Poverty Law Center, from November 9-18 2016; numerical
Type of study This is an observational study.
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
url <- "https://raw.githubusercontent.com/fivethirtyeight/data/master/hate-crimes/hate_crimes.csv"
dataset <- read.csv(url)
head(dataset)
## state median_household_income share_unemployed_seasonal
## 1 Alabama 42278 0.060
## 2 Alaska 67629 0.064
## 3 Arizona 49254 0.063
## 4 Arkansas 44922 0.052
## 5 California 60487 0.059
## 6 Colorado 60940 0.040
## share_population_in_metro_areas share_population_with_high_school_degree
## 1 0.64 0.821
## 2 0.63 0.914
## 3 0.90 0.842
## 4 0.69 0.824
## 5 0.97 0.806
## 6 0.80 0.893
## share_non_citizen share_white_poverty gini_index share_non_white
## 1 0.02 0.12 0.472 0.35
## 2 0.04 0.06 0.422 0.42
## 3 0.10 0.09 0.455 0.49
## 4 0.04 0.12 0.458 0.26
## 5 0.13 0.09 0.471 0.61
## 6 0.06 0.07 0.457 0.31
## share_voters_voted_trump hate_crimes_per_100k_splc
## 1 0.63 0.12583893
## 2 0.53 0.14374012
## 3 0.50 0.22531995
## 4 0.60 0.06906077
## 5 0.33 0.25580536
## 6 0.44 0.39052330
## avg_hatecrimes_per_100k_fbi
## 1 1.8064105
## 2 1.6567001
## 3 3.4139280
## 4 0.8692089
## 5 2.3979859
## 6 2.8046888
dataset <- as.data.frame(dataset %>% select(state, share_non_white, share_voters_voted_trump, hate_crimes_per_100k_splc) %>% filter(hate_crimes_per_100k_splc > 0) %>% filter(state != "District of Columbia") %>% mutate(ElectTrump = case_when(share_voters_voted_trump > 0.5 ~ 1, share_voters_voted_trump < 0.5 ~ 0)))
dataset[3, "ElectTrump"] = 1 #only state that was dead even in dataset, had to manually edit to red. Double checked states and coresponding ProTrump
dataset
## state share_non_white share_voters_voted_trump
## 1 Alabama 0.35 0.63
## 2 Alaska 0.42 0.53
## 3 Arizona 0.49 0.50
## 4 Arkansas 0.26 0.60
## 5 California 0.61 0.33
## 6 Colorado 0.31 0.44
## 7 Connecticut 0.30 0.41
## 8 Delaware 0.37 0.42
## 9 Florida 0.46 0.49
## 10 Georgia 0.48 0.51
## 11 Idaho 0.16 0.59
## 12 Illinois 0.37 0.39
## 13 Indiana 0.20 0.57
## 14 Iowa 0.15 0.52
## 15 Kansas 0.25 0.57
## 16 Kentucky 0.15 0.63
## 17 Louisiana 0.42 0.58
## 18 Maine 0.09 0.45
## 19 Maryland 0.50 0.35
## 20 Massachusetts 0.27 0.34
## 21 Michigan 0.24 0.48
## 22 Minnesota 0.18 0.45
## 23 Mississippi 0.44 0.58
## 24 Missouri 0.20 0.57
## 25 Montana 0.10 0.57
## 26 Nebraska 0.21 0.60
## 27 Nevada 0.50 0.46
## 28 New Hampshire 0.09 0.47
## 29 New Jersey 0.44 0.42
## 30 New Mexico 0.62 0.40
## 31 New York 0.42 0.37
## 32 North Carolina 0.38 0.51
## 33 Ohio 0.21 0.52
## 34 Oklahoma 0.35 0.65
## 35 Oregon 0.26 0.41
## 36 Pennsylvania 0.24 0.49
## 37 Rhode Island 0.28 0.40
## 38 South Carolina 0.36 0.55
## 39 Tennessee 0.27 0.61
## 40 Texas 0.56 0.53
## 41 Utah 0.19 0.47
## 42 Vermont 0.06 0.33
## 43 Virginia 0.38 0.45
## 44 Washington 0.31 0.38
## 45 West Virginia 0.07 0.69
## 46 Wisconsin 0.22 0.48
## hate_crimes_per_100k_splc ElectTrump
## 1 0.12583893 1
## 2 0.14374012 1
## 3 0.22531995 1
## 4 0.06906077 1
## 5 0.25580536 0
## 6 0.39052330 0
## 7 0.33539227 0
## 8 0.32275417 0
## 9 0.18752122 0
## 10 0.12042027 1
## 11 0.12420817 1
## 12 0.19534455 0
## 13 0.24700888 1
## 14 0.45442742 1
## 15 0.10515247 1
## 16 0.32439697 1
## 17 0.10973335 1
## 18 0.61557402 0
## 19 0.37043897 0
## 20 0.63081059 0
## 21 0.40377937 0
## 22 0.62747993 0
## 23 0.06744680 1
## 24 0.18452351 1
## 25 0.49549103 1
## 26 0.15948963 1
## 27 0.14167316 0
## 28 0.15154960 0
## 29 0.07830591 0
## 30 0.29481132 0
## 31 0.35062045 0
## 32 0.24400659 1
## 33 0.19071396 1
## 34 0.13362910 1
## 35 0.83284961 0
## 36 0.28510109 0
## 37 0.09540164 0
## 38 0.20989442 1
## 39 0.19993848 1
## 40 0.21358394 1
## 41 0.13654673 0
## 42 0.32414911 0
## 43 0.36324890 0
## 44 0.67748765 0
## 45 0.32867707 1
## 46 0.22619711 0
library(ggplot2)
library(RColorBrewer)
ggplot(dataset, aes(x = state, y = hate_crimes_per_100k_splc, fill = ElectTrump)) + geom_bar(stat = "identity") + scale_fill_gradient(high = "red",low ="blue") + theme(axis.text.x = element_text(angle = 90, hjust =1))
summaryset <- dataset %>% select(hate_crimes_per_100k_splc, ElectTrump) %>% group_by(ElectTrump) %>% summarize(MeanHateCrimesPer100k = mean(hate_crimes_per_100k_splc))
summaryset
## # A tibble: 2 x 2
## ElectTrump MeanHateCrimesPer100k
## <dbl> <dbl>
## 1 0 0.346
## 2 1 0.203
Relevant Summary Statistics As confirmed by the summary table, the states in which the electoral vote did not go for President Trump saw a higher average hate crimes per 100k during November 9-18.