Introduction

The dataset that I chose was from the article “Higher Rates Of Hate Crimes Are Tied To Income Inequality”. My subset of data is what what are the average stats by State for education, income and those who voted for Trump.

Here is the article: https://fivethirtyeight.com/features/higher-rates-of-hate-crimes-are-tied-to-income-inequality/

Read data

HateCrimes <- read.csv("https://raw.githubusercontent.com/fivethirtyeight/data/master/hate-crimes/hate_crimes.csv",header = TRUE, sep = ",")

Get a subset of the data

HateCrimes <- subset(HateCrimes, select = c(1,2,5,10,12))
colnames(HateCrimes) <- c("State","Income","HSDegree","VotedTrump","AvgHateCrimes")

Show the data

head(HateCrimes)
##        State Income HSDegree VotedTrump AvgHateCrimes
## 1    Alabama  42278    0.821       0.63     1.8064105
## 2     Alaska  67629    0.914       0.53     1.6567001
## 3    Arizona  49254    0.842       0.50     3.4139280
## 4   Arkansas  44922    0.824       0.60     0.8692089
## 5 California  60487    0.806       0.33     2.3979859
## 6   Colorado  60940    0.893       0.44     2.8046888

Conclusion

I would like to see more data including news items over time. In order to see if there is a pattern in the data directly affected by news stories.

summary(HateCrimes)
##         State        Income         HSDegree        VotedTrump   
##  Alabama   : 1   Min.   :35521   Min.   :0.7990   Min.   :0.040  
##  Alaska    : 1   1st Qu.:48657   1st Qu.:0.8405   1st Qu.:0.415  
##  Arizona   : 1   Median :54916   Median :0.8740   Median :0.490  
##  Arkansas  : 1   Mean   :55224   Mean   :0.8691   Mean   :0.490  
##  California: 1   3rd Qu.:60719   3rd Qu.:0.8980   3rd Qu.:0.575  
##  Colorado  : 1   Max.   :76165   Max.   :0.9180   Max.   :0.700  
##  (Other)   :45                                                   
##  AvgHateCrimes    
##  Min.   : 0.2669  
##  1st Qu.: 1.2931  
##  Median : 1.9871  
##  Mean   : 2.3676  
##  3rd Qu.: 3.1843  
##  Max.   :10.9535  
##  NA's   :1