The Department of Homeland Security (DHS) would like to find the highest rate of murders, assaults, and rape within each state to proactively reduce crime. Though crime is throughout every state, the DHS wants to know which state should the department focus its efforts initially.
path = "https://raw.githubusercontent.com/AlphaCurse/R_Programming/main/USArrests.csv"
crimerate = read.table(file = path, header=TRUE, sep = ",")
head(crimerate)
## X Murder Assault UrbanPop Rape
## 1 Alabama 13.2 236 58 21.2
## 2 Alaska 10.0 263 48 44.5
## 3 Arizona 8.1 294 80 31.0
## 4 Arkansas 8.8 190 50 19.5
## 5 California 9.0 276 91 40.6
## 6 Colorado 7.9 204 78 38.7
Data Exploration
murder_mean = mean(crimerate$Murder)
murder_median = median(crimerate$Murder)
assault_mean = mean(crimerate$Assault)
assault_median = median(crimerate$Assault)
pop_mean = mean(crimerate$UrbanPop)
pop_median = median(crimerate$UrbanPop)
rape_mean = mean(crimerate$Rape)
rape_median = median(crimerate$Rape)
murder_mean
## [1] 7.788
murder_median
## [1] 7.25
assault_mean
## [1] 170.76
assault_median
## [1] 159
pop_mean
## [1] 65.54
pop_median
## [1] 66
rape_mean
## [1] 21.232
rape_median
## [1] 20.1
The Murder column has a mean of 7.788 and a median of 7.25 of all US states. The Assault column has a mean of 170.76 and a median of 159 of all US states. The Urban Pop column has a mean of 65.54 and a median of 66 of all US states. The Rape column has a mean of 21.232 and a median of 20.1 of all US states.
df = data.frame(crimerate)
View(df)
Data Wrangling
names(df)[1] <- 'State'
head(df)
## State Murder Assault UrbanPop Rape
## 1 Alabama 13.2 236 58 21.2
## 2 Alaska 10.0 263 48 44.5
## 3 Arizona 8.1 294 80 31.0
## 4 Arkansas 8.8 190 50 19.5
## 5 California 9.0 276 91 40.6
## 6 Colorado 7.9 204 78 38.7
df['Murder'] <- floor(df$Murder)
df['Rape'] <- floor(df$Rape)
head(df)
## State Murder Assault UrbanPop Rape
## 1 Alabama 13 236 58 21
## 2 Alaska 10 263 48 44
## 3 Arizona 8 294 80 31
## 4 Arkansas 8 190 50 19
## 5 California 9 276 91 40
## 6 Colorado 7 204 78 38
Scatterplots
library(ggplot2)
ggplot(df, aes(x=Murder, y=UrbanPop)) + geom_point()
ggplot(df, aes(x=Assault, y=UrbanPop)) + geom_point()
ggplot(df, aes(x=Rape, y=UrbanPop)) + geom_point()
Boxplots
ggplot(df, aes(x=Murder, y=UrbanPop)) + geom_boxplot()
## Warning: Continuous x aesthetic -- did you forget aes(group=...)?
ggplot(df, aes(x=Assault, y=UrbanPop)) + geom_boxplot()
## Warning: Continuous x aesthetic -- did you forget aes(group=...)?
ggplot(df, aes(x=Rape, y=UrbanPop)) + geom_boxplot()
## Warning: Continuous x aesthetic -- did you forget aes(group=...)?
Histograms
ggplot(df, aes(x=Murder)) + geom_histogram(color='black', fill='blue')
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
ggplot(df, aes(x=Assault)) + geom_histogram(color='black', fill='red')
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
ggplot(df, aes(x=Rape)) + geom_histogram(color='black', fill='green')
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
ggplot(df, aes(x=UrbanPop)) + geom_histogram(color='black', fill='white')
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
Bar Charts
theme_set(theme_bw())
ggplot(df, aes(x=State, y=Murder)) + geom_bar(stat='identity', width=0.4) +
theme(axis.text.x = element_text(angle=70, vjust=0.6))
ggplot(df, aes(x=State, y=Assault)) + geom_bar(stat='identity', width=0.4) +
theme(axis.text.x = element_text(angle=70, vjust=0.6))
ggplot(df, aes(x=State, y=Rape)) + geom_bar(stat='identity', width=0.4) +
theme(axis.text.x = element_text(angle=70, vjust=0.6))
ggplot(df, aes(x=State, y=UrbanPop)) + geom_bar(stat='identity', width=0.4) +
theme(axis.text.x = element_text(angle=70, vjust=0.6))
From the data provided, we can determine population has a correlation to murders, assaults, and rape cases throughout the country. The higher the population in a given state, the greater frequency of crime occurs. That being said, the Department of Homeland Security should begin their efforts in high populated states, such as New York, Georgia, or Florida.