The Department of Homeland Security (DHS) would like to find the highest rate of murders, assaults, and rape within each state to proactively reduce crime. Though crime is throughout every state, the DHS wants to know which state should the department focus its efforts initially.

path = "https://raw.githubusercontent.com/AlphaCurse/R_Programming/main/USArrests.csv"
crimerate = read.table(file = path, header=TRUE, sep = ",")
head(crimerate)
##            X Murder Assault UrbanPop Rape
## 1    Alabama   13.2     236       58 21.2
## 2     Alaska   10.0     263       48 44.5
## 3    Arizona    8.1     294       80 31.0
## 4   Arkansas    8.8     190       50 19.5
## 5 California    9.0     276       91 40.6
## 6   Colorado    7.9     204       78 38.7

Data Exploration

murder_mean = mean(crimerate$Murder)
murder_median = median(crimerate$Murder)
assault_mean = mean(crimerate$Assault)
assault_median = median(crimerate$Assault)
pop_mean = mean(crimerate$UrbanPop)
pop_median = median(crimerate$UrbanPop)
rape_mean = mean(crimerate$Rape)
rape_median = median(crimerate$Rape)

murder_mean
## [1] 7.788
murder_median
## [1] 7.25
assault_mean
## [1] 170.76
assault_median
## [1] 159
pop_mean
## [1] 65.54
pop_median
## [1] 66
rape_mean
## [1] 21.232
rape_median
## [1] 20.1

The Murder column has a mean of 7.788 and a median of 7.25 of all US states. The Assault column has a mean of 170.76 and a median of 159 of all US states. The Urban Pop column has a mean of 65.54 and a median of 66 of all US states. The Rape column has a mean of 21.232 and a median of 20.1 of all US states.

df = data.frame(crimerate)
View(df)

Data Wrangling

names(df)[1] <- 'State'
head(df)
##        State Murder Assault UrbanPop Rape
## 1    Alabama   13.2     236       58 21.2
## 2     Alaska   10.0     263       48 44.5
## 3    Arizona    8.1     294       80 31.0
## 4   Arkansas    8.8     190       50 19.5
## 5 California    9.0     276       91 40.6
## 6   Colorado    7.9     204       78 38.7
df['Murder'] <- floor(df$Murder)
df['Rape'] <- floor(df$Rape)
head(df)
##        State Murder Assault UrbanPop Rape
## 1    Alabama     13     236       58   21
## 2     Alaska     10     263       48   44
## 3    Arizona      8     294       80   31
## 4   Arkansas      8     190       50   19
## 5 California      9     276       91   40
## 6   Colorado      7     204       78   38

Scatterplots

library(ggplot2)

ggplot(df, aes(x=Murder, y=UrbanPop)) + geom_point()

ggplot(df, aes(x=Assault, y=UrbanPop)) + geom_point()

ggplot(df, aes(x=Rape, y=UrbanPop)) + geom_point()

Boxplots

ggplot(df, aes(x=Murder, y=UrbanPop)) + geom_boxplot()
## Warning: Continuous x aesthetic -- did you forget aes(group=...)?

ggplot(df, aes(x=Assault, y=UrbanPop)) + geom_boxplot()
## Warning: Continuous x aesthetic -- did you forget aes(group=...)?

ggplot(df, aes(x=Rape, y=UrbanPop)) + geom_boxplot()
## Warning: Continuous x aesthetic -- did you forget aes(group=...)?

Histograms

ggplot(df, aes(x=Murder)) + geom_histogram(color='black', fill='blue')
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

ggplot(df, aes(x=Assault)) + geom_histogram(color='black', fill='red')
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

ggplot(df, aes(x=Rape)) + geom_histogram(color='black', fill='green')
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

ggplot(df, aes(x=UrbanPop)) + geom_histogram(color='black', fill='white')
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Bar Charts

theme_set(theme_bw())

ggplot(df, aes(x=State, y=Murder)) + geom_bar(stat='identity', width=0.4) +
  theme(axis.text.x = element_text(angle=70, vjust=0.6))

ggplot(df, aes(x=State, y=Assault)) + geom_bar(stat='identity', width=0.4) +
  theme(axis.text.x = element_text(angle=70, vjust=0.6))

ggplot(df, aes(x=State, y=Rape)) + geom_bar(stat='identity', width=0.4) +
  theme(axis.text.x = element_text(angle=70, vjust=0.6))

ggplot(df, aes(x=State, y=UrbanPop)) + geom_bar(stat='identity', width=0.4) +
  theme(axis.text.x = element_text(angle=70, vjust=0.6))

R Markdown

From the data provided, we can determine population has a correlation to murders, assaults, and rape cases throughout the country. The higher the population in a given state, the greater frequency of crime occurs. That being said, the Department of Homeland Security should begin their efforts in high populated states, such as New York, Georgia, or Florida.