title: “Exercise 3 - Quant Methods” author: “Jayhan” date: “03/10/2020” output: html_document

We will be looking at crime and gun laws in the USA

The original data can be found here: https://vincentarelbundock.github.io/Rdatasets/datasets.html

library(dplyr)
Guns <- read.csv("Guns.csv")

#Create Variables
Guns=Guns %>% mutate(murderpc = murder/population, violentpc = violent/population, robberypc=robbery/population)

head(Guns)
##   X year violent murder robbery prisoners     afam     cauc     male population
## 1 1 1977   414.4   14.2    96.8        83 8.384873 55.12291 18.17441   3.780403
## 2 2 1978   419.1   13.3    99.1        94 8.352101 55.14367 17.99408   3.831838
## 3 3 1979   413.3   13.2   109.5       144 8.329575 55.13586 17.83934   3.866248
## 4 4 1980   448.5   13.2   132.1       141 8.408386 54.91259 17.73420   3.900368
## 5 5 1981   470.5   11.9   126.5       149 8.483435 54.92513 17.67372   3.918531
## 6 6 1982   447.7   10.6   112.0       183 8.514000 54.89621 17.51052   3.925229
##     income   density   state law murderpc violentpc robberypc
## 1 9563.148 0.0745524 Alabama  no 3.756213  109.6179  25.60574
## 2 9932.000 0.0755667 Alabama  no 3.470919  109.3731  25.86226
## 3 9877.028 0.0762453 Alabama  no 3.414163  106.8995  28.32203
## 4 9541.428 0.0768288 Alabama  no 3.384296  114.9891  33.86860
## 5 9548.351 0.0771866 Alabama  no 3.036852  120.0705  32.28251
## 6 9478.919 0.0773185 Alabama  no 2.700479  114.0570  28.53337

#First I am going to look at this in time-series.

library(ggplot2)
ggplot(Guns, aes(x=year, y=murderpc, color = law))+
geom_point()+
theme_minimal()+
xlab("Years")+
ylab("Murder per capita in the US")+
geom_smooth(method = "lm", se = FALSE)+
ggtitle("Gun law and crime in USA") 
## `geom_smooth()` using formula 'y ~ x'

The diagram was pretty unclear however we can tell the murder per capita was significantly higher for states without gun laws.

I also decided to look at the year 1999 in more detail.

Guns1999 = filter(Guns, year==1999)
#Looking at the data we can see some outliers so we will remove some specific outliers rather than a specific percentile
Guns99 = Guns1999[-c(2,9),]

Now we will build a model distinguishing between murder per capita and violent crime per capita in states with gun law and without gun law

library(ggplot2)
ggplot(Guns99, aes(x=murderpc, y=violentpc, color = law))+
geom_point()+
theme_minimal()+
xlab("Murder per capita in the US")+
ylab("Violent per capita in the US")+
geom_smooth(method = "lm", se = FALSE)+
ggtitle("Gun law and crime in USA in 1999") 
## `geom_smooth()` using formula 'y ~ x'

Wehn looking at the plot we can observe there is fewer violent crimes as a result of gun laws however there were more observations of high murders per capita in states with gun laws in 1999. To take this model further we could factor in other factors like socioeconomic status which could be an explanatory variable for greater crime rather than gun laws.