The "Guns" dataset was created to see if the availability of more guns was related to less crime from 1977 to 1999 in the United States (plus District of Columbia). We know from recent situations that this is not necessarily true since guns being found by the wrong hands is extremely dangerous and a major cause of the many mass shootings seen in recent years. This is due to lack of mental health awareness, violent video games, family problems/negligence, social media, etc. However, I wanted to see if there is a upward trend in violence over the past years to see if the current violence in our nation could have been predicted. My question is, is there an upward trend in violence from 1977 to 1999 and does the violence trend in New York reflect the same as the full country (is there a correlation amongst the two)?
guns_data<-read.csv('https://raw.githubusercontent.com/Sangeetha-007/R-Practice/master/Bridge-Program/GunsProject/Guns.csv')
head(guns_data)
## X year violent murder robbery prisoners afam cauc male population
## 1 1 1977 414.4 14.2 96.8 83 8.384873 55.12291 18.17441 3.780403
## 2 2 1978 419.1 13.3 99.1 94 8.352101 55.14367 17.99408 3.831838
## 3 3 1979 413.3 13.2 109.5 144 8.329575 55.13586 17.83934 3.866248
## 4 4 1980 448.5 13.2 132.1 141 8.408386 54.91259 17.73420 3.900368
## 5 5 1981 470.5 11.9 126.5 149 8.483435 54.92513 17.67372 3.918531
## 6 6 1982 447.7 10.6 112.0 183 8.514000 54.89621 17.51052 3.925229
## income density state law
## 1 9563.148 0.0745524 Alabama no
## 2 9932.000 0.0755667 Alabama no
## 3 9877.028 0.0762453 Alabama no
## 4 9541.428 0.0768288 Alabama no
## 5 9548.351 0.0771866 Alabama no
## 6 9478.919 0.0773185 Alabama no
I removed the fields of "afam" (percent of state population that is African-American), "cauc" (percent of state population that is Caucasian, ages 10 to 64), "male" (percent of state population that is male, ages 10 to 29.), "income", and "law" because I felt it was irrelevant to my current analysis.
guns_data_cleaned<-guns_data[c(1:6, 10, 12:13)]
head(guns_data_cleaned)
## X year violent murder robbery prisoners population density state
## 1 1 1977 414.4 14.2 96.8 83 3.780403 0.0745524 Alabama
## 2 2 1978 419.1 13.3 99.1 94 3.831838 0.0755667 Alabama
## 3 3 1979 413.3 13.2 109.5 144 3.866248 0.0762453 Alabama
## 4 4 1980 448.5 13.2 132.1 141 3.900368 0.0768288 Alabama
## 5 5 1981 470.5 11.9 126.5 149 3.918531 0.0771866 Alabama
## 6 6 1982 447.7 10.6 112.0 183 3.925229 0.0773185 Alabama
summary(guns_data_cleaned)
## X year violent murder
## Min. : 1 Min. :1977 Min. : 47.0 Min. : 0.200
## 1st Qu.: 294 1st Qu.:1982 1st Qu.: 283.1 1st Qu.: 3.700
## Median : 587 Median :1988 Median : 443.0 Median : 6.400
## Mean : 587 Mean :1988 Mean : 503.1 Mean : 7.665
## 3rd Qu.: 880 3rd Qu.:1994 3rd Qu.: 650.9 3rd Qu.: 9.800
## Max. :1173 Max. :1999 Max. :2921.8 Max. :80.600
## robbery prisoners population density
## Min. : 6.4 Min. : 19.0 Min. : 0.4027 Min. : 0.000707
## 1st Qu.: 71.1 1st Qu.: 114.0 1st Qu.: 1.1877 1st Qu.: 0.031911
## Median : 124.1 Median : 187.0 Median : 3.2713 Median : 0.081569
## Mean : 161.8 Mean : 226.6 Mean : 4.8163 Mean : 0.352038
## 3rd Qu.: 192.7 3rd Qu.: 291.0 3rd Qu.: 5.6856 3rd Qu.: 0.177718
## Max. :1635.1 Max. :1913.0 Max. :33.1451 Max. :11.102120
## state
## Length:1173
## Class :character
## Mode :character
##
##
##
The mean value of violence from 1977 to 1999 in the United States:
mean(guns_data_cleaned$violent)
## [1] 503.0747
Creating the subset for New York:
#print(guns_data_cleaned)
ny_data<-guns_data_cleaned%>% filter(state == 'New York')
head(ny_data)
## X year violent murder robbery prisoners population density state
## 1 737 1977 831.8 10.7 472.6 98 17.81261 0.3724071 New York
## 2 738 1978 841.0 10.3 472.1 108 17.68059 0.3696471 New York
## 3 739 1979 917.4 11.9 529.6 114 17.58384 0.3676244 New York
## 4 740 1980 1029.5 12.7 641.3 120 17.56675 0.3707865 New York
## 5 741 1981 1069.6 12.3 684.0 123 17.56773 0.3708071 New York
## 6 742 1982 990.1 11.4 610.7 145 17.58975 0.3712718 New York
summary(ny_data)
## X year violent murder
## Min. :737.0 Min. :1977 Min. : 588.8 Min. : 5.00
## 1st Qu.:742.5 1st Qu.:1982 1st Qu.: 841.5 1st Qu.: 9.80
## Median :748.0 Median :1988 Median : 965.6 Median :11.10
## Mean :748.0 Mean :1988 Mean : 941.3 Mean :10.67
## 3rd Qu.:753.5 3rd Qu.:1994 3rd Qu.:1071.5 3rd Qu.:12.50
## Max. :759.0 Max. :1999 Max. :1180.9 Max. :14.50
## robbery prisoners population density
## Min. :240.8 Min. : 98.0 Min. :17.57 Min. :0.3676
## 1st Qu.:472.4 1st Qu.:151.5 1st Qu.:17.72 1st Qu.:0.3729
## Median :514.1 Median :229.0 Median :17.94 Median :0.3787
## Mean :501.8 Mean :244.7 Mean :17.91 Mean :0.3780
## 3rd Qu.:588.1 3rd Qu.:347.0 3rd Qu.:18.14 3rd Qu.:0.3842
## Max. :684.0 Max. :397.0 Max. :18.20 Max. :0.3853
## state
## Length:23
## Class :character
## Mode :character
##
##
##
The mean value of violence from 1977 to 1999 in NY (it's way higher than the national mean at 503.1!)
mean(ny_data$violent)
## [1] 941.3174
A scatter plot of the frequency of violent behavior/incidents throughout the 22 years I examined for NY state:
ny_data_plot <- ggplot (data=ny_data, mapping=aes(x=year, y=violent)) + geom_point()
print(ny_data_plot)
boxplot(guns_data_cleaned[,3],xlab="Violence",ylab="Frequency",
main="Boxplot of Violence Frequency in the United States", col=c("red"))
boxplot(ny_data[,3],xlab="Violence",ylab="Frequency",
main="Boxplot of Violence Frequency in NY", col=c("blue"))
hist(guns_data_cleaned$violent)
hist(ny_data$violent)
Side by side scatter plot comparisions of New York compared to the United States as a whole (the United States plot is on the bottom):
states_guns_plot <- ggplot(data = guns_data_cleaned, mapping = aes(x = year, y = violent))+ geom_point()
print(states_guns_plot)
According to my analysis, the trends in violence of NY state is not fully correlated to the trends of violence as a nation. New York State's violence mean was way higher than the national mean (1.87 times higher). In my last scatter plot (the one right above), you can see there have been drops in NY's frequency when there were increases nationally (and vice versa). Initially, before doing this analysis I believed there would be a constant rise in violence in both New York State and the country as a whole, as population increased over time, but I was wrong. This means I would need outside data, or possibly have to look at trends in densities of populations instead, before I can see a clearer correlation.