Issue Description

I want to investigate gun-involved deaths in the United States between 2012 and 2014.

Questions

Are men more likely to die from gun violence than women? Are minorities more likely to die from gun violence than others?

Data Source

https://www.kaggle.com

Documentation

https://www.kaggle.com/hakabuk/gun-deaths-in-the-us

Description of the Data

Use the tools in R such as str() and summary() to describe the original dataset you imported.

library(tidyverse)
## Loading tidyverse: ggplot2
## Loading tidyverse: tibble
## Loading tidyverse: tidyr
## Loading tidyverse: readr
## Loading tidyverse: purrr
## Loading tidyverse: dplyr
## Conflicts with tidy packages ----------------------------------------------
## filter(): dplyr, stats
## lag():    dplyr, stats
library(readr)
library(ggplot2)
library(dplyr)
GunDeaths12_14 <- read_csv("C:/Users/Samuel.Bradford/Desktop/GunDeaths12-14.zip")
## Warning: Missing column names filled in: 'X1' [1]
## Parsed with column specification:
## cols(
##   X1 = col_integer(),
##   year = col_integer(),
##   month = col_character(),
##   intent = col_character(),
##   police = col_integer(),
##   sex = col_character(),
##   age = col_integer(),
##   race = col_character(),
##   hispanic = col_integer(),
##   place = col_character(),
##   education = col_integer()
## )
summary(GunDeaths12_14)
##        X1              year         month              intent         
##  Min.   :     1   Min.   :2012   Length:100798      Length:100798     
##  1st Qu.: 25200   1st Qu.:2012   Class :character   Class :character  
##  Median : 50400   Median :2013   Mode  :character   Mode  :character  
##  Mean   : 50400   Mean   :2013                                        
##  3rd Qu.: 75599   3rd Qu.:2014                                        
##  Max.   :100798   Max.   :2014                                        
##                                                                       
##      police            sex                 age             race          
##  Min.   :0.00000   Length:100798      Min.   :  0.00   Length:100798     
##  1st Qu.:0.00000   Class :character   1st Qu.: 27.00   Class :character  
##  Median :0.00000   Mode  :character   Median : 42.00   Mode  :character  
##  Mean   :0.01391                      Mean   : 43.86                     
##  3rd Qu.:0.00000                      3rd Qu.: 58.00                     
##  Max.   :1.00000                      Max.   :107.00                     
##                                       NA's   :18                         
##     hispanic        place             education    
##  Min.   :100.0   Length:100798      Min.   :1.000  
##  1st Qu.:100.0   Class :character   1st Qu.:2.000  
##  Median :100.0   Mode  :character   Median :2.000  
##  Mean   :114.2                      Mean   :2.296  
##  3rd Qu.:100.0                      3rd Qu.:3.000  
##  Max.   :998.0                      Max.   :5.000  
##                                     NA's   :53
str(GunDeaths12_14)
## Classes 'tbl_df', 'tbl' and 'data.frame':    100798 obs. of  11 variables:
##  $ X1       : int  1 2 3 4 5 6 7 8 9 10 ...
##  $ year     : int  2012 2012 2012 2012 2012 2012 2012 2012 2012 2012 ...
##  $ month    : chr  "01" "01" "01" "02" ...
##  $ intent   : chr  "Suicide" "Suicide" "Suicide" "Suicide" ...
##  $ police   : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ sex      : chr  "M" "F" "M" "M" ...
##  $ age      : int  34 21 60 64 31 17 48 41 50 NA ...
##  $ race     : chr  "Asian/Pacific Islander" "White" "White" "White" ...
##  $ hispanic : int  100 100 100 100 100 100 100 100 100 998 ...
##  $ place    : chr  "Home" "Street" "Other specified" "Home" ...
##  $ education: int  4 3 4 4 2 1 2 2 3 5 ...
##  - attr(*, "spec")=List of 2
##   ..$ cols   :List of 11
##   .. ..$ X1       : list()
##   .. .. ..- attr(*, "class")= chr  "collector_integer" "collector"
##   .. ..$ year     : list()
##   .. .. ..- attr(*, "class")= chr  "collector_integer" "collector"
##   .. ..$ month    : list()
##   .. .. ..- attr(*, "class")= chr  "collector_character" "collector"
##   .. ..$ intent   : list()
##   .. .. ..- attr(*, "class")= chr  "collector_character" "collector"
##   .. ..$ police   : list()
##   .. .. ..- attr(*, "class")= chr  "collector_integer" "collector"
##   .. ..$ sex      : list()
##   .. .. ..- attr(*, "class")= chr  "collector_character" "collector"
##   .. ..$ age      : list()
##   .. .. ..- attr(*, "class")= chr  "collector_integer" "collector"
##   .. ..$ race     : list()
##   .. .. ..- attr(*, "class")= chr  "collector_character" "collector"
##   .. ..$ hispanic : list()
##   .. .. ..- attr(*, "class")= chr  "collector_integer" "collector"
##   .. ..$ place    : list()
##   .. .. ..- attr(*, "class")= chr  "collector_character" "collector"
##   .. ..$ education: list()
##   .. .. ..- attr(*, "class")= chr  "collector_integer" "collector"
##   ..$ default: list()
##   .. ..- attr(*, "class")= chr  "collector_guess" "collector"
##   ..- attr(*, "class")= chr "col_spec"

Cleaning and Preparation

Describe the steps you took to get from your original dataset to the final dataset you used for your analysis. Include the R code in chunks.

GunDeaths12_14 %>% 
  select(year, intent, sex, age, race, education) %>%
  filter(!is.na(age)) %>%
  filter(!is.na(intent)) %>%
  filter(!is.na(education)) -> GunDeaths12_14

summary(GunDeaths12_14)
##       year         intent              sex                 age        
##  Min.   :2012   Length:100726      Length:100726      Min.   :  0.00  
##  1st Qu.:2012   Class :character   Class :character   1st Qu.: 27.00  
##  Median :2013   Mode  :character   Mode  :character   Median : 42.00  
##  Mean   :2013                                         Mean   : 43.87  
##  3rd Qu.:2014                                         3rd Qu.: 58.00  
##  Max.   :2014                                         Max.   :107.00  
##      race             education    
##  Length:100726      Min.   :1.000  
##  Class :character   1st Qu.:2.000  
##  Mode  :character   Median :2.000  
##                     Mean   :2.296  
##                     3rd Qu.:3.000  
##                     Max.   :5.000

Final Results

Show how you approached the questions you posed at the beginning. Describe how much you were able to accomplish. There should be both graphical and numerical results produced by R code included in chunks. Explain what you did and what it means.

g1 = ggplot(data = GunDeaths12_14, aes(x=sex)) +
  geom_bar(aes(fill = intent)) + 
  ggtitle("Gun deaths by gender") +
  theme(axis.text.x = element_text(size = 6, color="#993333", 
                            angle=45))
g1

table(GunDeaths12_14$sex, GunDeaths12_14$intent)
##    
##     Accidental Homicide Suicide Undetermined
##   F        215     5356    8687          169
##   M       1410    29777   54475          637

This shows that men are almost 10 times more likely to die from gun violence than women. In addition, out of 100,726 observations, the suicide rate for men is 63.12%, while the suicide rate for women is 60.21%. Surprisingly, 37.12% of women who die from gun violence have it happen in a homicide situation, while only 34.5% of men die that way.

GunDeaths12_14$race[GunDeaths12_14$race == "Native American/Native Alaskan"] <- "Native American"

GunDeaths12_14$race[GunDeaths12_14$race == "Black"] <- "African American"
GunDeaths12_14$race[GunDeaths12_14$race == "White"] <- "Caucasian"

g2 = ggplot(data = GunDeaths12_14, aes(x=race)) +
  geom_bar(aes(fill = intent)) + 
  ggtitle("Gun deaths by race") +
  theme(axis.text.x = element_text(size = 6, color="#993333", 
                            angle=0))


table(GunDeaths12_14$race, GunDeaths12_14$intent)
##                         
##                          Accidental Homicide Suicide Undetermined
##   African American              321    19498    3331          125
##   Asian/Pacific Islander         12      557     745           10
##   Caucasian                    1126     9125   55363          585
##   Hispanic                      145     5628    3169           72
##   Native American                21      325     554           14
g2

This shows that Caucasian individuals are more likely to die from gun violence than all other minorities combined. Between 2012 and 2014, out of 100,726 incidents of gun violence, 65.72% of those incidents involved Caucasian males or females, while 34.28% of the incidents involved minority males and females. Something else to make note of is the large portion of homicides among gun deaths of African Americans and Hispanics. Of the 23,275 accounts of gun deaths among blacks, 83.77% of them are a result of a homicide, while 62.44% of Hispanic gun deaths are a result of a homicide.

GunDeaths12_14$education[GunDeaths12_14$education == 1] <- "Less than HS"
GunDeaths12_14$education[GunDeaths12_14$education == 2] <- "Graduated HS"
GunDeaths12_14$education[GunDeaths12_14$education == 3] <- "Some college"
GunDeaths12_14$education[GunDeaths12_14$education == 4] <- "College graduate"
GunDeaths12_14$education[GunDeaths12_14$education == 5] <- "Not Available"

g3 = ggplot(data = GunDeaths12_14, aes(x=education)) +
  geom_bar(aes(fill = intent)) + 
  ggtitle("Gun deaths by education level") +
  theme(axis.text.x = element_text(size = 6, color="#993333", 
                            angle=0))

table(GunDeaths12_14$education, GunDeaths12_14$intent)
##                   
##                    Accidental Homicide Suicide Undetermined
##   College graduate        146     1559   11147           93
##   Graduated HS            633    15649   26321          324
##   Less than HS            492    11838    9291          200
##   Not Available            27      447     871            9
##   Some college            327     5640   15532          180
g3

Based on the data, high school graduates are almost twice as likely to die from gun violence than any other group. High school graduates represent 43.2% of all gun involved deaths. Surprisingly, individuals with less than a high school education and individuals with some college experience have almost identical numbers as far as gun involved deaths.