library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.1.1 ✔ readr 2.1.4
## ✔ forcats 1.0.0 ✔ stringr 1.5.0
## ✔ ggplot2 3.4.3 ✔ tibble 3.2.1
## ✔ lubridate 1.9.2 ✔ tidyr 1.3.0
## ✔ purrr 1.0.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(ggplot2)
library(dplyr)
The article talks about how a group of statisticians decided to create interactive graphs for the rate of gun deaths in the USA and how they categoirzed the reasons behind these deaths. They talk about what we tend to focus the most, which is interestingly the lowest cause of gun deaths. Later, they point out that suicide is the leading cause of gun deaths in the USA, particularly male suicides.
gun_deaths <- read.csv("https://github.com/fivethirtyeight/guns-data/raw/master/full_data.csv")
ggplot(gun_deaths, aes(x = intent, y = age, color = intent)) +
geom_point() +
labs(title = "Gun Deaths by Intent and Age", x = "Intent", y = "Age")
ggplot(gun_deaths, aes(x = sex, fill = sex)) +
geom_bar() +
labs(title = "Suicide Rate by Sex", x = "Sex", y = "Count")
ggplot(gun_deaths, aes(x = race, fill = race)) +
geom_bar() +
labs(title = "Suicide Rate by Race", x = "Race", y = "Count")
ggplot(gun_deaths, aes(x = education, fill = education)) +
geom_bar() +
labs(title = "Suicide Rate by Education", x = "Education", y = "Count")
data2 <- gun_deaths %>%
mutate(month = as.numeric(month), # Convert month to numeric
season = case_when(
month %in% c(12, 1, 2) ~ "Winter",
month %in% c(3, 4, 5) ~ "Spring",
month %in% c(6, 7, 8) ~ "Summer",
month %in% c(9, 10, 11) ~ "Fall"
))
ggplot(data2, aes(x = season, fill = season)) +
geom_bar() +
labs(title = "Gun Shots by Season", x = "Season", y = "Count")
The table above shows the gun shots by season. There is no obvious pattern of gun deaths being visually higher or lower depending on the season.
data2_suicides <- data2 %>%
filter(intent == "Suicide")
# Create the bar chart
ggplot(data2_suicides, aes(x = season, fill = season)) +
geom_bar() +
labs(title = "Suicide Rate by Season", x = "Season", y = "Count")
The graph above shows only suicide rates by season. It seems like again, there is no a clear pattern of suicides occurring more often in on particular season. In fact, as opposed to what most might believe, there is less suicide rates by gun in Winter than in the other seasons.
data2_suicides <- data2 %>%
filter(intent == "Suicide", sex == "M")
# Create the bar chart
ggplot(data2_suicides, aes(x = season, fill = season)) +
geom_bar() +
labs(title = "Suicide Rate by Season, Males Only", x = "Season", y = "Count")
The graph above shows essentially the same thing but only males.
data2_suicides <- data2 %>%
filter(intent == "Suicide", race == "White")
# Create the bar chart
ggplot(data2_suicides, aes(x = season, fill = season)) +
geom_bar() +
labs(title = "Suicide Rate by Season, Whites Only", x = "Season", y = "Count")
The graph above shows essentially the same thing but only Whites.
The graphs seem to indicate no apparent relationship between season and suicide rates.