“Do stricter firearm control laws help reduce firearm mortality?”
Our task is to compare firearm mortality to whether a State has strict or lax gun laws. This should tell us if more gun laws are related to lower gun deaths.
Here’s an example of the firearm mortality data from the CDC that we
are looking for:
https://www.cdc.gov/nchs/pressroom/sosmap/firearm_mortality/firearm.htm
We’ll access the CDC data using API from here (select the Export button to see API options): https://data.cdc.gov/NCHS/NCHS-VSRR-Quarterly-provisional-estimates-for-sele/489q-934x/about_data
And we’ll categorize each State’s firearm control laws from very strict to very lax on a five-point scale using Every Town Research’s data: https://everytownresearch.org/rankings/
Subsections:
- Loading Libraries
- Loading Data
- Processing Data - Description of Variables
We’re using the tidyverse
to have a cohesive and
consistent ecosystem of libraries to complete Data Science-motivated
projects.
The viridis
package is to make a color scale available
that is easier to read, easier to read by those with colorblindness, and
which also prints well in gray scale.
The httr
and jsonlite
packages are
necessary to process the CDC API data.
# Libraries used
library(tidyverse)
library(viridis)
library(httr)
library(jsonlite)
We’re pulling mortality data from the CDC.
# Loading CDC data
url <- "https://data.cdc.gov/resource/489q-934x.json"
cdc <- GET(url)
cdc <- fromJSON(content(cdc, "text"), flatten=TRUE)
We manually entered into Google Sheets each State’s firearm control laws into a five-point scale between ‘Very Strict’ and ‘Very Lax’ based on categorizations from Everytown Research’s website and read it here from a cvs version.
# Loading Likert data
likert <- read_csv("D608 Story3 - Likert.csv",show_col_types = FALSE)
We need to isolate and clean up the CDC data and merge it with our imported Likert data prepared in Google Sheets.
Here we take just the most recent annual data with firearm mortality
data. I picked Age-adjusted
because those entries matched
with the numbers that were selected for the map of the United States in
the link referenced in the Introduction.
# Isolate and clean up the CDC data
cdc <- cdc %>%
filter(cause_of_death == "Firearm-related injury" &
rate_type == "Age-adjusted" &
year_and_quarter == "2023 Q1" &
time_period == "12 months ending with quarter")
cdc <- pivot_longer(cdc,9:59)
cdc$value <- as.numeric(cdc$value)
cdc <- select(cdc,19:20)
cdc$name <- str_remove(cdc$name, "rate_")
cdc$name <- str_replace_all(cdc$name, "_", " ")
Here we merge the data into a tidy database.
# Merge CDC and Likert data
likert$State <- str_to_lower(likert$State)
cdc <- cdc %>%
rename(State = name) %>%
rename(RATE = value)
df <- inner_join(likert, cdc, by = "State")
Below is a glimpse of the data and a table of the variables with short descriptions:
Variable | xxxx | Definition |
---|---|---|
ST | Two Letter State Code | |
State | Name of State | |
Likert | ‘Very Strict’, ‘Strict’, ‘Moderate’, ‘Lax’ or ‘Very Lax’ | |
RATE | Firearm Mortality per 100k |
# Quick look at the data
glimpse(df)
## Rows: 50
## Columns: 4
## $ ST <chr> "AL", "AK", "AZ", "AR", "CA", "CO", "CT", "DE", "FL", "GA", "HI…
## $ State <chr> "alabama", "alaska", "arizona", "arkansas", "california", "colo…
## $ Likert <chr> "Lax", "Very Lax", "Very Lax", "Very Lax", "Very Strict", "Stri…
## $ RATE <dbl> 26.4, 21.4, 20.0, 22.7, 8.6, 17.0, 6.6, 12.1, 14.2, 19.4, 4.1, …
Working on the heatmap I realized I needed the data in a special
format. It’s no longer tidy. Every State needs to be repeated five
times, one for each likert value, with a value of zero in the
RATE
cells that don’t correspond to the State’s actual
likert.
I wasn’t successful in transforming my df
into the
format required for a heatmap and so I was going to do five horizontal
distributions of the rates (maybe box plots) but to meet the assignment
criterion of using a heat map I’ve uploaded my trial data where I made
the necessary manipulations in Google Sheets.
# Loading trial data
dftrial <- read_csv("D608 Story3 - DF.csv",show_col_types = FALSE)
Do stricter gun control laws result in reduced gun violence deaths?
Here we’ve categorized the 50 States into five groups based on an assessment by Everytown Research. We mapped each of Everytown Research’s five categories to our own:
National Leaders -> Very Strict
Making Progress -> Strict
Missing Key Laws -> Moderate
Weak Systems -> Lax
National Failures -> Very Lax
Now we show each grouping of States by gun control laws as a cluster in a heat map where brighter colors indicate higher firearm mortality and darker colors indicate lower firearm mortality. You can see that the laxer the gun laws the brighter and higher the firearm mortality rates are.
It’s good practice to use colors that are distinguishable by people with the various types of colorblindness.
It’s also important to use colors that are perceptually-uniform. Uniform steps in the way a computer represents colors are not uniform steps to human perception, color blindness or not. So a good color scale may have uneven jumps on the computer’s RGB scale but seem proportionate to the change in value represented by the colors.
If we use a computer or rainbow scale we or our readers may be mislead in the proportional differences of the values represented by our color scale.
To think about color imagine the name of the color as the hue (example: Red, Blue, Purple), but every color also has a value which is how light or dark it is, and a saturation which is how dull or intense the color is.
A painter would pick a hue, say Purple, and change the value by tinting it lighter by adding white or shading it darker by adding black. If the painter wanted a duller color, they would add gray, or a little of the opposite if they had already added white or black, to decrease the saturation. This is why paints in a kit are always so bright - you can’t take out the gray so the colors start with no gray, or maximum saturation.
It seems clear that more firearm control laws result in fewer deaths.
As you go from the Very Lax
states in the top right to the
Very Strict
states in the bottom left you can see there are
dimmer and dimmer colors representing lower rates of mortality due to
firearms.
Presumably some gun deaths in a very strict state will be due to people purchasing guns in adjacent lax states. Maybe for future analysis we could look at the data on the county level and assign each county’s gun control category by the most lax laws in any bordering county.
It’s not clear to me that my report format was persuasive. In retrospect I wish I had written this as an article, maybe the article would be an abstract in a mini academic paper format so I could still demonstrate my work.
I went through four iterations of the assignment and am still not
happy with it. I couldn’t figure out how to turn my dataframe into one
with a RATE
for every State based on the Likert with a zero
for every missing RATE
but I bet that is two lines of
code!
If I could keep working on this I would try to pull the Likert data
through web scraping and get better at pipes %>%
to make
my code more elegant.
To help with the heat map I used the R Graph Gallery at https://r-graph-gallery.com/79-levelplot-with-ggplot2.html
For help with selecting an accessible color palette we found the
following resource:
https://cran.r-project.org/web/packages/viridis/vignettes/intro-to-viridis.html
# Libraries used
library(tidyverse)
library(viridis)
library(httr)
library(jsonlite)
# Loading CDC data
url <- "https://data.cdc.gov/resource/489q-934x.json"
cdc <- GET(url)
cdc <- fromJSON(content(cdc, "text"), flatten=TRUE)
# Loading Likert data
likert <- read_csv("D608 Story3 - Likert.csv",show_col_types = FALSE)
# Isolate and clean up the CDC data
cdc <- cdc %>%
filter(cause_of_death == "Firearm-related injury" &
rate_type == "Age-adjusted" &
year_and_quarter == "2023 Q1" &
time_period == "12 months ending with quarter")
cdc <- pivot_longer(cdc,9:59)
cdc$value <- as.numeric(cdc$value)
cdc <- select(cdc,19:20)
cdc$name <- str_remove(cdc$name, "rate_")
cdc$name <- str_replace_all(cdc$name, "_", " ")
# Merge CDC and Likert data
likert$State <- str_to_lower(likert$State)
cdc <- cdc %>%
rename(State = name) %>%
rename(RATE = value)
df <- inner_join(likert, cdc, by = "State")
# Quick look at the data
glimpse(df)
# Loading trial data
dftrial <- read_csv("D608 Story3 - DF.csv",show_col_types = FALSE)
# Ordering the Likert and States
dftrial$Likert <- factor(dftrial$Likert, levels = c("Very Strict", "Strict", "Moderate", "Lax", "Very Lax"))
dftrial$ST <- factor(dftrial$ST, levels = c("CA", "CT", "HI", "IL", "MD", "MA", "NJ", "NY", "CO", "DE", "MI", "MN", "NV", "NM", "OR", "PA", "RI", "VT", "VA", "WA", "FL", "LA", "ME", "NE", "NC", "WI", "AL", "IN", "IA", "ND", "OH", "SC", "TN", "TX", "UT", "WV", "AK", "AZ", "AR", "GA", "ID", "KS", "KY", "MS", "MO", "MT", "NH", "OK", "SD", "WY"))
# Heatmap
ggplot(dftrial, aes(Likert, ST, fill= RATE)) +
scale_fill_viridis("Mortality") +
geom_tile() +
ggtitle("States' Firearm Mortality Rates clustered by Gun Control Laws") +
xlab("Relative Gun Control Laws") +
ylab("States")