Introduction

“Do stricter firearm control laws help reduce firearm mortality?”

Our task is to compare firearm mortality to whether a State has strict or lax gun laws. This should tell us if more gun laws are related to lower gun deaths.

Here’s an example of the firearm mortality data from the CDC that we are looking for:
https://www.cdc.gov/nchs/pressroom/sosmap/firearm_mortality/firearm.htm

We’ll access the CDC data using API from here (select the Export button to see API options): https://data.cdc.gov/NCHS/NCHS-VSRR-Quarterly-provisional-estimates-for-sele/489q-934x/about_data

And we’ll categorize each State’s firearm control laws from very strict to very lax on a five-point scale using Every Town Research’s data: https://everytownresearch.org/rankings/

Subsections:
- Loading Libraries
- Loading Data
- Processing Data - Description of Variables


Loading Libraries

We’re using the tidyverse to have a cohesive and consistent ecosystem of libraries to complete Data Science-motivated projects.

The viridis package is to make a color scale available that is easier to read, easier to read by those with colorblindness, and which also prints well in gray scale.

The httr and jsonlite packages are necessary to process the CDC API data.

# Libraries used
library(tidyverse)
library(viridis)
library(httr)
library(jsonlite)


Loading Data

We’re pulling mortality data from the CDC.

# Loading CDC data
url <- "https://data.cdc.gov/resource/489q-934x.json"
cdc <- GET(url)
cdc <- fromJSON(content(cdc, "text"), flatten=TRUE)

We manually entered into Google Sheets each State’s firearm control laws into a five-point scale between ‘Very Strict’ and ‘Very Lax’ based on categorizations from Everytown Research’s website and read it here from a cvs version.

# Loading Likert data
likert <- read_csv("D608 Story3 - Likert.csv",show_col_types = FALSE)

Processing Data

We need to isolate and clean up the CDC data and merge it with our imported Likert data prepared in Google Sheets.

Clean up CDC Data

Here we take just the most recent annual data with firearm mortality data. I picked Age-adjusted because those entries matched with the numbers that were selected for the map of the United States in the link referenced in the Introduction.

# Isolate and clean up the CDC data
cdc <- cdc %>% 
  filter(cause_of_death == "Firearm-related injury" &
           rate_type == "Age-adjusted" &
           year_and_quarter == "2023 Q1" &
           time_period == "12 months ending with quarter")
cdc <- pivot_longer(cdc,9:59)
cdc$value <- as.numeric(cdc$value)
cdc <- select(cdc,19:20)
cdc$name <- str_remove(cdc$name, "rate_")
cdc$name <- str_replace_all(cdc$name, "_", " ")

Merge Data

Here we merge the data into a tidy database.

# Merge CDC and Likert data
likert$State <- str_to_lower(likert$State)
cdc <- cdc %>% 
  rename(State = name) %>% 
  rename(RATE = value)
df <- inner_join(likert, cdc, by = "State")


Description of Variables

Below is a glimpse of the data and a table of the variables with short descriptions:

Variable xxxx Definition
ST Two Letter State Code
State Name of State
Likert ‘Very Strict’, ‘Strict’, ‘Moderate’, ‘Lax’ or ‘Very Lax’
RATE Firearm Mortality per 100k


# Quick look at the data
glimpse(df)
## Rows: 50
## Columns: 4
## $ ST     <chr> "AL", "AK", "AZ", "AR", "CA", "CO", "CT", "DE", "FL", "GA", "HI…
## $ State  <chr> "alabama", "alaska", "arizona", "arkansas", "california", "colo…
## $ Likert <chr> "Lax", "Very Lax", "Very Lax", "Very Lax", "Very Strict", "Stri…
## $ RATE   <dbl> 26.4, 21.4, 20.0, 22.7, 8.6, 17.0, 6.6, 12.1, 14.2, 19.4, 4.1, …


Pivot in the Approach

Working on the heatmap I realized I needed the data in a special format. It’s no longer tidy. Every State needs to be repeated five times, one for each likert value, with a value of zero in the RATE cells that don’t correspond to the State’s actual likert.

I wasn’t successful in transforming my df into the format required for a heatmap and so I was going to do five horizontal distributions of the rates (maybe box plots) but to meet the assignment criterion of using a heat map I’ve uploaded my trial data where I made the necessary manipulations in Google Sheets.

# Loading trial data
dftrial <- read_csv("D608 Story3 - DF.csv",show_col_types = FALSE)



§1 Report


Do stricter gun control laws result in reduced gun violence deaths?

Here we’ve categorized the 50 States into five groups based on an assessment by Everytown Research. We mapped each of Everytown Research’s five categories to our own:

National Leaders -> Very Strict
Making Progress -> Strict
Missing Key Laws -> Moderate
Weak Systems -> Lax
National Failures -> Very Lax


Heatmap Representation

Now we show each grouping of States by gun control laws as a cluster in a heat map where brighter colors indicate higher firearm mortality and darker colors indicate lower firearm mortality. You can see that the laxer the gun laws the brighter and higher the firearm mortality rates are.


A Note on Accessible Colors


It’s good practice to use colors that are distinguishable by people with the various types of colorblindness.

It’s also important to use colors that are perceptually-uniform. Uniform steps in the way a computer represents colors are not uniform steps to human perception, color blindness or not. So a good color scale may have uneven jumps on the computer’s RGB scale but seem proportionate to the change in value represented by the colors.

If we use a computer or rainbow scale we or our readers may be mislead in the proportional differences of the values represented by our color scale.

To think about color imagine the name of the color as the hue (example: Red, Blue, Purple), but every color also has a value which is how light or dark it is, and a saturation which is how dull or intense the color is.

A painter would pick a hue, say Purple, and change the value by tinting it lighter by adding white or shading it darker by adding black. If the painter wanted a duller color, they would add gray, or a little of the opposite if they had already added white or black, to decrease the saturation. This is why paints in a kit are always so bright - you can’t take out the gray so the colors start with no gray, or maximum saturation.




Postamble

It seems clear that more firearm control laws result in fewer deaths. As you go from the Very Lax states in the top right to the Very Strict states in the bottom left you can see there are dimmer and dimmer colors representing lower rates of mortality due to firearms.

Presumably some gun deaths in a very strict state will be due to people purchasing guns in adjacent lax states. Maybe for future analysis we could look at the data on the county level and assign each county’s gun control category by the most lax laws in any bordering county.


Self Critique

It’s not clear to me that my report format was persuasive. In retrospect I wish I had written this as an article, maybe the article would be an abstract in a mini academic paper format so I could still demonstrate my work.

I went through four iterations of the assignment and am still not happy with it. I couldn’t figure out how to turn my dataframe into one with a RATE for every State based on the Likert with a zero for every missing RATE but I bet that is two lines of code!

If I could keep working on this I would try to pull the Likert data through web scraping and get better at pipes %>% to make my code more elegant.


References

To help with the heat map I used the R Graph Gallery at https://r-graph-gallery.com/79-levelplot-with-ggplot2.html

For help with selecting an accessible color palette we found the following resource:
https://cran.r-project.org/web/packages/viridis/vignettes/intro-to-viridis.html




Code Appendix


# Libraries used
library(tidyverse)
library(viridis)
library(httr)
library(jsonlite)

# Loading CDC data
url <- "https://data.cdc.gov/resource/489q-934x.json"
cdc <- GET(url)
cdc <- fromJSON(content(cdc, "text"), flatten=TRUE)

# Loading Likert data
likert <- read_csv("D608 Story3 - Likert.csv",show_col_types = FALSE)

# Isolate and clean up the CDC data
cdc <- cdc %>% 
  filter(cause_of_death == "Firearm-related injury" &
           rate_type == "Age-adjusted" &
           year_and_quarter == "2023 Q1" &
           time_period == "12 months ending with quarter")
cdc <- pivot_longer(cdc,9:59)
cdc$value <- as.numeric(cdc$value)
cdc <- select(cdc,19:20)
cdc$name <- str_remove(cdc$name, "rate_")
cdc$name <- str_replace_all(cdc$name, "_", " ")

# Merge CDC and Likert data
likert$State <- str_to_lower(likert$State)
cdc <- cdc %>% 
  rename(State = name) %>% 
  rename(RATE = value)
df <- inner_join(likert, cdc, by = "State")

# Quick look at the data
glimpse(df)

# Loading trial data
dftrial <- read_csv("D608 Story3 - DF.csv",show_col_types = FALSE)

# Ordering the Likert and States
dftrial$Likert <- factor(dftrial$Likert, levels = c("Very Strict", "Strict", "Moderate", "Lax", "Very Lax"))
dftrial$ST <- factor(dftrial$ST, levels = c("CA", "CT", "HI", "IL", "MD", "MA", "NJ", "NY", "CO", "DE", "MI", "MN", "NV", "NM", "OR", "PA", "RI", "VT", "VA", "WA", "FL", "LA", "ME", "NE", "NC", "WI", "AL", "IN", "IA", "ND", "OH", "SC", "TN", "TX", "UT", "WV", "AK", "AZ", "AR", "GA", "ID", "KS", "KY", "MS", "MO", "MT", "NH", "OK", "SD", "WY"))

# Heatmap 
ggplot(dftrial, aes(Likert, ST, fill= RATE)) + 
  scale_fill_viridis("Mortality") +
  geom_tile() +
  ggtitle("States' Firearm Mortality Rates clustered by Gun Control Laws") +
  xlab("Relative Gun Control Laws") +
  ylab("States")