Click the Original, Code and Reconstruction tabs to read about the issues and how they were fixed.

Original


Source: Mitchell, J. N. (2018).


Objective: White men are responsible for majority of mass shootings, yet instead of being criminalised, law enforcement tends to identify them as mentally ill. However, the incident would be treated very differently if it were caused by someone of a different racial identity.

Targeted Audience: Law enforcement, American public

The visualisation chosen had the following three main issues:

  • Method chosen to present the data is not transparent as it is very time consuming to look back and forth between the legend and the size of circles to know how many victims were killed. In addition, the multitudes of overlapping circles of similar shade makes it difficult to distinguish the exact number of mass shooting incidents that occurred between 2012 to 2018.

  • Data visualisation’s title does not align with objective and the data visualisation fails to illustrate the main argument that majority of mass shootings were carried out by white men due to insufficient data. Since ‘majority’ would mean that there were other races, however, the data visualisation focuses purely on white men and shows none of the other races (black, asian, latino, other, native american) to create that dichotomy in order to really prove and enforce this argument.

  • There appears to be underlying ethical issues such as perceived bias that white men are the sole cause and problem, in which referring back to the data source shows that some information was omitted, such as having prior mental health issues and other races also committing the same act, resulting in the data visualisation to be somewhat misleading.

Reference

Code

The following code was used to fix the issues identified in the original.

# Load necessary packages
library(readr)
library(ggplot2)
library(tidyr)
library(scales)

# Read in the data file (csv) using readr
mass_shooting <- read_csv("/Users/eivy/Desktop/A2_WebReport/mass_shooting_byRace.csv")
mass_shooting
## # A tibble: 6 × 6
##   Race            total_mass_shootings total_killed total_inju…¹ total…² total…³
##   <chr>                          <dbl>        <dbl>        <dbl>   <dbl>   <dbl>
## 1 White                             18          219          637     856       9
## 2 Native American                    1            5            1       6       0
## 3 Black                              3           22           19      41       1
## 4 Asian                              2           12            3      15       2
## 5 Latino                             2           12            6      18       2
## 6 Other                              6           94          107     201       2
## # … with abbreviated variable names ¹​total_injured, ²​total_victims,
## #   ³​total_with_mental_health_issues
# Reformatting data from a "wide" to a "long" format so that one column is used for the measure and another as a key variable to indicate which measure used in each row.
p1 <- gather(mass_shooting, key="measure", value="value", c("total_mass_shootings", "total_killed", "total_injured", "total_victims", "total_with_mental_health_issues"))

# Labeling variables in the graph
variable_names <- list(
  "total_mass_shootings" = "Total number of mass shooting incidents by perpetrating race",
  "total_killed" = "Total number of people killed by perpetrating race",
  "total_injured" = "Total number of people injured by perpetrating race",
  "total_victims" = "Total number of victims (total killed + total injured)",
  "total_with_mental_health_issues" = "Total number of mass shooters with mental health issues prior by perpetrating race"
)

variable_labeller <- function(variable,value){
  return(variable_names[value])
}

# Reconstructing the original to a faceted bar graph
new_plot <- ggplot(p1, aes(x=Race, y=value, fill=Race))+
  geom_bar(stat='identity')+
  labs(x = "(Perpetrating) Race", y = "Total Count", title = "Number of mass shooters by race and the total number of their victims", subtitle = "Mass shootings in the U.S. by 32 people aged 15-64, between 2012-2018, whereby a mass shooting is classified as 4 or more killed (shooter excluded) in one incident at a single location by a lone shooter.")+
  facet_wrap(~measure, ncol=1, scales="free_x", labeller=variable_labeller)+
  coord_flip()+
  scale_y_continuous(breaks = pretty_breaks())+
  scale_fill_manual(values = c("White" = "#720404",
                               "Native American" = "#FFB3B3",
                               "Black" = "#F30404",
                               "Asian" = "#FD9884",
                               "Latino" = "#FF5F3F",
                               "Other" = "#C10000"))

Data Reference

Reconstruction

The following plot fixes the main issues in the original.