The purpose of this report is to analyze gun violence trends in the United States. Gun violence has tragically affected many lives in the U.S. and poses a significant threat to the safety of American citizens. It is important to analyze gun violence trends, as it can influence legislation, actions taken by law enforcement, and the choices that individuals and communities make to stay safe.
The data used for this report comes from the Gun Violence Archive, a nonprofit research group that catalogs every incident of gun violence in the United States. The specific dataset used aggregates information of over three thousand incidents of gun violence in the U.S. between 2014 and 2021 and can be found on Kaggle. Most notably for purposes of this report, the dataset provides information regarding the state, city, date, and number of victims killed or injured for each reported incident of gun violence.
The five visualizations in the tabs below are useful for drawing conclusions about when are where gun violence is most prevalent, as well as the mortality of gun violence.
Referring to the visualizations, gun violence has increased in 2020 and 2021 compared to 2014-2019. Additionally, gun violence is greatest on the weekends. Trends also show that there is an increase in gun violence in the summer months.
The visualizations further show that Illinois, California, Texas, and Florida are the states that have experienced the most total gun violence from 2014-2021. Chicago, Baltimore, Las Vegas, and Philadelphia are the cities that have experienced the most gun violence from 2014-2021, and have more gun violence than all other cities in their respective states combined.
Finally, although gun violence results in a greater number of victims injured than killed, there is a general trend that gun violence has resulted in a greater percentage of deaths than injuries in 2018, 2019, and 2021.
This tab shows a horizontal stacked bar chart that displays the total number of gun violence victims in each state by year. The victim count captures both victims who were injured and killed. For each state, the total victim count is labeled by color to show the change in victim counts through the years 2014-2021.
When analyzing this visualization, we can see that Illinois, California, Texas, and Florida have the greatest total victim counts. Additionally, it is clear that in the majority of states, victim counts in 2020 and 2021 are greater than victim counts in 2014-2019.
date_df <- df_GunV$incident_date
x <- mdy(date_df)
df_GunV$year <- year(x)
df_GunV$monthname <- months(x, abbreviate = TRUE)
df_GunV$day <- weekdays(x, abbreviate = TRUE)
df_GunV[is.na(df_GunV$injured), "injured"] <- 0
df_GunV$TotalVictims <- df_GunV$injured + df_GunV$killed
df_GunV$stateabb <- state.abb[match(df_GunV$state, state.name)]
df_GunV[is.na(df_GunV$stateabb), "stateabb"] <- "MD"
state_df <- df_GunV %>%
select(year, stateabb, TotalVictims)%>%
group_by(year, stateabb)%>%
summarise(n=sum(TotalVictims), .groups='keep')%>%
data.frame()
state_df$year <- as.factor(state_df$year)
agg_tot <- state_df %>%
select(stateabb,n) %>%
group_by(stateabb) %>%
summarise(tot =sum(n), .groups ='keep') %>%
data.frame()
max_y <- round_any(max(agg_tot$tot),2000, ceiling)
ggplot(state_df, aes(x=reorder(stateabb,n,sum), y=n, fill=year))+
geom_bar(stat="identity", position=position_stack(reverse=TRUE))+
coord_flip()+
labs(title="Gun Violence Victims by State", x="State", y="Victim Count", fill= "Year")+
theme_tufte()+
theme(plot.title=element_text(hjust=0.5))+
scale_fill_brewer(palette="Spectral", guide=guide_legend(reverse=TRUE))+
geom_text(data= agg_tot, aes(x=stateabb, y=tot, label=scales::comma(tot, accuracy=1), fill=NULL),
hjust=-0.1, size=3)+
scale_y_continuous(labels=comma,
breaks=seq(0,max_y, by=250),
limits=c(0,max_y))
This tab shows multiple bar charts in a trellis structure to display the total number of gun violence victims in the top 4 states by month. The victim count captures both victims who were injured and killed, and the month of the year captures victims by month for all years through 2014-2021.
When analyzing these bar charts, it is clear that Illinois experiences an increase in gun violence in the summer months of June, July, August, and into September. Similarly, Florida has a significant spike in gun violence in June, Texas has the most gun violence in July and August, and California has high victim counts in June and July. In relation to the other months, Florida and Illinois do not have high victim counts in October and November; however, California and Texas do have relatively high victim counts in those months.
df_TotalVictims <- df_GunV %>%
select(state, TotalVictims) %>%
group_by(state) %>%
summarise(total=sum(TotalVictims), .groups='keep')%>%
data.frame()
df_TotalVictims <- df_TotalVictims[order(df_TotalVictims$total, decreasing=TRUE),]
top_stateVic <- df_TotalVictims$state[1:4]
top_stateVicMon <- df_GunV %>%
filter(state %in% top_stateVic) %>%
select(state, TotalVictims, monthname) %>%
group_by(state, monthname)%>%
summarise(total=sum(TotalVictims), .groups='keep')%>%
data.frame()
top_stateVicMon$monthname <- as.factor(top_stateVicMon$monthname)
top_stateVicMon$state <- as.factor(top_stateVicMon$state)
my_months <- c("Jan", "Feb", "Mar","Apr","May","Jun","Jul", "Aug","Sep","Oct","Nov","Dec")
month_order <- factor(top_stateVicMon$monthname, level=my_months)
ggplot(top_stateVicMon, aes(x= month_order, y=total, fill=state))+
geom_bar(stat="identity", position="dodge")+
theme_light()+
theme(plot.title=element_text(hjust=0.5))+
scale_y_continuous(labels=comma)+
labs(title="Multiple Bar Charts - Total Gun Violence in Top 4 States by Month",
x="Months of the Year",
y="Victim Count",
fill="State")+
scale_fill_brewer(palette= "Spectral")+
facet_wrap(~state, ncol=2, nrow=2)
This tab shows a multiple line plot to display the total gun violence victims by year and by day of the week. The victim count captures both victims who were injured and killed. Similar to the visualizations in tab 1 and tab 2, this line plot is useful for analyzing trends about when gun violence occurs.
When analyzing this visualization, we can see there are the greatest number of gun violence victims during the weekend days of Saturday and Sunday. Like the conclusion found in Tab 1, this multiple line plot clearly shows that victim counts in 2020 and 2021 are generally higher than victim counts in 2014-2019.
days_df <- df_GunV %>%
select(year, day, TotalVictims) %>%
group_by(year, day) %>%
summarise(n=sum(TotalVictims), .groups='keep')%>%
data.frame()
days_df$year <- as.factor(days_df$year)
day_order <- factor(days_df$day, level=c('Mon', 'Tue', 'Wed', 'Thu', 'Fri', 'Sat', 'Sun'))
ggplot(days_df, aes(x=day_order, y=n, group=year))+
geom_line(aes(color=year), size=2)+
labs(title="Gun Violence Victims by Day and by Year", x="Days of the Week", y="Victim Count")+
theme_minimal()+
theme(plot.title=element_text(hjust=0.5))+
geom_point(shape=21, size=4, color='red', fill='white')+
scale_y_continuous(labels=comma)+
scale_color_brewer(palette="Accent", name="Year", guide = guide_legend(reverse=TRUE))
This tab shows pie charts in a trellis structure to display the percentage of gun violence victims in the top 4 cities with greatest victim counts in relation to the all other cities in that respective state. The top 4 cities with gun violence were Chicago, Baltimore, Las Vegas, and Philadelphia. While tabs 1 and 2 showed that Illinois, Florida, California, and Texas were the states with the highest victim counts, this visualization shows that only Illinois from that list has a city with the greatest victim count.
When analyzing this visualization, we can see that the top 4 cities have more gun violence than all other cities in their state combined. Notably, the city of Las Vegas overwhelmingly contributes to gun violence in Nevada.
df_city <- df_GunV %>%
select(city_or_county, TotalVictims,state) %>%
group_by(state,city_or_county) %>%
summarise(total=sum(TotalVictims), .groups='keep')%>%
data.frame()
df_city <- df_city[df_city$state %in% c("Illinois","Nevada","Pennsylvania","Maryland"),]
df_VictimCity <- df_city %>%
select(state,city_or_county,total) %>%
mutate(myCity=ifelse(city_or_county=="Chicago", "Chicago",
ifelse(city_or_county=="Las Vegas", "Las Vegas",
ifelse(city_or_county=="Philadelphia", "Philadelphia",
ifelse(city_or_county=="Baltimore", "Baltimore",
ifelse(state=="Illinois","Other IL Cities",
ifelse(state=="Nevada", "Other NV Cities",
ifelse(state=="Pennsylvania", "Other PA Cities",
ifelse(state=="Maryland", "Other MD Cities", "Other"))))))))) %>%
group_by(state,myCity)%>%
summarise(total=sum(total), .groups='keep')%>%
data.frame()
df_IL <- df_VictimCity %>%
filter(myCity == "Chicago"| myCity == "Other IL Cities") %>%
select(state,myCity,total)%>%
mutate(percent_of_total = round(100*total/sum(total),1)) %>%
data.frame()
df_MD <- df_VictimCity %>%
filter(myCity == "Baltimore"| myCity == "Other MD Cities") %>%
select(state,myCity,total)%>%
mutate(percent_of_total = round(100*total/sum(total),1)) %>%
data.frame()
df_PA <- df_VictimCity %>%
filter(myCity == "Philadelphia"| myCity == "Other PA Cities") %>%
select(state,myCity,total)%>%
mutate(percent_of_total = round(100*total/sum(total),1)) %>%
data.frame()
df_NV <- df_VictimCity %>%
filter(myCity == "Las Vegas"| myCity == "Other NV Cities") %>%
select(state,myCity,total)%>%
mutate(percent_of_total = round(100*total/sum(total),1)) %>%
data.frame()
new_citydf <- rbind(df_MD,df_IL,df_PA,df_NV)
new_citydf$myCity = factor(new_citydf$myCity, levels=c("Chicago","Other IL Cities", "Baltimore", "Other MD Cities", "Las Vegas", "Other NV Cities", "Philadelphia", "Other PA Cities"))
ggplot(data=new_citydf, aes(x="", y=total, fill= myCity))+
geom_bar(stat="Identity",position="fill")+
coord_polar(theta="y", start=0)+
labs(fill= "City", x=NULL, y=NULL, title="Gun Violence in Top 4 Cities Compared to Total Gun Violence in the City's State")+
theme_light()+
theme(plot.title=element_text(hjust=0.5, size=15),
axis.text=element_blank(),
axis.ticks=element_blank(),
panel.grid=element_blank()) +
facet_wrap(~state, nrow=2,ncol=2) +
scale_fill_brewer(palette="Spectral")+
geom_text(aes(x=1.7, label=paste0(percent_of_total, "%")),
size=4,
position=position_fill(vjust=0.5))
This tab shows a nested pie chart to compare the amount of gun violence that resulted in death versus injury in the years 2018-2021. This visualization is interactive, and the victim status, year, percent of victims by status per year, and victim count by status can be seen by hovering over the respective areas. The outer layer of the nested pie chart displays data related to victims who were killed from gun violence, while the inner layer displays data related to victims who were injured from gun violence.
When analyzing the respective layers, it is clear that more victims are injured and killed from gun violence in each progressive year. When comparing the layers, we can see that more victims are injured than killed each year. On the other hand, a greater percentage of victims were killed than injured in 2018, 2019, and 2021. In 2020, however, a greater percentage of victims were injured than killed.
kill_or_inj <- df_GunV %>%
filter(year=="2018"|year=="2019"|year=="2020"|year=="2021") %>%
select(year,injured,killed) %>%
group_by(year) %>%
summarise(tot_killed=sum(killed),tot_injured=sum(injured),.groups='keep')%>%
data.frame()
plot_ly(hole=0.7) %>%
layout(title="Deaths vs. Injuries from Gun Violence by Year (2018-2021)") %>%
add_trace(data=kill_or_inj,
labels=~year,
values=~kill_or_inj$tot_killed,
type="pie",
textposition="inside",
hovertemplate="Victim Status:Killed<br>Year:%{label}<br>Percent:%{percent}<br>Killed Count:%{value}<extra></extra>")%>%
add_trace(data=kill_or_inj,
labels=~year,
values=~kill_or_inj$tot_injured,
type="pie",
textposition="inside",
hovertemplate="Victim Status:Injured<br>Year:%{label}<br>Percent:%{percent}<br>Injured Count:%{value}<extra></extra>",
domain=list(
x=c(0.16,0.84),
y=c(0.16,0.84)))
After analyzing the visualizations, there are several takeaways that urge action to be taken towards reducing gun violence in the United States. First, the increase in gun violence victims in 2020 and 2021 shows that gun violence is getting worse. Second, the states with the greatest gun violence urgently require further investigation and implementation of change by leaders in government and law enforcement. Third, major cities have high potential for gun violence and, therefore, require a stronger police presence and safety precautions for citizens in those communities. The weekends and summer months pose a greater risk for individuals to be harmed by gun violence and may require similar action. Finally, gun violence has resulted in tragedy for thousands of Americans. Trends in mortality commend the imperative need for change surrounding access to guns, enforcement of gun control, and protection for individuals and communities.