Introduction

This analysis examines data from the Centers for Disease Control and Prevention (CDC) on homicide deaths across all states in the US from 1999 to 2020. The dataset includes information on each state’s gender-specific homicide death counts, population, and crude death rates (deaths per 100,000 people). The goal of this analysis is to gain insights into the patterns and trends of homicide deaths across states and genders. There will be four main sections below for this analysis along with the graphs for the specific purpose of the analysis. Section One for Homicides by Year Across all States, section Two for Deaths and Population Aggregated by States, section Three encounters Crude Rate by Specific States throughout Years, and Four for Death Rate (per 100,000) by chosen States. By analyzing homicide data, I believe we can identify high-risk groups and areas, as well as the underlying factors that contribute to these violent acts (State in this context). This information can then be used to develop targeted prevention strategies and interventions to promote community safety. In addition, homicide data can also help shed light on issues related to social inequality, such as disparities in victimization and access to resources, which can inform broader social and economic policies (which we look to Gender).


Homicides by Year Across all States

total_deaths <- data %>% 
  group_by(Year, Gender) %>% 
  mutate(total_deaths = sum(Deaths))

# Plot the total number of homicides per year
ggplot(total_deaths, aes(x = Year, y = total_deaths, group = Gender)) +
  geom_line(aes(color = Gender)) +
  geom_point(aes(color=Gender))+
  scale_color_manual(values = c("#7fc97f", "#6BAED6"))+
  labs(title = "Total Number of Homicides per Year across all States",
       x = "Year",
       y = "Total Homicide Deaths") +
  theme_bw()


A point in this line charts represents the Total of homicde of Male and Female of a particular year (from 199-2020). Looking at our line chart here, we can clearly see that Female’s total homicide deaths are dominant by Males’ total homicide deaths. In 2001, both Male and Female’s number of homicides was significantly high, but decreased at the same pace in the following year (2002). Male homicides fluctuated onwards until 2018, and also Female’s remained stable until then. Toal of homicide deaths for Men across the United States dramatically surged to nearly 20,000 in 2020 and Female also went back to its peak in 2020.


Deaths and Population Aggregated by States

df_top10 <- data %>%
  group_by(State) %>%
  summarize(total_deaths = sum(Deaths)) %>%
  arrange(desc(total_deaths)) %>%
  slice(1:10) %>%
  inner_join(data, by = "State")

ggplot(data = df_top10, aes(x = fct_reorder(State, -total_deaths), y = Deaths, fill = Gender)) + 
  geom_bar(stat = "identity") +
  labs(title = "Homicide Deaths by State and Gender",
       x = "State",
       y = "Number of Deaths") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))


Taking a deeper look into Homicides, I created a bar chart and have sliced out the top 10 States with the highest total of Homicides follow by Male and Female (1999 - 2020). The y-axis show the accumulative of deaths based on each States in the chart and still, the proportion of Male dominated that of Female’s. We can clearly see that States with high population such as California, Texas, Florida, etc, has higher death rates respectively. However, these metrics are not standardized to all the States and cannot reflect fully the true characteristics of Homicides for each States. Therefore, we will standardized the metrics by moving forward to the next graph.


Crude Rate by Specific States throughout Years

states <- c("New York", "California", "Texas", "Massachusetts")

# Filter data to only include certain states
selected_states <- filter(data, State %in% states)

# Plot homicde rate per 100,000 by  state and year
plot <- ggplot(selected_states, aes(x = Year, y = Crude.Rate, color = Gender)) + 
  geom_line() +
  facet_wrap(~ State, scales = "free_y") +
  labs(title = "Homicide Rate Per 100000 ppl by State and Year", 
       x = "Year", y = "Homicide Rate", color = "Gender") +
  theme_bw()

ggplotly(plot)


CDC database provided us “Crude Rate” columns which examine the Homicides (per 100,000 people) for each states. Therefore, I want to utilized the Crude Rate and chose among the States that have high population and included our State as well which is Massachusetts here. The facet line chart (interactive) represents a deeper look into the trend. Out of the 4 states, surprisingly New York, has relatively low Homicide Rates for both Male and female throughout the years until 2020. For Massachusetts and Texas, we see a variation in death rates among male and female and are increasing in recent years. However, looking at the big picture, all the trends are decreasing and lead to a more positive direction, compared to 10 years ago. This may be due to the advancement in Technology as well as updated and stricter regulations, securities, and also law enforcement in the United states.


Death Rate (per 100,000) by chosen States

ggplot(selected_states, aes(x=Population, y=Crude.Rate, color=State)) +
  geom_point(alpha=0.5) +

  labs(x="Population", y="Deaths Rates (per 100,000)") +
  ggtitle("Crude Rate vs. Population by States") +

  geom_smooth(method=lm, se=FALSE) +
  facet_wrap(~State, scales="free")


Now we will take a deeper look into the four States with linear relationship between the Homicide Rate and their Population with the chosen States with a regression line. From the graph, we can see that there is a general trend that as the population increases, the crude homicide rate tends to decrease for most states. The linear regression lines also show a pretty strong negative correlation between population and crude homicide rate for Massachusetts and New York in particular. Both of the States has a relatively steeper slope compared to California and Texas. Texas and California’s slopes are more gentle as they are heavily influenced by the difference in death rate between Male and Female.


Final Thoughts

In conclusion, the analysis reveals that male homicide deaths dominate female deaths across all states, and there are variations in homicide rates and trends between different states. The analysis also highlights the importance of considering population size when examining homicide rates, as states with larger populations tend to have higher crude homicide rates.

In addition, examining the relationship between death rate per 100,000 people and population revealed a negative linear trend, indicating that states with small populations can still have significant numbers of homicides. While this analysis provides a valuable glimpse into gender-based homicide patterns over time, it is important to acknowledge that other metrics, such as race, urbanization, and age groups (other team members have focused on) have not been explored in depth in this analysis and may offer additional insights into the factors contributing to homicide rates. By identifying high-risk groups and areas, this analysis can inform targeted prevention strategies and interventions to promote community safety. Furthermore, this analysis can also shed light on issues related to social inequality, such as disparities in victimization and access to resources, which can inform broader social and economic policies. Overall, this analysis provides a valuable contribution to the field of criminology and public policy, and it has the potential to inform evidence-based interventions to prevent and reduce homicides in the United States.