This analysis examines data from the Centers for Disease Control and Prevention (CDC) on homicide deaths across all states in the US from 1999 to 2020. The dataset includes information on each state’s gender-specific homicide death counts, population, and crude death rates (deaths per 100,000 people). The goal of this analysis is to gain insights into the patterns and trends of homicide deaths across states and genders. There will be four main sections below for this analysis along with the graphs for the specific purpose of the analysis. Section One for Homicides by Year Across all States, section Two for Deaths and Population Aggregated by States, section Three encounters Crude Rate by Specific States throughout Years, and Four for Death Rate (per 100,000) by chosen States. By analyzing homicide data, I believe we can identify high-risk groups and areas, as well as the underlying factors that contribute to these violent acts (State in this context). This information can then be used to develop targeted prevention strategies and interventions to promote community safety. In addition, homicide data can also help shed light on issues related to social inequality, such as disparities in victimization and access to resources, which can inform broader social and economic policies (which we look to Gender).
total_deaths <- data %>%
group_by(Year, Gender) %>%
mutate(total_deaths = sum(Deaths))
# Plot the total number of homicides per year
ggplot(total_deaths, aes(x = Year, y = total_deaths, group = Gender)) +
geom_line(aes(color = Gender)) +
geom_point(aes(color=Gender))+
scale_color_manual(values = c("#7fc97f", "#6BAED6"))+
labs(title = "Total Number of Homicides per Year across all States",
x = "Year",
y = "Total Homicide Deaths") +
theme_bw()
A point in this line charts represents the Total of homicde of Male
and Female of a particular year (from 199-2020). Looking at our line
chart here, we can clearly see that Female’s total homicide deaths are
dominant by Males’ total homicide deaths. In 2001, both Male and
Female’s number of homicides was significantly high, but decreased at
the same pace in the following year (2002). Male homicides fluctuated
onwards until 2018, and also Female’s remained stable until then. Toal
of homicide deaths for Men across the United States dramatically surged
to nearly 20,000 in 2020 and Female also went back to its peak in
2020.
df_top10 <- data %>%
group_by(State) %>%
summarize(total_deaths = sum(Deaths)) %>%
arrange(desc(total_deaths)) %>%
slice(1:10) %>%
inner_join(data, by = "State")
ggplot(data = df_top10, aes(x = fct_reorder(State, -total_deaths), y = Deaths, fill = Gender)) +
geom_bar(stat = "identity") +
labs(title = "Homicide Deaths by State and Gender",
x = "State",
y = "Number of Deaths") +
theme_minimal() +
theme(axis.text.x = element_text(angle = 45, hjust = 1))
Taking a deeper look into Homicides, I created a bar chart and have
sliced out the top 10 States with the highest total of Homicides follow
by Male and Female (1999 - 2020). The y-axis show the accumulative of
deaths based on each States in the chart and still, the proportion of
Male dominated that of Female’s. We can clearly see that States with
high population such as California, Texas, Florida, etc, has higher
death rates respectively. However, these metrics are not standardized to
all the States and cannot reflect fully the true characteristics of
Homicides for each States. Therefore, we will standardized the metrics
by moving forward to the next graph.
states <- c("New York", "California", "Texas", "Massachusetts")
# Filter data to only include certain states
selected_states <- filter(data, State %in% states)
# Plot homicde rate per 100,000 by state and year
plot <- ggplot(selected_states, aes(x = Year, y = Crude.Rate, color = Gender)) +
geom_line() +
facet_wrap(~ State, scales = "free_y") +
labs(title = "Homicide Rate Per 100000 ppl by State and Year",
x = "Year", y = "Homicide Rate", color = "Gender") +
theme_bw()
ggplotly(plot)
CDC database provided us “Crude Rate” columns which examine the
Homicides (per 100,000 people) for each states. Therefore, I want to
utilized the Crude Rate and chose among the States that have high
population and included our State as well which is Massachusetts here.
The facet line chart (interactive) represents a deeper look into the
trend. Out of the 4 states, surprisingly New York, has relatively low
Homicide Rates for both Male and female throughout the years until 2020.
For Massachusetts and Texas, we see a variation in death rates among
male and female and are increasing in recent years. However, looking at
the big picture, all the trends are decreasing and lead to a more
positive direction, compared to 10 years ago. This may be due to the
advancement in Technology as well as updated and stricter regulations,
securities, and also law enforcement in the United states.
ggplot(selected_states, aes(x=Population, y=Crude.Rate, color=State)) +
geom_point(alpha=0.5) +
labs(x="Population", y="Deaths Rates (per 100,000)") +
ggtitle("Crude Rate vs. Population by States") +
geom_smooth(method=lm, se=FALSE) +
facet_wrap(~State, scales="free")
Now we will take a deeper look into the four States with linear
relationship between the Homicide Rate and their Population with the
chosen States with a regression line. From the graph, we can see that
there is a general trend that as the population increases, the crude
homicide rate tends to decrease for most states. The linear regression
lines also show a pretty strong negative correlation between population
and crude homicide rate for Massachusetts and New York in particular.
Both of the States has a relatively steeper slope compared to California
and Texas. Texas and California’s slopes are more gentle as they are
heavily influenced by the difference in death rate between Male and
Female.
In conclusion, the analysis reveals that male homicide deaths
dominate female deaths across all states, and there are variations in
homicide rates and trends between different states. The analysis also
highlights the importance of considering population size when examining
homicide rates, as states with larger populations tend to have higher
crude homicide rates.
In addition, examining the relationship between death rate per 100,000 people and population revealed a negative linear trend, indicating that states with small populations can still have significant numbers of homicides. While this analysis provides a valuable glimpse into gender-based homicide patterns over time, it is important to acknowledge that other metrics, such as race, urbanization, and age groups (other team members have focused on) have not been explored in depth in this analysis and may offer additional insights into the factors contributing to homicide rates. By identifying high-risk groups and areas, this analysis can inform targeted prevention strategies and interventions to promote community safety. Furthermore, this analysis can also shed light on issues related to social inequality, such as disparities in victimization and access to resources, which can inform broader social and economic policies. Overall, this analysis provides a valuable contribution to the field of criminology and public policy, and it has the potential to inform evidence-based interventions to prevent and reduce homicides in the United States.