Analysis of Crime in Maryland, 1975-2020

Introduction

Maryland is often considered a popular spot for raising families – great schools, four seasons, Adley Rutschman on the Orioles… and relatively safe communities. Just how safe is it, though? And how do different jurisdictions compare to each other? And how has it changed over time? This study seeks to answer these questions.

Dataset

The data used in this analysis is “Violent Crime & Property Crime by County: 1975 to Present” from the Maryland Open Data Portal. Find the report at: https://opendata.maryland.gov/Public-Safety/Violent-Crime-Property-Crime-by-County-1975-to-Pre/jwfa-fdxs

Findings


import os
os.environ['QT_QPA_PLATFORM_PLUGIN_PATH'] = 'C:/Users/tyler/anaconda3/library/plugins/platforms'

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np
from matplotlib.ticker import FuncFormatter
import plotly.graph_objects as go

plt.style.use("Solarize_Light2")

df = pd.read_csv("https://opendata.maryland.gov/api/views/jwfa-fdxs/rows.csv?accessType=DOWNLOAD")

Bump Chart of Crime Types, 1975-2020

Starting with a topline overview, we can see that property crimes are consistently the most common types of crime in Maryland, led by motor vehicle theft. Interestingly, assault has slowly creeped up over the years, surpassing breaking & entering in the late 2000’s and surpassing larceny theft in 2020. Is assault truly increasing, or maybe is it just being enforced more frequently? Or maybe other crimes are just declining? Let’s keep looking.


grouped_data = df.groupby('YEAR').agg({
    'MURDER': 'sum',
    'RAPE': 'sum',
    'ROBBERY': 'sum',
    'AGG. ASSAULT': 'sum',
    'B & E': 'sum',
    'LARCENY THEFT': 'sum',
    'M/V THEFT': 'sum'
})


ranked_data = grouped_data.rank(axis=1, method='min', ascending=False)

ax = ranked_data.plot.line(figsize=(12, 6), marker = 'o', markeredgewidth=1, linewidth=3, 
                           markersize=6, 
                           markerfacecolor='white')
ax.set_ylabel('Rank of Crime Types')
ax.set_title('Ranking of Crime Types by Year, All Jurisdictions')
ax.set_ylim(.5, 7.5) 
ax.set_xlim(1974,2029)
ax.set_xlabel('Year')
ax.invert_yaxis()  
plt.legend(loc='best', fontsize=8)


plt.show()

Stacked Barchart of Crime by Jurisdiction, 2020

If we look at our most recent year in the dataset, we can see, as likely expected, that Baltimore City leads the state in overall crime, followed by the other highly populated jurisdictions in the state. Most jurisdictions seem to follow a comparable proportion of crime types. Although this is helpful topline perspective it’s challenging to truly derive major insights from this, though, when population so heavily skews the data. In the next few charts, we’ll look at the crime rate per 100,000 people.


df_2020 = df[df['YEAR'] == 2020]

grouped_data = df_2020.groupby('JURISDICTION').agg({
    'MURDER': 'sum',
    'RAPE': 'sum',
    'ROBBERY': 'sum',
    'AGG. ASSAULT': 'sum',
    'B & E': 'sum',
    'LARCENY THEFT': 'sum',
    'M/V THEFT': 'sum'
})

grouped_data['TOTAL'] = grouped_data.sum(axis=1)

sorted_data = grouped_data.sort_values(by='TOTAL', ascending=False)

sorted_data = sorted_data.drop('TOTAL', axis=1)

ax = sorted_data.plot.bar(stacked=True, figsize=(12, 6))
ax.set_ylabel('Sum of Crimes')
ax.set_xlabel('Jurisdiction')
ax.set_title('Sum of Violent and Property Crimes by County in 2020')

formatter = FuncFormatter(lambda x, pos: f'{x:,.0f}')
ax.yaxis.set_major_formatter(formatter)

plt.show()

Clustered Barchart, Crime per 100K People 2020 vs 2010

Viewing these jurisdictions per 100K people makes it much easier to compare. First and foremost, it seems like crime has declined in every jurisdiction, which is great. It should be noted that 2020 may be skewed by the pandemic, however; I did validate that 2019 showed comparable results. Interestingly, some of the smaller jurisdictions, like Wicomico County and Worcester County, become more prominent for crime with this comparison. Wicomico, for instance, is a small county and has a low population, but it is the home of Salisbury University, and the surrounding areas are fairly low income. On the flip side, Prince George’s County has seen a tremendous decrease in crime in the past ten years. Further analysis would be required to determine what has driven such a major shift.


df_filtered = df[df['YEAR'].isin([2010, 2020])]

crime_rate = df_filtered.pivot_table(index='JURISDICTION', columns='YEAR', values='OVERALL CRIME RATE PER 100,000 PEOPLE').reset_index()

fig, ax = plt.subplots(figsize=(15, 6))

x = np.arange(len(crime_rate['JURISDICTION']))
width = 0.35

ax.bar(x + width / 2, crime_rate[2010], width, label="2010 Crime Rate", color='orange')
ax.bar(x - width / 2, crime_rate[2020], width, label="2020 Crime Rate")
ax.set_xticks(x)
ax.set_xticklabels(crime_rate['JURISDICTION'], rotation=90)
ax.set_ylabel("Overall Crime Rate per 100,000 People")
formatter = FuncFormatter(lambda x, pos: f'{x:,.0f}')
ax.yaxis.set_major_formatter(formatter)

ax.legend()

plt.title("Overall Crime Rate in 2010 and 2020 by Jurisdiction")
plt.show()

Multiple Line Plots, Top 5 Violent Crime Jurisdictions, 1975-2020

This chart drills down on the top 5 jurisdictions with the highest violent crime rates per 100K people as of 2020, to see how these have changed over time. It appears all of them have seen an overall downward trend, but there is a pretty significant spike for Baltimore City in the late 2010’s. More on that later.


top_5_jurisdictions = df_2020.sort_values(by='VIOLENT CRIME RATE PER 100,000 PEOPLE', ascending=False).head(5)['JURISDICTION'].tolist()

df_top_5 = df[df['JURISDICTION'].isin(top_5_jurisdictions)]


pivoted_data = df_top_5.pivot(index='YEAR', columns='JURISDICTION', 
values='VIOLENT CRIME RATE PER 100,000 PEOPLE')

ax = pivoted_data.plot.line(figsize=(12, 6))
ax.set_ylabel('Violent Crime Rate per 100,000 People')
ax.set_title('Violent Crime Rate by Jurisdiction and Year')
ax.set_xlabel('Year')

plt.show()

Nested Donut Chart, Baltimore City % of Crime Types, 2020

Zooming in on Baltimore City, we can see that over half of the crime is non-violent theft, and two-third of it are property crimes. 33% of the crime being violent is pretty significant, though.


baltimore_data_2020 = df[(df['JURISDICTION'] == 'Baltimore City') & (df['YEAR'] == 2020)].iloc[0]

outer_ring_categories = ['Violent Crime Total', 'Property Crime Total']
outer_ring_totals = [baltimore_data_2020['VIOLENT CRIME TOTAL'], baltimore_data_2020['PROPERTY CRIME TOTALS']]
total_crime_sum = sum(outer_ring_totals)
outer_ring_percentages = [total / total_crime_sum * 100 for total in outer_ring_totals]

inner_ring_categories = ['Murder & Rape', 'Robbery', 'Assault', 'B & E', 'Theft', 'M/V Theft']
inner_ring_totals = [(baltimore_data_2020['MURDER'] + baltimore_data_2020['RAPE']), baltimore_data_2020['ROBBERY'],
                     baltimore_data_2020['AGG. ASSAULT'], baltimore_data_2020['B & E'], baltimore_data_2020['LARCENY THEFT'],
                     baltimore_data_2020['M/V THEFT']]
inner_ring_percentages = [total / total_crime_sum * 100 for total in inner_ring_totals]

fig, ax = plt.subplots(figsize=(10,10))

ax.pie(outer_ring_percentages, labels=outer_ring_categories, radius=1, autopct = '%1.0f%%', 
       pctdistance = 0.85, labeldistance = 1.01, textprops= {'fontsize':13},
       startangle=90,wedgeprops=dict(width=0.3, edgecolor='w'))
ax.pie(inner_ring_percentages, labels=inner_ring_categories, radius=1-0.3, autopct = '%1.0f%%', 
       pctdistance = 0.7, labeldistance = 0.8,textprops= {'fontsize':13},
       startangle=90, wedgeprops=dict(width=0.3, edgecolor='w'))
plt.title('Crime in Balitmore City, 2020', fontsize=18)


ax.axis('equal')
plt.tight_layout()

plt.show()

Waterfall Chart of the Murder Rate in Baltimore, 2000-2020

The data in this chart echoes what we saw in the earlier line chart: Baltimore’s murder rate was declining fairly consistently, until 2015. Per the New York Times, this was due to a chain reaction stemming from the death of Freddie Gray. In summary, police mismanagement led to the violent offshoots of a largely peaceful protest spiraling out of control, which then led to a massive retreat in policing, dubbed the “pullback.” The communities who needed support most were left without protection, and crime skyrocketed. It’s a heartbreaking story that many parts of the city are still struggling to recover from.

For more information, refer to the full article: https://www.nytimes.com/2019/03/12/magazine/baltimore-tragedy-crime.html


years = range(df['YEAR'].max() - 19, df['YEAR'].max() + 1)
baltimore_murder_data = df[(df['JURISDICTION'] == 'Baltimore City') & (df['YEAR'].isin(years))]

murder_counts = baltimore_murder_data.groupby('YEAR')['MURDER  RATE PERCENT CHANGE PER 100,000 PEOPLE'].sum().reset_index()

fig = go.Figure(go.Waterfall(
    name="Murder Count",
    x=murder_counts['YEAR'],
    y=murder_counts['MURDER  RATE PERCENT CHANGE PER 100,000 PEOPLE'],
    textposition="outside",
    texttemplate="%{y:.0f}%",
    connector={"line": {"color": "rgb(63, 63, 63)"}},
))

fig.update_layout(
    title=dict(text="Baltimore City Murder Rate Over Last 20 Years",x=0.5),
    xaxis_title="Year",
    yaxis_title="Percent Change Per 100,000 People",
    showlegend=False
)

Conclusion

If you ask Marylanders where the most dangerous parts of Maryland are, they’re going to point to Baltimore City. I believe this study highlighted a few things:

First, that it’s important to view any geographical data proportionate to the population–Eastern Shore communities that otherwise wouldn’t be noticed rose high on the list of areas with significant crime.

Second, Baltimore City has been rocked by injustice, political ambition, and general bureaucratic incompetence, and the result is a higher death toll. I wasn’t expecting for this report to turn into a study of the geopolitical landscape of Maryland, but it sure does feel like an elephant in the room.

Finally, crime does not occur in a vacuum. I began this report thinking that crime data impacts our decisions on where to live, and more importantly, where not to live. However, there’s more to it than that – crime does not occur in a vacuum: it’s a side effect of flaws in the system, of injustice that needs to be addressed. My hope is that our leaders can be held accountable, our communities can band together, and that in 10 years, another student will analyze an updated version of this dataset and tell a story about how Maryland had one hell of a comeback. (Bonus points if Adley leads the Orioles to a comeback, too)