Context

Dataset: Baltimore City Gun Offenders

Observations: varies depending on dataset

Variables includes demographic and geographic fields such as:

  • Date of Birth
  • Age (derived)
  • Gender
  • Race
  • District
  • Latitude and Longitude

Each row represents an individual offender record.

This dataset is well suited for:

  • Demographic analysis (age, gender, race)
  • Geographic analysis (districts, coordinates)
  • Distribution analysis (age patterns, clustering)
  • Law enforcement and public safety insights

Introduction

This report analyzes gun offender data from Baltimore City. The purpose of this analysis is to explore demographic patterns, geographic distribution, and district-level trends using descriptive statistics and data visualizations.

Data Import and Cleaning

The dataset was imported from a CSV file. Age was calculated using the year of birth. Records with missing geographic data were excluded from the mapping visualization.
Click to view code
import pandas as pd
import matplotlib.pyplot as plt

df = pd.read_csv('Gun_Offenders.csv')
df.head();

df['DateOfBirth'] = pd.to_datetime(df['DateOfBirth'], errors='coerce')
df['Age'] = 2026 - df['DateOfBirth'].dt.year

Load the dataset

df = pd.read_csv(‘Gun_Offenders.csv’) df.head()

Calculate age from year of birth

df[‘DateOfBirth’] = pd.to_datetime(df [‘DateOfBirth’], errors=‘coerce’)

df[‘Age’] = 2026 - df[‘DateOfBirth’].dt.year

Descriptive Statistics

Basic summary statistics provide insight into the distribution of ages and the presence of missing values across variables.

Click to view output
df.describe()
##                  X            Y        RowID  ...    Longitude  Shape          Age
## count  3441.000000  3441.000000  4521.000000  ...  3441.000000    0.0  4519.000000
## mean    -76.624188    39.310883  2261.000000  ...   -76.624188    NaN    34.015048
## std       0.045019     0.028726  1305.244613  ...     0.045019    NaN     9.727438
## min     -76.711200    39.222500     1.000000  ...   -76.711200    NaN     1.000000
## 25%     -76.661900    39.294400  1131.000000  ...   -76.661900    NaN    27.000000
## 50%     -76.628900    39.308700  2261.000000  ...   -76.628900    NaN    33.000000
## 75%     -76.588800    39.331300  3391.000000  ...   -76.588800    NaN    39.000000
## max     -76.529900    39.372000  4521.000000  ...   -76.529900    NaN    78.000000
## 
## [8 rows x 9 columns]

Visualization 1 - Offenders by District (Bar Chart)

This visualization shows the number of offenders in each district. Certain districts clearly have higher counts, indicating geographic concentration.
Click to view code
plt.figure(figsize=(10, 5))
district_counts = df['District'].value_counts()
bars = plt.bar(district_counts.index, district_counts.values, color='steelblue', edgecolor='white')
plt.title('Gun Offenders by District')
plt.xlabel('District')
plt.ylabel('Number of Offenders')
plt.xticks(rotation=45);

for bar in bars:
    plt.text(
        bar.get_x() + bar.get_width() / 2,  
        bar.get_height() / 2,                  
        str(int(bar.get_height())),            
        ha='center', va='center',
        color='white', fontsize=10, fontweight='bold'
    )

plt.tight_layout()
plt.show()

How to Read This Plot:

  • Each bar represents a district in Baltimore.
  • The height of the bar indicates the total number of offenders recorded in that district.
  • Taller bars correspond to districts with higher concentrations of offenders.

What This Plot Shows:

Some districts have noticeably higher counts than others, suggesting uneven geographic distribution of gun-related offenses across the city.

Visualization 2 - Gender Breakdown (Pie Chart)

A pie chart was used to illustrate the proportion of offenders by gender.
Click to view code
plt.figure(figsize=(6, 6))
gender_counts = df['Gender'].value_counts()
plt.pie(gender_counts, labels=gender_counts.index, autopct='%1.1f%%',
        colors=['steelblue', 'salmon'], startangle=90);
plt.title('Gender Breakdown of Gun Offenders')
plt.show()

How to Read This Plot:

  • Each slice represents a gender category.
  • The size of each slice corresponds to the percentage of offenders.
  • Larger slices indicate a greater proportion of offenders within that category.

What This Plot Shows:

The distribution highlights which gender group makes up the majority of offenders, often revealing a strong imbalance.

Visualization 3 - Age Distribution (Histogram)

A histogram was used to analyze the distribution of offender ages.
Click to view code
plt.figure(figsize=(10, 5))
df['Age'].dropna().plot(kind='hist', bins=30, color='steelblue', edgecolor='white');
plt.title('Age Distribution of Gun Offenders')
plt.xlabel('Age')
plt.ylabel('Count')
plt.tight_layout()
plt.show()

How to Read This Plot:

  • The x-axis shows age ranges.
  • The y-axis represents the number of offenders within each age group.
  • Peaks indicate the most common age ranges.

What This Plot Shows:

The distribution typically reveals that offenders cluster within certain age groups, often concentrated among younger individuals.

Visualization 4 - Offenders by District and Race (Heatmap)

This heatmap visualizes the relationship between district and race.
Click to view code
plt.figure(figsize=(10, 6));
heatmap_data = df.groupby(['District', 'Race']).size().unstack(fill_value=0);
plt.imshow(heatmap_data, cmap='YlOrRd', aspect='auto');
plt.colorbar(label='Number of Offenders');
plt.xticks(range(len(heatmap_data.columns)), heatmap_data.columns, rotation=45);
plt.yticks(range(len(heatmap_data.index)), heatmap_data.index);
plt.title('Heatmap of Gun Offenders by District and Race');
plt.xlabel('Race');
plt.ylabel('District');

for i in range(len(heatmap_data.index)):
    for j in range(len(heatmap_data.columns)):
        plt.text(j, i, heatmap_data.iloc[i, j], ha='center', va='center', fontsize=9)

plt.tight_layout()
plt.show()

How to Read This Plot:

  • Rows represent districts and columns represent race categories.
  • Each cell shows the number of offenders for a specific district-race combination.
  • Darker colors indicate higher values.

What This Plot Shows:

The heatmap highlights how offender demographics vary across districts, revealing patterns that may not be visible in simpler charts.

Visualization 5 - Geographic Distribution (Scatter Plot)

A scatter plot was used to visualize offender locations using latitude and longitude.
Click to view code
plt.figure(figsize=(10, 8))
geo = df.dropna(subset=['Latitude', 'Longitude'])
plt.scatter(geo['Longitude'], geo['Latitude'], alpha=0.4, s=10, color='steelblue')
plt.title('Geographic Distribution of Gun Offenders in Baltimore')
plt.xlabel('Longitude')
plt.ylabel('Latitude')
plt.tight_layout()
plt.show()

How to Read This Plot:

  • Each point represents a single offender’s location.
  • Clusters of points indicate areas with higher concentrations of activity.
  • Sparse areas indicate fewer recorded incidents.

What This Plot Shows:

The visualization reveals geographic clustering, suggesting potential hotspots of gun-related activity within Baltimore.

Conclusion

This analysis explored gun offender data in Baltimore using a variety of visualization techniques. The findings reveal clear geographic clustering, demographic patterns in age and gender, and variation across districts.

These insights can support a better understanding of where and among whom gun-related offenses are most concentrated. Improved data completeness and further analysis could provide deeper insights into underlying causes and trends.