For this assignment I downloaded the 2015-2018 Boston crime data set. I explored crimes that occurred during these years.The main theme and questions I explored in this dataset specifically looked at the prevalence of crimes (time and day). I then took a closer look at these crimes and classified them so as to focus on the top 5 violent crimes that occurred during the year with the most report crimes (2017). The dataset can be downloaded from this website: https://www.kaggle.com/ankkur13/boston-crime-data

Narrative: I will take you through my logic behind how I decided to filter my data and focus on specific areas of interest:

I first wanted to compare crime rates across the years. I visualized the number of reported crimes across the years and noticed that the 2015 had the fewest and 2017 had the most. There was a spike in crimes in Boston in 2017 and I was curious to understand why that was the case. As a result, I decided to focus on crimes that occurred in 2017 for the rest of my narrative.

YEAR n percentage
2015 51163 16.81246
2016 91413 30.03884
2017 93038 30.57283
2018 68702 22.57588

In my next plot, I wanted to highlight when these crimes occurred- in hopes that I can identify if more crimes occurred at specific hours of the day (another criteria that I could later filter for). I found that most documented crimes happened after 6am.

The next question was what about different days of the week? Were crimes more likely to occur on specific days? I learned that there was no specific preference for day of the week- Monday through Sunday crimes occurred mostly after 6am with a more or less similar frequency… so I will not be filtering for specific days of the week.

What about months? Were crimes reported in 2017 at a higher frequency during specific months?- no. Interestingly, crimes occurred consistently throughout the year… so I will not be filtering by months going forward

Now, I wanted to take a closer look at the crime types. I decided to classify the crimes by mainly focusing on violent crimes so as to make the data clearer when interpreting it:

I focused on the following five violent crimes: “Arson”, “Manslaughter”, “Homicide”, “Human Trafficking”, “Robbery”. Below I show a table that ranks the violent crimes as determined by their frequency. Robbery was found to be the most prevalant crime type:

Top 5 Violent Crime Types Frequency
Robbery 93.27%
Homicide 3.78%
Arson 2.34%
Human Trafficking 0.38%
Manslaughter 0.23%

Now, I wanted to piece things together and filter the dataset for year, hours, months, and offense type and then map it using a geospatial plot. Going forward I thought it would be more interesting to focus on the year that had the highest number of crimes, 2017:

What if I wanted to look at the changes in these five violent crimes over the years 2015-2018? I realized that this visual is not very informative since there needs to be crosstalk between years and the crime types so as to make it more informative. The colors seen are reported violent crime cases without distinguishing what these crimes are.

Breaking down violent crimes by years:

Since Robbery occurred most frequently, I decided to focus on it in this leaflet Heatmap and try to pinpot which areas in Boston where the hottest- where did robberies occur the most? Interestingly, most robberies occurred around he financial district area!!

Conclusions: Crimes in Boston were the highest in 2017 and I was interested in learning more about their distribution and when they occurred. Through my analysis, I learned that among different violent crimes robberies seemed to occur most frequently in various regions but concentrated mostly around the financial district area. Moreover, crimes were reported throughout the year with no surges seen during specific months of the year. However, they mostly occurred after 6am.