my_data <- Crime_Incidents_in_2015 
small_data <- my_data[1:100, ]
head(small_data)
## # A tibble: 6 × 25
##        X      Y CCN    REPORT_DAT START_DATE END_DATE BLOCK OFFENSE METHOD SHIFT
##    <dbl>  <dbl> <chr>  <chr>      <chr>      <chr>    <chr> <chr>   <chr>  <chr>
## 1 401701 136862 11105… 2015/08/1… 2011/07/2… 2011/07… 1516… SEX AB… KNIFE  MIDN…
## 2 396523 137847 13134… 2015/07/2… 2013/09/1… 2013/09… 1700… SEX AB… OTHERS MIDN…
## 3 396535 140772 14174… 2015/01/2… 2014/11/0… 2014/11… 3400… THEFT … OTHERS DAY  
## 4 400121 137998 14191… 2015/03/2… 2014/12/1… 2014/12… 1300… SEX AB… OTHERS MIDN…
## 5 397628 140648 14192… 2015/01/1… 2014/12/1… 2014/12… 3500… SEX AB… OTHERS MIDN…
## 6 405483 136075 15000… 2015/01/0… 2015/01/0… 2015/01… 200 … ASSAUL… GUN    EVEN…
## # ℹ 15 more variables: WARD <dbl>, ANC <chr>, DISTRICT <dbl>, PSA <dbl>,
## #   NEIGHBORHOOD_CLUSTER <chr>, BLOCK_GROUP <chr>, CENSUS_TRACT <chr>,
## #   VOTING_PRECINCT <chr>, BID <chr>, XBLOCK <dbl>, YBLOCK <dbl>,
## #   LATITUDE <dbl>, LONGITUDE <dbl>, OBJECTID <dbl>, OCTO_RECORD_ID <lgl>
kable(head('Crime Incidents in 2015', 20), caption = "First 20 Rows of My Data")
First 20 Rows of My Data
x
Crime Incidents in 2015
library(tidyverse)
small_data %>%
  count(OFFENSE) %>%
  ggplot(aes(x = reorder(OFFENSE, n), y = n, fill = OFFENSE)) +
  geom_bar(stat = "identity") +
  coord_flip() +  # Makes the long crime names easier to read
  labs(title = "Distribution of Crime Incidents by Offense Type",
       x = "Offense Type",
       y = "Count") +
  theme_minimal() +
  guides(fill = "none")

Looking at this chart, we see that certain offenses like ‘Theft’ or ‘Sex Abuse’ are captured in this initial sample. This distribution allows us to understand which types of crimes were most frequently reported in the early part of 2015.

small_data %>%
  count(METHOD) %>%
  arrange(desc(n)) %>%
  ggplot(aes(x = reorder(METHOD, n), y = n)) +
  geom_bar(stat = "identity", fill = "steelblue") +
  coord_flip() +
  labs(title = "Distribution of Crime Incidents by Method",
       x = "Method Used",
       y = "Count") +
  theme_minimal()

This chart shows the methods used to commit the recorded offenses. By identifying the primary methods—whether involving weapons or other means—we can assess the severity and nature of the incidents in this dataset.

small_data %>%
  count(WARD) %>%
  ggplot(aes(x = factor(WARD), y = n)) +
  geom_bar(stat = "identity", fill = "orange") +
  labs(title = "Number of Incidents by City Ward",
       x = "Ward Number",
       y = "Number of Incidents") +
  theme_minimal()

By looking at the distribution across Wards, we can identify geographic clusters. This sample suggests that some Wards may have higher reported incident rates than others, which could indicate areas requiring more public safety resources.

small_data %>%
  count(SHIFT) %>%
  ggplot(aes(x = SHIFT, y = n, fill = SHIFT)) +
  geom_bar(stat = "identity") +
  labs(title = "Crime Incidents by Time Shift",
       x = "Shift",
       y = "Count") +
  theme_minimal()

This bar chart breaks down the reported incidents by the shift during which they were recorded (Day, Evening, or Midnight). By analyzing the workload across these time frames, we can identify which periods of the day experience the highest volume of reported activity. This is essential for understanding the operational demands placed on law enforcement at different times of the day.

# 5. Monthly Trend of Crime Incidents
small_data %>%
  mutate(Month = floor_date(as.Date(REPORT_DAT), "month")) %>%
  count(Month) %>%
  ggplot(aes(x = Month, y = n)) +
  geom_line(color = "darkred", size = 1) +
  geom_point(color = "darkred", size = 2) +
  labs(title = "Monthly Trend of Crime Incidents (Sample)",
       x = "Month of 2015",
       y = "Number of Incidents") +
  scale_x_date(date_labels = "%b", date_breaks = "1 month") +
  theme_minimal()
## Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
## ℹ Please use `linewidth` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.

The line graph illustrates the temporal progression of crime reports throughout the year within our sample. By tracking incidents on a month-to-month basis, we can identify seasonal trends or specific periods of heightened activity. This longitudinal view is vital for determining whether crime reporting is stable or if it fluctuates significantly based on the time of year.

small_data %>%
  count(NEIGHBORHOOD_CLUSTER) %>%
  filter(!is.na(NEIGHBORHOOD_CLUSTER)) %>%
  slice_max(n, n = 10) %>%
  ggplot(aes(x = reorder(NEIGHBORHOOD_CLUSTER, n), y = n, fill = NEIGHBORHOOD_CLUSTER)) +
  geom_bar(stat = "identity") +
  coord_flip() +
  labs(title = "Top 10 Neighborhood Clusters by Incident Count",
       subtitle = "Based on the first 100 recorded incidents",
       x = "Neighborhood Cluster",
       y = "Number of Incidents") +
  theme_minimal() +
  guides(fill = "none")

While the Ward-level analysis provides a broad geographic overview, this chart offers a more granular look at crime distribution by identifying the top 10 Neighborhood Clusters with the most reported incidents. Pinpointing these specific ‘hotspots’ allows for a more localized understanding of where public safety challenges are most concentrated within the city’s various communities.

small_data %>%
  ggplot(aes(x = OFFENSE, fill = SHIFT)) +
  geom_bar(position = "stack") +
  coord_flip() +
  labs(title = "Crime Type Distribution Across Shifts",
       x = "Offense Type",
       y = "Count of Incidents",
       fill = "Shift Time") +
  theme_minimal() +
  scale_fill_brewer(palette = "Set2")

This visualization examines the intersection between the nature of the offense and the time of day. By stacking the shifts within each offense category, we can see if certain crimes are disproportionately occurring during specific hours—for instance, determining if ‘Theft’ is more prevalent during the day versus the midnight shift. This insight is crucial for developing shift-specific prevention and response strategies.