Project 2

Author

Zachary Rodavich

For Project 2, I will be visualizing crimes committed in Rockville, MD, using an open-source dataset provided by Montgomery County. This dataset will focus on a select set of crimes, and will concentrate on determining where crimes occur most in Rockville, and which crimes are the most often commited.

Step 1. Setting Everything Up

#Loading the required libraries
library(tidyverse)

── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.2.0     ✔ readr     2.1.6
✔ forcats   1.0.1     ✔ stringr   1.6.0
✔ ggplot2   4.0.2     ✔ tibble    3.3.1
✔ lubridate 1.9.4     ✔ tidyr     1.3.2
✔ purrr     1.2.1     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

library(tidyr)
library(leaflet)
library(ggthemes)
library(ggplot2)
library(maps)


Attaching package: 'maps'

The following object is masked from 'package:purrr':

    map

#Setting the working directory
getwd()

[1] "/Users/zacharyrodavich/Downloads"

Step 2. Reading the CSV File

#Reaading the CSV file
mococrimes <- read_csv("Crime_20260421.csv")

Rows: 11182 Columns: 30
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (22): Dispatch Date / Time, Start_Date_Time, End_Date_Time, NIBRS Code, ...
dbl  (8): Incident ID, Offence Code, CR Number, Victims, Zip Code, Address N...

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

#Showing the dataset
mococrimes

# A tibble: 11,182 × 30
   `Incident ID` `Offence Code` `CR Number` `Dispatch Date / Time`
           <dbl>          <dbl>       <dbl> <chr>                 
 1     201571503           9101   260016574 04/18/2026 03:26:05 AM
 2     201571500           9101   260016573 04/18/2026 01:49:43 AM
 3     201571493           9113   260016563 04/17/2026 10:43:35 PM
 4     201571492           9199   260016564 04/17/2026 10:35:13 PM
 5     201571496           1399   260016568 04/18/2026 12:28:58 AM
 6     201571504           9061   260016560 04/17/2026 09:31:22 PM
 7     201571476           5309   260016554 04/17/2026 09:14:00 PM
 8     201571488           5707   260016552 04/17/2026 08:48:56 PM
 9     201571490           1399   260016559 04/17/2026 09:43:56 PM
10     201571478           9199   260016526 04/17/2026 07:18:59 PM
# ℹ 11,172 more rows
# ℹ 26 more variables: Start_Date_Time <chr>, End_Date_Time <chr>,
#   `NIBRS Code` <chr>, Victims <dbl>, `Crime Name1` <chr>,
#   `Crime Name2` <chr>, `Crime Name3` <chr>, `Police District Name` <chr>,
#   `Block Address` <chr>, City <chr>, State <chr>, `Zip Code` <dbl>,
#   Agency <chr>, Place <chr>, Sector <chr>, Beat <chr>, PRA <chr>,
#   `Address Number` <dbl>, `Street Prefix` <chr>, `Street Name` <chr>, …

Step 3. Filtering the Dataset and defining variables

#Filtering through crimes commited withith Rockville and through at least 6-10 different crime types
crimesrv <- mococrimes |>
  filter(City == "ROCKVILLE") |>
  filter(`Crime Name2` %in% c("Motor Vehicle Theft","Shoplifting","Simple Assault","Driving Under the Influence","Destrction/Damage/Vandalisim of Property","Drug/Narcotic Violations","Embezzlement","Tresspass of Real Property","All other Larceny","False Pretenses/Swindle/Confidence Game"))

For this project, we will be focusing on 6-10 different crimes that were committed: Assault, Shoplifting, Auto Theft, Embezzlement, Fraud, Drug-Related Crimes, DUI (Drunk Driving), Property Damage, Trespassing and Larceny.

#Showing the filtered dataset
crimesrv

# A tibble: 594 × 30
   `Incident ID` `Offence Code` `CR Number` `Dispatch Date / Time`
           <dbl>          <dbl>       <dbl> <chr>                 
 1     201571441           1399   260016446 04/17/2026 12:25:06 PM
 2     201571391           2404   260016409 04/17/2026 10:45:31 AM
 3     201571344           5404   260016358 04/16/2026 10:30:28 PM
 4     201571411           3532   260016352 04/16/2026 09:29:28 PM
 5     201571320           1399   260016312 04/16/2026 04:27:23 PM
 6     201571270           2303   260016258 04/16/2026 12:37:06 PM
 7     201571328           1399   260016197 04/16/2026 12:12:42 AM
 8     201571304           2399   260016129 04/15/2026 03:19:45 PM
 9     201571206           2799   260016182 04/15/2026 07:42:07 PM
10     201571195           2303   260016134 04/15/2026 02:03:40 PM
# ℹ 584 more rows
# ℹ 26 more variables: Start_Date_Time <chr>, End_Date_Time <chr>,
#   `NIBRS Code` <chr>, Victims <dbl>, `Crime Name1` <chr>,
#   `Crime Name2` <chr>, `Crime Name3` <chr>, `Police District Name` <chr>,
#   `Block Address` <chr>, City <chr>, State <chr>, `Zip Code` <dbl>,
#   Agency <chr>, Place <chr>, Sector <chr>, Beat <chr>, PRA <chr>,
#   `Address Number` <dbl>, `Street Prefix` <chr>, `Street Name` <chr>, …

#Defining all variables
crimesrv$`Crime Name2`[crimesrv$`Crime Name2` == "Motor Vehicle Theft"]<- "Auto Theft"

crimesrv$`Crime Name2`[crimesrv$`Crime Name2` == "Simple Assault"]<- "Assault"

crimesrv$`Crime Name2`[crimesrv$`Crime Name2` == "Driving Under the Influence"]<- "DUI"

crimesrv$`Crime Name2`[crimesrv$`Crime Name2` == "Destruction/Damage/Vandalism of Property"]<- "Property Crimes"

crimesrv$`Crime Name2`[crimesrv$`Crime Name2` == "Drug/Narcotic Violations"]<- "Drug Crimes"

crimesrv$`Crime Name2`[crimesrv$`Crime Name2` == "Embezzlement"]<- "Embezzelment"

crimesrv$`Crime Name2`[crimesrv$`Crime Name2` == "Tresspass of Real Property"]<- "Tresspassing"

crimesrv$`Crime Name2`[crimesrv$`Crime Name2` == "All other Larceny"]<- "Larcey"

crimesrv$`Crime Name2`[crimesrv$`Crime Name2` == "False Pretenses/Swindle/Confidence Game"]<- "Fraud"

Step 4. Creating the first plots

#Creating the first ggplot, which is a simple scatterplot
crimes_summary <- crimesrv |>
  group_by(`Crime Name2`) |>
  summarize(count = n())

ggplot(crimes_summary, aes(x = `Crime Name2`,y = count)) +
  geom_point(aes(size = count, alpha = 1, color = "#910")) +
  geom_smooth(method = "lm", se = FALSE) + 
  scale_size_area() +
  theme_bw() +
  labs(x = "Type of Crime",
       y = "Instances",
       size = "Count",
       caption = "Source: Montgomery County, MD Open Data",
       title = "Crime Incidents reported in Rockville")

`geom_smooth()` using formula = 'y ~ x'

#Creating a more colorful ggplot
p1 <- crimes_summary |>
  ggplot(aes(x=reorder(`Crime Name2`, count),y=count,fill = `Crime Name2`)) +
  geom_col(position="identity", alpha=0.5, color = "white")+
  scale_fill_discrete(
    name = "Crime", 
    labels = c("Assault", "Auto Theft","Drug Crimes", "DUI", "Embezzelment","Fraud","Larceny","Shoplifting")) +
  labs(
    x = "Type of Crime", 
       y = "Instances",
       title = "Crime Incidents reported in Rockville",
       caption = "Source:Montgomery County, MD Open Data"
    ) +
  theme_bw()
p1

p2 <- crimesrv |>
  ggplot(aes(x=rev(`Crime Name2`), fill = `Crime Name1`)) +
  geom_bar(position = "dodge", alpha=0.5, color = "white")+
  scale_fill_discrete(
    name = "Crime", 
    labels = c("Assault", "Auto Theft","Drug Crimes", "DUI", "Embezzelment","Fraud","Larceny","Shoplifting")) +
  labs(
    x = "Type of Crime", 
       y = "Instances",
       title = "Crime Incidents reported in Rockville",
       caption = "Source:Montgomery County, MD Open Data"
    ) +
  theme_bw() +
  coord_flip()
p2

Step 5. Mapping

#Centering on Rockville
leaflet() |>
  setView(lng = -77.15, lat = 39.1, zoom =11.5) |>
  addProviderTiles("Esri.WorldStreetMap") |>
  addCircles(
    data = crimesrv,
    radius = crimesrv$'Crime Name2'
)

Assuming "Longitude" and "Latitude" are longitude and latitude, respectively

#Defining the Lat and Long Variables
crimesrv_lat <- mean(crimesrv$Latitude, na.rm = TRUE)
crimesrv_lon <- mean(crimesrv$Longitude, na.rm = TRUE)

#Creating a Map with points and user interactivity
leaflet() |>
 setView(lng = -77.15, lat = 39.1, zoom = 11.5) |>
 addProviderTiles("Esri.WorldStreetMap") |>
  addCircles(
    data = crimesrv,
    radius = ~50,
    color = "#165",
    fillColor = "#198",
    fillOpacity = 0.25,
  label = ~`Crime Name2`,
  popup = ~paste("<strong>Crime Type:</strong>", `Crime Name2`),
  highlightOptions = highlightOptions (
  weight = 4,
  color = "#608",
  fillOpacity = 0.7,
  bringToFront = TRUE
  )
  )

Assuming "Longitude" and "Latitude" are longitude and latitude, respectively

From what the visualizations are showing above, the vast majority of crimes are committed in or near Rockville Town Center and along Rockville Pike between Veirs Mill Road and Pike & Rose. One concering trend noticed with the data visualizations is that assaults, shoplifting and car thefts are very common in Rockville. One thing I wished that I could add to this project would be including additional visualizations focusing on different locations in Montgomery County, rather than just concentrating on Rockville. Another alteration that I could have made would be to focus on visualizing more serious crimes, including aggravated assault. Multiple test-runs were conducted to ensure all code works as expected before submitting, and that it is possible to render the completed project to RPubs for a successful submission.

AI USE ATTRIBUTION STATEMENT

──────────────────────────────────────── Title: DATA 110 Project 2 Creator: Zachary Rodavich Context: DATA 110 Document Type: Student assignment

AI Permission: AI-NO AI Creation Categories: Debugging

AI Tools Used: • Gemini 3 (used 2026-04-23) — Debugging/Troubleshooting • Gemini 3 (used 2026-04-26) — Debugging/Troubleshooting

AI Prompt: Show me how to fix an error with my code

Human Role: I re-wrote my existing code with suggestions provided by the A.I. programs listed above to fix the highlighted errors.

Notes: NOTE: A.I. was only used for the process of troubleshooting faulty code in this project. All other work was completed by me and ME ONLY, as to follow the class “10% A.I. Limit” policy. If there are any concerns or questions, please email me.

──────────────────────────────────────── Generated with AI Attribution Generator