library(tidyverse)
# tidy verse Includes GGPLOT2
library(here)
library(skimr)
library(janitor)
library(ggbeeswarm)
library(RColorBrewer)
# Read the csv as crime23
crime23 <- read.csv(here("Lab3", "Crime2023.csv"))
glimpse(crime23)
## Rows: 34,218
## Columns: 18
## $ X <dbl> -77.04559, -76.93632, -76.98733, -77.00132, -77.05789, …
## $ Y <dbl> 38.91335, 38.89190, 38.86447, 38.89612, 38.92933, 38.91…
## $ CCN <int> 23121881, 23124238, 23173340, 23420013, 23424043, 23002…
## $ REPORT_DAT <chr> "2023/07/27 19:36:34+00", "2023/07/31 14:17:54+00", "20…
## $ SHIFT <chr> "EVENING", "DAY", "MIDNIGHT", "DAY", "DAY", "MIDNIGHT",…
## $ METHOD <chr> "OTHERS", "OTHERS", "GUN", "OTHERS", "OTHERS", "OTHERS"…
## $ OFFENSE <chr> "BURGLARY", "ROBBERY", "THEFT/OTHER", "THEFT/OTHER", "T…
## $ BLOCK <chr> "1700 - 1799 BLOCK OF CONNECTICUT AVENUE NW", "4600 - 4…
## $ XBLOCK <dbl> 396046.2, 405524.4, 401099.5, 399886.0, 394981.0, 39655…
## $ YBLOCK <dbl> 138387.6, 136006.7, 132959.9, 136474.0, 140161.8, 13912…
## $ DISTRICT <int> 2, 6, 7, 1, 2, 3, 5, 1, 3, 3, 4, 3, 3, 7, 1, 2, 3, 3, 2…
## $ LATITUDE <dbl> 38.91334, 38.89189, 38.86446, 38.89611, 38.92932, 38.91…
## $ LONGITUDE <dbl> -77.04559, -76.93632, -76.98733, -77.00131, -77.05788, …
## $ BID <chr> "DUPONT CIRCLE", "", "", "", "", "", "", "CAPITOL HILL"…
## $ START_DATE <chr> "2023/07/27 13:31:00+00", "2023/07/31 11:22:00+00", "20…
## $ END_DATE <chr> "2023/07/27 17:32:00+00", "2023/07/31 13:05:00+00", "20…
## $ OBJECTID <int> 484230789, 484230790, 484230795, 484230804, 484230824, …
## $ OCTO_RECORD_ID <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
# do a minor tidying of the data by turning all the headers to lowercase
newCrime <- select_all(crime23, tolower)
The following bar graph counts the amount of crimes by type of crime for each district. Each bar graph represents one of the seven Metropolitan Police Districts (MPD) in DC. Based on the chart, it is clear that “Theft/Other” is the most dominate crime in DC where district 2 and 3 appear to have the most crimes.
newCrime %>%
ggplot(aes(x = offense, color = offense, fill = offense)) + # create a ggplot while adjusting the aesthetics, x axis, y axis and fill of the bars
geom_bar() + # selects the type of ggplot
facet_wrap(~ district) + # wraps individual bar graphs by district
labs(title = "Crime offense by District", # labs allows edits to specific features such as title, caption, x, and y axis
caption = "Data derived from DC Open Data",
x = "Offense",
y = "Crime Count") +
theme(axis.text.x = element_blank()) # removes unecessary labels on the x axis
Gun violence has been filtered as a method of crime. Any crime that used a gun was filtered to change the results of the graph. Previous bar graph revealed that District Two and Three had the highest crime reports. However, District Six and Seven have the highest gun violence crimes in DC. Those crimes are primarily robbery and assault with a dangerous weapon.
newCrime %>%
# Adds another layer of analysis to the process by filtering only gun violence
filter(method == "GUN") %>%
ggplot(aes(x = offense, color = offense, fill = offense)) +
geom_bar() +
facet_wrap(~ district) +
labs(title = "Crime offense by District",
caption = "Data derived from DC Open Data",
x = "Offense",
y = "Crime Count") +
theme(axis.text.x = element_blank())
crimedate <- table(substr(newCrime$report_dat, 1, 7)) # looks into the report_dat field in the newCrime object to count each instance and create a table with two observations (variable and frequency)
dfcrimedays <- as.data.frame(crimedate) # turns table into a data frame
dfcrimedays %>%
ggplot(aes(x = Var1, y = Freq, fill = Freq)) +
geom_col() +
labs(title = "Crime over the years in DC",
caption = "Data derived from DC Open Data",
x = "Month",
y = "Crime Count") +
theme(axis.text.x = element_text(angle = 18)) # changes the angle of the x axis text so that they don't overlap eachother.
ggsave(here("2023totalcrime.png")) # saves the last ggplot!
## Saving 7 x 5 in image
The previous ggplot will save and the result will look identical… almost like twins!
Cody Longbotham - PSU - GEOG588 - Lab 3