library(sf)
library(ggplot2)
library(dplyr)
library(tidyr)
library(scales)
library(RColorBrewer)
library(units)
library(cowplot)
library(here)
library(classInt)
library(leaflet)
#library(lubridate) was attempting to use this for data wrangling but will try this again another time
I finally received the data I am intending to use for my final project. This data came from Seattle Fire Department (SFD) via a Public Records Request from the City of Seattle’s Open Data Portal. To learn more about public data requests, please visit this site.
Because I didnt receive this dataset until late Wednesday, my goals for using this data in this lesson are very basic, 1. mostly to just get familiar with it, to discover what data wrangling would be needed to derive meaningful insights for next time 2. to fulfil the requirements of this lesson.
This data includes all the 911 calls received by Seattle Fire Department from January 2021 through March 2024. My goals in this section is to do some initial data exploration.
My first look at the data shows me a few things. I will need to turn the latitude and longitude values into valid ones that can be read for a map. Additionally I found many zero values here.
sfdcalls <- read.csv(here("2021-2024_SFD_Response_Time_Data.csv"))
head(sfdcalls)
## Response_Date Master_Incident_Number Address Incident_Type
## 1 3/12/2024 23:52 F240035563 12740 33rd Ave Ne Aid Response
## 2 3/12/2024 23:51 F240035564 607 3rd Ave Advised Incident
## 3 3/12/2024 23:50 F240035562 1627 Belmont Ave Medic Response
## 4 3/12/2024 23:48 F240035561 1000 Union St Trans to SPD
## 5 3/12/2024 23:45 F240035560 2715 25th Ave S Aid Response
## 6 3/12/2024 23:44 F240035559 1413 E Olive Way Scenes Of Violence 7
## Time_PhonePickUp Time_First_Unit_Assigned Time_First_Unit_Enroute
## 1 3/12/2024 23:52 3/12/2024 23:54 3/12/2024 23:55
## 2 3/12/2024 23:51
## 3 3/12/2024 23:50 3/12/2024 23:51 3/12/2024 23:52
## 4 3/12/2024 23:48
## 5 3/12/2024 23:45 3/12/2024 23:46 3/12/2024 23:47
## 6 3/12/2024 23:44 3/12/2024 23:46 3/12/2024 23:48
## Time_First_Unit_Arrived Latitude Longitude Unit
## 1 3/12/2024 23:57 47720938 122292513 E39,E40,M31
## 2 47602813 122331449
## 3 3/12/2024 23:54 47615523 122324477 L10,M1,M10
## 4 47612007 122329116
## 5 3/12/2024 23:50 47579039 122300283 E30
## 6 3/12/2024 23:50 47617164 122326745 A25,B2,E25,M1,M10,M44
# Find number of 0 lat/long. There are 3,686 null lat/long
num_zero_latitude <- sum(as.numeric(sfdcalls$Latitude == 0))
num_zero_longitude <- sum(as.numeric(sfdcalls$Longitude == 0))
num_zero_latitude #number of empty coord values
## [1] 3686
Additionally I found many zero values here. That is a lot of zero coords! A little dissapointing.
For now, I am just going to not include the zero coordinate values. To turn them into a usable format, I tried using the st_as_sf() method but this requires more value wrangling, which I sunk a bit of time into and didnt have much success. So for the sake of time I am moved on. Instead I am going to use a scaling factor divide by values by, to turn these from an integer into a decimal/degrees format. Additionally, because these coords are in the PNW I need to change the longitude values to a negative for degrees West. There are a few date time fields that should be turned into a timestamp format but due to some issues I had doing that, I chose to manually extract the year from one of them for easy comparing in my map. I intend to turn these date fields into that data type so I can perform some response time calculations.
Lastly, there are many incident types that are not standardized. I intended to aggregate and clean some of them, but for now I am leaving them. A noisy one appears to be “advising incident” which means a SFD unit was not actually dispatched so for now, I am removing these types.
# Define scaling factors (dividing by 10^6 for six decimal places)
scaling_factor <- 10^6
# Convert integer latitude and longitude to degrees. Get rid of 0 lat/long values.
# There are a lot of incident values that are "advising" calls that have nothing to do with response times, so removed them.
# Remove og lat/long fields so they arent used in map.
# Create year field for easy comparison (this manual method computes bad values, only using for the sake of time) Will properly compute at a different time for calcs.
cleansfdcalls <- sfdcalls %>%
#as.POSIXct(Response_Date, format = "%d/%m/%Y %H:%M") %>% this didnt work, didnt troubleshoot
#mutate(callyear = year) %>%
mutate(callyear = substr(Response_Date, 6, 9)) %>%
filter(Latitude != 0 & Longitude != 0 & Incident_Type != 'Advised Incident') %>%
mutate(latitude = Latitude / scaling_factor,
longitude = -Longitude / scaling_factor) %>%
select(-Latitude, -Longitude)
head(cleansfdcalls)
## Response_Date Master_Incident_Number Address Incident_Type
## 1 3/12/2024 23:52 F240035563 12740 33rd Ave Ne Aid Response
## 2 3/12/2024 23:50 F240035562 1627 Belmont Ave Medic Response
## 3 3/12/2024 23:48 F240035561 1000 Union St Trans to SPD
## 4 3/12/2024 23:45 F240035560 2715 25th Ave S Aid Response
## 5 3/12/2024 23:44 F240035559 1413 E Olive Way Scenes Of Violence 7
## 6 3/12/2024 23:23 F240035556 509 3rd Ave Aid Response
## Time_PhonePickUp Time_First_Unit_Assigned Time_First_Unit_Enroute
## 1 3/12/2024 23:52 3/12/2024 23:54 3/12/2024 23:55
## 2 3/12/2024 23:50 3/12/2024 23:51 3/12/2024 23:52
## 3 3/12/2024 23:48
## 4 3/12/2024 23:45 3/12/2024 23:46 3/12/2024 23:47
## 5 3/12/2024 23:44 3/12/2024 23:46 3/12/2024 23:48
## 6 3/12/2024 23:23 3/12/2024 23:25 3/12/2024 23:26
## Time_First_Unit_Arrived Unit callyear latitude longitude
## 1 3/12/2024 23:57 E39,E40,M31 2024 47.72094 -122.2925
## 2 3/12/2024 23:54 L10,M1,M10 2024 47.61552 -122.3245
## 3 2024 47.61201 -122.3291
## 4 3/12/2024 23:50 E30 2024 47.57904 -122.3003
## 5 3/12/2024 23:50 A25,B2,E25,M1,M10,M44 2024 47.61716 -122.3267
## 6 3/12/2024 23:28 A5 2024 47.60211 -122.3308
Now I have a dataset that can be mapped! To start I am going to extract the records from 2023 and 2024 to get an intital feel for the disbursement.
Cluster Map, All SFD 911 Calls, 2023-2024
# big cluster map of all response types in 2023 and 2024------------------------------
last2years <- cleansfdcalls %>%
filter(callyear == c(2023, 2024))
leaflet(data = last2years) %>%
addTiles() %>%
addMarkers(clusterOptions = markerClusterOptions(),
#color = ~Incident_Type,
popup = paste0("Incident Type: ", last2years$Incident_Type, "<br>",
"Call Received: ", last2years$Time_PhonePickUp,"<br>",
"Response Arrived: ", last2years$Time_First_Unit_Arrived, "<br>",
"Master Incident Number: ", last2years$Master_Incident_Number)
)
Figure 1. All SFD 911 Calls, 2023-2024. All zero coordinates are removed.Some calls are very far outside city limits, going as far South into Federal Way and East into Issaquah
There are MANY 911 calls received so even with cluster markers this is very busy. For my main map, I chose to extract only calls from 2024 that had “response” in the incident type value. This was an effort to mainly look at medical aid related calls. Again, there are many unstandardized incident types that I intend to aggregate at a later time, but for now I chose to include the primary types to symbolize.
# year and incident type filtered selection for detailed map
justaid2024 <- cleansfdcalls %>%
filter(callyear == c(2024),
grepl("Response", Incident_Type))
# small detailed map-----------------------
num_colors <- 5
#Choose a color palette from RColorBrewer
palette <- brewer.pal(num_colors, "Set1")
pal <- colorFactor(palette, domain = c("Aid Response", "Medic Response", "Medic Response, Overdose", "Low Acuity Response", "BC Aid Response"))
leaflet(data = justaid2024) %>%
addTiles() %>%
#addMarkers(clusterOptions = markerClusterOptions()) %>%
addCircleMarkers(radius = 8,
color = ~pal(Incident_Type),
fillOpacity = 0.9,
stroke = FALSE,
label = ~Incident_Type,
popup = paste0("Call Received:", justaid2024$Time_PhonePickUp,"<br>",
"Response Arrived:", justaid2024$Time_First_Unit_Arrived, "<br>",
"Master Incident Number:", justaid2024$Master_Incident_Number)
) %>%
addLegend(pal = pal,
values = ~Incident_Type,
opacity = 0.9,
title = "SFD 2024 Aid Response Incident Types")
Figure 2. All SFD Aid Response Calls in 2024. All zero coordinates removed. Most “response” calls are for aid response, medic response, and low acuity response types. Popup includes when the call was received, when response arrived and the (standardized) incident response number. Unsurprisingly many calls are densely positioned in the downtown area.
There are many more meaningful insights to derive from this dataset! This is just an initial look for this lesson.