Presented here are data to analyze police response time in Dallas,
Texas. I use the data analytics site to create a data frame where the
Call (911) problem is identified as a theft
and the years
are subset to cover 2017-2022. A new variable
Time-to-Dispatch
is constructed to conduct further analysis
using difftime(Dispatch, Received, units=mins)
.
Source of data on Dallas Open Data comes from a main source site: Dallas Crime Analytics Overview.
This file is provided as a preliminary resource until this data is
added to the critstats
package. You may also use this code
to gather data related to your class project, thesis, or other academic
tasks beyond what is provided below. Content in this file comes from a
host of different sources which you should be familiar with prior to
access and analyzing any data.
An important first step is to read the codebook for the data. More information can be viewed at the bottom of the file in the references section. This file will be updated periodically.
Variables of primary interest:
Variable Name | Variable Type | Description |
---|---|---|
Call Received Date Time |
Date-Time | Date and time related call was cleared |
Call Dispatch Date Time |
Date-Time | Date and time related call was dispatched |
Time-to-Dispatch |
Numeric | Amount of time (in minutes) between received and dispatch. |
The data is subset by year (2021). I retain all other variables that will be removed during data processing.
Open up a new .Rmd file.
Use {r setup, include=F}
in your first code chunk.
knitr::opts_chunk$set(echo = TRUE)
library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.1.4 ✔ readr 2.1.5
## ✔ forcats 1.0.0 ✔ stringr 1.5.1
## ✔ ggplot2 3.5.1 ✔ tibble 3.2.1
## ✔ lubridate 1.9.3 ✔ tidyr 1.3.1
## ✔ purrr 1.0.2
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(readr)
library(dplyr)
Download the data here and place it the same folder as this
.Rmd
file.
data <- read_csv("police-response-theft-2017-2022.csv")
## Warning: One or more parsing issues, call `problems()` on your data frame for details,
## e.g.:
## dat <- vroom(...)
## problems(dat)
## Rows: 23003 Columns: 86
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (55): Incident Number w/year, Service Number ID, Call (911) Problem, Ty...
## dbl (17): Year of Incident, Watch, Reporting Area, Beat, Sector, Year1 of O...
## lgl (2): Family Offense, Hate Crime
## dttm (9): Date1 of Occurrence, Date2 of Occurrence, Date of Report, Date in...
## time (3): Time1 of Occurrence, Time2 of Occurrence, Offense Entered Time
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
data
## # A tibble: 23,003 × 86
## `Incident Number w/year` `Year of Incident` `Service Number ID` Watch
## <chr> <dbl> <chr> <dbl>
## 1 187681-2017 2017 187681-2017-03 3
## 2 292956-2017 2017 292956-2017-01 2
## 3 071711-2022 2022 071711-2022-01 3
## 4 204375-2017 2017 204375-2017-01 2
## 5 230053-2022 2022 230053-2022-01 2
## 6 280441-2017 2017 280441-2017-02 2
## 7 149746-2021 2021 149746-2021-01 1
## 8 225225-2017 2017 225225-2017-02 3
## 9 225225-2017 2017 225225-2017-03 3
## 10 031858-2019 2019 031858-2019-04 3
## # ℹ 22,993 more rows
## # ℹ 82 more variables: `Call (911) Problem` <chr>, `Type of Incident` <chr>,
## # `Type Location` <chr>, `Type of Property` <chr>, `Incident Address` <chr>,
## # `Apartment Number` <chr>, `Reporting Area` <dbl>, Beat <dbl>,
## # Division <chr>, Sector <dbl>, `Council District` <chr>,
## # `Target Area Action Grids` <chr>, Community <chr>,
## # `Date1 of Occurrence` <dttm>, `Year1 of Occurrence` <dbl>, …
Next I subset the columns of interest.
# view variables
names(data)
## [1] "Incident Number w/year"
## [2] "Year of Incident"
## [3] "Service Number ID"
## [4] "Watch"
## [5] "Call (911) Problem"
## [6] "Type of Incident"
## [7] "Type Location"
## [8] "Type of Property"
## [9] "Incident Address"
## [10] "Apartment Number"
## [11] "Reporting Area"
## [12] "Beat"
## [13] "Division"
## [14] "Sector"
## [15] "Council District"
## [16] "Target Area Action Grids"
## [17] "Community"
## [18] "Date1 of Occurrence"
## [19] "Year1 of Occurrence"
## [20] "Month1 of Occurence"
## [21] "Day1 of the Week"
## [22] "Time1 of Occurrence"
## [23] "Day1 of the Year"
## [24] "Date2 of Occurrence"
## [25] "Year2 of Occurrence"
## [26] "Month2 of Occurence"
## [27] "Day2 of the Week"
## [28] "Time2 of Occurrence"
## [29] "Day2 of the Year"
## [30] "Date of Report"
## [31] "Date incident created"
## [32] "Offense Entered Year"
## [33] "Offense Entered Month"
## [34] "Offense Entered Day of the Week"
## [35] "Offense Entered Time"
## [36] "Offense Entered Date/Time"
## [37] "CFS Number"
## [38] "Call Received Date Time"
## [39] "Call Date Time"
## [40] "Call Cleared Date Time"
## [41] "Call Dispatch Date Time"
## [42] "Special Report (Pre-RMS)"
## [43] "Person Involvement Type"
## [44] "Victim Type"
## [45] "Victim Race"
## [46] "Victim Ethnicity"
## [47] "Victim Gender"
## [48] "Responding Officer #1 Badge No"
## [49] "Responding Officer #1 Name"
## [50] "Responding Officer #2 Badge No"
## [51] "Responding Officer #2 Name"
## [52] "Reporting Officer Badge No"
## [53] "Assisting Officer Badge No"
## [54] "Reviewing Officer Badge No"
## [55] "Element Number Assigned"
## [56] "Investigating Unit 1"
## [57] "Investigating Unit 2"
## [58] "Offense Status"
## [59] "UCR Disposition"
## [60] "Modus Operandi (MO)"
## [61] "Family Offense"
## [62] "Hate Crime"
## [63] "Hate Crime Description"
## [64] "Weapon Used"
## [65] "Gang Related Offense"
## [66] "Drug Related Istevencident"
## [67] "RMS Code"
## [68] "Criminal Justice Information Service Code"
## [69] "Penal Code"
## [70] "UCR Offense Name"
## [71] "UCR Offense Description"
## [72] "UCR Code"
## [73] "Offense Type"
## [74] "NIBRS Crime"
## [75] "NIBRS Crime Category"
## [76] "NIBRS Crime Against"
## [77] "NIBRS Code"
## [78] "NIBRS Group"
## [79] "NIBRS Type"
## [80] "Update Date"
## [81] "X Coordinate"
## [82] "Y Cordinate"
## [83] "Zip Code"
## [84] "City"
## [85] "State"
## [86] "Location1"
Subset columns.
# Subset important columns
important_columns <- c("Date1 of Occurrence",
"Reporting Area",
"Call Received Date Time",
"Call Dispatch Date Time",
"Victim Race",
"Victim Gender",
"Offense Status",
"Zip Code")
subset_data <- data %>%
select(all_of(important_columns))
# Display the first few rows of the subset data
head(subset_data)
## # A tibble: 6 × 8
## `Date1 of Occurrence` `Reporting Area` `Call Received Date Time`
## <dttm> <dbl> <dttm>
## 1 2017-08-16 00:00:00 3024 2017-08-17 07:36:44
## 2 2017-04-14 00:00:00 4521 2017-12-27 09:31:19
## 3 2022-04-22 00:00:00 6037 2022-04-23 16:42:15
## 4 2017-09-04 00:00:00 1142 2017-09-06 13:40:11
## 5 2022-12-25 00:00:00 1223 2022-12-26 10:51:19
## 6 2017-12-09 00:00:00 3099 2017-12-10 12:37:06
## # ℹ 5 more variables: `Call Dispatch Date Time` <dttm>, `Victim Race` <chr>,
## # `Victim Gender` <chr>, `Offense Status` <chr>, `Zip Code` <dbl>
View the dimensions.
dim(subset_data)
## [1] 23003 8
View the structure of the data.
str(subset_data)
## tibble [23,003 × 8] (S3: tbl_df/tbl/data.frame)
## $ Date1 of Occurrence : POSIXct[1:23003], format: "2017-08-16" "2017-04-14" ...
## $ Reporting Area : num [1:23003] 3024 4521 6037 1142 1223 ...
## $ Call Received Date Time: POSIXct[1:23003], format: "2017-08-17 07:36:44" "2017-12-27 09:31:19" ...
## $ Call Dispatch Date Time: POSIXct[1:23003], format: "2017-08-17 09:12:57" "2017-12-27 10:42:47" ...
## $ Victim Race : chr [1:23003] NA NA NA NA ...
## $ Victim Gender : chr [1:23003] NA NA NA NA ...
## $ Offense Status : chr [1:23003] "Suspended" "Suspended" "Suspended" "Suspended" ...
## $ Zip Code : num [1:23003] 75229 75251 75243 75214 75223 ...
The “Call” variables will need to be reformatted to make sense of their ata type. The format should be turned into date and time for easy rendering.
Transform the date-time variables and create a
Time-to-Dispatch
variable.
library(dplyr)
library(lubridate)
subset_data <- subset_data %>%
mutate(
Received = as_datetime(`Call Received Date Time`),
Dispatched = as_datetime(`Call Dispatch Date Time`),
Time_to_Dispatch = difftime(Dispatched, Received, units = "mins")
)
# Display the first few rows of the updated dataset
head(subset_data) %>%
select("Reporting Area",
"Time_to_Dispatch",
"Call Received Date Time",
"Call Dispatch Date Time")
## # A tibble: 6 × 4
## `Reporting Area` Time_to_Dispatch `Call Received Date Time`
## <dbl> <drtn> <dttm>
## 1 3024 96.21667 mins 2017-08-17 07:36:44
## 2 4521 71.46667 mins 2017-12-27 09:31:19
## 3 6037 1162.66667 mins 2022-04-23 16:42:15
## 4 1142 161.36667 mins 2017-09-06 13:40:11
## 5 1223 440.16667 mins 2022-12-26 10:51:19
## 6 3099 280.05000 mins 2017-12-10 12:37:06
## # ℹ 1 more variable: `Call Dispatch Date Time` <dttm>
You may proceed from here with a base analysis of differences in
Time-to-Dispatch
by Reporting Area
. You may
also select other variables using the names(data)
to
identify which variables you would like to analyze in the data.