1 Overview

Near-repeat victimisation (NRV) is a phenomenon whereby the proximity to a recently victimised person or property item increases the risk of victimisation of spatially near or similar targets. An example might be the house next door to or several doors from a previously burgled property being targeted soon after. This heightened risk is often identified to be near in space-time, and therefore also decays with space-time.

NRV is explained by two theories, boost account and optimal forager,

  • Boost account suggests that future victimisation is boosted by the initial event - the offender was successful and got away, so why not try it again? Their previous success and knowledge gained can be applied to similar targets they are familiar with nearby. A typical example is uniform housing styles and layouts (high homogeneity in residential developments make it easier to identify suitable targets)
  • Optimal foraging theory considers the offender as the optimal forager, likening them to foraging animals. “As a forager, an animal makes a trade-off between the energy value of the food that is immediately available and the effort that will be expended in reaching a better food source. The better food has to be good enough to offset the energy required to travel and attain it. The quality of food in over-grazed areas diminishes until it re-grows. This is akin to a repeatedly burgled property, where the value of the items taken from this property declines until these items have been replaced. Once an area has been grazed out (i.e. skimmed of the best theft opportunities), the forager moves on.” see JDI Briefs: Predictive mapping

Whilst much of the literature on NRV considers household burglary, its application has been used across a broad range of problems including sex crimes, armed robberies, shootings, street robbery and vehicle crime.

Why might you want to identify NRV patterns?

  • Inform crime prevention initiatives (i.e. warning near neighbours at heightened risk of burglary)
  • Follow-up and investigation within defined areas (i.e. viewing potential routes possibly travelled by offenders between events, locate physical evidence to increase likelihood of solvability - CCTV, suspect descriptions, ANPR or vehicle details etc)
  • Identifying possible future targets for disrupting offenders (i.e. commercial burglary or commercial robbery NRV, forecasting/predicting patterns in shootings where visible policing can be targeted)

2 Libraries

Conducting NRV analyses in R requires the following libraries as shown below. There are two parts to this short guide. The first provides an overview of the NearRepeat package. This can be used to identify the space-time patterns in your data and quantify the extent of NRV. The second part uses the ptools package which produces outputs that can be visualised and form part of tactical or operational briefings to frontline officers, or contribute to strategic and problem analyses.

library(tidyverse) # data transformations
library(sf)        # spatial data transformations
library(tmap)      # spatial visuals
# near repeats
library(remotes)
remotes::install_github("wsteenbeek/NearRepeat")
library(NearRepeat)
# ptools - for near strings function
library(devtools)
install_github("apwheele/ptools")
library(ptools)

3 Near Repeat package

The NearRepeat package developed by Wouter Steenbeck can be found at GitHub, along with detailed guides on how to prepare data, select parameters and run NR analysis.

3.1 Data

This guide uses burglary residential data from Analyze Boston Crime Incident Reports 2021.

Whichever data source you are using, it will require as a minimum:

  • Easting / Xcoordinate
  • Northing / Ycoordinate
  • Date of incident as MM/DD/YYYY
  • Unique ID or reference number of event
burglary <- read_csv("sample_burglary.csv")

# to use Near Repeat function a date variable MUST be read as mm/dd/yyyy format
burglary$datemdy <- as.Date(burglary$datedmy, format = "%m/%d/%Y")

Your minimum working data set should appear as shown below.

rmarkdown::paged_table(burglary[,])

3.2 Near repeat function

Before running the near repeat function, you need to set spatial bands (in meters) and temporal bands (in days). There isn’t a great deal of guidance on how to do this. Spencer Chainey (2021, Understanding Crime) recommends working with distances that are easy to visualise, with an example of 100m in urban areas (and additional bands of 101-200, 201-300 etc). He suggests working with no more than five spatial bands, stating that large and numerous bands would be of little practical use. Additionally Chainey recommends experimenting with temporal bands, opting first to observe several 3-day spatial bands, and as many as 14, and finally observing 7-day bandwidths and four bands. Therefore, the assessment requires an iterative and exploratory approach in order to thoroughly assess and identify your near repeat patterns.

For the purpose of this short guide, we’re using 0, 0.1, 200, 400 and 600m as our spatial bands, and 0, 7, 14, 21 and 28 days as our temporal bands. This is because we’ve selected a relatively small data set (766 residential burglaries), spread over a large geographical area during partial periods of lock-down/stay at home orders, and the extent of NRV within the data is somewhat limited using smaller spatial/temporal bands.

# set spatial and temporal bands
s_bands <- c(0, 0.1, 200, 400, 600) # spatial bands, meters
t_bands <- c(0, 7, 14, 21, 28) # temporal bands, days
set.seed(9489)

Once you’ve set up the spatial and temporal bands you wish to explore, it just requires including them in the NearRepeat function along with the x, y and time/date fields.

#nr function
result <- NearRepeat(x = burglary$xcoord, 
                     y = burglary$ycoord, 
                     time = burglary$datemdy, 
                     sds = s_bands, 
                     tds = t_bands)

The results are displayed as four tables named observed, knox ratio, know ratio median and pvalues. For each spatial and temporal combination, these show the

  • counts of observed pairs (i.e. we see 39 crimes in 0-0.1 sd within 0-7 days),
  • the knox ratio based on the mean of the simulations (i.e. we see in 0-0.1 sd within 0-7 days, the ratio of 2.02 would suggest there is twice the number of events than one might on average expect by chance)
  • the knox ratio based on the median of the simulations
  • p-values
# view all results
result
## $observed
##            
##             [0,7) [7,14) [14,21) [21,28)
##   [0,0.1)      39     25      16      19
##   [0.1,200)     0      2       0       0
##   [200,400)    18     15       8       8
##   [400,600)     9     11       9      11
## 
## $knox_ratio
##            
##                 [0,7)    [7,14)   [14,21)   [21,28)
##   [0,0.1)   2.0167193 1.2791293 0.8647011 1.0446915
##   [0.1,200) 0.0000000 3.8947368 0.0000000 0.0000000
##   [200,400) 1.9613874 1.5526888 0.8767965 0.9039701
##   [400,600) 0.8175123 0.9692186 0.8371508 1.0472696
## 
## $knox_ratio_median
##            
##                 [0,7)    [7,14)   [14,21)   [21,28)
##   [0,0.1)   2.0526316 1.3157895 0.8888889 1.0555556
##   [0.1,200)                 Inf                    
##   [200,400) 2.0000000 1.5000000 0.8888889 0.8888889
##   [400,600) 0.8181818 1.0000000 0.8181818 1.1000000
## 
## $pvalues
##            
##             [0,7) [7,14) [14,21) [21,28)
##   [0,0.1)   0.001  0.135   0.752   0.446
##   [0.1,200) 1.000  0.089   1.000   1.000
##   [200,400) 0.009  0.059   0.692   0.656
##   [400,600) 0.760  0.582   0.733   0.488
## 
## attr(,"class")
## [1] "knox"

We can also plot these results as a heat map matrix.

# view matrix results with pvalue range
plot(result, pvalue_range = c(0, .01))

# 
plot(result, text = "observed")

The optimal spatial and temporal bands, or those you wish to explore further you can note and apply when using near_strings2 function within the ptools package, which we will look at next.

4 Ptools package

Ptools is a relatively recent library of helper functions created by Andrew Wheeler to assist in analysing crime counts among other things. A more detailed guide on the Near Strings function is also available on Andrew’s blog.

4.1 Data

Similar to using NearRepeat the minimum data again requires

  • Easting / Xcoordinate
  • Northing / Ycoordinate
  • Date of incident
  • Unique ID or reference number of event

Additionally, you may need to create an integer for the dates. The guidance says actual date objects can be used, but if that presents an error when running near_strings2 then the code below can add integers as dateint, which is just the number of days since the first date in the data set.

# data for ptools
burg_ptools <- burglary %>%
  select(id, xcoord, ycoord, datedmy) %>%
  mutate(dateint = as.numeric(difftime(burglary$datedmy, "2020-12-31", units = "days"))) %>%
  arrange(dateint)

# added below due to error/warning using nearstrings function
# error message 'length of 'dimnames' [1] not equal to array extent'
burg_ptools <- burg_ptools %>% as.matrix(.) %>% as.data.frame(.)

Your minimum working dataset should now appear as shown below.

rmarkdown::paged_table(burg_ptools[,])

4.2 Near Strings function

Once you’ve set up the data, it just requires including the variables in the near_strings2 function and specifying the spatial, DistThresh, and temporal, TimeThresh, bands within the function. Distance threshold uses feet rather than meters.

# identify chains/links
burg_links <- near_strings2(dat = burg_ptools, 
                            id="id", 
                            x = "xcoord", 
                            y = "ycoord",
                            tim = "dateint", 
                            # distance threshold is in feet
                            DistThresh = 1000, 
                            TimeThresh = 7)

The result produces a table for each crime labelling it with a CompID - this is the unique ID for every string or chain of events, and a CompNum - which denotes how many events are within the string or chain. We can add these to our original dataset using the code below.

# add results to original data
burg_ptools$CompID <- burg_links$CompId
burg_ptools$CompNum <- burg_links$CompNum

We can view quite simply the number of chains of different sizes as an aggregated table. From our Boston burglary residential data we can see that 553 of 766 (72%) events were isolates - burglary residential crimes with no other incidents nearby. There were 64 chains of two incidents, 12 of three and so on. It’s also possible to subset a particular chain, below code for example also extracts the chain of nine events. The ability to extract chains is likely of great use to intelligence or tactical analysts and detectives who are interested in linked-series of crime events.

# number of chains
table(aggregate(CompNum ~ CompID, data = burg_ptools, FUN = max) $CompNum)
## 
##   1   2   3   4   5   6   7   9 
## 553  64  12   4   1   2   1   1
# extract string of 9
burg_ptools[burg_ptools$CompNum == 9,]
##           id   xcoord  ycoord    datedmy dateint CompID CompNum
## 16 212001486 756442.3 2954361 2021-01-07       7     13       9
## 20 212002177 757285.2 2954624 2021-01-08       8     13       9
## 24 212002288 757515.9 2954521 2021-01-09       9     13       9
## 27 212002075 756442.3 2954361 2021-01-10      10     13       9
## 31 212002866 756442.3 2954361 2021-01-13      13     13       9
## 42 212003742 757016.8 2954097 2021-01-17      17     13       9
## 43 212003743 757016.8 2954097 2021-01-17      17     13       9
## 44 212003738 755716.2 2954647 2021-01-17      17     13       9
## 63 212005051 757626.0 2953453 2021-01-23      23     13       9

4.3 Visualising results

We might also wish to target geographical areas of high NRV, here it becomes helpful to view the distribution. From our burg_ptools output we can create a spatial simple feature object, and transform the X/Y coordinates to a spherical coordinate system (lat/long) for displaying on interactive map bases. We can also directly import other relevant geographical data from the Analyze Boston website, here we’re using the Boston PD Police Districts file.

# add latitude and longitude for mapping
burgLL <- st_as_sf(burg_ptools,
                   coords = c("xcoord", "ycoord")) %>% 
  st_set_crs(2249) %>% 
  st_transform(4326)

# open boston police districts map layer
bpd_districts <- st_read("https://bostonopendata-boston.opendata.arcgis.com/datasets/boston::police-districts.geojson?outSR=%7B%22latestWkid%22%3A2249%2C%22wkid%22%3A102686%7D")
## Reading layer `Police_Districts' from data source 
##   `https://bostonopendata-boston.opendata.arcgis.com/datasets/boston::police-districts.geojson?outSR=%7B%22latestWkid%22%3A2249%2C%22wkid%22%3A102686%7D' 
##   using driver `GeoJSON'
## Simple feature collection with 12 features and 8 fields
## Geometry type: MULTIPOLYGON
## Dimension:     XY
## Bounding box:  xmin: -71.19084 ymin: 42.2279 xmax: -70.95296 ymax: 42.39695
## Geodetic CRS:  WGS 84

Say we just want to view all the burglary chains and exclude isolates, we can filter by the CompNum, greater than or equal to 2, and then plot this using tmap. As an example the CompNum has been assigned a colour palette, with darker shades of red denoting near string patterns of more NRV and lighter shades representing chain patterns with a few NRV. You may wish to add labels or pop-up boxes for the groups or individual points with details such as date, time, modus operandi, entry point or other details that end-users would want to have easily at their disposal.

# filter chains of 2 or more near repeats
chains <- burgLL %>% filter(CompNum >= 2)

# plot as interactive map
tmap_mode("view")
tm_shape(chains) +
tm_dots(col = "CompNum",
        palette = "Reds",
        alpha = 0.8,
        size = 0.05) +
tm_shape(bpd_districts) +
  tm_borders(lwd = 1)