Generating a Count of IDU-related Arrests per Chicago Community Area (2016)

Overview

In this analysis, I will be generating a new variable construct that will be included in my final model of the HIV risk environment for IDUs in Chicago. This variable construct is the count of IDU-related arrests per Chicago Community Area in 2016. The specific arrests that I am interested in include arrests for crimes characterized as “possession of hypodermic needle,” “sale/delivery of hypodermic needle,” and “possession of drug equipment.” Research has indicated that laws and policing initiatives can heighten HIV risk among IDUs by impacting syringe exchange use and accessibility and potentially causing mixing between IDUs’ networks and groups (Burris et al. 2004; Rhodes et al. 2009). Thus, on-the-ground implementation of laws prohibiting the possession of drug paraphernalia and high arrest rates have been shown to decrease the carrying of syringes and equipment by IDUs, increasing sharing opportunities (Burris et al. 2004; Rhodes et al. 2009). By generating this IDU-related arrest count variable, I hope to gain a better understanding of the HIV risk landscape in the Chicago area.

Data Sources

The data that I will be using to construct the IDU-related arrest count variable includes “Crimes - 2016” from the Chicago Data Portal and “Boundaries - Community Areas(current)” also from the Chicago Data Portal. The “Crimes - 2016” dataset includes reported incidents of crime that took place in Chicago in 2016 from the Chicago Police Department’s CLEAR (Citizen Law Enforcement Analysis Reporting) system. Important pieces of this dataset include the year of the crime (2016), the IUCR or Illinois Crime Reporting code, a description of this code and “Primary Type,” arrest information, and longitude and latitude coordinates of crimes. This dataset can be found here. The “Boundaries - Community Areas(current)” is spatial data consisting of current community area boundaries in Chicago and can be downloaded as a shapefile here.

ETL Workflow

To generate the new variable construct: count of IDU-related arrests per Chicago Community Area in 2016, I EXTRACT the csv data file of Chicago Crime from 2016 and Chicago community area boundaries shapefile, TRANSFORM these datasets through geoprocessing, and LOAD a cleaned IDU-related crime data csv, a IDU-related crime points shapefile, and a shapefile including the arrest count per community area. This process ultimately enables me to map this count variable by producing a thematic map. The steps of my ETL workflow can be seen below.

Extract (Chicago Crime Data 2016)

Load libraries and set up R Session

In order to begin my analysis, I first loaded all the libraries that will be used throughout this spatial analysis process as seen below.

library(sf)
library(tmap)
library(leaflet)
library(data.table)
library(tidyverse)
library(tidyr)
library(dplyr)

Read in the Chicago Crimes 2016 CSV file

Here, I am reading in the crime csv so I can bring it into my R environment.

ChicagoCrime<-fread("Crimes_-_2016.csv", header = T)

Transform (Chicago Crime Data 2016)

Inspect the dataset

Before beginning any analysis, I need to inspect the dataset so I can get an idea of what I am working with and the understand what steps are needed to clean/wrangle the data.

glimpse(ChicagoCrime)

## Rows: 269,534
## Columns: 22
## $ ID                     <int> 11645836, 11043021, 11243066, 11243020, 112279…
## $ `Case Number`          <chr> "JC212333", "JA367631", "JB168427", "HZ184094"…
## $ Date                   <chr> "05/01/2016 12:25:00 AM", "10/19/2016 07:00:00…
## $ Block                  <chr> "055XX S ROCKWELL ST", "075XX S YATES BLVD", "…
## $ IUCR                   <chr> "1153", "0610", "1153", "0281", "1154", "2820"…
## $ `Primary Type`         <chr> "DECEPTIVE PRACTICE", "BURGLARY", "DECEPTIVE P…
## $ Description            <chr> "FINANCIAL IDENTITY THEFT OVER $ 300", "FORCIB…
## $ `Location Description` <chr> "", "RESTAURANT", "OTHER", "RESIDENCE PORCH/HA…
## $ Arrest                 <lgl> FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALS…
## $ Domestic               <lgl> FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALS…
## $ Beat                   <int> 824, 421, 332, 1712, 513, 2221, 311, 1831, 151…
## $ District               <int> 8, 4, 3, 17, 5, 22, 3, 18, 15, 7, 14, 10, 17, …
## $ Ward                   <int> 15, 7, 5, 39, 9, 19, 20, 42, 29, 6, 32, 12, 30…
## $ `Community Area`       <int> 63, 43, 43, 13, 49, 72, 40, 8, 25, 69, 22, 30,…
## $ `FBI Code`             <chr> "11", "05", "11", "02", "11", "26", "11", "11"…
## $ `X Coordinate`         <int> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
## $ `Y Coordinate`         <int> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
## $ Year                   <int> 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016…
## $ `Updated On`           <chr> "04/06/2019 04:04:43 PM", "08/05/2017 03:50:08…
## $ Latitude               <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
## $ Longitude              <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
## $ Location               <chr> "", "", "", "", "", "", "", "", "", "", "", ""…

dim(ChicagoCrime)

## [1] 269534     22

nrow(ChicagoCrime)

## [1] 269534

ncol(ChicagoCrime)

## [1] 22

Here, I can see that this dataset contains important information on the type of crimes and the the IUCR code used to identify them. Although the values being show in this output chunk display NA values for some latitude and longitude coordinates, there are not NA values for the crimes I am interested in analyzing in this model. I can also see here that this dataset has 268,308 rows and 22 columns.

Identify codes of IDU-related crimes

Using the function unique, I can view all the IUCR or Illinois Uniform Crime Reporting codes that classify criminal offenses. The Chicago Police Department has a dataset that lays out all 350 IUCR Codes and the crime they are associated with which can be found here. After inspecting this CPD data, I found that “POS:HYPODERMIC NEEDLE”, “SALE/DEL HYPODERMIC NEEDLE”, and “POSSESSION OF DRUG EQUIPMENT” represented by IUCR codes 2110, 2111, and 2170, include the crime indicators that I am interested in as they relate to IDUs’ HIV risk environment.

unique(ChicagoCrime$'IUCR')

##   [1] "1153" "0610" "0281" "1154" "2820" "1562" "1152" "1310" "2017" "4650"
##  [11] "4651" "2014" "0910" "1754" "2825" "1753" "1110" "1140" "0265" "1130"
##  [21] "5007" "0486" "1120" "1122" "1780" "0870" "0890" "0820" "1261" "1751"
##  [31] "0266" "1195" "031A" "1320" "051A" "041A" "3800" "0488" "1360" "0320"
##  [41] "5093" "1156" "1150" "0810" "1752" "1563" "1750" "0865" "1582" "1200"
##  [51] "0560" "1365" "0460" "143A" "0312" "0479" "0520" "0454" "1330" "1121"
##  [61] "0430" "0496" "0850" "0530" "0420" "2890" "1210" "0880" "0497" "4387"
##  [71] "3710" "0545" "1020" "0860" "1811" "2028" "1350" "0930" "2024" "2093"
##  [81] "0620" "0920" "2018" "2170" "4255" "4625" "1220" "2826" "5131" "502R"
##  [91] "1790" "033A" "1305" "5112" "501A" "0495" "141C" "502P" "2027" "2230"
## [101] "0331" "5002" "0630" "1477" "0275" "0484" "2250" "0558" "2092" "5111"
## [111] "1590" "5000" "2090" "0498" "0915" "1710" "1812" "2850" "5001" "3731"
## [121] "1549" "0470" "0340" "5011" "2025" "0313" "0325" "1821" "1345" "0334"
## [131] "0580" "1822" "4388" "0553" "1570" "2023" "2022" "0326" "4210" "0935"
## [141] "0650" "0550" "2070" "0330" "3960" "2095" "3760" "1025" "031B" "1585"
## [151] "0337" "2851" "2210" "1507" "2900" "3750" "2016" "1544" "0291" "502T"
## [161] "1090" "1840" "2860" "5073" "1792" "141A" "5110" "1506" "141B" "2015"
## [171] "0263" "2026" "051B" "4386" "1242" "0554" "4230" "1375" "0440" "143C"
## [181] "1030" "0261" "1565" "1460" "1170" "0917" "5004" "0485" "2021" "2870"
## [191] "0483" "1240" "1370" "0453" "0462" "4510" "1513" "0552" "143B" "1340"
## [201] "1505" "1185" "0555" "3730" "2830" "1235" "2031" "0557" "5009" "1720"
## [211] "3300" "3970" "1155" "4220" "2840" "2010" "1480" "2012" "1540" "142A"
## [221] "2080" "2220" "0461" "3910" "1661" "5121" "1535" "2091" "1206" "0583"
## [231] "2050" "5114" "2029" "1651" "2034" "0895" "0925" "0927" "1900" "3100"
## [241] "4860" "1479" "1151" "0551" "0475" "4389" "2020" "1335" "4652" "1755"
## [251] "1035" "1541" "1512" "1478" "2011" "1260" "1564" "1850" "1536" "0264"
## [261] "4800" "2032" "2040" "0830" "1055" "5003" "033B" "1450" "0450" "1245"
## [271] "1205" "0482" "2110" "1791" "1537" "501H" "2895" "0452" "1241" "5130"
## [281] "2033" "0584" "2013" "3740" "4310" "1670" "041B" "1725" "500N" "1135"
## [291] "5132" "1680" "1435" "3000" "0142" "0273" "2160" "0585" "1050" "0274"
## [301] "0937" "0556" "500E" "1566" "4240" "2060" "1520" "0581" "1230" "0918"
## [311] "1265" "2240" "1515" "3720" "0271" "5094" "3920" "0272" "1481" "3610"
## [321] "1010" "3770" "1715" "3975" "1580" "0110"

Subset and clean Chicago crime data

Now that I have identified what IUCR codes to include in my variable construct, I now want to make sure that the crime data is a data frame so I can begin to clean/subset.

ChicagoCrime.df<-as.data.frame(ChicagoCrime)

Filter

To begin to subset and clean the data, I first want to filter for the IDU-related crimes - those with IUCR codes 2110, 2111, 2170.

IDUCrimes = filter(ChicagoCrime.df, IUCR %in% c("2111","2110","2170"))
glimpse(IDUCrimes)

## Rows: 183
## Columns: 22
## $ ID                     <int> 10365037, 10365454, 10365881, 10373041, 103826…
## $ `Case Number`          <chr> "HZ100560", "HZ101106", "HZ101788", "HZ109136"…
## $ Date                   <chr> "01/01/2016 12:35:00 PM", "01/01/2016 10:00:00…
## $ Block                  <chr> "040XX W WILCOX ST", "064XX S ASHLAND AVE", "0…
## $ IUCR                   <chr> "2170", "2170", "2170", "2170", "2170", "2170"…
## $ `Primary Type`         <chr> "NARCOTICS", "NARCOTICS", "NARCOTICS", "NARCOT…
## $ Description            <chr> "POSSESSION OF DRUG EQUIPMENT", "POSSESSION OF…
## $ `Location Description` <chr> "ALLEY", "STREET", "STREET", "ALLEY", "POLICE …
## $ Arrest                 <lgl> TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE…
## $ Domestic               <lgl> FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALS…
## $ Beat                   <int> 1115, 725, 2531, 1922, 434, 815, 1421, 112, 25…
## $ District               <int> 11, 7, 25, 19, 4, 8, 14, 1, 25, 8, 8, 15, 4, 1…
## $ Ward                   <int> 28, 17, 29, 47, 10, 23, 35, 42, 37, 16, 17, 28…
## $ `Community Area`       <int> 26, 67, 25, 6, 51, 56, 22, 32, 25, 66, 66, 25,…
## $ `FBI Code`             <chr> "18", "18", "18", "18", "18", "18", "18", "18"…
## $ `X Coordinate`         <int> 1149482, 1166774, 1137152, 1163818, 1192874, 1…
## $ `Y Coordinate`         <int> 1899030, 1862112, 1911278, 1925275, 1837123, 1…
## $ Year                   <int> 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016…
## $ `Updated On`           <chr> "02/10/2018 03:50:01 PM", "02/10/2018 03:50:01…
## $ Latitude               <dbl> 41.87886, 41.77720, 41.91270, 41.95059, 41.708…
## $ Longitude              <dbl> -87.72659, -87.66416, -87.77157, -87.67321, -8…
## $ Location               <chr> "(41.878858346, -87.726594936)", "(41.77719854…

I also want to filter these IDU-related crimes to only include crimes where arrests were made since I am specifically interested in how law enforcement and policing affects IDUs’ HIV risk environment especially in relation to the danger of being caught carrying a syringe or drug-injection equipment.

IDU_arrests = filter(IDUCrimes, Arrest %in% c("TRUE"))
glimpse(IDU_arrests)

## Rows: 183
## Columns: 22
## $ ID                     <int> 10365037, 10365454, 10365881, 10373041, 103826…
## $ `Case Number`          <chr> "HZ100560", "HZ101106", "HZ101788", "HZ109136"…
## $ Date                   <chr> "01/01/2016 12:35:00 PM", "01/01/2016 10:00:00…
## $ Block                  <chr> "040XX W WILCOX ST", "064XX S ASHLAND AVE", "0…
## $ IUCR                   <chr> "2170", "2170", "2170", "2170", "2170", "2170"…
## $ `Primary Type`         <chr> "NARCOTICS", "NARCOTICS", "NARCOTICS", "NARCOT…
## $ Description            <chr> "POSSESSION OF DRUG EQUIPMENT", "POSSESSION OF…
## $ `Location Description` <chr> "ALLEY", "STREET", "STREET", "ALLEY", "POLICE …
## $ Arrest                 <lgl> TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE…
## $ Domestic               <lgl> FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALS…
## $ Beat                   <int> 1115, 725, 2531, 1922, 434, 815, 1421, 112, 25…
## $ District               <int> 11, 7, 25, 19, 4, 8, 14, 1, 25, 8, 8, 15, 4, 1…
## $ Ward                   <int> 28, 17, 29, 47, 10, 23, 35, 42, 37, 16, 17, 28…
## $ `Community Area`       <int> 26, 67, 25, 6, 51, 56, 22, 32, 25, 66, 66, 25,…
## $ `FBI Code`             <chr> "18", "18", "18", "18", "18", "18", "18", "18"…
## $ `X Coordinate`         <int> 1149482, 1166774, 1137152, 1163818, 1192874, 1…
## $ `Y Coordinate`         <int> 1899030, 1862112, 1911278, 1925275, 1837123, 1…
## $ Year                   <int> 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016…
## $ `Updated On`           <chr> "02/10/2018 03:50:01 PM", "02/10/2018 03:50:01…
## $ Latitude               <dbl> 41.87886, 41.77720, 41.91270, 41.95059, 41.708…
## $ Longitude              <dbl> -87.72659, -87.66416, -87.77157, -87.67321, -8…
## $ Location               <chr> "(41.878858346, -87.726594936)", "(41.77719854…

Select

Next, I want to clean up this data because there are a few columns with data that is not necessary for my analysis.

IDU_arrests = dplyr::select(IDU_arrests, IUCR, Latitude, Longitude)
glimpse(IDU_arrests)

## Rows: 183
## Columns: 3
## $ IUCR      <chr> "2170", "2170", "2170", "2170", "2170", "2170", "2170", "21…
## $ Latitude  <dbl> 41.87886, 41.77720, 41.91270, 41.95059, 41.70803, 41.80694,…
## $ Longitude <dbl> -87.72659, -87.66416, -87.77157, -87.67321, -87.56929, -87.…

Load (Chicago Crime Data 2016)

Write cleaned IDU-related arrest data 2016 as CSV

Now, I want to save my cleaned attribute dataset as a CSV file.

write.csv(IDU_arrests, "IDUArrests2016.csv")

Transform (IDU-related Arrest Data)

Enable spatial data of IDU-related Arrest data frame

Since I am interested in how many IDU-related arrests occurred in each Chicago community areas in 2016, I enable spatial aspect of the crime dataset so I can plot points.

#first I want to inspect the latitude and the longitude of the arrest data
glimpse(IDU_arrests[,c("Longitude","Latitude")])

## Rows: 183
## Columns: 2
## $ Longitude <dbl> -87.72659, -87.66416, -87.77157, -87.67321, -87.56929, -87.…
## $ Latitude  <dbl> 41.87886, 41.77720, 41.91270, 41.95059, 41.70803, 41.80694,…

#this includes checking the lat/long structure as well
str(IDU_arrests[,c("Longitude","Latitude")])

## 'data.frame':    183 obs. of  2 variables:
##  $ Longitude: num  -87.7 -87.7 -87.8 -87.7 -87.6 ...
##  $ Latitude : num  41.9 41.8 41.9 42 41.7 ...

#now I can convert the arrest data frame into a spatial data frame
IDU_arrests.pts<- st_as_sf(IDU_arrests, coords = c("Longitude", "Latitude"), crs = 4326)

#I plot the points to ensure that they are plotting correctly before moving forward with my analysis
plot(IDU_arrests.pts)

## Load (IDU-related Arrest Data) Now that I have spatial point data of arrest locations in Chicago in 2016, I can write this data as a shapefile.

st_write(IDU_arrests.pts, "IDUArrests2016.shp")

Extract (Chicago Community Area Boundary Data)

Next, I want to read in the Chicago community area boundaries shp.

ChiAreas <-st_read("Boundaries - Community Areas (current) (1)")

## Reading layer `geo_export_f808c167-2d30-4df4-9de6-288e921ef41f' from data source `/Users/brifadden/Desktop/IDU_project/Boundaries - Community Areas (current) (1)' using driver `ESRI Shapefile'
## Simple feature collection with 77 features and 9 fields
## geometry type:  MULTIPOLYGON
## dimension:      XY
## bbox:           xmin: -87.94011 ymin: 41.64454 xmax: -87.52414 ymax: 42.02304
## geographic CRS: WGS84(DD)

glimpse(ChiAreas)

## Rows: 77
## Columns: 10
## $ area       <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
## $ area_num_1 <chr> "35", "36", "37", "38", "39", "4", "40", "41", "42", "1", …
## $ area_numbe <chr> "35", "36", "37", "38", "39", "4", "40", "41", "42", "1", …
## $ comarea    <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
## $ comarea_id <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
## $ community  <chr> "DOUGLAS", "OAKLAND", "FULLER PARK", "GRAND BOULEVARD", "K…
## $ perimeter  <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
## $ shape_area <dbl> 46004621, 16913961, 19916705, 48492503, 29071742, 71352328…
## $ shape_len  <dbl> 31027.05, 19565.51, 25339.09, 28196.84, 23325.17, 36624.60…
## $ geometry   <MULTIPOLYGON [°]> MULTIPOLYGON (((-87.60914 4..., MULTIPOLYGON …

Transform (IDU-related Arrest Data & Chicago Community Area Boundary Data)

Subset/Clean Data

Select only relevant columns that will be helpful in analysis

There are a lot of columns in this shp that have rows of 0s or are not necessary for my analysis so I only select the ones I am interested in.

ChiAreas = dplyr::select(ChiAreas, geometry, community)
glimpse(ChiAreas)

## Rows: 77
## Columns: 2
## $ community <chr> "DOUGLAS", "OAKLAND", "FULLER PARK", "GRAND BOULEVARD", "KE…
## $ geometry  <MULTIPOLYGON [°]> MULTIPOLYGON (((-87.60914 4..., MULTIPOLYGON (…

Check and transform CRS

Here, I need to check both the spatial arrest point data and the spatial community area data CRS to make sure they are the same. Since they are not the same, I need to transform one of the CRS to match the other.

## Coordinate Reference System:
##   User input: EPSG:4326 
##   wkt:
## GEOGCRS["WGS 84",
##     DATUM["World Geodetic System 1984",
##         ELLIPSOID["WGS 84",6378137,298.257223563,
##             LENGTHUNIT["metre",1]]],
##     PRIMEM["Greenwich",0,
##         ANGLEUNIT["degree",0.0174532925199433]],
##     CS[ellipsoidal,2],
##         AXIS["geodetic latitude (Lat)",north,
##             ORDER[1],
##             ANGLEUNIT["degree",0.0174532925199433]],
##         AXIS["geodetic longitude (Lon)",east,
##             ORDER[2],
##             ANGLEUNIT["degree",0.0174532925199433]],
##     USAGE[
##         SCOPE["unknown"],
##         AREA["World"],
##         BBOX[-90,-180,90,180]],
##     ID["EPSG",4326]]

## Coordinate Reference System:
##   User input: WGS84(DD) 
##   wkt:
## GEOGCRS["WGS84(DD)",
##     DATUM["WGS84",
##         ELLIPSOID["WGS84",6378137,298.257223563,
##             LENGTHUNIT["metre",1,
##                 ID["EPSG",9001]]]],
##     PRIMEM["Greenwich",0,
##         ANGLEUNIT["degree",0.0174532925199433]],
##     CS[ellipsoidal,2],
##         AXIS["geodetic longitude",east,
##             ORDER[1],
##             ANGLEUNIT["degree",0.0174532925199433]],
##         AXIS["geodetic latitude",north,
##             ORDER[2],
##             ANGLEUNIT["degree",0.0174532925199433]]]

Overlay points over polygons

Now that both of my spatial data are in the same CRS, I can overlay the Chicago IDU-related arrest points on top of the community area polygons.

## tmap mode set to plotting

Join spatial data

Here, I want to conduct a spatial join to bring together the point Chicago arrest data and the polygon community area boundaries data.

#spatial join
IDU_arrests_chi <- st_join(IDU_arrests, ChiAreas, join = st_within)

## although coordinates are longitude/latitude, st_within assumes that they are planar

#inspect
glimpse(IDU_arrests_chi)

## Rows: 183
## Columns: 3
## $ IUCR      <chr> "2170", "2170", "2170", "2170", "2170", "2170", "2170", "21…
## $ community <chr> "WEST GARFIELD PARK", "WEST ENGLEWOOD", "AUSTIN", "LAKE VIE…
## $ geometry  <POINT [°]> POINT (-87.72659 41.87886), POINT (-87.66416 41.7772)…

Point in polygon operation

As I have previously noted, I am interested in how many IDU-related arrests occurred within each Chicago community area in 2016. This requires a point in polygon operation so I can count the number of arrests within each community creating an “crime_ct” variable. I can then create a thematic map to easily visualize the distribution in these arrests counts across the city.

Count the number of IDU-related arrests per Chicago community area

#count IDU-related arrests per community area
IDU_crime_ct <- count(as_tibble(IDU_arrests_chi), community, .drop = FALSE) %>%
  print()

## # A tibble: 41 x 2
##    community              n
##    <chr>              <int>
##  1 ASHBURN                1
##  2 AUBURN GRESHAM         5
##  3 AUSTIN                34
##  4 BELMONT CRAGIN         3
##  5 BRIGHTON PARK          1
##  6 CHATHAM                1
##  7 CHICAGO LAWN          10
##  8 EAST GARFIELD PARK     6
##  9 EDGEWATER              1
## 10 EDISON PARK            1
## # … with 31 more rows

#rename columns accordingly
names(IDU_crime_ct) <- c("community","crime_ct")
glimpse(IDU_crime_ct)

## Rows: 41
## Columns: 2
## $ community <chr> "ASHBURN", "AUBURN GRESHAM", "AUSTIN", "BELMONT CRAGIN", "B…
## $ crime_ct  <int> 1, 5, 34, 3, 1, 1, 10, 6, 1, 1, 6, 3, 5, 11, 2, 1, 2, 5, 10…

Join arrest count data back to the original Chicago community area shapefile

Now that I have generated by crime_ct variable, I want to join this variable back to the original Chicago community area boundary shp. In this process, I noticed that some community areas had no IDU-related arrests in 2016, so I had to replace NA values with 0 indicating 0 arrests.

#first inspect
glimpse(ChiAreas)

## Rows: 77
## Columns: 2
## $ community <chr> "DOUGLAS", "OAKLAND", "FULLER PARK", "GRAND BOULEVARD", "KE…
## $ geometry  <MULTIPOLYGON [°]> MULTIPOLYGON (((-87.60914 4..., MULTIPOLYGON (…

glimpse(IDU_crime_ct)

## Rows: 41
## Columns: 2
## $ community <chr> "ASHBURN", "AUBURN GRESHAM", "AUSTIN", "BELMONT CRAGIN", "B…
## $ crime_ct  <int> 1, 5, 34, 3, 1, 1, 10, 6, 1, 1, 6, 3, 5, 11, 2, 1, 2, 5, 10…

#join arrest count data to community areas shp
community_new <- left_join(ChiAreas, IDU_crime_ct, by="community") %>%
  print()

## Simple feature collection with 77 features and 2 fields
## geometry type:  MULTIPOLYGON
## dimension:      XY
## bbox:           xmin: -87.94011 ymin: 41.64454 xmax: -87.52414 ymax: 42.02304
## geographic CRS: WGS84(DD)
## First 10 features:
##          community crime_ct                       geometry
## 1          DOUGLAS       NA MULTIPOLYGON (((-87.60914 4...
## 2          OAKLAND       NA MULTIPOLYGON (((-87.59215 4...
## 3      FULLER PARK       NA MULTIPOLYGON (((-87.6288 41...
## 4  GRAND BOULEVARD       NA MULTIPOLYGON (((-87.60671 4...
## 5          KENWOOD       NA MULTIPOLYGON (((-87.59215 4...
## 6   LINCOLN SQUARE        1 MULTIPOLYGON (((-87.67441 4...
## 7  WASHINGTON PARK        2 MULTIPOLYGON (((-87.60604 4...
## 8        HYDE PARK       NA MULTIPOLYGON (((-87.58038 4...
## 9         WOODLAWN        2 MULTIPOLYGON (((-87.57714 4...
## 10     ROGERS PARK        5 MULTIPOLYGON (((-87.65456 4...

#some community areas have no IDU-related arrests. We want to make sure to account for this by replacing NA values with 0.
community_new[is.na(community_new)] = 0
glimpse(community_new)

## Rows: 77
## Columns: 3
## $ community <chr> "DOUGLAS", "OAKLAND", "FULLER PARK", "GRAND BOULEVARD", "KE…
## $ crime_ct  <dbl> 0, 0, 0, 0, 0, 1, 2, 0, 2, 5, 0, 0, 2, 0, 1, 0, 0, 0, 3, 3,…
## $ geometry  <MULTIPOLYGON [°]> MULTIPOLYGON (((-87.60914 4..., MULTIPOLYGON (…

Load (IDU-related Arrest Count per Chicago Community Area)

Write this new spatial data as a shapefile

Finally, I want to save this spatial data with my IDU-related arrests counts variable as a shapefile.

st_write(community_new, "CommunityArrests2016.shp")

Visualize the amount of IDU-related arrests per Chicago community area in 2016

To plot this data, I used tmap’s interactive viewing mode. From the “Plots” R environment, I can then export this map as an image, PDF, or link.

## tmap mode set to interactive viewing

Works Cited

Burris, Scott, Kim M. Blankenship, Martin Donoghoe, Susan Sherman, Jon S. Vernick, Patricia Case, Zita Lazzarini, and Stephen Koester. “Addressing the ‘Risk Environment’ for Injection Drug Users: The Mysterious Case of the Missing Cop.” The Milbank Quarterly 82, no. 1 (2004): 125–56. https://doi.org/10.1111/j.0887-378X.2004.00304.x .

Rhodes, Tim, Merrill Singer, Philippe Bourgois, Samuel R. Friedman, and Steffanie A. Strathdee. “The Social Structural Production of HIV Risk among Injecting Drug Users.” Social Science & Medicine 61, no. 5 (September 1, 2005): 1026–44. https://doi.org/10.1016/j.socscimed.2004.12.024 .