## 
  |                                                                            
  |                                                                      |   0%
  |                                                                            
  |                                                                      |   1%
  |                                                                            
  |=                                                                     |   1%
  |                                                                            
  |=                                                                     |   2%
  |                                                                            
  |==                                                                    |   2%
  |                                                                            
  |==                                                                    |   3%
  |                                                                            
  |==                                                                    |   4%
  |                                                                            
  |===                                                                   |   4%
  |                                                                            
  |===                                                                   |   5%
  |                                                                            
  |====                                                                  |   5%
  |                                                                            
  |====                                                                  |   6%
  |                                                                            
  |=====                                                                 |   7%
  |                                                                            
  |=====                                                                 |   8%
  |                                                                            
  |======                                                                |   8%
  |                                                                            
  |======                                                                |   9%
  |                                                                            
  |=======                                                               |   9%
  |                                                                            
  |=======                                                               |  10%
  |                                                                            
  |=======                                                               |  11%
  |                                                                            
  |========                                                              |  11%
  |                                                                            
  |========                                                              |  12%
  |                                                                            
  |=========                                                             |  12%
  |                                                                            
  |=========                                                             |  13%
  |                                                                            
  |=========                                                             |  14%
  |                                                                            
  |==========                                                            |  14%
  |                                                                            
  |==========                                                            |  15%
  |                                                                            
  |===========                                                           |  15%
  |                                                                            
  |===========                                                           |  16%
  |                                                                            
  |============                                                          |  17%
  |                                                                            
  |============                                                          |  18%
  |                                                                            
  |=============                                                         |  18%
  |                                                                            
  |=============                                                         |  19%
  |                                                                            
  |==============                                                        |  19%
  |                                                                            
  |==============                                                        |  20%
  |                                                                            
  |==============                                                        |  21%
  |                                                                            
  |===============                                                       |  21%
  |                                                                            
  |===============                                                       |  22%
  |                                                                            
  |================                                                      |  22%
  |                                                                            
  |================                                                      |  23%
  |                                                                            
  |=================                                                     |  24%
  |                                                                            
  |=================                                                     |  25%
  |                                                                            
  |==================                                                    |  25%
  |                                                                            
  |==================                                                    |  26%
  |                                                                            
  |===================                                                   |  26%
  |                                                                            
  |===================                                                   |  27%
  |                                                                            
  |===================                                                   |  28%
  |                                                                            
  |====================                                                  |  28%
  |                                                                            
  |====================                                                  |  29%
  |                                                                            
  |=====================                                                 |  29%
  |                                                                            
  |=====================                                                 |  30%
  |                                                                            
  |=====================                                                 |  31%
  |                                                                            
  |======================                                                |  31%
  |                                                                            
  |======================                                                |  32%
  |                                                                            
  |=======================                                               |  32%
  |                                                                            
  |=======================                                               |  33%
  |                                                                            
  |=======================                                               |  34%
  |                                                                            
  |========================                                              |  34%
  |                                                                            
  |========================                                              |  35%
  |                                                                            
  |=========================                                             |  35%
  |                                                                            
  |=========================                                             |  36%
  |                                                                            
  |==========================                                            |  37%
  |                                                                            
  |==========================                                            |  38%
  |                                                                            
  |===========================                                           |  38%
  |                                                                            
  |===========================                                           |  39%
  |                                                                            
  |============================                                          |  39%
  |                                                                            
  |============================                                          |  40%
  |                                                                            
  |============================                                          |  41%
  |                                                                            
  |=============================                                         |  41%
  |                                                                            
  |=============================                                         |  42%
  |                                                                            
  |==============================                                        |  42%
  |                                                                            
  |==============================                                        |  43%
  |                                                                            
  |==============================                                        |  44%
  |                                                                            
  |===============================                                       |  44%
  |                                                                            
  |===============================                                       |  45%
  |                                                                            
  |================================                                      |  45%
  |                                                                            
  |================================                                      |  46%
  |                                                                            
  |=================================                                     |  46%
  |                                                                            
  |=================================                                     |  47%
  |                                                                            
  |=================================                                     |  48%
  |                                                                            
  |==================================                                    |  48%
  |                                                                            
  |==================================                                    |  49%
  |                                                                            
  |===================================                                   |  49%
  |                                                                            
  |===================================                                   |  50%
  |                                                                            
  |===================================                                   |  51%
  |                                                                            
  |====================================                                  |  51%
  |                                                                            
  |====================================                                  |  52%
  |                                                                            
  |=====================================                                 |  52%
  |                                                                            
  |=====================================                                 |  53%
  |                                                                            
  |======================================                                |  54%
  |                                                                            
  |======================================                                |  55%
  |                                                                            
  |=======================================                               |  55%
  |                                                                            
  |=======================================                               |  56%
  |                                                                            
  |========================================                              |  56%
  |                                                                            
  |========================================                              |  57%
  |                                                                            
  |========================================                              |  58%
  |                                                                            
  |=========================================                             |  58%
  |                                                                            
  |=========================================                             |  59%
  |                                                                            
  |==========================================                            |  59%
  |                                                                            
  |==========================================                            |  60%
  |                                                                            
  |==========================================                            |  61%
  |                                                                            
  |===========================================                           |  61%
  |                                                                            
  |===========================================                           |  62%
  |                                                                            
  |============================================                          |  62%
  |                                                                            
  |============================================                          |  63%
  |                                                                            
  |============================================                          |  64%
  |                                                                            
  |=============================================                         |  64%
  |                                                                            
  |=============================================                         |  65%
  |                                                                            
  |==============================================                        |  65%
  |                                                                            
  |==============================================                        |  66%
  |                                                                            
  |===============================================                       |  67%
  |                                                                            
  |===============================================                       |  68%
  |                                                                            
  |================================================                      |  68%
  |                                                                            
  |================================================                      |  69%
  |                                                                            
  |=================================================                     |  69%
  |                                                                            
  |=================================================                     |  70%
  |                                                                            
  |=================================================                     |  71%
  |                                                                            
  |==================================================                    |  71%
  |                                                                            
  |==================================================                    |  72%
  |                                                                            
  |===================================================                   |  72%
  |                                                                            
  |===================================================                   |  73%
  |                                                                            
  |===================================================                   |  74%
  |                                                                            
  |====================================================                  |  74%
  |                                                                            
  |====================================================                  |  75%
  |                                                                            
  |=====================================================                 |  75%
  |                                                                            
  |=====================================================                 |  76%
  |                                                                            
  |======================================================                |  76%
  |                                                                            
  |======================================================                |  77%
  |                                                                            
  |======================================================                |  78%
  |                                                                            
  |=======================================================               |  78%
  |                                                                            
  |=======================================================               |  79%
  |                                                                            
  |========================================================              |  79%
  |                                                                            
  |========================================================              |  80%
  |                                                                            
  |========================================================              |  81%
  |                                                                            
  |=========================================================             |  81%
  |                                                                            
  |=========================================================             |  82%
  |                                                                            
  |==========================================================            |  82%
  |                                                                            
  |==========================================================            |  83%
  |                                                                            
  |===========================================================           |  84%
  |                                                                            
  |===========================================================           |  85%
  |                                                                            
  |============================================================          |  85%
  |                                                                            
  |============================================================          |  86%
  |                                                                            
  |=============================================================         |  86%
  |                                                                            
  |=============================================================         |  87%
  |                                                                            
  |=============================================================         |  88%
  |                                                                            
  |==============================================================        |  88%
  |                                                                            
  |==============================================================        |  89%
  |                                                                            
  |===============================================================       |  89%
  |                                                                            
  |===============================================================       |  90%
  |                                                                            
  |===============================================================       |  91%
  |                                                                            
  |================================================================      |  91%
  |                                                                            
  |================================================================      |  92%
  |                                                                            
  |=================================================================     |  92%
  |                                                                            
  |=================================================================     |  93%
  |                                                                            
  |=================================================================     |  94%
  |                                                                            
  |==================================================================    |  94%
  |                                                                            
  |==================================================================    |  95%
  |                                                                            
  |===================================================================   |  95%
  |                                                                            
  |===================================================================   |  96%
  |                                                                            
  |====================================================================  |  97%
  |                                                                            
  |====================================================================  |  98%
  |                                                                            
  |===================================================================== |  98%
  |                                                                            
  |===================================================================== |  99%
  |                                                                            
  |======================================================================|  99%
  |                                                                            
  |======================================================================| 100%

Summary

Data

Import and clean data

This study includes data on:

Demolition: The intervention variable available at the address level and aggregated to the block group.
Crime: The outcome variable available at the address level and aggregated to the block group.
Property conditions: The block groups are controlled for the existing prevalence of vacant lots and vacant building notices between 2012 and 2014.
Demographics: Using the American Community Survey data, this study controls for variables that are closely correlated with crime outcomes.

Demolition

This analysis uses two sources of data on demolitions:

Records on 3,573 (?) completed full building demolitions from the Baltimore Department of Housing and Community Development internal eDemo SQL database from ??? to the end of 2019. Jessica Clarke, former Demolition Program manager (?), provided this data as an Excel sheet on March 1, 2020.
Records on 4,186 building permits issued for 4,170 addresses. This data is included within the permit data provided on Open Baltimore (the city’s open data portal) and is available as a shapefile via the city’s ArcGIS Feature Server.

The records are first limited to the study period (by date of completed for the HCD records and date of issue for the demolition permits).

HCD demolition records: 236 rows (7%) outside period, 3,337 left
Demolition permits: 305 rows (7%) outside analysis period, 4,186 left

This data is joined to the city’s real property tax assessment data (by block and lot, and then by address for any unmatched records) to provide locations that are missing from the HCD completed demolition data.

The HCD completed demolition data is then matched to the permit data to avoid duplicate records. For any HCD completed demolition records that do not include the permit number, we removed any demolition permits that matched the same address.

The remaining demolition permits are then de-duped to remove duplicate addresses (NOTE: may be unecessary) and joined to the HCD completed demolition data as a complete record of all permitted demolitions during the study period.

Finally, the data is matched to U.S. census block groups.

We are not looking at the following (except maybe for a senstivity analysis):

Duration of demolition (difference between date started and finished): The date started is missing for many records.
Emergency v. non-emergency
Project CORE phase 1, 2, 3, and 4: primarily review and decision-making process

# Import city demolition data
city_demos <- read_csv('data/rptDemo_RFA_StatusDataTable_Property_3.1.20.csv') %>%
    janitor::clean_names('snake') %>%
    mutate(pinrelate = str_c(block, lot), # Concatenate block and lot
           demolition_started = ymd(start),
           demolition_finished = ymd(date_demo_finish),
           month_finished = month(date_demo_finish),
           year_finished = year(date_demo_finish),
           year_qrt_finished = quarter(date_demo_finish, with_year = TRUE)) %>%  # Add a year/quarter column
  rename(house_num = no,
       direction = dir,
       street_name = street) %>% # Rename variables to match VBN data (also more clear)
  filter(demolition_finished %within% analysis_period) # Filter to 2012-2019 data

# TODO: Document changes in demo count across analysis (e.g. filtering to the analysis period removed 236 rows (7%), 3,337 left)

# Import real property data
real_prop <- read_sf("data/real-property_shp/2019-10-13-property-information.shp") %>% 
  janitor::clean_names('snake') %>%
  select(-(4:20)) %>% # Drop 17 variables (blocklot, block, lot, ward, section, …)
  select(-(6:28)) # Drop 23 variables (currland, currimpr, exmpland, exmpimpr, fullcash, …)

# TODO: Explore address matching for missing data.
# 3,525 of rows have block/lot values, 48 are missing
city_demos_sf <- city_demos %>% 
  left_join(real_prop, by = "pinrelate") %>%  # Join by "pinrelate" (concatenated block/lot)
  st_as_sf() %>% 
  st_transform(4269)

city_demos_pts_sf <- city_demos_sf %>% 
  st_centroid()

## Warning in st_centroid.sf(.): st_centroid assumes attributes are constant over
## geometries of x

## Warning in st_centroid.sfc(st_geometry(x), of_largest_polygon =
## of_largest_polygon): st_centroid does not give correct centroids for longitude/
## latitude data

city_demos_blockgroups <- st_join(city_demos_pts_sf, baltimore_blockgroups, join = st_within)

# TODO: 3,298 rows (99%) matched to a block group, only 39 rows NA
city_demos_blockgroups_na <- city_demos_blockgroups %>% 
  filter(is.na(GEOID))

# write_csv(city_demos_blockgroups_na, "city_demos_blockgroups_na.csv")

# Import data from Open Baltimore

demo_permits_shp <- st_read('data/demolition-permits/2020-02-03-demolition-permits-since-2010.shp')

## Reading layer `2020-02-03-demolition-permits-since-2010' from data source `/Users/elipousson/Projects/vacant-building-demolition-health/data/demolition-permits/2020-02-03-demolition-permits-since-2010.shp' using driver `ESRI Shapefile'
## Simple feature collection with 4491 features and 42 fields
## geometry type:  POLYGON
## dimension:      XY
## bbox:           xmin: 1393980 ymin: 558803 xmax: 1445317 ymax: 621297
## epsg (SRID):    2248
## proj4string:    +proj=lcc +lat_1=39.45 +lat_2=38.3 +lat_0=37.66666666666666 +lon_0=-77 +x_0=399999.9998983998 +y_0=0 +ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +units=us-ft +no_defs

demo_permits <- demo_permits_shp %>% 
  janitor::clean_names() %>%
  # filter(csm_status == 'FNL') %>% 
  mutate(csm_issued = ymd(csm_issued),
         month = month(csm_issued),
         year = year(csm_issued),
         year_qrt = quarter(csm_issued, with_year = TRUE)) %>%  # Add a year/quarter column
  filter(csm_issued %within% analysis_period) %>%
  st_as_sf(
   # coords = c("longitude", "latitude"),
    agr = "constant",
    crs = (2248),        # Matching the CRS of the tigris block group data
    stringsAsFactors = FALSE,
    remove = TRUE
  ) %>% 
  st_transform(4269)

demo_permits_blockgroups <- st_join(demo_permits, baltimore_blockgroups, join = st_within)

demo_permits_blockgroups_na <- demo_permits_blockgroups %>% 
  filter(is.na(GEOID))

Crime

The Baltimore City Police Department Uniform Crime Reporting data on crimes and location.

2012 to June 2019 data provided by , geocoded by JHSPH (courtesy Molly Francis and Dr. Cassandra Crifasi)

This data is also available via Open Baltimore. The Open Baltimore data will be used to fill in the missing information from from June through December.

360,191 rows (>99%) matched to a block group, 495 rows unmatched

crime_openbalt_2019 <- read_csv('data/open-balt-crime_csv/crime_openbalt_2019.csv') %>% 
  select(crime_date, description, month, year, year_qrt, street_address, vri_name1, longitude, latitude)

## Error in select(., crime_date, description, month, year, year_qrt, street_address, : unused arguments (crime_date, description, month, year, year_qrt, street_address, vri_name1, longitude, latitude)

crime <- list.files("data/bpd-crime_csv", full.names = TRUE) %>%
  tibble(filename = .) %>%
  mutate(data = map(filename, read.csv)) %>%
  unnest(data) %>%
  rename(street_address = location) %>%
  mutate(missing_coords = (is.na(longitude) | longitude == 0), # Flag any crimes missing coordinates
         crime_date = ymd_hms(crimedate),
         month = month(crimedate),
         year = year(crimedate),
         year_qrt = quarter(crimedate, with_year = TRUE)) %>%  # Add a year/quarter column
  filter(crime_date %within% analysis_period) %>%
  select(crime_date, description, month, year, year_qrt, street_address, vri_name1, longitude, latitude) %>%
  bind_rows(as.data.frame(crime_openbalt_2019)) %>% 
  filter(!is.na(longitude)) %>% 
  st_as_sf(coords = c("longitude", "latitude"), #  Convert to an SF object
           crs = 4269, # Matching the CRS of the tigris block group data
           stringsAsFactors = FALSE,
           remove = TRUE)

## Error in select(., crime_date, description, month, year, year_qrt, street_address, : unused arguments (crime_date, description, month, year, year_qrt, street_address, vri_name1, longitude, latitude)

# find points within polygons
crime_blockgroups <- crime %>% st_join(baltimore_blockgroups, join = st_within)

# Missing data
crime_blockgroups_na <- crime_blockgroups %>% 
  filter(is.na(GEOID))

# write_csv(crime_blockgroups_na, "crime_blockgroups_na.csv")

Property data

The two key property variables for this study include:

Vacant building notices: an administrative data set used by the Department of Housing and Community Development to track vacant and abandoned houses. Notices are issued when a building is determined to meet the statutory standard of a “vacant building.” Notices are cancelled or abated when a use and occupancy permit is issued for a building or when the building is demolished.
Vacant lots: defined as any property with no improvements according to the city’s real property data.

Definition of vacant building changed in ?? year

Imported shapefiles on 26,897 unique vacant building notices including around 16,486 currently open VBN and 10,411 that are closed.

As vacant building notices are issued, abated, and cancelled continuously throughout the year, we determined the number of open vacant building notices per quarter in each block group.

A large share of the closed vacant building notices (5,144 or 49.4%) do not show an abatement or cancellation date. 214 of these notices are missing geometry. Only 21 of these can be matched to demolitions by block and lot values.

The remainder (5,123) may include buildings that were demolished prior to 2012 (so are not included in the records provided by the and are not a concern for this analysis) or buildings that recieved a use and occupancy permit but were not updated.

We plan to compare these records to the permit database to determine if the properties recieved demolition permits outside the analysis period, were demolished by a private owner, or if they recieved a use and occupancy permit. The permit date could then be used as an approximation of an abatement or cancel date.

All vacant building notices with location data were matched to

# Import and combine open vacant building notices
vbn <- fs::dir_ls("data/vacant-building-notice_shp/", glob = "*.shp") %>% 
  tibble(fname = .) %>%
  mutate(data = map(fname, read_sf)) %>%
  unnest(data) %>%
  janitor::clean_names('snake') %>% 
  st_as_sf() %>%
  st_transform(4269) %>%
  rename(notice_date = date_notice,
         abate_date = date_abate,
         cancel_date = date_cancel) %>% 
  mutate(
    pinrelate = str_c(block, lot),
    vbn_status = if_else(
      str_detect(fname,"(still-vacant)") == TRUE, "open", "closed"),
    street_name = str_to_title(street_name),
    notice_date = ymd(notice_date),
    abate_date = ymd(abate_date),
    cancel_date = ymd(cancel_date)) %>% 
  distinct(pinrelate, notice_date, .keep_all = TRUE) %>% # Remove duplicates 3,690 rows (12%)
  st_centroid() %>% 
  filter(!is.na(abate_date) | !is.na(cancel_date) | (vbn_status == 'open')) # Remove closed vacant building notices with no known closure date

## Warning in st_centroid.sf(.): st_centroid assumes attributes are constant over
## geometries of x

## Warning in st_centroid.sfc(st_geometry(x), of_largest_polygon =
## of_largest_polygon): st_centroid does not give correct centroids for longitude/
## latitude data

# TODO: Resolve issue with incomplete VBN records. This filter only works if the VBN import above does *not* filter out these records.
# vbn_fix <- vbn %>% filter(is.na(abate_date) & is.na(cancel_date) & (vbn_status == 'closed'))

# Check for missing geometry
# check_vbn_geo <- str_sub(as.character(st_geometry(vbn)), 3, 6)
# sum(str_detect(check_vbn_geo, "(N)(a)(N)"))

vbn_blockgroups <- vbn %>% 
  st_join(baltimore_blockgroups, join = st_within)

Use current (2020) real property data as a placeholder for baseline real property data to count vacant lots, vacant buildings.

# Match real property records to block groups (237 are unmatched)
real_prop_blockgroups <- real_prop %>% 
  st_transform(4269) %>% 
  st_centroid() %>%
  st_join(baltimore_blockgroups, join = st_within)

## Warning in st_centroid.sf(.): st_centroid assumes attributes are constant over
## geometries of x

## Warning in st_centroid.sfc(st_geometry(x), of_largest_polygon =
## of_largest_polygon): st_centroid does not give correct centroids for longitude/
## latitude data

real_prop_count <- real_prop_blockgroups %>%
  st_drop_geometry() %>%
  mutate(vacant_building = if_else(vacind == 'Y', TRUE, FALSE)) %>% 
  mutate_at(vars(vacant_building), ~ replace(., is.na(.), FALSE)) %>%
  group_by(GEOID) %>% # Group by block group
  summarize(property_count = n(),
            vacant_lot_count = sum(!is.na(no_imprv)), # Counting unimproved properties as vacant lots
            building_count = sum(is.na(no_imprv)), # Counting all other properties as buildings
            vacant_building_count = sum(vacant_building)) %>% # Use vacant indicator field to count vacant buildings
  filter(!is.na(GEOID))

hmt <- read_sf('data/housing-market-typology-2014/2020-03-05_housing-market-typology-2014.shp') %>%
  st_drop_geometry() %>% 
  janitor::clean_names('snake') %>%
  rename(geoid = geo_id10) %>% 
  rename(cluster_letter = cluster_let) %>%
  select(2:3) %>% 
  arrange(cluster_letter) %>% 
  mutate_at(vars(cluster_letter), ~ replace(., is.na(.), 'Not assigned')) %>%
  distinct(geoid, .keep_all = TRUE) %>%
  filter(!is.na(geoid))

Demographic data (American Community Survey)

Baseline 2013-2017 5-year ACS

NOTE: Should the ACS data be time-varying? We could

# Load a list of available ACS variables
# v18 <- load_variables(2018, "acs5", cache = TRUE)
# View(v18)

acs_vars <- c(
  total_pop = "B01003_001",
  white_pop = "B02001_002",
  black_pop = "B02001_003",
  total_hispanic_pop = "B03002_012",
  white_nonhispanic_pop = "B03002_003",
  black_hispanic_pop = "B03002_014",
  male_15to17_pop = "B01001_006",
  male_18to19_pop = "B01001_007",
  male_20_pop = "B01001_008",
  male_21_pop = "B01001_009",
  male_22to24_pop = "B01001_010",
  male_25to29_pop = "B01001_011",
  male_30to34_pop = "B01001_012",
  median_age = "B01002_001",
  total_household = "B11001_001",
  household_married_families = "B11001_002",
  total_families = "B17010_001",
  families_below_poverty = "B17010_002",
  labor_force = "B23025_003", # Civilian labor force
  labor_force_unemployed = "B23025_005"
)

# Get ACS data without geometry for 2018
acs_bg <- get_acs(
  geography = "block group",
  variables = acs_vars,
  state = "MD", county = "Baltimore city", geometry = FALSE,
  year = 2018
)

acs_bg_wide <- pivot_wider(acs_bg, id_cols = "GEOID", names_from = "variable", values_from = "estimate")

acs_bg_wide <- acs_bg_wide %>%
  clean_names(case = "snake") %>% 
  mutate(pct_white_nonhispanic = white_nonhispanic_pop / total_pop,
         pct_hispanic_nonblack = (total_hispanic_pop - black_hispanic_pop) / total_pop,
         pct_black_pop = black_pop / total_pop,
         pct_unemployed = labor_force_unemployed / labor_force,
         pct_family_poverty = families_below_poverty / total_families,
         pct_married_household = household_married_families / total_household,
         pct_male_15to34 = (male_15to17_pop + male_18to19_pop + male_20_pop + male_21_pop + male_22to24_pop + male_25to29_pop + male_30to34_pop) / total_pop)

# TODO: Several selected control are only available at the tract level.

#acs_vars_tract <- c(
#  income_total = "B17020_001",
#  income_below_poverty_level = "B17020_002",
#  school_less_hsgrad = "B16010_002",
#  school_pop = "B16010_001"
#)

#acs_tract <- get_acs(
#  geography = "tract",
#  variables = acs_vars_tract,
#  state = "MD", county = "Baltimore city", geometry = FALSE,
#  year = 2018
#)

Summarize data

Group
Count

Count crime by category, year, and quarter

Counts by outcome category at baseline and by year and quarter

# count crimes per census block group and category
crime_outcomes <- list(c("HOMICIDE"), c("HOMICIDE", "SHOOTING"), c("AGG. ASSAULT"), c("ROBBERY - STREET", "ROBBERY - RESIDENCE", "ROBBERY - CARJACKING"), c("AUTO THEFT", "LARCENY", "BURGLARY", "LARCENY FROM AUTO"))
names(crime_outcomes) <- c("homicide", "homicide_nonfatalshooting", "agg_assault", "robberies", "property_crime")

# Count crimes by GEOID and quarter for each outcome category 
crime_qrt_count_outcome <- map_dfr(names(crime_outcomes), 
                           ~add_column(count(
                             as_tibble(filter(crime_blockgroups,
                                              description %in% crime_outcomes[[.]])), 
                             GEOID, year_qrt), variable = .))

crime_qrt_count_all_crimes <- as_tibble(crime_blockgroups) %>%
  group_by(GEOID, year_qrt) %>% 
  summarize(n = n(),
            variable = "all_crimes")

crime_qrt_count <- bind_rows(crime_qrt_count_outcome, crime_qrt_count_all_crimes)

# Count crimes by GEOID and year for each outcome category 
crime_year_count_outcome <- map_dfr(names(crime_outcomes),
                            ~add_column(count(
                              as_tibble(filter(crime_blockgroups,
                                               description %in% crime_outcomes[[.]])),
                              GEOID, year), variable = .))

crime_year_count_all_crimes <- as_tibble(crime_blockgroups) %>%
  group_by(GEOID, year) %>% 
  summarize(n = n(),
            variable = "all_crimes")

crime_year_count <- bind_rows(crime_year_count_outcome, crime_year_count_all_crimes)

Count demolition by year and quarter

Permits count by year/quarter issued

demo_permits_year_count <- add_column(count(as_tibble(demo_permits_blockgroups),
                        GEOID, year), variable = c("demolition"))

demo_permits_qrt_count <- add_column(count(as_tibble(demo_permits_blockgroups),
                        GEOID, year_qrt), variable = c("demolition"))

total_demo_qrt_count <- demo_permits_qrt_count %>% 
  group_by(year_qrt) %>% 
  summarize(total_count = sum(n))

HCD count by year/quarter completed

city_demos_qrt_count <- add_column(count(as_tibble(city_demos_blockgroups),
                        GEOID, year_qrt_finished), variable = c("city_demolition"))

total_city_demos_qrt_count <- city_demos_qrt_count %>% 
  group_by(year_qrt_finished) %>% 
  summarize(total_count = sum(n))

Merge data

Join demolition and crime data

city_demos_qrt_count <- city_demos_qrt_count %>% # Rename year_qrt_finished so tables match
  rename(year_qrt = year_qrt_finished)

qrt_counts <- bind_rows(crime_qrt_count, city_demos_qrt_count) # TODO: Previously included demo_permits_qrt_count

qrt_counts_wide <- pivot_wider(qrt_counts, names_from = variable, values_from = n)

qrt_counts_wide <- qrt_counts_wide %>%
  right_join(geoid_year_qrt, by = c("GEOID", "year_qrt")) %>% 
  mutate_at(vars(homicide:city_demolition), ~replace(., is.na(.), 0)) %>% 
  group_by(GEOID) %>% 
  arrange(yq(year_qrt)) %>% 
  mutate(cumsum_city_demolition = cumsum(city_demolition)) %>% 
  ungroup()

qrt_counts_wide_sf <- qrt_counts_wide %>%  # Join data to block group geography
  right_join(baltimore_blockgroups, by = "GEOID") %>%
  arrange(year_qrt, GEOID) %>% 
  st_as_sf()

year_counts <- bind_rows(crime_year_count, demo_permits_year_count)

year_counts_wide <- pivot_wider(year_counts, names_from = variable, values_from = n)

year_counts_wide <- year_counts_wide %>% 
  right_join(geoid_year, by = c("GEOID", "year")) %>%
  mutate_at(vars(homicide:demolition), ~replace(., is.na(.), 0))

year_data <- baltimore_blockgroups %>% # Join data to block group geography
  left_join(year_counts_wide, by = "GEOID") %>% 
  arrange(year, GEOID)

Join demographic data with demolition and crime data

qrt_counts_acs <- acs_bg_wide %>%
  select(-(2:8),-(11:21)) %>% # Drop extra columns
  left_join(qrt_counts_wide, by = c("geoid" = "GEOID"))

# year_acs_data <- left_join(acs_bg_wide, year_data, by = c("geoid" = "GEOID"))

# counts_summary_acs_data <- left_join(acs_bg_wide, counts_wide_summary, by = c("geoid" = "GEOID"))
# TODO: I removed the summary data script. Figure out if I need to add it back in!

Join property data with demolition, crime, and demographic data

qrt_counts_acs <- qrt_counts_acs %>% 
  left_join(hmt, by = "geoid") %>% 
  left_join(real_prop_count, by = c("geoid" = "GEOID")) %>% 
  mutate_at(vars(cluster_letter), ~ replace(., is.na(.), 'Not assigned'))

Exploratory analysis

Crime

Plots over time

# Totaling crimes by quarter across block groups
crime_qrt_count %>% 
  group_by(year_qrt, variable) %>% 
  summarize(qrt_total = sum(n)) %>%
  filter(!variable %in% c("property_crime", "robberies", "all_crimes")) %>%
  ggplot(aes(lubridate::yq(year_qrt), qrt_total, color = variable, group = variable)) +
    geom_point() +
    geom_line() +
    scale_y_log10() +
    scale_color_manual(
      name = "Crime type",
      values = c("#66c2a5","#fc8d62","#8da0cb"),
      labels = c("Aggravated assault", "Homicide", "Homicide and non-fatal shootings")) +
    labs(x = 'Quarters',
         y = 'Number of incidents',
         title = "Assaults, homicides, and shootings, 2012-2019") +
    hrbrthemes::theme_ipsum_rc()

crime_qrt_count %>%
  group_by(year_qrt, variable) %>% 
  summarize(qrt_total = sum(n)) %>%
  filter(!variable %in% c("homicide", "homicide_nonfatalshooting", "agg_assault", "all_crimes")) %>%
  ggplot(aes(lubridate::yq(year_qrt), qrt_total, color = variable, group = variable)) +
    geom_point() +
    geom_line() +
    scale_y_log10() +
    scale_color_manual(
      name = "Crime type",
      values = c("#66c2a5","#fc8d62"),
      labels = c("Property crime", "Robberies")) +
    labs(x = 'Quarters',
         y = 'Number of incidents',
         title = "Robberies and property crime, 2012-2019") +
    hrbrthemes::theme_ipsum_rc()

crime_qrt_count %>%
  group_by(year_qrt, variable) %>% 
  summarize(qrt_total = sum(n)) %>%
  filter(variable == "all_crimes") %>%
  ggplot(aes(lubridate::yq(year_qrt), qrt_total, color = variable, group = variable)) +
    geom_point() +
    geom_line() +
    scale_y_log10() +
    scale_color_manual(
      name = "",
      values = c("#66c2a5"),
      labels = "") +
    labs(x = 'Quarters',
         y = 'Number of incidents',
         title = "Crime by quarter, 2012-2019") +
    hrbrthemes::theme_ipsum_rc()

Demolition

Quarterly plots

total_demo_qrt_count %>% 
  ggplot(aes(lubridate::yq(year_qrt), total_count, group = 1)) +
    geom_point(color = "#440154FF") +
    geom_line(color = "#440154FF") +
    labs(x = 'Quarters',
         y = 'Number of permits',
         title = "Demolition permits issued in Baltimore, Maryland, 2012-2019",
         caption = "Source: Department of Housing and Community Development (DHCD)") +
    hrbrthemes::theme_ipsum_rc()

total_city_demos_qrt_count %>% 
  ggplot(aes(yq(year_qrt_finished), total_count, group = 1)) +
    geom_point(color = "#440154FF") +
    geom_line(color = "#440154FF") +
    labs(x = 'Quarters',
         y = 'Number of permits',
         title = "Demolitions completed by Baltimore City and the State of Maryland, 2012-2019",
         caption = "Source: Department of Housing and Community Development (DHCD)") +
    hrbrthemes::theme_ipsum_rc()

Demographics

#acs_qrt_count %>%
#  keep(is.numeric) %>% 
#  gather() %>% 
#  ggplot(aes(value)) +
#    facet_wrap(~ key, scales = "free") +
#    geom_histogram()

Non-spatial regression analysis

Exploratory analysis with negative binomial regression

The initial exploratory analysis used a simplified model for a negative binomial regression:

\[ln(Y) = \beta_1X_1 + \beta_2X_2 + ln(t_1)\] \(X_1\) is the number of completed demolitions in a block group for any given quarter and \(X_2\) is the baseline level of crime in that block group (based on the median count of all crimes in 2012 and 2013). \(t_1\) is the total population for each block group used as an offset.

NOTE: I’m unsure if this is the correct notation.

Here is Jay et al.’s approach:

Negative bionmial model to model outcome defined as the number of crimes at unit i at time t (\(Y_{it}\)) as a function of the unit fixed effect \(a_i\), the time fixed effect \(b_i\), and a constant treatment \(\delta\) effect. \(D_{it}\) represents the treatment status of i at time t.

\[logE[Y_{it}] = a_i + b_i + \delta D_{it} + \epsilon_{it}\]

unit fixed effects: time-invariant attributes of the block groups
time fixed effects: controlled for seasonality and any other group-invariant time trends.

# TODO: Figure out where to add this baseline crime variable (this is probably not the most logical spot)
# Calculate baseline crime as median from 2012 to 2013
all_crimes_baseline <- qrt_counts_acs %>%
  filter(year == 2012 | year == 2013) %>%
  group_by(geoid) %>% 
  summarise(all_crimes_baseline_median = median(all_crimes))

model1_data <- qrt_counts_acs %>% 
  left_join(all_crimes_baseline, by = "geoid")

#model1_data <- model1_data %>%
#    select(geoid, homicide:all_crimes, city_demolition, cumsum_city_demolition, total_pop, year_qrt, year, all_crimes_baseline_median)

# TODO: I couldn't get the full function working but this piece should save a little space.

tidy_and_exp <- function(x, outcome_name){
  x %>% 
    tidy(conf.int = TRUE) %>%
    unnest(cols = c()) %>%
    mutate(
      irr_estimate = exp(estimate),
      irr_conf.low = exp(conf.low),
      irr_conf.high = exp(conf.high)
    ) %>% 
    add_column(outcome = outcome_name)
}

library(MASS) # For negative binomial regression

# TODO: Finish switching these models over to the expanded version

# All crimes
model1_all_crimes <- glm.nb(all_crimes ~ city_demolition + cluster_letter + all_crimes_baseline_median + median_age + pct_white_nonhispanic + pct_black_pop + pct_unemployed + pct_family_poverty + pct_male_15to34 + pct_married_household + vacant_building_count + vacant_lot_count,
      offset(total_pop), maxit=999, data = model1_data) %>%
tidy_and_exp("all_crimes")

# Homicide
model1_homicide <- glm.nb(homicide ~ cumsum_city_demolition + cluster_letter + all_crimes_baseline_median + median_age + pct_white_nonhispanic + pct_black_pop + pct_unemployed + pct_family_poverty + pct_male_15to34 + pct_married_household + vacant_building_count + vacant_lot_count,
      offset(total_pop), maxit=999, data = model1_data) %>%
tidy_and_exp("homicide")

# Homicide and non-fatal shootings
model1_homicide_nonfatalshooting <- glm.nb(homicide_nonfatalshooting ~ cumsum_city_demolition + all_crimes_baseline_median + median_age + pct_white_nonhispanic + pct_black_pop + pct_unemployed + pct_family_poverty + pct_male_15to34 + pct_married_household + vacant_building_count + vacant_lot_count,
      offset(total_pop), maxit=999, data = model1_data) %>%
tidy_and_exp("homicide_nonfatalshooting")

# Aggravated assault
model1_agg_assault <- glm.nb(agg_assault ~ cumsum_city_demolition + all_crimes_baseline_median + median_age + pct_white_nonhispanic + pct_black_pop + pct_unemployed + pct_family_poverty + pct_male_15to34 + pct_married_household + vacant_building_count + vacant_lot_count,
      offset(total_pop), maxit=999, data = model1_data) %>%
tidy_and_exp("agg_assault")

# Robberies
model1_robberies <- glm.nb(robberies ~ cumsum_city_demolition + all_crimes_baseline_median + median_age + pct_white_nonhispanic + pct_black_pop + pct_unemployed + pct_family_poverty + pct_male_15to34 + pct_married_household + vacant_building_count + vacant_lot_count,
      offset(total_pop), maxit=999, data = model1_data) %>%
tidy_and_exp("robberies")

# Property crime
model1_property_crime <- glm.nb(property_crime ~ cumsum_city_demolition + all_crimes_baseline_median + median_age + pct_white_nonhispanic + pct_black_pop + pct_unemployed + pct_family_poverty + pct_male_15to34 + pct_married_household + vacant_building_count + vacant_lot_count,
      offset(total_pop), maxit=999, data = model1_data) %>%
tidy_and_exp("property_crime")

# Combine outcomes into one table
model1_table <- bind_rows(model1_all_crimes,
                          model1_homicide,
                          model1_homicide_nonfatalshooting,
                          model1_agg_assault,
                          model1_robberies,
                          model1_property_crime) %>%
  select(outcome, term, irr_estimate, irr_conf.low, irr_conf.high, p.value)

knitr::kable(model1_table, col.names = c("Outcome", "Model term", "Estimated IRR", "Low (95% CI)", "High (95% CI)", "p-value"), align = "crccc", digits = 3,
caption = "Model 1: Adjusted Incident Rate Ratio (IRR) by Crime Category")

Model 1: Adjusted Incident Rate Ratio (IRR) by Crime Category
Outcome	Model term	Estimated IRR	Low (95% CI)	High (95% CI)	p-value
all_crimes	(Intercept)	11.545	11.494	11.596	0.000
all_crimes	city_demolition	0.992	0.992	0.992	0.000
all_crimes	cluster_letterB	1.129	1.127	1.130	0.000
all_crimes	cluster_letterC	1.178	1.176	1.180	0.000
all_crimes	cluster_letterD	1.333	1.331	1.335	0.000
all_crimes	cluster_letterE	1.327	1.325	1.330	0.000
all_crimes	cluster_letterF	1.523	1.520	1.526	0.000
all_crimes	cluster_letterG	1.406	1.404	1.409	0.000
all_crimes	cluster_letterH	1.221	1.218	1.224	0.000
all_crimes	cluster_letterNot assigned	1.290	1.287	1.293	0.000
all_crimes	all_crimes_baseline_median	1.027	1.027	1.027	0.000
all_crimes	median_age	0.995	0.995	0.995	0.000
all_crimes	pct_white_nonhispanic	0.904	0.901	0.907	0.000
all_crimes	pct_black_pop	0.878	0.876	0.881	0.000
all_crimes	pct_unemployed	1.084	1.079	1.088	0.000
all_crimes	pct_family_poverty	0.992	0.990	0.994	0.000
all_crimes	pct_male_15to34	0.663	0.659	0.666	0.000
all_crimes	pct_married_household	0.798	0.797	0.800	0.000
all_crimes	vacant_building_count	1.001	1.001	1.001	0.000
all_crimes	vacant_lot_count	1.001	1.001	1.001	0.000
homicide	(Intercept)	0.034	0.033	0.035	0.000
homicide	cumsum_city_demolition	0.998	0.998	0.998	0.000
homicide	cluster_letterB	1.691	1.665	1.719	0.000
homicide	cluster_letterC	1.641	1.615	1.667	0.000
homicide	cluster_letterD	2.479	2.441	2.518	0.000
homicide	cluster_letterE	3.100	3.052	3.149	0.000
homicide	cluster_letterF	4.023	3.960	4.088	0.000
homicide	cluster_letterG	3.314	3.261	3.368	0.000
homicide	cluster_letterH	2.766	2.719	2.814	0.000
homicide	cluster_letterNot assigned	3.360	3.305	3.416	0.000
homicide	all_crimes_baseline_median	1.011	1.011	1.011	0.000
homicide	median_age	0.996	0.996	0.996	0.000
homicide	pct_white_nonhispanic	0.264	0.259	0.270	0.000
homicide	pct_black_pop	1.125	1.108	1.143	0.000
homicide	pct_unemployed	1.693	1.667	1.720	0.000
homicide	pct_family_poverty	1.237	1.224	1.250	0.000
homicide	pct_male_15to34	0.534	0.520	0.548	0.000
homicide	pct_married_household	0.992	0.981	1.004	0.181
homicide	vacant_building_count	1.006	1.006	1.006	0.000
homicide	vacant_lot_count	1.001	1.001	1.001	0.000
homicide_nonfatalshooting	(Intercept)	0.286	0.282	0.291	0.000
homicide_nonfatalshooting	cumsum_city_demolition	0.999	0.999	0.999	0.000
homicide_nonfatalshooting	all_crimes_baseline_median	1.014	1.014	1.014	0.000
homicide_nonfatalshooting	median_age	0.993	0.993	0.993	0.000
homicide_nonfatalshooting	pct_white_nonhispanic	0.121	0.119	0.122	0.000
homicide_nonfatalshooting	pct_black_pop	0.880	0.871	0.890	0.000
homicide_nonfatalshooting	pct_unemployed	2.118	2.093	2.143	0.000
homicide_nonfatalshooting	pct_family_poverty	2.055	2.041	2.070	0.000
homicide_nonfatalshooting	pct_male_15to34	0.396	0.389	0.403	0.000
homicide_nonfatalshooting	pct_married_household	1.165	1.156	1.174	0.000
homicide_nonfatalshooting	vacant_building_count	1.007	1.007	1.007	0.000
homicide_nonfatalshooting	vacant_lot_count	1.001	1.001	1.001	0.000
agg_assault	(Intercept)	2.952	2.933	2.971	0.000
agg_assault	cumsum_city_demolition	0.994	0.994	0.994	0.000
agg_assault	all_crimes_baseline_median	1.022	1.022	1.022	0.000
agg_assault	median_age	0.992	0.992	0.992	0.000
agg_assault	pct_white_nonhispanic	0.235	0.234	0.236	0.000
agg_assault	pct_black_pop	0.560	0.557	0.562	0.000
agg_assault	pct_unemployed	1.644	1.634	1.654	0.000
agg_assault	pct_family_poverty	1.753	1.747	1.759	0.000
agg_assault	pct_male_15to34	0.315	0.313	0.318	0.000
agg_assault	pct_married_household	0.806	0.804	0.809	0.000
agg_assault	vacant_building_count	1.004	1.004	1.004	0.000
agg_assault	vacant_lot_count	1.001	1.001	1.001	0.000
robberies	(Intercept)	3.118	3.097	3.140	0.000
robberies	cumsum_city_demolition	0.998	0.998	0.998	0.000
robberies	all_crimes_baseline_median	1.024	1.024	1.024	0.000
robberies	median_age	0.992	0.992	0.992	0.000
robberies	pct_white_nonhispanic	0.343	0.341	0.345	0.000
robberies	pct_black_pop	0.526	0.524	0.529	0.000
robberies	pct_unemployed	0.907	0.901	0.914	0.000
robberies	pct_family_poverty	1.178	1.173	1.183	0.000
robberies	pct_male_15to34	0.782	0.774	0.789	0.000
robberies	pct_married_household	0.546	0.544	0.548	0.000
robberies	vacant_building_count	1.000	1.000	1.000	0.000
robberies	vacant_lot_count	1.001	1.001	1.001	0.000
property_crime	(Intercept)	8.179	8.146	8.213	0.000
property_crime	cumsum_city_demolition	0.989	0.989	0.989	0.000
property_crime	all_crimes_baseline_median	1.025	1.025	1.025	0.000
property_crime	median_age	0.998	0.998	0.998	0.000
property_crime	pct_white_nonhispanic	0.927	0.924	0.930	0.000
property_crime	pct_black_pop	0.835	0.832	0.837	0.000
property_crime	pct_unemployed	1.049	1.044	1.053	0.000
property_crime	pct_family_poverty	0.902	0.900	0.904	0.000
property_crime	pct_male_15to34	0.769	0.765	0.773	0.000
property_crime	pct_married_household	0.876	0.874	0.878	0.000
property_crime	vacant_building_count	1.000	1.000	1.000	0.000
property_crime	vacant_lot_count	1.001	1.001	1.001	0.000

Matching on control variables

We then used propensity score matching to identify an appropriate comparison group for the 225 block groups that recieved one or more city/state demolitions between 2012 and 2019.

library(MatchIt) # load matching package

# Join baseline crime to ACS data
match_data <- qrt_counts_acs %>%  # 20896 rows
  left_join(all_crimes_baseline, by = "geoid") %>%
  arrange(desc(yq(year_qrt))) %>% 
#  filter(year != 2019) %>% #18284 rows
  distinct(geoid, .keep_all = TRUE) %>% 
  mutate(demo_treatment = (cumsum_city_demolition >= 1)) # 225 block groups have at least one demolition by the end of 2019; 218 block groups have at least one demolition by the end of 2018
# TODO: Shouldn't you need to have a demolition by earlier in the period?

# 10 block groups exluced due to missing covariates. 3 out of 10 have no population. Only one of five (one with no population has a cumulative demolition count (5)). All have a baseline crime ranging from 7 to 30.5
match_data_complete <- match_data %>% # Remove missing data
  filter(!is.na(median_age) & !is.na(pct_white_nonhispanic) &           !is.na(pct_black_pop) & !is.na(pct_unemployed) &           !is.na(pct_family_poverty) & !is.na(pct_male_15to34) &           !is.na(pct_married_household) & !is.na(all_crimes_baseline_median) & !is.na(geoid)) %>% select(-cluster_letter) # TODO: There continues to be an issue with missing values in cluster letter.

match_out <- matchit(demo_treatment ~ median_age + pct_white_nonhispanic + pct_black_pop + pct_unemployed + pct_family_poverty + pct_married_household + pct_male_15to34, data = match_data_complete, method = "nearest", distance = "logit")

summary(match_out) # check balance

## 
## Call:
## matchit(formula = demo_treatment ~ median_age + pct_white_nonhispanic + 
##     pct_black_pop + pct_unemployed + pct_family_poverty + pct_married_household + 
##     pct_male_15to34, data = match_data_complete, method = "nearest", 
##     distance = "logit")
## 
## Summary of balance for all data:
##                       Means Treated Means Control SD Control Mean Diff eQQ Med
## distance                     0.4790        0.2785     0.2025    0.2005  0.2019
## median_age                  38.2214       38.0401     9.4984    0.1813  0.6000
## pct_white_nonhispanic        0.1018        0.3434     0.3148   -0.2416  0.2091
## pct_black_pop                0.8266        0.5564     0.3604    0.2703  0.2403
## pct_unemployed               0.1505        0.0891     0.1002    0.0614  0.0718
## pct_family_poverty           0.2658        0.1372     0.1779    0.1286  0.1548
## pct_married_household        0.5498        0.5376     0.1777    0.0122  0.0139
## pct_male_15to34              0.1289        0.1514     0.0855   -0.0225  0.0151
##                       eQQ Mean eQQ Max
## distance                0.2010  0.3236
## median_age              1.0228 16.5000
## pct_white_nonhispanic   0.2408  0.5793
## pct_black_pop           0.2716  0.6289
## pct_unemployed          0.0617  0.1252
## pct_family_poverty      0.1299  0.1859
## pct_married_household   0.0182  0.0940
## pct_male_15to34         0.0223  0.1459
## 
## 
## Summary of balance for matched data:
##                       Means Treated Means Control SD Control Mean Diff eQQ Med
## distance                     0.4790        0.4286     0.1544    0.0504  0.0605
## median_age                  38.2214       37.9223    10.2495    0.2991  0.5000
## pct_white_nonhispanic        0.1018        0.1169     0.1764   -0.0151  0.0138
## pct_black_pop                0.8266        0.8091     0.2315    0.0175  0.0298
## pct_unemployed               0.1505        0.1237     0.1174    0.0269  0.0320
## pct_family_poverty           0.2658        0.2108     0.2015    0.0550  0.0660
## pct_married_household        0.5498        0.5594     0.1612   -0.0096  0.0134
## pct_male_15to34              0.1289        0.1362     0.0738   -0.0072  0.0077
##                       eQQ Mean eQQ Max
## distance                0.0506  0.0814
## median_age              0.5973  3.9000
## pct_white_nonhispanic   0.0168  0.0720
## pct_black_pop           0.0275  0.0866
## pct_unemployed          0.0307  0.1740
## pct_family_poverty      0.0607  0.1470
## pct_married_household   0.0151  0.0720
## pct_male_15to34         0.0088  0.0696
## 
## Percent Balance Improvement:
##                       Mean Diff. eQQ Med eQQ Mean  eQQ Max
## distance                 74.8570 70.0200  74.8458  74.8427
## median_age              -64.9490 16.6667  41.5976  76.3636
## pct_white_nonhispanic    93.7641 93.4215  93.0313  87.5730
## pct_black_pop            93.5256 87.6090  89.8804  86.2321
## pct_unemployed           56.2410 55.4750  50.2159 -38.9821
## pct_family_poverty       57.2263 57.3977  53.2566  20.9433
## pct_married_household    21.8261  3.5679  16.7646  23.4037
## pct_male_15to34          67.7991 49.1331  60.4084  52.2820
## 
## Sample sizes:
##           Control Treated
## All           419     224
## Matched       224     224
## Unmatched     195       0
## Discarded       0       0

# s.out <- summary(m.out1, standardize = TRUE)
# plot(s.out, interactive = FALSE)
plot(match_out,  type = "jitter", interactive = FALSE)

plot(match_out,  type = "hist")

matched_data <- match_out %>% 
  match.data(distance = "pscore") %>% # create ps matched data set from previous output
  mutate(pscore_quantile = ntile(pscore, 5))

Negative binomial regression with matched data

We then repeated the negative binomial regression using the same model as previously.

matched_data <- matched_data %>% 
  select(geoid, all_crimes_baseline_median, pscore, pscore_quantile)

model2_data <- qrt_counts_acs %>%
  right_join(matched_data, by = "geoid")

# All crimes
model2_all_crimes <- glm.nb(all_crimes ~ cumsum_city_demolition + cluster_letter + all_crimes_baseline_median + median_age + pct_white_nonhispanic + pct_black_pop + pct_unemployed + pct_family_poverty + pct_male_15to34 + pct_married_household + vacant_building_count + vacant_lot_count,
      offset(total_pop), maxit=999, data = model2_data) %>%
tidy_and_exp("all_crimes")

# Homicide
model2_homicide <- glm.nb(homicide ~ cumsum_city_demolition + cluster_letter + all_crimes_baseline_median + median_age + pct_white_nonhispanic + pct_black_pop + pct_unemployed + pct_family_poverty + pct_male_15to34 + pct_married_household + vacant_building_count + vacant_lot_count,
      offset(total_pop), maxit=999, data = model2_data) %>%
tidy_and_exp("homicide")

# Homicide and non-fatal shootings
model2_homicide_nonfatalshooting <- glm.nb(homicide_nonfatalshooting ~ cumsum_city_demolition + cluster_letter + all_crimes_baseline_median + median_age + pct_white_nonhispanic + pct_black_pop + pct_unemployed + pct_family_poverty + pct_male_15to34 + pct_married_household + vacant_building_count + vacant_lot_count,
      offset(total_pop), maxit=999, data = model2_data) %>%
tidy_and_exp("homicide_nonfatalshooting")

# Aggravated assault
model2_agg_assault <- glm.nb(agg_assault ~ cumsum_city_demolition + cluster_letter + all_crimes_baseline_median + median_age + pct_white_nonhispanic + pct_black_pop + pct_unemployed + pct_family_poverty + pct_male_15to34 + pct_married_household + vacant_building_count + vacant_lot_count,
      offset(total_pop), maxit=999, data = model2_data) %>%
tidy_and_exp("agg_assault")

# Robberies
model2_robberies <- glm.nb(robberies ~ cumsum_city_demolition + cluster_letter + all_crimes_baseline_median + median_age + pct_white_nonhispanic + pct_black_pop + pct_unemployed + pct_family_poverty + pct_male_15to34 + pct_married_household + vacant_building_count + vacant_lot_count,
      offset(total_pop), maxit=999, data = model2_data) %>%
tidy_and_exp("robberies")

# Property crime
model2_property_crime <- glm.nb(property_crime ~ cumsum_city_demolition + cluster_letter + all_crimes_baseline_median + median_age + pct_white_nonhispanic + pct_black_pop + pct_unemployed + pct_family_poverty + pct_male_15to34 + pct_married_household + vacant_building_count + vacant_lot_count,
      offset(total_pop), maxit=999, data = model2_data) %>%
tidy_and_exp("property_crime")

# Combine outcomes into one table
model2_table <- bind_rows(model2_all_crimes,
                          model2_homicide,
                          model2_homicide_nonfatalshooting,
                          model2_agg_assault,
                          model2_robberies,
                          model2_property_crime) %>%
  select(outcome, term, irr_estimate, irr_conf.low, irr_conf.high, p.value)

knitr::kable(model2_table, col.names = c("Outcome", "Model term", "Estimated IRR", "Low (95% CI)", "High (95% CI)", "p-value"), align = "crccc", digits = 3,
caption = "Model 2: Adjusted Incident Rate Ratio (IRR) by Crime Category")

Model 2: Adjusted Incident Rate Ratio (IRR) by Crime Category
Outcome	Model term	Estimated IRR	Low (95% CI)	High (95% CI)	p-value
all_crimes	(Intercept)	7.833	7.791	7.876	0.000
all_crimes	cumsum_city_demolition	0.994	0.994	0.994	0.000
all_crimes	cluster_letterB	1.242	1.238	1.246	0.000
all_crimes	cluster_letterC	1.397	1.392	1.402	0.000
all_crimes	cluster_letterD	1.472	1.467	1.477	0.000
all_crimes	cluster_letterE	1.500	1.495	1.505	0.000
all_crimes	cluster_letterF	1.633	1.628	1.639	0.000
all_crimes	cluster_letterG	1.566	1.560	1.571	0.000
all_crimes	cluster_letterH	1.459	1.453	1.464	0.000
all_crimes	cluster_letterNot assigned	1.464	1.459	1.469	0.000
all_crimes	all_crimes_baseline_median	1.031	1.031	1.031	0.000
all_crimes	median_age	0.994	0.994	0.994	0.000
all_crimes	pct_white_nonhispanic	1.157	1.152	1.163	0.000
all_crimes	pct_black_pop	1.051	1.048	1.055	0.000
all_crimes	pct_unemployed	1.029	1.025	1.032	0.000
all_crimes	pct_family_poverty	0.974	0.972	0.976	0.000
all_crimes	pct_male_15to34	0.683	0.679	0.687	0.000
all_crimes	pct_married_household	0.898	0.895	0.900	0.000
all_crimes	vacant_building_count	1.001	1.001	1.001	0.000
all_crimes	vacant_lot_count	1.001	1.001	1.001	0.000
homicide	(Intercept)	0.019	0.018	0.020	0.000
homicide	cumsum_city_demolition	0.999	0.999	0.999	0.000
homicide	cluster_letterB	1.827	1.753	1.906	0.000
homicide	cluster_letterC	2.118	2.032	2.208	0.000
homicide	cluster_letterD	2.760	2.649	2.877	0.000
homicide	cluster_letterE	3.590	3.445	3.742	0.000
homicide	cluster_letterF	3.892	3.735	4.058	0.000
homicide	cluster_letterG	3.750	3.598	3.911	0.000
homicide	cluster_letterH	3.438	3.297	3.587	0.000
homicide	cluster_letterNot assigned	3.710	3.559	3.868	0.000
homicide	all_crimes_baseline_median	1.018	1.018	1.018	0.000
homicide	median_age	0.998	0.998	0.999	0.000
homicide	pct_white_nonhispanic	0.399	0.388	0.410	0.000
homicide	pct_black_pop	1.307	1.283	1.331	0.000
homicide	pct_unemployed	1.700	1.673	1.728	0.000
homicide	pct_family_poverty	1.204	1.191	1.217	0.000
homicide	pct_male_15to34	0.634	0.616	0.652	0.000
homicide	pct_married_household	1.124	1.110	1.138	0.000
homicide	vacant_building_count	1.005	1.005	1.005	0.000
homicide	vacant_lot_count	1.001	1.001	1.001	0.000
homicide_nonfatalshooting	(Intercept)	0.035	0.034	0.037	0.000
homicide_nonfatalshooting	cumsum_city_demolition	1.003	1.002	1.003	0.000
homicide_nonfatalshooting	cluster_letterB	2.981	2.891	3.074	0.000
homicide_nonfatalshooting	cluster_letterC	3.243	3.146	3.345	0.000
homicide_nonfatalshooting	cluster_letterD	4.380	4.250	4.516	0.000
homicide_nonfatalshooting	cluster_letterE	5.715	5.544	5.892	0.000
homicide_nonfatalshooting	cluster_letterF	5.386	5.225	5.554	0.000
homicide_nonfatalshooting	cluster_letterG	6.201	6.015	6.395	0.000
homicide_nonfatalshooting	cluster_letterH	5.735	5.561	5.916	0.000
homicide_nonfatalshooting	cluster_letterNot assigned	4.630	4.490	4.775	0.000
homicide_nonfatalshooting	all_crimes_baseline_median	1.022	1.022	1.022	0.000
homicide_nonfatalshooting	median_age	0.992	0.992	0.992	0.000
homicide_nonfatalshooting	pct_white_nonhispanic	0.508	0.499	0.517	0.000
homicide_nonfatalshooting	pct_black_pop	1.527	1.508	1.547	0.000
homicide_nonfatalshooting	pct_unemployed	1.683	1.663	1.703	0.000
homicide_nonfatalshooting	pct_family_poverty	1.314	1.304	1.324	0.000
homicide_nonfatalshooting	pct_male_15to34	0.521	0.511	0.531	0.000
homicide_nonfatalshooting	pct_married_household	1.078	1.068	1.087	0.000
homicide_nonfatalshooting	vacant_building_count	1.004	1.004	1.004	0.000
homicide_nonfatalshooting	vacant_lot_count	1.001	1.001	1.001	0.000
agg_assault	(Intercept)	0.391	0.387	0.396	0.000
agg_assault	cumsum_city_demolition	0.998	0.998	0.998	0.000
agg_assault	cluster_letterB	2.758	2.732	2.783	0.000
agg_assault	cluster_letterC	2.743	2.718	2.769	0.000
agg_assault	cluster_letterD	3.803	3.769	3.838	0.000
agg_assault	cluster_letterE	4.237	4.198	4.275	0.000
agg_assault	cluster_letterF	4.161	4.123	4.200	0.000
agg_assault	cluster_letterG	4.632	4.590	4.676	0.000
agg_assault	cluster_letterH	4.247	4.206	4.287	0.000
agg_assault	cluster_letterNot assigned	4.647	4.604	4.690	0.000
agg_assault	all_crimes_baseline_median	1.027	1.027	1.027	0.000
agg_assault	median_age	0.991	0.991	0.991	0.000
agg_assault	pct_white_nonhispanic	0.811	0.804	0.817	0.000
agg_assault	pct_black_pop	1.030	1.024	1.036	0.000
agg_assault	pct_unemployed	1.209	1.202	1.216	0.000
agg_assault	pct_family_poverty	1.268	1.264	1.273	0.000
agg_assault	pct_male_15to34	0.529	0.524	0.534	0.000
agg_assault	pct_married_household	0.886	0.882	0.890	0.000
agg_assault	vacant_building_count	1.002	1.002	1.002	0.000
agg_assault	vacant_lot_count	1.001	1.001	1.001	0.000
robberies	(Intercept)	1.031	1.021	1.042	0.000
robberies	cumsum_city_demolition	1.001	1.001	1.001	0.000
robberies	cluster_letterB	1.289	1.280	1.298	0.000
robberies	cluster_letterC	1.825	1.813	1.838	0.000
robberies	cluster_letterD	1.923	1.910	1.937	0.000
robberies	cluster_letterE	1.804	1.792	1.817	0.000
robberies	cluster_letterF	1.939	1.925	1.952	0.000
robberies	cluster_letterG	1.805	1.792	1.819	0.000
robberies	cluster_letterH	1.746	1.733	1.760	0.000
robberies	cluster_letterNot assigned	1.850	1.837	1.864	0.000
robberies	all_crimes_baseline_median	1.030	1.030	1.030	0.000
robberies	median_age	0.991	0.991	0.991	0.000
robberies	pct_white_nonhispanic	0.806	0.799	0.813	0.000
robberies	pct_black_pop	0.737	0.732	0.741	0.000
robberies	pct_unemployed	0.805	0.799	0.811	0.000
robberies	pct_family_poverty	1.020	1.015	1.024	0.000
robberies	pct_male_15to34	0.939	0.929	0.949	0.000
robberies	pct_married_household	0.664	0.661	0.667	0.000
robberies	vacant_building_count	1.000	1.000	1.000	0.000
robberies	vacant_lot_count	1.001	1.001	1.001	0.000
property_crime	(Intercept)	4.787	4.760	4.814	0.000
property_crime	cumsum_city_demolition	0.991	0.991	0.991	0.000
property_crime	cluster_letterB	1.116	1.112	1.120	0.000
property_crime	cluster_letterC	1.334	1.329	1.339	0.000
property_crime	cluster_letterD	1.310	1.305	1.314	0.000
property_crime	cluster_letterE	1.291	1.287	1.296	0.000
property_crime	cluster_letterF	1.466	1.461	1.471	0.000
property_crime	cluster_letterG	1.278	1.274	1.283	0.000
property_crime	cluster_letterH	1.171	1.166	1.175	0.000
property_crime	cluster_letterNot assigned	1.171	1.167	1.176	0.000
property_crime	all_crimes_baseline_median	1.029	1.028	1.029	0.000
property_crime	median_age	0.997	0.997	0.997	0.000
property_crime	pct_white_nonhispanic	1.454	1.447	1.462	0.000
property_crime	pct_black_pop	1.039	1.035	1.042	0.000
property_crime	pct_unemployed	0.993	0.989	0.997	0.001
property_crime	pct_family_poverty	0.872	0.870	0.874	0.000
property_crime	pct_male_15to34	0.795	0.790	0.799	0.000
property_crime	pct_married_household	0.932	0.929	0.934	0.000
property_crime	vacant_building_count	1.000	1.000	1.000	0.000
property_crime	vacant_lot_count	1.001	1.001	1.001	0.000

Demolition Analysis

Eli Pousson

2/2/2020

Summary

Data

Import and clean data

Demolition

Crime

Property data

Demographic data (American Community Survey)

Summarize data

Count crime by category, year, and quarter

Count demolition by year and quarter

Merge data

Join demolition and crime data

Join demographic data with demolition and crime data

Join property data with demolition, crime, and demographic data

Exploratory analysis

Crime

Demolition

Demographics

Non-spatial regression analysis

Exploratory analysis with negative binomial regression

Matching on control variables

Negative binomial regression with matched data

Longitudinal regression analysis

Spatial analysis

Mapping distributions

Crime

Vacant housing

Appendix

History of Demolition Policy in Baltimore