Philadelphia Revitalization

Addressing 311 Incidents

Background

This report examines 311 incidents in the city of Philadelphia from 2015 - 2016. Incidents are not spatially random. In fact, markers of blight cluster in poor and majority non-white neighborhoods. This project intends to understand the magnitude of this problem. Combined with data from the American Community Survey, I connect the demographics of the city’s census tracts with the 311 incident case log. Hopefully the results of these findings will allow the city to effectively combat and reduce the problems they’re facing by leveraging this methodical and quantitative approach.

Below I load in libraries and custom functions.

library(tidyverse)
library(sf)
library(RSocrata)
library(viridis)
library(spatstat)
library(raster)
library(spdep)
library(FNN)
library(grid)
library(gridExtra)
library(knitr)
library(tidycensus)
library(spatstat)
library(stargazer)

options(scipen = 999)

mapTheme <- function(base_size = 12, title_size = 16) {
  theme(
    text = element_text( color = "black"),
    plot.title = element_text(size = title_size,colour = "black"),
    plot.subtitle=element_text(face="italic"),
    plot.caption=element_text(hjust=0),
    axis.ticks = element_blank(),
    panel.background = element_blank(),axis.title = element_blank(),
    axis.text = element_blank(),
    axis.title.x = element_blank(),
    axis.title.y = element_blank(),
    panel.grid.minor = element_blank(),
    panel.border = element_rect(colour = "black", fill=NA, size=2),
    strip.text.x = element_text(size = 14))
}

plotTheme <- function(base_size = 12, title_size = 16) {
  theme(
    text = element_text( color = "black"),
    plot.title = element_text(size = title_size, colour = "black"), 
    plot.subtitle = element_text(face="italic"),
    plot.caption = element_text(hjust=0),
    axis.ticks = element_blank(),
    panel.background = element_blank(),
    panel.grid.major = element_line("grey80", size = 0.1),
    panel.grid.minor = element_blank(),
    panel.border = element_rect(colour = "black", fill=NA, size=2),
    strip.background = element_rect(fill = "grey80", color = "white"),
    strip.text = element_text(size=12),
    axis.title = element_text(size=12),
    axis.text = element_text(size=10),
    plot.background = element_blank(),
    legend.background = element_blank(),
    legend.title = element_text(colour = "black", face = "italic"),
    legend.text = element_text(colour = "black", face = "italic"),
    strip.text.x = element_text(size = 14)
  )
}

q5 <- function(variable) {as.factor(ntile(variable, 5))}

qBr <- function(df, variable, rnd) {
  if (missing(rnd)) {
    as.character(format(quantile(round(df[[variable]],digits = -2),
                          c(.01,.2,.4,.6,.8), na.rm=T)),big.mark=',')
  } else if (rnd == FALSE | rnd == F) {
    as.character(format(quantile(round(df[[variable]]),digits = -2),
                 c(.01,.2,.4,.6,.8), na.rm=T),big.mark=',')
  }
}


palette5 <- c("#f0f9e8","#bae4bc","#7bccc4","#43a2ca","#0868ac")

brown <- c("#ffffd4", "#fed98e", "#fe9929", "#d95f0e", "#993404")

Visualizing the Data

My first step is to load the 311 incident file from 2015 and 2016. Then I load the projected Philadelphia boundaries and parcel the city into 1000 equally-sized fishnet gridcells. These will be used later for counting up total incidents.

I have chosen to focus on graffiti, street lights out, trees, and abandoned vehicles. My rough hypothesis is that trees will be predominantly in wealthier neighborhoods, while graffiti, broken street lights, and abandoned vehicles will be spatially correlated in poorer neighborhoods.

The density plots show that abandoned vehicles and broken street lights follow a similar pattern. Trees appear spread out evenly throughout the city and graffiti seems mostly concentrated in Center City and near the Delaware River to the east.

load("311-requests-sample.rda")

summary(phlcalls)
str(phlcalls)




phillyBoundary <- 
  st_read("City_Limits.geojson") %>%
  st_transform('ESRI:102728') 



## Creating a fishnet grid

fishnet <- 
  st_make_grid(phillyBoundary,
               cellsize = 1000, 
               square = TRUE) %>%
  .[phillyBoundary] %>% 
  st_sf() %>%
  mutate(uniqueID = rownames(.))

ggplot()+
  geom_sf(data = fishnet)+
  labs(title = "Philadelphia Fishnet Gridcells")

## pulling out different 311 requests and projecting them
graffiti <- 
  phlcalls %>%
  filter(servicename == "Graffiti Removal") %>%
  dplyr::select(Y = latitude, X = longitude) %>%
  na.omit() %>%
  st_as_sf(coords = c("X", "Y"), crs = 4326, agr = "constant") %>%
  st_transform(st_crs(fishnet)) %>%
  mutate(Legend = "Graffiti")

street_light_out <- 
  phlcalls %>%
  filter(servicename == "Street Light Outage") %>%
  dplyr::select(Y = latitude, X = longitude) %>%
  na.omit() %>%
  st_as_sf(coords = c("X", "Y"), crs = 4326, agr = "constant") %>%
  st_transform(st_crs(fishnet)) %>%
  mutate(Legend = "Street_light_out")

trees <- 
  phlcalls %>%
  filter(servicename == "Street Trees") %>%
  dplyr::select(Y = latitude, X = longitude) %>%
  na.omit() %>%
  st_as_sf(coords = c("X", "Y"), crs = 4326, agr = "constant") %>%
  st_transform(st_crs(fishnet)) %>%
  mutate(Legend = "Trees")

abandoned_vehicle <- 
  phlcalls %>%
  filter(servicename == "Abandoned Vehicle") %>%
  dplyr::select(Y = latitude, X = longitude) %>%
  na.omit() %>%
  st_as_sf(coords = c("X", "Y"), crs = 4326, agr = "constant") %>%
  st_transform(st_crs(fishnet)) %>%
  mutate(Legend = "Abandoned_vehicle")


# Combining features into a net and mapping them
vars_net <- 
  rbind(graffiti, street_light_out,trees,abandoned_vehicle) %>%
  st_join(., fishnet, join=st_within) %>%
  st_drop_geometry() %>%
  group_by(uniqueID, Legend) %>%
  summarize(count = n()) %>%
  full_join(fishnet) %>%
  spread(Legend, count, fill=0) %>%
  st_sf() %>%
  dplyr::select(-`<NA>`) %>%
  na.omit() %>%
  ungroup()

#Mapping Features
vars_net.long <- 
  gather(vars_net, Variable, value, -geometry, -uniqueID)

vars <- unique(vars_net.long$Variable)
mapList <- list()

for(i in vars){
  mapList[[i]] <- 
    ggplot() +
    geom_sf(data = filter(vars_net.long, Variable == i), aes(fill=value), colour=NA) +
    scale_fill_viridis(name="") +
    labs(title=i) +
    mapTheme()}

do.call(grid.arrange,c(mapList, ncol =2, top = "311 Complaints by Fishnet"))

American Community Survey

While it’s important to see the spatial distribution of 311 incidents, it will be even more important to connect these features with the demographic information of various neighborhoods. To accomplish this, I pull in survey data from the five year pooled American Community Surveys from 2014-2018. Specifically I am looking at median income, median rent, poverty, and race.

First I had to spatially join the fishnet gridcells to the larger census tracts. Then I ran several correlation plots comparing 311 incidents and tract demographic information with Median Household Income.

Features like MedRent and pctWhite showed a strong positive correlation with Median Household Income at the census tract level. Abandoned_vehicle and Street_light_out were negatively correlated with median incomes. Graffiti and Trees show very litter correlation with median household income.

#ACS Data

tracts18 <- 
  get_acs(geography = "tract", variables = c("B25026_001E","B02001_002E",
                                             "B19013_001E","B25058_001E",
                                             "B06012_002E"), 
          year=2018, state= 42, county= 101, geometry=T, output="wide") %>%
  st_transform('ESRI:102728') %>%
  rename(TotalPop = B25026_001E, 
         Whites = B02001_002E,
         MedHHInc = B19013_001E, 
         MedRent = B25058_001E,
         TotalPoverty = B06012_002E) %>%
  dplyr::select(-NAME, -starts_with("B")) %>%
  mutate(pctWhite = ifelse(TotalPop > 0, Whites / TotalPop,0),
         pctPoverty = ifelse(TotalPop > 0, TotalPoverty / TotalPop, 0),
         raceContext = ifelse(pctWhite > .5, "Majority_White", "Majority_Non_White")) %>%
  dplyr::select(-Whites, -TotalPoverty) 

#Philly Neighborhoods

neighborhoods <- 
  st_read("https://raw.githubusercontent.com/blackmad/neighborhoods/master/philadelphia.geojson") %>%
  st_transform(st_crs(fishnet)) 

# Joining 
final_net <-
  st_centroid(vars_net) %>%
  st_join(dplyr::select(neighborhoods, name), by = "uniqueID") %>%
  st_join(dplyr::select(tracts18, pctPoverty, pctWhite, MedRent,MedHHInc,TotalPop,raceContext), by = "uniqueID") %>%
  st_drop_geometry() %>%
  left_join(dplyr::select(vars_net, geometry, uniqueID)) %>%
  st_sf() %>%
  na.omit() %>%
  filter(pctWhite <1.0)


mapview::mapview(final_net, zcol = "name")

#Correlation Charts

correlation.long <-
  st_drop_geometry(final_net) %>%
  dplyr::select(-uniqueID, -name, -raceContext) %>%
  gather(Variable, Value, -MedHHInc)

correlation.cor <-
  correlation.long %>%
  group_by(Variable) %>%
  summarize(correlation = cor(Value, MedHHInc, use = "complete.obs"))

ggplot(correlation.long, aes(Value, MedHHInc)) +
  geom_point(size = 0.1) +
  geom_text(data = correlation.cor, aes(label = paste("r =", round(correlation, 2))),
            x=-Inf, y=Inf, vjust = 1.5, hjust = -.1) +
  geom_smooth(method = "lm", se = FALSE, colour = "black") +
  facet_wrap(~Variable, ncol = 2, scales = "free") +
  labs(title = "Median Household Income vs. Tract Demographics and 311 Incidents") +
  plotTheme()

Mapping

The majority of the following maps compare 311 features with median household income by tract. The results mostly back up the previous findings: abandoned cars and broken street lights are more prominent in poorer neighborhoods. Graffiti tends to cluster in the eastern parts of Center City, which are affluent.

Finally I examined total population by tract. This will be important later on when I develop my linear models. The last map shows majority white vs. majority non-white tracts. Philadelphia remains racially segregated to this day.

# mapping wealthiest neighborhoods

ggplot() +
  geom_sf(data = tracts18, aes(fill = q5(MedHHInc))) +
  scale_fill_manual(values = palette5,
                    labels = paste('$',(qBr(tracts18, "MedHHInc"))),
                    name = "Median Household Income\n(Quintile Breaks)") +
  labs(title = "Wealthiest Neighborhoods", subtitle = "Philadelphia; 2014-2018",
       caption = "Source: 5-year pooled American Community Survey") +
  mapTheme() + theme(plot.title = element_text(size=22))

# wealthy vs. graffiti

ggplot() +
  geom_sf(data = tracts18, aes(fill = q5(MedHHInc))) +
  geom_sf(data = graffiti, size = 0.2, color ="red") +
  scale_fill_manual(values = palette5,
                    labels = paste('$',qBr(tracts18, "MedHHInc")),
                    name = "Median Household Income\n(Quintile Breaks)") +
  labs(title = "Graffiti",
       caption = "Source: 5-year pooled American Community Survey") +
  mapTheme() + theme(plot.title = element_text(size=22))

#wealthy vs. abandoned cars

ggplot() +
  geom_sf(data = tracts18, aes(fill = q5(MedHHInc))) +
  geom_sf(data = abandoned_vehicle, size = 0.2, color ="orange") +
  scale_fill_manual(values = palette5,
                    labels = paste('$',qBr(tracts18, "MedHHInc")),
                    name = "Median Household Income\n(Quintile Breaks)") +
  labs(title = "Abandoned Cars",
       caption = "Source: 5-year pooled American Community Survey") +
  mapTheme() + theme(plot.title = element_text(size=22))

#wealthy vs. trees

ggplot() +
  geom_sf(data = tracts18, aes(fill = q5(MedHHInc))) +
  geom_sf(data = trees, size = 0.8, color ="#dd1c77") +
  scale_fill_manual(values = palette5,
                    labels = paste('$',qBr(tracts18, "MedHHInc")),
                    name = "Median Household Income\n(Quintile Breaks)") +
  labs(title = "Trees",
       caption = "Source: 5-year pooled American Community Survey") +
  mapTheme() + theme(plot.title = element_text(size=22))      

#wealthy vs. street lights out

ggplot() +
  geom_sf(data = tracts18, aes(fill = q5(MedHHInc))) +
  geom_sf(data = street_light_out, size = 0.8) +
  scale_fill_manual(values = palette5,
                    labels = paste('$',qBr(tracts18, "MedHHInc")),
                    name = "Median Household Income\n(Quintile Breaks)") +
  labs(title = "Street Lights Out",
       caption = "Source: 5-year pooled American Community Survey") +
  mapTheme() + theme(plot.title = element_text(size=22))  

#mapping population
ggplot() +
  geom_sf(data = tracts18, aes(fill = q5(TotalPop))) +
  scale_fill_manual(values = brown,
                    labels = qBr(tracts18, "TotalPop"),
                    name = "Total Population\n(Quintile Breaks)") +
  labs(title = "Total Population", subtitle = "Philadelphia; 2014-2018",
       caption = "Source: 5-year pooled American Community Survey") +
  mapTheme() + theme(plot.title = element_text(size=22))

#race
ggplot() +
  geom_sf(data = tracts18, aes(fill = raceContext)) +
  labs(title = "Race", subtitle = "Philadelphia; 2014-2018",
       caption = "Source: 5-year pooled American Community Survey") +
  guides(fill=guide_legend("Racial Composition")) +
  mapTheme() + theme(plot.title = element_text(size=22))

Linear Modeling

I developed three models. model 1 examines the relationship between 311 features in MedHHInc. model 2 looks at the same features relationship with TotalPop. model 3 is a logistic regression that examines whether there is a relationship between race and 311 features for each census tract.

All three models are significant, we can reject the null hypothesis that there is no relationship between the dependent and indepdent variables.

model 1 shows that, holding all features constant, median tract income is $55k. For each additional abandoned vehicle, tract income decreases by $2700. Broken street lights also decrease tract income. Increases in graffiti violations are related to increases in tract income. Trees are not statistically significant.

model 2 follows a similar pattern, with two noted differences. 1) Graffiti is more prevalent in less-densely populated neighborhoods. 2) Trees are statistically significant with population and are more common in tracts with higher populations.

model 3 is the logistic regression. All features are statistically significant at the p<0.01 threshold. Abandoned vehicles, broken street lights, and trees are more likely to exist in majority non-white neighborhoods than majority white. Graffiti is less likely in majority non-white neighborhoods than majority white.

#linear modeling

model1 <- lm(MedHHInc ~ Abandoned_vehicle + Graffiti + Street_light_out + Trees,data = final_net %>%
               st_drop_geometry())

summary(model1)

model2 <- lm(TotalPop ~ Abandoned_vehicle + Graffiti + Street_light_out + Trees,data = final_net %>%
               st_drop_geometry())

summary(model2)

#Majority white neighborhoods are 0, majority non-white are 1
final_net2 <-
  final_net %>%
  mutate(RaceNumeric = ifelse(raceContext=="Majority_White",0,1))

model3 <- glm(RaceNumeric ~ Abandoned_vehicle + Graffiti + Street_light_out + Trees, 
              family=binomial(link = "logit"), data = final_net2 %>%
                st_drop_geometry())

summary(model3)
#these will only show in an RMD, knitted to html
stargazer(model1,model2, type = 'html', align=TRUE, single.row=TRUE,report="vc*",se=NULL,digits=2, dep.var.caption = "",title = "Regression Results", 
          column.labels=c("Household Income","Population"))
Regression Results
MedHHInc TotalPop
Household Income Population
(1) (2)
Abandoned_vehicle -2,690.59*** 151.63***
Graffiti 607.97*** -41.19***
Street_light_out -948.92** 136.36***
Trees 243.04 101.14**
Constant 55,271.71*** 3,970.04***
Observations 3,046 3,046
R2 0.06 0.05
Adjusted R2 0.06 0.05
Residual Std. Error (df = 3041) 23,821.94 1,795.29
F Statistic (df = 4; 3041) 45.50*** 40.96***
Note: p<0.1; p<0.05; p<0.01
stargazer(model3, type = 'html', align=TRUE, single.row=TRUE, report="vc*",se=NULL,digits=2, dep.var.caption = "",title = "Logistic Regression Results", 
          column.labels=c("Race"))
Logistic Regression Results
RaceNumeric
Race
Abandoned_vehicle 0.27***
Graffiti -0.10***
Street_light_out 0.18***
Trees 0.20***
Constant -0.69***
Observations 3,046
Log Likelihood -1,951.85
Akaike Inf. Crit. 3,913.71
Note: p<0.1; p<0.05; p<0.01

Conclusion

Wealthier neighborhoods have higher rent and are majority white. They are clustered in central Philadelphia and the suburban tracts in the northwest and northeast. Graffiti and trees show little correlation with median tract income. Trees are more prominent in less populated neighborhoods and majority non-white neighborhoods. Broken street lights and abandoned cars are more common in poorer, majority non-white neighborhoods. Majority non-white neighborhoods are 18% more likely to have a broken street light than majority white neighborhoods and 27% more likely to have an abandoned car.

Recommendations

  • Remove abandoned cars and broken street lights since they disproportionately impact vulnerable communities
  • Plant trees in less populated neighborhoods (support parks’ creation/ beautification)
  • De-prioritize graffiti abatement
  • Solve Philadelphia’s geographic racial divisions
  • Consider housing credits for minority homeowners in majority white neighborhoods