Introduction

Philadelphia, Pennsylvania is a growing city in many ways. Philly’s population increased by .18 percent from 2010 to 2019, making it the sixth largest city in the United States. Philly has also experienced an increase in homocides , from 306 in 2010 to 351 in 2018. The aggravated gun assault rate increased by five percent. This rise in violence has left many Philadelphia-natives, like me, looking for answers as to why violence continues to increase in the City of Brotherly Love.

In this project I explore the relationship between per capita income and number of shootings in Philadelphia County, Pennsylvania. I used the Philly Open Data Portal, OpenDataPhilly , to gather census tract and shooting data . I got per capita income data by census tract from the American Fact Finder via the Census Bureau. This data was not very friendly to download and the link to the exact table expires after every session. The page can be accessed via this report from the Missouri Census Data Center by clicking on section E3 reference table B19301, selecting 2017, clicking the Add/Remove Geographies button, and selecting all census tracts in Philadelphia County.

I hypothesized that areas with lower per capita incomes would have a higher number of shootings than areas with higher per capita incomes.

For reference, the estimated 2017 per capita income in the United States, according to the American Fact Finder, was 31,177 dollars. In Pennsylvania, the estimate was 31,476 compared to an estimate of 24,811 dollars in Philadelphia County.

Results

I first loaded the necessary packages and loaded my geojson files. I assigned a variable name to each dataset. I then joined the income data with the census tract data by the field GEOID10. I also created a minimalistic theme, called basic_theme, to add to each of my maps.

library(sf)
library(tidyverse)
library(tidytext)
library(tigris)
library(mapview)
library(raster)
library(leaflet)

philly <- read_sf("project/censustracts2010.geojson")

shootings <- read_sf("project/shootings.geojson")

shootings <- shootings %>% 
  filter(is.na(point_x) !=TRUE)

income <- read_csv("project/phillyincome.csv") %>% 
  as_tibble()

#join income data with tracts
combined <- merge(x = philly, y = income, by = "GEOID10", all.y = TRUE)
#rename NAMELSAD10
colnames(combined)[6] <- "Tract"
#theme
basic_theme <- theme(plot.title = element_text(size = 14),
                     panel.background = element_blank(),
                     axis.text = element_blank(),
                     axis.title = element_blank(),
                     axis.ticks = element_blank(),
                     axis.line = element_blank())

The map below shows almost all of the census tracts in Philadelphia. For the purposes of this project, census tracts that did not report per capita income, according to the US Census FactFinder, were excluded. Those tracts, that appear on the map as white/blank areas, are: 9803 ,9805, 9808, 9809, 9804, 50, 9807, and 9806. Click on different census tracts to see census tract number and per capita income.

mapview(combined, popup= popupTable(combined, zcol= c("Tract", "Income")))

I then mapped the per capita income in each census tract using a gradient scale and plotted the shooting incidences. The darker the census tract is shaded, the lower the per capita income is for that tract. Each yellow circle indicates a shooting incidence.

#income gradient map by census tract
combined %>% 
  ggplot() +
  geom_sf(aes(fill = Income)) +
  scale_color_gradient(low = "#132B43", high = "#56B1F7") +
  basic_theme +
#map gradient income with shootings and add theme/title
geom_sf(data=shootings, shape=1, size=.5, color="yellow", alpha=0.3) +
  coord_sf(xlim = c(-75.287, -74.959), ylim=c(40.146, 39.874)) + basic_theme +
  labs(title = "Shooting Locations and Per Capita Income in Philadelphia County") +
  theme(plot.title = element_text(hjust = 0.5))

Shootings appear to be clustered in the census tracts in the center of Philadelphia County and on the lower left side. It also looks like, generally, the darker the area is, and thus the lower the per capita income, the higher the number of shootings. Conversely, the lighter areas, with higher incomes, appear to have less shootings. The tract with the most shootings, tract 176.01, has 114 shootings. The per capita income of this tract, $9,870, is one of the lowest in Philly. Here’s a closer look at tract 176.01.

combined %>% 
  ggplot() +
  geom_sf(aes(fill = Income)) +
  scale_color_gradient(low = "#132B43", high = "#56B1F7") +
  basic_theme +
  #zoom to tract
  geom_sf(data=shootings, shape=1, size=1.5, color="yellow", alpha=0.8) +
  coord_sf(xlim = c(-75.14, -75.127), ylim=c( 39.99905, 39.99008)) + basic_theme +
  ggtitle("Shootings in Tract 176.01") +
  theme(plot.title = element_text(hjust = 0.5))

Though it appears that tracts with lower incomes generally have more shootings, let’s prove this by grouping the census tracts into income brackets and counting the number of shootings within each bracket.

To do this, I first combined my variable “combined”, that contains information for Philly census tracts and income, with the shooting information. From there, I took my newly created variable, phillyShootings, and filtered it by specific income numbers to create my income brackets. I used the counts created for each income bracket by the count() funtion to create a new data frame that included income brackets and counts of shootings in Excel. I imported this data back into R and visualized it.

  #combine income/philly data with shooting data and filter out places with no shootings
combined %>% 
  st_join(shootings) %>% 
  filter(is.na(point_x) !=TRUE)-> phillyShootings

#split shooting incidences by income
#show shootings where income was less than 20,000
phillyShootings %>% 
  filter(Income < 20000)-> income20orLess
#show shootings where income was 20000 to 39999
phillyShootings %>% 
  filter(Income >19999 & Income< 40000) -> income20to40
#show shootings where income was 40000 to 59999
phillyShootings %>% 
  filter(Income >39999 & Income< 60000) -> income40to60
#show shootings where income was 60000 to 79999
phillyShootings %>% 
  filter(Income >59999 & Income< 80000) -> income60to80
#show shootings where income was 80000 to 99999
phillyShootings %>% 
  filter(Income >79999 & Income< 100000) -> income80to100
count(income20orLess)
count(income20to40)
count(income40to60)
count(income60to80)
count(income80to100)

brackets <- read_csv("project/incomeBracketsShootings.csv")
brackets%>% 
  ggplot(aes(reorder(Income, Shootings), Shootings)) + geom_col(fill= "#00BA38") + coord_flip() +
  labs(y = "Number of Shootings", x = "Income Bracket ($)", title = "Shootings by Income Bracket in Philly") +
  theme(plot.title = element_text(hjust = 0.5))

Now a more clear trend has emerged. The lowest income bracket, which includes the census tracts with a per capita income (in dollars) of less than 20,000, had 4,338 shootings. The income bracket of 20,000 to 39,999 had 969 shootings. The income bracket of 40,000 to 59,000 had 58 shootings. The income bracket of 60,000 to 79,000 had 23 shootings. The highest income bracket, of 80,000 to 100,000 had just 3 shootings. There was one census tract with a per capita income higher than 100,000 dollars, but there were no shootings in that tract.

Conclusion

These findings support my hypothesis that areas with lower per capita incomes would have a higher number of shootings than areas with higher per capita incomes. The lowest income bracket, 20,000 dollars or less, had significantly more shootings than every other, higher, income bracket. This analysis could be improved with income data from every census tract. While only missing data for 8 tracts out of hundreds isn’t too bad, the analysis would have painted a more complete picture if it had included that missing data.

The shootings dataset provides ample opportunity for further exploration both on its own and in conjunction with other datasets. It would be interesting to analyze where the most fatal versus nonfatal shootings take place as well as the race, sex, and age of the victim. It would also be interesting to explore how shootings in Philadelphia have changed over time using the shooting data available from 2010-2019 in terms of indoor versus outdoor shootings, where the victim was wounded, and the time of day the shootings take place.