Introduction

Philadelphia is Pennsylvania’s largest city, and the sixth most populated U.S. city, with about 1.584 million residents. The City of Brotherly Love is best known for its history and art. Home to many celebrities, athletes, and innovators, Philadelphia is rich in culture. However, more than 400,000 residents, about 26% of the population, live below the poverty line. Parts of the city are very wealthy and are flourishing, but neighboring areas are struggling with crime and poverty. In recent, the COVID-19 pandemic (the coronavirus) has taken a toll on the city as a whole. With only a limited amount of hospitals and resources, it is likely that everyone will not get the treatment they need.

This report looks at the relationship between coronavirus cases in Philadelphia county by the median income. In this analysis, the county is separated by zip code. As of now (04/28/2020) there have been a total of 43,264* coronavirus cases in Philadelphia with 1,716 deaths. I hypothesize that zip codes with a lower median income will have more coronavirus cases due to the lack of resources and funding to get assistance.

*Total case counts include confirmed and probable cases.

Coronavirus Data

The coronavirus data was pulled from Opendata Philly. It provides a geospatial map that links Philadelphia county zip codes with the number of cases they have. The coronavirus case data is up to April 28, 2020.

Median Income Data

To get the median income by zip code in Philadelphia county, I went to philly.com’s “Income and Poverty in the Philaelphia Region” interactive map.

With this information, I created a CSV with each zip code and its reported median income and coronavirus case sum.

library(readr)
income <-read_csv("/Users/raynamason/Desktop/finalzipcodeincome.csv") 
## Parsed with column specification:
## cols(
##   `Zip Code` = col_double(),
##   `Median Income` = col_double(),
##   Cases = col_double()
## )
income
## # A tibble: 46 x 3
##    `Zip Code` `Median Income` Cases
##         <dbl>           <dbl> <dbl>
##  1      19116           48058   849
##  2      19154           58319   869
##  3      19115           45799  1072
##  4      19114           54074   747
##  5      19111           44123  1711
##  6      19152           50584   911
##  7      19136           44641  1012
##  8      19149           36906  1033
##  9      19135           35859   723
## 10      19137           43522   256
## # … with 36 more rows

Comparative Map

This CSV file with the combined information allowed me to make another geospatial map matching each Philadelphia zip code with their respective median incomes and coronavirus cases.

Separating Income

A bar graph was also created comparing only categorized income levels and coronavirus cases. This helps to show the relationship between the two variables in another way.

library(tidyverse)
## ── Attaching packages ───────────────────────────────────────────────────────────────────────────────────── tidyverse 1.3.0 ──
## ✓ ggplot2 3.2.1     ✓ dplyr   0.8.4
## ✓ tibble  2.1.3     ✓ stringr 1.4.0
## ✓ tidyr   1.0.2     ✓ forcats 0.4.0
## ✓ purrr   0.3.3
## ── Conflicts ──────────────────────────────────────────────────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()
income %>%
select(`Median Income`, Cases) %>%
mutate(`Income Level` = case_when(`Median Income` < 25000 ~ "> $25,000",
`Median Income` > 25000 & `Median Income` < 49999 ~ "$25,000 - $50,000",
`Median Income` > 50000 ~ "$50,000 and Up")) %>% 
   mutate(`Income Level` = as.factor(`Income Level`)) %>% 
  ggplot(aes(Cases,`Income Level`)) + geom_col()

Conclusion

After looking at the maps and the graph, it is clear that zip code areas with higher median income levels generally have fewer coronavirus cases. This could be because of a few reasons, one being the size of the zip code boundary. Looking at the map, most areas with higher median income levels are smaller than the others. This means that there are physically fewer people in that space, meaning there will be fewer cases. There is no calculated ratio between coronavirus cases and the population in each zip code. While the sum of cases may be lower in higher-income households, there could also be a higher level of infection due to the areas smaller size. Another thing to note is that most of the higher income level households are located in the middle of Philadelphia. Center City is filled with museums, restaurants, and other businesses. This means that there are fewer residential areas which contributes to both a higher median income level and an overall lower population. This can skew the data into looking like households with higher median incomes have fewer coronavirus cases.

The second thing to look at is the distribution of hospitals in Philadelphia. Opendata Philly provided a geospatial map showing hospitals in Philadelphia. I excluded rehabilitation facilities, long term care, and behavioral health centers because they would not be used to treat COVID-19. The map below shows the locations of general medical hospitals in Philadelphia county.

This map shows that the areas that have a higher median income level have more hospitals near them. This again could be because those areas are located towards the center of Philadelphia. Many of those hospitals are through colleges and universities like the University of Pennsylvania and Thomas Jefferson University. These universities are located in Center City, therefore allowing the residents of the higher median income households to use them. The coronavirus is not easily treatable and there are many things that we do not know about it, but being close to a hospital is valuable.

Other important things needed during this pandemic besides hospitals are access to basic necessities. This includes food, water, and the ability to make an income. People that have more money are more able to afford businesses being closed down. Ordering food and groceries and overall staying inside is reasonable. The people living in lower-income neighborhoods still need to go out to get things and may have to continue working under more compromising conditions to maintain their lifestyles. This makes them more susceptible to catching the coronavirus. In all, having a higher median household income does not mean that one is less likely to catch the virus. It does provide more of an advantage because these residents are closer and have better access to important resources that help during this pandemic.