My project demonstrates the increase in COVID-19 cases from January 2020 through April 2022 in each of the 67 counties in Pennsylvania. I will be creating maps of the state by county to illustrate the number of cases reported each year, as well as line charts to show the counties’ with the highest and lowest number of COVID-19 cases. The overall goal of my project is show the growth of Pennsylvania counties’ number of cases in 2020, 2021 and 2022. I retrieved my data for this project from the New York Times COVID-19 data set (https://github.com/nytimes/covid-19-data) and from Pennsylvania’s Open Data site (https://data.pa.gov/Covid-19/COVID-19-Aggregate-Cases-Current-Daily-County-Heal/j72v-r42c).
I predict that the counties in Pennsylvania with higher population densities, such as the counties containing major cities like Philadelphia and Pittsburgh, will report a higher number of COVID-19 cases each year of the pandemic with the most rapid increases. Population density is defined as the number of people per unit of area (counties in this case).
The project contains limitations in regards to reported cases. The PA Open Data site states: “Note that case counts by date of report are influenced by a variety of factors, including but not limited to testing availability, test ordering patterns (such as day of week patterns), labs reporting backlogged test results, and mass screenings in nursing homes, workplaces, schools, etc. Case reports received without a patient address are assigned to the county of the ordering provider or facility based on provider zip code. Cases reported with a residential address that does not match to a known postal address per the commonwealth geocoding service are assigned to a county based on the zip code of residence. Many zip codes cross county boundaries so there is some degree of misclassification of county.”
First, I downloaded the necessary packages and merged the NYT county data set through the following code in order to be able to create the necessary plots.
library(sp)
library(rgdal)
library(tidycensus)
library(tidyverse)
census_api_key("2b4796102d9b2eee2cf3d1184a1c52f815562624", overwrite =TRUE)
Sys.getenv("CENSUS_API_KEY")
## [1] "2b4796102d9b2eee2cf3d1184a1c52f815562624"
total_population_10 <- get_decennial(
geography = "state",
variables = "P001001",
year = 2010
)
colnames(total_population_10)[2] <- "state"
covid <- read_csv("~/Desktop/MEA3290/R/geo spatial project/covid-19-data-master/us-counties.csv")
total_population_10 %>%
full_join(covid, by = "state") -> merged_covid
write_csv(merged_covid, "~/Desktop/covid.csv")
merged_covid %>%
filter(state %in% "Pennsylvania") %>%
mutate(case_rate = (cases /value * 1000000)) %>%
mutate(death_rate = (cases /value * 1000000)) -> merged_covid_rate
library(lubridate)
ymd(merged_covid$date) -> merged_covid$date
My first map I created in Tableau represents the COVID-19 cases in Pennsylvania by county in 2020. Each map builds upon this original 2020 map.
On this map, the darker the blue hue on the county, the more cases the county has. In 2020, Philadelphia (93,885), Allegheny (53,809), Montgomery (36,131) and Bucks (30,537) counties had the highest reported number of COVID-19 cases.
My second map I created in Tableau represents the COVID-19 cases in Pennsylvania by county from the start of the pandemic in 2020 through the end of 2021.
On this map, the darker the blue hue on the county, the more cases the county has. From 2020 through 2021, Philadelphia (225,031), Allegheny (176,336), Montgomery (110,575) and Bucks (91,995) counties had the highest reported number of COVID-19 cases.
My third map I created in Tableau represents the COVID-19 cases in Pennsylvania by county from the start of the pandemic in 2020 through the end of 2022.
On this map, the darker the blue hue on the county, the more cases the county has. From 2020 through 2022, Philadelphia (311,825), Allegheny (265,744), Montgomery (154,008) and Bucks (124,051) counties had the highest reported number of COVID-19 cases.
This visualization of a continuous line chart demonstrates the spikes per day in COVID-19 cases from the beginning of 2020 through the end of April 2022. Each color FIPS code represents a different county in the state.
The highest number of reported new cases in one day is from Philadelphia county on January 7, 2022 at 6,239. The highest number of reported cases in 2021 was on December 9 at 1,237 in Philadelphia county. The highest number of reported cases in 2020 was on December 1 at 1,508 in Philadelphia county.
This first line graph I coded shows the top four counties with the highest number of reported COVID-19 cases from 2020 through 2022: Philadelphia, Allegheny, Montgomery and Bucks counties.
options(scipen = 999)
merged_covid_rate %>%
filter(state %in% "Pennsylvania") %>%
filter(county %in% c("Philadelphia", "Allegheny", "Montgomery", "Bucks")) %>%
ggplot(aes(date, cases, color = county)) + geom_line()
Philadelphia county was the only county to surpass 300,000 reported cases by April 2022. Allegheny county followed with cases in the 200,000’s and both Montgomery and Bucks counties surpassing the 100,000 mark.
This second line graph I coded shows the four counties with the lowest number of reported COVID-19 cases from 2020 through 2022: Potter, Forest, Sullivan and Cameron (with the smallest number) counties.
merged_covid_rate %>%
filter(state %in% "Pennsylvania") %>%
filter(county %in% c("Potter", "Forest", "Sullivan", "Cameron")) %>%
ggplot(aes(date, cases, color = county)) + geom_line()
Cameron county had the least number of reported cases, coming in at less than 1,000, followed by Sullivan county with just over 1,000 cases as of April 2022. Forest county had just over 2,000 reported cases, and Potter with just over 3,000 cases.
On March 6, 2020, the first two COVID-19 cases were announced in the state, in Delaware and Wayne counties. On March 18, the first COVID-19 related death in the state was announced, and the following day the state ordered all non-essential businesses to close. PA Governor Tom Wolf announced that schools would not return in person for the rest of the year, and the state remained in lockdown through late May. The end of April, September, and late October through the end of the December marked the major COVID-19 cases spikes for 2020. Since the virus entered the state in early March, it spread at a rapid pace through the year’s first high on April 30th (784 cases in Philadelphia county alone). The next spike occurred in September, as the state’s schools returned to in-person or hybrid learning, including many of PA’s colleges and universities returning to campus. With larger groups of people in one place, cases began rapidly increasing at rates faster than the spring when PA was in a state of lockdown. Finally, by late October through the very end of 2020, cases in counties throughout the state reached an all time high for the year as winter, holiday and the first flu season since the start of pandemic aproached. CDC is continuing research on the virus and announcing a vaccine rollout for high risk groups.
From January through April, COVID-19 cases steadily increased with the greatest spikes in Philadelphia and Allegheny county at just over 1,000 cases reported per day. From May to mid August, there was a record low number of cases in all counties throughout the state. During late winter and throughout the spring/early summer period when people began getting their first and second doses of vaccines in various phases and waves that the state created– phase 1a (healthcare workers and nursing homes), 1b (population ages 75+ and essential workers) and 1c (pregnant women, people with existing health issues and other essential workers), then phase 2 (population over 16 years old). Then throughout the fall and early winter, cases bagn to steadily increase again. In December, there were cases rapidly spiking at over 1,000 new ones reported daily as a new variant of the virus was discovered– the SARS-CoV-2 Omicron variant.
As the Omicron variant was discovered and ran rampant in late 2021, it continued into January 2022, causing a record daily high of reported cases at 6,239 on January 7th in Philadelphia county. Various schools and colleges throughout the stae then went virtual for school during the month of January in attempt the further spread of the new variant. From a dip in cases in February, to a slight increase again in March, the state has begun more cautiously watching case count recently and Philadelphia became one of the first major cities to reinstate an indoor mask mandate on April 11, just to revoke this a few days after. State-wide, it is up to the disgression of individual places whether or not people should wear a mask.
According to the World Atlas, (https://www.worldatlas.com/articles/the-10-biggest-cities-in-pennsylvania.html) Pennsylvania covers an area of 19,283 square kilometers with a population of approxitmately over 12 million people. In the United States, it is the 33rd mist extensive state and the 5th most populated state. Philadelphia (Philadelphia county) and Pittsburgh (Allegheny county) are the two most populated cities in the state. Philadelphia county is the most populated county, while Cameron county is the least populated with just over 5,000 residents. Following the two major cities of the state, the next most densely populated area is part of the Delaware Valley region (area where Delaware River windsto the Atlantic Ocean accoridng to “Philadelphia Information” https://phillyseaperch.org/blog/what-is-the-greater-philadelphia-area-solved.html), Also known as the Greater Philadelphia area, this includes the counties surrounding Philadelphia – Montgomery, Bucks, Chester and Delaware. These highly populated counties included smmaler major cities such as Bensalem, King of Prussia, Norristown and Upper Darby – all very densely populated.
With a low number of people per square unit of area, it makes sense that Cameron county has had the lowest number of reported COVID-19 cases throughout the pandemic. On the contrary, since both Philadelphia and Pittsburgh are the state’s two most densely populated cities, it also makes sense that Philadelphia and Allegheny counties would have the most and second most number of reported COVID-19 cases. Finally, the third and fourth counties with the most number of reported COVID-19 cases, Montgomery and Bucks, both fall into the highly populated region surrounding Philadelphia.
Overall, my hypothesis proved to be accurate – the counties with higher population density like Philadelphia and Allegheny along with Montgomery and Bucks counties of the Greater Philadelphia area had the highest number of reported COVID-19 cases consistently each year of the pandemic so far.