As of May 5th, 2020, there were 11,848 confirmed cases of Covid-19 in North Carolina. On May 5th these were 408 new confirmed cases and 22 Covid-19 related deaths. For the purpose of this analysis, ‘corona,’ ‘covid,’ and ‘covid-19’ will be used interchangeably, all meaning the COVID-19 Corona virus that is currently sweeping over the world. Worldwide, there are 3.68 Million confirmed cases, as well as 1.21 Million recovered and 258 thousand covid related deaths. Google recently released movement tracking data gathered from portable personal devices (cell phones, tablets, etc) of Google users, and have released this data to the public. In this analysis, I will be analyzing the county of Elon, North Carolina, as well as surrounding counties, in order to inform the student body and other interested parties about the cases in relation to movement trends. I will be using R and Tableau to perform this visualization. The Google Dataset aims to provide insights of how much people have decreased their movement based on restrictions put in place. The categories are residential, groceries and pharmacies, retail and recreation, parks, transit stations, and workplaces.
As of May 5th, there is a stay at home order in North Carolina. This order was issued on March 27th by Governor Ray Cooper, and is in place until Friday, May 8th. This order calls for “people to stay at home except to visit essential businesses, to exercise outdoors or to help a family member. Specifically, the order bans gatherings of more than 10 people and directs everyone to physically stay at least six feet apart from others.” It is important to note that with this state of emergency / stay at home declaration, all state parks have been closed. The full order is available here. The first phase of restrictions that will be lifted on Friday May 8th will include people being able to gather with non-family members, and allow all non-essential businesses to open at 50 percent capacity. Will this movement increase the spread of Coronavirus, and is it safe for North Carolina citizens to go into the public again? How much does movement impact the curve of Coronavirus infections and deaths? Here’s what we’ve seen in the data since Corona started.
I first had to load my packages and data files into R. I had to manually imput the column types because R was automatically parsing the date column incorrectly.
library(tidyverse)
## ── Attaching packages ──────────────────── tidyverse 1.3.0 ──
## ✓ ggplot2 3.3.0 ✓ purrr 0.3.3
## ✓ tibble 2.1.3 ✓ dplyr 0.8.4
## ✓ tidyr 1.0.2 ✓ stringr 1.4.0
## ✓ readr 1.3.1 ✓ forcats 0.4.0
## ── Conflicts ─────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
library(readr)
states <- read_csv("~/Desktop/R stuff/tableau/covid-19-data-master/us-states.csv")
## Parsed with column specification:
## cols(
## date = col_character(),
## state = col_character(),
## fips = col_double(),
## cases = col_double(),
## deaths = col_double()
## )
counties <- read_csv("~/Desktop/R stuff/tableau/covid-19-data-master/us-counties.csv")
## Parsed with column specification:
## cols(
## date = col_character(),
## county = col_character(),
## state = col_character(),
## fips = col_double(),
## cases = col_double(),
## deaths = col_double()
## )
mobility <- read_csv("~/Desktop/R stuff/tableau/Global_Mobility_Report.csv",
col_types = cols(
country_code = col_character(),
country = col_character(),
state = col_character(),
county = col_character(),
date = col_character(),
retail_recreation = col_double(),
grocery_pharmacy = col_double(),
parks = col_double(),
transit = col_double(),
workplace = col_double(),
residential = col_double()
))
mobility %>%
filter(country_code %in% "US") -> move
Next, I analyzed the confirmed deaths per day in Tableau:
As you can see, the number of cases increases exponentially in each county over time. The hot spots in North Carolina are Mecklenburg, Wake, Durham, Wayne, and Guilford. This is because major cities and hospitals are in these counties, and cases are registered where they are diagnosed / treated, not where the individual lives. In order to clean the graphs appearance, I removed the x axis labels. All of the death data is from 3.18.2020 to 5.04.2020, and the mobility data is from 2.15.2020 until 4.26.2020.
counties %>%
filter(state %in% "North Carolina") %>%
filter(county %in% "Wake" | county %in% "Durham" |
county %in% "Wayne" | county %in% "Guilford" |
county %in% "Mecklenburg") %>%
ggplot(aes(date, cases, group=county, color=county)) + geom_line() +
theme(axis.text.x = element_blank())
counties %>%
filter(state %in% "North Carolina") %>%
filter(county %in% "Wake" | county %in% "Durham" |
county %in% "Wayne" | county %in% "Guilford" |
county %in% "Mecklenburg") %>%
ggplot(aes(date, deaths, group=county, color=county)) + geom_line() +
theme(axis.text.x = element_blank())
It is also clear that the curve of these counties is not leveled off yet, but this may not be representative of the entirety of North Carolina, because of aforementioned reasons. On the left are the cases and on the right are deaths. Since many cases may not be counted because of a lack of reporting or asymptomatic patients, it is more reliable to use death counts instead of cases, this is what we will be analyzing moving forward. In many counties in North Carolina, there is not enough data to create a complete mobility analysis, so we will be looking only at the top 5 aforementioned counties.
The mobility in these counties are well under the baseline for workplace, transit, grocery and pharmacy, and retail and recreation, as can be seen here. Parks and Residential areas have higher variability, as seen below:
move %>%
filter(state %in% "North Carolina") %>%
filter(county %in% "Wake County" | county %in% "Durham County" |
county %in% "Wayne County" | county %in% "Guilford County" |
county %in% "Mecklenburg County") %>%
ggplot(aes(date, residential, group=county, color=county)) + geom_line() +
theme(axis.text.x = element_blank())
move %>%
filter(state %in% "North Carolina") %>%
filter(county %in% "Wake County" | county %in% "Durham County" |
county %in% "Wayne County" | county %in% "Guilford County" |
county %in% "Mecklenburg County") %>%
ggplot(aes(date, parks, group=county, color=county)) + geom_line() +
theme(axis.text.x = element_blank())
This residential movement could be an indication of walks being taken around neighborhoods, which is acceptable within the boundaries of quarantine, and the park movement could also be individual exercise. To see this, we can take a look at the county with the highest park mobility rate, Guilford County. The red line is the residential mobility and the thick purple line is park mobility. These are the onle two facets that trend above the 0 baseline average movement.
move %>%
filter(state %in% "North Carolina") %>%
filter(county %in% "Guilford County") -> nc6
ggplot(nc6, aes(date, workplace, group=county)) + geom_line(color="thistle4", size = 1) +
theme(axis.text.x = element_blank())+
geom_line(aes(date, parks, group=county), color="red", size = 2) +
geom_line(aes(date, retail_recreation, group=county), color="orangered4", size = 1) +
geom_line(aes(date, grocery_pharmacy, group=county), color="hotpink4", size = 1) +
geom_line(aes(date, transit, group=county), color="gold4", size = 1) +
geom_line(aes(date, residential, group=county), color="plum4", size = 2)
counties %>%
filter(state %in% "North Carolina") %>%
filter(county %in% "Guilford") %>%
ggplot(aes(date, deaths, group=county)) + geom_line() +
theme(axis.text.x = element_blank())
Overall, the data shows that residents in North Carolina counties have mostly been following quarantine. Although they go to parks, it is possible that they find secluded areas in the park to exercise or hang out in. It is not possible to fully conclude how much this quarantining has impacted the curve, because of how vigilent the North Carolina population has been. This data will be important to monitor in the upcoming weeks as the state reopens, however, because it will be possible to measure which areas of mobility result in the highest infection / death rate. It will also be possible to compare, by county, how much the overall mobility is changing and how much this impacts the Corona curve. This is an ever-changing situation, as the data literally changes by the thousands daily and with changes in quarantine the changes are expected to increase even more. Keep up with these changes at your local news source or the CDC website, which is updated daily.