COVID-19 cases in Washington State are observed to have substantially different growth rates depending on the region. The core Seattle metro area and the Greater Puget Sound case growth rate is now lower than other regions of the State, despite total case loads still being higher.
Get data from the NYTimes Github Repo.
countydata_file <- "https://raw.githubusercontent.com/nytimes/covid-19-data/master/us-counties.csv"
county_data <- read_csv(countydata_file)
## Parsed with column specification:
## cols(
## date = col_date(format = ""),
## county = col_character(),
## state = col_character(),
## fips = col_character(),
## cases = col_double(),
## deaths = col_double()
## )
###Patch data
There’s a faulty data point in the data for Benton County (total cases drops on 2020-03-25. This fix just patches that. Issue has been escalated on the github repo.
county_data = county_data %>%
mutate(cases = ifelse(fips == 53005 & date == "2020-03-25", 14, cases)) %>%
yo
puget_region <- c("Thurston", "Island", "Kitsap", "Pierce", "King", "Snohomish", "Whatcom", "Skagit" )
state_metro_data <- county_data %>%
filter(state == "Washington") %>%
mutate(region = ifelse(county %in% puget_region, "Puget Sound", "Greater Washington")) %>%
yo
Plot data is created by summarizing data by metro area. Daily case rates are also computed.
plot_data <- state_metro_data %>%
group_by(region, date) %>%
summarize(cases = sum(cases), deaths = sum(deaths)) %>%
mutate(daily_cases = cases-lag(cases)) %>%
mutate(daily_deaths = deaths-lag(deaths)) %>%
arrange(date)%>%
ungroup() %>%
mutate(daily_cases = ifelse(is.na(daily_cases), 0, daily_cases)) %>%
mutate(daily_deaths = ifelse(is.na(daily_deaths), 0, daily_deaths)) %>%
filter(cases > 1) %>%
#filter(deaths >1) %>%
mutate(t_case = daily_cases/(cases-1))%>%
#mutate(t_death = daily_deaths/(deaths-1)) %>%
yo
This can be done a number of ways. Using the median function for data from the last few days seems to work well. (Mean computation resulted in errors presumably because of the sparse data for some areas).
growth_table <- plot_data %>%
filter(date > max(date) - days(8)) %>%
group_by(region) %>%
summarize(doubling_time = median(0.69/(t_case))) %>%
yo
ggplot(growth_table, aes(x = fct_reorder(region, -doubling_time), y = doubling_time)) + geom_bar(stat = "identity", width = 0.5, fill = "lightblue", color = "grey40") +
labs(title = "Regional Variation of COVID-19 growth in Washington State", x = "Region", y = "Doubling Time (Days)", subtitle = str_c("NYTimes Data (update ", today(), ")"))
Here is the (noisy) doubling time data computed on a daily basis.