Updated: 2020-10-29 07:18:49 PDT

Original version created 2020-05-03. See below for revision history

Intro


The spread of the SARS-COV-19 viral disease defies description in terms of a single statistic. To be informed about personal risk we need to know more than how many people have been sick at a national level or even state level, we need information about how many people are currently sick in our communicty and how the number of sick people is changing is changing at a state and even county level. It can be hard to find this information.

This analysis seeks to fill partially that gap. It includes:
1. Several national pictures of disease trends to enable a “large pattern” view of how disease has and is evolving a on country-wide scale.
2. A per capita analysis of disease spread.
3. A more granular analysis of regions, states, and counties to shed light on local disease pattern evolution.
4. Details of the time evolution of growth statistics.


This computed document is part of a constantly evolving analysis, so please “refresh” for the latest updates. If you have suggestions or comments please reach out on twitter @WinstonOnData or facebook.


You are welcome to visit my code repository on Github.
You are also welcome to visit my analysis on the Politics of COVID
Finally, you can alway check my Rpubs for new documents and updates.

National Statistics

Total & Active Cases, and Deaths

These trend charts show the national disease statistics. The raw data are shown. since these showdaily trends that are systematically related ot the M-F work week, possibly due to reporting delays, numbers showsn

Mortality and \(R_e\)

## Warning: Transformation introduced infinite values in continuous y-axis

## Warning: Transformation introduced infinite values in continuous y-axis
## Warning: Removed 1 rows containing non-finite values (stat_smooth).

Distribution of \(R_e\) Values

There is a wide dirstubtion of \(R_e\) across regions and counties. The distributions in the graph below looks roughly symmetrical because the x-scale is logarithmic.

National Maps

The key indicator for disease forcasting is the Effective Reproduction Rate \(R_e\), which is a measure of how many new cases each existing case of disease creates.

When a lot of people are sick in a population without mass-immunity you want \(R_e \ll 1\) or \(log_2\)(\(R_e\)) < 0) to acheive negative disease growth.

After achieving negative growth, the next phase of recovery is maintaining consistently lower levels of disease to a level where disease cases can be micro-managed. There’s no clear agreement on how few that is, but I’ve seen estimates as low as 500 cases per day across the US (about 0.16 cases per 100k population).

An estimate of the disease “toll” is the number of officially tallied deaths. It is fully agreed this vastly underestimates that actual cost of the disease. Death counts are almost certainly underestimated and this does not reflect any long lasting health effects on those who recover.

State Level Data

Pandemic Totals

Current Status of Active Disease

Computed Reproduction Rate \(R_e\).

How many cases are there per day, per capita, in each state? You can see the number of current cases varies widely. I also include a forecast of the number of cases about a week from now given current trends. Most states currently show improvement.

County Level Data

While the State-Level Data Tell as remarkable story, it is also interesting to look at County-level data


state R_e cases daily_cases
Rhode Island 1.52 27371 354
Maine 1.37 6367 57
Connecticut 1.30 68883 702
Kansas 1.27 82512 1123
Vermont 1.24 2112 22
Michigan 1.23 183916 2705
South Dakota 1.23 40988 1005
Wyoming 1.23 12159 365
Colorado 1.22 100723 1750
Massachusetts 1.22 150099 1120
Kentucky 1.21 103406 1611
Pennsylvania 1.19 205513 2137
West Virginia 1.18 23002 350
Iowa 1.17 120507 1494
New Jersey 1.17 233990 1432
Georgia 1.16 365048 2074
New Hampshire 1.15 10611 106
Minnesota 1.14 139082 1910
Illinois 1.13 390138 4977
Wisconsin 1.12 221893 4540
Maryland 1.11 142894 752
Montana 1.11 30178 789
New Mexico 1.11 43985 777
Ohio 1.11 205381 2552
Texas 1.11 930825 6729
Indiana 1.10 171638 2366
Tennessee 1.10 250682 2761
Arizona 1.09 241247 1074
Florida 1.09 788285 3816
New York 1.09 505234 1773
Oregon 1.09 43233 394
Washington 1.09 109774 768
Utah 1.08 109017 1556
California 1.07 922937 4462
Idaho 1.07 62234 924
Alabama 1.06 188335 1597
North Dakota 1.06 40020 843
South Carolina 1.06 173540 1023
Virginia 1.06 137740 879
Missouri 1.05 165164 1960
Nevada 1.05 97927 788
North Carolina 1.05 266596 2208
Delaware 1.04 24310 142
Nebraska 1.03 66339 881
Arkansas 1.02 106636 966
Mississippi 1.02 117485 749
Louisiana 0.98 181200 577
Oklahoma 0.93 119672 1136

Regional Snapshots

Regional snapshots reveal the highly nuanced behavior of disease spread. Each snaphot includes multiple states and selected counties.

How to read the charts

There are four components:
1. State Maps show the number of active cases and with the Reproduction rate encoded as color.
2. State Graphs State-wide trend graphs.
3. Severity Ranking These is a table of counties where the highest number of new cases are expected. Severity is a compounded function \(f(R, cases(t))\). This is useful for finding new (often unexpected) “hot spots.” Added per capita rates.
4. County Graphs encode the R-value in the active number of cases. R is the Reproduction Rate.

(NOTE: R < 1 implies a shrinking number of active cases, R > 1 implies a growing number of active cases. For R = 1, active cases are stable. ).


Washington and Oregon

## Warning: Removed 1 rows containing missing values (geom_col).

California

Four Corners

Mid-Atlantic

Deep South

FL and GA

Texas & Oklahoma

Michigan & Wisconsin

Minnesota, North Dakota, and South Dakota

Connecticut, Massachusetts, and Rhode Island

New York

Vermont, New Hampshire, and Maine

Carolinas

North-Rockies

Midwest

Tennessee and Kentucky

Missouri and Arkansas

Conclusions

It’s in control some places, but not all places. And many places are completely out-of-control.

Stay Safe!
Be Diligent!
…and PLEASE WEAR A MASK



Built with R Version 4.0.2
This document took 444.4 seconds to compute.
2020-10-29 07:26:13

version history

Today is 2020-10-29.
162 days ago: span plots to multiple states.
154 days ago: include \(R_e\) computation.
151 days ago: created color coding for \(R_e\) plots.
146 days ago: Reduced \(t_d\) from 14 to 12 days. 14 was the upper range of what most people are using. Wanted slightly higher bandwidth.
146 days ago: “persistence” time evolution.
139 days ago: “In control” mapping.
139 days ago: “Severity” tables to county analysis. Severity is computed from the number of new cases expected at current \(R_e\) for 6 days in the future. It does not trend \(R_e\), which could be a future enhancement.
131 days ago: Added census API functionality to compute per capita infection rates. Reduced spline spar = 0.65.
126 days ago: Added Per Capita US Map.
124 days ago: Deprecated national map. can be found here.
120 days ago: added state “Hot 10” analysis.
115 days ago: cleaned up county analysis to show cases and actual data. Moved “Hot 10” analysis to separate web page. Moved “Hot 10” here.
113 days ago: added per capita disease and mortaility to state-level analysis.
101 days ago: changed to county boundaries on national map for per capita disease.
96 days ago: corrected factor of two error in death trend data.
92 days ago: removed “contained and uncontained” analysis, replacing it with county level control map.
87 days ago: added county level “baseline control” and \(R_e\) maps.
83 days ago: fixed normalization error on total disease stats plot.
76 days ago: Corrected some text matching in generating county level plots of \(R_e\).
70 days ago: adapted knot spacing for spline.
56 days ago:using separate knot spacing for spline fits of deaths and cases.
54 days ago: MAJOR UPDATE. Moved things around. Added per capita severity map.
26 days ago: improved national trends with per capita analysis.
25 days ago: added county level per capita daily cases map. testing new color scheme.

Appendix: Methods

Disease data are sourced from the NYTimes Github Repo. Population data are sourced from the US Census census.gov

Case growth is assumed to follow a linear-partial differential equation. This type of model is useful in populations where there is still very low immunity and high susceptibility.

\[\frac{\partial}{\partial t} cases(t, t_d) = a \times cases(t, t_d) \] \(cases(t)\) is the number of active cases at \(t\) dependent on recent history, \(t_d\). The constant \(a\) and has units of \(time^{-1}\) and is typically computed on a daily basis

Solution results are often expressed in terms of the Effective Reproduction Rate \(R_e\), where \[a \space = \space ln(R_e).\]

\(R_e\) has a simple interpretation; when \(R_e \space > \space 1\) the number of \(cases(t)\) increases (exponentially) while when \(R_e \space < \space 1\) the number of \(cases(t)\) decreases.

Practically, computing \(a\) can be extremely complicated, depending on how functionally it is related to history \(t_d\). And guessing functional forms can be as much art as science. To avoid that, let’s keep things simple…

Assuming a straight-forward flat time of latent infection \(t_d\) = 12 days, with \[f(t) = \int_{t - t_d}^{t}cases(t')\; dt' ,\] \(R_e\) reduces to a simple computation

\[R_e(t) = \frac{cases(t)}{\int_{t - t_d}^{t}cases(t')\; dt'} \times t_d .\]

Typical range of \(t_d\) range \(7 \geq t_d \geq 14\). The only other numerical treatment is, in order to reduce noise the data, I smooth case data with a reticulated spline to compute derivatives.


DISCLAIMER: Results are for entertainment purposes only. Please consult local authorities for official data and forecasts.