Corona Virus Crisis: Where is it Heading?

For a few weeks now ears, eyes and thoughts have been consumed with corona virus issues - status, policy making, crisis management. Early in March, my co-chairs and I pulled the plug on our TEIS academic workshop (but had a very successful event over Zoom, instead of the original Newport Beach location). Next, I conducted my final class of the quarter over Zoom rather than in-person. And then, as the picture started looking bleaker, my friends and I began discussing broader community- and state-level actions that were needed to confront the impending crisis. We put out a blog (Needed: Bold Decisions to Stop Covid-19) and a change.org petition on March 12, calling for suspension of in-person activities at schools, universities and other places of gathering. Around this time, I started laying my hands on daily data about the spread of corona virus in the US.

This blog is based on looking at the data over the last few days, trying to get a sense of how bad things are, their trends, and whether institutional measures are having a mitigating effect. My analysis is based on daily data from https://covidtracking.com/; see the section on “Data Source” and “Data Limitations and Data Quality” for a discussion of data limitations and the need to interpret any analyses cautiously.

Number Tested (Results) and Percentage Infected

For many weeks after this crisis hit the US, the main story was the growing number of infections and the lack of testing. During March, testing capacity has increased rapidly - from about 500 tests per day on March 2 to 50,000 tests per day on March 22 - although it still appears not to be enough, especially in certain states.

We see a huge growth in infections, and part of it could well be due to increased testing. But, before digging into it, you might want to check the section “Data Limitations and Data Quality”.

How is the Situation Changing?

Is the situation getting better or worse, and at what rate? Are the control measures working? Let’s look at what the data tell us about whether this is getting worse. Consider the increase in known infections i.e., positive test results.

  • Some of this increase is because coronavirus infections are spreading, but some of it is also because volume of testing is growing rapidly.

  • At one extreme it could be that a large-enough segment of the population was infected some days ago - and there are no new infections since then - but as we’re testing more people we’re catching more infected ones. At the other extreme, it is that more people are getting infected each day - and we’re seeing exponentially more - and that is why they show up in the results when we test more people.

To visualize this, suppose that at some day, the mix of infected and non-infected people looked like the picture below. The red dots are infected individuals, the orange ones are not infected but otherwise indistinguishable from them, and the green ones are healthy.

Extreme Best Case: No spread in infection

Suppose that the red-orange-green status and mix remained constant. That is, many people are already infected - but no more new infections! Now imagine a circular net, representing people we picked up to test one day. The number of positive results for the day (new infections) will depend on the size of the net, the method for picking whom to test (we’ll ignore this for a moment), and the mix of red and orange (which remains constant). Therefore, if we didn’t change the method for picking whom to test, then the number of positive test results should be somewhat proportional to the size of the net. The more you test, the more show up infected - but the proportion or percentage of people who show up positive should remain the same regardless of the size of the net. (Recall this is the case where the red-orange status remains static.)

What do the numbers tell us?

The left panel shows what percentage of daily test results are positive (again, with caveats - no change in the method for deciding whom to test, etc.). The percentage is up and down a bit, and increasing. This suggests we’re somewhere between the two extremes discussed. If there were no new infections the positive rate would have been nearly flat (again, assuming we aren’t picking more from the “more likely to be infected” group).

Do the numbers convey good news or dire news? Too early to tell. There’s evidence of increasing spread, but there’s also just too much “noise” in the data. We’re mixing in numbers across states and counties that are extremely different in terms of their infected populations, testing regimes, etc. But - bottomline - the data certainly do not convey “good news”.

Daily Changes and Growth Rate

One metric everyone is interested in is “how quickly is the cumulative number of known infections growing each day?” For instance, if we had 100 known infections until yesterday, and then learnt about another 20 new infections in the last day, then the growth rate (over cumulative) is 20%. Let’s look at these numbers.

With just a few days observations, the daily rate of growth is bouncing around. It is also very sensitive to small measurement errors or overlaps (allocating infections on a particular day to previous day or next day).

However, once we have enough data points then we can look for an approximate constant daily rate of growth (say r).

Is the Lockdown Sensible?

The big question - apart from “what is happening” - is “what should we do about it?” We’ve seen several measures such as lockdowns at the state and federal level.

First, there is no question that significant restrictions - such as a lockdown - are needed. Despite some who argue that this is an overreaction, the reality is that the infection has spread undetected among the population, that many more people have the virus than we know, and these unknown asymptomatic individuals will keep propogating the spread. Slowing the spread down - which the restrictions do - will enable us to “flatten the curve” and make it less unmanageable for hospitals and other responders. There are even others who argue that current lockdown measures are too untargeted (and based on data which is highly unprecise) - and that a more precise targeting would still limit the spread but reduce the accompanying harm (see, e.g., https://www.dailywire.com/news/stanford-professor-data-indicates-were-overreacting-to-coronavirus). If you this stated in simple words, it’s like getting out the big cannons to shoot a few ants. If we could compute a precise targeting solution, we should. However, even then, it may not be implementable. The population is simply not capable of reacting properly to a highly precise, multidimensional, set of lockdown conditions. A general lockdown is more likely to be enforced and implemented.

Is it Working?

The second question is, do we know that the lockdowns are having a positive effect? If the lockdowns and other restrictions are working, that should slow the rate of spread of the infected population (from what it was prior to the lockdowns), and this should show up in the numbers of people who test positive, the rate of positive test results, and the rate of growth of the infected population. But there are a lot of important caveats to keep in mind.

  • Results won’t show up for days, expect a lag of about 10-15 days: the time before people get tested, the time for test results, and some extra days to account for people really changing their behavior.

Lesson: Don’t give up. Don’t think that the measures are not working just because you don’t observe quick results.

  • Volume of past-infected people still in the population. If this number is high, then even if the spread slows down, the available mass of infected people will continue to produce more positive test results as they get tested. So, longer time lag to see the effects of measures.

With all these caveats, looking at the data, there’s some reason to believe that the measures are beginning to take effect—the growth rate curve is trending downwards. It will be useful to look at this by state.

What does it take to Bend the Curve?

Let’s look ahead to see what we can do with growth rate r once we have it.

An important question on everyone’s mind is “how many days does it take to double the number of infections”. We’re seeing very rapid doubling in some countries, e.g., in Spain it is about 2 days to double.

So, we can ask how many days d(r) does it take to double the number of infections when the daily rate of growth is r. The relationship is (1+r)^d = 2. In other words, Log(2) = d*log(1+r), so that d = log(2)/log(1+r). (Note: log = with base e.)

At a 40% growth rate, the total number of infections doubles almost every 2 days. If we can reduce the rate of growth from 40% to 30% to 25%, the doubling-period goes up from about 2 days to 2.6 days to about 3.1 days. Then reducing again from 25% to 20% growth rate takes us from 3.1 days to 3.8 days, and then if we reduce again to 10% it goes up to over 7 days!

Any Conclusions?

A few things.

Next, we’ll want to look at state level data. We do this below - but with a huge caveat: state-level numbers are still too sketchy and finicky, so any insights are probably even more faulty.

## "","x"
## "1","/cloud/project/states-daily.csv"

California

Let’s pull in state-level data and focus on a few states.

date state n.pos n.neg n.result n.pend n.death n.total
2020-03-24 CA 2102 13452 15554 12100 40 27654
2020-03-23 CA 1733 12567 14300 12100 27 26400
2020-03-22 CA 1536 11304 12840 NA 27 12840
2020-03-21 CA 1279 11249 12528 NA 24 12528
2020-03-20 CA 1063 10424 11487 NA 20 11487
2020-03-19 CA 924 8787 9711 NA 18 9711
2020-03-18 CA 611 7981 8592 NA 13 8592
2020-03-17 CA 483 7981 8464 NA 11 8407
2020-03-16 CA 335 7981 8316 NA 6 8316
2020-03-15 CA 293 916 1209 NA 5 1209
2020-03-14 CA 252 916 1168 NA 5 1168
2020-03-13 CA 202 916 1118 NA 4 1118
2020-03-12 CA 202 916 1118 NA 4 1118
2020-03-11 CA 157 916 1073 NA NA 1073
2020-03-10 CA 133 690 823 NA NA 823
2020-03-09 CA 114 690 804 NA NA 804
2020-03-08 CA 88 462 550 NA NA 550
2020-03-07 CA 69 462 531 NA NA 531
2020-03-06 CA 60 462 522 NA NA 522
2020-03-05 CA 53 462 515 NA NA 515
2020-03-04 CA 53 462 515 NA NA 515

CA: How Much Worse is it Getting?

Let’s look at the data for California.

Number Tested (Results) and Percentage Infected

The reported number of infections is growing in the state.

There’s evidently some flaw in the data here, in terms of tests conducted each day. It seems a few days of test results were reported in bulk and assigned to a single date. More importantly, it is bothersome that the number of tests per day is not increasing, like it is nationally and like it is in states like New York (see below).

And the rate of positive test results each day?

CA: Daily Changes and Growth Rate

Now we can compute the daily change as an absolute difference and as a rate of growth.

NY: How Much Worse is it Getting?

Let’s look at the data for New York.

date state n.pos n.neg n.result n.pend n.death n.total
2020-03-24 NY 25665 65605 91270 NA 210 91270
2020-03-23 NY 20875 57414 78289 NA 114 78289
2020-03-22 NY 15168 46233 61401 NA 114 61401
2020-03-21 NY 10356 35081 45437 NA 44 45437
2020-03-20 NY 7102 25325 32427 NA 35 32427
2020-03-19 NY 4152 18132 22284 NA 12 22284
2020-03-18 NY 2382 12215 14597 NA 12 14597
2020-03-17 NY 1700 5506 7206 NA 7 7206
2020-03-16 NY 950 4543 5493 NA 7 5493
2020-03-15 NY 729 4543 5272 NA 3 5272
2020-03-14 NY 524 2779 3303 NA NA 3303
2020-03-13 NY 421 2779 3200 NA NA 3200
2020-03-12 NY 216 NA NA NA NA 216
2020-03-11 NY 216 NA NA NA NA 216
2020-03-10 NY 173 92 265 NA NA 265
2020-03-09 NY 142 92 234 NA NA 234
2020-03-08 NY 105 92 197 NA NA 197
2020-03-07 NY 76 92 168 236 NA 404
2020-03-06 NY 33 92 125 236 NA 361
2020-03-05 NY 22 76 98 24 NA 122
2020-03-04 NY 6 48 54 24 NA 78

Number Tested (Results) and Percentage Infected

The reported number of infections is growing in the state.

And the rate of positive test results each day?

NY: Daily Changes and Growth Rate

Now we can compute the daily change as an absolute difference and as a rate of growth.

WA: How Much Worse is it Getting?

Let’s look at the data for Washington.

date state n.pos n.neg n.result n.pend n.death n.total
2020-03-24 WA 2221 31712 33933 NA 110 33933
2020-03-23 WA 1996 28879 30875 NA 95 30875
2020-03-22 WA 1793 25328 27121 NA 94 27121
2020-03-21 WA 1524 21719 23243 NA 83 23243
2020-03-20 WA 1376 19336 20712 NA 74 20712
2020-03-19 WA 1187 15918 17105 NA 66 17105
2020-03-18 WA 1012 13117 14129 NA 52 14129
2020-03-17 WA 904 11582 12486 NA 48 12486
2020-03-16 WA 769 9451 10220 NA 42 10220
2020-03-15 WA 642 7122 7764 NA 40 7764
2020-03-14 WA 568 6001 6569 NA 37 6569
2020-03-13 WA 457 4350 4807 NA 31 4807
2020-03-12 WA 337 3037 3374 NA 29 3403
2020-03-11 WA 267 2175 2442 NA 24 2466
2020-03-10 WA 162 1110 1272 NA NA 1272
2020-03-09 WA 136 1110 1246 NA NA 1246
2020-03-08 WA 102 640 742 60 NA 802
2020-03-07 WA 102 370 472 66 NA 538
2020-03-06 WA 79 370 449 NA NA 449
2020-03-05 WA 70 NA NA NA NA 70
2020-03-04 WA 39 NA NA NA NA 39

Number Tested (Results) and Percentage Infected

The reported number of infections is growing in the state.

And the rate of positive test results each day?

WA: Daily Changes and Growth Rate

Now we can compute the daily change as an absolute difference and as a rate of growth.

MA: How Much Worse is it Getting?

Let’s look at the data for Massachusetts.

date state n.pos n.neg n.result n.pend n.death n.total
2020-03-24 MA 1159 12590 13749 NA 11 13749
2020-03-23 MA 777 8145 8922 NA 9 8922
2020-03-22 MA 646 5459 6105 NA 5 6128
2020-03-21 MA 525 4752 5277 NA 1 5277
2020-03-20 MA 413 3678 4091 NA 1 4091
2020-03-19 MA 328 2804 3132 NA NA 3132
2020-03-18 MA 256 2015 2271 NA NA 2271
2020-03-17 MA 218 1541 1759 NA NA 1759
2020-03-16 MA 164 352 516 NA NA 516
2020-03-15 MA 138 352 490 NA NA 490
2020-03-14 MA 138 352 490 NA NA 490
2020-03-13 MA 123 92 215 NA NA 215
2020-03-12 MA 95 NA NA NA NA 95
2020-03-11 MA 92 NA NA NA NA 92
2020-03-10 MA 92 NA NA NA NA 92
2020-03-09 MA 41 NA NA NA NA 41
2020-03-08 MA 13 NA NA NA NA 13
2020-03-07 MA 13 NA NA NA NA 13
2020-03-06 MA 8 NA NA NA NA 8
2020-03-05 MA 2 NA NA NA NA 2
2020-03-04 MA 2 NA NA NA NA 2

Number Tested (Results) and Percentage Infected

The reported number of infections is growing in the state.

And the rate of positive test results each day?

MA: Daily Changes and Growth Rate

Now we can compute the daily change as an absolute difference and as a rate of growth.

TX: How Much Worse is it Getting?

Let’s look at the data for Texas.

Number Tested (Results) and Percentage Infected

The reported number of infections is growing in the state.

And the rate of positive test results each day?

TX: Daily Changes and Growth Rate

Now we can compute the daily change as an absolute difference and as a rate of growth.

FL: How Much Worse is it Getting?

Let’s look at the data for Florida.

date state n.pos n.neg n.result n.pend n.death n.total
2020-03-24 FL 1412 13127 14539 1008 18 15547
2020-03-23 FL 1171 11063 12234 860 14 13094
2020-03-22 FL 830 7990 8820 963 13 9783
2020-03-21 FL 658 6579 7237 1002 12 8239
2020-03-20 FL 520 1870 2390 1026 10 3416
2020-03-19 FL 390 1533 1923 1019 8 2942
2020-03-18 FL 314 1225 1539 954 7 2493
2020-03-17 FL 186 940 1126 872 6 1998
2020-03-16 FL 141 684 825 514 4 1339
2020-03-15 FL 116 678 794 454 4 1248
2020-03-14 FL 77 478 555 221 3 776
2020-03-13 FL 50 478 528 221 2 749
2020-03-12 FL 32 301 333 147 2 480
2020-03-11 FL 28 301 329 147 2 476
2020-03-10 FL 19 222 241 155 NA 396
2020-03-09 FL 18 140 158 115 NA 273
2020-03-08 FL 17 118 135 108 NA 243
2020-03-07 FL 14 100 114 88 NA 202
2020-03-06 FL 9 55 64 51 NA 115
2020-03-05 FL 9 31 40 69 NA 109
2020-03-04 FL 2 24 26 16 NA 42

Number Tested (Results) and Percentage Infected

The reported number of infections is growing in the state.

And the rate of positive test results each day?

FL: Daily Changes and Growth Rate

Now we can compute the daily change as an absolute difference and as a rate of growth.

Data Source (US numbers)

The first question is where to get the data, whose data to believe? For instance, on March 18 morning, the New York Times’ updated number (March 18, 10 am, based on Johns Hopkins University) was 5,879, but the Johns Hopkins page itself showed 6,519 confirmed cases. The CDC puts out numbers daily (updated at 4pm ET) but this appears to be a substantial undercount relative to others: 4,226. This site shows a history of daily numbers (in US at 4pm EDT): https://covidtracking.com/, with a March 17 update of 5,723 cases.

I’ll use the covidtracking site in this analysis, mainly because it provides a running table of data vs. just current reports. Here’s a look at the data. n.states is the number of states with known infections (including Puerto Rico and other territories). The next 3 columns are about known test results (positive, negative, total) and pending results.

date n.states n.pos n.neg n.result n.pend n.hosp n.death n.total
2020-03-24 56 51970 292758 344728 14433 4468 675 359161
2020-03-23 56 42164 237321 279485 14571 3325 471 294056
2020-03-22 56 31888 193463 225351 2842 2554 398 228216
2020-03-21 56 23203 155909 179112 3477 1964 272 182589
2020-03-20 56 17038 118147 135185 3336 NA 219 138521
2020-03-19 56 11723 89119 100842 3025 NA 160 103867
2020-03-18 56 7731 66225 73956 2538 NA 112 76495
2020-03-17 56 5723 47604 53327 1687 NA 90 54957
2020-03-16 56 4019 36104 40123 1691 NA 71 41714
2020-03-15 51 3173 22548 25721 2242 NA 60 27963
2020-03-14 51 2450 17102 19552 1236 NA 49 20789
2020-03-13 51 1922 13613 15535 1130 NA 39 16665
2020-03-12 51 1315 7949 9264 673 NA 36 9966
2020-03-11 51 1053 5978 7031 563 NA 27 7617
2020-03-10 51 778 3807 4585 469 NA NA 5054
2020-03-09 51 584 3367 3951 313 NA NA 4264
2020-03-08 51 417 2335 2752 347 NA NA 3099
2020-03-07 51 341 1809 2150 602 NA NA 2752
2020-03-06 36 223 1571 1794 458 NA NA 2252
2020-03-05 24 176 953 1129 197 NA NA 1326
2020-03-04 14 118 748 866 103 NA NA 969

Data Limitations and Data Quality

One could look at state-level data (as I do below, just as an exercise to see how a few specific states are doing), but the same occurs there because state data is just an aggregate of county-level reporting, and so on. I’d love to get a hold of every individual-level (or transactional) data.

Current data are very sketchy - data-driven analysis is still needed - but tread cautiously.

With that said, let’s see what we have, see what clues are contained in the data … but, at the end, interpret all results cautiously.