Summary Report

Using Fractal analysis, we reveal that sars-cov2 is indeed self-similar in nature of its spread across time and space, and further that the virus follows a long-memory process, implying that it persists over a long period (the range of data collection from March 2020 to April 2021) once exposed in a community. We also show that while quarantines indeed help with reducing the spread of the virus, they do not stop the spread given its long memory nature.

Using the fact of long memory process and self-similarity nature, we are able to predict the number of cases into the future within a single person at each time stamp as measured using data from April 2021 to August 2021. With our forecast model, we also show the our methodology is robust by giving explicit probability distributions

Data Prep

Data Exploration

## Warning: One or more parsing issues, see `problems()` for details
Data summary
Name Piped data
Number of rows 8998
Number of columns 29
_______________________
Column type frequency:
character 27
numeric 2
________________________
Group variables None

Variable type: character

skim_variable n_missing complete_rate min max empty n_unique whitespace
senior 559 0.94 2 9 0 6609 0
name 469 0.95 4 131 0 7113 0
date_notified 495 0.94 8 10 0 314 0
date_colected 563 0.94 8 10 0 318 0
result 470 0.95 7 18 0 7 0
result_date_crappy 4195 0.53 1 255 0 345 0
birth_date 495 0.94 8 10 0 5503 0
age 508 0.94 2 8 0 196 0
symp_start_date 548 0.94 8 10 0 416 0
CPF 4684 0.48 7 32 0 3876 0
job 496 0.94 1 84 0 1424 0
local 600 0.93 2 23 0 111 0
local_detail 1434 0.84 2 94 0 2594 0
gender 478 0.95 1 18 0 14 0
CEP 509 0.94 8 15 0 4023 0
email 886 0.90 1 61 0 6845 0
phone 490 0.95 7 79 0 7886 0
symp_0 494 0.95 5 176 0 5144 0
comorbidities 8205 0.09 1 231 0 266 0
lab 535 0.94 1 38 0 42 0
city 1286 0.86 1 93 0 220 0
province 8620 0.04 3 171 0 360 0
isolation 7627 0.15 1 21 0 4 0
result_aplicat 4592 0.49 3 99 0 10 0
other 8345 0.07 7 108 0 59 0
encamin 8828 0.02 3 3 0 2 0
senior_copy 910 0.90 1 9 0 6071 0

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100 hist
cas_n 2 1.00 4.498500e+03 2.597070e+03 1 2.24975e+03 4.49850e+03 6.74725e+03 8.996e+03 ▇▇▇▇▇
n_gal 632 0.93 3.689564e+11 4.566233e+11 35203 3.52008e+11 3.52038e+11 3.52161e+11 3.520e+13 ▇▁▁▁▁

Is data self similar?

We answer this key question by finding the Hurst parameter exponent The higher the Hurst parameter, the higher the statistical self-similarity and higher its long-memory process

How do positive Covid cases score: 0.778

How do negative Covid cases score: 0.843

Fractal dimensions against time

Modeling time series for forecasting using fractal model

# how many weeks do we want to forecast
step_h = 12

Model Fit

Uncertainty of the model

Measuring uncertainty of forecast

Understanding the plot

the plot shows that there is a low probability of the counts being less than 30 and a greater chance the count being above 30. It is also showing that the probability distribution between 0 and .6 are centered around 25 cases implying directional implication for higher case counts

Forecast plot

As we can observe from the plot, our overall forecast is well behaved and bounded by the uncertainty bounds which indicates low uncertainty in the forecast model and thus, a robust outcome that can augment decisions of healthcare coordinators and policy makers