configuration & functions

data ingestion

sources & variables

time dependent
covid, hospitalization, vaccination, claim

time invariant variables
county features, distance matrix location to location, mobility score matrix

derived variables
- population per location is constant over time e.g. Pfizer/tot

data exploration

hospitalization distributions

discrete fourier transform (DTF)

the disrete fourier transform characterizes the hospitalization rate by its frequency spectrum over the entire data collection time period T, in our case 2 years. the spectrum provides time-information on the alternating hospitalization admission rates and respectively what the algorithm has to learn.
Time Period T: 2 years = Frequency of 1
Frequency F: presents the alternating admission rate. e.g. a Frequency of 5 means that within 2 years the admission rates presents pattern of 5 changes.
Butterworth filter: prior to the DTF a butterworth filter was applied to cutoff frequency > 20, meaning changing patterns > 20.

\[\begin{aligned} X_k &= \sum^{N-1}_{k = 0} X_n e^{-i. 2\pi k n / N}\\ X_k &= x + iy \end{aligned}\]

correlations

sorted correlation all counties
variable.hos values.hos variable.rest values.rest
hospitalization 1.0000000 X1st 0.4307005
Myocardial.Infarction 0.8867571 tot 0.4283012
Congestive.Heart.Failure 0.8848962 Pfizer_1 0.4153022
Cerebrovascular.Disease 0.8724137 Moderna_1 0.4123807
Renal 0.8721020 Johnson_1 0.4062952
Chronic.Pulmonary.Disease 0.8714368 X2nd 0.4059126
Diabetes.without.chronic.complication 0.8635006 Moderna_2 0.3918767
n_visits 0.8550657 Pfizer_2 0.3878811
Hypertension 0.8508274 inbeds_covid 0.3269741
Peripheral.Vascular.Disease 0.8429283 inbeds 0.3223259
Obesity 0.8420284 icu 0.3197678
Liver.Disease 0.8418334 icu_covid 0.3123147
age_cnt 0.8331619 PfizerTS10_1 0.2725763
Hemiplegia.or.Paraplegia 0.8192369 PfizerTS10_2 0.2271253
Dementia 0.8158466 Pfizer_b 0.2112769
HIV 0.8113039 bst 0.2003936
Malignancy 0.8095888 Moderna_b 0.1806095
Peptic.Ulcer.Disease 0.7864110 Johnson_b 0.1403537
Immunodeficiency 0.7701248 Pfizer_1.Population. 0.0788581
Metastatic.Solid.Tumor 0.7635862 Pfizer_2.Population. 0.0748859
n_covid 0.4523637 PfizerTS_2 0.0722046

association with weekdays

location corr r.square
2291 0.5040086 0.2540247
2105 0.5075915 0.2576491
351 0.5348549 0.2860697
2208 0.5377712 0.2891979
1769 0.5424664 0.2942698
554 0.6075900 0.3691656
939 0.6099504 0.3720395

Remark: 329 counties cor > 0.2, 105 counties cor > 0.3, 31 counties cor > 0.4, 7 counties cor > 0.5

Remarks

? what are the predictor variables ? “all variables” + historical hospital cases ? ? what do we want to predict ? hospitalization 28 days and location

.) time invariant variables are not included (county, distance matrix, mobility score) | predicting location: .) introduced weekdays as variable .) Remark: The algorithm has to predict an alternative curve with the period time of .) Fourier tranform

bayesian additive decision applied to time series data

data set

sliding window 28 days

bayesian additive regression tree

hospitalization for 1 location