Tetragon: a brief introduction

A simple example: Covid in Europe

In our introduction to Tetragon, we are going to use an old data frame with daily and cumulative cases of Covid infections and deaths in Europe since March 2021 to August 2021: a small data set including four different time features ordered in columns in a data frame format.

Examples of time features for Covid stats in Europe
date	daily_cases	daily_deaths	cumulative_cases	cumulative_deaths
2021-03-02	102125	1973	22724397	549967
2021-03-03	117049	2755	22841446	552722
2021-03-04	133743	2461	22975189	555183
2021-03-05	133057	2724	23108246	557907
2021-03-06	126833	2111	23235079	560018
2021-03-07	103745	1527	23338824	561545
2021-03-08	88904	1535	23427728	563080
2021-03-09	107841	2157	23535569	565237
2021-03-10	129334	2580	23664903	567817
2021-03-11	148639	2416	23813542	570233

In the first example, we are predicting the next 10 days for two time features, reducing the standard expanding windows for cross-validation (n_windows is set to 3 instead of 10, meaning that the model is tested 3 times according to an expanding scheme; when there is not enough data for the validation windows a message will be visualized).

example1 <- tetragon(covid_in_europe[, c("daily_cases", "daily_deaths")], seq_len = 10, dates = covid_in_europe$date, method = "euclidean", distr = "exp", n_windows = 3, n_sample = 1)
  time: 0.89 sec elapsed

The result is a list of different components, as you can see below.

names(example1)
  [1] "exploration" "history"     "best"        "time_log"

The first variableexploration includes all the model generated during the random search. The second variable, history, summarizes the hyper-parameters selected by the user or through random search and relative error metrics¹. Besides the predictions for each feature, best includes testing error statistics, prediction stats and plots for each one.

names(example1$best)
  [1] "predictions"    "testing_errors" "plots"

The prediction is a list including the predicted results for each time-feature (quantile, min, max, mean, mode, sd, skewness, kurtosis, etc. for each time point in the seq_len sequence). Let’s see the prediction table for the first time feature.

knitr::kable(example1$best$predictions[[1]], align = "ccc", caption = "Predictions for daily Covid cases in Europe")

Predictions for daily Covid cases in Europe
	min	10%	25%	50%	75%	90%	max	mean	sd	mode	kurtosis	skewness	iqr_to_range	risk_ratio	upside_prob	divergence	entropy
2021-08-12	29494	36571.0	39707.0	42966	46422.0	50144.8	61868	43325.94	5868.115	44541.04	4.010	0.658	0.2074195	1.403058	NA	NA	6.898762
2021-08-13	15093	29890.0	36067.0	42581	50325.0	60147.0	72431	43443.61	11477.092	41271.57	2.972	0.395	0.2486658	1.085928	0.487	0.990	6.872923
2021-08-14	6414	28120.0	35329.0	43418	50384.5	56663.0	95753	43875.77	15169.204	44854.23	5.502	0.861	0.1685210	1.414307	0.533	0.952	6.847995
2021-08-15	8297	27150.4	40759.5	50045	57198.0	64589.7	105443	49712.23	17048.930	51671.99	4.938	0.621	0.1692144	1.326962	0.602	0.966	6.847656
2021-08-16	2452	22121.0	33011.0	39363	46123.0	52914.0	78231	39129.08	13017.453	39442.62	4.452	0.124	0.1730295	1.053019	0.331	0.953	6.847161
2021-08-17	19716	31420.0	38120.0	42400	48168.0	51748.0	69669	42744.43	9704.995	42425.11	3.820	0.319	0.2011491	1.202125	0.565	0.924	6.881801
2021-08-18	38129	47005.5	49754.0	54793	57998.0	60783.0	71848	54117.75	6349.018	55887.94	3.600	-0.207	0.2444912	1.023464	0.846	0.990	6.900765
2021-08-19	23747	39750.0	46869.0	50084	53779.0	58015.0	71423	49850.44	7992.403	49818.05	4.745	-0.329	0.1449367	0.810229	0.343	0.968	6.894386
2021-08-20	11707	35310.0	42920.0	49192	58626.0	64973.0	92135	50419.57	13782.792	47188.54	4.286	0.246	0.1952803	1.145605	0.510	0.952	6.869059
2021-08-21	27772	39543.0	44994.0	54057	61759.0	73170.0	127966	56581.26	18827.413	48847.84	7.773	1.992	0.1673254	2.811832	0.592	0.964	6.859645

In version 1.1, IQR to range, risk ratio, upside probability, divergence and entropy, have been added directly inside the prediction table and the terminal values (calculation at both ends of each sequence) have been dismissed. Just a couple of brief explanation here:

1. IQR to range: well, almost self explanatory, the normalization of IQR to min-max range allows for comparison among different time features;

2. risk ratio: here we mean the ratio between the range above median and the range below (in financial series allows you to understand how deep is the precipice even when the trend is going up);

3. upside probability: no brainer, probability of getting a larger value compared to the previous point, easy (an annotation here: in most cases the value is around 50%);

4. divergence: we dismissed the average Kullback-Leibler divergence for a simpler measure of divergence, quite similar to Chebyshev distance: in our humble case, the max distance between subsequent ecdf;

5. entropy: a last new entry from a specific package² (it could be of interest to understand how entropy evolves in long-term forecasting and how entropy is related to good or bad predictions).

For each time features included in the model, you get a plot of the median with the chosen confidence interval (ci default is 0.8).

example1$best$plots
  $daily_cases

  
  $daily_deaths

Automating the search for a better model

Now, the question is simple: can we get a better prediction searching the hyper-space with Random Search? Let’s give it a try. The following example show you how to sample 100 different models from a compact hyper-parameter space: we are searching for the best methods and distr (you can set the parameters of your choosing among the available options and search for seq_len too).

example2 <- tetragon(covid_in_europe[, c("daily_cases", "daily_deaths")], seq_len = 10, n_sample = 100, n_windows = 3)
  time: 102.76 sec elapsed

If we compare the error statistics from the best model in example2with the naive model in example1, we see a clear improvement.

The error statistics from example1.

knitr::kable(round(example1$best$testing_errors, 2), align = "ccc", caption = "Testing errors for each time feature BEFORE random search")

Testing errors for each time feature BEFORE random search
	pred_scores	me	mae	mse	rmsse	mpe	mape	rmae	rrmse	rame	mase	smse	sce	gmrae
daily_cases	0.22	-2566.83	14079.20	457535707.3	140.37	-0.14	0.31	1.04	1.03	2.28	0.87	27268.05	-1.73	1.01
daily_deaths	0.22	-68.57	354.11	248933.8	23.03	0.05	0.31	0.90	0.93	1.05	1.00	687.76	-1.55	0.68

The error statistics from example2.

knitr::kable(round(example2$best$testing_errors, 2), align = "ccc", caption = "Testing errors for each time feature AFTER random search")

Testing errors for each time feature AFTER random search
	pred_scores	me	mae	mse	rmsse	mpe	mape	rmae	rrmse	rame	mase	smse	sce	gmrae
daily_cases	0.51	-5429.85	13112.13	522919285	139.83	-0.11	0.27	0.93	0.98	0.42	0.81	31122.21	-3.20	0.80
daily_deaths	0.25	-83.38	315.56	215040	20.89	0.00	0.27	0.79	0.82	0.86	0.89	591.87	-2.07	0.69

A closer look to the history table:

knitr::kable(example2$history, align = "ccc", caption = "Search history (100 samples)")

Search history (100 samples)
	seq_len	method	distr	avg_pred_scores	avg_me	avg_mae	avg_mse	avg_rmsse	avg_mpe	avg_mape	avg_rmae	avg_rrmse	avg_rame	avg_mase	avg_smse	avg_sce	avg_gmrae
10	10	divergence, divergence	empirical, t	0.37985	-2756.6148	6713.846	261567162	80.35933	-0.0580000	0.2700000	0.8593333	0.9008333	0.640500	0.8491667	15857.039	-2.6373333	0.7456667
91	10	euclidean , divergence	t , empirical	0.30375	-1598.3510	6555.080	201442525	77.14700	-0.0571667	0.2953333	0.9133333	0.9510000	1.538833	0.8795000	12340.490	-1.4605000	0.7755000
32	10	clark, clark	cauchy, exp	0.29745	-5615.8407	9312.216	599246599	109.04050	-0.0790000	0.3488333	1.0473333	1.0961667	2.027167	1.0588333	35819.043	-3.1976667	0.8678333
48	10	clark , manhattan	cauchy, exp	0.29340	-5486.0452	8916.275	569185018	104.32350	-0.0920000	0.2986667	0.9360000	0.9878333	1.507500	0.9910000	33990.298	-4.0850000	0.7561667
76	10	divergence, avg	empirical, norm	0.29205	-2054.6215	6688.655	251093428	81.60133	0.0055000	0.2860000	0.9175000	0.9783333	1.087333	0.8833333	15301.284	-1.7448333	0.7508333
85	10	jaccard, clark	exp , empirical	0.29050	-844.7142	6508.213	185421568	74.59367	-0.0393333	0.2791667	0.8871667	0.9115000	1.231833	0.8655000	11378.589	-1.4760000	0.7943333
58	10	dice , clark	logis, exp	0.28920	-481.1907	6074.388	153100908	70.42583	0.0013333	0.2995000	0.9110000	0.9445000	1.133500	0.8556667	9469.215	-0.3476667	0.8278333
19	10	gower, gower	exp, t	0.28785	-1407.7273	6465.874	204133246	75.65033	-0.0405000	0.2733333	0.8646667	0.9055000	1.229500	0.8446667	12458.500	-1.5706667	0.6878333
4	10	manhattan, clark	exp, exp	0.28715	-895.8275	6785.242	202380568	77.97783	0.0056667	0.3130000	0.9426667	0.9880000	1.316333	0.8823333	12358.717	-0.2581667	0.7996667
51	10	jaccard , euclidean	exp , cauchy	0.27750	-1074.2392	6151.684	168395538	72.07883	-0.0348333	0.2813333	0.8803333	0.9085000	1.308667	0.8486667	10360.166	-1.3445000	0.8106667
21	10	lorentzian, gower	t , exp	0.27595	-869.9140	6355.827	177296773	73.57150	-0.0276667	0.2801667	0.8820000	0.9151667	1.315667	0.8505000	10879.900	-1.2433333	0.7696667
50	10	jaccard , chebyshev	exp , logis	0.26985	-980.8005	6011.796	152465063	70.28667	-0.0573333	0.2816667	0.8748333	0.9068333	1.607500	0.8403333	9451.682	-1.6213333	0.7505000
52	10	divergence, divergence	t , norm	0.26790	-1611.2940	6943.843	235279575	82.94167	0.0178333	0.3678333	1.0525000	1.1001667	1.421667	0.9920000	14378.800	0.2153333	0.9493333
78	10	avg , clark	logis, logis	0.26740	-1101.4172	6611.396	195036859	76.89033	-0.0163333	0.3193333	0.9511667	0.9875000	1.440167	0.8953333	11945.786	-0.4395000	0.8666667
80	10	lorentzian, clark	cauchy, t	0.26470	-1188.6183	6541.933	198199937	75.86633	-0.0483333	0.2776667	0.8670000	0.9041667	1.164167	0.8411667	12106.506	-1.6791667	0.7401667
62	10	divergence, divergence	t , exp	0.26020	-1701.8977	7247.789	239660193	84.21867	-0.0130000	0.3735000	1.0575000	1.0985000	1.827167	0.9728333	14644.128	-0.2953333	0.8871667
64	10	jaccard , squared_euclidean	logis, exp	0.25940	-878.8358	6387.124	178724136	73.96700	-0.0425000	0.2761667	0.8785000	0.9116667	1.297833	0.8540000	10989.925	-1.5446667	0.7793333
7	10	divergence, lorentzian	empirical, norm	0.25895	-948.8515	7524.854	271185548	86.75583	0.0030000	0.2896667	0.9555000	1.0055000	1.160500	0.9150000	16502.917	-1.4636667	0.6811667
8	10	gower, clark	logis , empirical	0.25660	-1083.1680	6300.028	174554642	73.26717	-0.0498333	0.2880000	0.8865000	0.9266667	1.494833	0.8505000	10740.963	-1.3463333	0.7136667
69	10	jaccard, jaccard	exp, t	0.25300	-731.6817	7107.943	219951581	79.91767	-0.0211667	0.2881667	0.9166667	0.9466667	1.115667	0.8866667	13401.045	-1.3150000	0.7690000
22	10	manhattan , squared_euclidean	exp , logis	0.24685	-574.6447	6184.675	151779049	70.17333	-0.0380000	0.2788333	0.8695000	0.8948333	1.569500	0.8161667	9358.199	-0.9401667	0.7651667
20	10	jaccard , manhattan	t, t	0.24685	-1134.8733	6473.737	182647693	74.60767	-0.0531667	0.2878333	0.8956667	0.9250000	1.554333	0.8560000	11210.897	-1.5323333	0.7021667
3	10	clark, gower	t , logis	0.24670	-1668.1400	6500.985	210819726	76.72100	-0.0385000	0.2831667	0.8803333	0.9176667	1.413333	0.8533333	12838.904	-1.5671667	0.7536667
86	10	avg , gower	cauchy, exp	0.24495	-1313.7755	6582.964	191716205	75.78000	-0.0615000	0.2876667	0.8900000	0.9253333	1.663333	0.8483333	11743.182	-1.6861667	0.7060000
45	10	squared_euclidean, lorentzian	logis, t	0.24485	-678.3088	6134.132	156846427	71.16417	-0.0445000	0.2850000	0.8900000	0.9225000	1.378000	0.8590000	9732.013	-1.5423333	0.7261667
54	10	divergence, clark	exp , norm	0.24460	-6826.5583	10560.672	737105771	120.11600	-0.1206667	0.3896667	1.1878333	1.2086667	2.698500	1.1436667	44079.989	-3.8466667	1.0766667
73	10	divergence, euclidean	exp, t	0.24150	-6489.9573	10404.374	735204637	118.16450	-0.1328333	0.3376667	1.0726667	1.1073333	2.400833	1.0995000	43896.297	-4.8745000	0.8695000
13	10	gower , manhattan	t , logis	0.24105	-1121.7063	6371.332	182720230	74.34567	-0.0288333	0.2903333	0.9046667	0.9361667	1.306167	0.8715000	11222.789	-1.2818333	0.7831667
60	10	clark , squared_euclidean	logis, t	0.24080	-5160.4487	9327.230	590455365	108.13700	-0.0948333	0.3136667	0.9860000	1.0308333	1.936167	1.0210000	35274.667	-3.8776667	0.8060000
59	10	avg , clark	logis, t	0.23585	-1013.9622	6257.372	175179673	72.84633	-0.0406667	0.2773333	0.8733333	0.9083333	1.363333	0.8463333	10765.391	-1.3565000	0.7176667
43	10	manhattan , lorentzian	cauchy, t	0.23535	-1173.8897	6174.214	165110203	71.41467	-0.0533333	0.2850000	0.8778333	0.9061667	1.642000	0.8293333	10161.166	-1.5353333	0.7778333
55	10	squared_euclidean, dice	logis, exp	0.23455	-705.7505	6517.286	176386482	74.15900	-0.0528333	0.2853333	0.8963333	0.9223333	1.451833	0.8608333	10857.134	-1.4308333	0.7746667
79	10	divergence, jaccard	t , cauchy	0.23315	-1703.3458	6676.441	219777949	77.64833	-0.0425000	0.2868333	0.8836667	0.9203333	1.384167	0.8620000	13369.606	-1.8043333	0.7450000
26	10	clark , manhattan	t, t	0.23220	-1088.9512	7020.025	215157771	79.79267	-0.0576667	0.2918333	0.9200000	0.9495000	1.528833	0.8951667	13155.277	-1.7805000	0.7691667
88	10	avg, avg	exp , norm	0.23200	-1260.1432	6692.617	199888002	76.99783	-0.0481667	0.2916667	0.9155000	0.9420000	1.591500	0.8916667	12257.487	-1.5770000	0.7796667
81	10	euclidean, euclidean	t, t	0.23170	-642.9992	6679.146	186953417	76.51717	-0.0280000	0.2946667	0.9203333	0.9550000	1.425333	0.8856667	11480.644	-1.2540000	0.7725000
18	10	clark , squared_euclidean	norm , cauchy	0.23035	-5766.2057	9291.684	607199016	108.55433	-0.0988333	0.3175000	0.9960000	1.0396667	1.988833	1.0321667	36271.596	-4.0200000	0.8025000
82	10	lorentzian, chebyshev	exp , norm	0.23020	-760.8563	6490.586	184039229	74.91850	-0.0200000	0.2835000	0.8930000	0.9260000	1.276000	0.8591667	11273.216	-1.1155000	0.7698333
100	10	clark, dice	cauchy, cauchy	0.22740	-5502.2032	9245.353	584568211	107.22067	-0.1223333	0.3163333	0.9911667	1.0243333	2.073500	1.0213333	34949.647	-4.3208333	0.8360000
31	10	dice , jaccard	norm , cauchy	0.22710	-729.1557	6190.017	163760792	72.25433	-0.0343333	0.2835000	0.8911667	0.9285000	1.301667	0.8613333	10124.179	-1.3891667	0.7101667
42	10	squared_euclidean, euclidean	exp , logis	0.22630	-239.3847	6627.497	176047870	74.26167	-0.0100000	0.2826667	0.9061667	0.9246667	1.193500	0.8718333	10813.507	-0.7950000	0.8181667
83	10	clark , euclidean	t , logis	0.22505	-1427.1773	6862.577	211467060	79.34267	-0.0550000	0.3001667	0.9373333	0.9671667	1.715167	0.9010000	12946.126	-1.6805000	0.7718333
99	10	jaccard, dice	norm, exp	0.22320	-1038.6300	6717.920	204287115	76.92017	-0.0465000	0.2695000	0.8696667	0.9026667	1.244500	0.8493333	12464.682	-1.6465000	0.7506667
46	10	gower , divergence	cauchy, t	0.22235	-1022.5437	6601.788	182762757	75.21933	-0.0603333	0.2963333	0.9115000	0.9418333	1.643333	0.8646667	11232.244	-1.5605000	0.7638333
33	10	squared_euclidean, lorentzian	logis, exp	0.22225	-554.6870	6542.160	175352838	74.13350	-0.0290000	0.2990000	0.9270000	0.9475000	1.446833	0.8930000	10801.494	-1.1955000	0.7938333
30	10	lorentzian, manhattan	t , cauchy	0.22080	-1419.8172	6859.918	225019319	79.40233	-0.0446667	0.2800000	0.8905000	0.9283333	1.374167	0.8761667	13703.811	-1.6801667	0.6518333
74	10	manhattan , squared_euclidean	t , norm	0.22075	-1209.4188	6695.710	203547576	77.65383	-0.0241667	0.2951667	0.9196667	0.9535000	1.521167	0.8873333	12441.809	-1.1903333	0.7651667
68	10	manhattan, jaccard	t , norm	0.21980	-1284.3032	6419.215	182770253	74.64483	-0.0475000	0.2956667	0.9111667	0.9433333	1.635500	0.8723333	11235.239	-1.5423333	0.7423333
38	10	avg , chebyshev	logis, exp	0.21955	-1610.3110	6520.614	198390251	76.46667	-0.0636667	0.2956667	0.9093333	0.9486667	1.640833	0.8786667	12173.210	-2.0385000	0.7383333
2	10	euclidean , divergence	norm , cauchy	0.21380	-1398.2370	6389.834	181781374	76.59017	0.0011667	0.3733333	1.0440000	1.0931667	1.659667	0.9583333	11248.290	0.2648333	0.8990000
36	10	clark , divergence	cauchy, norm	0.21290	-5578.4527	9482.778	590306483	112.17367	-0.0708333	0.4125000	1.1766667	1.2335000	2.307167	1.1328333	35403.728	-2.4796667	0.9716667
75	10	squared_euclidean, manhattan	t , cauchy	0.21240	-466.9628	6508.598	170450039	73.63067	-0.0228333	0.2871667	0.8963333	0.9276667	1.533667	0.8448333	10456.566	-0.7486667	0.7360000
14	10	squared_euclidean, chebyshev	logis , cauchy	0.21130	-434.1252	6661.615	184405947	74.89550	-0.0133333	0.2848333	0.9026667	0.9255000	1.190667	0.8678333	11302.026	-0.8918333	0.7973333
6	10	dice , gower	exp , norm	0.20705	-602.2478	6726.564	186994100	75.82367	-0.0231667	0.2883333	0.9063333	0.9303333	1.358833	0.8648333	11447.051	-1.0038333	0.8041667
29	10	chebyshev, jaccard	t , norm	0.20505	-1238.3750	6639.366	194371012	76.70633	-0.0505000	0.2995000	0.9205000	0.9486667	1.638500	0.8800000	11914.258	-1.5926667	0.7581667
72	10	dice , clark	empirical, logis	0.20265	-866.0878	6354.424	177958679	74.42783	0.0046667	0.3135000	0.9360000	0.9818333	1.279833	0.8778333	10934.783	-0.2145000	0.8330000
9	10	manhattan, chebyshev	cauchy, norm	0.20215	-1430.9840	6398.092	188161176	74.56583	-0.0585000	0.2861667	0.8865000	0.9236667	1.587667	0.8613333	11559.914	-1.8636667	0.7006667
90	10	gower, avg	cauchy, norm	0.19975	-1121.0180	6418.189	179478430	74.24067	-0.0505000	0.2861667	0.8916667	0.9260000	1.587500	0.8533333	11031.397	-1.4951667	0.6866667
27	10	jaccard, jaccard	norm , cauchy	0.19760	-873.6373	6512.593	184454874	74.38200	-0.0518333	0.2738333	0.8641667	0.8970000	1.351000	0.8385000	11295.676	-1.5745000	0.7505000
67	10	euclidean, manhattan	cauchy, exp	0.19685	-948.5235	6814.524	194915552	77.31017	-0.0568333	0.2888333	0.9166667	0.9401667	1.605167	0.8795000	11950.976	-1.5485000	0.7848333
98	10	lorentzian , squared_euclidean	norm , cauchy	0.19325	-1042.4753	6503.183	187346653	75.86083	-0.0283333	0.2956667	0.9223333	0.9533333	1.477667	0.8913333	11519.460	-1.2820000	0.7851667
70	10	jaccard, gower	norm , cauchy	0.19040	-336.8587	6550.530	167404421	73.99267	-0.0368333	0.2868333	0.9123333	0.9356667	1.441000	0.8783333	10346.005	-1.2393333	0.7903333
15	10	clark , euclidean	logis, t	0.18990	-5317.1587	10091.729	631653619	114.14983	-0.1248333	0.3376667	1.0658333	1.0951667	2.376167	1.0636667	37757.621	-4.1290000	0.8325000
40	10	lorentzian, jaccard	norm, t	0.18925	-1116.1363	6428.041	178038459	74.48600	-0.0543333	0.2933333	0.9013333	0.9380000	1.610167	0.8615000	10960.690	-1.6995000	0.7551667
71	10	manhattan, clark	empirical, cauchy	0.18410	-1046.9252	6305.790	177242293	73.87483	-0.0170000	0.3051667	0.9151667	0.9571667	1.423000	0.8620000	10880.669	-0.5423333	0.7826667
65	10	avg , lorentzian	cauchy, norm	0.18165	-860.6475	6273.967	174503940	72.84700	-0.0346667	0.2776667	0.8735000	0.9088333	1.244333	0.8441667	10721.652	-1.3641667	0.6918333
5	10	euclidean , squared_euclidean	logis, norm	0.18110	-879.9973	6428.592	171236022	73.47517	-0.0370000	0.2951667	0.9133333	0.9341667	1.628333	0.8688333	10547.469	-1.1835000	0.8273333
34	10	euclidean, euclidean	t , empirical	0.18110	-1270.2000	6294.010	173398580	73.57017	-0.0586667	0.2863333	0.8891667	0.9271667	1.607000	0.8513333	10679.290	-1.7026667	0.6880000
97	10	euclidean, chebyshev	exp , empirical	0.18055	-1370.9735	6146.852	178151364	72.51450	-0.0505000	0.2770000	0.8600000	0.9036667	1.379333	0.8371667	10944.083	-1.7485000	0.6326667
95	10	manhattan , squared_euclidean	norm, norm	0.17980	-1418.2690	6725.072	198313386	76.73850	-0.0490000	0.3006667	0.9303333	0.9471667	1.782500	0.8773333	12142.191	-1.3891667	0.8326667
63	10	chebyshev, euclidean	cauchy, norm	0.17970	-1629.5315	6426.457	191037529	75.41800	-0.0608333	0.2950000	0.9081667	0.9410000	1.679167	0.8746667	11738.242	-1.8870000	0.7041667
96	10	chebyshev , lorentzian	t , norm	0.17625	-1261.2040	6807.739	206507881	77.68983	-0.0553333	0.2870000	0.8978333	0.9310000	1.576500	0.8671667	12612.268	-1.6558333	0.7565000
89	10	gower, gower	empirical, t	0.17400	-1178.8008	6384.291	182716077	74.14950	-0.0636667	0.2763333	0.8640000	0.9015000	1.487333	0.8383333	11214.671	-1.8883333	0.6555000
25	10	euclidean , lorentzian	empirical, exp	0.17200	-1057.1113	6463.954	184531090	74.55417	-0.0573333	0.2790000	0.8748333	0.9073333	1.458333	0.8448333	11306.469	-1.6208333	0.7008333
23	10	manhattan, chebyshev	norm , cauchy	0.17200	-1370.5175	6580.256	185350254	75.58283	-0.0733333	0.2943333	0.9210000	0.9433333	1.793333	0.8823333	11432.055	-2.1041667	0.8101667
44	10	squared_euclidean, dice	norm, norm	0.17175	-995.9355	6615.075	183190100	75.19283	-0.0500000	0.3001667	0.9230000	0.9495000	1.622833	0.8786667	11279.641	-1.5770000	0.7611667
39	10	lorentzian , squared_euclidean	logis , empirical	0.16890	-1295.5753	6772.305	211441082	78.57483	-0.0376667	0.2928333	0.9161667	0.9508333	1.486833	0.8876667	12914.098	-1.4615000	0.7728333
66	10	euclidean, jaccard	empirical, t	0.16885	-1098.8565	6515.307	187266082	75.28283	-0.0626667	0.2800000	0.8796667	0.9161667	1.496500	0.8530000	11491.005	-1.7535000	0.6946667
56	10	jaccard , euclidean	logis , empirical	0.16720	-1307.4000	6279.768	184012233	73.89150	-0.0480000	0.2763333	0.8666667	0.9081667	1.409000	0.8431667	11282.767	-1.6855000	0.6720000
16	10	clark , squared_euclidean	norm , empirical	0.16560	-5430.7632	9315.151	609923400	108.74117	-0.0883333	0.3150000	0.9883333	1.0363333	1.745000	1.0368333	36424.723	-3.9751667	0.7921667
47	10	gower, dice	cauchy , empirical	0.16385	-1393.3568	6182.032	181112842	72.87733	-0.0506667	0.2815000	0.8680000	0.9088333	1.450000	0.8438333	11118.197	-1.6936667	0.6610000
92	10	euclidean, euclidean	norm , cauchy	0.16245	-1259.2183	6476.160	186403439	75.18700	-0.0528333	0.2913333	0.9035000	0.9360000	1.647833	0.8635000	11445.440	-1.5420000	0.7458333
12	10	lorentzian, lorentzian	exp , empirical	0.16095	-880.2260	6686.079	199903744	77.05400	-0.0300000	0.2786667	0.8903333	0.9251667	1.118167	0.8663333	12219.831	-1.4766667	0.7308333
35	10	dice , gower	norm , empirical	0.15845	-1066.3492	5911.657	163935110	70.34217	-0.0326667	0.2686667	0.8475000	0.8895000	1.133667	0.8235000	10101.107	-1.4908333	0.6725000
77	10	manhattan, manhattan	empirical, t	0.15725	-1116.9092	6350.289	179840078	73.84683	-0.0506667	0.2820000	0.8776667	0.9096667	1.485500	0.8501667	11045.919	-1.6420000	0.7210000
41	10	manhattan, manhattan	t , empirical	0.15550	-1255.4373	6728.119	198467090	77.16867	-0.0536667	0.2946667	0.9168333	0.9470000	1.651667	0.8795000	12161.848	-1.6371667	0.7295000
24	10	squared_euclidean, dice	empirical, logis	0.15530	-1092.8768	6607.361	185194280	75.89983	-0.0621667	0.2953333	0.9175000	0.9493333	1.656167	0.8825000	11412.088	-1.8668333	0.7580000
37	10	gower , divergence	empirical, cauchy	0.15320	-1067.6582	6415.687	182262223	76.61483	0.0138333	0.3573333	1.0178333	1.0728333	1.517000	0.9430000	11250.134	0.3671667	0.8396667
28	10	manhattan, gower	exp , empirical	0.14885	-504.0503	7027.039	199733130	78.03017	-0.0285000	0.2928333	0.9258333	0.9461667	1.394833	0.8920000	12217.467	-1.1651667	0.7860000
94	10	euclidean, euclidean	empirical, t	0.14595	-1160.7033	6425.139	183416791	74.77150	-0.0513333	0.2855000	0.8901667	0.9300000	1.536167	0.8583333	11266.298	-1.6791667	0.7083333
53	10	lorentzian, avg	cauchy , empirical	0.14425	-837.6475	6278.195	169777714	72.68633	-0.0376667	0.2823333	0.8826667	0.9128333	1.351667	0.8471667	10443.588	-1.3626667	0.7215000
93	10	gower , euclidean	empirical, logis	0.14245	-1127.8158	6295.230	177634023	73.25050	-0.0475000	0.2761667	0.8626667	0.9035000	1.457000	0.8291667	10895.509	-1.4705000	0.6831667
1	10	euclidean , lorentzian	norm, norm	0.13720	-1101.1345	6483.005	176958641	74.11217	-0.0680000	0.2875000	0.8975000	0.9271667	1.686500	0.8510000	10895.183	-1.7213333	0.6593333
87	10	clark , euclidean	cauchy , empirical	0.13675	-5900.5617	9496.173	621630384	110.56267	-0.1245000	0.3250000	1.0155000	1.0630000	2.200833	1.0416667	37159.605	-4.4956667	0.7438333
84	10	lorentzian, chebyshev	norm , empirical	0.11860	-1198.7222	6368.122	184717611	74.47500	-0.0496667	0.2813333	0.8775000	0.9181667	1.458833	0.8511667	11330.715	-1.6315000	0.6720000
17	10	avg , jaccard	norm , empirical	0.11130	-1470.6315	6762.974	202334165	77.49817	-0.0671667	0.2995000	0.9235000	0.9516667	1.769333	0.8791667	12399.887	-1.8636667	0.7368333
49	10	avg , lorentzian	logis , empirical	0.10995	-985.7722	6351.975	172105924	73.12800	-0.0543333	0.2800000	0.8783333	0.9171667	1.552833	0.8455000	10598.459	-1.6270000	0.6521667
57	10	lorentzian, divergence	empirical, exp	0.09620	-1031.3753	6460.685	185656640	76.30867	0.0118333	0.3488333	0.9925000	1.0413333	1.399667	0.9198333	11420.129	0.1795000	0.8826667
61	10	euclidean, avg	empirical, empirical	0.07515	-1096.2438	6469.849	184241686	75.01450	-0.0473333	0.2870000	0.8953333	0.9318333	1.539667	0.8621667	11311.896	-1.5465000	0.7011667
11	10	avg , gower	empirical, empirical	0.06125	-1260.6433	6348.656	182447867	74.20667	-0.0550000	0.2818333	0.8788333	0.9175000	1.516500	0.8491667	11200.435	-1.6931667	0.6775000

Here are the best parameters discovered during the random search:

knitr::kable(example2$history[1,]
, align = "ccc", caption = "Testing errors for each time feature after random search")

Testing errors for each time feature after random search
	seq_len	method	distr	avg_pred_scores	avg_me	avg_mae	avg_mse	avg_rmsse	avg_mpe	avg_mape	avg_rmae	avg_rrmse	avg_rame	avg_mase	avg_smse	avg_sce	avg_gmrae
10	10	divergence, divergence	empirical, t	0.37985	-2756.615	6713.846	261567162	80.35933	-0.058	0.27	0.8593333	0.9008333	0.6405	0.8491667	15857.04	-2.637333	0.7456667

Let’s have a look to the plots for the best model.

example2$best$plots
  $daily_cases

  
  $daily_deaths

The error metrics are calculated using the greybox package. For any info, you can look here: https://cran.r-project.org/web/packages/greybox/index.html ↩︎
We used the entropy package base options. For any information, you can look here: https://cran.r-project.org/web/packages/entropy/index.html ↩︎

Tetragon: a brief introduction

Giancarlo Vercellino

25-April-2022

Expanding the distance matrix to predict new sequences

A simple example: Covid in Europe

Automating the search for a better model

Some useful references