naive: a brief introduction

For example, let’s look at tech giants’ stocks …

The dataset time features included with naive is a recent take on some Big Techs’ stock prices (source: Yahoo Finance). The data is expected in a dataframe format, where each column represents a different time series (the date information is not mandatory and could be provided separately).

Examples of time features: Tech Giants Share
date	IBM.Close	AAPL.Close	AMZN.Close	GOOGL.Close	MSFT.Close
2017-01-03	159.8375	29.0375	753.67	808.01	62.58
2017-01-04	161.8164	29.0050	757.18	807.77	62.30
2017-01-05	161.2811	29.1525	780.45	813.02	62.30
2017-01-06	162.0746	29.4775	795.99	825.21	62.84
2017-01-09	160.2773	29.7475	796.92	827.18	62.64
2017-01-10	158.2409	29.7775	795.90	826.01	62.62
2017-01-11	160.3728	29.9375	799.02	829.86	63.19
2017-01-12	160.5641	29.8125	813.64	829.53	62.61
2017-01-13	159.9809	29.7600	817.14	830.94	62.70
2017-01-17	160.5067	30.0000	809.72	827.46	62.53

In the first example, we are predicting the close price for Amazon and Google In this example we try to set seq_len = 50 (sequence length), using a cross-validation scheme of 10 n_windows for error measurement.


example1 <- naive(time_features[, 4:5], seq_len = 30, n_windows = 10,  dates = time_features$date)
  time: 89.5 sec elapsed

The result is a list of different components, as you can see below.

names(example1)
  [1] "exploration" "history"     "best_model"  "time_log"
names(example1$best_model)
  [1] "quant_preds" "plots"       "errors"

exploration takes in all models tested during the exploration.history includes selected parameters and error metrics for the explored space during random search (beside prediction score, me, mae, mse, rmsse, mpe, mape, rmae, rrmse, rame, mase, smse, sce, gmrae ³, averaged across features and validation windows). best_model collects a list of information for the best model selected according to the average error metric: you will find the prediction intervals (quant_preds), the visualizations (plots) and the testing error metric for each time feature (errors).

The quant_preds is a list including the predicted results for each time-feature (quantile, min, max, mean, mode, sd, skewness, kurtosis, iqr to range, median range ratio, upside probability, divergence for each time point in the seq_len sequence). The IQR to range is the interquartile range to the min-max range, the median range ratio is the range above median to the range below it, the upside probability is the probability of growth compared to the former point in the time sequence, the divergence is the maximum distance of cumulative normal curve of each point to the former point in the sequence.

Examples of prediction for GOOGLE Close Prices
	min	10%	25%	50%	75%	90%	max	mean	sd	mode	kurtosis	skewness	iqr_to_range	median_range_ratio	upside_prob	divergence
2022-04-22	2312	2369.0	2387.00	2396.0	2403.00	2422.0	2599	2398.242	36.1526	2394.608	19.9504	3.3027	0.0557	2.4167	NA	NA
2022-04-23	2259	2348.0	2374.00	2390.0	2401.00	2422.0	2599	2389.505	40.6684	2392.842	13.5310	2.0235	0.0794	1.5954	0.439	0.1009
2022-04-24	2259	2334.0	2365.00	2386.0	2398.00	2412.0	2599	2382.771	42.9731	2389.698	11.9165	1.7518	0.0971	1.6772	0.433	0.0676
2022-04-26	2181	2319.0	2352.00	2378.0	2394.00	2408.1	2599	2373.262	47.6292	2384.674	9.5104	0.9789	0.1005	1.1218	0.436	0.0922
2022-04-27	2179	2304.0	2342.00	2369.0	2389.00	2402.1	2597	2364.415	50.1594	2379.652	8.3632	0.7157	0.1124	1.2000	0.478	0.0749
2022-04-29	2063	2293.9	2329.00	2362.0	2384.00	2398.0	2597	2355.865	54.1692	2372.984	8.1068	0.3451	0.1030	0.7860	0.464	0.0716
2022-04-30	2063	2285.8	2322.75	2356.0	2379.25	2396.0	2597	2349.760	56.1086	2368.126	7.7924	0.3211	0.1058	0.8225	0.472	0.0462
2022-05-02	2063	2269.9	2311.00	2347.0	2374.00	2393.0	2597	2340.390	60.0182	2358.532	6.8190	0.1505	0.1180	0.8803	0.444	0.0693
2022-05-03	2063	2252.0	2298.00	2338.0	2367.00	2388.0	2597	2330.789	63.6116	2352.590	6.2816	0.0833	0.1292	0.9418	0.465	0.0659
2022-05-05	2063	2241.0	2290.00	2329.0	2361.00	2382.0	2593	2322.422	65.5858	2346.656	5.9728	0.0929	0.1340	0.9925	0.463	0.0530
2022-05-06	2061	2230.9	2280.00	2321.5	2354.25	2378.1	2593	2314.626	67.1350	2331.384	5.8596	0.0464	0.1396	1.0422	0.450	0.0477
2022-05-08	2000	2218.9	2270.75	2312.5	2347.00	2373.0	2589	2305.860	68.5272	2321.340	5.7721	0.0307	0.1295	0.8848	0.467	0.0522
2022-05-09	2000	2206.0	2261.00	2306.0	2342.25	2368.0	2587	2299.410	69.5498	2323.350	5.5168	0.0038	0.1384	0.9183	0.489	0.0377
2022-05-11	1992	2200.0	2253.00	2301.0	2338.00	2363.0	2587	2292.794	70.7851	2318.126	5.5252	-0.0004	0.1429	0.9256	0.457	0.0382
2022-05-12	1992	2192.8	2243.00	2289.5	2329.25	2360.0	2587	2284.182	72.9742	2304.134	5.0821	-0.0566	0.1450	1.0000	0.484	0.0492
2022-05-14	1966	2183.9	2237.00	2284.0	2324.00	2357.0	2587	2277.186	74.7673	2300.406	5.1059	-0.1172	0.1401	0.9528	0.486	0.0390
2022-05-15	1966	2173.0	2224.75	2273.0	2319.00	2349.1	2587	2267.879	77.2393	2281.216	4.8703	-0.1002	0.1518	1.0228	0.460	0.0505
2022-05-17	1919	2154.0	2213.75	2263.0	2309.00	2343.1	2587	2257.539	79.2551	2273.442	4.5546	-0.1452	0.1426	0.9419	0.468	0.0537
2022-05-18	1916	2146.9	2200.00	2253.5	2300.25	2339.0	2587	2248.577	81.0631	2266.092	4.4727	-0.1285	0.1494	0.9881	0.460	0.0455
2022-05-20	1916	2139.0	2195.00	2249.0	2294.00	2332.0	2587	2242.412	81.6024	2258.164	4.4830	-0.1856	0.1475	1.0150	0.482	0.0303
2022-05-21	1916	2121.9	2184.00	2238.5	2287.00	2327.1	2587	2232.815	84.2189	2254.274	4.2418	-0.1886	0.1535	1.0806	0.465	0.0478
2022-05-23	1916	2103.0	2167.00	2227.0	2279.00	2321.0	2587	2220.778	88.0392	2242.648	3.9936	-0.1916	0.1669	1.1576	0.491	0.0583
2022-05-24	1905	2094.0	2158.00	2219.5	2272.00	2316.0	2587	2212.660	89.1097	2237.640	3.8126	-0.1750	0.1672	1.1685	0.476	0.0369
2022-05-26	1902	2085.0	2149.50	2214.0	2266.00	2310.0	2587	2205.269	89.6083	2235.430	3.8217	-0.1632	0.1701	1.1955	0.479	0.0331
2022-05-27	1880	2073.0	2139.00	2206.5	2257.25	2301.0	2587	2196.312	92.0427	2215.920	3.7777	-0.1699	0.1673	1.1654	0.483	0.0407
2022-05-29	1880	2070.0	2130.75	2198.0	2252.00	2296.2	2587	2189.828	92.8848	2218.566	3.6961	-0.1307	0.1715	1.2233	0.460	0.0282
2022-05-30	1871	2057.8	2124.00	2190.0	2247.00	2291.1	2587	2182.532	94.0993	2202.796	3.6965	-0.1193	0.1718	1.2445	0.497	0.0315
2022-06-01	1854	2043.8	2111.00	2179.0	2237.25	2283.1	2587	2172.364	95.9514	2199.506	3.6927	-0.1266	0.1722	1.2554	0.455	0.0434
2022-06-02	1816	2036.9	2104.00	2171.0	2231.00	2276.0	2587	2164.854	97.5723	2194.736	3.7860	-0.1345	0.1647	1.1718	0.488	0.0317
2022-06-04	1805	2023.0	2093.00	2162.0	2219.00	2265.0	2587	2153.757	99.5639	2181.820	3.8943	-0.1543	0.1611	1.1905	0.450	0.0456

For each time features included in the model, you get a plot of the median with the chosen confidence interval (ci default is 0.8). As in other packages⁴, we provide different stats to give a better hint on the different dynamics related to aleatoric and epistemic uncertainty.

  $AMZN.Close

  
  $GOOGL.Close

An exploration of the hyper-parameter space

The hyper-parameter space defined by seq_len, method, location and cover is kind of huge. Now, let’s try a random search for the best parameter settings. The following example shows how to sample 100 different models for a sequence of 30 time steps.

example2 <- naive(time_features[, 4:5], seq_len = 30, n_samp = 100, n_windows = 10, dates = time_features$date)
  time: 316.6 sec elapsed

History table with ranking of 100 different models
	seq_len	cover	stride	method	location	pred_score	me	mae	mse	rmsse	mpe	mape	rmae	rrmse	rame	mase	smse	sce	gmrae
82	30	0.3627	13	minimum	median	0.1176	135.9563	146.4151	36363.95	45.3967	0.0754	0.0814	2.4688	2.4440	9.8024	10.6473	2305.583	300.1828	2.3354
27	30	0.8976	13	minimum	median	0.1167	136.2575	146.6415	36447.93	45.4417	0.0756	0.0814	2.4746	2.4484	9.8884	10.6551	2310.521	300.7124	2.3502
93	30	0.2978	13	bhattacharyya	median	0.1156	135.4350	145.9678	36128.14	45.2944	0.0753	0.0812	2.4579	2.4359	9.8552	10.6192	2292.120	299.2818	2.3198
67	30	0.3971	2	bhattacharyya	mode	0.1139	133.7792	144.9582	35695.15	44.8745	0.0740	0.0803	2.4310	2.4064	9.4554	10.5261	2262.969	295.0029	2.3033
34	30	0.7975	10	minkowski	median	0.1136	133.1825	143.9254	35473.83	44.5923	0.0737	0.0799	2.4135	2.3915	9.4922	10.4305	2232.926	292.8412	2.2649
92	30	0.6854	17	manhattan	mean	0.1135	134.2135	144.7527	35745.19	44.8967	0.0745	0.0804	2.4300	2.4122	9.8994	10.5091	2261.106	295.7342	2.2790
48	30	0.8856	17	jensen_shannon	mean	0.1134	135.1137	145.7697	36105.12	45.2138	0.0748	0.0809	2.4525	2.4337	10.0112	10.5831	2284.964	297.9181	2.3070
26	30	0.8455	1	minkowski	mean	0.1134	134.4618	145.4196	35771.52	45.0310	0.0744	0.0807	2.4494	2.4240	9.6353	10.5680	2266.894	296.7357	2.3173
58	30	0.8872	10	jensen_shannon	mode	0.1132	134.3650	145.0983	35743.99	44.9361	0.0743	0.0803	2.4501	2.4237	9.5994	10.5206	2256.370	295.5453	2.3238
2	30	0.8976	5	maximum	mode	0.1131	134.2458	145.1689	35974.80	44.8898	0.0738	0.0802	2.4321	2.4085	9.5758	10.5040	2264.191	294.7563	2.2982
37	30	0.4844	17	manhattan	mean	0.1130	127.8763	139.0565	32610.98	43.4224	0.0718	0.0782	2.3396	2.3236	9.5364	10.2348	2112.513	286.4325	2.1902
50	30	0.5965	2	jensen_shannon	median	0.1128	134.2330	145.2163	35659.76	45.0061	0.0744	0.0806	2.4542	2.4298	9.6182	10.5474	2259.035	295.9662	2.2994
15	30	0.5805	19	maximum	median	0.1128	131.9635	143.3606	34957.72	44.3704	0.0727	0.0792	2.4149	2.3922	9.7234	10.3915	2213.092	290.3270	2.3244
16	30	0.2313	16	jensen_shannon	median	0.1127	134.8424	145.7357	36156.25	45.1162	0.0745	0.0807	2.4274	2.4063	9.5478	10.5931	2293.582	297.7759	2.2841
73	30	0.6582	17	maximum	mean	0.1126	134.5706	145.1971	35759.61	45.1072	0.0747	0.0806	2.4478	2.4308	9.9659	10.5572	2270.445	297.1437	2.2975
79	30	0.1817	5	maximum	mean	0.1125	127.3835	139.0384	32893.46	43.2785	0.0710	0.0777	2.3338	2.3122	9.2121	10.1917	2110.181	284.1158	2.2011
72	30	0.7959	17	jensen_shannon	mode	0.1123	134.4455	145.2099	35767.29	45.0619	0.0745	0.0806	2.4426	2.4247	9.7527	10.5524	2266.938	296.7464	2.2808
98	30	0.1649	1	minimum	mode	0.1123	131.3441	142.3096	34146.60	44.2251	0.0731	0.0795	2.3853	2.3602	9.4146	10.4156	2190.982	292.0144	2.2406
87	30	0.4924	4	minimum	median	0.1123	134.3477	145.3388	35891.70	45.0473	0.0744	0.0806	2.4413	2.4190	9.4336	10.5662	2274.899	296.5515	2.3052
62	30	0.8375	2	maximum	mean	0.1121	135.3867	146.2512	36190.23	45.2758	0.0751	0.0812	2.4669	2.4414	9.6188	10.6128	2288.168	298.1770	2.3109
96	30	0.6037	1	canberra1	mode	0.1121	135.4234	146.4813	36439.15	45.2818	0.0747	0.0809	2.4666	2.4400	9.5890	10.6090	2296.974	297.6325	2.3354
46	30	0.3330	17	minimum	mode	0.1120	134.6943	145.1731	35734.15	45.0663	0.0746	0.0805	2.4477	2.4297	10.0462	10.5426	2263.614	296.9290	2.3121
99	30	0.8359	2	jensen_shannon	median	0.1119	134.6419	145.4952	35936.26	45.0350	0.0745	0.0807	2.4432	2.4188	9.4262	10.5614	2274.328	296.5750	2.2962
71	30	0.7158	10	jensen_shannon	mean	0.1119	134.9617	145.6226	35973.62	45.0739	0.0745	0.0805	2.4603	2.4339	9.7124	10.5486	2268.644	296.5633	2.3495
78	30	0.8560	1	kullback_leibler	mean	0.1118	135.0255	146.0189	36157.13	45.1892	0.0747	0.0811	2.4609	2.4352	9.5308	10.5958	2286.952	297.3935	2.3316
18	30	0.5252	17	jensen_shannon	mean	0.1117	135.0746	145.6853	35875.60	45.1971	0.0749	0.0810	2.4552	2.4374	9.8950	10.5820	2274.482	297.8978	2.3015
100	30	0.8047	8	minkowski	median	0.1116	133.9162	144.9855	35849.09	44.8657	0.0741	0.0805	2.4188	2.3956	9.3968	10.5364	2268.735	295.5204	2.2878
8	30	0.6069	4	bhattacharyya	mode	0.1116	135.0376	145.9415	36026.74	45.2042	0.0748	0.0810	2.4539	2.4296	9.4592	10.6125	2285.709	298.0416	2.3258
61	30	0.2954	1	maximum	median	0.1114	130.8272	142.3956	34611.57	44.1336	0.0722	0.0789	2.3788	2.3584	9.4052	10.3694	2196.589	289.4805	2.2356
70	30	0.4235	5	canberra1	median	0.1114	134.4999	145.4162	35875.43	44.9972	0.0743	0.0804	2.4510	2.4255	9.5046	10.5390	2267.466	295.8257	2.3243
53	30	0.5124	6	jensen_shannon	median	0.1114	135.3996	146.4460	36193.30	45.3414	0.0751	0.0813	2.4707	2.4437	9.6626	10.6504	2296.205	299.1325	2.3296
85	30	0.8840	17	euclidean	mode	0.1112	126.8314	138.3389	32319.81	43.2181	0.0711	0.0778	2.3260	2.3088	9.2310	10.1879	2095.643	284.5070	2.1843
33	30	0.7879	10	jensen_shannon	mean	0.1112	134.9808	145.6351	36010.73	45.0610	0.0746	0.0807	2.4596	2.4323	9.7015	10.5472	2269.219	296.4922	2.3218
94	30	0.3595	17	minkowski	mean	0.1112	124.3290	135.9015	31242.13	42.5106	0.0698	0.0763	2.2786	2.2649	9.1348	10.0337	2032.975	279.7829	2.1430
10	30	0.2017	10	jensen_shannon	median	0.1110	135.3778	145.9996	36272.79	45.1488	0.0746	0.0808	2.4577	2.4312	9.5820	10.5617	2282.016	296.8976	2.3402
4	30	0.2217	8	canberra1	median	0.1109	135.2561	146.3118	36271.98	45.2336	0.0748	0.0810	2.4586	2.4316	9.4779	10.6135	2294.230	297.8154	2.3224
41	30	0.3386	6	minimum	mode	0.1109	134.9547	146.1150	36081.86	45.2015	0.0747	0.0811	2.4624	2.4353	9.5804	10.6152	2286.082	297.7167	2.3368
45	30	0.6181	16	minkowski	median	0.1107	131.3382	142.8209	34734.05	44.3518	0.0733	0.0797	2.3826	2.3614	9.4232	10.4557	2219.485	292.6693	2.2377
12	30	0.1184	13	jensen_shannon	mode	0.1106	132.1042	143.4229	34787.81	44.5472	0.0735	0.0801	2.4283	2.4034	9.6261	10.4834	2221.971	293.7161	2.2856
23	30	0.3370	16	jensen_shannon	median	0.1106	136.5036	147.3980	36685.07	45.6019	0.0756	0.0818	2.4700	2.4437	9.7102	10.7218	2327.403	301.7155	2.3184
1	30	0.5484	18	canberra1	median	0.1105	134.6698	146.4322	36397.29	45.2828	0.0744	0.0812	2.4560	2.4301	9.3095	10.6284	2307.402	297.3109	2.3235
97	30	0.6133	17	euclidean	mean	0.1104	130.0838	141.2379	33861.09	43.9032	0.0723	0.0788	2.3740	2.3576	9.5876	10.3040	2158.410	288.4696	2.2128
54	30	0.2802	6	canberra1	mode	0.1104	134.8900	145.9643	35978.72	45.1777	0.0748	0.0810	2.4578	2.4306	9.6082	10.6183	2283.052	297.9936	2.3231
80	30	0.2818	3	kullback_leibler	mean	0.1104	134.6300	145.7845	36059.20	45.0810	0.0744	0.0808	2.4552	2.4280	9.4526	10.5751	2277.861	296.3767	2.3396
51	30	0.3779	6	bhattacharyya	mean	0.1102	135.9112	147.1401	36563.43	45.4628	0.0752	0.0816	2.4825	2.4523	9.5603	10.6811	2312.669	299.5022	2.3657
49	30	0.2161	18	minkowski	median	0.1101	118.7966	131.7988	29273.44	41.3803	0.0673	0.0746	2.2035	2.1879	7.9332	9.8252	1942.136	270.9560	2.0669
63	30	0.6005	2	minkowski	median	0.1097	129.4861	141.1656	34119.47	43.7884	0.0718	0.0784	2.3632	2.3421	9.2284	10.2890	2162.279	287.0399	2.2228
66	30	0.2033	2	manhattan	mean	0.1096	124.1231	136.0273	31209.00	42.4174	0.0695	0.0762	2.2844	2.2638	8.6550	10.0238	2022.228	278.5039	2.1561
20	30	0.8055	9	kullback_leibler	mode	0.1096	134.8628	146.4201	36243.44	45.2559	0.0746	0.0812	2.4657	2.4368	9.4066	10.6322	2294.359	297.2637	2.3492
11	30	0.3418	18	jensen_shannon	mode	0.1095	132.5239	144.4656	35551.92	44.7246	0.0734	0.0802	2.4150	2.3940	9.2446	10.4964	2257.021	293.0615	2.2862
17	30	0.5973	22	minimum	mode	0.1095	141.5012	150.8588	38203.51	46.5567	0.0782	0.0836	2.5544	2.5186	10.1851	10.9370	2412.425	310.9208	2.4665
52	30	0.2570	22	canberra1	mode	0.1093	141.7500	151.2595	38686.59	46.6724	0.0781	0.0837	2.5495	2.5170	10.0695	10.9398	2438.078	310.7661	2.4473
75	30	0.1312	3	canberra1	median	0.1092	135.0946	146.1694	36026.36	45.2003	0.0747	0.0810	2.4674	2.4372	9.5655	10.6206	2283.976	297.9552	2.3528
29	30	0.5965	5	minkowski	mode	0.1091	124.1491	136.1535	31512.33	42.4124	0.0692	0.0761	2.2808	2.2598	8.7510	10.0004	2030.409	277.4643	2.1487
91	30	0.3851	9	maximum	mean	0.1091	134.4238	145.9933	36045.46	45.0978	0.0741	0.0809	2.4516	2.4198	9.3769	10.6034	2283.409	296.4262	2.3485
25	30	0.3258	6	minkowski	mean	0.1090	125.7110	137.7352	32040.75	42.8598	0.0703	0.0772	2.3146	2.2910	9.0305	10.1367	2069.154	281.7223	2.1838
74	30	0.6197	5	manhattan	median	0.1090	128.2259	140.0396	33525.03	43.4762	0.0710	0.0779	2.3466	2.3274	9.2313	10.2126	2129.947	284.3884	2.2073
36	30	0.3507	19	euclidean	mode	0.1090	114.1260	127.4087	27481.36	40.1182	0.0650	0.0723	2.1340	2.1176	8.5330	9.5442	1839.741	261.7064	2.0385
38	30	0.3378	14	bhattacharyya	median	0.1089	135.3790	147.0794	36621.21	45.4672	0.0751	0.0817	2.4708	2.4418	9.5036	10.6954	2321.066	299.0361	2.3694
30	30	0.5132	9	maximum	median	0.1089	133.6645	145.3503	35651.71	44.9634	0.0740	0.0807	2.4446	2.4148	9.4444	10.5792	2267.256	295.5030	2.3316
57	30	0.3835	2	minkowski	median	0.1088	125.3553	137.2441	31529.18	42.7644	0.0700	0.0769	2.3144	2.2934	9.1666	10.1084	2042.740	281.1055	2.1978
35	30	0.3066	1	euclidean	mean	0.1087	126.0470	137.9348	32296.02	42.9020	0.0703	0.0771	2.3102	2.2884	8.9193	10.1278	2076.881	281.5236	2.1883
44	30	0.3507	5	maximum	median	0.1087	129.8376	141.3021	34087.03	43.8246	0.0718	0.0784	2.3636	2.3404	9.3612	10.3064	2168.325	287.7921	2.2367
59	30	0.6814	12	minkowski	mean	0.1087	133.5842	145.1508	35984.04	44.9137	0.0738	0.0804	2.4231	2.3998	9.3578	10.5448	2273.419	294.7323	2.3035
60	30	0.8560	9	euclidean	mean	0.1086	134.9346	146.5317	36322.85	45.2566	0.0746	0.0813	2.4629	2.4326	9.3365	10.6341	2298.550	297.2564	2.3649
5	30	0.1585	8	maximum	mean	0.1086	127.5504	139.1678	32573.89	43.2514	0.0713	0.0779	2.3346	2.3091	9.0656	10.2270	2100.776	285.2037	2.1955
13	30	0.7711	21	canberra1	median	0.1085	134.2207	146.0625	36245.03	45.1099	0.0741	0.0810	2.4505	2.4235	9.5929	10.5870	2285.282	294.8591	2.3420
28	30	0.8928	14	bhattacharyya	mode	0.1084	134.8567	146.4420	36197.46	45.3230	0.0749	0.0814	2.4646	2.4367	9.4640	10.6615	2301.431	298.2230	2.3750
21	30	0.8031	21	euclidean	median	0.1084	132.9977	144.9425	35830.19	44.7726	0.0735	0.0806	2.4216	2.3973	9.3791	10.5130	2259.742	292.5273	2.3074
64	30	0.3090	11	jensen_shannon	mean	0.1084	139.7928	149.5234	37572.80	46.1613	0.0774	0.0830	2.5288	2.4946	10.0602	10.8447	2374.697	307.4434	2.4420
24	30	0.5805	8	minkowski	median	0.1083	128.8408	140.5684	33521.07	43.6479	0.0718	0.0784	2.3569	2.3349	9.1366	10.2882	2142.680	286.8858	2.2093
22	30	0.8191	11	minkowski	mean	0.1080	139.8112	149.6610	37725.00	46.2274	0.0774	0.0830	2.5300	2.4976	10.0247	10.8592	2385.225	307.6527	2.4303
3	30	0.3563	14	euclidean	mode	0.1079	117.1689	130.6726	29357.36	41.0752	0.0665	0.0741	2.1736	2.1490	7.4737	9.7941	1945.480	268.7716	2.0892
7	30	0.2161	4	minkowski	mean	0.1079	124.1419	136.1154	31040.43	42.4856	0.0697	0.0766	2.2908	2.2699	8.8438	10.0637	2023.761	279.6485	2.1631
31	30	0.2690	16	maximum	median	0.1077	127.6972	139.5232	32938.89	43.4049	0.0713	0.0782	2.3270	2.3050	9.2084	10.2710	2127.363	286.2577	2.1978
6	30	0.2818	21	kullback_leibler	median	0.1074	133.4495	145.2760	35887.29	44.8993	0.0736	0.0806	2.4290	2.4058	9.3908	10.5320	2264.910	293.2331	2.3198
32	30	0.8439	11	manhattan	median	0.1074	138.8773	148.7082	37401.46	45.9050	0.0768	0.0824	2.4978	2.4662	9.8152	10.7841	2359.268	305.3974	2.3991
77	30	0.1256	14	minimum	mode	0.1071	133.5458	145.2642	35732.33	44.8954	0.0739	0.0806	2.4456	2.4166	9.3441	10.5402	2264.905	294.2998	2.3540
83	30	0.6349	11	maximum	median	0.1071	139.3713	149.1376	37514.32	45.9733	0.0769	0.0824	2.5117	2.4775	10.0982	10.8059	2363.014	306.1941	2.4315
89	30	0.6862	11	minkowski	mean	0.1065	138.8414	148.5883	37331.62	45.8286	0.0769	0.0825	2.5002	2.4665	9.8653	10.7688	2352.560	305.0984	2.4133
14	30	0.3843	22	manhattan	mean	0.1065	132.6796	142.5832	34395.17	44.1229	0.0741	0.0796	2.3941	2.3620	9.4294	10.4138	2192.272	294.2344	2.3218
47	30	0.7687	22	manhattan	mode	0.1060	131.3013	141.5186	33502.82	44.0758	0.0739	0.0798	2.3848	2.3557	9.4406	10.4469	2179.410	294.7557	2.2826
68	30	0.1016	10	bhattacharyya	mode	0.1056	129.5518	141.1246	34057.91	43.7621	0.0719	0.0786	2.3698	2.3459	9.5304	10.2997	2169.723	287.2631	2.2689
76	30	0.5132	14	manhattan	median	0.1056	125.1600	137.7180	31864.72	42.8780	0.0702	0.0775	2.3198	2.2936	8.7753	10.1544	2065.204	281.1245	2.2101
65	30	0.4115	14	kullback_leibler	mode	0.1055	134.2611	146.0308	36045.79	45.2002	0.0748	0.0815	2.4542	2.4246	9.4565	10.6498	2296.058	297.4744	2.3585
69	30	0.3987	22	maximum	mean	0.1054	136.6511	146.3606	36205.41	45.1600	0.0758	0.0812	2.4602	2.4260	9.7829	10.6286	2288.416	300.9779	2.3917
55	30	0.5028	20	maximum	mean	0.1052	136.2165	147.0977	36597.00	45.3308	0.0751	0.0814	2.4745	2.4381	9.6078	10.6745	2305.985	299.6266	2.3822
95	30	0.8904	20	manhattan	median	0.1051	136.1019	147.0967	36612.20	45.3750	0.0753	0.0816	2.4732	2.4365	9.4894	10.6947	2318.346	299.9388	2.3726
9	30	0.1384	7	bhattacharyya	mode	0.1048	129.6559	141.8586	34499.66	43.8925	0.0718	0.0789	2.3712	2.3480	9.1860	10.3146	2184.573	286.5098	2.2463
19	30	0.4275	12	minkowski	median	0.1047	123.9351	136.3320	31102.90	42.4647	0.0693	0.0766	2.2943	2.2702	8.8913	10.0592	2021.238	278.6496	2.1915
40	30	0.2257	15	minimum	mode	0.1046	131.6340	144.1205	35138.36	44.6320	0.0728	0.0801	2.4280	2.3998	9.1925	10.5027	2232.206	291.4605	2.3286
42	30	0.8287	15	bhattacharyya	median	0.1046	132.4812	144.6578	35404.05	44.8009	0.0734	0.0804	2.4388	2.4113	9.2931	10.5331	2248.846	292.8452	2.3400
56	30	0.8568	15	minimum	median	0.1041	132.1232	144.4772	35480.58	44.6800	0.0730	0.0801	2.4281	2.4002	9.1794	10.4963	2242.028	291.3914	2.3356
90	30	0.1120	17	bhattacharyya	mode	0.1038	128.3248	140.4523	33804.34	43.6106	0.0711	0.0781	2.3521	2.3329	9.1537	10.2638	2151.785	285.0004	2.2044
86	30	0.7214	20	euclidean	mode	0.1032	123.7912	136.0037	30770.05	42.4490	0.0699	0.0770	2.3005	2.2672	8.3907	10.1166	2025.445	280.4462	2.1975
43	30	0.4243	11	euclidean	mean	0.1032	131.9922	142.4048	34182.61	44.1175	0.0736	0.0794	2.4000	2.3697	9.5699	10.4214	2185.586	293.5018	2.3003
84	30	0.1601	20	minimum	mode	0.1030	136.5197	147.3576	36393.12	45.5026	0.0754	0.0819	2.4994	2.4600	9.5820	10.7375	2303.994	301.5211	2.3955
81	30	0.1865	22	minkowski	mode	0.1022	121.9391	132.5114	29633.49	41.2482	0.0680	0.0742	2.2054	2.1779	8.2214	9.7630	1928.981	273.2001	2.1153
88	30	0.5372	20	minkowski	median	0.1016	127.8345	139.8420	32798.77	43.3876	0.0713	0.0784	2.3554	2.3243	9.0295	10.2826	2112.349	285.7437	2.2556
39	30	0.1184	14	manhattan	median	0.1011	115.3293	128.8623	27810.27	40.4749	0.0656	0.0733	2.1618	2.1397	8.0196	9.6478	1855.701	264.1898	2.0811

If we compare the error statistics from the best model in example2 with the model in example1, for Amazon and Google we see consistent improvement. All the relative and scaled error metrics defaults to naive, but you can choose more challenging thresholds (like the deviation of the whole time feature or the average of the whole predicted sequence).

The error statistics from example1 (averaged across 10 expanding validation windows):

example1$best_model$errors
              pred_score       me      mae      mse   rmsse    mpe   mape   rmae
  AMZN.Close      0.1338 169.8016 185.0265 55874.76 53.3262 0.0794 0.0877 2.2428
  GOOGL.Close     0.0992 101.4664 107.1904 16207.96 37.4059 0.0716 0.0751 2.6984
               rrmse    rame    mase     smse      sce  gmrae
  AMZN.Close  2.2683  4.1215 11.3708 3154.579 316.5591 2.0452
  GOOGL.Close 2.6257 15.7802  9.9103 1432.759 283.6309 2.6248

The error statistics from example2 (as above, averaged across 10 expanding validation windows):

example2$best_model$errors
              pred_score       me      mae      mse   rmsse    mpe   mape   rmae
  AMZN.Close      0.1357 170.7348 185.8427 56560.19 53.4571 0.0796 0.0879 2.2483
  GOOGL.Close     0.0995 101.1778 106.9875 16167.70 37.3364 0.0712 0.0748 2.6893
               rrmse    rame    mase     smse      sce  gmrae
  AMZN.Close  2.2717  4.1114 11.3971 3181.336 317.4391 2.0492
  GOOGL.Close 2.6163 15.4933  9.8975 1429.830 282.9264 2.6217

The improvement is clear for both the time features, but we are still using a naive approach to measure scaled and relative errors. Let’s try to shift to deviation as scale, and average as benchmark, that are more challenging evaluation criteria.

example3 <- naive(time_features[, 4:5], seq_len = 30, n_windows = 10, dates = time_features$date, error_scale = "deviation", error_benchmark = "average")
  time: 88.72 sec elapsed

As you can see, the relative and scaled measures change sensibly as we raise the bar of our expectations:

example3$best_model$errors
              pred_score       me      mae      mse   rmsse    mpe   mape   rmae
  AMZN.Close      0.1338 169.8016 185.0265 55874.76 11.9350 0.0794 0.0877 1.0797
  GOOGL.Close     0.0992 101.4664 107.1904 16207.96 10.3578 0.0716 0.0751 1.0544
               rrmse rame   mase     smse     sce  gmrae
  AMZN.Close  1.1810    1 0.6396 165.4390 18.1701 0.8392
  GOOGL.Close 1.1393    1 0.7990 112.7934 23.0830 0.8762

naive: a brief introduction

Giancarlo Vercellino

21-May-2022

The simplest baseline

For example, let’s look at tech giants’ stocks …

An exploration of the hyper-parameter space

Some useful references