“In the natural world, there are many pattern-assembly systems for which there is no simple explanation. There are useful scientific explanations for these complex systems, but the final patterns that they produce are so heterogeneous that they cannot effectively be reduced to smaller or less intricate predecessor components. As I will argue … these patterns are, in a fundamental sense, irreducibly complex.” (Michael J. Katz, Templets and the explanation of complex patterns)
dymo uses Dynamic Mode Decomposition1 (DMD) to predict multiple time features correlated by non-linear dynamics. dymo is a pretty simple implementation of SVD-based DMD with a couple of add-ons to make it a little more interesting:
An introduction to DMD is out-of-scope of this vignette. What you have to know, simplifying a bit, is that the DMD linearizes the non-linear dynamics of multiple time features by projecting them in an Hilbert space to approximates their value. What DMD does is understanding the non-linear dynamics of a system in time and approximates them as linear multiplicative system (you can think of it as a kind of regression, but no ordinary least squares here). Key points: you need at least two different time features (a single one is not a system by definition) and you need to choose the rank of the SVD approximation (dymo tests all the different ranks and selects the best one). You can also explore the space of all possible combinations of time features by selecting the minimum and the maximun number of features (from 2 to the total number of features).
Some basic transformation are directly managed in background. Differentiation and integration are automatically managed by dymo using the maximal p-value in a recursive F-test for de-trending each time-feature: this allows to easily determine the different dynamic characteristics of each time feature, random walk, trend, exponential (somehow more simple and practical compared to other formal approaches like Augmented Dickey-Fuller or Ljung Box Test). If you have limited missing values in your time features, dymo automatically proceeds with the imputation using the Kalman filter method2. If you prefer to project into the future the smoothed version, you can set smoother
= TRUE
to use loess3 function.
The test errors are cross-validated through an expanding validation n_windows
: the default value is set to 10, meaning that the time features are divided into 10 + 1 segments guaranteeing at least ten validation sets to measure the error on unforeseen data. For each point in the prediction sequence, a thousand samples are collected for the calculation of quantiles, mean, mode, standard deviation, skewness and kurtosis, and other less common measures that are provided for each time step (see below).
The process flow of dymo
The dataset time features
included with dymo is a recent take on some Big Techs’ stock prices (source: Yahoo Finance). The data is expected in a dataframe format, where each column represents a different time series (the date information is not mandatory and could be provided separately).
date | IBM.Close | AAPL.Close | AMZN.Close | GOOGL.Close | MSFT.Close |
---|---|---|---|---|---|
2017-01-03 | 159.8375 | 29.0375 | 753.67 | 808.01 | 62.58 |
2017-01-04 | 161.8164 | 29.0050 | 757.18 | 807.77 | 62.30 |
2017-01-05 | 161.2811 | 29.1525 | 780.45 | 813.02 | 62.30 |
2017-01-06 | 162.0746 | 29.4775 | 795.99 | 825.21 | 62.84 |
2017-01-09 | 160.2773 | 29.7475 | 796.92 | 827.18 | 62.64 |
2017-01-10 | 158.2409 | 29.7775 | 795.90 | 826.01 | 62.62 |
2017-01-11 | 160.3728 | 29.9375 | 799.02 | 829.86 | 63.19 |
2017-01-12 | 160.5641 | 29.8125 | 813.64 | 829.53 | 62.61 |
2017-01-13 | 159.9809 | 29.7600 | 817.14 | 830.94 | 62.70 |
2017-01-17 | 160.5067 | 30.0000 | 809.72 | 827.46 | 62.53 |
In the first example, we are predicting the close price for IBM and Microsoft. In this example we try to set seq_len= 100
(sequence length), using a cross-validation scheme of 10 n_windows
for error measurement.
<- dymo(time_features[, 2:6], seq_len = 100, n_windows = 10, dates = time_features$date)
example1 : 13.96 sec elapsed time
The result is a list of different components, as you can see below.
names(example1)
1] "comb_metrics" "best_model" "time_log"
[names(example1$best_model)
1] "best_combination" "quant_preds" "testing_errors" "plots" [
comb_metrics
includes error metrics for all possible combinations of time features (beside prediction score, me, mae, mse, rmsse, mpe, mape, rmae, rrmse, rame, mase, smse, sce, gmrae 4, averaged across features, ranks and validation windows). best_model
collects the best combination of features and rank (best_combination
), the error metric for each time feature (testing_errors
), the prediction intervals (quant_preds
) and the visualizations (plots
).
The quant_preds
is a list including the predicted results for each time-feature (quantile, min, max, mean, mode, sd, skewness, kurtosis, iqr to range, risk ratio, upside probability, divergence for each time point in the seq_len
sequence). The IQR to range is the interquartile range to the min-max range, the risk ratio is the range above median to the range below it, the upside probability is the probability of growth compared to the former point in the time sequence, the divergence is the maximum distance of cumulative normal curve of each point to the former point in the sequence.
min | 10% | 25% | 50% | 75% | 90% | max | mean | sd | mode | kurtosis | skewness | iqr_to_range | median_range_ratio | upside_prob | divergence | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
2022-04-22 | 136.2462 | 136.2462 | 137.1773 | 137.7488 | 138.7031 | 139.0710 | 139.4631 | 137.8184 | 1.030336 | 137.2716 | 1.771882 | 0.0708948 | 0.4742911 | 1.1408802 | NA | NA |
2022-04-23 | 135.2319 | 135.2319 | 137.2029 | 138.4536 | 138.9849 | 140.7172 | 140.8256 | 138.2064 | 1.689900 | 138.4850 | 2.309916 | -0.1448395 | 0.3185650 | 0.7362330 | 0.572 | 0.196 |
2022-04-24 | 134.4115 | 135.7858 | 135.9352 | 138.7753 | 139.9002 | 141.9825 | 144.3283 | 138.4494 | 2.887694 | 136.8674 | 2.353346 | 0.5289843 | 0.3998268 | 1.2725227 | 0.520 | 0.202 |
2022-04-26 | 133.5547 | 133.5547 | 135.5948 | 139.2622 | 139.6835 | 143.3630 | 143.3630 | 137.9175 | 3.078726 | 137.9754 | 2.008074 | 0.1416566 | 0.4168617 | 0.7185051 | 0.436 | 0.212 |
2022-04-27 | 133.8319 | 134.3550 | 135.4705 | 138.3799 | 140.0557 | 143.7167 | 143.7167 | 138.1400 | 3.063455 | 138.2290 | 2.042838 | 0.2979875 | 0.4638630 | 1.1734455 | 0.503 | 0.190 |
2022-04-29 | 132.3023 | 134.1447 | 135.3940 | 138.7719 | 141.7543 | 143.7072 | 144.1092 | 138.6588 | 3.789718 | 139.3594 | 1.795545 | -0.0840420 | 0.5386922 | 0.8249865 | 0.528 | 0.102 |
2022-04-30 | 130.8847 | 134.7497 | 136.4454 | 138.5424 | 142.6843 | 147.1780 | 147.1780 | 139.1904 | 4.694869 | 136.6870 | 2.173326 | 0.1846910 | 0.3829110 | 1.1276902 | 0.537 | 0.104 |
2022-05-02 | 128.8961 | 128.8961 | 136.2183 | 138.9943 | 142.3936 | 147.6842 | 147.9333 | 139.2732 | 5.647323 | 138.4052 | 2.380761 | -0.0571137 | 0.3243786 | 0.8852128 | 0.511 | 0.111 |
2022-05-03 | 125.6743 | 134.5776 | 135.8526 | 138.3643 | 145.6935 | 148.8702 | 148.8702 | 139.7009 | 6.564031 | 136.6248 | 2.573428 | -0.3642295 | 0.4242506 | 0.8278921 | 0.512 | 0.109 |
2022-05-05 | 129.3263 | 129.3263 | 136.7991 | 137.7002 | 145.4832 | 149.2144 | 150.5619 | 140.0869 | 6.440886 | 136.8208 | 1.992426 | 0.1653656 | 0.4089409 | 1.5359386 | 0.512 | 0.058 |
For each time features included in the model, you get a plot of the median with the chosen confidence interval (ci
default is 0.8). As in other packages5, we provide different stats to give a better hint on the different dynamics related to aleatoric and epistemic uncertainty.
$IBM.Close
$AAPL.Close
$AMZN.Close
$GOOGL.Close
$MSFT.Close
Now, let’s try a fast grid search for the best combination setting the minimum and the max number of time features to consider (obviously, the number can’t be lower than two and greater than the total number of features). The following example shows how to sample 75 different models at different rank values for a sequence of 100 time steps, setting min_feats
and max_feats
between 2 and 5.
<- dymo(time_features[, 2:6], seq_len = 100, n_windows = 10, min_feats = 2, max_feats = 5, dates = time_features$date)
example2 : 146.4 sec elapsed time
combined_features | rank | pred_scores | me | mae | mse | rmsse | mpe | mape | rmae | rrmse | rame | mase | smse | sce | gmrae |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1, 2 | 1 | 0.3337122 | 5.014450 | 8.117350 | 126.7835 | 10.79780 | 0.0490500 | 0.0785500 | 1.003000 | 1.002450 | 1.0011500 | 10.630750 | 159.2758 | 734.7405 | 1.0042500 |
1, 2 | 2 | 0.3322289 | 5.047200 | 8.149850 | 127.2314 | 10.82870 | 0.0494000 | 0.0789500 | 1.008000 | 1.006200 | 1.0026000 | 10.673600 | 159.7936 | 739.3155 | 1.0089000 |
1, 3 | 1 | 0.3781400 | 62.023550 | 96.204050 | 37370.8128 | 31.90910 | 0.0363500 | 0.0702000 | 1.004200 | 1.003300 | 1.0074000 | 9.095450 | 2230.9702 | 538.5538 | 1.0055000 |
1, 3 | 2 | 0.3766478 | 61.567000 | 96.058950 | 37295.0393 | 31.86910 | 0.0361500 | 0.0701500 | 1.003650 | 1.002350 | 0.9993000 | 9.092500 | 2228.0903 | 535.2638 | 1.0065000 |
1, 4 | 1 | 0.3311867 | 35.168750 | 60.415400 | 12813.1707 | 24.20645 | 0.0287500 | 0.0636500 | 1.001000 | 1.000550 | 0.9865500 | 7.951400 | 1002.2409 | 380.0926 | 1.0061000 |
1, 4 | 2 | 0.3304533 | 34.835450 | 60.160500 | 12681.4605 | 24.15095 | 0.0284500 | 0.0636500 | 1.001250 | 1.001150 | 0.9989500 | 7.943650 | 993.7906 | 376.8893 | 1.0009500 |
1, 5 | 1 | 0.3183456 | 5.812800 | 10.324250 | 197.2527 | 11.49770 | 0.0370000 | 0.0671000 | 1.002700 | 1.002500 | 1.0025500 | 9.460100 | 157.2731 | 583.3542 | 1.0030000 |
1, 5 | 2 | 0.3205644 | 5.825000 | 10.343600 | 197.3523 | 11.51525 | 0.0371500 | 0.0673000 | 1.005450 | 1.004850 | 1.0058000 | 9.475750 | 157.4423 | 584.1363 | 1.0061000 |
2, 3 | 1 | 0.3081389 | 64.835750 | 96.801300 | 37401.5293 | 34.27255 | 0.0707000 | 0.0903000 | 1.003300 | 1.002850 | 1.0099500 | 13.409950 | 2309.6056 | 1118.0692 | 1.0010500 |
2, 3 | 2 | 0.3096300 | 65.405200 | 97.028350 | 37563.2148 | 34.36280 | 0.0711500 | 0.0905500 | 1.007050 | 1.006450 | 1.0190500 | 13.465550 | 2319.6161 | 1126.6748 | 1.0087500 |
2, 4 | 1 | 0.2600200 | 37.989300 | 61.036100 | 12845.5759 | 26.58660 | 0.0631000 | 0.0839500 | 1.002100 | 1.001750 | 0.9891500 | 12.284600 | 1081.6693 | 959.8020 | 1.0015500 |
2, 4 | 2 | 0.2599756 | 38.090750 | 60.967800 | 12836.7808 | 26.58300 | 0.0631500 | 0.0841000 | 1.002650 | 1.002350 | 1.0319500 | 12.300300 | 1081.9792 | 962.7207 | 0.9975000 |
2, 5 | 1 | 0.2461589 | 8.680850 | 10.961500 | 229.4220 | 13.89615 | 0.0716000 | 0.0873500 | 1.006000 | 1.005650 | 1.0111000 | 13.794650 | 236.9394 | 1166.0287 | 1.0040500 |
2, 5 | 2 | 0.2481800 | 8.679300 | 10.975350 | 230.1019 | 13.91815 | 0.0716500 | 0.0875000 | 1.007350 | 1.007150 | 1.0149000 | 13.819300 | 237.6496 | 1168.2628 | 1.0054500 |
3, 4 | 1 | 0.3066778 | 95.063550 | 149.146800 | 50163.6505 | 47.70350 | 0.0503500 | 0.0755500 | 1.002300 | 1.002200 | 1.0052000 | 10.745650 | 3156.7243 | 764.0676 | 0.9995000 |
3, 4 | 2 | 0.3050411 | 95.220450 | 149.286200 | 50188.7413 | 47.72695 | 0.0504500 | 0.0756000 | 1.003650 | 1.002500 | 1.0085000 | 10.756450 | 3158.8977 | 765.8924 | 1.0102500 |
3, 5 | 1 | 0.2933978 | 65.593850 | 98.992500 | 37475.7553 | 34.95475 | 0.0583000 | 0.0786500 | 1.002800 | 1.002800 | 1.0106000 | 12.231200 | 2307.1267 | 964.6295 | 1.0029000 |
3, 5 | 2 | 0.2935933 | 66.031400 | 99.314050 | 37676.4897 | 35.07235 | 0.0588500 | 0.0791500 | 1.006350 | 1.006800 | 1.0211000 | 12.273350 | 2318.5853 | 970.1417 | 1.0046500 |
4, 5 | 1 | 0.2472922 | 38.744250 | 63.245950 | 12917.7447 | 27.28765 | 0.0508500 | 0.0725000 | 1.002300 | 1.002150 | 0.9912500 | 11.114300 | 1079.8356 | 805.7971 | 1.0020500 |
4, 5 | 2 | 0.2445700 | 39.504850 | 63.567750 | 13079.7614 | 27.40885 | 0.0516000 | 0.0729500 | 1.007450 | 1.007250 | 0.9711500 | 11.166050 | 1092.5968 | 817.6014 | 1.0012500 |
1, 2, 3 | 1 | 0.3394407 | 43.964433 | 67.046467 | 24966.5319 | 25.66307 | 0.0521000 | 0.0797333 | 1.004200 | 1.003233 | 1.0060667 | 11.048367 | 1566.6094 | 797.5320 | 1.0035000 |
1, 2, 3 | 2 | 0.3404044 | 43.681167 | 66.950867 | 24928.2663 | 25.63840 | 0.0520000 | 0.0798000 | 1.004833 | 1.003300 | 0.9994667 | 11.063100 | 1565.1554 | 796.6638 | 1.0050000 |
1, 2, 3 | 3 | 0.3376719 | 44.044900 | 67.137900 | 25044.5708 | 25.70643 | 0.0525667 | 0.0801000 | 1.008633 | 1.006300 | 0.9999000 | 11.099400 | 1572.2406 | 803.9074 | 1.0132667 |
1, 2, 4 | 1 | 0.3072578 | 26.057100 | 43.183500 | 8594.9008 | 20.52503 | 0.0469667 | 0.0753333 | 1.001167 | 1.000833 | 0.9915667 | 10.284267 | 747.7030 | 691.3607 | 1.0025667 |
1, 2, 4 | 2 | 0.3082607 | 25.817567 | 43.029067 | 8524.9614 | 20.48713 | 0.0467667 | 0.0753667 | 1.001100 | 1.000800 | 0.9997667 | 10.277067 | 742.8255 | 688.8413 | 1.0011000 |
1, 2, 4 | 3 | 0.3062704 | 25.911533 | 43.011367 | 8508.7568 | 20.50283 | 0.0470667 | 0.0756000 | 1.004433 | 1.003600 | 1.0256000 | 10.310867 | 742.6782 | 693.3524 | 1.0067333 |
1, 2, 5 | 1 | 0.2999622 | 6.510333 | 9.799733 | 184.1858 | 12.06233 | 0.0527333 | 0.0776667 | 1.004067 | 1.003433 | 1.0031000 | 11.299867 | 184.3455 | 829.2924 | 1.0070000 |
1, 2, 5 | 2 | 0.3015267 | 6.507300 | 9.806333 | 184.0581 | 12.06893 | 0.0526667 | 0.0779000 | 1.005433 | 1.004567 | 1.0038333 | 11.310367 | 184.3122 | 829.7167 | 1.0094333 |
1, 2, 5 | 3 | 0.3008844 | 6.512533 | 9.822767 | 184.5237 | 12.08723 | 0.0528667 | 0.0780000 | 1.007600 | 1.006233 | 1.0038333 | 11.333000 | 184.8117 | 831.5879 | 1.0152000 |
1, 3, 4 | 1 | 0.3386170 | 64.111700 | 101.936500 | 33473.8958 | 34.61273 | 0.0385333 | 0.0698000 | 1.002733 | 1.002267 | 1.0031333 | 9.267767 | 2131.2574 | 561.2362 | 1.0018667 |
1, 3, 4 | 2 | 0.3377837 | 64.192333 | 102.017400 | 33486.4965 | 34.62340 | 0.0386667 | 0.0698333 | 1.004333 | 1.002567 | 0.9984333 | 9.275667 | 2132.3660 | 563.0697 | 1.0075667 |
1, 3, 4 | 3 | 0.3382867 | 63.747467 | 101.724267 | 33292.6211 | 34.55490 | 0.0384667 | 0.0698667 | 1.003900 | 1.002233 | 0.9813667 | 9.266533 | 2122.4166 | 560.1311 | 1.0050333 |
1, 3, 5 | 1 | 0.3291719 | 44.468700 | 68.506467 | 25015.9151 | 26.11753 | 0.0438333 | 0.0719333 | 1.003767 | 1.003133 | 1.0066667 | 10.262033 | 1564.9436 | 695.1400 | 1.0051333 |
1, 3, 5 | 2 | 0.3303948 | 44.839967 | 68.742900 | 25118.7180 | 26.23377 | 0.0444333 | 0.0725333 | 1.014333 | 1.012700 | 1.0289333 | 10.331433 | 1571.1657 | 701.6544 | 1.0200667 |
1, 3, 5 | 3 | 0.3319207 | 44.567667 | 68.628733 | 25108.2918 | 26.16363 | 0.0441000 | 0.0721000 | 1.005167 | 1.004367 | 1.0053333 | 10.286300 | 1570.6829 | 697.5101 | 1.0051000 |
1, 4, 5 | 1 | 0.3001385 | 26.560333 | 44.656600 | 8643.0188 | 20.99243 | 0.0388000 | 0.0677000 | 1.001300 | 1.001100 | 0.9930000 | 9.504133 | 746.4789 | 588.7003 | 1.0030000 |
1, 4, 5 | 2 | 0.2991504 | 26.636600 | 44.687100 | 8675.5619 | 21.01050 | 0.0388000 | 0.0677333 | 1.002000 | 1.001767 | 0.9919333 | 9.514533 | 748.9582 | 589.8973 | 1.0029667 |
1, 4, 5 | 3 | 0.2981667 | 26.892667 | 44.753700 | 8681.5199 | 21.04887 | 0.0393333 | 0.0680667 | 1.006600 | 1.005700 | 0.9793667 | 9.545733 | 750.2065 | 595.6537 | 1.0062000 |
2, 3, 4 | 1 | 0.2911844 | 65.989933 | 102.340233 | 33494.9211 | 36.19167 | 0.0614333 | 0.0833000 | 1.002800 | 1.002400 | 1.0045667 | 12.148500 | 2183.8346 | 947.7828 | 1.0019667 |
2, 3, 4 | 2 | 0.2914415 | 66.073000 | 102.423400 | 33509.1342 | 36.20600 | 0.0614667 | 0.0833333 | 1.003200 | 1.002367 | 1.0046000 | 12.156667 | 2185.2708 | 948.8958 | 1.0059667 |
2, 3, 4 | 3 | 0.2925778 | 66.423000 | 102.438633 | 33469.8498 | 36.22533 | 0.0618000 | 0.0834667 | 1.005100 | 1.004100 | 0.9830333 | 12.180400 | 2184.8880 | 953.3424 | 1.0088000 |
2, 3, 5 | 1 | 0.2835578 | 46.344767 | 68.905267 | 25036.4686 | 27.69360 | 0.0667333 | 0.0853667 | 1.003267 | 1.002900 | 1.0081333 | 13.138867 | 1617.3833 | 1081.5792 | 1.0025333 |
2, 3, 5 | 2 | 0.2838652 | 46.739900 | 69.140067 | 25169.2869 | 27.78850 | 0.0671333 | 0.0857333 | 1.007000 | 1.006800 | 1.0179667 | 13.178833 | 1625.3686 | 1087.5854 | 1.0070333 |
2, 3, 5 | 3 | 0.2831296 | 46.869100 | 69.228767 | 25227.9206 | 27.82677 | 0.0673000 | 0.0858667 | 1.007867 | 1.008000 | 1.0202333 | 13.203967 | 1629.2192 | 1090.4674 | 1.0044000 |
2, 4, 5 | 1 | 0.2525793 | 28.441133 | 45.070700 | 8664.6160 | 22.57943 | 0.0617000 | 0.0812667 | 1.002100 | 1.001900 | 0.9947667 | 12.393300 | 799.4371 | 975.2223 | 0.9997333 |
2, 4, 5 | 2 | 0.2498363 | 29.037367 | 45.353000 | 8779.2162 | 22.69047 | 0.0624333 | 0.0817333 | 1.008000 | 1.007433 | 0.9919667 | 12.448533 | 808.9592 | 985.9126 | 1.0071667 |
2, 4, 5 | 3 | 0.2511452 | 28.912067 | 45.267567 | 8762.2846 | 22.66723 | 0.0623333 | 0.0817333 | 1.007000 | 1.006533 | 0.9853667 | 12.442600 | 807.4455 | 984.5331 | 1.0042667 |
3, 4, 5 | 1 | 0.2825170 | 66.494500 | 103.804500 | 33542.3754 | 36.65190 | 0.0532000 | 0.0756000 | 1.002500 | 1.002467 | 1.0058000 | 11.364400 | 2182.2954 | 845.3113 | 1.0031667 |
3, 4, 5 | 2 | 0.2819185 | 66.557200 | 103.887667 | 33555.3317 | 36.67153 | 0.0532333 | 0.0756667 | 1.003767 | 1.003000 | 1.0054667 | 11.380967 | 2183.8266 | 846.4252 | 1.0086667 |
3, 4, 5 | 3 | 0.2818667 | 67.349867 | 104.264133 | 33676.3325 | 36.80407 | 0.0538333 | 0.0760667 | 1.008400 | 1.008033 | 1.0155333 | 11.421833 | 2193.6488 | 854.4674 | 1.0076000 |
1, 2, 3, 4 | 1 | 0.3197344 | 50.044325 | 78.634125 | 25144.7873 | 29.25147 | 0.0498000 | 0.0770500 | 1.002975 | 1.002400 | 1.0031750 | 10.689325 | 1657.9553 | 749.7213 | 1.0030000 |
1, 2, 3, 4 | 2 | 0.3186872 | 50.094800 | 78.689900 | 25153.1894 | 29.25893 | 0.0498750 | 0.0770750 | 1.003875 | 1.002450 | 0.9982750 | 10.696125 | 1658.8261 | 751.0386 | 1.0050500 |
1, 2, 3, 4 | 3 | 0.3194389 | 49.885800 | 78.514850 | 25040.3792 | 29.21142 | 0.0497000 | 0.0770500 | 1.003275 | 1.001850 | 0.9850000 | 10.686200 | 1652.7728 | 749.3717 | 1.0014500 |
1, 2, 3, 4 | 4 | 0.3187022 | 50.020600 | 78.532675 | 25017.0199 | 29.23670 | 0.0501000 | 0.0773000 | 1.006625 | 1.004375 | 0.9727500 | 10.721950 | 1652.8669 | 753.9492 | 1.0112250 |
1, 2, 3, 5 | 1 | 0.3135567 | 35.312925 | 53.562400 | 18801.4044 | 22.88075 | 0.0537750 | 0.0786500 | 1.003850 | 1.003125 | 1.0057500 | 11.434875 | 1233.1771 | 850.1967 | 1.0045250 |
1, 2, 3, 5 | 2 | 0.3136322 | 35.664450 | 53.758000 | 18867.6186 | 22.99330 | 0.0544750 | 0.0794000 | 1.015800 | 1.012975 | 1.0254750 | 11.529975 | 1237.8186 | 860.2727 | 1.0242750 |
1, 2, 3, 5 | 3 | 0.3141517 | 35.370800 | 53.643375 | 18866.3568 | 22.91595 | 0.0540500 | 0.0788750 | 1.005825 | 1.004750 | 1.0039750 | 11.461600 | 1236.8785 | 853.2092 | 1.0086500 |
1, 2, 3, 5 | 4 | 0.3139083 | 35.559700 | 53.750600 | 18920.2698 | 22.96148 | 0.0542500 | 0.0790500 | 1.008025 | 1.006700 | 1.0043250 | 11.493200 | 1240.9852 | 857.3979 | 1.0097750 |
1, 2, 4, 5 | 1 | 0.2902789 | 21.878975 | 35.672500 | 6521.8527 | 19.03490 | 0.0499500 | 0.0754750 | 1.001400 | 1.001200 | 0.9951750 | 10.865725 | 619.5164 | 770.0327 | 1.0012500 |
1, 2, 4, 5 | 2 | 0.2898228 | 21.983650 | 35.733500 | 6568.5220 | 19.06583 | 0.0501000 | 0.0756000 | 1.002900 | 1.002675 | 0.9947500 | 10.883475 | 622.8895 | 772.8125 | 1.0038750 |
1, 2, 4, 5 | 3 | 0.2910217 | 22.095050 | 35.807450 | 6572.1682 | 19.10243 | 0.0505000 | 0.0758250 | 1.006925 | 1.005750 | 0.9841250 | 10.915700 | 623.9389 | 776.6895 | 1.0072250 |
1, 2, 4, 5 | 4 | 0.2898378 | 22.097875 | 35.738025 | 6543.7838 | 19.08540 | 0.0505250 | 0.0759250 | 1.007400 | 1.006275 | 0.9801000 | 10.918300 | 621.9325 | 777.4074 | 1.0123250 |
1, 3, 4, 5 | 1 | 0.3125539 | 50.422775 | 79.732225 | 25180.3742 | 29.59665 | 0.0436250 | 0.0712750 | 1.002775 | 1.002475 | 1.0040500 | 10.101250 | 1656.7991 | 672.8628 | 1.0030250 |
1, 3, 4, 5 | 2 | 0.3120489 | 50.460700 | 79.789300 | 25188.3863 | 29.60793 | 0.0437250 | 0.0713250 | 1.004200 | 1.002850 | 0.9989750 | 10.113625 | 1657.7597 | 674.0936 | 1.0070000 |
1, 3, 4, 5 | 3 | 0.3120072 | 50.556650 | 79.766675 | 25161.1994 | 29.60592 | 0.0438250 | 0.0715000 | 1.005550 | 1.003825 | 0.9857000 | 10.120800 | 1657.0171 | 675.2241 | 1.0081500 |
1, 3, 4, 5 | 4 | 0.3100828 | 50.795000 | 79.908550 | 25186.5848 | 29.66475 | 0.0441000 | 0.0716250 | 1.007400 | 1.006050 | 0.9945750 | 10.141075 | 1660.0292 | 678.5728 | 1.0080250 |
2, 3, 4, 5 | 1 | 0.2770911 | 51.831625 | 80.035225 | 25196.1469 | 30.78102 | 0.0608000 | 0.0814000 | 1.002825 | 1.002550 | 1.0051750 | 12.261850 | 1696.2358 | 962.7867 | 1.0039000 |
2, 3, 4, 5 | 2 | 0.2772711 | 51.868750 | 80.092675 | 25204.8889 | 30.79475 | 0.0608000 | 0.0814500 | 1.003475 | 1.002725 | 1.0034500 | 12.274600 | 1697.4194 | 963.5209 | 1.0058000 |
2, 3, 4, 5 | 3 | 0.2769500 | 52.013825 | 79.986325 | 25177.2589 | 30.78242 | 0.0608000 | 0.0813250 | 1.002025 | 1.002175 | 1.0095750 | 12.261750 | 1696.5272 | 964.8413 | 0.9945250 |
2, 3, 4, 5 | 4 | 0.2764600 | 52.601025 | 80.416400 | 25303.1833 | 30.91882 | 0.0614500 | 0.0818000 | 1.008600 | 1.007975 | 1.0171500 | 12.326825 | 1705.8303 | 972.5418 | 1.0073250 |
1, 2, 3, 4, 5 | 1 | 0.3024004 | 41.906760 | 65.531260 | 20175.7915 | 26.31092 | 0.0516200 | 0.0767800 | 1.002960 | 1.002540 | 1.0039200 | 11.071760 | 1373.0494 | 801.3258 | 1.0036600 |
1, 2, 3, 4, 5 | 2 | 0.3021471 | 41.931340 | 65.573900 | 20181.7094 | 26.31942 | 0.0516800 | 0.0768200 | 1.003780 | 1.002660 | 0.9987400 | 11.082160 | 1373.8651 | 802.2514 | 1.0050200 |
1, 2, 3, 4, 5 | 3 | 0.3021409 | 42.145400 | 65.600640 | 20176.1053 | 26.33324 | 0.0518800 | 0.0769400 | 1.005760 | 1.004220 | 0.9897400 | 11.097660 | 1374.4217 | 805.3026 | 1.0076200 |
1, 2, 3, 4, 5 | 4 | 0.3029138 | 42.216920 | 65.699320 | 20156.7790 | 26.37876 | 0.0521200 | 0.0772000 | 1.008340 | 1.006280 | 0.9913200 | 11.127960 | 1375.1162 | 808.8738 | 1.0094600 |
1, 2, 3, 4, 5 | 5 | 0.3008862 | 42.300460 | 65.713340 | 20193.0421 | 26.38884 | 0.0521200 | 0.0772200 | 1.008760 | 1.007080 | 0.9968000 | 11.130640 | 1376.8350 | 809.2007 | 1.0104800 |
The best model includes time features number 1 and number 3 (respectively, IBM and Amazon). If we compare the error statistics from the best model in example2
with the model in example1
, for IBM and Amazon we see consistent improvement. All the relative and scaled error metrics defaults to naive
, but you can choose more challenging thresholds (like the deviation of the whole time feature or the average of the whole predicted sequence).
The error statistics from example1
(averaged across 10 expanding validation windows):
$best_model$testing_errors
example1
pred_scores me mae mse rmsse mpe mape0.3960911 2.3043 7.5842 96.5223 8.4726 0.0155 0.0590
IBM.Close 0.2657489 7.8821 8.7555 158.3111 13.2152 0.0841 0.0992
AAPL.Close 0.3512089 122.8044 185.2975 74715.1934 55.4967 0.0581 0.0820
AMZN.Close 0.2576022 69.0797 113.7313 25695.1495 40.1203 0.0433 0.0696
GOOGL.Close 0.2363667 9.4318 13.1982 300.0342 14.6394 0.0596 0.0763
MSFT.Close
rmae rrmse rame mase smse sce gmrae1.0135 1.0083 0.9825 6.3671 80.9444 162.0369 1.0220
IBM.Close 1.0079 1.0064 0.9982 15.0325 239.2960 1325.7503 1.0117
AAPL.Close 1.0062 1.0062 1.0237 11.8917 4389.4121 926.2679 0.9998
AMZN.Close 1.0049 1.0037 0.9642 9.6565 1938.6584 614.0788 1.0023
GOOGL.Close 1.0113 1.0108 1.0154 12.7054 235.8639 1017.8694 1.0166 MSFT.Close
The error statistics from example2
(as above, averaged across 10 expanding validation windows):
$best_model$testing_errors
example2
pred_scores me mae mse rmsse mpe mape0.4018044 2.1896 7.5408 96.1175 8.4489 0.0147 0.0585
IBM.Close 0.3514911 120.9444 184.5771 74493.9610 55.2893 0.0576 0.0818
AMZN.Close
rmae rrmse rame mase smse sce gmrae1.0076 1.0052 0.9964 6.3319 80.6405 153.6033 1.0118
IBM.Close 0.9997 0.9995 1.0022 11.8531 4375.5401 916.9243 1.0012 AMZN.Close
The improvement is clear for both the time features, but we are still using a naive
approach to measure scaled and relative errors. Let’s try to shift to deviation
as scale, and average
as benchmark, that are more challenging evaluation criteria.
<- dymo(time_features[, 2:6], seq_len = 100, n_windows = 10, min_feats = 2, max_feats = 5, dates = time_features$date, error_scale = "deviation", error_benchmark = "average")
example3 : 133.97 sec elapsed time
As you can see, the relative and scaled measures change sensibly as we raise the bar of our expectations:
$best_model$testing_errors
example3
pred_scores me mae mse rmsse mpe mape rmae0.4018044 2.1896 7.5408 96.1175 2.5574 0.0147 0.0585 1
IBM.Close 0.3514911 120.9444 184.5771 74493.9610 12.6031 0.0576 0.0818 1
AMZN.Close
rrmse rame mase smse sce gmrae1 1 0.5788 7.3931 13.3151 0.9995
IBM.Close 1 1 0.6886 242.9613 59.2111 1.0000 AMZN.Close
An introductory review on Dynamic Mode Decomposition may be find on Wikipedia, with the different variants of DMD algorithm (SVD-based approach included) and a bunch of academic references: https://en.wikipedia.org/wiki/Dynamic_mode_decomposition.↩︎
The missing imputation is managed through imputeTS package. For any information: https://cran.r-project.org/web/packages/imputeTS/index.html.↩︎
In some cases, maybe you want to operate on smoothed time-features. In this case, dymo calls on fANCOVA
package. Here you can find all the latest: https://cran.r-project.org/web/packages/fANCOVA/index.html↩︎
The metrics are calculated using the greybox package. For any reference, please take a look here: https://cran.r-project.org/web/packages/greybox/index.html↩︎
Other packages focused on time feature analysis that could be of interest here:
- AUDREX, https://cran.r-project.org/web/packages/audrex/index.html
- PROTEUS, https://cran.r-project.org/web/packages/proteus/index.html
- JENGA, https://cran.r-project.org/web/packages/jenga/index.html
- TETRAGON, https://cran.r-project.org/web/packages/tetragon/index.html
- SPOOKY, https://cran.r-project.org/web/packages/spooky/index.html
↩︎