dymo: a brief introduction

Giancarlo Vercellino

05-May-2022

In the natural world, there are many pattern-assembly systems for which there is no simple explanation. There are useful scientific explanations for these complex systems, but the final patterns that they produce are so heterogeneous that they cannot effectively be reduced to smaller or less intricate predecessor components. As I will argue … these patterns are, in a fundamental sense, irreducibly complex.” (Michael J. Katz, Templets and the explanation of complex patterns)

Predicting non-linear systems using Dynamic Mode Decomposition

dymo uses Dynamic Mode Decomposition1 (DMD) to predict multiple time features correlated by non-linear dynamics. dymo is a pretty simple implementation of SVD-based DMD with a couple of add-ons to make it a little more interesting:

  1. An introduction to DMD is out-of-scope of this vignette. What you have to know, simplifying a bit, is that the DMD linearizes the non-linear dynamics of multiple time features by projecting them in an Hilbert space to approximates their value. What DMD does is understanding the non-linear dynamics of a system in time and approximates them as linear multiplicative system (you can think of it as a kind of regression, but no ordinary least squares here). Key points: you need at least two different time features (a single one is not a system by definition) and you need to choose the rank of the SVD approximation (dymo tests all the different ranks and selects the best one). You can also explore the space of all possible combinations of time features by selecting the minimum and the maximun number of features (from 2 to the total number of features).

  2. Some basic transformation are directly managed in background. Differentiation and integration are automatically managed by dymo using the maximal p-value in a recursive F-test for de-trending each time-feature: this allows to easily determine the different dynamic characteristics of each time feature, random walk, trend, exponential (somehow more simple and practical compared to other formal approaches like Augmented Dickey-Fuller or Ljung Box Test). If you have limited missing values in your time features, dymo automatically proceeds with the imputation using the Kalman filter method2. If you prefer to project into the future the smoothed version, you can set smoother = TRUE to use loess3 function.

  3. The test errors are cross-validated through an expanding validation n_windows: the default value is set to 10, meaning that the time features are divided into 10 + 1 segments guaranteeing at least ten validation sets to measure the error on unforeseen data. For each point in the prediction sequence, a thousand samples are collected for the calculation of quantiles, mean, mode, standard deviation, skewness and kurtosis, and other less common measures that are provided for each time step (see below).

The process flow of dymo

The process flow of dymo

A look at the price dynamics of tech giants’ stocks

The dataset time features included with dymo is a recent take on some Big Techs’ stock prices (source: Yahoo Finance). The data is expected in a dataframe format, where each column represents a different time series (the date information is not mandatory and could be provided separately).

Examples of time features: Tech Giants Share
date IBM.Close AAPL.Close AMZN.Close GOOGL.Close MSFT.Close
2017-01-03 159.8375 29.0375 753.67 808.01 62.58
2017-01-04 161.8164 29.0050 757.18 807.77 62.30
2017-01-05 161.2811 29.1525 780.45 813.02 62.30
2017-01-06 162.0746 29.4775 795.99 825.21 62.84
2017-01-09 160.2773 29.7475 796.92 827.18 62.64
2017-01-10 158.2409 29.7775 795.90 826.01 62.62
2017-01-11 160.3728 29.9375 799.02 829.86 63.19
2017-01-12 160.5641 29.8125 813.64 829.53 62.61
2017-01-13 159.9809 29.7600 817.14 830.94 62.70
2017-01-17 160.5067 30.0000 809.72 827.46 62.53

In the first example, we are predicting the close price for IBM and Microsoft. In this example we try to set seq_len= 100 (sequence length), using a cross-validation scheme of 10 n_windows for error measurement.


example1 <- dymo(time_features[, 2:6], seq_len = 100, n_windows = 10,  dates = time_features$date)
  time: 13.96 sec elapsed

The result is a list of different components, as you can see below.

names(example1)
  [1] "comb_metrics" "best_model"   "time_log"
names(example1$best_model)
  [1] "best_combination" "quant_preds"      "testing_errors"   "plots"

comb_metrics includes error metrics for all possible combinations of time features (beside prediction score, me, mae, mse, rmsse, mpe, mape, rmae, rrmse, rame, mase, smse, sce, gmrae 4, averaged across features, ranks and validation windows). best_model collects the best combination of features and rank (best_combination), the error metric for each time feature (testing_errors), the prediction intervals (quant_preds) and the visualizations (plots).

The quant_preds is a list including the predicted results for each time-feature (quantile, min, max, mean, mode, sd, skewness, kurtosis, iqr to range, risk ratio, upside probability, divergence for each time point in the seq_len sequence). The IQR to range is the interquartile range to the min-max range, the risk ratio is the range above median to the range below it, the upside probability is the probability of growth compared to the former point in the time sequence, the divergence is the maximum distance of cumulative normal curve of each point to the former point in the sequence.

Examples of prediction for IBM Close Prices (first 10 data points)
min 10% 25% 50% 75% 90% max mean sd mode kurtosis skewness iqr_to_range median_range_ratio upside_prob divergence
2022-04-22 136.2462 136.2462 137.1773 137.7488 138.7031 139.0710 139.4631 137.8184 1.030336 137.2716 1.771882 0.0708948 0.4742911 1.1408802 NA NA
2022-04-23 135.2319 135.2319 137.2029 138.4536 138.9849 140.7172 140.8256 138.2064 1.689900 138.4850 2.309916 -0.1448395 0.3185650 0.7362330 0.572 0.196
2022-04-24 134.4115 135.7858 135.9352 138.7753 139.9002 141.9825 144.3283 138.4494 2.887694 136.8674 2.353346 0.5289843 0.3998268 1.2725227 0.520 0.202
2022-04-26 133.5547 133.5547 135.5948 139.2622 139.6835 143.3630 143.3630 137.9175 3.078726 137.9754 2.008074 0.1416566 0.4168617 0.7185051 0.436 0.212
2022-04-27 133.8319 134.3550 135.4705 138.3799 140.0557 143.7167 143.7167 138.1400 3.063455 138.2290 2.042838 0.2979875 0.4638630 1.1734455 0.503 0.190
2022-04-29 132.3023 134.1447 135.3940 138.7719 141.7543 143.7072 144.1092 138.6588 3.789718 139.3594 1.795545 -0.0840420 0.5386922 0.8249865 0.528 0.102
2022-04-30 130.8847 134.7497 136.4454 138.5424 142.6843 147.1780 147.1780 139.1904 4.694869 136.6870 2.173326 0.1846910 0.3829110 1.1276902 0.537 0.104
2022-05-02 128.8961 128.8961 136.2183 138.9943 142.3936 147.6842 147.9333 139.2732 5.647323 138.4052 2.380761 -0.0571137 0.3243786 0.8852128 0.511 0.111
2022-05-03 125.6743 134.5776 135.8526 138.3643 145.6935 148.8702 148.8702 139.7009 6.564031 136.6248 2.573428 -0.3642295 0.4242506 0.8278921 0.512 0.109
2022-05-05 129.3263 129.3263 136.7991 137.7002 145.4832 149.2144 150.5619 140.0869 6.440886 136.8208 1.992426 0.1653656 0.4089409 1.5359386 0.512 0.058

For each time features included in the model, you get a plot of the median with the chosen confidence interval (ci default is 0.8). As in other packages5, we provide different stats to give a better hint on the different dynamics related to aleatoric and epistemic uncertainty.

  $IBM.Close

  
  $AAPL.Close

  
  $AMZN.Close

  
  $GOOGL.Close

  
  $MSFT.Close

The best combination is not always what you may think

Now, let’s try a fast grid search for the best combination setting the minimum and the max number of time features to consider (obviously, the number can’t be lower than two and greater than the total number of features). The following example shows how to sample 75 different models at different rank values for a sequence of 100 time steps, setting min_feats and max_feats between 2 and 5.

example2 <- dymo(time_features[, 2:6], seq_len = 100, n_windows = 10,  min_feats = 2, max_feats = 5, dates = time_features$date)
  time: 146.4 sec elapsed
History table with ranking of 75 different models
combined_features rank pred_scores me mae mse rmsse mpe mape rmae rrmse rame mase smse sce gmrae
1, 2 1 0.3337122 5.014450 8.117350 126.7835 10.79780 0.0490500 0.0785500 1.003000 1.002450 1.0011500 10.630750 159.2758 734.7405 1.0042500
1, 2 2 0.3322289 5.047200 8.149850 127.2314 10.82870 0.0494000 0.0789500 1.008000 1.006200 1.0026000 10.673600 159.7936 739.3155 1.0089000
1, 3 1 0.3781400 62.023550 96.204050 37370.8128 31.90910 0.0363500 0.0702000 1.004200 1.003300 1.0074000 9.095450 2230.9702 538.5538 1.0055000
1, 3 2 0.3766478 61.567000 96.058950 37295.0393 31.86910 0.0361500 0.0701500 1.003650 1.002350 0.9993000 9.092500 2228.0903 535.2638 1.0065000
1, 4 1 0.3311867 35.168750 60.415400 12813.1707 24.20645 0.0287500 0.0636500 1.001000 1.000550 0.9865500 7.951400 1002.2409 380.0926 1.0061000
1, 4 2 0.3304533 34.835450 60.160500 12681.4605 24.15095 0.0284500 0.0636500 1.001250 1.001150 0.9989500 7.943650 993.7906 376.8893 1.0009500
1, 5 1 0.3183456 5.812800 10.324250 197.2527 11.49770 0.0370000 0.0671000 1.002700 1.002500 1.0025500 9.460100 157.2731 583.3542 1.0030000
1, 5 2 0.3205644 5.825000 10.343600 197.3523 11.51525 0.0371500 0.0673000 1.005450 1.004850 1.0058000 9.475750 157.4423 584.1363 1.0061000
2, 3 1 0.3081389 64.835750 96.801300 37401.5293 34.27255 0.0707000 0.0903000 1.003300 1.002850 1.0099500 13.409950 2309.6056 1118.0692 1.0010500
2, 3 2 0.3096300 65.405200 97.028350 37563.2148 34.36280 0.0711500 0.0905500 1.007050 1.006450 1.0190500 13.465550 2319.6161 1126.6748 1.0087500
2, 4 1 0.2600200 37.989300 61.036100 12845.5759 26.58660 0.0631000 0.0839500 1.002100 1.001750 0.9891500 12.284600 1081.6693 959.8020 1.0015500
2, 4 2 0.2599756 38.090750 60.967800 12836.7808 26.58300 0.0631500 0.0841000 1.002650 1.002350 1.0319500 12.300300 1081.9792 962.7207 0.9975000
2, 5 1 0.2461589 8.680850 10.961500 229.4220 13.89615 0.0716000 0.0873500 1.006000 1.005650 1.0111000 13.794650 236.9394 1166.0287 1.0040500
2, 5 2 0.2481800 8.679300 10.975350 230.1019 13.91815 0.0716500 0.0875000 1.007350 1.007150 1.0149000 13.819300 237.6496 1168.2628 1.0054500
3, 4 1 0.3066778 95.063550 149.146800 50163.6505 47.70350 0.0503500 0.0755500 1.002300 1.002200 1.0052000 10.745650 3156.7243 764.0676 0.9995000
3, 4 2 0.3050411 95.220450 149.286200 50188.7413 47.72695 0.0504500 0.0756000 1.003650 1.002500 1.0085000 10.756450 3158.8977 765.8924 1.0102500
3, 5 1 0.2933978 65.593850 98.992500 37475.7553 34.95475 0.0583000 0.0786500 1.002800 1.002800 1.0106000 12.231200 2307.1267 964.6295 1.0029000
3, 5 2 0.2935933 66.031400 99.314050 37676.4897 35.07235 0.0588500 0.0791500 1.006350 1.006800 1.0211000 12.273350 2318.5853 970.1417 1.0046500
4, 5 1 0.2472922 38.744250 63.245950 12917.7447 27.28765 0.0508500 0.0725000 1.002300 1.002150 0.9912500 11.114300 1079.8356 805.7971 1.0020500
4, 5 2 0.2445700 39.504850 63.567750 13079.7614 27.40885 0.0516000 0.0729500 1.007450 1.007250 0.9711500 11.166050 1092.5968 817.6014 1.0012500
1, 2, 3 1 0.3394407 43.964433 67.046467 24966.5319 25.66307 0.0521000 0.0797333 1.004200 1.003233 1.0060667 11.048367 1566.6094 797.5320 1.0035000
1, 2, 3 2 0.3404044 43.681167 66.950867 24928.2663 25.63840 0.0520000 0.0798000 1.004833 1.003300 0.9994667 11.063100 1565.1554 796.6638 1.0050000
1, 2, 3 3 0.3376719 44.044900 67.137900 25044.5708 25.70643 0.0525667 0.0801000 1.008633 1.006300 0.9999000 11.099400 1572.2406 803.9074 1.0132667
1, 2, 4 1 0.3072578 26.057100 43.183500 8594.9008 20.52503 0.0469667 0.0753333 1.001167 1.000833 0.9915667 10.284267 747.7030 691.3607 1.0025667
1, 2, 4 2 0.3082607 25.817567 43.029067 8524.9614 20.48713 0.0467667 0.0753667 1.001100 1.000800 0.9997667 10.277067 742.8255 688.8413 1.0011000
1, 2, 4 3 0.3062704 25.911533 43.011367 8508.7568 20.50283 0.0470667 0.0756000 1.004433 1.003600 1.0256000 10.310867 742.6782 693.3524 1.0067333
1, 2, 5 1 0.2999622 6.510333 9.799733 184.1858 12.06233 0.0527333 0.0776667 1.004067 1.003433 1.0031000 11.299867 184.3455 829.2924 1.0070000
1, 2, 5 2 0.3015267 6.507300 9.806333 184.0581 12.06893 0.0526667 0.0779000 1.005433 1.004567 1.0038333 11.310367 184.3122 829.7167 1.0094333
1, 2, 5 3 0.3008844 6.512533 9.822767 184.5237 12.08723 0.0528667 0.0780000 1.007600 1.006233 1.0038333 11.333000 184.8117 831.5879 1.0152000
1, 3, 4 1 0.3386170 64.111700 101.936500 33473.8958 34.61273 0.0385333 0.0698000 1.002733 1.002267 1.0031333 9.267767 2131.2574 561.2362 1.0018667
1, 3, 4 2 0.3377837 64.192333 102.017400 33486.4965 34.62340 0.0386667 0.0698333 1.004333 1.002567 0.9984333 9.275667 2132.3660 563.0697 1.0075667
1, 3, 4 3 0.3382867 63.747467 101.724267 33292.6211 34.55490 0.0384667 0.0698667 1.003900 1.002233 0.9813667 9.266533 2122.4166 560.1311 1.0050333
1, 3, 5 1 0.3291719 44.468700 68.506467 25015.9151 26.11753 0.0438333 0.0719333 1.003767 1.003133 1.0066667 10.262033 1564.9436 695.1400 1.0051333
1, 3, 5 2 0.3303948 44.839967 68.742900 25118.7180 26.23377 0.0444333 0.0725333 1.014333 1.012700 1.0289333 10.331433 1571.1657 701.6544 1.0200667
1, 3, 5 3 0.3319207 44.567667 68.628733 25108.2918 26.16363 0.0441000 0.0721000 1.005167 1.004367 1.0053333 10.286300 1570.6829 697.5101 1.0051000
1, 4, 5 1 0.3001385 26.560333 44.656600 8643.0188 20.99243 0.0388000 0.0677000 1.001300 1.001100 0.9930000 9.504133 746.4789 588.7003 1.0030000
1, 4, 5 2 0.2991504 26.636600 44.687100 8675.5619 21.01050 0.0388000 0.0677333 1.002000 1.001767 0.9919333 9.514533 748.9582 589.8973 1.0029667
1, 4, 5 3 0.2981667 26.892667 44.753700 8681.5199 21.04887 0.0393333 0.0680667 1.006600 1.005700 0.9793667 9.545733 750.2065 595.6537 1.0062000
2, 3, 4 1 0.2911844 65.989933 102.340233 33494.9211 36.19167 0.0614333 0.0833000 1.002800 1.002400 1.0045667 12.148500 2183.8346 947.7828 1.0019667
2, 3, 4 2 0.2914415 66.073000 102.423400 33509.1342 36.20600 0.0614667 0.0833333 1.003200 1.002367 1.0046000 12.156667 2185.2708 948.8958 1.0059667
2, 3, 4 3 0.2925778 66.423000 102.438633 33469.8498 36.22533 0.0618000 0.0834667 1.005100 1.004100 0.9830333 12.180400 2184.8880 953.3424 1.0088000
2, 3, 5 1 0.2835578 46.344767 68.905267 25036.4686 27.69360 0.0667333 0.0853667 1.003267 1.002900 1.0081333 13.138867 1617.3833 1081.5792 1.0025333
2, 3, 5 2 0.2838652 46.739900 69.140067 25169.2869 27.78850 0.0671333 0.0857333 1.007000 1.006800 1.0179667 13.178833 1625.3686 1087.5854 1.0070333
2, 3, 5 3 0.2831296 46.869100 69.228767 25227.9206 27.82677 0.0673000 0.0858667 1.007867 1.008000 1.0202333 13.203967 1629.2192 1090.4674 1.0044000
2, 4, 5 1 0.2525793 28.441133 45.070700 8664.6160 22.57943 0.0617000 0.0812667 1.002100 1.001900 0.9947667 12.393300 799.4371 975.2223 0.9997333
2, 4, 5 2 0.2498363 29.037367 45.353000 8779.2162 22.69047 0.0624333 0.0817333 1.008000 1.007433 0.9919667 12.448533 808.9592 985.9126 1.0071667
2, 4, 5 3 0.2511452 28.912067 45.267567 8762.2846 22.66723 0.0623333 0.0817333 1.007000 1.006533 0.9853667 12.442600 807.4455 984.5331 1.0042667
3, 4, 5 1 0.2825170 66.494500 103.804500 33542.3754 36.65190 0.0532000 0.0756000 1.002500 1.002467 1.0058000 11.364400 2182.2954 845.3113 1.0031667
3, 4, 5 2 0.2819185 66.557200 103.887667 33555.3317 36.67153 0.0532333 0.0756667 1.003767 1.003000 1.0054667 11.380967 2183.8266 846.4252 1.0086667
3, 4, 5 3 0.2818667 67.349867 104.264133 33676.3325 36.80407 0.0538333 0.0760667 1.008400 1.008033 1.0155333 11.421833 2193.6488 854.4674 1.0076000
1, 2, 3, 4 1 0.3197344 50.044325 78.634125 25144.7873 29.25147 0.0498000 0.0770500 1.002975 1.002400 1.0031750 10.689325 1657.9553 749.7213 1.0030000
1, 2, 3, 4 2 0.3186872 50.094800 78.689900 25153.1894 29.25893 0.0498750 0.0770750 1.003875 1.002450 0.9982750 10.696125 1658.8261 751.0386 1.0050500
1, 2, 3, 4 3 0.3194389 49.885800 78.514850 25040.3792 29.21142 0.0497000 0.0770500 1.003275 1.001850 0.9850000 10.686200 1652.7728 749.3717 1.0014500
1, 2, 3, 4 4 0.3187022 50.020600 78.532675 25017.0199 29.23670 0.0501000 0.0773000 1.006625 1.004375 0.9727500 10.721950 1652.8669 753.9492 1.0112250
1, 2, 3, 5 1 0.3135567 35.312925 53.562400 18801.4044 22.88075 0.0537750 0.0786500 1.003850 1.003125 1.0057500 11.434875 1233.1771 850.1967 1.0045250
1, 2, 3, 5 2 0.3136322 35.664450 53.758000 18867.6186 22.99330 0.0544750 0.0794000 1.015800 1.012975 1.0254750 11.529975 1237.8186 860.2727 1.0242750
1, 2, 3, 5 3 0.3141517 35.370800 53.643375 18866.3568 22.91595 0.0540500 0.0788750 1.005825 1.004750 1.0039750 11.461600 1236.8785 853.2092 1.0086500
1, 2, 3, 5 4 0.3139083 35.559700 53.750600 18920.2698 22.96148 0.0542500 0.0790500 1.008025 1.006700 1.0043250 11.493200 1240.9852 857.3979 1.0097750
1, 2, 4, 5 1 0.2902789 21.878975 35.672500 6521.8527 19.03490 0.0499500 0.0754750 1.001400 1.001200 0.9951750 10.865725 619.5164 770.0327 1.0012500
1, 2, 4, 5 2 0.2898228 21.983650 35.733500 6568.5220 19.06583 0.0501000 0.0756000 1.002900 1.002675 0.9947500 10.883475 622.8895 772.8125 1.0038750
1, 2, 4, 5 3 0.2910217 22.095050 35.807450 6572.1682 19.10243 0.0505000 0.0758250 1.006925 1.005750 0.9841250 10.915700 623.9389 776.6895 1.0072250
1, 2, 4, 5 4 0.2898378 22.097875 35.738025 6543.7838 19.08540 0.0505250 0.0759250 1.007400 1.006275 0.9801000 10.918300 621.9325 777.4074 1.0123250
1, 3, 4, 5 1 0.3125539 50.422775 79.732225 25180.3742 29.59665 0.0436250 0.0712750 1.002775 1.002475 1.0040500 10.101250 1656.7991 672.8628 1.0030250
1, 3, 4, 5 2 0.3120489 50.460700 79.789300 25188.3863 29.60793 0.0437250 0.0713250 1.004200 1.002850 0.9989750 10.113625 1657.7597 674.0936 1.0070000
1, 3, 4, 5 3 0.3120072 50.556650 79.766675 25161.1994 29.60592 0.0438250 0.0715000 1.005550 1.003825 0.9857000 10.120800 1657.0171 675.2241 1.0081500
1, 3, 4, 5 4 0.3100828 50.795000 79.908550 25186.5848 29.66475 0.0441000 0.0716250 1.007400 1.006050 0.9945750 10.141075 1660.0292 678.5728 1.0080250
2, 3, 4, 5 1 0.2770911 51.831625 80.035225 25196.1469 30.78102 0.0608000 0.0814000 1.002825 1.002550 1.0051750 12.261850 1696.2358 962.7867 1.0039000
2, 3, 4, 5 2 0.2772711 51.868750 80.092675 25204.8889 30.79475 0.0608000 0.0814500 1.003475 1.002725 1.0034500 12.274600 1697.4194 963.5209 1.0058000
2, 3, 4, 5 3 0.2769500 52.013825 79.986325 25177.2589 30.78242 0.0608000 0.0813250 1.002025 1.002175 1.0095750 12.261750 1696.5272 964.8413 0.9945250
2, 3, 4, 5 4 0.2764600 52.601025 80.416400 25303.1833 30.91882 0.0614500 0.0818000 1.008600 1.007975 1.0171500 12.326825 1705.8303 972.5418 1.0073250
1, 2, 3, 4, 5 1 0.3024004 41.906760 65.531260 20175.7915 26.31092 0.0516200 0.0767800 1.002960 1.002540 1.0039200 11.071760 1373.0494 801.3258 1.0036600
1, 2, 3, 4, 5 2 0.3021471 41.931340 65.573900 20181.7094 26.31942 0.0516800 0.0768200 1.003780 1.002660 0.9987400 11.082160 1373.8651 802.2514 1.0050200
1, 2, 3, 4, 5 3 0.3021409 42.145400 65.600640 20176.1053 26.33324 0.0518800 0.0769400 1.005760 1.004220 0.9897400 11.097660 1374.4217 805.3026 1.0076200
1, 2, 3, 4, 5 4 0.3029138 42.216920 65.699320 20156.7790 26.37876 0.0521200 0.0772000 1.008340 1.006280 0.9913200 11.127960 1375.1162 808.8738 1.0094600
1, 2, 3, 4, 5 5 0.3008862 42.300460 65.713340 20193.0421 26.38884 0.0521200 0.0772200 1.008760 1.007080 0.9968000 11.130640 1376.8350 809.2007 1.0104800

The best model includes time features number 1 and number 3 (respectively, IBM and Amazon). If we compare the error statistics from the best model in example2 with the model in example1, for IBM and Amazon we see consistent improvement. All the relative and scaled error metrics defaults to naive, but you can choose more challenging thresholds (like the deviation of the whole time feature or the average of the whole predicted sequence).

The error statistics from example1 (averaged across 10 expanding validation windows):

example1$best_model$testing_errors
              pred_scores       me      mae        mse   rmsse    mpe   mape
  IBM.Close     0.3960911   2.3043   7.5842    96.5223  8.4726 0.0155 0.0590
  AAPL.Close    0.2657489   7.8821   8.7555   158.3111 13.2152 0.0841 0.0992
  AMZN.Close    0.3512089 122.8044 185.2975 74715.1934 55.4967 0.0581 0.0820
  GOOGL.Close   0.2576022  69.0797 113.7313 25695.1495 40.1203 0.0433 0.0696
  MSFT.Close    0.2363667   9.4318  13.1982   300.0342 14.6394 0.0596 0.0763
                rmae  rrmse   rame    mase      smse       sce  gmrae
  IBM.Close   1.0135 1.0083 0.9825  6.3671   80.9444  162.0369 1.0220
  AAPL.Close  1.0079 1.0064 0.9982 15.0325  239.2960 1325.7503 1.0117
  AMZN.Close  1.0062 1.0062 1.0237 11.8917 4389.4121  926.2679 0.9998
  GOOGL.Close 1.0049 1.0037 0.9642  9.6565 1938.6584  614.0788 1.0023
  MSFT.Close  1.0113 1.0108 1.0154 12.7054  235.8639 1017.8694 1.0166

The error statistics from example2 (as above, averaged across 10 expanding validation windows):

example2$best_model$testing_errors
             pred_scores       me      mae        mse   rmsse    mpe   mape
  IBM.Close    0.4018044   2.1896   7.5408    96.1175  8.4489 0.0147 0.0585
  AMZN.Close   0.3514911 120.9444 184.5771 74493.9610 55.2893 0.0576 0.0818
               rmae  rrmse   rame    mase      smse      sce  gmrae
  IBM.Close  1.0076 1.0052 0.9964  6.3319   80.6405 153.6033 1.0118
  AMZN.Close 0.9997 0.9995 1.0022 11.8531 4375.5401 916.9243 1.0012

The improvement is clear for both the time features, but we are still using a naive approach to measure scaled and relative errors. Let’s try to shift to deviation as scale, and average as benchmark, that are more challenging evaluation criteria.

example3 <- dymo(time_features[, 2:6], seq_len = 100, n_windows = 10,  min_feats = 2, max_feats = 5, dates = time_features$date, error_scale = "deviation", error_benchmark = "average")
  time: 133.97 sec elapsed

As you can see, the relative and scaled measures change sensibly as we raise the bar of our expectations:

example3$best_model$testing_errors
             pred_scores       me      mae        mse   rmsse    mpe   mape rmae
  IBM.Close    0.4018044   2.1896   7.5408    96.1175  2.5574 0.0147 0.0585    1
  AMZN.Close   0.3514911 120.9444 184.5771 74493.9610 12.6031 0.0576 0.0818    1
             rrmse rame   mase     smse     sce  gmrae
  IBM.Close      1    1 0.5788   7.3931 13.3151 0.9995
  AMZN.Close     1    1 0.6886 242.9613 59.2111 1.0000

Some useful references


  1. An introductory review on Dynamic Mode Decomposition may be find on Wikipedia, with the different variants of DMD algorithm (SVD-based approach included) and a bunch of academic references: https://en.wikipedia.org/wiki/Dynamic_mode_decomposition.↩︎

  2. The missing imputation is managed through imputeTS package. For any information: https://cran.r-project.org/web/packages/imputeTS/index.html.↩︎

  3. In some cases, maybe you want to operate on smoothed time-features. In this case, dymo calls on fANCOVA package. Here you can find all the latest: https://cran.r-project.org/web/packages/fANCOVA/index.html↩︎

  4. The metrics are calculated using the greybox package. For any reference, please take a look here: https://cran.r-project.org/web/packages/greybox/index.html↩︎

  5. Other packages focused on time feature analysis that could be of interest here:

    - AUDREX, https://cran.r-project.org/web/packages/audrex/index.html
    - PROTEUS, https://cran.r-project.org/web/packages/proteus/index.html
    - JENGA, https://cran.r-project.org/web/packages/jenga/index.html
    - TETRAGON, https://cran.r-project.org/web/packages/tetragon/index.html
    - SPOOKY, https://cran.r-project.org/web/packages/spooky/index.html
    ↩︎