“I will live in the Past, the Present, and the Future.” (Ebenezer Scrooge, A Christmas Carol)
spooky uses Discrete Fast Fourier Transformation to extrapolate time features beyond their boundaries. As you can see below, moving inside out the picture, the main points of spooky are the following:
We can decompose a time feature signal in all its components using a Discrete Fast-Fourier Transformation: some components will be highly related to future trends and will be relevant for any purpose of future generalizations, while others will probably be no more than noises and errors without any representative power for extrapolation: we sift through all the possible frequencies using a jack-knife resampling to magnify the predicting power of best frequencies. Each time feature is expanded inside each validation window: after the spectral analysis, the signal of each time feature is extended into the future for the requested sequence of time steps (seq_len
), then the expansion is being carried out on the resampled set from a jack-knife leaving N points out (lno
).
Some basic transformation are directly managed in background. Differentiation and integration are automatically managed by spooky using the maximal p-value in a recursive F-test for de-trending each time-feature: this allows to easily determine the different dynamic characteristics of each time feature (random walk, trend, exponential). Maybe not cool like Augmented Dickey-Fuller or Ljung Box Test, but it works for our humble purpose. If you have limited missing values in your time features, spooky automatically proceeds with the imputation using the Kalman filter method1. If you prefer to project into the future the smoothed version, you can set smoother
= TRUE
to use loess2 function.
The test errors are cross-validated through an expanding validation n_windows
: the default value is set to 10, meaning that the time features are divided into 10 + 1 segments guaranteeing at least ten validation sets to measure the error on unforeseen data. For each point in the prediction sequence, a thousand samples are collected for the calculation of quantiles, mean, mode, standard deviation, skewness and kurtosis, and other less common measures that are provided for each time step (see below).
Concluding this brief excursus inside out the mode, beyond the windowing phase, spooky manages the random search sampling n_samp
models, randomly selected within the cartesian space delimited by seq_len
and lno
. The process is really fast.
Easy, that’s all.
The process flow of spooky
The dataset time features
included with spooky is a recent take on IBM and Microsoft stock prices (source: Yahoo Finance). The data is expected in a dataframe format, where each column represents a different time series (the date information is not mandatory and could be provided separately).
IBM.Close | MSFT.Close | |
---|---|---|
2017-01-03 | 159.8375 | 62.58 |
2017-01-04 | 161.8164 | 62.30 |
2017-01-05 | 161.2811 | 62.30 |
2017-01-06 | 162.0746 | 62.84 |
2017-01-09 | 160.2773 | 62.64 |
2017-01-10 | 158.2409 | 62.62 |
2017-01-11 | 160.3728 | 63.19 |
2017-01-12 | 160.5641 | 62.61 |
2017-01-13 | 159.9809 | 62.70 |
2017-01-17 | 160.5067 | 62.53 |
In the first example, we are predicting the close price for IBM and Microsoft. In this example we try to set seq_len= 100
(sequence length) and lno= 1
(leave-N-out), using a cross-validation scheme of 10 n_windows
for error measurement.
<- spooky(time_features, n_samp = 1, seq_len = 50, n_windows = 10, dates = as.Date(rownames(time_features)), lno = 1)
example1 : 10.11 sec elapsed time
The result is a list of different components, as you can see below.
names(example1)
1] "exploration" "history" "best_model" "time_log" [
The first variable, exploration
, includes all the models evaluated during the random search within the boundaries decided by users: besides the predictions for each feature, exploration
includes the test error statistics for each time sequence (length equals to seq_len
) and model information. The second variable, history
, summarizes the hyper-parameters selected by users, the relative max error metrics across different time features (me, mae, rmsse, mpe, mape, rmae, rrmse, rame, mase, smse, sce, gmrae 3) and the weighted average rank used to sort out the best model. The last variable, best_model
, collects testing_errors
, preds
, plots
for the selected model.
The prediction is a list including the predicted results for each time-feature (quantile, min, max, mean, mode, sd, skewness, kurtosis, iqr to range, risk ratio, upside probability, divergence for each time point in the seq_len
sequence). The IQR to range is the interquartile range to the min-max range, the risk ratio is the range above median to the range below it, the upside probability is the probability of growth compared to the former point in the time sequence, the divergence is the maximum distance of cumulative normal curve of each point to the former point in the sequence.
min | 10% | 25% | 50% | 75% | 90% | max | mean | sd | mode | kurtosis | skewness | iqr_to_range | risk_ratio | upside_prob | divergence | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
2022-04-05 | 123.291 | 124.218 | 127.284 | 127.630 | 129.509 | 130.075 | 130.711 | 127.933 | 1.879 | 127.974 | 3.141 | -0.870 | 0.300 | 0.710 | NA | NA |
2022-04-06 | 125.432 | 126.038 | 128.061 | 129.656 | 132.766 | 134.182 | 134.520 | 129.884 | 2.757 | 128.825 | 1.855 | 0.307 | 0.518 | 1.152 | 0.736 | 0.002 |
2022-04-07 | 122.122 | 125.304 | 127.102 | 129.556 | 131.683 | 133.359 | 134.253 | 129.367 | 3.006 | 131.076 | 2.352 | -0.408 | 0.378 | 0.632 | 0.439 | 0.079 |
2022-04-09 | 124.420 | 126.501 | 129.360 | 131.037 | 132.109 | 133.458 | 133.810 | 130.665 | 2.262 | 131.462 | 3.468 | -1.034 | 0.293 | 0.419 | 0.658 | 0.005 |
2022-04-10 | 124.202 | 127.552 | 129.701 | 130.403 | 133.145 | 136.468 | 139.255 | 130.984 | 3.143 | 130.105 | 3.036 | 0.334 | 0.229 | 1.428 | 0.522 | 0.053 |
2022-04-12 | 116.776 | 117.239 | 125.797 | 131.133 | 134.117 | 134.889 | 138.390 | 129.220 | 6.045 | 132.026 | 2.806 | -1.026 | 0.385 | 0.505 | 0.416 | 0.255 |
2022-04-13 | 116.053 | 116.383 | 124.565 | 130.119 | 135.276 | 136.724 | 138.422 | 128.558 | 6.938 | 131.590 | 2.322 | -0.670 | 0.479 | 0.590 | 0.457 | 0.062 |
2022-04-15 | 116.745 | 117.101 | 124.420 | 130.333 | 132.792 | 133.730 | 136.335 | 128.007 | 5.905 | 132.052 | 2.453 | -0.892 | 0.427 | 0.442 | 0.488 | 0.062 |
2022-04-16 | 117.312 | 117.605 | 124.523 | 131.834 | 135.699 | 137.980 | 139.263 | 129.911 | 7.181 | 135.024 | 2.028 | -0.635 | 0.509 | 0.512 | 0.570 | 0.005 |
2022-04-18 | 115.223 | 115.576 | 126.419 | 130.807 | 134.335 | 139.075 | 140.759 | 128.658 | 7.594 | 128.949 | 2.338 | -0.504 | 0.310 | 0.639 | 0.475 | 0.071 |
2022-04-19 | 115.664 | 116.105 | 125.830 | 131.873 | 138.004 | 144.732 | 146.708 | 130.285 | 9.052 | 129.497 | 2.179 | -0.122 | 0.392 | 0.915 | 0.539 | 0.010 |
2022-04-21 | 114.624 | 115.039 | 122.820 | 128.364 | 138.167 | 146.664 | 148.452 | 129.973 | 10.096 | 127.120 | 2.115 | 0.191 | 0.454 | 1.462 | 0.479 | 0.035 |
2022-04-22 | 115.823 | 116.287 | 120.958 | 126.506 | 138.906 | 152.756 | 154.379 | 130.110 | 11.487 | 122.010 | 2.522 | 0.689 | 0.466 | 2.609 | 0.504 | 0.028 |
2022-04-24 | 113.660 | 115.374 | 121.403 | 126.972 | 136.247 | 153.948 | 155.031 | 129.599 | 11.550 | 130.326 | 2.933 | 0.726 | 0.359 | 2.108 | 0.492 | 0.018 |
2022-04-25 | 115.752 | 116.243 | 121.544 | 127.314 | 136.236 | 145.722 | 146.593 | 129.349 | 9.492 | 130.122 | 2.118 | 0.287 | 0.476 | 1.667 | 0.488 | 0.053 |
2022-04-27 | 114.999 | 116.134 | 126.372 | 128.135 | 136.108 | 148.476 | 149.468 | 130.624 | 9.988 | 130.807 | 2.415 | 0.281 | 0.282 | 1.624 | 0.508 | 0.000 |
2022-04-28 | 112.883 | 115.959 | 123.957 | 124.776 | 135.817 | 139.903 | 142.994 | 127.736 | 8.203 | 121.056 | 1.783 | 0.053 | 0.394 | 1.532 | 0.418 | 0.146 |
2022-04-29 | 114.891 | 116.189 | 121.962 | 124.254 | 139.427 | 145.720 | 147.639 | 129.188 | 10.492 | 120.690 | 1.713 | 0.364 | 0.533 | 2.498 | 0.551 | 0.028 |
2022-05-01 | 109.913 | 110.997 | 122.091 | 123.697 | 139.137 | 145.209 | 146.779 | 126.925 | 11.265 | 124.611 | 1.994 | 0.241 | 0.462 | 1.674 | 0.438 | 0.087 |
2022-05-02 | 108.429 | 108.918 | 120.886 | 124.215 | 137.317 | 141.225 | 143.251 | 125.629 | 10.958 | 124.663 | 1.894 | -0.112 | 0.472 | 1.206 | 0.500 | 0.048 |
2022-05-04 | 107.332 | 107.716 | 119.847 | 125.924 | 135.959 | 140.008 | 142.550 | 124.754 | 10.634 | 123.869 | 2.026 | -0.192 | 0.458 | 0.894 | 0.459 | 0.034 |
2022-05-05 | 107.833 | 108.467 | 121.217 | 127.392 | 131.165 | 137.919 | 143.016 | 124.996 | 9.829 | 125.016 | 2.271 | -0.330 | 0.283 | 0.799 | 0.521 | 0.014 |
2022-05-07 | 107.280 | 108.062 | 118.923 | 124.399 | 130.997 | 138.203 | 140.394 | 124.131 | 10.155 | 124.228 | 1.912 | -0.244 | 0.365 | 0.934 | 0.471 | 0.037 |
2022-05-08 | 105.531 | 106.093 | 117.417 | 123.886 | 130.399 | 137.078 | 139.115 | 122.990 | 10.639 | 124.373 | 1.899 | -0.382 | 0.387 | 0.830 | 0.486 | 0.047 |
2022-05-10 | 107.562 | 109.164 | 121.019 | 130.978 | 135.424 | 143.467 | 144.917 | 127.628 | 11.799 | 137.613 | 1.836 | -0.299 | 0.386 | 0.595 | 0.603 | 0.004 |
2022-05-11 | 108.599 | 110.260 | 121.968 | 129.937 | 136.626 | 141.369 | 143.628 | 127.944 | 11.343 | 137.399 | 1.761 | -0.403 | 0.418 | 0.642 | 0.522 | 0.004 |
2022-05-13 | 108.665 | 108.969 | 123.950 | 130.296 | 137.776 | 142.943 | 144.433 | 129.147 | 11.083 | 129.750 | 2.307 | -0.659 | 0.387 | 0.654 | 0.530 | 0.001 |
2022-05-14 | 110.037 | 110.395 | 123.012 | 129.699 | 138.215 | 142.718 | 143.598 | 128.394 | 10.729 | 128.560 | 2.126 | -0.389 | 0.453 | 0.707 | 0.471 | 0.030 |
2022-05-16 | 108.876 | 109.498 | 122.273 | 129.333 | 136.187 | 141.189 | 141.876 | 126.725 | 10.773 | 135.420 | 1.975 | -0.364 | 0.422 | 0.613 | 0.453 | 0.062 |
2022-05-17 | 108.552 | 110.412 | 120.509 | 126.952 | 135.333 | 141.401 | 142.254 | 126.290 | 10.225 | 125.635 | 1.946 | -0.173 | 0.440 | 0.832 | 0.493 | 0.024 |
2022-05-19 | 108.521 | 109.124 | 121.614 | 126.610 | 135.248 | 138.392 | 139.764 | 125.740 | 10.000 | 133.624 | 2.071 | -0.434 | 0.436 | 0.727 | 0.476 | 0.023 |
2022-05-20 | 110.892 | 112.294 | 124.817 | 128.389 | 135.515 | 141.067 | 141.872 | 128.265 | 9.216 | 127.673 | 2.318 | -0.445 | 0.345 | 0.771 | 0.581 | 0.000 |
2022-05-22 | 109.897 | 111.294 | 126.043 | 127.756 | 137.501 | 139.531 | 146.453 | 128.488 | 10.359 | 128.462 | 2.325 | -0.202 | 0.313 | 1.047 | 0.518 | 0.023 |
2022-05-23 | 109.773 | 110.401 | 127.153 | 128.851 | 135.375 | 138.804 | 147.791 | 128.199 | 10.041 | 127.903 | 2.768 | -0.365 | 0.216 | 0.993 | 0.502 | 0.016 |
2022-05-24 | 108.138 | 109.786 | 124.906 | 129.702 | 140.268 | 144.722 | 149.081 | 129.683 | 12.381 | 130.612 | 2.051 | -0.307 | 0.375 | 0.899 | 0.536 | 0.024 |
2022-05-26 | 110.257 | 111.699 | 125.345 | 128.316 | 138.698 | 146.659 | 149.671 | 129.579 | 11.595 | 129.372 | 2.168 | -0.108 | 0.339 | 1.183 | 0.519 | 0.018 |
2022-05-27 | 110.915 | 112.697 | 123.521 | 128.045 | 140.102 | 146.726 | 150.733 | 129.436 | 12.028 | 120.362 | 1.947 | 0.094 | 0.416 | 1.325 | 0.468 | 0.012 |
2022-05-29 | 112.964 | 114.502 | 121.570 | 127.502 | 140.362 | 150.480 | 150.932 | 130.296 | 12.237 | 120.112 | 1.925 | 0.321 | 0.495 | 1.612 | 0.537 | 0.002 |
2022-05-30 | 113.753 | 115.900 | 119.003 | 125.171 | 141.510 | 148.352 | 149.944 | 129.289 | 12.141 | 118.785 | 1.710 | 0.421 | 0.622 | 2.170 | 0.485 | 0.033 |
2022-06-01 | 114.559 | 116.623 | 122.662 | 129.952 | 141.969 | 147.978 | 151.934 | 130.947 | 11.399 | 122.484 | 1.816 | 0.361 | 0.517 | 1.428 | 0.538 | 0.000 |
2022-06-02 | 113.436 | 117.261 | 121.820 | 128.589 | 140.516 | 147.806 | 153.463 | 130.029 | 11.602 | 121.422 | 2.084 | 0.597 | 0.467 | 1.641 | 0.478 | 0.033 |
2022-06-04 | 112.127 | 114.354 | 121.302 | 127.910 | 140.282 | 148.327 | 152.344 | 130.213 | 12.259 | 120.393 | 1.923 | 0.370 | 0.472 | 1.548 | 0.506 | 0.010 |
2022-06-05 | 114.654 | 115.648 | 123.411 | 128.819 | 144.328 | 148.091 | 154.735 | 131.665 | 12.104 | 127.914 | 1.918 | 0.337 | 0.522 | 1.830 | 0.534 | 0.003 |
2022-06-07 | 115.035 | 115.334 | 125.738 | 130.378 | 144.334 | 147.042 | 150.160 | 132.188 | 11.679 | 122.441 | 1.710 | -0.067 | 0.529 | 1.289 | 0.526 | 0.002 |
2022-06-08 | 116.846 | 119.064 | 125.014 | 134.047 | 147.141 | 149.536 | 152.449 | 134.079 | 11.259 | 131.993 | 1.656 | 0.062 | 0.621 | 1.070 | 0.530 | 0.003 |
2022-06-10 | 116.519 | 121.699 | 127.446 | 135.198 | 146.573 | 150.488 | 152.567 | 135.130 | 10.783 | 127.208 | 1.763 | 0.222 | 0.531 | 0.930 | 0.509 | 0.000 |
2022-06-11 | 115.928 | 118.430 | 123.879 | 133.948 | 140.346 | 150.998 | 153.221 | 132.994 | 11.539 | 125.937 | 1.921 | 0.285 | 0.442 | 1.070 | 0.460 | 0.081 |
2022-06-13 | 114.817 | 119.164 | 120.716 | 135.881 | 140.683 | 152.237 | 155.009 | 134.392 | 12.775 | 125.108 | 1.703 | 0.187 | 0.497 | 0.908 | 0.536 | 0.005 |
2022-06-14 | 106.221 | 114.362 | 123.434 | 134.880 | 140.841 | 156.219 | 158.294 | 134.105 | 14.002 | 130.288 | 1.963 | 0.221 | 0.334 | 0.817 | 0.480 | 0.028 |
2022-06-16 | 108.645 | 110.757 | 122.794 | 132.532 | 141.286 | 153.733 | 156.718 | 132.076 | 13.910 | 129.217 | 1.999 | 0.049 | 0.385 | 1.013 | 0.469 | 0.058 |
For each time features included in the model, you get a plot of the median with the chosen confidence interval (ci
default is 0.8). As in other packages4, we provide different stats to give a better hint on the different dynamics related to aleatoric and epistemic uncertainty.
$IBM.Close
$MSFT.Close
Now, let’s try a fast random search to optimize our estimations. The following example show how to sample 30 different models for a sequence of 50 time steps. Setting lno = NULL
starts a search between 1 and the square root of time feature length.
<- spooky(time_features, n_samp = 30, seq_len = 50, n_windows = 10, dates = as.Date(rownames(time_features)), lno = NULL)
example2 : 489.36 sec elapsed time
seq_len | lno | max_me | max_mae | max_mse | max_rmsse | max_mpe | max_mape | max_rmae | max_rrmse | max_rame | max_mase | max_smse | max_sce | max_gmrae | wgt_avg_rank | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
20 | 50 | 21 | 4.23 | 7.76 | 114.80 | 6.80 | 0.03 | 0.06 | 0.06 | 0.07 | 0.05 | 4.27 | 58.03 | 110.94 | 0.05 | 7.51 |
5 | 50 | 23 | 4.31 | 7.77 | 115.35 | 6.80 | 0.03 | 0.06 | 0.06 | 0.07 | 0.05 | 4.26 | 57.73 | 113.45 | 0.05 | 8.18 |
9 | 50 | 24 | 4.37 | 7.77 | 116.10 | 6.80 | 0.03 | 0.06 | 0.06 | 0.07 | 0.05 | 4.27 | 57.62 | 115.56 | 0.05 | 8.87 |
6 | 50 | 19 | 4.22 | 7.80 | 115.36 | 6.82 | 0.03 | 0.06 | 0.06 | 0.07 | 0.05 | 4.29 | 58.31 | 110.42 | 0.05 | 8.92 |
26 | 50 | 19 | 4.22 | 7.80 | 115.36 | 6.82 | 0.03 | 0.06 | 0.06 | 0.07 | 0.05 | 4.29 | 58.31 | 110.42 | 0.05 | 8.92 |
7 | 50 | 27 | 4.56 | 7.80 | 119.68 | 6.79 | 0.03 | 0.06 | 0.06 | 0.07 | 0.05 | 4.28 | 56.86 | 118.91 | 0.05 | 9.47 |
10 | 50 | 26 | 4.51 | 7.81 | 118.33 | 6.81 | 0.03 | 0.06 | 0.06 | 0.07 | 0.05 | 4.28 | 57.26 | 119.12 | 0.05 | 10.13 |
12 | 50 | 26 | 4.51 | 7.81 | 118.33 | 6.81 | 0.03 | 0.06 | 0.06 | 0.07 | 0.05 | 4.28 | 57.26 | 119.12 | 0.05 | 10.13 |
19 | 50 | 18 | 4.24 | 7.83 | 116.07 | 6.83 | 0.03 | 0.06 | 0.06 | 0.07 | 0.05 | 4.31 | 58.49 | 111.43 | 0.05 | 10.62 |
11 | 50 | 17 | 4.27 | 7.84 | 116.96 | 6.82 | 0.03 | 0.06 | 0.06 | 0.07 | 0.05 | 4.32 | 58.45 | 112.18 | 0.05 | 10.99 |
15 | 50 | 17 | 4.27 | 7.84 | 116.96 | 6.82 | 0.03 | 0.06 | 0.06 | 0.07 | 0.05 | 4.32 | 58.45 | 112.18 | 0.05 | 10.99 |
29 | 50 | 17 | 4.27 | 7.84 | 116.96 | 6.82 | 0.03 | 0.06 | 0.06 | 0.07 | 0.05 | 4.32 | 58.45 | 112.18 | 0.05 | 10.99 |
4 | 50 | 30 | 4.74 | 7.91 | 125.26 | 6.81 | 0.03 | 0.06 | 0.06 | 0.07 | 0.05 | 4.37 | 57.16 | 122.23 | 0.05 | 12.13 |
30 | 50 | 30 | 4.74 | 7.91 | 125.26 | 6.81 | 0.03 | 0.06 | 0.06 | 0.07 | 0.05 | 4.37 | 57.16 | 122.23 | 0.05 | 12.13 |
27 | 50 | 15 | 4.36 | 7.86 | 119.08 | 6.85 | 0.03 | 0.06 | 0.06 | 0.07 | 0.05 | 4.32 | 58.90 | 113.67 | 0.05 | 13.23 |
1 | 50 | 33 | 4.87 | 8.09 | 130.40 | 6.83 | 0.03 | 0.06 | 0.06 | 0.07 | 0.05 | 4.44 | 58.03 | 124.46 | 0.05 | 15.83 |
21 | 50 | 33 | 4.87 | 8.09 | 130.40 | 6.83 | 0.03 | 0.06 | 0.06 | 0.07 | 0.05 | 4.44 | 58.03 | 124.46 | 0.05 | 15.83 |
28 | 50 | 33 | 4.87 | 8.09 | 130.40 | 6.83 | 0.03 | 0.06 | 0.06 | 0.07 | 0.05 | 4.44 | 58.03 | 124.46 | 0.05 | 15.83 |
2 | 50 | 34 | 4.89 | 8.11 | 131.29 | 6.87 | 0.03 | 0.06 | 0.06 | 0.07 | 0.05 | 4.42 | 58.59 | 124.13 | 0.05 | 17.81 |
13 | 50 | 34 | 4.89 | 8.11 | 131.29 | 6.87 | 0.03 | 0.06 | 0.06 | 0.07 | 0.05 | 4.42 | 58.59 | 124.13 | 0.05 | 17.81 |
16 | 50 | 34 | 4.89 | 8.11 | 131.29 | 6.87 | 0.03 | 0.06 | 0.06 | 0.07 | 0.05 | 4.42 | 58.59 | 124.13 | 0.05 | 17.81 |
24 | 50 | 34 | 4.89 | 8.11 | 131.29 | 6.87 | 0.03 | 0.06 | 0.06 | 0.07 | 0.05 | 4.42 | 58.59 | 124.13 | 0.05 | 17.81 |
17 | 50 | 35 | 4.91 | 8.11 | 131.72 | 6.88 | 0.03 | 0.06 | 0.06 | 0.07 | 0.05 | 4.41 | 58.94 | 124.66 | 0.05 | 19.31 |
23 | 50 | 36 | 4.91 | 8.11 | 131.65 | 6.88 | 0.03 | 0.06 | 0.06 | 0.07 | 0.05 | 4.42 | 59.04 | 125.77 | 0.05 | 19.86 |
14 | 50 | 10 | 4.72 | 8.50 | 124.99 | 7.49 | 0.03 | 0.07 | 0.07 | 0.08 | 0.06 | 4.77 | 67.88 | 123.80 | 0.05 | 23.76 |
3 | 50 | 11 | 4.65 | 8.51 | 123.89 | 7.49 | 0.03 | 0.07 | 0.07 | 0.08 | 0.06 | 4.78 | 68.25 | 123.36 | 0.05 | 23.85 |
8 | 50 | 6 | 4.86 | 8.57 | 128.90 | 7.57 | 0.03 | 0.07 | 0.07 | 0.08 | 0.06 | 4.80 | 68.27 | 125.30 | 0.05 | 26.25 |
22 | 50 | 6 | 4.86 | 8.57 | 128.90 | 7.57 | 0.03 | 0.07 | 0.07 | 0.08 | 0.06 | 4.80 | 68.27 | 125.30 | 0.05 | 26.25 |
18 | 50 | 5 | 4.88 | 8.57 | 129.80 | 7.57 | 0.03 | 0.07 | 0.07 | 0.08 | 0.06 | 4.80 | 68.06 | 125.41 | 0.05 | 26.65 |
25 | 50 | 4 | 4.90 | 8.56 | 130.65 | 7.57 | 0.03 | 0.07 | 0.07 | 0.08 | 0.06 | 4.79 | 67.76 | 125.82 | 0.05 | 27.17 |
If we compare the error statistics from the best model in example2
with the model in example1
, we see an interesting improvement in the testing errors for IBM and Microsoft (the default error measurement for relative and scaled error is naive
). The error statistics from example1
, evaluated as the mean on 10 validation windows, are the following:
$best_model$testing_errors
example1
me mae mse rmsse mpe mape rmae rrmse rame mase smse4.283 8.575 122.207 7.580 0.032 0.067 0.068 0.080 0.062 4.793 67.670
IBM.Close 4.911 7.782 132.215 6.302 0.021 0.042 0.042 0.048 0.039 4.330 57.677
MSFT.Close
sce gmrae117.787 0.048
IBM.Close 124.995 0.034 MSFT.Close
The error statistics from example2
.
$best_model$testing_errors
example2
me mae mse rmsse mpe mape rmae rrmse rame mase smse3.478 7.763 107.472 6.804 0.027 0.061 0.062 0.073 0.051 4.266 58.027
IBM.Close 4.229 7.439 114.796 6.092 0.018 0.041 0.041 0.046 0.037 4.189 51.030
MSFT.Close
sce gmrae92.959 0.046
IBM.Close 110.937 0.033 MSFT.Close
The improvement is clear for both the time features, but we are still using a naive
approach to measure scaled and relative errors. Let’s try to shift to deviation
as scale, and average
as benchmark, that are more challenging evaluation criteria.
<- spooky(time_features, n_samp = 30, seq_len = 50, n_windows = 10, dates = as.Date(rownames(time_features)), lno = NULL, error_scale = "deviation", error_benchmark = "average")
example3 : 736.65 sec elapsed time
As you can see, the relative and scaled measures change sensibly as we raise the bar of our expectations
$best_model$testing_errors
example3
me mae mse rmsse mpe mape rmae rrmse rame mase smse3.478 7.763 107.472 6.71 0.027 0.061 1.141 1.180 1 4.151 56.380
IBM.Close 4.229 7.439 114.796 6.02 0.018 0.041 0.837 0.843 1 4.143 49.313
MSFT.Close
sce gmrae88.589 1.099
IBM.Close 109.244 0.840 MSFT.Close
You can also explore for the best possible forecasting horizon, choosing a range for seq_len
besides lno
. Let’s see the best foreseeability using the ranges lno = c(1, 50)
and seq_len = c(10, 20)
.
<- spooky(time_features, n_samp = 10, n_windows = 10, dates = as.Date(rownames(time_features)), lno = c(1, 50), seq_len = c(10, 20))
example4 : 117.86 sec elapsed time
As we can see from history, 13 days and 47 left-out are the best parameters to reduce the max joint error on the time features.
seq_len | lno | max_me | max_mae | max_mse | max_rmsse | max_mpe | max_mape | max_rmae | max_rrmse | max_rame | max_mase | max_smse | max_sce | max_gmrae | wgt_avg_rank | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
3 | 13 | 47 | -0.19 | 8.27 | 152.32 | 6.61 | 0.00 | 0.04 | 0.04 | 0.05 | 0.04 | 4.32 | 64.82 | 0.05 | 0.04 | 2.92 |
8 | 11 | 7 | -0.59 | 6.52 | 88.80 | 5.45 | 0.00 | 0.04 | 0.04 | 0.05 | 0.04 | 3.46 | 39.51 | -3.71 | 0.03 | 3.83 |
1 | 19 | 23 | 0.03 | 8.52 | 186.20 | 6.92 | 0.00 | 0.05 | 0.05 | 0.05 | 0.04 | 4.53 | 78.47 | -0.54 | 0.04 | 3.96 |
4 | 18 | 14 | -0.04 | 8.72 | 188.35 | 7.06 | 0.00 | 0.05 | 0.05 | 0.05 | 0.04 | 4.68 | 79.87 | 0.43 | 0.04 | 4.17 |
7 | 17 | 49 | -0.50 | 8.61 | 182.56 | 6.91 | 0.00 | 0.05 | 0.05 | 0.05 | 0.04 | 4.49 | 76.97 | -4.70 | 0.04 | 5.32 |
2 | 19 | 36 | -0.11 | 8.52 | 183.66 | 6.95 | -0.01 | 0.05 | 0.05 | 0.05 | 0.04 | 4.52 | 77.76 | -2.37 | 0.04 | 5.99 |
10 | 17 | 28 | -0.71 | 8.37 | 178.40 | 6.83 | -0.01 | 0.04 | 0.04 | 0.05 | 0.04 | 4.45 | 75.44 | -4.97 | 0.03 | 6.63 |
9 | 17 | 24 | -0.79 | 8.35 | 178.53 | 6.82 | -0.01 | 0.04 | 0.04 | 0.05 | 0.04 | 4.45 | 75.53 | -5.73 | 0.03 | 7.13 |
6 | 15 | 47 | -1.12 | 7.98 | 165.54 | 6.47 | -0.01 | 0.04 | 0.04 | 0.05 | 0.04 | 4.19 | 69.83 | -8.34 | 0.03 | 7.33 |
5 | 16 | 24 | -1.14 | 8.21 | 172.40 | 6.68 | -0.01 | 0.04 | 0.04 | 0.05 | 0.04 | 4.40 | 73.33 | -6.74 | 0.04 | 7.73 |
Let’s see the plots.
$IBM.Close
$MSFT.Close
The missing imputation is managed through imputeTS package. For any information: https://cran.r-project.org/web/packages/imputeTS/index.html.↩︎
In some cases, maybe you want to operate on smoothed time-features. In this case, spooky calls on fANCOVA
package. Here you can find all the latest: https://cran.r-project.org/web/packages/fANCOVA/index.html↩︎
The metrics are calculated using the greybox package. For any reference, please take a look here: https://cran.r-project.org/web/packages/greybox/index.html↩︎
Other packages focused on time feature analysis that could be of interest here:
- AUDREX, https://cran.r-project.org/web/packages/audrex/index.html
- PROTEUS, https://cran.r-project.org/web/packages/proteus/index.html
- JENGA, https://cran.r-project.org/web/packages/jenga/index.html
- TETRAGON, https://cran.r-project.org/web/packages/tetragon/index.html
↩︎