library(temper)
<- temper(
fit ts = dummy_set$MSFT.Close,
future = 100,
past = 100,
latent_dim = 10,
n_trees = 100,
depth = 8,
epochs = 30,
seed = 123,
verbose = FALSE
)
$plot # Visualize fan chart fit
Intro to temper
“He is happy whom circumstances suit his temper; but he is more excellent who suits his temper to any circumstance.” - David Hume
“You cannot predict the outcome of human affairs. Temper and unpredictability are their nature.” - Virginia Woolf
“Temper is the one thing you can’t get rid of by losing it.” - Jack Nicholson
What is temper
temper
(Temporal Encoder–Masked Probabilistic Ensemble Regressor) is a machine learning algorithm for forecasting univariate time series with full uncertainty quantification. Instead of predicting only future values, temper
estimates their entire probability distributions across multiple forecast horizons.
It works by combining a temporal autoencoder to compress historical patterns, a masked neural decision forest to generate diverse forecast samples, and a Gaussian mixture model to fit smooth predictive distributions. This design captures nonlinear dynamics, handles missing or noisy data robustly, and produces interpretable and calibrated uncertainty estimates.
By blending representation learning with probabilistic modeling, temper
offers a versatile forecasting engine that is particularly valuable in domains where understanding forecast confidence is as important as the predictions themselves.
The analytical process implemented with temper
The temper
algorithm follows a structured pipeline that transforms a univariate time series into full probabilistic forecasts. Each step plays a distinct role in enabling accurate, uncertainty-aware predictions:
Input preprocessing: The input is a numeric series of levels. Missing values are automatically filled using Kalman smoothing via
imputeTS::na_kalman()
, ensuring continuity. The series is then transformed into scaled differences (level changes divided by the previous value) to stabilize dynamics while preserving level interpretability.Reframing into supervised format: Using a sliding window, the series is reframed into overlapping samples. Each sample consists of
past
consecutive scaled differences as inputs andfuture
scaled differences as targets. This transforms the time series into a supervised dataset, enabling model training with input-output pairs.Latent encoding via autoencoder:
Each input window is passed through a neural autoencoder, implemented using
torch
, which learns a compressed representation (latent_dim
) of temporal dynamics.The encoder captures the most informative structure in a low-dimensional latent space, while the decoder attempts to reconstruct the original input.
This reconstruction loss (mean squared error) acts as a regularizer, ensuring the latent code preserves important information.
Probabilistic masking and ensemble prediction:
The latent representation is passed through a differentiable forest — an ensemble of soft decision trees trained via
torch
.Each tree includes a probabilistic feature mask, implemented using Gumbel–Softmax sampling. This learnable mask decides which latent features are retained or dropped during training.
The trees jointly produce diverse forecast samples (scaled differences) across multiple future steps, exploiting ensemble variance for uncertainty.
Inverse transformation to level space:
- Forecasts in scaled-difference space are converted back to levels by reversing the differencing procedure—cumulatively applying the predicted scaled changes to the most recent observed value. This ensures the outputs remain on the same scale as the original series.
Gaussian mixture smoothing:
For each forecast horizon, the multiple sampled outcomes are post-processed via a Gaussian mixture model (GMix) using
stats::kmeans()
and custom density estimation routines.A candidate number of components is set; K-means clustering identifies group structure; and means, variances, and weights are estimated.
The result is a smooth, interpretable, and reusable predictive distribution for each forecasted step.
This modular pipeline makes temper
resilient to noise, expressive in uncertainty modeling, and flexible enough for downstream applications.
Output Construction and Visualization
Once all steps are completed, the algorithm returns a structured output object that includes:
pred_funs
: A list of empirical distribution functions (pfun
,qfun
,rfun
, anddfun
) for each forecast step, allowing for quantile estimation, sampling, and density evaluation.plot
: A ready-to-publish fan chart implemented withggplot2
, displaying the original time series, the median forecast, and a predictive interval (typically 90%). This visualization is ideal for presentations, reporting, or exploratory analysis.loss
: A ggplot object showing the evolution of the training and validation loss (CRPS) over epochs. This aids in understanding convergence behavior and diagnosing overfitting or underfitting.time_log
: Alubridate::period
object tracking the elapsed training time, offering transparency and aiding reproducibility.
Quick Start Example
Here’s a minimal example to get started with temper()
in just a few lines:
How to Forecast with temper
: A Practical Example
Forecast Distributions, Quantiles, and Expectation
Once you’ve trained a temper
model, the output includes a full predictive distribution at each forecast horizon. This allows you to:
Sample future scenarios with
rfun()
Compute quantiles using
qfun()
Evaluate cumulative probabilities with
pfun()
Access pointwise densities using
dfun()
Example 1: Plotting the Forecast Distribution at Horizon 100
<- fit$pred_funs$t100$rfun(1000)
samples plot(density(samples), main = "Forecast Distribution at t100", xlab = "Predicted Value", ylab = "Density" )
Example 2: Growth Probability at Horizon 50 and 100
# Probability that the series will increase
1 - fit$pred_funs$t50$pfun(tail(dummy_set$MSFT.Close, 1)) # t+50
[1] 0.4792041
1 - fit$pred_funs$t100$pfun(tail(dummy_set$MSFT.Close, 1)) # t+100
[1] 0.6070789
Example 3: Forecast Quantiles at Horizon 10
$pred_funs$t10$qfun(c(0.1, 0.25, 0.5, 0.75, 0.9)) fit
[1] 448.7369 465.3088 487.9334 512.1540 527.4680
Example 4: Expected Value via Sampling and Integration at Horizon 100
# Sampling-based expectation
mean(fit$pred_funs$t100$rfun(100000))
[1] 529.4935
# Numerical integration
<- fit$pred_funs$t100$qfun(0.000001)
q0 <- fit$pred_funs$t100$qfun(0.999999)
q1 <- integrate(function(x) fit$pred_funs$t100$dfun(x), q0, q1)$value
norm_base <- integrate(function(x) fit$pred_funs$t100$dfun(x) * x, q0, q1)$value/norm_base
expectation expectation
[1] 529.3906
Final Thoughts
temper
is a flexible and probabilistic forecasting model that learns from volatility, embraces unpredictability, and—unlike most tempers—never snaps under pressure. So yes, use it for forecasting markets, demand, or the next time your CRAN submission will pass on the first try. Spoiler: it won’t.
(Yes, it is a joke).
Enzoi!