Intro to xpect

Author

Giancarlo Vercellino

Published

March 24, 2025

“Those who have knowledge don’t predict. Those who predict don’t have knowledge.” - Lao Tzu

“If you do not expect the unexpected, you will not find it.” - Heraclitus

“Prediction is very difficult, especially if it’s about the future.” - Niels Bohr

“I never think of the future—it comes soon enough.” - Albert Einstein

`xpect` the unexpected with Probabilistic Forecasting

The xpect function brings together the power of gradient-boosted regression (via XGBoost¹) and conformal inference to deliver probabilistic forecasts. Instead of providing only point estimates, xpect constructs full predictive distributions that capture uncertainty in future time series values.

The xpect package works by:

Data Preparation & Feature Engineering
The process begins by transforming the raw time series into scaled returns, which normalizes the data to reduce non-stationarity and better expose underlying dynamics. Next, xpect reframes these scaled returns into structured matrices (based on past and future variables), creating input features suitable for predictive modeling. To enhance the quality and interpretability of these features, a Singular Value Decomposition² (SVD) is applied, retaining a specified fraction of total variance defined by the coverage parameter. This step effectively reduces dimensionality, filters noise, and highlights the most informative components in the data, ultimately improving forecasting accuracy and model stability.
Modeling with Conformal Inference and Mixture of Distributions
An XGBoost model is trained on the prepared dataset, providing initial predictions for future time steps. These predictions are then calibrated through Conformal inference³, using a dedicated calibration set to estimate prediction residuals. After calibration, a mixture⁴ model approach fits several candidate statistical distributions—including Generalized Normal, Generalized Logistic, Generalized Lambda, among others—to the calibrated predictions, effectively capturing complex uncertainties in the forecasts. The resulting ensemble of fitted distributions is combined into an empirical distribution function⁵ using the edfun package, ultimately generating a suite of comprehensive predictive functions for each point in the forecasted horizon (future):
- rfun: Generates random samples from the forecast distribution.
- dfun: Computes the probability density function (PDF).
- pfun: Returns the cumulative distribution function (CDF).
- qfun: Provides the inverse CDF (quantile function).

Hyperparameter Optimization

xpect offers flexible hyperparameter optimization strategies, allowing users to tailor model tuning to the complexity of their data and available computational resources:
- random_search: Select randomly n_samples configurations from predefined or custom hyperparameter ranges, providing broad coverage of the parameter space with minimal assumptions.
- coarse_to_fine: Starts with a broad random search with n_samples and progressively narrows down the hyperparameter space through a top_k selection in multiple iterative n_phases, systematically refining toward optimal configurations.
- bayesian: Employs Bayesian optimization⁶, intelligently balancing exploration and exploitation through probabilistic modeling. This method efficiently guides the search toward promising configurations, minimizing the number of evaluations required.
These methods empower users to select an optimization approach aligned with their analytical objectives, computational constraints, and data characteristics.

What do you expect? A Practical Example

In the following example, we apply xpect on a synthetic dataset. We forecast five future time steps using random_search hyperparameter search.

# Load required libraries (assumes xpect and dependencies are installed)
library(xpect)
library(ggplot2)

# Create a dummy predictors dataset with sufficient observations.
set.seed(42)
df <- data.frame(
  target_series = tail(RandomWalker::rw30()$y, 100)+30,
  predictor1 = tail(RandomWalker::rw30()$y, 100)+30
)

# Execute xpect with random search method
results <- xpect(predictors = df,
                 target = "target_series",
                 future = 5,
                 past = NULL,  #STANDARD RANGE
                 max_depth = c(3:8), #CUSTOM RANGE
                 eta = seq(0.01, 0.3, length.out = 100), #CUSTOM RANGE
                 alpha = 0.5, #FIXED VALUE
                 lambda = seq(0.01, 0.1, length.out = 100), #CUSTOM RANGE
                 gamma = NULL, #STANDARD RANGE
                 search = "random_search",
                 n_samples = 5,
                 seed = 123)

# Display the computational time and forecast plot
results$time_log

[1] "1M 24S"

results$plot

The result is a list containing:

history: A data frame logging each evaluated hyperparameter configuration along with its performance.
best_model: The optimal forecasting model with associated predictive functions (rfun, dfun, pfun, qfun).
best_params: A summary of the hyperparameters selected for the best-performing model.
plot: A ggplot visualization that overlays the forecast with its uncertainty intervals.
time_log: The duration of the complete model-building and optimization process.

Let’s inspect the structure of history:

knitr::kable(results$history, caption = "Hyperparameters from the random search exploration, including entropy evaluation")

Hyperparameters from the random search exploration, including entropy evaluation
past	coverage	max_depth	eta	gamma	alpha	lambda	subsample	colsample_bytree	entropy
15	0.5	4	0.0481	0.4505	0.5	0.0991	0.8	0.8	1.2366
19	0.5	8	0.0803	4.7648	0.5	0.0745	0.8	0.8	1.2038
14	0.5	5	0.2707	1.7367	0.5	0.0327	0.8	0.8	1.0961
3	0.5	7	0.2736	3.2432	0.5	0.0155	0.8	0.8	1.0844
10	0.5	6	0.2092	4.9449	0.5	0.0473	0.8	0.8	1.1146

Models in the history are evaluated using entropy because entropy provides a quantitative measure of the uncertainty in their predictive distributions. Lower entropy indicates that a model’s forecasts are sharper and more confident, suggesting it better captures the underlying dynamics of the data. By ranking models based on their entropy, xpect can objectively select the configuration that minimizes uncertainty and delivers more informative and reliable forecasts.

Let’s inspect the structure of the best parameters:

knitr::kable(round(results$best_params, 3), caption = "Model with the lower entropy")

Model with the lower entropy
	past	coverage	max_depth	eta	gamma	alpha	lambda	subsample	colsample_bytree	entropy
4	3	0.5	7	0.274	3.243	0.5	0.016	0.8	0.8	1.084

And explore the predictive functions for the last forecast point (t5):

# List available predictive functions
names(results$best_model)

[1] "t1" "t2" "t3" "t4" "t5"

# Examine the prediction function for the 5th time step
results$best_model$t5

$dfun
function (x) 
dapproxfun(x)
<bytecode: 0x0000013ecd888768>
<environment: 0x0000013ecd87a0a0>

$pfun
Empirical CDF 
Call: ecdf(x)
 x[1:15] = -6.8546, -6.5771, -4.4938,  ..., 1.1582, 1.8487

$qfun
function (v) 
.approxfun(x, y, v, method, yleft, yright, f, na.rm)
<bytecode: 0x0000013ecd890d00>
<environment: 0x0000013ecd887a28>

$rfun
function (n) 
sample(x, size = n, replace = TRUE)
<bytecode: 0x0000013ecd890248>
<environment: 0x0000013ecd87a0a0>

$pfun_integrate_dfun
NULL

Using these functions for each time point, you can compute confidence intervals, derive quantiles, and even generate random samples from the forecasted distribution. For example:

# Calculate the 95% confidence interval for T5 predictions
results$best_model$t5$qfun(c(0.025, 0.975))

[1] 10.81514 16.01988

# Generate 1000 random forecast samples for T4 and compute their mean
mean(results$best_model$t4$rfun(1000))

[1] 13.79569

# Probability of growth over the horizon
1 - results$best_model$t5$pfun(tail(df$target_series, 1))

[1] 0.459

# Probability of a value between 15 and 20 at T2
results$best_model$t2$pfun(20) - results$best_model$t2$pfun(15)

[1] 0.062

# Odds for value 15 over value 13 at T5
results$best_model$t5$dfun(15)/results$best_model$t5$dfun(13)

[1] 0.8133226

Final Thoughts

Never forget—forecasting is a notoriously tricky endeavor. The knowledgeable refrain from making predictions, the wise embrace uncertainty, physicists insist it’s an impossible task, and as for Albert… well, he doesn’t even bother trying. So, in summary: the future’s a prankster, and we’re all just trying to guess the punchline.

Enzoi!

Footnotes

XGBoost is the most famous gradient boosting library designed for efficiency, flexibility, and portability. For a comprehensive overview, please refer to the XGBoost documentation.↩︎
Singular Value Decomposition (SVD) is a matrix factorization technique widely used for dimensionality reduction and noise filtering. For a primer on the math, take a look here.↩︎
Conformal prediction is a statistical framework that provides reliable uncertainty estimates by constructing predictive intervals with guaranteed coverage under minimal assumptions. In xpect, conformal inference is used to calibrate the raw forecasts by leveraging residuals from a held-out calibration set. For further reading, take a look here.↩︎
The mixture function in xpect captures forecast uncertainty by fitting multiple probability distributions—including Generalized Normal, Generalized Logistic, Generalized Extreme Value, Generalized Lambda, Asymmetric Laplace, and Generalized Hyperbolic—to the calibrated predictions. These fitted distributions are then combined into a composite model, which is transformed into an empirical distribution function (EDF) using edfun. This ensures that the predictive functions (rfun, dfun, pfun, qfun) accurately reflect uncertainty across each point in the time horizon.↩︎
edfun is an R package that facilitates the creation of empirical distribution functions (EDFs), enabling non-parametric estimation of cumulative distribution functions and quantiles. More details can be found on its CRAN page.↩︎
Bayesian optimization is employed to fine-tune hyperparameters by defining a custom objective that minimizes the mean entropy of the predictive functions estimated with xpect. Key driving variables are n_samples and n_exploration. For more details, refer to the rBayesianOptimization package documentation.↩︎

xpect the unexpected with Probabilistic Forecasting

What do you expect? A Practical Example

Final Thoughts

Footnotes

`xpect` the unexpected with Probabilistic Forecasting