Intro to chopper

Author

Giancarlo Vercellino

Published

May 25, 2025

“The measure of intelligence is the ability to change.” — Albert Einstein

“Divide each difficulty into as many parts as is feasible and necessary to resolve it.” — René Descartes

“The secret of getting started is breaking your complex overwhelming tasks into small manageable tasks, and starting on the first one.” — Mark Twain

What is chopper?

Traditional univariate forecasting assumes a single, stationary data‑generating process. Yet real‑world series—energy demand, e‑commerce traffic, stock prices—often jump between regimes where level, variance or trend change abruptly. chopper embraces this non‑stationarity: it first chops the timeline at statistically significant breaks, then trains a lightweight probabilistic expert on each slice. Every expert forecasts from its own local context, and their densities are combined via a product‑of‑experts rule—areas of agreement tighten the distribution; disagreement widens it. A decaying weight (decay) emphasises the most recent regime, and the alt_mode switch lets you decide whether knowledge should accumulate across cuts (cumulative) or stay confined within independent segments. The result is a forecast that responds instantly to structural shifts while still delivering a full uncertainty profile.

Algorithm at a Glance

✅ Data Preprocessing and Initialization

The algorithm begins by preparing the input time series for analysis. Any missing values (NAs) in the input vector are imputed using a Kalman smoother (na_kalman) to ensure continuity in the data. This is a crucial step for ensuring that all subsequent operations—particularly changepoint detection and model fitting—operate on a clean, numeric vector without interruptions. The time at the beginning of the function is also recorded to compute the total runtime later on. This setup guarantees that all internal computations have a stable foundation and makes the model resilient to missing or imperfect data, which is common in real-world time series.

🔀 Changepoint Detection on Transformed Series

To identify structural breaks in the time series, the data is first transformed using a log-return–like transformation via the function dts. This transformation computes the relative changes between consecutive values, which stabilizes the variance and helps emphasize shifts in mean or volatility. The changepoint::cpt.meanvar function is then applied to this transformed series to perform offline changepoint detection. This algorithm detects positions in the series where statistical properties—specifically both the mean and variance—change significantly. The result is a series of index positions representing the endpoints of statistically homogeneous segments. These breakpoints are interpreted as regime changes, after which the statistical behavior of the series is likely different. The segments demarcated by these changepoints form the modeling units in the next stage.

🗒 Segment Definition: Cumulative vs. Independent

Once the changepoints are identified, the algorithm determines how to define the modeling windows. If the alt_mode flag is set to FALSE (default), then each segment includes all observations from the start of the segment up to the end of the full series. This approach treats each new segment as an extension of the previous one, accumulating more information over time. In contrast, if alt_mode = TRUE, each segment is treated independently and includes only the values strictly within that changepoint interval. This distinction allows users to choose between cumulative modeling (which may enhance stability and context) or independent modeling (which better isolates local structural characteristics). This flexibility is useful when dealing with time series that undergo non-recurring shifts in behavior.

🤖 Forecasting Each Segment via a Product of Experts

For each defined segment, the algorithm fits three distinct probabilistic models:

TBATS: A state-space model well-suited to data with multiple or complex seasonal patterns.
Theta: A decomposition method that enhances trend extrapolation through curvature adjustments.
ARFIMA: A long-memory generalization of ARIMA that models persistence in both mean and variance.

Each model generates forecasts over the desired horizon, using a Normal distribution for each time step. These Normal densities are then combined using the Product of Experts (PoE) principle: the three densities are multiplied pointwise and normalized over a shared evaluation grid. This multiplication rewards consensus between models and down-weights outcomes where the models disagree, effectively capturing epistemic uncertainty. The result is a more confident and robust predictive density at each forecast step, conditional on the segment.

⚖️ Segment Forecast Fusion with Softmax Weighting

After each segment’s forecast distributions are built, the next step is to fuse these into a unified prediction. Since more recent segments may better represent the current regime, the algorithm uses an exponentially decaying softmax weighting strategy. These weights are normalized using softmax to ensure they sum to 1. If the reverse flag is set to TRUE, the weights are reversed, emphasizing earlier (older) segments instead. This design choice allows the user to steer the balance between responsiveness and stability, depending on the application context.

🧰 Aggregation of Segment Distributions

The algorithm then aggregates the individual forecast distributions from all segments into one final probabilistic forecast per time step. For each horizon step, a common support grid is constructed by combining the sampled values from all segments. The densities from each segment are then evaluated on this grid, and the resulting values are combined using the segment-specific weights. A weighted sum of the densities is computed, and a final empirical forecast distribution is derived by sampling from this combined density. This distribution reflects both aleatoric uncertainty (inherent randomness within each segment) and epistemic uncertainty (variation across models and regimes).

📊 Output Construction and Visualization

Once all steps are completed, the algorithm returns a structured output object that includes:

pred_funs: A named list of empirical distributions (for each forecast horizon t1, t2, …) that expose four probabilistic functions: rfun (random sampling), dfun (density), pfun (CDF), and qfun (quantile).
plot: A ggplot2 object visualizing the historical data, the median forecast trajectory, and a confidence ribbon that reflects forecast uncertainty.
time_log: The total computation time formatted as a human-readable duration.

Quick Start Example

Now, let’s see how to use chopper with a practical example. Here below you can see the essential function signature with all the main element you need to control:

#Basic example with TSLA time series  
library(chopper) 
ts <- ts_set$TSLA.Close  
model <- chopper(ts, horizon = 100, decay = 0.5, alt_mode  = FALSE, reverse = FALSE) 

model$plot

How to Chop and Chunk: A Practical Example

Parameters That Matter Most

decay: Determines how quickly past segments lose relevance. A high decay means recent data dominates; low decay lets older regimes speak longer. Ideal for tuning based on your data’s memory depth.
alt_mode: Logical flag that switches between cumulative (default) and independent segment modeling. Use TRUE when past segments should less inform current forecasts.
reverse: If TRUE, flips the decay weights, emphasizing early segments. Useful in cases where older patterns are more persistent or cyclic.

Example 1: High Decay, Independent Segments

# High decay, independent segments

forecasted <- chopper(ts, horizon = 5, decay = 0.7, alt_mode = TRUE, reverse = FALSE)

plot(density(forecasted$pred_funs$t5$rfun(1000)), main = "Predicted Price at t5: high decay and independent composition")

Example 2: Low Decay, Cumulative Segments

alt_forecasted <- chopper(ts, horizon = 5, decay = 0.1, alt_mode = FALSE, reverse = FALSE)
plot(density(alt_forecasted$pred_funs$t5$rfun(1000)), main = "Predicted Price at t5: low decay and cumulative composition")

As you can see the result is quite different in the characterization of tail behaviour of the prediction, while the central stats are (usually) changing in a quite limited way. Let’s compare few stats from the two distributions at t1.

1 - forecasted$pred_funs$t1$pfun(tail(ts, 1))###GROW TREND PROBABILITY AT T1

[1] 0.4473279

1 - alt_forecasted$pred_funs$t5$pfun(tail(ts, 1))###GROW TREND PROBABILITY AT T5

[1] 0.4044444

Let’s have a look at the quantiles for t3.

forecasted$pred_funs$t3$qfun(c(0.1, 0.25, 0.5, 0.75, 0.9))

[1] 274.0976 279.7998 285.3947 290.6705 296.7787

alt_forecasted$pred_funs$t3$qfun(c(0.1, 0.25, 0.5, 0.75, 0.9))

[1] 276.7910 281.3556 286.3064 291.2755 295.3956

Let’s have a look at the expected value for t5, both sampling and integrating.

mean(forecasted$pred_funs$t5$rfun(10000))

[1] 283.9367

norm_base <- integrate(function(x) forecasted$pred_funs$t5$dfun(x), forecasted$pred_funs$t5$qfun(0), forecasted$pred_funs$t5$qfun(1))$value
integrate(function(x) forecasted$pred_funs$t5$dfun(x) * x, forecasted$pred_funs$t5$qfun(0), forecasted$pred_funs$t5$qfun(1))$value/norm_base

[1] 283.8507

mean(alt_forecasted$pred_funs$t5$rfun(10000))

[1] 285.3761

norm_base <- integrate(function(x) alt_forecasted$pred_funs$t5$dfun(x), alt_forecasted$pred_funs$t5$qfun(0), alt_forecasted$pred_funs$t5$qfun(1))$value
integrate(function(x) alt_forecasted$pred_funs$t5$dfun(x) * x, alt_forecasted$pred_funs$t5$qfun(0), alt_forecasted$pred_funs$t5$qfun(1))$value/norm_base

[1] 285.3747

Final Thoughts

chopper offers a powerful yet intuitive way to forecast series that don't play by the rules of stationarity. Whether you’re modeling electricity usage, cryptocurrency prices, or pandemic case counts, structural shifts matter. And so should your forecast.

Just remember: if your time series has commitment issues, chopper is the therapist it needs—cutting through the drama one regime at a time. And if your forecast suddenly spikes for no reason, don’t worry—it’s just going through a mid-life changepoint (yes, it’s a joke).

Enzoi!