ETS’s ability to discern trending from non-trending random walks

LFS forecasts:

What prompted me to look into this was I used ETS (ExponenTial Smoothing) to predict last Friday’s LFS numbers, and noticed that the industry forecasts were flat for a couple industries that, to the untrained eye, looked like they were trending. e.g. Health Care. Not that this is necessarily a bad thing, if the forecasts were trending Jackie would have beat me in the competition 😜 . Still, it raises the question how good the algorithm is at differentiating between a random walk and a trending random walk.

What is a (Gaussian) random walk?

A Gaussian random walk is a specific type of random walk where the steps taken at each time interval are determined by random variables following a normal (Gaussian) distribution. In a Gaussian random walk, the central idea is that the magnitude and direction of each step are both subject to random variations described by a Gaussian distribution.

Here’s how a Gaussian random walk works:

Initial Position: The process starts with an initial position or value, often at a fixed point or an initial value.
Random Steps: At each time step, a random value is drawn from a Gaussian (normal) distribution with a specified mean (average) and standard deviation (a measure of the spread or variability). This random value determines both the magnitude and direction of the step taken at that time step.
Accumulation: The random step is then added to the current position, and this becomes the new position for the next time step.

The mean and standard deviation of the Gaussian distribution determine the characteristics of the random walk. A higher mean will tend to result in the process trending in one direction, while a higher standard deviation will lead to larger and more unpredictable steps.

ETS Algorithm:

Fully automated forecasting algorithm that uses maximum likelihood estimation to fit models, and then uses the AIC score to choose the “best” model from the candidates. ETS stands for <Error, Trend, Seasonal>, and what we are interested here is the algorithm’s ability to discern whether a series is trending or not. To do so we perform a simulation where we create time series with known characteristics, and then see how well the algorithm does at uncovering the (hidden) characteristics of the series.

Random walk

create vectors of innovations drawn from the normal distribution with mean zero: \(\mu=0\).
A random walk is the cumulative sum of the above innovations.

Random walk with drift/trend

create vectors of innovations drawn from the normal distribution with mean equal to a non-zero constant \(\mu=k\).
A random walk with drift/trend is the cumulative sum of the above innovations.

What challenges does the algorithm face?

Distinguishing a random walk and a random walk with drift is more difficult the closer \(k\) is to zero.
Distinguishing a random walk and a random walk with drift is more difficult the noisier the innovations are: the higher the standard deviation of the normal distribution from which the innovations are drawn \(\sigma\) .
Distinguishing a random walk and a random walk with drift is more difficult the less observations are available \(n\).

The simulation:

In order to see how well the algorithm does at distinguishing between a random walk and a random walk with drift we create 100 time series for each combination of \(\mu, \sigma, n\). We consider \(\mu\) with values of -0.5, -0.4, -0.3, -0.2, -0.1, 0, 0.1, 0.2, 0.3, 0.4, 0.5, \(\sigma\) with values of 1, 1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7, 1.8, 1.9, 2 and \(n\) with values of 120, 240, 480, 960, resulting in 48,400 series to be forecast.

With this data, for each combination of the above parameters, we record how often (out of 100 series) the algorithm chooses, on the basis of AIC, a model that features a trend.

Results:

In a perfect world the algorithm would correctly identify all series as non-trending if and only if \(\mu=0\). The proportion of series identified as trending would be 100% (yellow) whenever \(\mu\neq0\), and 0% (purple) when \(\mu=0\).

However, in reality the algorithm makes two mistakes:

When \(\mu=0\), the algorithm thinks that the series is trending about 10% of the time, regardless of the number of observations and the standard deviation of the innovations.
When \(\mu\neq0\), the algorithm thinks that the series is not trending, and the proportion of errors depends the parameters \(\mu, \sigma, n\).
- More mistakes the closer \(\mu\) is to zero.
- More mistakes the larger the standard deviation \(\sigma\).
- More mistakes the smaller the number of observations.

What is the point of this?

With real data we never know the data generating process, so we can never say whether the chosen model is appropriate or not: e.g. with the motivating example of health care, it does look like a trending random walk, but we do not know this. In contrast, with simulation we control the data generating process, so we can ascertain when and why the algorithm makes mistakes.