The Box-Cox transformation is used to stabilize the variance of a time series using a parameter called lambda (\(\lambda\)). Many time series have seasonal swings that get sometimes bigger as the level of the time series rises. For example. for some time series the seasonal swings in earlier time periods are small, but in later timer periods the ups and downs become much larger. We know this also as a non-constant variance (heteroskedasticity).
Interpreting the lambda output is important to know because it helps to decide how to transform the time series. For example, special cases include:
lambda = 1 -> no transformation needed
lambda = 0 -> log transformation
lambda = 0.5 -> square root needed for transformation
Using the Box-Cox method has some advantages like:
the simplicity because it just uses one parameter (lambda) to control the whole transformation
it covers common transformations (log, square root, inverse) in one framework
helps forecasting models that assume constant variance perform better
R handles the back-transformation automatically
Disadvantages are:
it only works with strictly positive data (it cannot handle zeros or negatives)
makes the interpretation harder because everything is on a transformed scale
only fixes variance (other tools needed like differencing for non-stationary)
back-transformed forecasts only gives medians instead of means
# A tibble: 1 × 10
.model .type ME RMSE MAE MPE MAPE MASE RMSSE ACF1
<chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 snaive Test 59372. 71504. 61358. 3.10 3.21 0.763 0.627 0.750
accuracy(fc_log, sales)
# A tibble: 1 × 10
.model .type ME RMSE MAE MPE MAPE MASE RMSSE ACF1
<chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 snaive_log Test 50807. 63114. 53274. 2.65 2.78 0.662 0.554 0.728
accuracy(fc_sqrt, sales)
# A tibble: 1 × 10
.model .type ME RMSE MAE MPE MAPE MASE RMSSE ACF1
<chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 snaive_sqrt Test 56031. 68206. 58203. 2.92 3.04 0.724 0.598 0.742
Overall, when comparing all the transformations, the log transformation improve the error metrics the best. Compared to the original SNAIVE model, our MAPE dropped from 3.21% to 2.78%. Other values, such as MAE dropped from 61,358 to 53,274 and ME dropped from 59,372 to 50,807.
Why does this happen?
The STL decomposition shows that the seasonal swings were getting bigger over time. By using the log transformation, we compress the larger values more, which makes the seasonal patterns more consistent.
Note: Lambda was -0.7, so we should have taken the inverse, but this gave me NaN values, so I tried using other transformations.