Box-Cox Transformations in Time Series Analysis
Box-Cox Transformations in Time Series
Setup
What is Box-Cox?
Box-Cox transformations stabilize variance in time series data. Many series show increasing variance as the level increases (the “fan-out” pattern).
The Problem: Non-Constant Variance
# Show the fan-out pattern
food_data %>%
autoplot(Turnover) +
labs(title = "Australian Food Retail: Notice Increasing Variance",
subtitle = "Early years: small fluctuations. Recent years: large fluctuations",
y = "Turnover ($AUD)") +
theme_minimal()The Box-Cox Formula
\[w_t = \begin{cases} \frac{y_t^\lambda - 1}{\lambda} & \text{if } \lambda \neq 0 \\ \log(y_t) & \text{if } \lambda = 0 \end{cases}\]
Key Parameters:
- \(y_t\) = original data
-
\(w_t\) = transformed data
- \(\lambda\) = transformation strength
When to Use Box-Cox
Use when:
- Variance increases with the level of the series
- You see a “fan-out” pattern in plots
- Residuals from models show heteroscedasticity
- Forecasting models perform poorly due to changing variance
Don’t use when:
- Variance is already stable
- Data contains zeros or negative values (without adjustment)
- You need to maintain original scale interpretation
Formula Intuition
Key Mathematical Insight
Important: When λ = 1, Box-Cox gives us: \(\frac{y^1 - 1}{1} = y - 1\)
This is not “no transformation” - it’s a linear shift by -1.
For “no transformation” (keeping original scale), we need to work outside the Box-Cox family or interpret λ = 1 as the limiting case.
Common λ Values
library(knitr)
tibble(
Lambda = c(2, 1, 0.5, 0, -0.5, -1),
Formula = c("(y² - 1)/2", "(y - 1)/1", "(√y - 1)/0.5", "log(y)", "(1/√y - 1)/(-0.5)", "(-1/y - 1)/(-1)"),
`Simplified Form` = c("(y² - 1)/2", "y - 1", "2(√y - 1)", "log(y)", "-2(1/√y - 1)", "1/y - 1"),
Effect = c("Expansion", "Linear shift", "Mild compression", "Moderate compression",
"Strong compression", "Very strong compression"),
`Use When` = c("Variance decreases with level", "Linear relationship", "Slight fan-out", "Clear fan-out",
"Strong fan-out", "Extreme fan-out")
) %>%
kable(caption = "Box-Cox Transformations: The Complete Picture")| Lambda | Formula | Simplified Form | Effect | Use When |
|---|---|---|---|---|
| 2.0 | (y² - 1)/2 | (y² - 1)/2 | Expansion | Variance decreases with level |
| 1.0 | (y - 1)/1 | y - 1 | Linear shift | Linear relationship |
| 0.5 | (√y - 1)/0.5 | 2(√y - 1) | Mild compression | Slight fan-out |
| 0.0 | log(y) | log(y) | Moderate compression | Clear fan-out |
| -0.5 | (1/√y - 1)/(-0.5) | -2(1/√y - 1) | Strong compression | Strong fan-out |
| -1.0 | (-1/y - 1)/(-1) | 1/y - 1 | Very strong compression | Extreme fan-out |
Visual Intuition
# Show how different lambdas "bend" the data
demo_data <- tibble(x = seq(1, 20, by = 1)) %>%
mutate(
`λ = 1 (y - 1)` = x - 1,
`λ = 0.5 ((√y - 1)/0.5)` = 2 * (sqrt(x) - 1),
`λ = 0 (Log)` = log(x)
) %>%
pivot_longer(cols = -x, names_to = "Transformation", values_to = "y")
demo_data %>%
ggplot(aes(x = x, y = y)) +
geom_line(size = 1, color = "steelblue") +
facet_wrap(~ Transformation, scales = "free_y") +
labs(title = "Box-Cox Formula Reality Check",
subtitle = "λ = 1 gives linear shift (y - 1), not original scale",
x = "Original Value", y = "Box-Cox Transformed Value") +
theme_minimal()Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
ℹ Please use `linewidth` instead.
Key Insight: Transformations with λ < 1 compress large values more than small ones, which stabilizes variance when it increases with level.
https://onlinestatbook.com/mobile/transformations/box-cox.html
Finding Optimal λ
Guerrero Method (Automatic)
Compare Before and After
# Plot original vs transformed
food_transformed %>%
pivot_longer(cols = c(Original, Transformed),
names_to = "Type", values_to = "Value") %>%
ggplot(aes(x = Month, y = Value)) +
geom_line(color = "steelblue") +
facet_wrap(~ Type, scales = "free_y", ncol = 1) +
labs(title = "Original vs Box-Cox Transformed Data",
subtitle = paste("λ =", round(lambda_optimal, 3)),
y = "Value") +
theme_minimal()Handling Zero Values
[1] -Inf 0.000000 2.302585 4.605170
log1p(x) # safe: log(1+x)[1] 0.0000000 0.6931472 2.3978953 4.6151205
Problem: Box-Cox requires positive values. Common solutions:
Method 1: Add a Small Constant
# If data has zeros, add small constant
data_with_zeros <- tibble(
Month = seq(as.Date("2020-01-01"), by = "month", length.out = 12),
Value = c(0, 5, 10, 0, 15, 20, 25, 0, 30, 35, 40, 45)
)
# Add small constant before transformation
adjusted_data <- data_with_zeros %>%
mutate(
Original = Value,
Adjusted = Value + 0.1, # Add small constant
Transformed = box_cox(Adjusted, 0.5)
)
print("Original data with zeros:")[1] "Original data with zeros:"
print(data_with_zeros$Value[1:4])[1] 0 5 10 0
print("After adding 0.1:")[1] "After adding 0.1:"
print(adjusted_data$Adjusted[1:4])[1] 0.1 5.1 10.1 0.1
Method 2: Use Modified Box-Cox
# Some implementations allow for shifted Box-Cox
# Formula: ((y + shift)^λ - 1) / λ
# Example with manual implementation
modified_boxcox <- function(y, lambda, shift = 1) {
if (lambda == 0) {
log(y + shift)
} else {
((y + shift)^lambda - 1) / lambda
}
}
# Apply with shift
adjusted_data <- adjusted_data %>%
mutate(Modified_BoxCox = modified_boxcox(Original, 0.5, shift = 1))
print("Modified Box-Cox with shift:")[1] "Modified Box-Cox with shift:"
print(adjusted_data$Modified_BoxCox[1:4])[1] 0.000000 2.898979 4.633250 0.000000
Practical Implementation
In Forecasting Models
# Compare models with and without Box-Cox
models <- food_data %>%
model(
Original = ETS(Turnover),
BoxCox = ETS(box_cox(Turnover, lambda_optimal)),
Auto = ETS(Turnover) # ETS chooses transformation automatically
)
# Compare performance
model_performance <- models %>%
glance() %>%
select(.model, AICc, sigma2) %>%
arrange(AICc)
model_performance %>%
kable(digits = 2, caption = "Model Comparison")| .model | AICc | sigma2 |
|---|---|---|
| BoxCox | -94.40 | 0 |
| Original | 6609.73 | 0 |
| Auto | 6609.73 | 0 |
Interpreting the Results
Key Insights from the comparison:
-
Box-Cox wins decisively: AICc of -94.40 vs ~6610 for others
- Lower AICc = better model fit
- The massive difference shows Box-Cox dramatically improves model performance
-
Original vs Auto are identical: Both have AICc = 6609.73
- This means ETS’s automatic selection chose no transformation
- ETS didn’t detect the need for variance stabilization
-
Why Box-Cox performs so much better:
- Addresses the heteroscedasticity (changing variance) problem
- Creates more stable residuals that ETS can model effectively
- Transforms the “fan-out” pattern into more manageable errors
The lesson: Sometimes manual Box-Cox transformation beats automatic methods because: - It specifically targets the variance stabilization problem - Automatic methods might not always detect heteroscedasticity - The Guerrero method optimizes specifically for variance stability
Back-Transformation
# Important: fpp3 automatically back-transforms forecasts
forecasts <- models %>%
forecast(h = 12)
# All forecasts are on original scale
forecasts %>%
autoplot(food_data %>% filter(year(Month) >= 2015)) +
labs(title = "Forecasts (Automatically Back-Transformed)",
y = "Turnover ($AUD)") +
theme_minimal()Quick Decision Guide
When to use Box-Cox:
- Plot your data - look for fan-out pattern
-
Check residuals - heteroscedasticity indicates need for transformation
- Try λ = 0.5 first - often works well for economic data
- Use Guerrero method - for automatic optimal λ selection
- Validate results - ensure variance is more stable after transformation
R Code Pattern:
Summary
- Purpose: Stabilize variance in time series
- When: Data shows increasing variance with level
- How: Apply power transformation with parameter λ
-
Zero handling: Add small constant or use shifted Box-Cox
-
In practice: Use
guerrerofeature for automatic λ selection -
Models: Apply with
box_cox()function in model formulas
Appendix
https://onlinestatbook.com/mobile/transformations/box-cox.html
http://www.econ.illinois.edu/~econ508/Papers/boxcox64.pdf
Kutner, M., Nachtsheim, C., Neter, J., and Li, W. (2004). Applied Linear Statistical Models, McGraw-Hill/Irwin, Homewood, IL. https://users.stat.ufl.edu/~winner/sta4211/ALSM_5Ed_Kutner.pdf