Forecasting using 19 {forecast} package algorithms

Forecasting plays a vital role in identifying patterns within time-series data and supporting informed decision-making across diverse fields such as business, healthcare, economics, and environmental studies.

In R, the {forecast} package offers a powerful and comprehensive collection of algorithms that help users model, compare, and predict future values based on historical observations. It includes 19 forecasting methods, ranging from well-established statistical techniques such as exponential smoothing (ETS) and ARIMA, to advanced approaches like TBATS for handling complex seasonality and dynamic regression models that incorporate external variables. Each method is suited for different data characteristics, such as trend, seasonality, or irregular patterns.

By providing a unified framework for applying and evaluating these methods, the {forecast} package makes it easier for practitioners to select suitable models, measure predictive accuracy, and generate forecasts that can guide practical decisions and strategies.

Below is an overview of the 19 forecasting methods included in the package, along with general steps for applying each approach.

Forecasting Methods and Steps:

Note. Before we begin load the necessary libraries:

library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.4     ✔ readr     2.1.5
## ✔ forcats   1.0.0     ✔ stringr   1.5.1
## ✔ ggplot2   3.5.2     ✔ tibble    3.3.0
## ✔ lubridate 1.9.4     ✔ tidyr     1.3.1
## ✔ purrr     1.1.0     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(forecast)
## Registered S3 method overwritten by 'quantmod':
##   method            from
##   as.zoo.data.frame zoo
library(ggfortify)
## Registered S3 methods overwritten by 'ggfortify':
##   method                 from    
##   autoplot.Arima         forecast
##   autoplot.acf           forecast
##   autoplot.ar            forecast
##   autoplot.bats          forecast
##   autoplot.decomposed.ts forecast
##   autoplot.ets           forecast
##   autoplot.forecast      forecast
##   autoplot.stl           forecast
##   autoplot.ts            forecast
##   fitted.ar              forecast
##   fortify.ts             forecast
##   residuals.ar           forecast

Load and plot the Time-Series dataset.

# Load the AirPassengers dataset
data(AirPassengers)
ts_data <- AirPassengers

# Plot dataset
autoplot(ts_data)

Split the data into training and testing sets.

train_data <- window(ts_data, end = c(1959, 12))
test_data <- window(ts_data, start = c(1960, 1))

Create function to calculate forecast accuracy.

calculate_accuracy <- function(forecast, test_data) {
  accuracy(forecast, test_data)
}

Now, we are ready to forecast!

  1. Auto ARIMA (Auto-Regressive Integrated Moving Average)

Purpose: Automatically selects the best ARIMA model by minimizing information criteria.

# Auto ARIMA model
fit_auto_arima <- auto.arima(train_data)
# Note. You can also use fit_auto_arima %>% accuracy() to find model accuracy.
forecast_auto_arima <- forecast(fit_auto_arima, h = length(test_data))
accuracy_auto_arima <- calculate_accuracy(forecast_auto_arima, test_data)
  1. ETS (Exponential Smoothing State Space Model)

Purpose: Models trend and seasonality using exponential smoothing methods.

fit_ets <- ets(train_data)
forecast_ets <- forecast(fit_ets, h = length(test_data))
accuracy_ets <- calculate_accuracy(forecast_ets, test_data)
  1. STLF (Seasonal and Trend Decomposition using Loess Forecasting)

Purpose: Handles complex seasonality by decomposing the series.

forecast_stlf <- stlf(train_data, h = length(test_data))
accuracy_stlf <- calculate_accuracy(forecast_stlf, test_data)
  1. NNETAR (Neural Network Autoregression)

Purpose: Uses a feed-forward neural network to model nonlinear relationships.

fit_nnetar <- nnetar(train_data, lambda=0.5)
forecast_nnetar <- forecast(fit_nnetar, PI=TRUE, h=12)
accuracy_nnetar <- calculate_accuracy(forecast_nnetar, test_data)
  1. TBATS (Trigonometric Box-Cox ARMA Trend Seasonal Model)

Purpose: Handles multiple or non-integer seasonalities.

fit_tbats <- tbats(train_data)
forecast_tbats <- forecast(fit_tbats, h = length(test_data))
accuracy_tbats <- calculate_accuracy(forecast_tbats, test_data)
  1. Theta Method

Purpose: A simple and robust method that works well for many series.

forecast_theta <- thetaf(train_data, h = length(test_data))
accuracy_theta <- calculate_accuracy(forecast_theta, test_data)
  1. Naive Forecast

Purpose: Uses the last observed value as the forecast.

forecast_naive <- naive(train_data, h = length(test_data))
accuracy_naive <- calculate_accuracy(forecast_naive, test_data)
  1. Seasonal Naive (Snaive)

Purpose: Forecasts by repeating the last observed seasonal pattern.

forecast_snaive <- snaive(train_data, h = length(test_data))
accuracy_snaive <- calculate_accuracy(forecast_snaive, test_data)
  1. Mean Forecast

Purpose: Uses the average of all past observations as the forecast.

forecast_mean <- meanf(train_data, h = length(test_data))
accuracy_mean <- calculate_accuracy(forecast_mean, test_data)
  1. Random Walk Forecast (RWF)

Purpose: Models series as a random walk, optionally with drift.

forecast_rwf <- rwf(train_data, h = length(test_data))
accuracy_rwf <- calculate_accuracy(forecast_rwf, test_data)
  1. Holt’s Linear Method (Holt)

Purpose: Extends simple exponential smoothing to capture trend.

forecast_holt <- holt(train_data, h = length(test_data))
accuracy_holt <- calculate_accuracy(forecast_holt, test_data)
  1. Holt-Winters (HW)

Purpose: Extends Holt’s method to capture both trend and seasonality.

forecast_hw <- hw(train_data, h = length(test_data))
accuracy_hw <- calculate_accuracy(forecast_hw, test_data)
  1. Bagged Forecasts

Purpose: Uses bootstrapped resampling to reduce forecast variance.

fit_bagged <- baggedModel(train_data)
forecast_bagged <- forecast(fit_bagged, h = length(test_data))
accuracy_bagged <- calculate_accuracy(forecast_bagged, test_data)
  1. Croston’s Method

Purpose: Designed for intermittent demand series (many zeros).

forecast_croston <- croston(train_data, h = length(test_data))
accuracy_croston <- calculate_accuracy(forecast_croston, test_data)
  1. StructTS (Structural Time Series Models)

Purpose: State-space models with trend and seasonal components.

forecast_structts <- croston(train_data, h = length(test_data))
accuracy_structts <- calculate_accuracy(forecast_croston, test_data)
  1. TSLM (Time-Series Linear Model)

Purpose: Fits linear regression models with time or season as predictors.

fit_tslm <- tslm(train_data ~ trend + season)
forecast_tslm <- forecast(fit_tslm, h = length(test_data))
accuracy_tslm <- calculate_accuracy(forecast_tslm, test_data)
  1. DHR (Dynamic Harmonic Regression)

Purpose: Handles complex seasonality using Fourier terms.

fit_dhr <- auto.arima(train_data, xreg = fourier(train_data, K = 4))
forecast_dhr <- forecast(fit_dhr, xreg = fourier(train_data, K = 4, h = length(test_data)))
accuracy_dhr <- calculate_accuracy(forecast_dhr, test_data)
  1. Bagged ETS (Bagged Exponential Smoothing Models)

Purpose: Improves ETS forecasts by using bootstrapped resamples of the original series and averaging results, reducing variance and increasing robustness.

fit_bagged_ets <- baggedETS(train_data)
forecast_bagged_ets <- forecast(fit_bagged_ets, h = length(test_data))
accuracy_bagged_ets <- calculate_accuracy(forecast_bagged_ets, test_data)
  1. Combined Forecasts

Purpose: Combines forecasts from multiple models to improve accuracy.

combined_forecast <- (forecast_auto_arima$mean + forecast_ets$mean + forecast_stlf$mean) / 3
accuracy_combined <- calculate_accuracy(combined_forecast, test_data)

RMSE (Root Mean Squared Error) and MAPE (Mean Absolute Percentage Error) Accuracy Results

Note. Test RMSE (preferred!) tells you how well the model predicts unseen data → much more realistic for model comparison.

RMSE_accuracy_results <- data.frame(
  Model = c("Auto ARIMA", "ETS", "STLF", "NNETAR", "TBATS", "Theta", "Naive", "Snaive", 
            "Mean", "RWF", "Holt", "HW", "Bagged", "Croston", "StructTS", "TSLM", "DHR", 
            "Bagged ETS", "Combined"),
  RMSE = c(accuracy_auto_arima[2,2], accuracy_ets[2,2], accuracy_stlf[2,2], 
           accuracy_nnetar[2,2], accuracy_tbats[2,2], 
           accuracy_theta[2,2], accuracy_naive[2,2], accuracy_snaive[2,2], 
           accuracy_mean[2,2], accuracy_rwf[2,2], accuracy_holt[2,2], accuracy_hw[2,2], 
           accuracy_bagged[2,2], accuracy_croston[2,2], accuracy_structts[2,2], 
           accuracy_tslm[2,2], accuracy_dhr[2,2], 
           accuracy_bagged_ets[2,2], accuracy_combined[1,2])
)

# View best forecasting algorithms
RMSE_accuracy_results %>% 
            arrange(desc(RMSE))
##         Model      RMSE
## 1        Mean 226.26567
## 2       Naive 102.97653
## 3         RWF 102.97653
## 4     Croston  96.05045
## 5    StructTS  96.05045
## 6        Holt  93.30006
## 7          HW  53.79669
## 8      Snaive  50.70832
## 9        TSLM  49.47908
## 10       STLF  34.19514
## 11      Theta  30.71807
## 12        ETS  27.39804
## 13 Auto ARIMA  23.93170
## 14     Bagged  22.53056
## 15 Bagged ETS  22.48688
## 16   Combined  22.25598
## 17      TBATS  22.21373
## 18        DHR  20.43343
## 19     NNETAR  17.68679

Note. If you have a train/test split, you should always use the test dataset RMSE, MAPE, etc. for model selection.

MAPE_accuracy_results <- data.frame(
  Model = c("Auto ARIMA", "ETS", "STLF", "NNETAR", "TBATS", "Theta", "Naive", "Snaive", 
            "Mean", "RWF", "Holt", "HW", "Bagged", "Croston", "StructTS", "TSLM", "DHR", 
            "Bagged ETS", "Combined"),
  MAPE = c(accuracy_auto_arima[2,5], accuracy_ets[2,5], accuracy_stlf[2,5], 
           accuracy_nnetar[2,5], accuracy_tbats[2,5], accuracy_theta[2,5], 
           accuracy_naive[2,5], accuracy_snaive[2,5], 
           accuracy_mean[2,5], accuracy_rwf[2,5], accuracy_holt[2,5], accuracy_hw[2,5], 
           accuracy_bagged[2,5], accuracy_croston[2,5], accuracy_structts[2,5], 
           accuracy_tslm[2,5], accuracy_dhr[2,5], 
           accuracy_bagged_ets[2,5], accuracy_combined[1,5])
)

# View best forecasting algorithms
MAPE_accuracy_results %>% 
             arrange(desc(MAPE))
##         Model      MAPE
## 1        Mean 43.621522
## 2       Naive 14.251338
## 3         RWF 14.251338
## 4     Croston 12.898577
## 5    StructTS 12.898577
## 6        Holt 12.540598
## 7      Snaive  9.987533
## 8          HW  7.689038
## 9        TSLM  7.073539
## 10      Theta  5.327789
## 11       STLF  5.116247
## 12        ETS  4.655647
## 13 Auto ARIMA  4.182395
## 14   Combined  3.605321
## 15        DHR  3.391703
## 16      TBATS  3.354141
## 17 Bagged ETS  3.332271
## 18     Bagged  3.295546
## 19     NNETAR  3.267182

Note. A lower RMSE indicates a better fit to the data and more accurate predictions, meaning it’s considered “best” when compared to other models in the analysis.

Plot worst and best Forecasts.

# Worst model accuracy using RMSE score
  autoplot(forecast_mean) + ggtitle("Mean Model Passangers Forecast") 

# Best model accuracy using RMSE score
  autoplot(forecast_nnetar) + ggtitle("NNETAR Model Passangers Forecast")

Conclusion

When evaluating forecasting methods, accuracy is a key factor in determining the most appropriate model for a given dataset. In the case of the AirPassengers series, the NNETAR model achieved the best accuracy, producing the lowest RMSE by effectively capturing nonlinear patterns and seasonality.

In contrast, the Mean model performed the worst, as it oversimplified the data by relying on the historical average, failing to adapt to the strong upward trend and seasonal fluctuations. This comparison highlights the importance of selecting models that align with the structure of the data, where advanced approaches like NNETAR outperform simpler benchmarks such as the Mean model in complex time-series forecasting tasks.

A.M.D.G