1 Introduction

In this session, we will learn about time series analysis and perform forecasting using several exponential smoothing methods, including: - Simple Exponential Smoothing - Holt’s Linear Trend Method - Holt-Winters Method

1.1 Data Description

The dataset we will use contains annual sales data from the year 2000 to 2016. Based on this historical data, the objective of this analysis is to: Forecast sales for the next two years, specifically for 2017 and 2018.

1.2 Library

# import libs
library(tidyverse)
library(lubridate)
library(forecast)
library(TTR)
library(fpp)
library(tseries)
library(TSstudio)
library(padr)

1.3 Data

year <- c(2000,2001,2002,2003,2004,2005,2006,2007,
          2008,2009,2010,2011,2012,2013,2014,2015,
          2016)

sales <- c(156, 161, 189, 182, 224, 258, 283, 325,
        332, 388, 475, 502, 537, 584, 631,
        704, 689)

df <- data.frame(year, sales)
df
# create ts object
df_ts <- ts(data = df$sales, start = 2000, frequency = 1)
df_ts
#> Time Series:
#> Start = 2000 
#> End = 2016 
#> Frequency = 1 
#>  [1] 156 161 189 182 224 258 283 325 332 388 475 502 537 584 631 704 689
# visualize the data
df_ts %>% 
  autoplot()

Between the years 2000 and 2015, the sales showed a positive upward trend. However, after 2015, the sales began to show a declining trend.

2 Forecasting Model

2.1 Exponential Smoothing

2.1.1 SES

Simple Exponential Smoothing (SES) is used when data has no trend or seasonality.

# Apply SES
ses_model <- ses(df_ts, h = 2)
ses_model
#>      Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
#> 2017       689.0015 632.7974 745.2056 603.0448 774.9582
#> 2018       689.0015 609.5209 768.4821 567.4464 810.5566
plot(ses_model)

accuracy(ses_model)
#>                    ME     RMSE      MAE     MPE     MAPE      MASE       ACF1
#> Training set 31.35529 41.19581 33.94409 8.12971 8.839146 0.9412573 -0.0117723

2.1.2 Holt’s

Use when data has a trend but no seasonality.

holt_model <- holt(df_ts, h = 2)
holt_model
#>      Point Forecast   Lo 80    Hi 80    Lo 95    Hi 95
#> 2017        731.895 694.600 769.1900 674.8572 788.9328
#> 2018        766.887 718.803 814.9711 693.3488 840.4253
plot(holt_model)

accuracy(holt_model)
#>                      ME     RMSE      MAE       MPE     MAPE     MASE
#> Training set -0.3203326 25.44848 19.17602 -1.728756 6.103914 0.531744
#>                      ACF1
#> Training set -0.003815317

2.1.3 Holt-Winters’

Use when data has both trend and seasonality.

# If you have monthly data with seasonality
df_monthly <- ts(sales, start = 2000, frequency = 12)

hw_model <- hw(df_monthly, seasonal = "additive", h = 12)
hw_model
#>          Point Forecast    Lo 80     Hi 80     Lo 95     Hi 95
#> Jun 2001       756.9330 724.8649  789.0012 707.88909  805.9770
#> Jul 2001       791.0206 737.5729  844.4683 709.27940  872.7618
#> Aug 2001       825.1081 733.1534  917.0629 684.47544  965.7409
#> Sep 2001       859.1957 718.1915 1000.1999 643.54844 1074.8429
#> Oct 2001       893.2832 695.3114 1091.2551 590.51142 1196.0551
#> Nov 2001       927.3708 665.7519 1188.9897 527.25919 1327.4824
#> Dec 2001       961.4583 630.2579 1292.6587 454.93103 1467.9856
#> Jan 2002       995.5459 589.3472 1401.7445 374.31868 1616.7731
#> Feb 2002      1029.6334 543.4112 1515.8557 286.02069 1773.2462
#> Mar 2002      1063.7210 492.7616 1634.6804 190.51395 1936.9280
#> Apr 2002      1097.8085 437.6557 1757.9614  88.19191 2107.4251
#> May 2002      1131.8961 378.3112 1885.4810 -20.61253 2284.4047
plot(hw_model)

2.2 Mov AVG & Trend Extrapolation

2.2.1 Moving Average (Simple)

# Moving average (e.g., 2-period)
ma2 <- SMA(df_ts, n = 2)
ma2
#> Time Series:
#> Start = 2000 
#> End = 2016 
#> Frequency = 1 
#>  [1]    NA 158.5 175.0 185.5 203.0 241.0 270.5 304.0 328.5 360.0 431.5 488.5
#> [13] 519.5 560.5 607.5 667.5 696.5
plot(df_ts, type = "l")
lines(ma2, col = "blue", lwd = 2)

2.2.2 Trend Extrapolation

model_linear <- lm(sales ~ year)
summary(model_linear)
#> 
#> Call:
#> lm(formula = sales ~ year)
#> 
#> Residuals:
#>     Min      1Q  Median      3Q     Max 
#> -57.412 -21.652   1.132  18.676  63.804 
#> 
#> Coefficients:
#>               Estimate Std. Error t value          Pr(>|t|)    
#> (Intercept) -74211.725   3328.981  -22.29 0.000000000000651 ***
#> year            37.152      1.658   22.41 0.000000000000603 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Residual standard error: 33.49 on 15 degrees of freedom
#> Multiple R-squared:  0.971,  Adjusted R-squared:  0.9691 
#> F-statistic: 502.2 on 1 and 15 DF,  p-value: 0.0000000000006034
# Predict future years
future <- data.frame(year = c(2017, 2018))
predict(model_linear, newdata = future)
#>        1        2 
#> 723.7794 760.9314

2.3 Autocorrelation & ACF

Use ACF (Autocorrelation Function) to detect patterns

acf(df_ts, main = "ACF Plot of Time Series")

Interpretation:

  • If significant spikes (outside blue bounds) → data is autocorrelated
  • Lag 1 spike: short-term dependence
  • Slowly declining spikes: possible trend
  • Repeating pattern: seasonality

2.4 Forecast Accuracy Measures

2.4.1 Trend Extrapolation

accuracy(model_linear)
#>                                   ME     RMSE      MAE       MPE     MAPE
#> Training set 0.000000000000002507632 31.45557 25.26759 0.7846039 9.524381
#>                  MASE
#> Training set 0.153837

2.4.2 SES

accuracy(ses_model)
#>                    ME     RMSE      MAE     MPE     MAPE      MASE       ACF1
#> Training set 31.35529 41.19581 33.94409 8.12971 8.839146 0.9412573 -0.0117723

2.4.3 Holt’s

accuracy(holt_model)
#>                      ME     RMSE      MAE       MPE     MAPE     MASE
#> Training set -0.3203326 25.44848 19.17602 -1.728756 6.103914 0.531744
#>                      ACF1
#> Training set -0.003815317

2.4.4 Holt-Winter

accuracy(hw_model)
#>                    ME     RMSE      MAE      MPE     MAPE       MASE
#> Training set 5.132598 25.02288 19.90842 2.400402 5.609929 0.04457775
#>                     ACF1
#> Training set 0.009726304

Based on the accuracy measures above, it can be concluded that the Holt-Winters model is the most suitable for forecasting the next two years (2017 and 2018), as it has the lowest RMSE value (24.74) compared to the other models.