Stat 142 (Time Series Analysis)

Lesson 2.2 (Exponential Smoothing)

Author

Norberto E. Milla, Jr.

Published

March 26, 2023

Code
library(fpp2)
library(fpp3)
library(forecast)
library(tidyverse)
library(timetk)
library(seasonal)
library(TSstudio)
library(ggfortify)

1 Introduction

  • Exponential smoothing can be traced back to the works of Brown (1959), Holt (1957), and Winters (1960)

  • Forms the basis of some of the most successful forecasting methods

  • Forecasts produced using exponential smoothing methods are weighted averages of past observations, with the weights decaying exponentially as the observations get older.

  • The more recent the observation the higher the associated weight.

  • This framework generates reliable forecasts quickly and for a wide range of time series, which is a great advantage and of major importance to applications in industry.

2 Simple exponential smoothing

  • The simplest of the exponentially smoothing methods is naturally called simple exponential smoothing (SES)

  • This method is suitable for forecasting data with no clear trend or seasonal pattern.

  • For example, the data in Figure 1 do not display any clear trending behavior or any seasonality.

Code
algeria_economy <- global_economy  %>% 
  filter(Country == "Algeria")

alger_econ.ts <- ts(algeria_economy$Exports, 
                    frequency = 1, 
                    start = 1960, 
                    end = 2017)

autoplot(alger_econ.ts) +
         labs(x = "Year",
              y = "Exports (% of GDP)",
              title = "Export: Algeria") +
  scale_x_continuous(breaks = seq(1960, 2017, 5))

Naive method

  • Using the naive method, all forecasts for the future are equal to the last observed value of the series,

\hat{y}_{T+h|T} = y_T, \; h = 1, 2, \cdots

  • The naive method assumes that the most recent observation is the only important one, and all previous observations provide no information for the future

Average method

  • Using the average method, all future forecasts are equal to a simple average of the observed data

\hat{y}_{T+h|T} = \frac{1}{t} \sum_{t=1}^T y_t, \; h = 1, 2, \cdots

  • The average method assumes that all observations are of equal importance, and gives them equal weights when generating forecasts

  • We often want something between these two extremes, say, assigning larger weights to more recent observations than to observations from the distant past

  • In SES, forecasts are calculated using weighted averages, where the weights decrease exponentially as observations come from further in the past — the smallest weights are associated with the oldest observations

\hat{y}_{T+1|T}=\alpha y_T+\alpha (1-\alpha) y_{T-1}+\alpha (1-\alpha)^2 y_{T-2}+\cdots, \tag{1}

where 0 \leq \alpha \leq 1 is the smoothing parameter

  • The one-step-ahead forecast for time T + 1 is a weighted average of all of the observations in the series y_1, y_2, \cdots, y_T. The rate at which the weights decrease is controlled by the parameter \alpha

Weighted average form

The forecast at time T+1 is equal to a weighted average between the most recent observation y_T and the previous forecast \hat{y}_{T|T-1}:

\hat{y}_{T+1|T} = \alpha y_T + (1-\alpha)\hat{y}_{T|T-1}

where 0 \leq \alpha \leq 1 is the smoothing parameter. Similarly, we can write the fitted values as

\hat{y}_{t+1|t} = \alpha y_t + (1-\alpha)\hat{y}_{t|t-1}, \; t = 1, 2, \cdots, T

  • Recall that fitted values are simply one-step forecasts of the training data

  • The process has to start somewhere, so we let the first fitted value at time 1 be denoted by \ell_0 (which we will have to estimate)

  • Then

\begin{align} \hat{y}_{2|1} &= \alpha y_1 + (1-\alpha) \ell_0 \notag \\ \hat{y}_{3|2} &= \alpha y_2 + (1-\alpha) \hat{y}_{2|1} \notag \\ \hat{y}_{4|3} &= \alpha y_3 + (1-\alpha) \hat{y}_{3|2} \notag \\ &\vdots \notag \\ \hat{y}_{T|T-1} &= \alpha y_{T-1} + (1-\alpha) \hat{y}_{T-1|T-2} \notag \\ \end{align}

Substituting each equation into the following equation, we obtain

\begin{align} \hat{y}_{3|2} &= \alpha y_2 + (1-\alpha) [\alpha y_1 + (1-\alpha) \ell_0] \notag \\ &= \alpha y_2 + \alpha(1-\alpha)y_1 + (1-\alpha)^2 \ell_0 \notag \\ \hat{y}_{4|3} &= \alpha y_3 + (1-\alpha)[\alpha y_2 + \alpha(1-\alpha)y_1 + (1-\alpha)^2 \ell_0] \notag \\ &= \alpha y_3 + \alpha(1-\alpha) y_2 + \alpha(1-\alpha)^2 y_3 + (1-\alpha)^3 \ell_0 \notag \\ &\vdots \notag \\ \hat{y}_{T+1|T} &= \sum_{j=0}^{T-1} \alpha(1-\alpha)^j y_{T-j} + (1-\alpha)\ell_0 \notag \end{align}

The last term gets smaller and smaller as T gets larger so, the weighted average form leads to the same forecast Equation (1)

Component form

An alternative representation is the component form. For simple exponential smoothing, the only component included is the level, \ell_t. (Other methods which are considered later in this course may also include a trend
b_t and a seasonal component s_t)

  • Component form representations of exponential smoothing methods comprise a forecast equation and a smoothing equation for each of the components included in the method.

  • The component form of simple exponential smoothing is given by

\begin{align} \text{Forecast equation}: \hat{y}_{t+h|t} &= \ell_t \notag \\ \text{Smoothing equation}: \ell_t &= \alpha y_t + (1-\alpha) \ell_{t-1} \notag \end{align}

where \ell_t is the level (or the smoothed value) of the series at time t.

  • Setting h = 1 gives the fitted values, while setting t = T gives the true forecasts beyond the training data

  • The forecast equation shows that the forecast value at time t + 1 is the estimated level at time t

  • The smoothing equation for the level (usually referred to as the level equation) gives the estimated level of the series at each period t.

  • If we replace \ell_t with \hat{y}_{t+1|t} and \ell_{t-1} with \hat{y}_{t|t-1} in the smoothing equation, we will recover the weighted average form of simple exponential smoothing

  • The component form of simple exponential smoothing is not particularly useful on its own, but it will be the easiest form to use when we start adding other components

Optimization

  • The application of every exponential smoothing method requires the smoothing parameters (\alpha) and the initial values (\ell_0) to be chosen

  • In some cases, the smoothing parameters may be chosen in a subjective manner — the forecaster specifies the value of the smoothing parameters based on previous experience.

  • Default values that have been shown to work well are in the range 0.1-0.2

  • A more reliable and objective way to obtain values for the unknown parameters is to estimate them from the observed data

  • The optimal \alpha can be found by minimizing RMSE over the training set

  • In R, forecasting using simple exponential smoothing can be done via the ets() function in the forecast package

  • ets stands for error, trend, and seasonality, respectively.

  • Applying this function to a time series will yield forecasts and forecast errors (residuals) for both the training and validation periods

2.1 Example of SES implementation with fixed smoothing parameter

Code
alg_econ_split <- alger_econ.ts %>% 
  TSstudio::ts_split(sample.out = 12)
train <- alg_econ_split$train
test <- alg_econ_split$test
alger_econ.ses <- forecast::ets(train, 
                      model = "ANN", 
                      alpha = 0.15)
alger_econ.ses.fc <- forecast::forecast(alger_econ.ses, 
                                h = 12, 
                                level = 0)
print(alger_econ.ses)
ETS(A,N,N) 

Call:
 forecast::ets(y = train, model = "ANN", alpha = 0.15) 

  Smoothing parameters:
    alpha = 0.15 

  Initial states:
    l = 29.1981 

  sigma:  7.6381

     AIC     AICc      BIC 
365.1221 365.4012 368.7794 
Code
print(alger_econ.ses.fc)
     Point Forecast     Lo 0     Hi 0
2006       34.93181 34.93181 34.93181
2007       34.93181 34.93181 34.93181
2008       34.93181 34.93181 34.93181
2009       34.93181 34.93181 34.93181
2010       34.93181 34.93181 34.93181
2011       34.93181 34.93181 34.93181
2012       34.93181 34.93181 34.93181
2013       34.93181 34.93181 34.93181
2014       34.93181 34.93181 34.93181
2015       34.93181 34.93181 34.93181
2016       34.93181 34.93181 34.93181
2017       34.93181 34.93181 34.93181
Code
a <- algeria_economy %>% 
  filter(Year>"2005") %>% 
  select(Exports)


b <- as.data.frame(alger_econ.ses.fc$mean)
ab <- as.data.frame(bind_cols(a,b))
print(ab)
    Exports Year        x
1  48.81069 2006 34.93181
2  47.06816 2007 34.93181
3  47.97335 2008 34.93181
4  35.37165 2009 34.93181
5  38.44455 2010 34.93181
6  38.78695 2011 34.93181
7  36.89055 2012 34.93181
8  33.20990 2013 34.93181
9  30.21912 2014 34.93181
10 23.17178 2015 34.93181
11 20.86001 2016 34.93181
12 22.63889 2017 34.93181
Code
autoplot(alger_econ.ses.fc) +
  labs(x = "Year",
       y = "Exports (% of GDP)",
       title = "SES with fixed smoothing parameter (alpha = 0.15)") +
  scale_x_continuous(breaks = seq(1960, 2017, 5)) +
  geom_line(aes(y = alger_econ.ses.fc$fitted), 
                col = "red",
                linetype = "dashed")

2.2 Example of SES implementation with optimal smoothing parameter

Code
alger_econ.opt <- forecast::ets(train, 
                      model = "ANN")
alger_econ.ses.fc1 <- forecast::forecast(alger_econ.opt, 
                                h = 12, 
                                level = 0)
print(alger_econ.opt)
ETS(A,N,N) 

Call:
 forecast::ets(y = train, model = "ANN") 

  Smoothing parameters:
    alpha = 0.7783 

  Initial states:
    l = 39.3919 

  sigma:  6.2503

     AIC     AICc      BIC 
348.6747 349.2461 354.1606 
Code
print(alger_econ.ses.fc1)
     Point Forecast     Lo 0     Hi 0
2006       45.50503 45.50503 45.50503
2007       45.50503 45.50503 45.50503
2008       45.50503 45.50503 45.50503
2009       45.50503 45.50503 45.50503
2010       45.50503 45.50503 45.50503
2011       45.50503 45.50503 45.50503
2012       45.50503 45.50503 45.50503
2013       45.50503 45.50503 45.50503
2014       45.50503 45.50503 45.50503
2015       45.50503 45.50503 45.50503
2016       45.50503 45.50503 45.50503
2017       45.50503 45.50503 45.50503
Code
autoplot(alger_econ.ses.fc1) +
  labs(x = "Year",
       y = "Exports (% of GDP)",
       title = "SES with optimal smoothing parameter") +
  scale_x_continuous(breaks = seq(1960, 2017, 5)) +
  geom_line(aes(y = alger_econ.ses.fc$fitted), 
                col = "red",
                linetype = "dashed")

3 Holt’s linear trend method

  • Previously, we learned that both moving average and simple exponential smoothing should only be used for forecasting series with no trend or seasonality

  • One solution for forecasting series with trend and/or seasonality is first to remove those components (e.g., via differencing)

  • Another solution is to use a more sophisticated version of exponential smoothing, which can capture trend and/or seasonality

  • Holt (1957) extended simple exponential smoothing to allow the forecasting of data with a trend

  • This method involves a forecast equation and two smoothing equations: one for the level and one for the trend \begin{align} \text{Forecast equation}: \hat{y}_{t+h|t} &= l_t + hb_t \notag \\ \text{Level equation}: l_t &= \alpha y_t + (1-\alpha)(l_{t-1}+b_{t-1}) \notag \\ \text{Trend equation}: b_t &= \beta^*(l_t - l_{t-1}) + (1-\beta^*)b_{t-1} \notag \end{align}

    • Here l_t denotes an estimate of the level of the series at time t, b_t denotes an estimate of the trend (slope) of the series at time t, \alpha is the smoothing parameter for the level, 0 \leq \alpha \leq 1, and \beta^* is the smoothing parameter for the trend, 0 \leq \beta^* \leq 1
  • As with simple exponential smoothing, the level equation shows that l_t is a wieghted average of the observations y_t and the one-step-ahead forecast for time t which is given by l_{t-1}+b_{t-1}

  • Also, known as double exponential smoothing

  • The trend equation shows that b_t is a weighted average of the estimated trend at time t based on l_t - l_{t-1} and b_{t-1}, the previous estimate of trend

  • The forecast function is no longer flat but trending

  • The h-step-ahead forecast is equal to the last estimated level plus h times the last estimated trend value

  • Hence, the forecasts are a linear function of h

3.1 An example

  • To illustrate these ideas, consider the Australian air passenger time series
Code
air <- window(ausair,start=1990)#From 1990 to 2016
fc <- holt(air,h=5)#Forecast from 2017 to 2021
fc$model#Gives the parameter estimates and the initial state (t=0) values
Holt's method 

Call:
 holt(y = air, h = 5) 

  Smoothing parameters:
    alpha = 0.8302 
    beta  = 1e-04 

  Initial states:
    l = 15.5715 
    b = 2.1017 

  sigma:  2.3645

     AIC     AICc      BIC 
141.1291 143.9863 147.6083 
Code
print(fc)#displays the 5-year forecast
     Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
2017       74.60130 71.57106 77.63154 69.96695 79.23566
2018       76.70304 72.76440 80.64169 70.67941 82.72668
2019       78.80478 74.13092 83.47864 71.65673 85.95284
2020       80.90652 75.59817 86.21487 72.78810 89.02494
2021       83.00826 77.13343 88.88310 74.02348 91.99305
Code
autoplot(air) +
  autolayer(fc, PI=T)

4 Damped trend methods

  • The forecasts generated by Holt’s linear method display a constant trend (increasing or decreasing) indefinitely into the future

  • Empirical evidence indicates that these methods tend to over-forecast, especially for longer forecast horizons

  • Gardner & McKenzie (1985) introduced a parameter that “dampens” the trend to a flat line some time in the future

  • In conjunction with the smoothing parameters \alpha and \beta^* this method also includes a damping parameter 0 < \phi < 1 \begin{align} \hat{y}_{t+h|t} &= l_t + (\phi + \phi^2 + \cdots + \phi^h)b_t \notag \\ l_t &= \alpha y_t + (1-\alpha)(l_{t-1} + \phi b_{t-1}) \notag \\ b_t &= \beta^*(l_t - l_{t-1}) + (1-\beta^*)\phi b_{t-1} \notag \end{align}

  • If \phi=1 the method is identical to Holt’s linear method

  • In practice, \phi is rarely less than 0.8 as the damping has a very strong effect for smaller values

  • Values of phi close to 1 will mean that a damped model is not able to be distinguished from a non-damped model

  • For these reasons, we usually restrict 0.8 < \phi < 0.98

4.1 An example

Code
fc1 <- holt(air, h=15)
fc2 <- holt(air, damped=TRUE, phi = 0.9, h=15)
autoplot(air) +
  autolayer(fc, series="Holt's method", PI=FALSE) +
  autolayer(fc2, series="Damped Holt's method", PI=FALSE) +
  ylab("Air passengers in Australia (millions)") +
  guides(colour=guide_legend(title="Forecast"))

4.2 One more example

  • In this example, we compare the forecasting performance of the three exponential smoothing methods that we have considered so far in forecasting the sheep livestock population in Asia

  • We will use time series cross-validation, tsCV(), function in the forecast package to compare the one-step forecast accuracy of the three methods

Code
e1 <- tsCV(livestock, ses, h=1)
e2 <- tsCV(livestock, holt, h=1)
e3 <- tsCV(livestock, holt, damped=TRUE, h=1)
MSE1 <- mean(e1^2, na.rm=TRUE)
MSE2 <- mean(e2^2, na.rm=TRUE)
MSE3 <- mean(e3^2, na.rm=TRUE)
MAE1 <- mean(abs(e1), na.rm=TRUE)
MAE2 <- mean(abs(e2), na.rm=TRUE)
MAE3 <- mean(abs(e3), na.rm=TRUE)
print(cbind(MSE1,MSE2,MSE3,MAE1,MAE2,MAE3))
         MSE1    MSE2     MSE3    MAE1     MAE2     MAE3
[1,] 178.2531 173.365 162.6274 8.53246 8.803058 8.024192
  • Damped Holt’s method is best whether you compare MAE or MSE values
  • So we will proceed with using the damped Holt’s method and apply it to the whole data set to get forecasts for future years
Code
fc3 <- holt(livestock, damped=TRUE)
par.fc3 <- fc3$model$par
print(par.fc3)
       alpha         beta          phi            l            b 
9.998998e-01 2.806460e-04 9.797542e-01 2.233500e+02 6.904597e+00 
Code
print(fc3)
     Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
2008       458.3355 441.8760 474.7951 433.1628 483.5083
2009       460.8784 437.5990 484.1578 425.2757 496.4811
2010       463.3697 434.8551 491.8844 419.7603 506.9792
2011       465.8107 432.8806 498.7407 415.4485 516.1728
2012       468.2022 431.3806 505.0237 411.8884 524.5159
2013       470.5452 430.2041 510.8864 408.8488 532.2417
2014       472.8409 429.2620 516.4198 406.1927 539.4891
2015       475.0900 428.4964 521.6837 403.8312 546.3489
2016       477.2937 427.8676 526.7198 401.7029 552.8844
2017       479.4527 427.3466 531.5588 399.7633 559.1421
Code
autoplot(fc3) +
  xlab("Year") + ylab("Livestock, sheep in Asia (millions)")

5 Holt-Winters’ seasonal method

  • Holt (1957) and Winters (1960) extended Holt’s method to capture seasonality

  • The Holt-Winters seasonal method comprises the forecast equation and three smoothing equations — one for the level l_t, one for the trend b_t, and one for the seasonal component s_t, and with corresponding smoothing parameters \alpha, \beta^*, and \gamma

  • We use m to denote the frequency of the seasonality, i.e., the number of seasons in a year

  • For example, for quarterly data m=4 and for monthly data m=12

  • There are two variations to this method that differ in the nature of the seasonal component

  • The additive method is preferred when the seasonal variations are roughly constant through the series, while the multiplicative method is preferred when the seasonal variations are changing proportional to the level of the series

  • With the additive method, the seasonal component is expressed in absolute terms in the scale of the observed series, and in the level equation the series is seasonally adjusted by subtracting the seasonal component

  • Within each year, the seasonal component will add up to approximately zero

  • With the multiplicative method, the seasonal component is expressed in relative terms (percentages), and the series is seasonally adjusted by dividing through by the seasonal component

  • Within each year, the seasonal component will sum up to approximately m

5.1 Holt-Winters’ additive method

  • The component form for the additive method is: \begin{align} \hat{y}_{t+h|t} &= \ell_t + hb_t + s_{t+h-m(k+1)} \notag \\ \ell_t &= \alpha (y_t - s_{t-m}) + (1-\alpha) (\ell_{t-1} + b_{t-1}) \notag \\ b_t &= \beta^{*}(\ell_t - \ell_{t-1}) + (1 - \beta^{*})b_{t-1} \notag \\ s_t &= \gamma (y_t + \ell_{t-1} - b_{t-1}) + (1 - \gamma)(s_{t-m}) \notag \end{align}

    • where k is the integer part of \frac{h-1}{m} which ensures that the estimates of the seasonal indices used for forecasting come from the final year of the sample
    • The level equation shows a weighted average between the seasonally adjusted observation (y_t - s_{t-m}) and the non-seasonal forecast (l_{t-1} + b_{t-1})
    • The trend equation is identical to Holt’s linear method
    • The seasonal equation shows a weighted average between the current seasonal index (y_t + \ell_{t-1} - b_{t-1}) and the seasonal index of the same season last year (i.e.,m time periods ago)
  • The equation for the seasonal component is often expressed as s_t = \gamma^* (y_t - \ell_t) + (1-\gamma^{*})s_{t-m}

  • If we substitute \ell_t from the smoothing equation for the level of the component form above, we get s_t = \gamma^*(1-\alpha)(y_t - \ell_{t-1} - b_{t-1}) + [1-\gamma^{*}(1-\alpha)]s_{t-m}

    • which is identical to the smoothing equation for the seasonal component we specify here, with \gamma=\gamma^{*}(1-\alpha)

    • the usual parameter restriction is 0 \leq \gamma^{*} \leq 1, which translate to 0 \leq \gamma \leq 1-\alpha

5.1.1 An example

  • As an example, let us apply Holt-Winters’ method with both additive and seasonality to forecast quarterly visitor nights in Australia spent by international tourists
Code
aust <- window(austourists,start=2005)
fit1 <- hw(aust,seasonal="additive",h=8)#additive seasonality
summary(fit1)

Forecast method: Holt-Winters' additive method

Model Information:
Holt-Winters' additive method 

Call:
 hw(y = aust, h = 8, seasonal = "additive") 

  Smoothing parameters:
    alpha = 0.3063 
    beta  = 1e-04 
    gamma = 0.4263 

  Initial states:
    l = 32.2597 
    b = 0.7014 
    s = 1.3106 -1.6935 -9.3132 9.6962

  sigma:  1.9494

     AIC     AICc      BIC 
234.4171 239.7112 250.4748 

Error measures:
                      ME     RMSE      MAE        MPE     MAPE      MASE
Training set 0.008115785 1.763305 1.374062 -0.2860248 2.973922 0.4502579
                    ACF1
Training set -0.06272507

Forecasts:
        Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
2016 Q1       76.09837 73.60011 78.59664 72.27761 79.91914
2016 Q2       51.60333 48.99039 54.21626 47.60718 55.59947
2016 Q3       63.96867 61.24582 66.69153 59.80443 68.13292
2016 Q4       68.37170 65.54313 71.20027 64.04578 72.69762
2017 Q1       78.90404 75.53440 82.27369 73.75061 84.05747
2017 Q2       54.40899 50.95325 57.86473 49.12389 59.69409
2017 Q3       66.77434 63.23454 70.31414 61.36069 72.18799
2017 Q4       71.17737 67.55541 74.79933 65.63806 76.71667

5.2 Holt-Winters’ multiplicative method

  • The component form for the multiplicative method is: \begin{align} \hat{y}_{t+h|t} &= (\ell_t + hb_t)s_{t+h-m(k+1)} \notag \\ \ell_t &= \alpha \frac{y_t}{s_{t-m}} + (1-\alpha)({\ell_{t-1} + b_{t-1}}) \notag \\ b_t &= \beta^* (\ell_t - \ell_{t-1}) + (1-\beta^*)b_{t-1} \notag \\ s_t &= \gamma \frac{y_t}{(\ell_{t-1}+b_{t-1}}) + (1-\gamma) s_{t-m} \notag \end{align}

5.2.1 An example

  • Let us apply Holt-Winters’ method with multiplicative seasonality to the Australian international tourists data
Code
fit2 <- hw(aust,seasonal="multiplicative",h=8)#multiplicative seasonality
summary(fit2)

Forecast method: Holt-Winters' multiplicative method

Model Information:
Holt-Winters' multiplicative method 

Call:
 hw(y = aust, h = 8, seasonal = "multiplicative") 

  Smoothing parameters:
    alpha = 0.4406 
    beta  = 0.0134 
    gamma = 0.0023 

  Initial states:
    l = 32.4875 
    b = 0.6974 
    s = 1.0237 0.9618 0.7704 1.2442

  sigma:  0.0367

     AIC     AICc      BIC 
221.1313 226.4254 237.1890 

Error measures:
                     ME     RMSE     MAE           MPE    MAPE      MASE
Training set 0.09206228 1.575631 1.25496 -0.0006505533 2.70539 0.4112302
                    ACF1
Training set -0.07955726

Forecasts:
        Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
2016 Q1       80.08894 76.31865 83.85922 74.32278 85.85509
2016 Q2       50.15482 47.56655 52.74309 46.19640 54.11324
2016 Q3       63.34322 59.80143 66.88502 57.92652 68.75993
2016 Q4       68.17810 64.08399 72.27221 61.91670 74.43950
2017 Q1       83.80112 78.43079 89.17146 75.58790 92.01434
2017 Q2       52.45291 48.88795 56.01787 47.00077 57.90504
2017 Q3       66.21274 61.46194 70.96353 58.94702 73.47845
2017 Q4       71.23205 65.85721 76.60690 63.01194 79.45217
  • The code chunk below will plot the series and the forecast for the 2 methods
Code
autoplot(aust) +
  autolayer(fit1, series="HW additive forecasts", PI=FALSE) +
  autolayer(fit2, series="HW multiplicative forecasts", PI=FALSE) +
  xlab("Year") +
  ylab("Visitor nights (millions)") +
  ggtitle("International visitors nights in Australia") +
  guides(colour=guide_legend(title="Forecast"))

6 Holt-Winters’ damped method

  • Damping is possible with both additive and multiplicative Holt-Winters’ methods

  • Holt-Winters method with a damped trend and multiplicative seasonality often provides accurate and robust forecasts for seasonal data \begin{align} \hat{y}_{t+h|t} &= \left[\ell_t + (\phi + \phi^2 + \cdots + \phi^h)b_t \right] s_{t+h-m(k+1)} \notag \\ \ell_t &= \alpha \left(\frac{y_t}{s_{t-m}}\right) + (1-\alpha)({\ell_{t-1} + \phi b_{t-1}}) \notag \\ b_t &= \beta^* (\ell_t - \ell_{t-1}) + (1-\beta^*)\phi b_{t-1} \notag \\ s_t &= \gamma \frac{y_t}{(\ell_{t-1} + \phi b_{t-1})} + (1-\gamma) s_{t-m} \notag \end{align}

6.1 An example

  • The hyndsight data contains the daily pageviews on the Hyndsight blog of Prof. Rob Hyndman.

  • It is a daily time series from April 30, 2014 to April 30, 2015

Code
hs.ts <- subset(hyndsight,end=length(hyndsight)-35)
fit3 <- hw(hs.ts, damped = TRUE, seasonal="multiplicative", h=35)
summary(fit3)

Forecast method: Damped Holt-Winters' multiplicative method

Model Information:
Damped Holt-Winters' multiplicative method 

Call:
 hw(y = hs.ts, h = 35, seasonal = "multiplicative", damped = TRUE) 

  Smoothing parameters:
    alpha = 0.4189 
    beta  = 1e-04 
    gamma = 0.0964 
    phi   = 0.9533 

  Initial states:
    l = 1169.6237 
    b = 0.3841 
    s = 1.2235 1.0607 0.6176 0.7694 1.0574 1.1257
           1.1455

  sigma:  0.1984

     AIC     AICc      BIC 
5529.690 5530.842 5579.078 

Error measures:
                   ME     RMSE      MAE       MPE     MAPE      MASE      ACF1
Training set 2.262107 222.5991 156.6953 -2.139331 12.86201 0.7069489 0.1305779

Forecasts:
         Point Forecast     Lo 80    Hi 80      Lo 95    Hi 95
48.57143       2035.252 1517.6913 2552.813 1243.71112 2826.793
48.71429       1810.067 1309.5271 2310.608 1044.55716 2575.578
48.85714       1353.964  951.4964 1756.431  738.44305 1969.485
49.00000       1015.333  693.7201 1336.945  523.46870 1507.196
49.14286       1523.818 1012.9460 2034.689  742.50688 2305.129
49.28571       2009.983 1300.6290 2719.337  925.11963 3094.847
49.42857       2044.262 1288.1910 2800.333  887.95102 3200.573
49.57143       2038.604 1235.0056 2842.203  809.60616 3267.602
49.71429       1813.044 1070.2535 2555.834  677.04402 2949.044
49.85714       1356.187  780.1671 1932.207  475.24057 2237.133
50.00000       1016.997  570.1636 1463.831  333.62415 1700.370
50.14286       1526.313  833.9390 2218.687  467.41837 2585.208
50.28571       2013.270 1071.9700 2954.570  573.67575 3452.865
50.42857       2047.601 1062.3701 3032.833  540.82019 3554.382
50.57143       2041.933 1018.6513 3065.215  476.95855 3606.908
50.71429       1816.001  882.4209 2749.582  388.21331 3243.789
50.85714       1358.397  642.7819 2074.012  263.95814 2452.836
51.00000       1018.653  469.2741 1568.032  178.45058 1858.855
51.14286       1528.795  685.4631 2372.128  239.02995 2818.561
51.28571       2016.542  879.6943 3153.389  277.88373 3755.199
51.42857       2050.925  870.1707 3231.680  245.11690 3856.734
51.57143       2045.248  831.5772 3258.919  189.09882 3901.397
51.71429       1818.947  718.5171 2919.377  135.98477 3501.909
51.85714       1360.599  521.9036 2199.294   77.92525 2643.272
52.00000       1020.303  379.8370 1660.768   40.79500 1999.810
52.14286       1531.270  552.9350 2509.604   35.03593 3027.503
52.28571       2019.803  706.9877 3332.618   12.02521 4027.581
52.42857       2054.240  696.5298 3411.951  -22.19881 4130.680
52.57143       2048.553  661.2225 3435.884  -73.18613 4170.293
52.71429       1821.885  568.5476 3075.222  -94.92900 3738.699
52.85714       1362.795  410.8094 2314.781  -93.14132 2818.732
53.00000       1021.949  297.2978 1746.600  -86.30930 2130.207
53.14286       1533.739  430.1533 2637.325 -154.04978 3221.528
53.28571       2023.059  546.3999 3499.718 -235.29612 4281.414
53.42857       2057.550  534.5215 3580.579 -271.72132 4386.822
Code
autoplot(hyndsight) +
  autolayer(fit3, series="HW multiple damped", PI=FALSE)+
  guides(colour=guide_legend(title="Daily forecasts"))