DATA 624 Lecture #4

LOGAN THOMSON
September 26, 2017

Review:
Forecasting: Principles and Practice
Hyndman & Athanasopoulos

Ch. 1: What Can Be Forecast?

Forecasting is an important aid to effective and efficient planning

Predictability of a quantity/event depends on:

  • How well the factors that contribute to it are understood
  • How much data are available
  • Whether forecasts affect the target of what is being forecast

  • Good forecasts capture genuine patterns and relationships which exist in historical data

  • Do not replicate past events that will not occur again

Ch. 1: Getting Started

Depending on the application, organizations require forecasts of different lengths:

  • Short-term: Scheduling personnel, production, or transportation
  • Medium-term: Determine future resource requirements (raw materials)
  • Long-term: Used in strategic planning

Ch. 1: Getting Started

A forecasting task generally involves five basic steps:

  1. Problem Definition
  2. Information Gathering
  3. Exploratory Analysis
  4. Choosing & Fitting Models
  5. Using and Evaluating a Forecast Model

Ch. 2: The Forecaster's Toolbox - Graphics

Graphics enable features of the data to be visualized, including:

  • Patterns
  • Unusual observations
  • Changes over time
  • Relationships between variables

The type of data determines both the forecasting method, and the appropriate graph

Ch. 2: The Forecaster's Toolbox - Graphics



plot of chunk unnamed-chunk-2



plot of chunk unnamed-chunk-3

Ch. 2: The Forecaster's Toolbox - Graphics


plot of chunk unnamed-chunk-4



plot of chunk unnamed-chunk-5

Ch. 2: Simple Forecating Methods

Average Method:
\( \hat y_{T+h|T}=\bar{y}=(y_1 + ... + y_t) / T \)

Naive Method:
All future values set to \( y_T \), where \( y_T \) is the last observed value

Seasonal Method:
\( y_{T+h-km} \) where \( m= \) the seasonal period, and \( k=(h-1)/m + 1 \)

Drift Method:
\( y_T + \frac{h}{T-1} \displaystyle\sum_{t=2}^{T} (y_t - y_{t-1}) = y_t + h (\frac{y_T - y_1}{T-1}) \)

Ch. 2: Transformations and Adjustments

  • Mathematical Transformations
    • Logarithmic,
    • Power (\( \sqrt{x}, x^3 \))
    • Inverse
  • Calendar Adjustments
    • Remove variation from uneven distribution of days per month
  • Population Adjustments
    • Use per-capita rather than totals
  • Inflation Adjustments
    • Adjusted to a base year: \( x_t = y_t/z_t * z_{base-year} \)

Ch. 2: Evaluating Forecast Accuracy

  • Scale-dependent Errors
    • MAE: Mean absolute error = mean\( (|e_i|) \)
    • RMSE: Root mean squared error = \( \sqrt{mean(e_{i}^2)} \)
  • Percentage Errors
    • Given by \( p_i = 100e_{i}/y_{i} \)
    • MAPE: Mean absolute percentage error = mean\( (|p_i|) \)
  • Scaled Errors
    • \( q_j = \frac{e_j}{\frac{1}{T - 1} \sum_{t=2}^T |y_t - y_{t-1}|} \)
    • MASE: Mean absolute scaled error = mean\( (|q_j|) \)
    • MSSE: Mean squared sclaed error = mean\( (q_{j}^2) \)

Ch 6: Time Series Decomposition


Time Series Patterns

  • Trend
    • Long-term increase/decrease
    • Does not have to be linear
  • Seasonal
    • Always of a fixed or known period
    • Exists when series is influenced by seasonal factors
  • Cyclic
    • Rises and falls not of a fixed period


plot of chunk unnamed-chunk-6

Ch 6: Time Series Decomposition

Time series can be thought to comprise three components:

  • Seasonal
  • Trend-Cycle
  • Remainder


Additive Model: \( y_t = S_t + T_t + E_t \)


Mulitplicative Model: \( y_t = S_t \times T_t \times E_t \)

Ch 6: Time Series Decomposition


plot of chunk unnamed-chunk-7


plot of chunk unnamed-chunk-8

Ch 6: Moving Averages

Moving average of order \( m \) can be written as:

\( \hat{T_t}=\frac{1}{m} \sum_{j = -k}^{k} y_{t+j} \), where \( m=2k+1 \)

  • Found by averaging values of time series within \( k \) periods of \( t \)
  • Elminates randomness
  • \( m \)-MA = moving average of order \( m \)

Ch 6: Moving Averages



43 Years of Copper Prices:
Moving Average of Order 3, 5, 7 & 9

plot of chunk unnamed-chunk-9

Ch 6: Moving Averages



Moving Average of Moving Averages:

  • Notated by second order \( \times \) first order
    • \( 2\times4 \)-MA
    • 2-MA following another even order is “centered”


  • To keep symmetry:
    • Even order should be followed by even order
    • Odd order followed by odd order



Weighted Moving Averages:

  • Combinations of moving averages result in weighted moving averages
  • Weights should be symmetric and sum to 1
    \( \hat{T_t} = \frac{1}{8}y_{t-2} + \frac{1}{4}y_{y-1} + \frac{1}{4}y_t + \frac{1}{4}y_{t+1} + \frac{1}{8}y_{t+2} \)
  • Yield a smoother curve

Ch 6: Classical Decomposition



Additive Decomposition:
1. Obtain \( \hat{T_t} \). If \( m \) is even, use \( 2\times m \)-MA, and \( m \)-MA for odd \( m \) values.
2. Calculate detrended series: \( y_t - \hat{T_t} \)
3. Obtain \( \hat{S_t} \) by stringing together seasonal indicies
4. Calculate the remainder by subtracting estimated seasonal and trend-cycle components:
\( \hat{E_t} = y_t - \hat{T_t} - \hat{S_t} \)



Multiplicative Decomposition:
1. Obtain \( \hat{T_t} \) using same rules as additive decomposition
2. Calculate detrended series: \( y_t/\hat{T_t} \)
3. Obtain \( \hat{S_t} \)
4. Calculate the remainder by dividing out the estimated seasonal and trend-cycle components:
\( \hat{E_t} = y_t/(\hat{T_t}\hat{S_t}) \)

Ch 6: Classical Decomposition



plot of chunk unnamed-chunk-10



plot of chunk unnamed-chunk-11

Ch 6: X-12-ARIMA & STL Decomposition

X-12-ARIMA:

  • Popular method for decomposing quarterly & monthly data (and only allows for it)
  • Based on classical decomposition with many extra steps and features
  • Uses ARIMA model providing forecasts foreward & backwards in time
  • Separate software from US Census Bureau required w/R interface via x12 package


    Seasonal & Trend using Loess [STL]:
  • Will handle any type of seasonality
    • Seasonal component can change over time
  • Rate of change and smoothness of trend-cycle can be controlled by user
  • Does not automatically handle calendar variations
  • Main parameters are the trend window and seasonal window

Ch 6: Forecasting with Decomposition

Decomposition can be used for forecasting, as well as studying time series

Additive Decomposition:

  • \( y_t = \hat{S_t} + \hat{A_t} \)
    • Where \( \hat{A_t} = \hat{T_t} + \hat{E_t} \)

Multiplicative Decomposition:

  • \( y_t = \hat{S_t}\hat{A_t} \)
    • Where \( \hat{A_t} = \hat{S_t}\hat{E_t} \)

The seasonal component (\( \hat{S_t} \)) and the seasonally adjusted component (\( \hat{A_t} \)) are forecasted separately to forecast a decomposed time series

Ch. 7: Simple Exponential Smoothing

  • Suitable for forecasting data with no trend or seasonal pattern
  • In between naive and mean forecast methods
  • Attaches larger weights to more recent observations
  • Weights decrease exponentially for observations further in past
  • Rate at which weights decrease is controlled by parameter \( \alpha \), where \( 0\leq\alpha\leq1 \)

\[ \hat{y}_{T+1|T} = \alpha y_T + \alpha(1-\alpha)y_{T-1} + \alpha(1-\alpha)^2y_{T-2} + \alpha(1-\alpha)^3y_{T-3} \dots \]

Observation \( \alpha = 0.2 \) \( \alpha = 0.4 \) \( \alpha = 0.6 \) \( \alpha = 0.8 \)
\( y_T \) 0.2 0.4 0.6 0.8
\( y_T-1 \) 0.16 0.24 0.24 0.16
\( y_T-2 \) 0.128 0.144 0.096 0.032
\( Y_T-3 \) 0.1024 0.0864 0.0384 0.0064

Ch. 7: Simple Exponential Smoothing (Cont.)

  • Initialization required, need to specify value for level, \( \ell_0 \)
  • Common approach is to set \( \ell_0 = y_1 \)
  • Alternative is to use optimization to estimate value of \( \ell_0 \)
  • ses function contains a parameter initial to calculate this value.
ses(oilprice, alpha=0.2, initial='simple', h=3)  # use  'simple' or 'optimal'
     Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
1998       21.80368 11.02002 32.58733 5.311500 38.29586
1999       21.80368 10.80646 32.80089 4.984891 38.62247
2000       21.80368 10.59697 33.01038 4.664504 38.94285

Ch. 7: Holt's Linear Trend Method


  • Allows forecasting of data with a trend
  • Involves a forecast equation plus two smoothing equations (one for level, one for trend)

Forecast: \( \hat{y}_{t+h|t} = \ell_t + hb_t \)
Level: \( \ell_t = \alpha y_t + (1 - \alpha)(\ell_{t-1} + b_{t-1}) \)
Trend: \( b_t = \beta^*(\ell_t - \ell_{t-1}) + (1 - \beta^*)b_{t-1} \)

  • \( b_t \) is the estimate of trend at time \( t \) (slope)
  • \( \beta^* \) is the smoothing parameter, where \( 0 \leq \beta \leq 1 \)

Ch. 7: Exponential Trend Method


  • Variation on Holt's linear trend method
  • Allows level and slope to be multiplied rather than added
  • The trend becomes exponential rather than linear
  • Growth rate is constant, rather than a constant slope

Forecast: \( \hat{y}_{t+h|t} = \ell_t + hb_t \)
Level: \( \ell_t = \alpha y_t + (1 - \alpha)(\ell_{t-1}b_{t-1}) \)
Trend: \( b_t = \beta^* \frac{\ell_t}{\ell_{t-1}} + (1 - \beta^*)b_{t-1} \)

Ch. 7: Holt's Linear Trend Method

plot of chunk unnamed-chunk-13

Ch. 7: Holt-Winters Seasonal Method


  • Extension of Holt's method to capture seasonality
  • Involves forecast equation and three smoothing parameters (level, trend, & seasonal)
  • Additive method preferred when seasonal variations are constant
  • Multiplicative preferred when seasonal variation change proportional to level
  • Holt-Winters w/damped trend & multiplicative seasonality often accurate for seasonal data
  • hw function available in forecast package



Additive Method:
\( \hat{y_{t+h|t}} = \ell_t + hb_t + s_{t-m+h_{m}^+} \)
\( \ell_t = \alpha(y_t - s_{t-m}) + (1 - \alpha)(\ell_{t-1} + b_{t-1}) \)
\( b_t = \beta^*(\ell_t - \ell{t-1}) + (1-\beta^*)b_{t-1} \)
\( s_t = \gamma(y_t - \ell_{t-1} - b_{t-1}) + (1 - \gamma)s_{t-m} \)

Multiplicative Method:
\( \hat{y_{t+h|t}} = (\ell_t + hb_t)s_{t-m+h_{m}^+} \)
\( \ell_t = \alpha\frac{y_t}{s_{t-m}} + (1 - \alpha)(\ell_{t-1} + b_{t-1}) \)
\( b_t = \beta^*(\ell_t - \ell{t-1}) + (1-\beta^*)b_{t-1} \)
\( s_t = \gamma\frac{y_t}{(\ell_{t-1} + b_{t-1})} + (1 - \gamma)s_{t-m} \)

Ch. 7: Holt-Winters Seasonal Method

plot of chunk unnamed-chunk-14

Ch. 7: "Taxonomy" of Exp. Smoothing Methods

Trend Component Seasonal: None Seasonal Additive Seasonal Mulitplicative
N (None) (N, N) (N, A) (N, M)
A (Additive) (A, N) (A, A) (A, M)
\( A_d \) (Additive Damped) (\( A_d \), N) (\( A_d \), A) (\( A_d \), M)
M (Multiplicative) (M, N) (M, A) (M, M)
\( M_d \) (Mult. Damped) (\( M_d \), N) (\( M_d \), A) (\( M_d \), M)

Simple Exponential Smoothing = (N,N)
Holts Linear Method = (A,N)
Exponential Trend Method = (M,N)
Additive Damped Trend Method = (Ad,N)
Multiplicative Damped Trend Method = (Md,N)
Additive Holt-Winters Method = (A,A)
Multiplicative Holt-Winters Method = (A,M)
Holt-Winters Damped Method = (Ad,M)

Ch. 7: State Space Models & ETS

State Space Models:
Models which consist of:

  • A measurement equation that describes the observed data
  • Some transition equations that describe how the unobserved components or states (level, trend, seasonal) change over time.

Two models for each method:

  • one with additive errors
  • one with multiplicative errors

Third letter added to the classification, labeled ETS (Error, Trend, & Seasonal)

Ch. 7: State Space Models & ETS

Possibilities for each component are:
Error = {A,M}, Trend ={N,A,Ad,M,Md} and Seasonal ={N,A,M}.

Total of 30 such state space models:

  • 15 with additive errors
  • 15 with multiplicative errors

ETS models can be selected using information criteria (AIC, \( AIC_c \), & BIC)

ets function from the forecast package can be used to estimate the models
- Leave blank or enter parameters

THANK YOU!



Questions?