FPP3 Chapter 3: Time Series Decomposition

Author

Teddy Kelly

In this chapter, we consider the most common methods for extracting these components from a time series. Often this is done to help improve understanding of the time series, but it can also be used to improve forecast accuracy.

1 Transformation and Adjustments

We deal with four kinds of adjustments. These adjustments can simplify patterns in data by removing unknown sources of variation or by making the pattern more consistent across the whole data set. Simpler patterns lead to more accurate forecasts.

1.1 Calendar Adjustments

Some variation in seasonal data may be due to basic calendar effects.
Some of the variation in monthly data may be due to the different number of days in each month.
So, computing the average per day of each month will remove this calendar variation and allow us to more effectively evaluate which months have the most of some variable.

1.2 Population Adjustments

It it better to adjust data that is affected by population to model per-capita data.
Much easier to interpret how a variable increases over time if you remove the effects of population changes by considering the variable per some certain number of people.
We can see an example of this using the global_economy dataset
Say we want to look at only Australia and compare the GDP and GDP per capita.

library(fpp3)

Registered S3 method overwritten by 'tsibble':
  method               from 
  as_tibble.grouped_df dplyr

── Attaching packages ──────────────────────────────────────────── fpp3 1.0.2 ──

✔ tibble      3.3.0     ✔ tsibble     1.1.6
✔ dplyr       1.1.4     ✔ tsibbledata 0.4.1
✔ tidyr       1.3.1     ✔ feasts      0.4.2
✔ lubridate   1.9.4     ✔ fable       0.4.1
✔ ggplot2     4.0.1

Warning: package 'ggplot2' was built under R version 4.5.2

── Conflicts ───────────────────────────────────────────────── fpp3_conflicts ──
✖ lubridate::date()    masks base::date()
✖ dplyr::filter()      masks stats::filter()
✖ tsibble::intersect() masks base::intersect()
✖ tsibble::interval()  masks lubridate::interval()
✖ dplyr::lag()         masks stats::lag()
✖ tsibble::setdiff()   masks base::setdiff()
✖ tsibble::union()     masks base::union()

# Gdp
global_economy |> filter(Country == "Australia") |>
  autoplot(GDP) +
  labs(title = "Australian GDP since 1960",
       x = "Time (Years)")

global_economy |> filter(Country == "Australia") |>
  autoplot(GDP/Population) +
  labs(title = "Australian GDP per-capita since 1960",
       x = "Time (Years)")

1.3 Inflation Adjustments

Data which are affected by the value of money are best adjusted for inflation before modeling.
Financial time series are usually adjusted so that all values are stated in dollar values from a specific base year.
We often use the CPI as a price index to measure the increase in prices.
We will look at adjusting for inflation with the aus_retail dataset using the CPI from the global_economy dataset.
We will aggregate annual “newspaper and book” retail turnover

retail <- aus_retail |> filter(Industry == "Newspaper and book retailing") |>
  group_by(Industry) |>
  index_by(Year = year(Month)) |>
  summarize(Turnover = sum(Turnover))

retail |> autoplot(Turnover) +
  labs(title = "Australian Retail Turnover",
       subtitle = "Newspaper and Book",
       x = "Time (years)")

Now let’s account for inflation

aus_economy <- global_economy |> filter(Code == 'AUS')

new_retail <- retail |> left_join(aus_economy, by = "Year") |>
  mutate(adjusted_turnover = (Turnover / CPI) * 100)

new_retail |> autoplot(adjusted_turnover) +
  labs(title = "Australian Retail Turnover Adjusted for Inflation",
       subtitle = "Newspaper and Book",
       x = "Time (years)")

We can see that when we adjust for inflation, the downward trend in retail turnover for newspaper and books actually began steadily in the mid 80s and has an even steeper decline in 2010 than the original graph suggests.

1.4 Mathematical Transformations

Can do log transformations to reduce heterogeneity and keep variation relatively similar across different points in the time series.
This makes analysis and modeling much easier.
Box-Cox method.

Let’s look at the aus_production tsibble and graph the gas variable:

aus_production |> autoplot(Gas)

We can see that there is a sharp jump in the gas production beginning in 1970
We can also see there is much more variation in the gas production in the more recent years.
We can apply a log transformation to reduce the difference in variation and make modeling much easier and more interpretable.

aus_production |> mutate(log_gas = log(Gas)) |> autoplot(log_gas)

Now we can see that the variation is much more similar across the different points in time in the time series. Modeling will be much easier now.
We could also perform a Box-Cox transformation which would allow us to find the ideal transformation to perform.

Box-Cox Tranformation:

A family of power transformations designed to stabilize variance.
It depends on the parameter \(\lambda\) and the transformations are defined as follows

\[ w_t=\begin{cases} log(y_t) \text{if } \lambda=0 \\ sign(y_t)|y_t|^\lambda-1/\lambda \text{otherwise } \end{cases} \]

When \(\lambda=0\), natural logarithms are used, but if \(\lambda\neq0\) then a power transformation is used.

2 Time Series Components

Time series data is a combination of the trend-cycle component at period \(t\), the seasonal component, and the remainder component.

If we assume additive decomposition, then we can write:

\[ y_t=S_t+T_t+R_t \]

where \(y_t\) is the data, \(S_t\) is the seasonal component, \(T_t\) is the trend-cycle component, and \(R_t\) is the remainder component.

Most appropriate if the magnitude of the seasonal fluctuations, or the variation around the trend-cycle, does not vary with the level of the time series.

If we assume multiplicative decomposition, then we can write:

\[ y_t=S_t \times T_t\times R_t \]

Most appropriate when the variation in the seasonal pattern, or the the variation around the trend-cycle, appears to be proportional to the level of the time series.
Common in economic time series.
An alternative to using multiplicative decomposition is to transform the data until the variation in the series appears to be constants and then use additive decomposition.

\(y_t=S_t\times T_t\times R_t\) is equivalent to \(logy_t=logS_t+logT_t+logR_t\).

2.1 Example: Employment in the US retail Sector

We can first graph the time series to observe how US employment changes over time and then decompose the time series into the trend-cycle, seasonal, and remainder components to more clearly understand what is happening. We will look at the data since 1990.

us_employment |> filter(year(Month) >= 1990 & Title == 'Retail Trade') |> 
  autoplot(Employed)

Now we can break the graph up into its components. We will use the model function and select the STL decomposition method. The model function comes from the fabletools package. The STL command comes from the feasts package.

Will then use the components function from fabletools to view a dable that displays the data explained by each of the time series components.
We can then graph the components on the same plane using autoplot

dcmp <- us_employment |> filter(year(Month) >= 1990 & Title == 'Retail Trade') |>
  model(stl = STL(Employed))

components(dcmp) |> autoplot()

The grey bars on the left represent the scale of each of the components. The larger the bar, the smaller the scale of the variation in the component.
Or we can convert the dable into a tsibble and plot the components on top of each other.

components(dcmp) |> as_tsibble() |> autoplot(Employed, color="gray") +
  geom_line(aes(y=trend), colour = "red") +
  labs(title = "Trend-Cycle Component of US Employment Data",
       subtitle = "Retail Sales")

components(dcmp) |> as_tsibble() |> autoplot(Employed, color="gray") +
  geom_line(aes(y=season_adjust), colour = "red") + 
  labs(title = "Seasonally Adusted Us Emplyoment Data",
       subtitle = "Retail Sales")

2.2 Seasonally Adjusted Data

Seasonally adjusted data effectively removes the seasonality from data. It can either be computed as \(y_t-S_t\) or equivalently as \(T_t+R_t\). It just leaves the trend-cycle and the remainder components.

We often perform this adjustment if the variation due to seasonality is not of main interest. For example monthly unemployment data. More interested in non-seasonal variation.
They still contain the remainder component, so they are not completely smooth like the trend-cycle component.
It is better to use the trend-cycle component instead of the seasonally adjusted data if the purpose is too look for turning points in a series and interpret any changes in direction.

3 Moving Averages

The Classical method of time series decomposition.

The first step is to use a moving average method to estimate the trend-cycle.

3.1 Moving Average Smoothing

m-MA: Moving average of order m.
Say m=5, then we have a moving average of order 5. This means that we look at 5 observations to calculate the average for a specific point (look at the previous 2 observations, the current, and the next 2 observations in the future)

To implement moving averages in R, we use the slider package and the slide_dbl function:

aus_exports <- global_economy |> filter(Country == "Australia") |>
  mutate(`5-MA` = slider::slide_dbl(Exports, mean,
                                     .before = 2, .after = 2, .complete = T))

aus_exports |> autoplot(Exports, color = "grey") +
  geom_line(mapping = aes(y= `5-MA`, color = 'red'))+
  labs(title = "Australian Exports",
       subtitle = "Moving Average of order 5")

Note that when the moving average is of a higher order, the trend-cycle line is much smoother because for each observation, it accounts for more data around that observation.
However, because for data is required to compute these averages, more data at the beginning and the ends are left out.
Hence, moving averages of lower order will include more of the endpoints but will not be as smooth and not capture as much of the trend-cycle component.
We specified to include two points before and after each observation, so two points at the beginning and end do not have moving averages.

3.2 Moving Averages of Moving Averages

Possible to apply a moving average to a moving average. One reason is to make an even order moving average symmetric.

For the first moving average, say you are performing a moving average of order 4, then you can either choose 1 before and 2 after or 2 before and 1 after.
Then based on what you chose for the first, you can will specify 1 before and 0 after for the former option or 0 before and 1 after for the latter option.
we usually do even order moving averages for quarterly data.
Lets do this in R for the aus_production tsibble.

beer <- aus_production |> filter(year(Quarter) >= 1992) |> select(Quarter, Beer) |>
  mutate(`4-MA` = slider::slide_dbl(Beer, mean,
                                    .before = 1, .after = 2, .complete = T),
         `2x4-MA` = slider::slide_dbl(`4-MA`, mean,
                                      .before = 1, .after = 0, .complete = T))

# Graph this jaunt
beer |> autoplot(Beer, color = 'grey') +
  geom_line(mapping = aes(y = `4-MA`)) +
  geom_line(aes(y = `2x4-MA`, color = 'red'))

The weighted averages computed for the red line is now symmetric.
We took a moving average of order 2 on the 4-MA column.
Centered moving average: When a 2-MA moving average follows a moving average of an even order (such as 4 in the example above)
The most common use of centered moving averages is for estimating the trend-cycle from seasonal data.
Combinations of moving averages result in weighted moving averages.

3.3 Estimating the Trend-cycle with Seasonal Data

If the seasonal period is odd and of order \(m\), we use a \(m\)-MA to estimate the trend-cycle.
If the seasonal period is even and of order \(m\), we use a \(2\times m\)-MA to estimate the trend-cycle.
For example, a \(2\times12\)-MA can be used to estimate the trend-cycle of monthly data with annual seasonality and a \(7\)-MA can be used to estimate the trend-cycle of daily data with a weekly seasonality.

3.4 Weighted Moving Averages

Combinations of moving averages result in weighted moving averages.

For example, the \(2 \times 4\)-MA is equivalent to a weighted 5-MA with weights given by \([\frac{1}{8},\frac{1}{4},\frac{1}{4},\frac{1}{4},\frac{1}{8}]\)
A major advantage of a weighted moving average is that they yield a smoother estimate of the trend-cycle.
Moving averages are calculated to estimate the trend-cycle of a time series.

4 Classical Decomposition

The Classical Decomposition method originated in the 1920s.

4.1 Additive Decomposition

Step 1

If m is an even number, compute the trend-cycle component \(\hat{T}_t\) using a 2×m-MA. If m is an odd number, compute the trend-cycle component \(\hat{T}_t\) using an m-MA.

Step 2

Calculate the detrended series: \(y_t−\hat{T}_t\).

Step 3

To estimate the seasonal component for each season, simply average the detrended values for that season. For example, with monthly data, the seasonal component for March is the average of all the detrended March values in the data. These seasonal component values are then adjusted to ensure that they add to zero. The seasonal component is obtained by stringing together these monthly values, and then replicating the sequence for each year of data. This gives \(\hat{S}_t\).

Step 4

The remainder component is calculated by subtracting the estimated seasonal and trend-cycle components: \(\hat{R}_t=y_t−\hat{T}_t−\hat{S}_t\).

Example of using classical decomposition on the us_employment tsibble.
Notice that since classical decomposition using moving averages, it leaves out the beginning and end most points from the trend.

us_employment |> filter(year(Month) >= 1992 & Title == 'Retail Trade') |>
  model(classical_decomposition(Employed, type = 'additive')) |>
  fabletools::components() |> autoplot()

4.2 Multiplicative Decomposition

Step 1: If m is an even number, compute the trend-cycle component \(\hat{T}_t\) using a 2×m-MA. If m is an odd number, compute the trend-cycle component \(\hat{T}_t\) using an m-MA.
Step 2: Calculate the detrended series: \(\frac{y_t}{\hat{T}_t}\).
Step 3: To estimate the seasonal component for each season, simply average the detrended values for that season. For example, with monthly data, the seasonal index for March is the average of all the detrended March values in the data. These seasonal indexes are then adjusted to ensure that they add to m. The seasonal component is obtained by stringing together these monthly indexes, and then replicating the sequence for each year of data. This gives \(\hat{S}_t\).
Step 4: The remainder component is calculated by dividing out the estimated seasonal and trend-cycle components: \(\hat{R}_t=y_t/(\hat{T}_t\hat{S}_t)\).

Limitations of Classical Decomposition

Estimations of trend is unavailable for the first few and last few observations
There is also no estimate of the remainder component for the same time periods.
The trend-cycle tends to over-smooth rapid rises and falls in the data
Seasonal Component repeats from year to year. This may not be realistic. Classical decomposition is unable to capture seasonal changes over time.
Not robust to outliers (heavily influenced by outliers)
There are new methods to overcome these problems.

5 Methods Used by Official Statistics Agencies

Most statistical agencies use variants of the X-11 Method or the SEATS method or a combination of the two.

They are specifically designed to only handle seasonality of monthly or quarterly data.
We will use the latest implementation of this group known as the “X-13ARIMA-SEATS”.
Nead the seasonal packages to use these methods

5.1 x-11 Method

Based on Classical Decomposition, but includes many extra steps and features to overcome the drawbacks of classical decomposition.
Trend-cycle components are available for all observations and the seasonal component is allowed to vary slowly over time.
X-11 also handles trading day variation, holiday effects, and the effects of known predictors.
There are methods for both additive and multiplicative decomposition
Highly robust to outliers and level shifts in time series

library(seasonal)

us_retail_employment <- us_employment |> 
  filter(Title == 'Retail Trade' & year(Month) >= 1992)

x11_dcmp <- us_retail_employment |>
  model(x11 = feasts::X_13ARIMA_SEATS(Employed ~ x11())) |> fabletools::components()

x11_dcmp |> feasts::autoplot() +
  labs(title = "Decomposition of Us Retail Employment using X-11 Method")

x11_dcmp |> gg_subseries(seasonal)

Warning: `gg_subseries()` was deprecated in feasts 0.4.2.
ℹ Please use `ggtime::gg_subseries()` instead.

x11_dcmp |> ggplot(aes(x = Month)) +
  geom_line(aes(y = Employed, color = "Data")) + 
  geom_line(aes(y = season_adjust, color = 'Seasonally Adjusted')) +
   geom_line(aes(y = trend, color = 'Trend')) +
  labs(title = "Total Employment is Us Retail") +
  scale_color_manual(
    values = c('grey', 'blue', 'red'),
    breaks = c("Data", "Seasonally Adjusted", "Trend")
  )

The Seats Method

seats_dcmp <- us_retail_employment |> 
  model(seats = feasts::X_13ARIMA_SEATS(Employed ~ seats())) |> 
  components()

seats_dcmp |> autoplot() + 
  labs(title = "Decomposition of total US REtail Employment Using the SEATS Method")

seats_dcmp |> ggplot(aes(x = Month)) +
  geom_line(aes(y = Employed, color = "Original Data")) +
  geom_line(aes(y = trend, color = "Trend")) +
  geom_line(aes(y = season_adjust, color = "Seasonally Adjusted")) +
  labs(title = "Total Employment In US Retail Since 1992",
       subtitle = "Actual Data, Trend-Cycle, and Seasonally Adjusted") +
  scale_color_manual(values = c('grey', 'red', 'blue'),
                     breaks = c("Original Data", "Trend", "Seasonally Adjusted"))

6 STL Decomposition

You can optionally add in a trend and season argument into the STL function to specify how many observations should be included in calculating the moving averages for the trend and seasonal components. You can also specify whether or not you want the model to be robust to outliers by typing robust = T.

The default trend-cycle window is 11 and the default season window for monthly data is 21.
A low window for season will lead to lots of variation over time in the seasonal patterns whereas a high window will lead to little variation. You can also specify season(window = "periodic") so that there will be no change in the seasonal pattern
A low window for the trend-cycle will cause the trend-cycle component to be bumpy and not very smooth whereas a high value will increase the smoothness and too high of a window will cause a straight line.
It’s important to have a good balance so the remainder doesn’t pick up too much of the seasonality or trend components.

us_retail_employment |>
  model(feasts::STL(Employed ~ trend(window = 11) + season(window = 'periodic'), robust = T)) |> 
  fabletools::components() |> 
  feasts::autoplot()

The STL method works on any type of seasonality.
It only provide additive decomposition