Load libraries
library(fpp3)
── Attaching packages ────────────────────────────────────────────────────────────────────────────────────────────────────────────────── fpp3 0.5 ──
✔ tibble 3.2.1 ✔ tsibble 1.1.3
✔ dplyr 1.1.3 ✔ tsibbledata 0.4.1
✔ tidyr 1.3.0 ✔ feasts 0.3.1
✔ lubridate 1.9.2 ✔ fable 0.3.3
✔ ggplot2 3.4.3 ✔ fabletools 0.3.3
── Conflicts ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────── fpp3_conflicts ──
✖ lubridate::date() masks base::date()
✖ dplyr::filter() masks stats::filter()
✖ tsibble::intersect() masks base::intersect()
✖ tsibble::interval() masks lubridate::interval()
✖ dplyr::lag() masks stats::lag()
✖ tsibble::setdiff() masks base::setdiff()
✖ tsibble::union() masks base::union()
library(GGally)
Registered S3 method overwritten by 'GGally':
method from
+.gg ggplot2
library(seasonal)
Attaching package: ‘seasonal’
The following object is masked from ‘package:tibble’:
view
Inflation adjustments
Australian newspapers adjusted for inlfation
## Retail CPI
aus_dataset <- aus_retail |>
filter(Industry == "Newspaper and book retailing") |>
group_by(Industry) |>
index_by(Year = year(Month)) |>
summarise(Turnover = sum(Turnover))
aus_dataset |> autoplot()
Plot variable not specified, automatically selected `.vars = Turnover`

- We see the data going up untill around 2010 and then start dipping
down.
- Is this because a change in industry or just a change in the
CPI?
- Or is it because the CPI?
## CPI
aus_economy <- global_economy |>
filter(Code == "AUS")
aus_economy
Joining the datasets
aus_dataset |>
left_join(aus_economy, by="Year") |>
mutate(Adjusted_Turnover = Turnover / CPI *100) |>
pivot_longer(c(Turnover, Adjusted_Turnover), values_to = "Turnover") |>
mutate(name=factor(name, levels = c("Turnover","Adjusted_turnover"))) |>
ggplot(aes(x=Year,y=Turnover)) + geom_line() +
facet_grid(name ~ ., scales = "free_y") +
labs(title="Turnover: Australian print media industry", y="$AU")

- We can see that adjusted for CPI the Turnover has been declining
since 1980’s
- Decline since 2010 is been faster, but has been going on since
1980
When forecasting you want to forecast taking in to account CPI
The original plot:
food |> autoplot(Turnover)

Square root:
Works, but variations at the start of series are still bigger than
variations at the end
food |> autoplot(sqrt(Turnover)) + labs(title = "Square root Turnover")

Cube root
Works, but variations at the start of series are still bigger than
variations at the end
food |> autoplot(Turnover^(1/3)) + labs(title = "Cube root Turnover")

Log
food |> autoplot(log(Turnover)) + labs(title = "Log Turnover")

- Much better, variations are more HOMOGENEOUS
- Size of fluctuations at the start are about same size at end
This is going to make modeling much easier Makes timeseries
decompositions easier Good first step to remove that extra heterogeneity
Allows us to fit a much simpler model
Function to compute the optimal lambda
food |> features(Turnover, features = guerrero)
This attempts to balance the seasonal fluctuations and random
variation across the series Always check the results A low value of
lambda can give extremely large prediction intervals
Time series components
A time series decomposition involves estimating timeseries
components
- Trend: long term increase or decrease
- Seasonal pattern: influenced by the calendar
- Cycles: rises or falls that are not of fixed period
Decomposition: yt = f(St, Tt, Rt) where: * yt = data at period t * St
= seasonal component at period t * Tt = trend at period t * Rt =
remainder component at period t
Additive decomposition: yt = St + Tt + Rt Multiplicative
decomposition: yt = St * Tt * Rt
Additive model
Appropiate if magnitude of seasonal fluctuations does not vary
with level
If seasonal patterns are proportional to the level of the series
-> MULTIPLICATIVE MODEL
Multiplicative decomposition more prevalent with economic
series
ALTERNATIVE: Use a Box-Cox transformation, and then
use additive decomposition
Logs turn multiplicative relationships in to additive
relationships
yt = St * Tt * Rt --> ln(yt) = ln(St) + ln(Tt) + ln(Rt )
Example
us_employment |>
filter(Title == "Retail Trade", year(Month) >= 1990) |>
select(-Series_ID) -> us_retail_employment_1990
us_retail_employment_1990 |> autoplot(Employed) + geom_point() +
labs(y="Persons (thousands)", title = "Total employment in US retail")

- We have Trend + Cycle + Seasonality, lets fit a model
us_retail_employment_1990 |>
model(stl = STL(Employed))
Looking at the model components
dcmp <- us_retail_employment_1990 |>
model(stl = STL(Employed))
components(dcmp)
Autoplot the components
components(dcmp) |> autoplot()

Other things we can do with the components
us_retail_employment_1990 |>
autoplot(Employed, color="black") +
autolayer(components(dcmp), trend, color="#D55E00") +
labs(y= "Persons (thousands)", title = "Total employment in US retail")

Subseries plot
Useful if interested in SHAPE of seasonality
components(dcmp) |> gg_subseries(season_year)

Sesonal Adjustment
- Useful by-product of decomposition: an easy way to calculate
seasonally adjusted data
- Additive decomposition: seasonally adjusted data given by yt - St =
Tt + Rt
- Multiplicative: yt/St = Tt * Rt
Plotting the seasonally adjusted data: yt - St = Tt + Rt
us_retail_employment_1990 |>
autoplot(Employed, color="black") +
autolayer(components(dcmp), season_adjust, color="blue") +
labs(y="Persons (thousands)", title = "Total employment in US retail")

Observations:
- We can see the season_adjust (blue) line is more wobbly than the
trend line in the above chart
- This is due to the effect of Rt the remainder, which is almost
always noisy in nature
How to seasonal adjust:
- We use estimates of S based on past values to seasonally adjust a
current value
- Seasonally adjusted series reflect remainders as
well as trend
- Therefore they are
not smooth, and
downturns or upturns can be
misleading
It is better to use the trend-cycle component to look for turning points
Moving Averages (Classical Decomposition)
- Traditional way to perform time series decompositions,
- Originated in the 1920’s but still the basis for many
decompositions.
A moving average is an average of nearby points
- observations nearby in time are also likely to be
close in value
- average eliminates some
randomness in the data, leaving
a smooth trend-cycle component
- 3-MA: Tt = (y(t-1) + yt + y(t+1))/3
- 5-MA: Tt = (y(t-2) + y(t-1) + yt + y(t+1) + y(t+2))/5
- each average computed by dropping
oldest observation
and including next observation
- averaging
moves trough time series until trend-cycle
computed at each observation possible
First step: Get and estimate of the trend component
- We use Moving Averages for this:
global_economy |>
filter(Country == "Australia") |>
autoplot(Exports) +
labs(y="% of GDP", title = "Total Australian exports")

Why is there no estimates at the ends?
- for a 3-MA, there cannot be estimates at time 1 or time T because
the observations at time 0 and T+1 are not available
- Generally: there cannot be estimates at times near the
endpoints
The order of the MA
- larger order means smoother, flatter curve
- larger order means more points lost at ends
if order = lenght of season or cycle IT
removes pattern
- But so far odd orders?
- with even orders, we need to take 2 moving averages, to center the
data
If order==4: 1/4[yt-2 + yt-1 + yt + yt+1] or 1/4[yt-1 + yt + yt+1 +
yt+2] There are 2 options to compute the same 4-MA so we need to
recenter at half of both 4_MA : 1/2(1/4[yt-2 + yt-1 + yt + yt+1] +
1/4[yt-1 + yt + yt+1 + yt+2]) = 1/8yt-2 + 1/4yt-1 + 1/4yt + 1/4yt=1 +
1/8yt+2
A MA of the same lenght as the season removes the seasonal pattern
- For quarterly data: use 2x4 MA
- For monthly data: use a 2x12 MA
Moving average trend-cycle
us_retail_employment_1990 |>
mutate(`12-MA` = slider::slide_dbl(Employed, mean,
.before = 5, .after = 6, .complete = TRUE),
`2x12-MA` = slider::slide_dbl(`12-MA`, mean,
.before = 1, .after = 0, .complete = TRUE)) -> us_retail_employment_1990_ma
us_retail_employment_1990_ma
Plotting it
us_retail_employment_1990_ma |>
autoplot(Employed, color="black") +
autolayer(us_retail_employment_1990_ma, vars("2x12-MA"),
color="blue") +
labs(y="Persons (thousands)", title = "Total employment in US retail")

Classical Decomposition (estimating time series components)
Additive decomposition: components add
Multiplicative Decomposition: components multiply
EStimate Tt using MA (2xm)-MA if m is even. Otherwise, estimate
Tt using m-MA
Compute de-trended series
- Additive decomposition: yt - Tt
- Multiplicative decomposition: yt/Tt
- De-trending
Remove smoothed series Tt from yt to leave St and Rt
- Additive model yt - Tt + (Tt + St + Rt) - St = St + Rt
- Multiplicative model: yt/Tt = TtxStxRt/Tt = StxRt
This will leave behind Seasonal and Remainder
- Estimate Seasonal component
- Seasonal index for each season is estimated as an average of the
detrended series for that season of successive years
- E.g. take averages across all Januaries to get S1 if your data is
monthly
If necessary CONSTRAIN adjust the seasonal indices so
that:
- Additive: S1 + S2 + … + SN = 0
- Multiplicative: S1 + S2 + … + SN = m
The seasonal component St simply consists of replications of the
seasonal indices
- Additive decomposition: Rt = yt - Tt - St
- Multiplicative decomposition: Rt = yt/(Tt*St)
Classical decomposition
- Choose additive or multiplicative depending on which gives the most
stable components
- For multiplicative model, this method a.k.a.
ratio-to-moving-average method
Additive
dcmp <- us_retail_employment_1990 |>
model(stl = STL(Employed))
components(dcmp) |> autoplot() +
labs(title = "Classical additive decomposition of US retail employment")

Multiplicative
dcmp <- us_retail_employment_1990 |>
model(classical_decomposition(Employed, type = "multiplicative"))
components(dcmp) |> autoplot() +
labs(title = "Classical multiplicative decomposition of US retail employment")

Methods used in official statistics
Timeseries decomposition is officially used in official statistics
bureaus
- Classical method originated in 1920
Census II method introduced in 1957
- Basis for
X-11 method and variants (including X-12,
X-12-ARIMA, X-13-ARIMA)
STL method introduced in 1983 at BELL
LABS
TRAMO/SEATS introduced in 1990s in Europe
(bank of Spain)
National Statistics Offices
- ABS (Australia) uses X-12-ARIMA
- US Census Bureau uses X-13ARIMA-SEATS
- Statistics Canada uses X-12-ARIMA
- ONS (UK) uses X-12-ARIMA
- EuroStat uses X-13ARIMA-SEATS
How to do it using the FABLE packages
Install packages
# install.packages("seasonal")
Load packages
# library(seasonal)
X-13-ARIMA-SEATS Decomposition
x11_dcmp <- us_retail_employment_1990 |>
model(x11 = X_13ARIMA_SEATS(Employed ~ x11()))
components(x11_dcmp) # show the components
Plotting X-13-ARIMA MULTIPLICATIVE
components(x11_dcmp) |> autoplot()

Advantages
- Relatively robust to outliers
- Completly automated choices for trend and seasonal changes
- Very widely tested on economic data over a long period of time
Disadvantages
- No prediction/confidence intervals
- Ad hoc method with no underlying model
ONLY DEVELOPED FOR MONTHLY/QUARTER DATA
Extensions: X-12ARIMA- and X-13ARIMA
- The X11, X12ARIMA, X13ARIMA methods are based on Census II
decomposition
- These allow adjustments for trading days and other explanatory
variables
- Known outliers can be ommited
- Level shifts and ramp effects can be modelled
- Missing values estimated and replaced
- Holiday factors can be estimated
X-13ARIMA-SEATS
seats_dcmp <- us_retail_employment_1990 |>
model(seats = X_13ARIMA_SEATS(Employed ~ seats()))
components(seats_dcmp) |> autoplot()

Advantages of X-13ARIMA-SEATS
- Model based
CAN GET CONFIDENCE INTERVALS
- Smooth trend estimate
- Allows estimates at end points
- Allows changing seasonality
- Developed for economic data
Disadvantages of X-13ARIMA-SEATS
ONLY DEVELOPED FOR Q AND M DATA
STL Seasonal and Trend decomposition using Loess
- Loess is a form of smoothing based on locally linear
regressions
- Very versatile and robust
- Will handle any type of seasonality
- Seasonal component allowed to change over time
- Rate of change controlled by user
- Smoothness of trend-cycle also controlled by user
- Robust to outliers
Disadvantages:
- No trading day adjustment
- No calendar adjustments
- Only additive
For multiplicative: Take logs to get multiplicative decomposition Use
Box-Cox transformations to get other decompositions
Example
us_retail_employment_1990 |>
model(STL(Employed)) |>
components() |>
autoplot()

We can control the change in seasonal pattern and trend with
windows
us_retail_employment_1990 |>
model(STL(Employed ~ season(window=15) + trend(window=6665))) |>
components() |>
autoplot()

Observations
- Lot of structure in the remainder
- Because trend window = 6665
- Trend becomes a straight line
- What should have gone in trend goes to remainder
Default params work pretty good
Trend with small window
us_retail_employment_1990 |>
model(STL(Employed ~ season(window=15) + trend(window=15))) |>
components() |>
autoplot()

Robust argument (default is false)
us_retail_employment_1990 |>
model(STL(Employed ~ season(window=15) + trend(window=15), robust = TRUE)) |>
components() |>
autoplot()

Summary
- trend(window = ?) controls wiggliness of trend component
- Because of Loes
- weighted least squared fits
- smaller windows, fewer obs for each local linear regression
- season(window = ?) controls variations on seasonal component
- Same component in different years
- How many years to use when estimating each of seasonal
component
- season(window = ‘periodic’) is equivalent to infinite window
By default
- chooses season(window=13)
- Can include transformations
STL DECOMPOSITION
- Default trend: window =
nextodd(ceiling((1.5*period)/(1-(1.5/s.window))))
- Default season: window = 13
How does it work?
- Updated the trend and seasonal component iteratively
- Starts off by saying, lets suppose there is no trend, estimate
seasonality by classical decomp
- Removes that, tries to estimate trend using LOES curve
- Iterates
Explanation:
- Starts with Tt = 0
- Uses a mixture of
loess and MA to
sucessively refine the trend and seasonal
estimates
- The trend window controls
loess bandwidth applied to
deasonalised values
- The season window controls
loess bandwidth applied to
detrended subseries
- Robustbess weights based on remainder
THE END
---
title: "Time Series Decomposition"
output: html_notebook
---

## Load libraries

```{r}
library(fpp3)
```


```{r}
library(GGally)
```


```{r}
library(seasonal)
```

### Transformations and adjustments

* Per capita adjustments: Dividing data by population


```{r}
global_economy |>
  filter(Country == "Australia") |>
  autoplot(GDP / Population)
```

### Inflation adjustments

Australian newspapers adjusted for inlfation

```{r}
## Retail CPI
aus_dataset <- aus_retail |>
  filter(Industry == "Newspaper and book retailing") |>
  group_by(Industry) |>
  index_by(Year = year(Month)) |>
  summarise(Turnover = sum(Turnover))

aus_dataset |> autoplot()
```
* We see the data going up untill around 2010 and then start dipping down.
* Is this because a change in industry or just a change in the CPI?
* Or is it because the CPI?

```{r}
## CPI
aus_economy <- global_economy |>
  filter(Code == "AUS")

aus_economy
```

Joining the datasets

```{r}
aus_dataset |>
  left_join(aus_economy, by="Year") |>
  mutate(Adjusted_Turnover = Turnover / CPI *100) |>
  pivot_longer(c(Turnover, Adjusted_Turnover), values_to = "Turnover") |>
  mutate(name=factor(name, levels = c("Turnover","Adjusted_turnover"))) |>
  ggplot(aes(x=Year,y=Turnover)) + geom_line() +
  facet_grid(name ~ ., scales = "free_y") +
  labs(title="Turnover: Australian print media industry", y="$AU")
```

* We can see that adjusted for CPI the Turnover has been declining since 1980's
* Decline since 2010 is been faster, but has been going on since 1980

`When forecasting you want to forecast taking in to account CPI`

### Mathematical Transformation

* Take data and transform by applying a mathematical transformation
* We try to make the variations in the series more HOMOGENEOUS

```{r}
food <- aus_retail |>
  filter(Industry == "Food retailing") |>
  summarise(Turnover = sum(Turnover))
food
```

### The original plot:

```{r}
food |> autoplot(Turnover)
```

### Square root:

Works, but variations at the start of series are still bigger than variations at the end

```{r}
food |> autoplot(sqrt(Turnover)) + labs(title = "Square root Turnover")
```

### Cube root

Works, but variations at the start of series are still bigger than variations at the end

```{r}
food |> autoplot(Turnover^(1/3)) + labs(title = "Cube root Turnover")
```

### Log

```{r}
food |> autoplot(log(Turnover)) + labs(title = "Log Turnover")
```

* Much better, variations are more HOMOGENEOUS
* Size of fluctuations at the start are about same size at end

This is going to make modeling much easier
Makes timeseries decompositions easier
Good first step to remove that extra heterogeneity
Allows us to fit a much simpler model

### Inverse transformation

```{r}
food |> autoplot(-1/Turnover) + labs(title = "-1/Turnover")
```

* Gone too far, Log was about right

### BOX-COX Transformations

Bickel-Doksum transformations

### Function to compute the optimal lambda

```{r}
food |> features(Turnover, features = guerrero)
```
This attempts to balance the seasonal fluctuations and random variation across the series
Always check the results
A low value of lambda can give extremely large prediction intervals


### Using the guerrero transformation

```{r}
food |> autoplot(box_cox(Turnover, 0.08952696)) +
  labs(title = "Box-Cox Transformed turnover")
```

* Often no transformations needed
* Simple transformations are easier to explain and work well enough
* Can have a VERY LARGE effect on Prediction Interval
* If some data are 0 or negative, then use lambda > 0
* log1p() can also be useful for data with 0s
* Choosing logs is a simple way to force forecasts to be positive
* Transformations must be reversed to obtain forecasts on the original scale. (Handled automatically by fable)

# Time series components

A time series decomposition involves estimating timeseries components

* Trend: long term increase or decrease
* Seasonal pattern: influenced by the calendar
* Cycles: rises or falls that are not of fixed period

Decomposition: yt = f(St, Tt, Rt) where:
* yt = data at period t
* St = seasonal component at period t
* Tt = trend at period t
* Rt = remainder component at period t

Additive decomposition: yt = St + Tt + Rt
Multiplicative decomposition: yt = St * Tt * Rt

### Additive model

* Appropiate if magnitude of seasonal fluctuations does not vary with level
* If seasonal patterns are proportional to the level of the series -> MULTIPLICATIVE MODEL
* Multiplicative decomposition more prevalent with economic series

* `ALTERNATIVE`: Use a Box-Cox transformation, and then use additive decomposition
* Logs turn multiplicative relationships in to additive relationships

`yt = St * Tt * Rt --> ln(yt) = ln(St) + ln(Tt) + ln(Rt )`

### Example 

```{r}
us_employment |> 
  filter(Title == "Retail Trade", year(Month) >= 1990) |>
  select(-Series_ID) -> us_retail_employment_1990

us_retail_employment_1990 |> autoplot(Employed) + geom_point() +
  labs(y="Persons (thousands)", title = "Total employment in US retail")
```

* We have Trend + Cycle + Seasonality, lets fit a model

```{r}
us_retail_employment_1990 |>
  model(stl = STL(Employed))
```

### Looking at the model components

```{r}
dcmp <- us_retail_employment_1990 |>
  model(stl = STL(Employed))
components(dcmp)
```

### Autoplot the components

```{r}
components(dcmp) |> autoplot()
```

### Other things we can do with the components

```{r}
us_retail_employment_1990 |>
  autoplot(Employed, color="black") +
  autolayer(components(dcmp), trend, color="#D55E00") +
  labs(y= "Persons (thousands)", title = "Total employment in US retail")
```

### Subseries plot `Useful if interested in SHAPE of seasonality`

```{r}
components(dcmp) |> gg_subseries(season_year)
```

### Sesonal Adjustment

* Useful by-product of decomposition: an easy way to calculate seasonally adjusted data
* Additive decomposition: seasonally adjusted data given by yt - St = Tt + Rt
* Multiplicative: yt/St = Tt * Rt

Plotting the seasonally adjusted data: yt - St = Tt + Rt


```{r}
us_retail_employment_1990 |>
  autoplot(Employed, color="black") +
  autolayer(components(dcmp), season_adjust, color="blue") +
  labs(y="Persons (thousands)", title = "Total employment in US retail")
```

Observations:

* We can see the season_adjust (blue) line is more wobbly than the trend line in the above chart
* This is due to the effect of Rt the remainder, which is almost always noisy in nature

### How to seasonal adjust:

* We use estimates of S based on past values to seasonally adjust a current value
* Seasonally adjusted series reflect **remainders** as well as **trend**
* Therefore they are `not` `smooth`, and `downturns` or `upturns` can be `misleading`
* `It is better to use the trend-cycle component to look for turning points`

# Moving Averages (Classical Decomposition)

* Traditional way to perform time series decompositions, 
* Originated in the 1920's but still the basis for many decompositions.

A moving average is an average of nearby points

* observations nearby in time are also likely to be `close in value`
* average eliminates some `randomness` in the data, leaving a smooth trend-cycle component
* 3-MA: Tt = (y(t-1) + yt + y(t+1))/3
* 5-MA: Tt = (y(t-2) + y(t-1) + yt + y(t+1) + y(t+2))/5
* each average computed by dropping `oldest` observation and including `next` observation
* averaging `moves` trough time series until trend-cycle computed at each observation possible

### First step: Get and estimate of the trend component

* We use Moving Averages for this:


```{r}
global_economy |>
  filter(Country == "Australia") |>
  autoplot(Exports) +
  labs(y="% of GDP", title = "Total Australian exports")
```


### Why is there no estimates at the ends?

* for a 3-MA, there cannot be estimates at time 1 or time T because the observations at time 0 and T+1 are not available
* Generally: there cannot be estimates at times near the endpoints

### The order of the MA

* larger order means smoother, flatter curve
* larger order means more points lost at ends
* `if order = lenght of season or cycle` IT `removes pattern`
* But so far odd orders?
* with even orders, we need to take 2 moving averages, to center the data

If order==4: 1/4[yt-2 + yt-1 + yt + yt+1] or 1/4[yt-1 + yt + yt+1 + yt+2]
There are 2 options to compute the same 4-MA so we need to recenter at half of both
4_MA : 1/2(1/4[yt-2 + yt-1 + yt + yt+1] + 1/4[yt-1 + yt + yt+1 + yt+2]) = 1/8yt-2 + 1/4yt-1 + 1/4yt + 1/4yt=1 + 1/8yt+2

`A MA of the same lenght as the season removes the seasonal pattern`

* For quarterly data: use 2x4 MA
* For monthly data: use a 2x12 MA

### Moving average trend-cycle

```{r}
us_retail_employment_1990 |>
  mutate(`12-MA` = slider::slide_dbl(Employed, mean,
                                     .before = 5, .after = 6, .complete = TRUE),
         `2x12-MA` = slider::slide_dbl(`12-MA`, mean,
                                        .before = 1, .after = 0, .complete = TRUE)) -> us_retail_employment_1990_ma

us_retail_employment_1990_ma
```

### Plotting it

```{r}
us_retail_employment_1990_ma |>
  autoplot(Employed, color="black") +
  autolayer(us_retail_employment_1990_ma, vars("2x12-MA"),
            color="blue") +
  labs(y="Persons (thousands)", title = "Total employment in US retail")
```

# Classical Decomposition (estimating time series components)

Additive decomposition: components add

* yt = Tt + St + Rt

Multiplicative Decomposition: components multiply

* yt = Tt * St * Rt

1. EStimate Tt using MA (2xm)-MA if m is even. Otherwise, estimate Tt using m-MA

2. Compute de-trended series

  * Additive decomposition: yt - Tt
  * Multiplicative decomposition: yt/Tt

3. De-trending

Remove smoothed series Tt from yt to leave St and Rt

* Additive model yt - Tt + (Tt + St + Rt) - St = St + Rt
* Multiplicative model: yt/Tt = TtxStxRt/Tt = StxRt

This will leave behind Seasonal and Remainder

4. Estimate Seasonal component

* Seasonal index for each season is estimated as an average of the detrended series for that season of successive years
* E.g. take averages across all Januaries to get S1 if your data is monthly

If necessary `CONSTRAIN` adjust the seasonal indices so that:

  * Additive: S1 + S2 + ... + SN = 0
  * Multiplicative: S1 + S2 + ... + SN = m
  
The seasonal component St simply consists of replications of the seasonal indices

* Additive decomposition: Rt = yt - Tt - St
* Multiplicative decomposition: Rt = yt/(Tt*St)

`Classical decomposition`

* Choose additive or multiplicative depending on which gives the most stable components
* For multiplicative model, this method a.k.a. `ratio-to-moving-average` method

### Additive

```{r}
dcmp <- us_retail_employment_1990 |>
  model(stl = STL(Employed))
components(dcmp) |> autoplot() +
  labs(title = "Classical additive decomposition of US retail employment")
```

### Multiplicative

```{r}
dcmp <- us_retail_employment_1990 |>
  model(classical_decomposition(Employed, type = "multiplicative"))
components(dcmp) |> autoplot() +
  labs(title = "Classical multiplicative decomposition of US retail employment")
```

### General comments of Classical Decomposition

* Estimate of trend is unavailable for first few and last few observations
* **Seasonal component repeats** form year to year. May not be realistic
* **Not robust to outliers**
* Newer methods desinged to overcome these problems
  
# `Methods used in official statistics`

Timeseries decomposition is officially used in official statistics bureaus

* Classical method originated in 1920
* `Census II` method introduced in `1957`
* Basis for `X-11` method and variants (including X-12, X-12-ARIMA, X-13-ARIMA)
* `STL` method introduced in `1983` at BELL LABS
* `TRAMO/SEATS` introduced in `1990s` in Europe (bank of Spain)

National Statistics Offices

* ABS (Australia) uses X-12-ARIMA
* US Census Bureau uses X-13ARIMA-SEATS
* Statistics Canada uses X-12-ARIMA
* ONS (UK) uses X-12-ARIMA
* EuroStat uses X-13ARIMA-SEATS

## How to do it using the `FABLE` packages

### Install packages

```{r}
# install.packages("seasonal")
```
### Load packages

```{r}
# library(seasonal)
```

### X-13-ARIMA-SEATS Decomposition

```{r}
x11_dcmp <- us_retail_employment_1990 |>
  model(x11 = X_13ARIMA_SEATS(Employed ~ x11()))

components(x11_dcmp) # show the components
```

### Plotting X-13-ARIMA `MULTIPLICATIVE`

```{r}
components(x11_dcmp) |> autoplot()
```

### Advantages

* Relatively robust to outliers
* Completly automated choices for trend and seasonal changes
* Very widely tested on economic data over a long period of time

### Disadvantages

* No prediction/confidence intervals
* Ad hoc method with no underlying model
* `ONLY DEVELOPED FOR MONTHLY/QUARTER DATA`

### Extensions: X-12ARIMA- and X-13ARIMA

* The X11, X12ARIMA, X13ARIMA methods are based on Census II decomposition
* These allow adjustments for trading days and other explanatory variables
* Known outliers can be ommited
* Level shifts and ramp effects can be modelled
* Missing values estimated and replaced
* Holiday factors can be estimated

## X-13ARIMA-SEATS

```{r}
seats_dcmp <- us_retail_employment_1990 |>
  model(seats = X_13ARIMA_SEATS(Employed ~ seats()))

components(seats_dcmp) |> autoplot()
```

### Advantages of `X-13ARIMA-SEATS`

* Model based
* `CAN GET CONFIDENCE INTERVALS`
* Smooth trend estimate
* Allows estimates at end points
* Allows changing seasonality
* Developed for economic data

### Disadvantages of `X-13ARIMA-SEATS`

* `ONLY DEVELOPED FOR Q AND M DATA`

# STL Seasonal and Trend decomposition using Loess

* Loess is a form of smoothing based on locally linear regressions
* Very versatile and robust
* Will handle any type of seasonality
* Seasonal component allowed to change over time
  * Rate of change controlled by user
* Smoothness of trend-cycle also controlled by user
* Robust to outliers

Disadvantages:

* No trading day adjustment
* No calendar adjustments
* Only additive

For multiplicative: Take logs to get multiplicative decomposition
Use Box-Cox transformations to get other decompositions

### Example

```{r}
us_retail_employment_1990 |>
  model(STL(Employed)) |>
  components() |>
  autoplot()
```

### We can control the change in seasonal pattern and trend with windows

```{r}
us_retail_employment_1990 |>
  model(STL(Employed ~ season(window=15) + trend(window=6665))) |>
  components() |>
  autoplot()
```

Observations

* Lot of structure in the remainder
* Because trend window = 6665
* Trend becomes a straight line
* What should have gone in trend goes to remainder
* `Default params work pretty good`

### Trend with small window

```{r}
us_retail_employment_1990 |>
  model(STL(Employed ~ season(window=15) + trend(window=15))) |>
  components() |>
  autoplot()
```

* Observations

* With trend window = 15
* We capture the interesting behaviour our in Trend
* Noise goes to remainder
* No strong structure in remainder

### Robust argument (default is false)

```{r}
us_retail_employment_1990 |>
  model(STL(Employed ~ season(window=15) + trend(window=15), robust = TRUE)) |>
  components() |>
  autoplot()
```

### Summary

* trend(window = ?) controls wiggliness of trend component
  * Because of Loes
  * weighted least squared fits
  * smaller windows, fewer obs for each local linear regression
  
* season(window = ?) controls variations on seasonal component
  * Same component in different years
  * How many years to use when estimating each of seasonal component
* season(window = 'periodic') is equivalent to infinite window

By default

* chooses season(window=13)
* Can include transformations

`STL DECOMPOSITION`

* Default trend: window = nextodd(ceiling((1.5*period)/(1-(1.5/s.window))))
* Default season: window = 13

How does it work?

* Updated the trend and seasonal component iteratively
* Starts off by saying, lets suppose there is no trend, estimate seasonality by classical decomp
* Removes that, tries to estimate trend using LOES curve
* Iterates

Explanation:

* Starts with Tt = 0
* Uses a mixture of `loess` and `MA` to sucessively refine the `trend` and `seasonal` estimates
* The trend window controls `loess` bandwidth applied to deasonalised values
* The season window controls `loess` bandwidth applied to detrended subseries
* Robustbess weights based on remainder

### LINK TO PAPER:

http://bit.ly/stl1990


# THE END
