Exercise 6.2: plastics

a. Plot the time series

  • Seasonality: A 12-month annual cycle is evident from the seasonal plot
  • Trend-cycle: A trend of steady upward growth is evident from the 12-month moving average plot
autoplot(plastics) + 
  labs(title = "Monthly sales of plastic product", 
       x = "Year", y = "")

# seasonality
ggseasonplot(plastics) + 
  labs(title = "Monthly sales of plastic product",
       subtitle = "Seasonal plot")

# 12-month moving average trend
autoplot(plastics, series="Data") +
  autolayer(ma(plastics, 12), series="12-MA") +
  labs(title = "Monthly sales of plastic product", 
       subtitle = "Time series with 12-month moving average",
       x = "Year", y = "") +
  scale_colour_manual(values=c("Data"="grey","12-MA"="red"),
                      breaks=c("Data","12-MA"))
## Warning: Removed 12 rows containing missing values (geom_path).

b. Classical multiplicative decomposition

The components of the classical decomposition are plotted below. The values of the trend-cycle and seasonal components can be accessed using the trendcycle and seasonal functions.

plastics %>% decompose(type = "multiplicative") -> classical
autoplot(classical) + 
  labs(title = "Monthly sales of plastic product",
       subtitle = "Classical multiplicative decomposition",
       x = "Year", y = "")

# show trend & seasonal values
head(trendcycle(classical), 12)
##        Jan      Feb      Mar      Apr      May      Jun      Jul      Aug
## 1       NA       NA       NA       NA       NA       NA 976.9583 977.0417
##        Sep      Oct      Nov      Dec
## 1 977.0833 978.4167 982.7083 990.4167
head(seasonal(classical), 12)
##         Jan       Feb       Mar       Apr       May       Jun       Jul
## 1 0.7670466 0.7103357 0.7765294 0.9103112 1.0447386 1.1570026 1.1636317
##         Aug       Sep       Oct       Nov       Dec
## 1 1.2252952 1.2313635 1.1887444 0.9919176 0.8330834

c. Graphical interpretation

It is apparent from the classical decomposition that there is a 12-month seasonality cycle as well as an upward trend-cycle in the time series, confirming the observations in part (a) above.

d. Plot the seasonally-adjusted data

The seasonally-adjusted data can be accessed using the seasadj function. The seasonally-adjusted data is plotted below along with the raw time series and trend line.

autoplot(plastics, series="Data") +
  autolayer(trendcycle(classical), series="Trend") + 
  autolayer(seasadj(classical), series="Seas. Adj.") + 
  labs(title = "Monthly sales of plastic product", 
       subtitle = "Data with trend-cycle and seasonal adjustment",
       x = "Year", y = "") +
  scale_colour_manual(values=c("gray","blue","red"),
             breaks=c("Data","Trend","Seas. Adj."))
## Warning: Removed 12 rows containing missing values (geom_path).

e. Add outliers

To determine the effect of outliers, we will adjust observations near the beginning, middle, and end of the time series. In particular, we add +/- 500 to observations in months 3, 21, 39, and 57, resulting in 4 new times series (Outlier1 through Outlier4).

The 4 times series are plotted below, followed by the seasonally-adjusted data. The outliers are apparent in the seasonally-adjusted data; further we can see that the position of the outlier matters, as outliers near the beginning or end of the time series are isolated whereas outliers in the middle of the time series produce “echos” throughout the time series. For instance, the outlier in month 39 (see the SeasAdj3 curve in brown below), which is an outlier on the downside, produces smaller spikes on the upside in earlier and later years because of the seasonal adjustment.

# adjust points at beginning, middle, and end
idx <- c(3, 21, 39, 57)
# list of adjusted time series
tslist <- list()
# list of seasonally adjusted time series
fitlist <- list()

for (j in 1:4) {
  tempts <- plastics
  tempidx <- idx[j]
  tempts[tempidx] <- plastics[tempidx] + (-1)^j * 500
  tslist[[j]] <- tempts
  fitlist[[j]] <- decompose(tempts, type = "multiplicative")
}

# plot time series
autoplot(tslist[[1]], series="Outlier1") + 
  autolayer(tslist[[2]], series="Outlier2") + 
  autolayer(tslist[[3]], series="Outlier3") + 
  autolayer(tslist[[4]], series="Outlier4") + 
  autolayer(plastics, series="Data") + 
  labs(title = "Monthly sales of plastic product",
       subtitle = "Data with outliers near beginning, middle, and end of time series",
       x = "Year", y = "") +
  scale_colour_manual(values=c("grey","red","blue","brown","purple"), 
                      breaks=c("Data","Outlier1","Outlier2","Outlier3","Outlier4"))

# plot seasonally-adjusted data
autoplot(seasadj(classical), series="SeasAdj") +
  autolayer(seasadj(fitlist[[1]]), series="SeasAdj1") +
  autolayer(seasadj(fitlist[[2]]), series="SeasAdj2") +
  autolayer(seasadj(fitlist[[3]]), series="SeasAdj3") +
  autolayer(seasadj(fitlist[[4]]), series="SeasAdj4") +
  labs(title = "Monthly sales of plastic product", 
       subtitle = "Seasonally adjusted time series for data and data with outliers",
       x = "Year", y = "") +
  scale_colour_manual(values=c("black","red","blue","brown","purple"),
             breaks=c("SeasAdj","SeasAdj1","SeasAdj2","SeasAdj3","SeasAdj4"))

f. Position of outliers

As seen above in part (e), the position of the outlier matters. If the outlier is near either end of the time series, it only affects the seasonally-adjusted time series in its neighborhood; however if the outlier is in the middle of the time series, it potentially affects the entire seasonally-adjusted time series. The reason for this lies in the seasonal adjustment and how the seasonality component is computed.

Recall that in the classical decomposition, the seasonal pattern is computed by removing the trend-cycle from the raw time series, and the trend-cycle is computed as a moving average (here over 12 months). As a result, the trend-cycle is not available for the first or last 6 months of the time series, and the seasonality pattern is computed over months 7 through 7 months prior to the end of the time series. So in Outlier1 and Outlier4, the outliers in months 3 and 57 occur in the beginning and ending “stub period”, which don’t contribute to the calculation of the seasonality component. Consequently, the impact of the outlier on the seasonally-adjusted time series occurs only in the neighborhood of the outlier. On the other hand, when the outlier occurs in the middle of the time series (or more specifically, after the first 6 months or prior to the last 6 months), then the seasonality pattern is computed including the effect of the outlier. As a result, the outlier will impact the entire seasonality curve, primarily in the month of the outlier but also in the overall average level of the curve.

This can be seen below in the seasonality components for Outlier1 (example of outlier at the beginning of the time series) and Outlier3 (example of outlier in the middle of the time series). It is evident that the outlier is included in the seasonal component for Outlier3 but not for Outlier1.

autoplot(fitlist[[1]]) + 
  labs(subtitle = "Outlier1")

autoplot(fitlist[[3]]) + 
  labs(subtitle = "Outlier3")

Exercise 6.3: Australian retail data

For the Australian retail data, I chose the 7th variable column in the retail dataset, which relates to “Turnover - New South Wales - Hardware, building and garden supplies retailing”.

The X11 decomposition of the time series is shown below; note that this is based on a multiplicative decomposition. Some features that are evident in the plots:

  • Trend: overall upward trend
  • Cycle: economic cycle every 7-10 years
  • Seasonality: 12-month seasonal pattern.

Of particular note, the seasonal component plot highlights the fact that the amplitude of the seasonality pattern varies over time: as a proportion of the trend line, the seasonal factor had a larger amplitude (over 2.0x) in the early years of the time series, and a smaller and consistent magnitude in the later years (around 1.2x). In addition, outliers (unexplained by the trend and seasonal components) are evident in the remainder component, including for instance, an observation of 1.2x in June 2000.

retaildata <- readxl::read_excel("../Week2/retail.xlsx", skip=1)
colID <- colnames(retaildata)[8]
myts <- ts(retaildata[ , colID], frequency=12, start=c(1982,4))

autoplot(myts) + 
  labs(title = "Turnover - NSW - Hardware, building & garden supplies retailing", 
       subtitle = "Monthly 1982/04 - 2013/12", 
       x = "Year", y = "")

myts %>% seas(x11="") %>% 
  autoplot() + 
  labs(title = "X11 decomposition of Australian retail data", 
       subtitle = "Monthly 1982/04 - 2013/12", 
       x = "Year", y = "")