Exercise 6.2: The plastics data set consists of the monthly sales (in thousands) of product A for a plastics manufacturer for five years.
autoplot(plastics) + ggtitle('Sales of product A') +
theme(plot.title = element_text(hjust = 0.5))
The plot suggest an overall upward trend in the data with a monthly seasonality peaking around July and dipping towards February (assuming that 0 on the year is January as opposed to a fiscal calendar)
fit_6_2 <- plastics %>% decompose(type="multiplicative")
autoplot(fit_6_2) + xlab("Year") +
ggtitle("Classical multiplicative decomposition
of sales of product A")
As indicated above, there is a yearly seasonality coupled with an overall upward trend in the data. The remainder does seem to have something of a cyclical component to it, so the basic decomposition isn’t capturing the whole story.
autoplot(plastics, series="Data") +
autolayer(trendcycle(fit_6_2), series="Trend") +
autolayer(seasadj(fit_6_2), series="Seasonally Adjusted") +
xlab("Year") + ylab("Sales") +
ggtitle("Monthly sales of product A for a plastics manufacturer") +
scale_colour_manual(values=c("gray","blue","red"),
breaks=c("Data","Seasonally Adjusted","Trend"))
plastics2 <- plastics
plastics2[14] <- 1500
fit_6_2e <- plastics2 %>% decompose(type="multiplicative")
autoplot(plastics2, series="Data") +
autolayer(trendcycle(fit_6_2e), series="Trend") +
autolayer(seasadj(fit_6_2e), series="Seasonally Adjusted") +
xlab("Year") + ylab("Sales") +
ggtitle("Monthly sales of product A for a plastics manufacturer (edited)") +
scale_colour_manual(values=c("gray","blue","red"),
breaks=c("Data","Seasonally Adjusted","Trend"))
Replacing the 14th month with a 1400 sales datapoint has a slight effect on the trend, but the seasonally adjusted data is changed dramatically. In this case not only does it attempt to capture the errant positive spike, but it is also forced to compensate at each dip in the seasonal data to account for the overall increase in the dips in the seasonal data
If we were to add the point towards the end as opposed to the middle, the trend line would also be strongly influenced, skewing it in the direction of the outlier and affecting any forecast we were to make. Depending on where in the seasonal pattern this outlier would fall would determine the impact on the seasonally adjusted data, since matching up with a pea/dip in teh appropriate direction would probably be better captured by the seasonal data (though it would reduce teh accuracy of the magnitudes elsewhere).
Exercise 6.3: Recall your retail time series data (from Exercise 3 in Section 2.10). Decompose the series using X11. Does it reveal any outliers, or unusual features that you had not noticed previously?
retaildata <- readxl::read_excel("retail.xlsx", skip=1)
myts <- ts(retaildata[,"A3349874C"],
frequency=12, start=c(1982,4))
library(seasonal)
fit_6_3 <- myts %>% seas(x11='')
autoplot(fit_6_3) +
ggtitle("X11 decomposition of retail data from Ex.3 of Section 2.10")
This X11 decomposition highlights a few things that I had not noticed previously: