The plastics data set consists of the monthly sales (in thousands) of product A for a plastics manufacturer for five years.
Plot the time series of sales of product A. Can you identify seasonal fluctuations and/or a trend-cycle?
For the five years of data, there are seasonal fluctuations, peaks in August/September and troughs in January/February. The overall trend is increasing at what appears to a steady rate.
autoplot(plastics) +
ggtitle('Plastics sales TS')
Use a classical multiplicative decomposition to calculate the trend-cycle and seasonal indices.
A classical decomposition can be performed using the base R function decompose()
. An added argument indicates that the decomposition will be multiplicative instead of additive.
decomp <- decompose(plastics, type="multiplicative")
decomp %>% autoplot()+ ggtitle('Decomposition of plastics TS')
Do the results support the graphical interpretation from part a?
The results of the decomposition clearly support the graphical interpretation. The seasonal component shows the peaks and troughs discussed previously as well as an approximately linear and increasing trend-cycle.
Compute and plot the seasonally adjusted data.
Since a multiplicative decomposition was used, the seasonally adjusted data can be calculated by y/S
. This time series is a combination of the trend cycle and remainder components, and can be referred to as a deseasonalized time series.
seas_adj <- plastics/decomp$seasonal
seas_adj %>% autoplot() + ggtitle('Seasonally adjusted plastics TS')
Change one observation to be an outlier (e.g., add 500 to one observation), and recompute the seasonally adjusted data. What is the effect of the outlier? The outlier has a large effect - the spike is considered to the part of the seasonality, so in the seasonally adjusted data there is a corresponding dip every year for the same month as the outlier.
plastics_outlier <- plastics
plastics_outlier[30] <- plastics_outlier[30] + 500
decomp2 <- decompose(plastics_outlier, type="multiplicative")
decomp2 %>% autoplot
seas_adj2 <- plastics_outlier/decomp2$seasonal
seas_adj2 %>% autoplot()+ ggtitle('Seasonally adjusted plastics TS with outlier added')
Does it make any difference if the outlier is near the end rather than in the middle of the time series? The previous outlier was addded in the middle of the data at observation 30 (of 60). The same process is repeated to see what happens to an addition of the same magnitude to observation at the end of the time series.
plastics_outlier2 <- plastics
plastics_outlier2[58] <- plastics_outlier2[58] + 500
decomp3 <- decompose(plastics_outlier2, type="multiplicative")
decomp3 %>% autoplot
seas_adj3 <- plastics_outlier2/decomp3$seasonal
seas_adj3 %>% autoplot()+ ggtitle('Seasonally adjusted plastics TS with outlier added at the end')
Recall your retail time series data (from Exercise 3 in Section 2.10). Decompose the series using X11. Does it reveal any outliers, or unusual features that you had not noticed previously?
From a previous assignment, it was written of this time series:
There is an obvious increasing trend over time, the rate of increase does change over different time periods with steady increase during the 80’s before leveling off during the early-mid 90’s, then sharp increases from the late nineties through about 2005. There is still overall increase after that but the rate of increase flattens. Seasonality is certainly a major factor, particularly in December when values are highest, with a second bump generally occurring in May - February is consistently the lowest. Sales show high levels of serially correlation due to the underlying trend.
The decomposition is consistent with previous observations made regarding the original time series. The decomposition is very useful for trend cycle analysis in particular. This component is more clearly visible in the decomposition. Unusual values in the time series indicated by the remainder component of the decomposition in early 1989 was something I had not noticed originally.
# read in retail data from excel spreadsheet
retaildata <- readxl::read_excel("retail.xlsx", skip=1)
# create time series object
myts <- ts(retaildata[,"A3349627V"],
frequency=12, start=c(1982,4))
# decomposition
myts %>% seas(x11="") -> fit
# plot decomposition
autoplot(fit) +
ggtitle("X11 decomposition liquor sales")
myts %>% window()
## Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
## 1982 41.7 43.1 40.3 40.9 42.1 42.0 46.1 46.5 53.8
## 1983 43.8 39.3 43.4 43.7 42.3 40.4 41.6 42.2 42.5 45.0 45.8 56.8
## 1984 47.2 44.5 46.1 44.8 45.3 44.8 42.5 46.3 46.2 47.4 50.8 65.3
## 1985 52.2 45.8 49.4 47.3 48.2 45.2 47.3 50.3 48.5 52.2 55.0 70.3
## 1986 54.8 49.0 53.7 51.4 50.7 47.4 48.3 52.6 51.8 57.4 56.8 73.3
## 1987 59.5 53.9 56.2 57.3 55.1 52.7 57.2 56.8 57.9 64.3 63.6 82.5
## 1988 67.8 59.8 64.6 62.4 59.6 56.4 54.1 52.9 59.4 63.9 67.1 99.0
## 1989 51.6 52.3 66.8 70.1 73.5 67.7 71.1 73.4 84.0 85.2 88.7 146.7
## 1990 85.1 75.5 82.1 79.9 79.9 76.0 71.1 72.7 69.0 73.0 83.6 121.0
## 1991 76.4 67.3 74.3 73.1 75.9 66.0 66.9 72.2 69.8 75.0 79.7 113.9
## 1992 77.5 66.7 72.3 73.8 69.7 65.4 69.4 70.8 72.0 77.0 77.8 112.3
## 1993 81.1 69.3 69.1 75.6 70.4 65.6 72.3 73.4 71.5 78.2 82.9 124.0
## 1994 84.7 71.4 77.1 75.7 72.3 73.5 78.8 77.7 84.0 90.6 98.7 144.9
## 1995 94.3 83.3 90.3 93.0 84.4 85.9 89.6 91.1 95.5 93.4 104.2 150.2
## 1996 93.7 92.2 101.3 98.7 98.7 95.1 92.5 97.6 94.7 101.7 113.3 153.1
## 1997 93.7 92.1 103.3 95.7 103.7 96.7 91.1 95.4 90.4 102.0 108.7 157.4
## 1998 100.7 88.9 94.0 93.8 94.6 87.7 87.1 84.3 85.3 101.7 103.9 150.6
## 1999 91.5 84.4 96.6 92.1 92.1 86.6 97.5 93.9 100.1 107.4 108.3 159.8
## 2000 96.8 94.9 97.5 101.0 97.7 106.2 96.9 105.7 113.8 113.3 117.3 175.9
## 2001 122.6 106.4 119.6 111.7 112.1 107.0 102.6 106.5 109.3 121.5 131.2 191.9
## 2002 124.8 111.6 125.5 113.2 115.5 110.2 123.6 122.6 124.9 143.4 154.5 199.0
## 2003 142.6 124.6 134.6 135.6 133.6 133.8 130.0 131.7 138.8 150.0 160.2 220.5
## 2004 143.0 128.7 140.8 141.5 133.0 133.4 138.4 135.9 142.0 148.2 153.7 226.9
## 2005 145.7 139.3 155.9 149.3 143.1 142.8 144.4 150.3 157.6 177.8 194.7 279.9
## 2006 171.9 157.6 174.7 172.2 173.0 164.7 171.1 173.6 181.7 189.3 206.5 296.5
## 2007 191.2 179.7 194.9 191.4 183.3 173.5 182.1 189.9 204.4 204.5 217.3 309.9
## 2008 214.9 184.3 210.3 200.4 196.9 195.4 179.5 176.8 179.2 214.1 222.8 310.8
## 2009 242.9 194.0 215.0 213.6 211.3 206.0 201.4 208.7 206.1 218.5 236.1 325.9
## 2010 233.7 199.0 222.6 222.2 213.4 206.3 206.7 212.9 224.3 238.7 251.2 380.0
## 2011 250.0 216.8 240.5 244.4 232.1 223.6 239.0 245.7 251.6 264.3 282.9 409.6
## 2012 265.2 239.4 259.6 247.1 243.3 238.2 243.2 255.7 261.9 265.8 283.4 400.3
## 2013 274.1 231.7 273.4 256.1 250.0 241.9 248.7 264.6 262.8 264.4 271.5 394.5