Exercise 6.2

The plastics data set consists of the monthly sales (in thousands) of product A for a plastics manufacturer for five years.

The dataset is small enough that we can start by visually inspecting the data.

plastics
##    Jan  Feb  Mar  Apr  May  Jun  Jul  Aug  Sep  Oct  Nov  Dec
## 1  742  697  776  898 1030 1107 1165 1216 1208 1131  971  783
## 2  741  700  774  932 1099 1223 1290 1349 1341 1296 1066  901
## 3  896  793  885 1055 1204 1326 1303 1436 1473 1453 1170 1023
## 4  951  861  938 1109 1274 1422 1486 1555 1604 1600 1403 1209
## 5 1030 1032 1126 1285 1468 1637 1611 1608 1528 1420 1119 1013
a) Plot the time series of sales of product A. Can you identify seasonal fluctuations and/or a trend-cycle?
autoplot(plastics)

Answer:

It looks like there is a bit of a positive trend and seasonal fluctuations are present.

b) Use a classical multiplicative decomposition to calculate the trend-cycle and seasonal indices.
decplas <- plastics %>% decompose(type="multiplicative") 

autoplot(decplas) + xlab("Year") +
  ggtitle("Classical multiplicative decomposition of plastics")

decplas$figure
##  [1] 0.7670466 0.7103357 0.7765294 0.9103112 1.0447386 1.1570026 1.1636317
##  [8] 1.2252952 1.2313635 1.1887444 0.9919176 0.8330834
c) Do the results support the graphical interpretation from part a?
Answer:

The results support the graphical interpretation from part a. There is a clear indication of positive trend and seasonality.

d) Compute and plot the seasonally adjusted data.
#Compute
seasadj(decplas)
##         Jan       Feb       Mar       Apr       May       Jun       Jul
## 1  967.3468  981.2262  999.3182  986.4758  985.8925  956.7826 1001.1759
## 2  966.0431  985.4495  996.7427 1023.8257 1051.9377 1057.0417 1108.5982
## 3 1168.1168 1116.3736 1139.6864 1158.9443 1152.4413 1146.0648 1119.7701
## 4 1239.8204 1212.1029 1207.9388 1218.2647 1219.4437 1229.0378 1277.0364
## 5 1342.8129 1452.8342 1450.0416 1411.6051 1405.1361 1414.8628 1384.4587
##         Aug       Sep       Oct       Nov       Dec
## 1  992.4139  981.0263  951.4241  978.9119  939.8819
## 2 1100.9592 1089.0366 1090.2260 1074.6860 1081.5244
## 3 1171.9625 1196.2349 1222.2981 1179.5334 1227.9683
## 4 1269.0820 1302.6210 1345.9580 1414.4319 1451.2352
## 5 1312.3369 1240.9008 1194.5377 1128.1178 1215.9647
#Plot
autoplot(plastics, series='Plastics') +
  autolayer(seasadj(decplas), series='Seasonally Adjusted') + 
  ggtitle("Plastics - Seasonal Adjustments")

e) Change one observation to be an outlier (e.g., add 500 to one observation), and recompute the seasonally adjusted data. What is the effect of the outlier?
plastics2 <- plastics
plastics2[10] <- plastics2[10]+500
decplas2 <- plastics2 %>% decompose(type="multiplicative") 
seasadj(decplas2)
##         Jan       Feb       Mar       Apr       May       Jun       Jul
## 1  979.0979  993.1357 1011.3499  993.6901  988.2701  959.0900 1014.1829
## 2  977.7783  997.4103 1008.7433 1031.3131 1054.4746 1059.5908 1123.0009
## 3 1182.3069 1129.9234 1153.4080 1167.4198 1155.2206 1148.8287 1134.3179
## 4 1254.8815 1226.8146 1222.4822 1227.1740 1222.3846 1232.0018 1293.6273
## 5 1359.1251 1470.4677 1467.4999 1421.9284 1408.5247 1418.2749 1402.4453
##         Aug       Sep       Oct       Nov       Dec
## 1 1005.2125  993.5579 1258.5054  991.2801  951.2286
## 2 1115.1577 1102.9480 1000.0141 1088.2642 1094.5811
## 3 1187.0766 1211.5156 1121.1578 1194.4364 1242.7930
## 4 1285.4486 1319.2607 1234.5853 1432.3027 1468.7553
## 5 1329.2613 1256.7521 1095.6945 1142.3712 1230.6444
#Plot
autoplot(plastics2, series='Plastics') +
  autolayer(seasadj(decplas2), series='Seasonally Adjusted') + 
  ggtitle("Plastics with Outlier - Seasonal Adjustments")

The outlier has a significant effect on the seasonally adjusted data - since our dataset is not that large even 1 outlier when it is this significant is creating a lot of variance. The variance is slightly lower when we are looking at the seasonally adjusted data.

f) Does it make any difference if the outlier is near the end rather than in the middle of the time series?
plastics2 <- plastics
plastics2[30] <- plastics2[30]+500
decplas2 <- plastics2 %>% decompose(type="multiplicative") 
#seasadj(decplas2)
plastics3 <- plastics
plastics3[53] <- plastics3[53]+500
decplas3 <- plastics3 %>% decompose(type="multiplicative") 

#Plot
autoplot(plastics2, series='Plastics Mid-Outlier') +
  autolayer(seasadj(decplas2), series='Seasonally Adjusted Mid-Outlier') + 
  autolayer(plastics3, series='Plastics End-Outlier') + 
  autolayer(seasadj(decplas3), series='Seasonally Adjusted End-Outlier') +
  ggtitle("Plastics with 2 Outlier - Seasonal Adjustments")

There doesn’t seem to be much of a difference in where the outlier is added, but it does cause a very significant variance in our seasonally adjusted data.

Exercise 6.3

Recall your retail time series data (from Exercise 3 in Section 2.10). Decompose the series using X11. Does it reveal any outliers, or unusual features that you had not noticed previously?

Reading in retail data

#setwd("/Users/elinaazrilyan/Documents/Data624/")
retaildata <- readxl::read_excel("retail.xlsx", skip=1)
myts <- ts(retaildata[,"A3349882C"],
  frequency=12, start=c(1982,4))

X11 Decomposition.

myts %>% seas(x11="") -> fit
autoplot(fit) +
  ggtitle("X11 decomposition of retail data")

The X11 decomposition reveals some outliers, particularly in the older data. There is a very clear upward trend which was visible from the original plot. X11 decomposition confirms that seasonal variance increases over time.

Let’s look at the seasonal plots and seasonal sub-series plots of the seasonal component.

fit %>% seasonal() %>% ggsubseriesplot() + ylab("Seasonal")

The plot confirms that there are some variations in the seasonal component over time but they are not that major.