library(knitr)
library(kableExtra)
library(ggplot2)
library(fpp2)
library(seasonal)
library(readxl)

Question 6.2

The plastics data set consists of the monthly sales (in thousands) of product A for a plastics manufacturer for five years.

  1. Plot the time series of sales of product A. Can you identify seasonal fluctuations and/or a trend-cycle?
summary(plastics)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   697.0   947.8  1148.0  1162.4  1362.5  1637.0
plastics
##    Jan  Feb  Mar  Apr  May  Jun  Jul  Aug  Sep  Oct  Nov  Dec
## 1  742  697  776  898 1030 1107 1165 1216 1208 1131  971  783
## 2  741  700  774  932 1099 1223 1290 1349 1341 1296 1066  901
## 3  896  793  885 1055 1204 1326 1303 1436 1473 1453 1170 1023
## 4  951  861  938 1109 1274 1422 1486 1555 1604 1600 1403 1209
## 5 1030 1032 1126 1285 1468 1637 1611 1608 1528 1420 1119 1013
autoplot(plastics) + xlab("Month") + ylab("Sales (in thousands)") +
  ggtitle("Monthly sales (in thousands) of product A for a plastics manufacturer for five years")

ggseasonplot(plastics)

ggsubseriesplot(plastics)

gglagplot(plastics)

Acf(plastics)

The trends show that there is seasonality in the sales for plastic A. Looks like the sales increases from spring through summer. During fall and winter there is a decline in sales. Overall, there is a general increased positive trend year over year.

plastics %>%
  stl(t.window=12, s.window="periodic", robust=TRUE) %>%
  autoplot()

b.Use a classical multiplicative decomposition to calculate the trend-cycle and seasonal indices.

multiplicative_decomp <- decompose(plastics, type = "multiplicative")
multiplicative_decomp
## $x
##    Jan  Feb  Mar  Apr  May  Jun  Jul  Aug  Sep  Oct  Nov  Dec
## 1  742  697  776  898 1030 1107 1165 1216 1208 1131  971  783
## 2  741  700  774  932 1099 1223 1290 1349 1341 1296 1066  901
## 3  896  793  885 1055 1204 1326 1303 1436 1473 1453 1170 1023
## 4  951  861  938 1109 1274 1422 1486 1555 1604 1600 1403 1209
## 5 1030 1032 1126 1285 1468 1637 1611 1608 1528 1420 1119 1013
## 
## $seasonal
##         Jan       Feb       Mar       Apr       May       Jun       Jul
## 1 0.7670466 0.7103357 0.7765294 0.9103112 1.0447386 1.1570026 1.1636317
## 2 0.7670466 0.7103357 0.7765294 0.9103112 1.0447386 1.1570026 1.1636317
## 3 0.7670466 0.7103357 0.7765294 0.9103112 1.0447386 1.1570026 1.1636317
## 4 0.7670466 0.7103357 0.7765294 0.9103112 1.0447386 1.1570026 1.1636317
## 5 0.7670466 0.7103357 0.7765294 0.9103112 1.0447386 1.1570026 1.1636317
##         Aug       Sep       Oct       Nov       Dec
## 1 1.2252952 1.2313635 1.1887444 0.9919176 0.8330834
## 2 1.2252952 1.2313635 1.1887444 0.9919176 0.8330834
## 3 1.2252952 1.2313635 1.1887444 0.9919176 0.8330834
## 4 1.2252952 1.2313635 1.1887444 0.9919176 0.8330834
## 5 1.2252952 1.2313635 1.1887444 0.9919176 0.8330834
## 
## $trend
##         Jan       Feb       Mar       Apr       May       Jun       Jul
## 1        NA        NA        NA        NA        NA        NA  976.9583
## 2 1000.4583 1011.2083 1022.2917 1034.7083 1045.5417 1054.4167 1065.7917
## 3 1117.3750 1121.5417 1130.6667 1142.7083 1153.5833 1163.0000 1170.3750
## 4 1208.7083 1221.2917 1231.7083 1243.2917 1259.1250 1276.5833 1287.6250
## 5 1374.7917 1382.2083 1381.2500 1370.5833 1351.2500 1331.2500        NA
##         Aug       Sep       Oct       Nov       Dec
## 1  977.0417  977.0833  978.4167  982.7083  990.4167
## 2 1076.1250 1084.6250 1094.3750 1103.8750 1112.5417
## 3 1175.5000 1180.5417 1185.0000 1190.1667 1197.0833
## 4 1298.0417 1313.0000 1328.1667 1343.5833 1360.6250
## 5        NA        NA        NA        NA        NA
## 
## $random
##         Jan       Feb       Mar       Apr       May       Jun       Jul
## 1        NA        NA        NA        NA        NA        NA 1.0247887
## 2 0.9656005 0.9745267 0.9750081 0.9894824 1.0061175 1.0024895 1.0401641
## 3 1.0454117 0.9953920 1.0079773 1.0142083 0.9990100 0.9854384 0.9567618
## 4 1.0257400 0.9924762 0.9807020 0.9798704 0.9684851 0.9627557 0.9917766
## 5 0.9767392 1.0510964 1.0498039 1.0299302 1.0398787 1.0628077        NA
##         Aug       Sep       Oct       Nov       Dec
## 1 1.0157335 1.0040354 0.9724119 0.9961368 0.9489762
## 2 1.0230774 1.0040674 0.9962088 0.9735577 0.9721203
## 3 0.9969907 1.0132932 1.0314752 0.9910657 1.0258002
## 4 0.9776897 0.9920952 1.0133954 1.0527311 1.0665946
## 5        NA        NA        NA        NA        NA
## 
## $figure
##  [1] 0.7670466 0.7103357 0.7765294 0.9103112 1.0447386 1.1570026 1.1636317
##  [8] 1.2252952 1.2313635 1.1887444 0.9919176 0.8330834
## 
## $type
## [1] "multiplicative"
## 
## attr(,"class")
## [1] "decomposed.ts"
autoplot(multiplicative_decomp) + xlab("Month") +
  ggtitle("Monthly sales (in thousands) of product A for a plastics manufacturer for five years")

c.Do the results support the graphical interpretation from part A

Yes, the graphical interpretation supports the findings in part A

d.Compute and plot the seasonally adjusted data.

autoplot(plastics, series="Data") +
  autolayer(trendcycle(multiplicative_decomp), series="Trend") +
  autolayer(seasadj(multiplicative_decomp), series="Seasonally Adjusted") +
  xlab("Year") + ylab("Monthly Sales") +
  ggtitle("Plastic Sales") +
  scale_colour_manual(values=c("green","blue","red"), breaks=c("Data","Seasonally Adjusted","Trend"))
## Warning: Removed 12 row(s) containing missing values (geom_path).

The trend line is upward trending

e.Change one observation to be an outlier (e.g., add 500 to one observation), and recompute the seasonally adjusted data. What is the effect of the outlier?

change.plastics.2 <- plastics
change.plastics.2[2] <- plastics[2]+500
change.plastics.2
##    Jan  Feb  Mar  Apr  May  Jun  Jul  Aug  Sep  Oct  Nov  Dec
## 1  742 1197  776  898 1030 1107 1165 1216 1208 1131  971  783
## 2  741  700  774  932 1099 1223 1290 1349 1341 1296 1066  901
## 3  896  793  885 1055 1204 1326 1303 1436 1473 1453 1170 1023
## 4  951  861  938 1109 1274 1422 1486 1555 1604 1600 1403 1209
## 5 1030 1032 1126 1285 1468 1637 1611 1608 1528 1420 1119 1013
change_multiplicative_decomp <- decompose(change.plastics.2, type = "multiplicative")
autoplot(change.plastics.2, series="Data") +
  autolayer(trendcycle(change_multiplicative_decomp), series="Trend") +
  autolayer(seasadj(change_multiplicative_decomp), series="Seasonally Adjusted") +
  xlab("Year") + ylab("Monthly Sales") +
  ggtitle("Plastic Sales") +
  scale_colour_manual(values=c("green","blue","red"), breaks=c("Data","Seasonally Adjusted","Trend"))
## Warning: Removed 12 row(s) containing missing values (geom_path).

This changes the Seasonally Adjusted.

f.Does it make any difference if the outlier is near the end rather than in the middle of the time series?

change.plastics.middle <- plastics
change.plastics.middle[25] <- plastics[25]+500

change.plastics.end <- plastics
change.plastics.end[59] <- plastics[59]+500

change_multiplicative_decomp_middle <- decompose(change.plastics.middle, type = "multiplicative")
autoplot(change.plastics.middle, series="Data") +
  autolayer(trendcycle(change_multiplicative_decomp_middle), series="Trend") +
  autolayer(seasadj(change_multiplicative_decomp_middle), series="Seasonally Adjusted") +
  xlab("Year") + ylab("Monthly Sales") +
  ggtitle("Plastic Sales - Outlier in the Middle") +
  scale_colour_manual(values=c("green","blue","red"), breaks=c("Data","Seasonally Adjusted","Trend"))
## Warning: Removed 12 row(s) containing missing values (geom_path).

change_multiplicative_decomp_end <- decompose(change.plastics.end, type = "multiplicative")
autoplot(change.plastics.end, series="Data") +
  autolayer(trendcycle(change_multiplicative_decomp_end), series="Trend") +
  autolayer(seasadj(change_multiplicative_decomp_end), series="Seasonally Adjusted") +
  xlab("Year") + ylab("Monthly Sales") +
  ggtitle("Plastic Sales - outlier at the end") +
  scale_colour_manual(values=c("green","blue","red"), breaks=c("Data","Seasonally Adjusted","Trend"))
## Warning: Removed 12 row(s) containing missing values (geom_path).

The seasonality is adjusted and the trend had a effect on the outliers at the middle and end.

Question 6.3

Recall your retail time series data (from Exercise 3 in Section 2.10). Decompose the series using X11. Does it reveal any outliers, or unusual features that you had not noticed previously?

retail_data <- read_excel("data/retail.xlsx", skip = 1)
retail <- ts(retail_data[, "A3349397X"], frequency = 12, start = c(1982, 4))
x11_retail <- seas(retail, x11 = "")
autoplot(x11_retail) + ggtitle("X11 Decomposition of Retail Sales")

The remainder shows outliers around 2000-2001 which may not have been previously noticed.