Exercise 6.2

The data below represent the monthly sales (in thousands) of product A for a plastics manufacturer for years 1 through 5 (data set plastics).

if (!require("fma")) install.packages("fma")
library(fma)
plastics
##    Jan  Feb  Mar  Apr  May  Jun  Jul  Aug  Sep  Oct  Nov  Dec
## 1  742  697  776  898 1030 1107 1165 1216 1208 1131  971  783
## 2  741  700  774  932 1099 1223 1290 1349 1341 1296 1066  901
## 3  896  793  885 1055 1204 1326 1303 1436 1473 1453 1170 1023
## 4  951  861  938 1109 1274 1422 1486 1555 1604 1600 1403 1209
## 5 1030 1032 1126 1285 1468 1637 1611 1608 1528 1420 1119 1013

Exercise 6.2.a

Plot the time series of sales of product A. Can you identify seasonal fluctuations and/or a trend?

Approach

The Time Plot is a line graph that plots each observed value against the time of the observation, with a single line connecting each observation across the entire period. This plot is useful for graphically interpreting time series data.

Results

plot(plastics, ylab="Sales (Thousands)", xlab="Month", main="Plastics Time Plot")

Interpretation

This Time Plot shows very clearly that the data have an overall upward trend and seasonal fluctuations. The seasonality results in decreased sales at the beginning of each year and increased sales toward the end of each year.

Exercise 6.2.b

Use a classical multiplicative decomposition to calculate the trend-cycle and seasonal indices.

Approach

The Decomposition Plot graphs the observed values, the underlying trend, seasonality, and randomness of the data. The method the decompose function that produces the plot uses for extracting components is Seasonal and Trend decomposition using Loess. This method is can handle any tape of seasonality and is robust to outliers.

Results

plot(decompose(plastics, type="multiplicative"))

Interpretation

Plotting the trend-cycle and seasonal indices (\(m\)) computed by the multiplicative decomposition decompose function shows that the data have an upward trend, seasonal fluctuations, and fairly random residuals.

Exercise 6.2.c

Do the results support the graphical interpretation from part (a)?

Approach

Comparison of results of 6.2.a and 6.2.b can be furthered by looking at a Seasonal Plot. A Seasonal Plot is a line graph that plots each observed value against the time of the observation, but with seasons in the x-axis and separate lines connecting the observations for a given year across seasons.

Results

seasonplot(plastics, ylab="Sales (Thousands)", xlab="Month", 
           year.labels.left=TRUE, main="Plastics Seasonal Plot", col=1:20, pch=19)

Interpretation

The results of the classical multiplicative decomposition output are in accordance with the graphical interpretation. There was an upward trend and seasonal fluctuation. This is further evidenced in the Seasonal Plot. In this time series, those components were relatively easy to see.

Exercise 6.2.d

Compute and plot the seasonally adjusted data.

Approach

Seasonally Adjusted data are derived by removing the seasonal component from the original data. This is achieved by first doing a Seasonal and Trend decomposition using Loess (STL) then applying the seasadj function. STL is a very versatile and robust method for decomposing a time series. The function stl decomposes a time series into seasonal, trend, and irregular components using loess. The trend window parameter (t.window) controls wiggliness of trend component and the seasonal window parameter (s.window) controls variation on seasonal component.

Results

plot(plastics, col="grey", ylab="Sales (Thousands)", xlab="Month", main="Seasonally Adjusted Plastics")
lines(seasadj(stl(plastics, s.window="periodic")), col="red", ylab="Seasonally Adjusted")

Interpretation

Seasonal adjusting the data is relatively easy once the time series is decomposed using STL. The stl function s.window parameter is set to periodic so that the span (in lags) of the loess window for seasonal extraction is calculated using defaults. The resulting plot shows the seasonally adjusted data represented by a red line. This seasonally adjusted data still contains the trend-cycle and remainder components which make it go upward in a jagged pattern.

Exercise 6.2.e

Change one observation to be an outlier (e.g., add 500 to one observation), and recompute the seasonally adjusted data. What is the effect of the outlier?

Approach

The observation to be changed is chosen at random. The R function sample() takes a sample of the specified size from the elements of \(x\). Using this function, a randomly selecti element from plastics to increase by 500 is selected. A Time Plot comparison is then made of the original and modified seasonally adjusted data.

Results

plastics2 <- plastics
set.seed(2018)
index <- sample(1:length(plastics), 1)
obs <- plastics2[index]
data.frame(index, obs, outlier=obs+500)
##   index  obs outlier
## 1    21 1341    1841
plastics2[index] <- obs + 500
par(mfrow=c(1, 2))
plot(plastics, col="grey", ylab="Sales (Thousands)", xlab="Month", main="Seasonally Adjusted")
lines(seasadj(stl(plastics, s.window="periodic")), col="red", ylab="Seasonally Adjusted")
plot(plastics2, col="grey", ylab="Sales (Thousands)", xlab="Month", main="Random Outlier")
lines(seasadj(stl(plastics2, s.window="periodic")), col="red", ylab="Seasonally Adjusted")

Interpretation

First the seed is set to facilitate reproducible research. After duplicating the dataset, a random index number is selected and its corresponding value is increased by 500. This value is the outlier. Once the original seasonally adjusted data and modified seasonally adjusted data are plotted side by side, it can be seen that the outlier creates a spike in the Seasonally Adjusted data but does little to increase the volatility of the seasonally adjusted data.

Exercise 6.2.f

Does it make any difference if the outlier is near the end rather than in the middle of the time series?

Approach

The observation to be changed is final element. It will be increased by 500 and a Time Plot comparison will then be made of the original and modified seasonally adjusted data.

Results

plastics3 <- plastics
plastics2[length(plastics)] <- obs + 500
par(mfrow=c(1, 2))
plot(plastics, col="grey", ylab="Sales (Thousands)", xlab="Month", main="Seasonally Adjusted")
lines(seasadj(stl(plastics, s.window="periodic")), col="red", ylab="Seasonally Adjusted")
plot(plastics3, col="grey", ylab="Sales (Thousands)", xlab="Month", main="Trailing Outlier")
lines(seasadj(stl(plastics3, s.window="periodic")), col="red", ylab="Seasonally Adjusted")

Interpretation

After duplicating the dataset, the final element in the dataset increased by 500. This value is the outlier. Once the original seasonally adjusted data and modified seasonally adjusted data are plotted side by side, it can be seen that the outlier at the end of the time series has no effected on the Seasonally Adjusted data.

Exercise 6.2.g

Use a random walk with drift to produce forecasts of the seasonally adjusted data.

Approach

Seasonal adjusting the data is relatively easy once the time series is decomposed using STL. The stl function t.window parameter is set to NULL and s.window parameter set to periodic so that the span (in lags) of the loess window for trend/seasonal extraction is calculated using defaults. A random walk model with drift forecast, equivalent to an ARIMA(0,1,0) model with a drift coefficient, is then implemented using the rwf function with the drift parameter set to TRUE. The data are then plotted with a Time Plot.

Results

fit <- stl(plastics, t.window=NULL, s.window="periodic", robust=TRUE)
plot(rwf(seasadj(fit), drift=T), ylab="Sales (Thousands)", xlab="Month", 
     main="Naive Random Walk Forecast of Seasonally Adjusted Data")

Interpretation

The forecast reflects the trend in its drift, but ignores seasonality. Seasonality is potentially captured in the prediction intervals, but the accuracy of predictions will be too low to be of technically use. Overall, the forecast and its prediction intervals are of minimal use.

Exercise 6.2.h

Reseasonalize the results to give forecasts on the original scale.

Approach

Forecasts are “reseasonalized” by adding a seasonal component to the forecast. This is achieved by first doing a Seasonal and Trend decomposition using Loess (STL), then forecasting without seasonal adjustment. For STL, the t.window parameter is set to NULL and s.window parameter set to periodic so that the span (in lags) of the loess window for trend/seasonal extraction is calculated using defaults. A random walk model with drift forecast, equivalent to an ARIMA(0,1,0) model with a drift coefficient, is then implemented using the forecast function with the method parameter set to rwdrift. The data are then plotted with a Time Plot.

Results

fit <- stl(plastics, t.window=NULL, s.window="periodic", robust=TRUE)
fcast <- forecast(fit, method="rwdrift")
plot(fcast, ylab="Sales (Thousands)", xlab="Month", 
     main="Naive Random Walk Forecasts of Seasonally Adjusted Data")

Interpretation

The reseasonalized random walk with drift forecast produces a very useful forecast. The forecast reflect both trend and seasonality. Both the prediction and the production intervals continue in the trend and seasonal pattern of the original data. Using this forecast would yeild better results than a non-reseasonalized forecast.

References

https://www.otexts.org/fpp/

http://appliedpredictivemodeling.com/

https://www.rdocumentation.org/packages

https://robjhyndman.com/uwafiles/fpp-notes.pdf

https://cran.r-project.org/web/packages/forecast/forecast.pdf