DATA 624: Homework 3

2. The plastics data set consists of the monthly sales (in thousands) of product A for a plastics manufacturer for five years.

  1. Plot the time series of sales of product A. Can you identify seasonal fluctuations and/or a trend-cycle?
plastics <- ts(plastics, frequency = 12, start = c(2000, 1)) #Unknown years are fabricated here
autoplot(plastics)

There is clear seasonality with peaks in the late summer/ early fall, and trough in the winter. There is no cycle, but rather an upward trend over the time period.

ggseasonplot(plastics)

  1. Use a classical multiplicative decomposition to calculate the trend-cycle and seasonal indices.
plastics %>% decompose(type="multiplicative") %>%
  autoplot() + xlab("Year") +
  ggtitle("Classical multiplicative decomposition of plastic sales data")

  1. Do the results support the graphical interpretation from part a?

The apparent late downturn in the trend is not obvious from the raw time series data alone. The string of consecutive positive and negative values in the remainder suggest that the trend is not captured with precision.

  1. Compute and plot the seasonally adjusted data.

First we converting the time series (ts) to a class seas, that includes automatic procedures of X-13ARIMA-SEATS to perform a seasonal adjustment. Then, we apply the seasadj() function of the forecast package, which returns seasonally-adjusted data constructed by removing the seasonal component.

autoplot(plastics, series="Data") +
  autolayer(seasadj(seas(plastics)), series="Seasonally Adjusted") +
  ggtitle("Australian Retail Data") +
  scale_colour_manual(values=c("gray","blue"))

  1. Change one observation to be an outlier (e.g., add 500 to one observation), and recompute the seasonally adjusted data. What is the effect of the outlier?
plastics_copy <- plastics 
plastics_copy[20] <- plastics_copy[20] + 500
autoplot(plastics_copy, series="Data") +
  autolayer(seasadj(seas(plastics_copy)), series="Seasonally Adjusted") +
  ggtitle("Australian Retail Data") +
  scale_colour_manual(values=c("gray","blue"))

In estimating the seasonal component for each season, the detrended values for that season are averaged. These seasonal component values are then adjusted to ensure that they add to zero. The shock (+500) is not factored into the seasonal component. In the X11 decompensation below it figures into the remainder component instead.

plastics_copy %>% seas(x11="") -> fit
autoplot(fit) +
  ggtitle("X11 decomposition of Australian Retail Data")

  1. Does it make any difference if the outlier is near the end rather than in the middle of the time series?
plastics_copy <- plastics 
plastics_copy[58] <- plastics_copy[58] + 500
autoplot(plastics_copy, series="Data") +
  autolayer(seasadj(seas(plastics_copy)), series="Seasonally Adjusted") +
  ggtitle("Australian Retail Data") +
  scale_colour_manual(values=c("gray","blue"))

plastics_copy %>% seas(x11="") -> fit
autoplot(fit) +
  ggtitle("X11 decomposition of Australian Retail Data")

When the shock occurs towards the end of the time series, it is still reflected in the remainder portion of the time series, rather than the season component or the trend, as seen from the X11 decomposition.

If it were a classical decomposition, the trend would not carry out to the last dates (due to moving average calculation), and remainders would not be available.

3. Recall your retail time series data (from Exercise 3 in Section 2.10).

retaildata <- readxl::read_excel('retail.xlsx', skip=1)
myts <- ts(retaildata[,"A3349824F"], frequency=12, start=c(1982,4))
lambda <- BoxCox.lambda(myts)
dframe <- cbind(RawSeries = myts, TransformedData = BoxCox(myts,lambda))
autoplot(dframe, facet=TRUE) +
  xlab("Years") + ylab("Sales") +
  ggtitle("Australian Retail Data (ID: A3349824F), \n power parameter lambda = ", round(lambda,5))

Decompose the series using X11. Does it reveal any outliers, or unusual features that you had not noticed previously?

myts %>% seas(x11="") -> fit
autoplot(fit) +
  ggtitle("X11 decomposition of Australian Retail Data")

autoplot(myts, series="Data") +
  autolayer(trendcycle(fit), series="Trend") +
  autolayer(seasadj(fit), series="Seasonally Adjusted") +
  xlab("Year") + ylab("New orders index") +
  ggtitle("Australian Retail Data") +
  scale_colour_manual(values=c("gray","blue","red"),
                      breaks=c("Data","Seasonally Adjusted","Trend"))

Previously, we observed that time series demonstrated greater volatility as the values increased. In the X11 Decomposition the seasonal component varies over time. Spikes in the remainders suggest outliers, and from 2000-2003, there are interruptions in the regular seasonal pattern.