DATA 624 - Homework #3

2. The plastics data set consists of the monthly sales (in thousands) of product A for a plastics manufacturer for five years.

a. Plot the time series of sales of product A. Can you identify seasonal fluctuations and/or a trend-cycle?

autoplot(plastics) +
  ggtitle('Plastics Product A Monthly Sales') +
  ylab('Sales in Thousands') +
  xlab('Year')

ggseasonplot(plastics) +
  ggtitle('Seasonal plot: Plastics Product A Sales')

ggsubseriesplot(plastics) + 
  ggtitle('Seasonal subseries plot: Plastics Product A Sales')

There is definitely a seasonal pattern here. The sales typically are the lowest in February, and then the sales start to increase until it reach a maximum around the months of June through October. And the sales then decline after that. There is also a noticeable increasing trend through the years.

b. Use a classical multiplicative decomposition to calculate the trend-cycle and seasonal indices.

plastics.decomp <- decompose(plastics, type='multiplicative')
plastics.decomp$trend

##         Jan       Feb       Mar       Apr       May       Jun       Jul
## 1        NA        NA        NA        NA        NA        NA  976.9583
## 2 1000.4583 1011.2083 1022.2917 1034.7083 1045.5417 1054.4167 1065.7917
## 3 1117.3750 1121.5417 1130.6667 1142.7083 1153.5833 1163.0000 1170.3750
## 4 1208.7083 1221.2917 1231.7083 1243.2917 1259.1250 1276.5833 1287.6250
## 5 1374.7917 1382.2083 1381.2500 1370.5833 1351.2500 1331.2500        NA
##         Aug       Sep       Oct       Nov       Dec
## 1  977.0417  977.0833  978.4167  982.7083  990.4167
## 2 1076.1250 1084.6250 1094.3750 1103.8750 1112.5417
## 3 1175.5000 1180.5417 1185.0000 1190.1667 1197.0833
## 4 1298.0417 1313.0000 1328.1667 1343.5833 1360.6250
## 5        NA        NA        NA        NA        NA

plastics.decomp$seasonal

##         Jan       Feb       Mar       Apr       May       Jun       Jul
## 1 0.7670466 0.7103357 0.7765294 0.9103112 1.0447386 1.1570026 1.1636317
## 2 0.7670466 0.7103357 0.7765294 0.9103112 1.0447386 1.1570026 1.1636317
## 3 0.7670466 0.7103357 0.7765294 0.9103112 1.0447386 1.1570026 1.1636317
## 4 0.7670466 0.7103357 0.7765294 0.9103112 1.0447386 1.1570026 1.1636317
## 5 0.7670466 0.7103357 0.7765294 0.9103112 1.0447386 1.1570026 1.1636317
##         Aug       Sep       Oct       Nov       Dec
## 1 1.2252952 1.2313635 1.1887444 0.9919176 0.8330834
## 2 1.2252952 1.2313635 1.1887444 0.9919176 0.8330834
## 3 1.2252952 1.2313635 1.1887444 0.9919176 0.8330834
## 4 1.2252952 1.2313635 1.1887444 0.9919176 0.8330834
## 5 1.2252952 1.2313635 1.1887444 0.9919176 0.8330834

autoplot(plastics.decomp)

c. Do the results support the graphical interpretation from part a?

Yes. There is an apparent seasonal pattern and a increasing trend.

There is a slight trend drop off in the 1st half of year 5, and the remainder plot suggests that the trend-cycle estimate underestimated after year 5.

d. Compute and plot the seasonally adjusted data.

plastics.sadj <- plastics / plastics.decomp$seasonal
autoplot(plastics.sadj, series='Seasonally Adjusted ') +
  autolayer(plastics, series='Data') + 
  ggtitle('Plastics Product A Monthly Sales') +
  ylab('Sales in Thousands') +
  xlab('Year')

e. Change one observation to be an outlier (e.g., add 500 to one observation), and recompute the seasonally adjusted data. What is the effect of the outlier?

Here, I changed the sales data in the 3rd year month of June from 1326 to 500.

plastics.out <- plastics
plastics.out[30] <- 500
plastics.out.decomp <- decompose(plastics.out, type='multiplicative')
plastics.out.sadj <- plastics.out / plastics.out.decomp$seasonal

autoplot(plastics.out.sadj, series='Seasonally Adjusted ') +
  autolayer(plastics.out, series='Data') + 
  ggtitle('Plastics Product A Monthly Sales with an Outlier') +
  ylab('Sales in Thousands') +
  xlab('Year')

autoplot(plastics.out.decomp)

The outlier introduces a dip in the seasonal pattern in the month of June, and a region of depression in the trend pattern 6 months before and after the month of June in year 3. The remainder plot also registers this large outlier.

f. Does it make any difference if the outlier is near the end rather than in the middle of the time series?

Below, I introduced the outlier at the end of the 3rd year, instead of in the middle.

plastics.out <- plastics
plastics.out[36] <- 500
plastics.out.decomp <- decompose(plastics.out, type='multiplicative')
plastics.out.sadj <- plastics.out / plastics.out.decomp$seasonal

autoplot(plastics.out.sadj, series='Seasonally Adjusted ') +
  autolayer(plastics.out, series='Data') + 
  ggtitle('Plastics Product A Monthly Sales with an Outlier') +
  ylab('Sales in Thousands') +
  xlab('Year')

autoplot(plastics.out.decomp)

It changes the location of the dip in the seaonal pattern, and the depression region in the trend. Overall, the effects still exist, and just change in magnitude.

3. Recall your retail time series data (from Exercise 3 in Section 2.10). Decompose the series using X11. Does it reveal any outliers, or unusual features that you had not noticed previously?

retaildata <- readxl::read_excel("retail.xlsx", skip=1)
myts <- ts(retaildata[,"A3349873A"],
  frequency=12, start=c(1982,4))
myts.decomp <- seas(myts, x11='')

autoplot(myts.decomp) +
  ggtitle('Decomposition of Retrail Time Series A3349873A')

autoplot(myts, series='Data') +
  autolayer(trendcycle(myts.decomp), series='Trend') +
  autolayer(seasadj(myts.decomp), series='Seasonally Adjusted')

Yes, it does reveal outliers. For example, in the first month of 2000, there is a drop in the trend pattern, which can be spot in the remainder plot. This was not noticed previously. Also, from the seasonal pattern plot, it seems that the seasonal variation slowly decrease over time, which was not noticed previously. The X11 was able to capture this decrease variation, where the classical decomposition will not since it assumes constant seasonal component.

DATA 624 - Homework #3

Jun Yan

February 19, 2019