CUNY DATA624 Homework 3
CUNY DATA624 Homework 3
- Question 6.2 The
plasticsdata set consists of the monthly sales (in thousands) of product A for a plastics manufacturer for five years.- a. Plot the time series of sales of product A. Can you identify seasonal fluctuations and/or a trend-cycle?
- b. Use a classical multiplicative decomposition to calculate the trend-cycle and seasonal indices.
- c. Do the results support the graphical interpretation from part a?
- d. Compute and plot the seasonally adjusted data.
- e. Change one observation to be an outlier (e.g., add 500 to one observation), and recompute the seasonally adjusted data. What is the effect of the outlier?
- f. Does it make any difference if the outlier is near the end rather than in the middle of the time series?
- Question 6.3 Recall your retail time series data (from Exercise 3 in Section 2.10). Decompose the series using X11. Does it reveal any outliers, or unusual features that you had not noticed previously?
Question 6.2 The plastics data set consists of the monthly sales (in thousands) of product A for a plastics manufacturer for five years.
a. Plot the time series of sales of product A. Can you identify seasonal fluctuations and/or a trend-cycle?
From the autoplot, we an immediately tell there’s both an upward trend and clear seasonality where sales peak around the middle to end of the summer. No cyclic behavior appears to be present.
## Loading required package: gridExtra
We can can confirm clear seasonality from the above plots, but also noticeable is the dropoff in year 5 during the fall months. Could be the start of a cycle, but we would need more data over a longer time frame to know.
Lags 1 and 12 appear to confirm that there is clear seasonality and follows an obvious annual pattern with sales being low around the colder months and the opposite in the summer months.
b. Use a classical multiplicative decomposition to calculate the trend-cycle and seasonal indices.
plastics %>% decompose(type="multiplicative") %>%
autoplot() + xlab("Year") +
ggtitle("Classical multiplicative decomposition
of sales of plastic product")c. Do the results support the graphical interpretation from part a?
The results of the classic multiplicative decomposition supports the graphical interpretation in A. Additionally, in the “trend” portion, you can even begin to see the downward dip that is starting that was apparent in the ggsubseriesplot.
d. Compute and plot the seasonally adjusted data.
pl_sa <- plastics %>%
decompose(type = "multiplicative") %>%
seasadj()
autoplot(plastics, series="Data") +
autolayer(pl_sa, series="Seasonally Adjusted") +
xlab("Year") + ylab("Sales") +
ggtitle("Sales of plastic product") +
scale_colour_manual(values=c("gray","blue"),
breaks=c("Data","Seasonally Adjusted"))From the seasonal adjustment, we can even see the beginning of the downward trend at the end more prominently.
e. Change one observation to be an outlier (e.g., add 500 to one observation), and recompute the seasonally adjusted data. What is the effect of the outlier?
pl_out <- plastics
pl_out[20] <- pl_out[20]+500
pl_out_sa <- pl_out %>%
decompose(type = "multiplicative") %>%
seasadj()
autoplot(plastics, series="Orig Data") +
autolayer(pl_sa, series="Seasonally Adjusted") +
autolayer(pl_out_sa, series="Seasonally Adjusted w/Outlier") +
xlab("Year") + ylab("Sales") +
ggtitle("Sales of plastic product") +
scale_colour_manual(values=c("gray","blue","red"),
breaks=c("Orig Data",
"Seasonally Adjusted",
"Seasonally Adjusted w/Outlier"))From the above graph, I added the outlier at month 20 (August of Year 2). It looks like the outlier pulled the calculations upward very slightly, but also introduced a significant dip annually. We can observe the difference of the seasonal adjustment calculations from that of the outlier below. You can see that the outlier mostly created a positive difference in the seasonal adjustment by about 9 to 12, but every August (the incidence of the outlier) had a negative difference of about 75 to 95 sales.
## Jan Feb Mar Apr May Jun Jul
## 1 9.081280 9.272209 9.534035 9.441534 9.500020 9.110139 9.793764
## 2 9.069042 9.312118 9.509463 9.799008 10.136429 10.064770 10.844597
## 3 10.966074 10.549300 10.873223 11.092225 11.104878 10.912416 10.953883
## 4 11.639215 11.453906 11.524388 11.659979 11.750510 11.702456 12.492303
## 5 12.606090 13.728723 13.834180 13.510435 13.539834 13.471814 13.543136
## Aug Sep Oct Nov Dec
## 1 -74.808906 9.100944 8.679757 8.650694 8.230045
## 2 294.313571 10.102952 9.946034 9.497054 9.470333
## 3 -88.343412 11.097426 11.150916 10.423596 10.752664
## 4 -95.664349 12.084367 12.279054 12.499406 12.707694
## 5 -98.924935 11.511791 10.897661 9.969234 10.647555
f. Does it make any difference if the outlier is near the end rather than in the middle of the time series?
My apologies to the color blind, but can’t think of a better contrasting color!
pl_out_end <- plastics
pl_out_end[56] <- pl_out_end[56]+500
pl_out_end_sa <- pl_out_end %>%
decompose(type = "multiplicative") %>%
seasadj()
autoplot(plastics, series="Orig Data") +
autolayer(pl_sa, series="Seasonally Adjusted") +
autolayer(pl_out_sa, series="Seasonally Adjusted w/Outlier") +
autolayer(pl_out_end_sa, series="Seasonally Adjusted w/Outlier@End") +
xlab("Year") + ylab("Sales") +
ggtitle("Sales of plastic product") +
scale_colour_manual(values=c("gray","blue","red","green"),
breaks=c("Orig Data",
"Seasonally Adjusted",
"Seasonally Adjusted w/Outlier","Seasonally Adjusted w/Outlier@End"
))For this case, I added the outlier to the last August of the series. You can see that adding it at the end does appear to make a difference, but it’s hard to see (and getting crowded!), which perhaps implies that the differences are marginal. We can take a closer look at the differences below and in fact, we do notice that the effect on August and overall is not huge. Generally on the seasonal adjustment it looks like if the outlier occurs near the end of the series, the changes to the calculations should be minimal.
## Jan Feb Mar Apr May Jun Jul
## 1 -2.660025 1.121901 4.944113 4.792428 4.968545 5.099367 -2.753049
## 2 -2.656440 1.126730 4.931371 4.973878 5.301389 5.633718 -3.048440
## 3 -3.212106 1.276424 5.638583 5.630302 5.807891 6.108185 -3.079161
## 4 -3.409277 1.385878 5.976261 5.918488 6.145559 6.550407 -3.511614
## 5 -3.692487 1.661122 7.174061 6.857761 7.081382 7.540799 -3.807006
## Aug Sep Oct Nov Dec
## 1 -2.728955 -2.697641 -2.616240 -2.691827 -2.584502
## 2 -3.027435 -2.994650 -2.997920 -2.955188 -2.973992
## 3 -3.222681 -3.289425 -3.361094 -3.243499 -3.376686
## 4 -3.489741 -3.581967 -3.701136 -3.889427 -3.990629
## 5 403.334155 -3.412248 -3.284758 -3.102116 -3.343678
Question 6.3 Recall your retail time series data (from Exercise 3 in Section 2.10). Decompose the series using X11. Does it reveal any outliers, or unusual features that you had not noticed previously?
retaildata <- readxl::read_excel("retail.xlsx", skip=1)
myts <- ts(retaildata[,"A3349398A"],
frequency=12, start=c(1982,4))
myts %>% seas(x11="") -> myts_sa
autoplot(myts_sa) +
ggtitle("X11 decomposition of Turnover (food retail)")The X11 decomposition doesn’t appear to reveal anything new necessarily. In fact, it shows how regular and stable the trend and seasonality are as the remainder component reveals very small numbers. The seasonal adjustment on the X11 decompositons also show virtually nothing irregular with the data. Perhaps my fault for selecting data so boring.
autoplot(myts, series="Data") +
autolayer(seasadj(myts_sa), series="Seasonally Adjusted") +
autolayer(trendcycle(myts_sa), series="Trend") +
xlab("Year") + ylab("Sales") +
ggtitle("Turnover (food retail)") +
scale_colour_manual(values=c("gray","blue","red"),
breaks=c("Data","Seasonally Adjusted","Trend"))