library(fpp2)
## Warning: package 'fpp2' was built under R version 3.5.3
## Loading required package: ggplot2
## Warning: package 'ggplot2' was built under R version 3.5.3
## Loading required package: forecast
## Warning: package 'forecast' was built under R version 3.5.3
## Loading required package: fma
## Warning: package 'fma' was built under R version 3.5.3
## Loading required package: expsmooth
## Warning: package 'expsmooth' was built under R version 3.5.3
library(seasonal)
The plastics data set consists of the monthly sales (in thousands) of product A for a plastics manufacturer for five years.
plastics
## Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
## 1 742 697 776 898 1030 1107 1165 1216 1208 1131 971 783
## 2 741 700 774 932 1099 1223 1290 1349 1341 1296 1066 901
## 3 896 793 885 1055 1204 1326 1303 1436 1473 1453 1170 1023
## 4 951 861 938 1109 1274 1422 1486 1555 1604 1600 1403 1209
## 5 1030 1032 1126 1285 1468 1637 1611 1608 1528 1420 1119 1013
A. Plot the time series of sales of product A. Can you identify seasonal fluctuations and/or a trend-cycle?
ggseasonplot(plastics, polar=TRUE) +
ylab("$ Thousands") +
ggtitle("Polar seasonal plot: Plastic consumption")
autoplot(plastics) + ggtitle("Annual plastic consumption ") + xlab("Year") + ylab("Plastic Sale of Product A")
#
# # With Moving average of 3 data point
# autoplot(plastics, series="Data") +
# autolayer(ma(plastics,3), series="3-MA") +
# xlab("Year") + ylab("GWh") +
# ggtitle("Annual electricity sales: South Australia") +
# scale_colour_manual(values=c("Data"="grey50","3-MA"="red"),
# breaks=c("Data","3-MA"))
#
# # Classical Decomposition : Deafult is additive
# dec_plastic <- decompose(plastics)
# autoplot(dec_plastic)
# plastics %>% stl(s.window=6) %>% autoplot() +xlab("Year") + ylab("GWh") +
# ggtitle("Annual electricity sales: South Australia")
#
# # X11 Decomposition
# plastics %>% seas() %>% autoplot() + xlab("Year") + ylab("GWh") +
# ggtitle("X11: Annual plastic consumptio")
#
#
# elecequip
#
# as.ts(plastics,start=c(1959),frequency = 12)
Above plot of data very clearly shows that there is some pattern each year. This is the seasonality present in the data each year, which shows downward trend in the beginning of the year and peak in the mid of the year with decline again at the end of the year. Polar seasonal plot also suggest decline at its peak in the month of Jan and Feb and then picking up from there.
B. Use a classical multiplicative decomposition to calculate the trend-cycle and seasonal indices.
# Classical Decomposition : Deafult is additive
dec_plastic_mul <- decompose(plastics,type = "multiplicative")
autoplot(dec_plastic_mul)
# autoplot(stl(plastics,s.window = "periodic"))
print("Trend Cycle:")
## [1] "Trend Cycle:"
trendcycle(dec_plastic_mul)
## Jan Feb Mar Apr May Jun Jul
## 1 NA NA NA NA NA NA 976.9583
## 2 1000.4583 1011.2083 1022.2917 1034.7083 1045.5417 1054.4167 1065.7917
## 3 1117.3750 1121.5417 1130.6667 1142.7083 1153.5833 1163.0000 1170.3750
## 4 1208.7083 1221.2917 1231.7083 1243.2917 1259.1250 1276.5833 1287.6250
## 5 1374.7917 1382.2083 1381.2500 1370.5833 1351.2500 1331.2500 NA
## Aug Sep Oct Nov Dec
## 1 977.0417 977.0833 978.4167 982.7083 990.4167
## 2 1076.1250 1084.6250 1094.3750 1103.8750 1112.5417
## 3 1175.5000 1180.5417 1185.0000 1190.1667 1197.0833
## 4 1298.0417 1313.0000 1328.1667 1343.5833 1360.6250
## 5 NA NA NA NA NA
print("seasonal indices")
## [1] "seasonal indices"
print(dec_plastic_mul$figure)
## [1] 0.7670466 0.7103357 0.7765294 0.9103112 1.0447386 1.1570026 1.1636317
## [8] 1.2252952 1.2313635 1.1887444 0.9919176 0.8330834
Classical decomposition methods assume that the seasonal component repeats from year to year. For many series, this is a reasonable assumption, but for some longer series it is not. Here third plot shows the trend with m= 12, as it shows yearly increasing trend.As data suggest around the end we see some high ressidual suddden drop in demand, which our trend-cycle estimate is not able to capture.
C.Do the results support the graphical interpretation from part a?
Yes, yearly trend suggested from A. is very clearly seen in above plots. We can also see that some decline in number at the end of 5th year in last 6 months.
D. Compute and plot the seasonally adjusted data.
# Decomposing time series data with multiplicative seasonal component
dec_plastic_mul <- decompose(plastics,type = "multiplicative")
autoplot(plastics, series="Data") +
autolayer(trendcycle(dec_plastic_mul), series="Trend") +
autolayer(seasadj(dec_plastic_mul), series="Seasonally Adjusted") +
xlab("Year") + ylab("Plastic Sale of Product A") +
ggtitle("Plastic Sale of Product A") +
scale_colour_manual(values=c("orange","blue","red"),
breaks=c("Data","Seasonally Adjusted","Trend"))
## Warning: Removed 12 rows containing missing values (geom_path).
Blue line in the above plot shows Seasonally adjusted data. As we can see that there are not much seasonality fluctuation in the data any more after we have decomposed the data , only linear trend can be seen.
E. Change one observation to be an outlier (e.g., add 500 to one observation), and recompute the seasonally adjusted data. What is the effect of the outlier?
# Function to use for checking Outlier effect.
plot_outlier_trend <- function(c_index) {
c_plastics <- plastics
print(paste("Ref Value:" ,plastics[c_index], "at ", c_index))
c_plastics[c_index] <- c_plastics[c_index] + 500
# Decomposing time series data with multiplicative seasonal component
c_dec_plastic_mul <- decompose(c_plastics,type = "multiplicative")
autoplot(plastics, series="Data") +
autolayer(c_plastics, series="Data Outlier") +
autolayer(trendcycle(dec_plastic_mul), series="Trend") +
autolayer(trendcycle(c_dec_plastic_mul), series="Trend with Outlier") +
xlab("Year") + ylab("Plastic Sale of Product A") +
ggtitle("Plastic Sale of Product A") +
scale_colour_manual(values=c("orange","gray","red","blue"),
breaks=c("Data","Data Outlier","Trend","Trend with Outlier"))
}
plot_outlier_seasadj <- function(c_index) {
c_plastics <- plastics
print(paste("Ref Value:" ,plastics[c_index], "at ", c_index))
c_plastics[c_index] <- c_plastics[c_index] + 500
# Decomposing time series data with multiplicative seasonal component
c_dec_plastic_mul <- decompose(c_plastics,type = "multiplicative")
autoplot(plastics, series="Data") +
autolayer(seasadj(c_dec_plastic_mul), series="Seasonally Adjusted with Outlier") +
autolayer(seasadj(dec_plastic_mul), series="Seasonally Adjusted") +
xlab("Year") + ylab("Plastic Sale of Product A") +
ggtitle("Plastic Sale of Product A") +
scale_colour_manual(values=c("orange","blue","red"),
breaks=c("Data","Seasonally Adjusted","Seasonally Adjusted with Outlier"))
}
# Choose some index and update the value with +500
set.seed(421)
c_index <- sample(1:59, 1)
plot_outlier_seasadj(c_index)
## [1] "Ref Value: 1403 at 47"
plot_outlier_trend(c_index)
## [1] "Ref Value: 1403 at 47"
## Warning: Removed 12 rows containing missing values (geom_path).
## Warning: Removed 12 rows containing missing values (geom_path).
Addition of outliers shows some sharp variation in the data at the same season in other year, but trend line seem to have no effect except in the year outliers are noted. The spike in the seasonally adjusted data is due to th big ressidual at the ponit of outlier.
F. Does it make any difference if the outlier is near the end rather than in the middle of the time series?
# Lets set the outlier in the begning and end and see how it impacts
plot_outlier_seasadj(1)
## [1] "Ref Value: 742 at 1"
plot_outlier_trend(1)
## [1] "Ref Value: 742 at 1"
## Warning: Removed 12 rows containing missing values (geom_path).
## Warning: Removed 12 rows containing missing values (geom_path).
plot_outlier_seasadj(60)
## [1] "Ref Value: 1013 at 60"
plot_outlier_trend(60)
## [1] "Ref Value: 1013 at 60"
## Warning: Removed 12 rows containing missing values (geom_path).
## Warning: Removed 12 rows containing missing values (geom_path).
Outliers at the beginning or at the end have very little effect in the same season’s data but its impact is negligible in other season. Possible reason could be not knowing the Trend Cycle for the first 6 and last 6 months of the data in the multiplicative seasonal decomposition.