7.1, 7.5,7.6, 7.7, 7.8 and 7.9
7.1 Consider the pigs series — the number of pigs slaughtered in Victoria each month.
library(fpp2)## Registered S3 method overwritten by 'quantmod':
## method from
## as.zoo.data.frame zoo
## -- Attaching packages ---------------------------------------------- fpp2 2.4 --
## v ggplot2 3.3.3 v fma 2.4
## v forecast 8.13 v expsmooth 2.3
##
library(seasonal)
library(kableExtra)## Warning: package 'kableExtra' was built under R version 4.0.4
- Use the ses() function in R to find the optimal values of \(\alpha\) \(\ell_0\) and generate forecasts for the next four months.
alpha = 0.2971, sigma: 10308.58
forecast <- ses(pigs,h=4)
forecast## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## Sep 1995 98816.41 85605.43 112027.4 78611.97 119020.8
## Oct 1995 98816.41 85034.52 112598.3 77738.83 119894.0
## Nov 1995 98816.41 84486.34 113146.5 76900.46 120732.4
## Dec 1995 98816.41 83958.37 113674.4 76092.99 121539.8
print(forecast$model)## Simple exponential smoothing
##
## Call:
## ses(y = pigs, h = 4)
##
## Smoothing parameters:
## alpha = 0.2971
##
## Initial states:
## l = 77260.0561
##
## sigma: 10308.58
##
## AIC AICc BIC
## 4462.955 4463.086 4472.665
- Compute a 95% prediction interval for the first forecast using \(\hat{y} \pm 1.96s\) where \(s\) is the standard deviation of the residuals. Compare your interval with the interval produced by
s <- sd(forecast$residuals)
mean_fc <- forecast$mean[1]
int_fc <- c(mean_fc - (1.96*s),mean_fc + (1.96*s))
int_fc## [1] 78679.97 118952.84
7.5 Data set books contains the daily sales of paperback and hardcover books at the same store. The task is to forecast the next four days’ sales for paperback and hardcover books.
- Plot the series and discuss the main features of the data.
autoplot(books) +
ggtitle('Daily sales of paper and hardcover books sale') b. Use the ses() function to forecast each series, and plot the forecasts.
pb_ses <- ses(books[, "Paperback"], h = 4)
hc_ses <- ses(books[, "Hardcover"], h = 4)
autoplot(books[, "Paperback"], series = "Paperback") +
autolayer(pb_ses, series = "Paperback", PI = FALSE) +
autolayer(books[, "Hardcover"], series = "Hardcover") +
autolayer(hc_ses, series = "Hardcover", PI=FALSE) +
ylab("Books Sale") +
ggtitle("Daily Paperback and hardcover books sales") c. Compute the RMSE values for the training data in each case.
sqrt(mean(pb_ses$residuals^2)) ## [1] 33.63769
sqrt(mean(hc_ses $residuals^2))## [1] 31.93101
7.6. will continue with the daily sales of paperback and hardcover books in data set books.
- Apply Holt’s linear method to the paperback and hardback series and compute four-day forecasts in each case.
holtpaperback <- holt(books[, "Paperback"], h = 4)
holthardcover <- holt(books[, "Hardcover"], h = 4)
autoplot(books[, "Paperback"])autoplot(books[,"Hardcover"])- Compare the RMSE measures of Holt’s method for the two series to those of simple exponential smoothing in the previous question. (Remember that Holt’s method is using one more parameter than SES.) Discuss the merits of the two forecasting methods for these data sets.
holtpaperback <- holt(books[,1], h=4)
forecast(holtpaperback)## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## 31 209.4668 166.6035 252.3301 143.9130 275.0205
## 32 210.7177 167.8544 253.5811 145.1640 276.2715
## 33 211.9687 169.1054 254.8320 146.4149 277.5225
## 34 213.2197 170.3564 256.0830 147.6659 278.7735
holthardcover <- holt(books[,2], h=4)
forecast(holthardcover)## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## 31 250.1739 212.7390 287.6087 192.9222 307.4256
## 32 253.4765 216.0416 290.9113 196.2248 310.7282
## 33 256.7791 219.3442 294.2140 199.5274 314.0308
## 34 260.0817 222.6468 297.5166 202.8300 317.3334
round(accuracy(holtpaperback), 2)## ME RMSE MAE MPE MAPE MASE ACF1
## Training set -3.72 31.14 26.18 -5.51 15.58 0.66 -0.18
round(accuracy(holthardcover), 2)## ME RMSE MAE MPE MAPE MASE ACF1
## Training set -0.14 27.19 23.16 -2.11 12.16 0.69 -0.03
- Compare the forecasts for the two series using both methods. Which do you think is best?
Based on the RMSE values and the fact that underlying time series has trend component, I will say forecast generated from Holt’s linear method is best
d.Calculate a 95% prediction interval for the first forecast for each series, using the RMSE values and assuming normal errors. Compare your intervals with those produced using ses and holt.
paste("95% PI of the paperback sales were calculated by holt function")## [1] "95% PI of the paperback sales were calculated by holt function"
holtpaperback$upper[1, "95%"]## 95%
## 275.0205
holtpaperback$lower[1, "95%"]## 95%
## 143.913
paste("95% PI of the paperback sales were calculated by formula")## [1] "95% PI of the paperback sales were calculated by formula"
#holtpaperback$mean[1] + 1.96*s_paperback#holtpaperback$mean[1] - 1.96*s_paperbackpaste("95% PI of the hardcover sales were calculated by holt function")## [1] "95% PI of the hardcover sales were calculated by holt function"
holthardcover$upper[1, "95%"]## 95%
## 307.4256
holthardcover$lower[1, "95%"]## 95%
## 192.9222
paste("95% PI of the hardcover sales were calculated by formula")## [1] "95% PI of the hardcover sales were calculated by formula"
#holthardcover$mean[1] + 1.96* s_hardcover#holthardcover$mean[1] - 1.96*s_hardcover7.7 For this exercise use data set eggs, the price of a dozen eggs in the United States from 1900–1993. Experiment with the various options in the holt() function to see how much the forecasts change with damped trend, or with a Box-Cox transformation. Try to develop an intuition of what each argument is doing to the forecasts.
[Hint: use h=100 when calling holt() so you can clearly see the differences between the various options when plotting the forecasts.]
Which model gives the best RMSE?
From Comparing the three models they have revealed the RMSE scores were similar, however, the holt w/ the box-cox has a slight edge from the different methods. I believe the holt w/ boxcox will be the best model to use for this example.
head(eggs)## Time Series:
## Start = 1900
## End = 1905
## Frequency = 1
## [1] 276.79 315.42 314.87 321.25 314.54 317.92
h <- 100
holteggs <- holt(eggs, h=h)
boxcox_eggs <- holt(eggs, h=h, lambda=TRUE)
dampedeggs <- holt(eggs, h=h, damped=TRUE)
dampedbox_cox_eggs <- holt(eggs, h=h, damped=TRUE, lambda=TRUE)
exponentialeggs <- holt(eggs, h=h, exponential=TRUE)
autoplot(holteggs) + ggtitle("Holt's Method") autoplot(dampedeggs) + ggtitle("Damped")autoplot(boxcox_eggs) + ggtitle("Box-Cox")autoplot(dampedbox_cox_eggs) + ggtitle("Damped & Box-Cox")autoplot(exponentialeggs) + ggtitle("Exponential")g_rmse <- function(model){
accuracy(model)[2]
}
Model <- c("Holt's Linear",
"Box-Cox Transformed",
"Damped",
"Damped and Box-Cox",
"Exponential")
RMSE <- c(g_rmse(holteggs),
g_rmse(boxcox_eggs),
g_rmse(dampedeggs),
g_rmse(dampedbox_cox_eggs),
g_rmse(exponentialeggs))
eggs_rmse_df <- data.frame(Model, RMSE)
eggs_rmse_df %>%
kable() %>%
kable_styling()| Model | RMSE |
|---|---|
| Holt’s Linear | 26.58219 |
| Box-Cox Transformed | 26.55504 |
| Damped | 26.54019 |
| Damped and Box-Cox | 26.73445 |
| Exponential | 26.49795 |
7.8 Recall your retail time series data (from Exercise 3 in Section 2.10)
retaildata <- readxl::read_excel("retail.xlsx", skip=1)
myts <- ts(retaildata[,"A3349335T"],
frequency=12, start=c(1982,4))
autoplot(myts)- Why is multiplicative seasonality necessary for this series?
The multiplicative is necessary in this series because the seasonality variations are charging w/ increase in time. Multiplicative seasonality variations are not constant and extra method can handle constant seasonal variations only.
- Apply Holt-Winters’ multiplicative method to the data. Experiment with making the trend damped.
hw_multi <- hw(myts, seasonal = "multiplicative")
autoplot(hw_multi) +
ggtitle("Multiplicative") +
ylab("Retail Sales")summary(hw_multi)##
## Forecast method: Holt-Winters' multiplicative method
##
## Model Information:
## Holt-Winters' multiplicative method
##
## Call:
## hw(y = myts, seasonal = "multiplicative")
##
## Smoothing parameters:
## alpha = 0.3253
## beta = 0.0129
## gamma = 0.0255
##
## Initial states:
## l = 304.256
## b = 2.1149
## s = 1.0213 0.9379 1.0298 1.1195 1.015 1.0226
## 0.9666 1.0035 0.9715 0.9468 0.9983 0.9673
##
## sigma: 0.0275
##
## AIC AICc BIC
## 4769.995 4771.681 4837.022
##
## Error measures:
## ME RMSE MAE MPE MAPE MASE
## Training set 0.9212824 25.20381 18.77683 0.06856226 1.979315 0.3016982
## ACF1
## Training set -0.1217931
##
## Forecasts:
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## Jan 2014 2320.673 2238.928 2402.418 2195.655 2445.691
## Feb 2014 2146.714 2066.909 2226.519 2024.663 2268.765
## Mar 2014 2333.130 2241.768 2424.492 2193.404 2472.856
## Apr 2014 2232.738 2140.809 2324.666 2092.145 2373.330
## May 2014 2286.119 2187.318 2384.920 2135.016 2437.222
## Jun 2014 2198.216 2098.655 2297.777 2045.951 2350.482
## Jul 2014 2267.353 2159.893 2374.812 2103.008 2431.697
## Aug 2014 2319.664 2204.785 2434.543 2143.972 2495.356
## Sep 2014 2260.000 2143.202 2376.799 2081.373 2438.628
## Oct 2014 2382.065 2253.759 2510.371 2185.838 2578.292
## Nov 2014 2379.754 2246.319 2513.190 2175.682 2583.827
## Dec 2014 2657.668 2502.716 2812.620 2420.689 2894.647
## Jan 2015 2402.219 2256.142 2548.296 2178.813 2625.624
## Feb 2015 2221.927 2081.767 2362.086 2007.572 2436.282
## Mar 2015 2414.636 2256.781 2572.491 2173.218 2656.055
## Apr 2015 2310.511 2154.110 2466.911 2071.316 2549.705
## May 2015 2365.521 2199.864 2531.179 2112.170 2618.872
## Jun 2015 2274.345 2109.704 2438.986 2022.548 2526.142
## Jul 2015 2345.650 2170.260 2521.039 2077.415 2613.885
## Aug 2015 2399.538 2214.355 2584.721 2116.325 2682.751
## Sep 2015 2337.597 2151.532 2523.663 2053.035 2622.159
## Oct 2015 2463.620 2261.508 2665.732 2154.516 2772.724
## Nov 2015 2460.998 2253.045 2668.952 2142.960 2779.036
## Dec 2015 2748.143 2509.110 2987.175 2382.574 3113.712
hw_multi_damped <- hw(myts, seasonal = "multiplicative", damped = TRUE)
autoplot(hw_multi_damped) +
ggtitle("Multiplicative & Damped") +
ylab("Retail Sales")summary(hw_multi_damped)##
## Forecast method: Damped Holt-Winters' multiplicative method
##
## Model Information:
## Damped Holt-Winters' multiplicative method
##
## Call:
## hw(y = myts, seasonal = "multiplicative", damped = TRUE)
##
## Smoothing parameters:
## alpha = 0.3632
## beta = 0.0321
## gamma = 0.0141
## phi = 0.98
##
## Initial states:
## l = 304.2559
## b = 2.7033
## s = 1.0191 0.9312 1.0265 1.1253 1.018 1.0257
## 0.9723 0.997 0.9821 0.9424 0.9851 0.9751
##
## sigma: 0.0279
##
## AIC AICc BIC
## 4781.286 4783.175 4852.256
##
## Error measures:
## ME RMSE MAE MPE MAPE MASE ACF1
## Training set 2.9059 25.10059 18.74334 0.2366077 1.993423 0.3011601 -0.1394101
##
## Forecasts:
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## Jan 2014 2321.366 2238.340 2404.392 2194.389 2448.343
## Feb 2014 2135.288 2053.195 2217.381 2009.737 2260.838
## Mar 2014 2327.785 2231.563 2424.006 2180.627 2474.943
## Apr 2014 2236.621 2137.288 2335.954 2084.704 2388.538
## May 2014 2268.949 2160.829 2377.070 2103.594 2434.305
## Jun 2014 2182.623 2071.229 2294.018 2012.261 2352.986
## Jul 2014 2268.891 2145.125 2392.658 2079.606 2458.176
## Aug 2014 2307.530 2173.289 2441.771 2102.226 2512.834
## Sep 2014 2257.437 2117.698 2397.175 2043.724 2471.149
## Oct 2014 2380.566 2224.122 2537.011 2141.305 2619.827
## Nov 2014 2375.984 2210.584 2541.383 2123.027 2628.940
## Dec 2014 2649.063 2454.144 2843.981 2350.961 2947.165
## Jan 2015 2393.562 2207.407 2579.717 2108.863 2678.262
## Feb 2015 2200.181 2020.088 2380.275 1924.752 2475.611
## Mar 2015 2396.919 2190.814 2603.023 2081.709 2712.128
## Apr 2015 2301.540 2094.016 2509.065 1984.159 2618.922
## May 2015 2333.316 2113.073 2553.560 1996.483 2670.150
## Jun 2015 2243.144 2021.853 2464.435 1904.708 2581.580
## Jul 2015 2330.388 2090.478 2570.297 1963.477 2697.298
## Aug 2015 2368.669 2114.568 2622.770 1980.055 2757.283
## Sep 2015 2315.909 2057.380 2574.437 1920.524 2711.294
## Oct 2015 2440.850 2157.681 2724.019 2007.780 2873.919
## Nov 2015 2434.810 2141.619 2728.001 1986.413 2883.207
## Dec 2015 2713.191 2374.474 3051.908 2195.169 3231.214
- Compare the RMSE of the one-step forecasts from the two methods. Which do you prefer?
Well the RMSE values are very similar. For the reason of, it takes the same amount of effort to get either, I would rather choose the Holt-Winters’ multiplicative method.
cat("RMSE of Multiplicative = ", accuracy(hw_multi)[2])## RMSE of Multiplicative = 25.20381
cat("RMSE of Multiplicative & Damped = ", accuracy(hw_multi_damped)[2])## RMSE of Multiplicative & Damped = 25.10059
- Check that the residuals from the best method look like white noise.
checkresiduals(hw_multi)##
## Ljung-Box test
##
## data: Residuals from Holt-Winters' multiplicative method
## Q* = 250.64, df = 8, p-value < 2.2e-16
##
## Model df: 16. Total lags used: 24
- Now find the test set RMSE, while training the model to the end of 2010. Can you beat the seasonal naïve approach from Exercise 8 in Section 3.7?
my_timeseries_train <- window(myts, end = c(2010, 12))
my_timeseries_test <- window(myts, start = 2011)my_timesseries_train_hw <- hw(my_timeseries_train, damped = TRUE, seasonal = "multiplicative")
accuracy(my_timesseries_train_hw, my_timeseries_test)## ME RMSE MAE MPE MAPE MASE
## Training set 4.089724 25.68057 18.92510 0.371369 2.057716 0.3068053
## Test set -27.787782 41.08034 32.12435 -1.279976 1.487404 0.5207858
## ACF1 Theil's U
## Training set -0.05009924 NA
## Test set 0.13519074 0.3334222
my_timeseries_train_sn <- snaive(my_timeseries_train)
accuracy(my_timeseries_train_sn,my_timeseries_test)## ME RMSE MAE MPE MAPE MASE ACF1
## Training set 61.56787 72.20702 61.68438 6.388722 6.404105 1.000000 0.6018274
## Test set 97.44583 109.62545 100.02917 4.629852 4.751209 1.621629 0.2686595
## Theil's U
## Training set NA
## Test set 0.9036205
7.9. For the same retail data, try an STL decomposition applied to the Box-Cox transformed series, followed by ETS on the seasonally adjusted data. How does that compare with your best previous forecasts on the test set?
boxc <- BoxCox.lambda(my_timeseries_train)
fc_boxc<-BoxCox(my_timeseries_train,boxc)
stl_boxc <- stlm(fc_boxc)
my_timeserires.boxc <- forecast(stl_boxc)
autoplot(my_timeserires.boxc)