The data set used for this analysis is the ausbeer data set. It represents the total quarterly beer production in Australia (in megalitres). The data set contains 211 observations and 2 variables. Summary output of the data set as well as its structure follows. the two variables in the data set are called QD and nd0. QD is a date variable that starts from 1956 to 2008. nd0 is an int, which counts the beer production in Australia for each observation. Each observation represents a quarter of a year. Its min, mean, and max are, respectively, 213, 415, 599.0. There seems to be no missing values.
'data.frame': 211 obs. of 2 variables:
$ QD : int 1956 1956 1956 1956 1957 1957 1957 1957 1958 1958 ...
$ nd0: int 284 213 227 308 262 228 236 320 272 233 ...
QD nd0
Min. :1956 Min. :213.0
1st Qu.:1969 1st Qu.:378.5
Median :1982 Median :423.0
Mean :1982 Mean :415.0
3rd Qu.:1995 3rd Qu.:465.5
Max. :2008 Max. :599.0
Next the data was split into test and training data sets. There were four training data sets. Each training data set had different size. The sizes were n = 48,64,150,200 The most recent 13 periods are kept for the test data set. However the amount of test observations used for each method-STL and Classic, where different. This is owed to the differences inherent in each method. Specifically , the Classic method of decomposition is short a few observations. Tabular output of these amounts follow.
| AMNTs | |
|---|---|
| Train Set_1 | 200 |
| Train Set_2 | 150 |
| Train Set_3 | 64 |
| Train Set_4 | 48 |
| Test Set | 13 |
| ORG Set | 211 |
A time series objects was constructed using the ts() function. Each training data set received a ts object. training sets 1 through 4 are respectively aligned to n=200,150,64,48 observations. The frequency was set to 4. For brevity, output of the ts object for n=64 follows as well as a plot the ts object for n=200. Visual inspection of the graph may yield that the time series is additive. Frankly, there seems to be no amplification of the data over time.
Qtr1 Qtr2 Qtr3 Qtr4
1994 449 381 423 531
1995 426 408 416 520
1996 409 398 398 507
1997 432 398 406 526
1998 428 397 403 517
1999 435 383 424 521
2000 421 402 414 500
2001 451 380 416 492
2002 428 408 406 506
2003 435 380 421 490
2004 435 390 412 454
2005 416 403 408 482
First Classical decomposition was conducted. Each training set was decomposed into there trend, seasonality, and error components, and then plotted. The plots depict these compositions for n=200,150,64,48 respectively.
Next Forecasting was conducted. Classical Decomposition does not include the first and last two observations after it decomposes the time series. So each forecast is for 13 periods. Output of the forecast for n=200 follows. The method used in forecasting was the naive.
| Point Forecast | Lo 80 | Hi 80 | Lo 95 | Hi 95 | |
|---|---|---|---|---|---|
| 2005 Q3 | 416.5306 | 385.5944 | 447.4668 | 369.2178 | 463.8434 |
| 2005 Q4 | 515.9133 | 472.1629 | 559.6637 | 449.0028 | 582.8237 |
| 2006 Q1 | 448.3342 | 394.7511 | 501.9172 | 366.3860 | 530.2824 |
| 2006 Q2 | 403.0000 | 341.1276 | 464.8724 | 308.3743 | 497.6257 |
| 2006 Q3 | 416.5306 | 347.3552 | 485.7061 | 310.7359 | 522.3253 |
| 2006 Q4 | 515.9133 | 440.1354 | 591.6912 | 400.0210 | 631.8056 |
| 2007 Q1 | 448.3342 | 366.4847 | 530.1837 | 323.1562 | 573.5122 |
| 2007 Q2 | 403.0000 | 315.4992 | 490.5008 | 269.1791 | 536.8209 |
| 2007 Q3 | 416.5306 | 323.7220 | 509.3392 | 274.5921 | 558.4691 |
| 2007 Q4 | 515.9133 | 418.0844 | 613.7421 | 366.2970 | 665.5296 |
| 2008 Q1 | 448.3342 | 345.7304 | 550.9379 | 291.4153 | 605.2531 |
| 2008 Q2 | 403.0000 | 295.8339 | 510.1661 | 239.1035 | 566.8965 |
| 2008 Q3 | 416.5306 | 304.9886 | 528.0727 | 245.9418 | 587.1194 |
Next Accuracy metrics for each of the training data sets were computed and output in a table. The MAPE and MSE depicted allong the top. The observation size for each training set is written across the left side. The MAPE is represented as percentages.
| MAPE | MSE | |
|---|---|---|
| 200 | 4.653308 | 507.0078 |
| 150 | 4.907893 | 619.3271 |
| 64 | 4.376375 | 525.5023 |
| 48 | 4.155283 | 410.6874 |
Next STL decomposition was conducted. Each training set was decomposed into there trend, seasonality, and error components, and then plotted. The plots depict these compositions for n=200,150,64,48 respectively.
Next Forecasting was conducted. Each forecast is for 11 periods. Output of the forecast for n=200 follows. The method used in forecasting was the naive.
| Point Forecast | Lo 80 | Hi 80 | Lo 95 | Hi 95 | |
|---|---|---|---|---|---|
| 2006 Q1 | 414.8311 | 384.0250 | 445.6373 | 367.7172 | 461.9451 |
| 2006 Q2 | 369.1828 | 325.6163 | 412.7492 | 302.5536 | 435.8119 |
| 2006 Q3 | 383.1352 | 329.7774 | 436.4930 | 301.5314 | 464.7389 |
| 2006 Q4 | 482.0000 | 420.3877 | 543.6123 | 387.7721 | 576.2279 |
| 2007 Q1 | 414.8311 | 345.9465 | 483.7158 | 309.4812 | 520.1811 |
| 2007 Q2 | 369.1828 | 293.7234 | 444.6421 | 253.7776 | 484.5879 |
| 2007 Q3 | 383.1352 | 301.6298 | 464.6406 | 258.4834 | 507.7870 |
| 2007 Q4 | 482.0000 | 394.8670 | 569.1330 | 348.7416 | 615.2584 |
| 2008 Q1 | 414.8311 | 322.4127 | 507.2496 | 273.4893 | 556.1730 |
| 2008 Q2 | 369.1828 | 271.7652 | 466.6004 | 220.1954 | 518.1701 |
| 2008 Q3 | 383.1352 | 280.9627 | 485.3076 | 226.8759 | 539.3945 |
Next Accuracy metrics for each of the training data sets were computed and output in a table. The MAPE and MSE are depicted along the top. The observation size for each training set is written across the left side. The MAPE is represented as percentages.
| MAPE | MSE | |
|---|---|---|
| 200 | 3.737894 | 281.6836 |
| 150 | 5.151534 | 515.3770 |
| 64 | 5.199174 | 535.5810 |
| 48 | 3.549672 | 266.7371 |
Two decomposition methods were explored in this analysis. Those methods were STL and Classical. Each method broke their given time series into trend, seasonality, and error values. However, each method approached that task differently. Ultimately, that difference was realized not only in the values that were produced, but also in the mechanics of this analysis. The data set used was the Counts of beer production in Australia. Each observation represented a measure taken quarterly from 1956 to 2008. There were 211 observations, 2 varaibles: a date and an amount, the highest and lowest beer produced in this time period, respectively, was 599.0ML and 213.0ML. Before decomposition was constructed on the time series, training and testing data sets were derived from the initial data set. Each training data set stopped at the same time period but began at different time periods. The sizes for the training data sets were n=200,150,64,48. The size for the testing data set was n=13. This was done to offset the affects of the classical decomposition method. Once these data sets were constructed forecasting was carried out on them. Naive forecasting was utilized. Each method had different forecasting periods. Specifically, STL used the 11 periods and Classical used 13. Afterwards accuracy metrics were derived using the forecasted and test values. The MAPE and MSE were reported. The MAPE was in percentages. For the STL method the highest MAPE and MSE were for n=64 and the lowest MAPE and MSE were for n=48. So as the observations increased or decreased from n=64, it seems that the MAPE and MSE decreased. For the Classical method the highest MAPE and MSE were for n=150 and the lowest MAPE and MSE were for n=48. So as the observations increased or decreased from n=150, it seems that the MAPE and MSE decreased. The differences in the accuracy metrics between the two methods may stem from different factors. A main factor may be that Classical decomposition uses the moving average to disassemble time series where STL does not. The differences in the accuracy metrics within these two methods may stem from different factors as well. A main factor may be that each observation size may inherently alter the values that the forecast is based on. Since those values may originate from dissimilar parts of the trend line, the seasonality that they produce may not culminate into expected results. Specifically, forecasting on values with dissimilar trends may not yield much information on an optimal training size. At least when forecasting using the naive method.
An optimal value for forecasting using STL and Classical decomposition was not found, nor was an optimal decomposition method either. However, using the MAPE and MSE reported, one might choose n=48 for both STL and Classical decomposition methods using Naive forecasting and choose STL over Classical decomposition, since it minimized both values. Caution may be advised here, however. This may be owed to peculiarities inherent in each method, their choice of forecasting method, and their choice of training size and training data set location.