0.1 Case Study and Data Description

This is a dataset of the amount of beer produced which is taken in the year 1955-95 in New York. It contains 476 observations and the units are a count. this dataset is taken from Kagle, which is a website where many data miners share their ideas and publish their works.

0.2 Time Series Object

0.3 Forecasting with Decomposition

0.4 Training and Testing

We hold up the last 7 periods of data for testing. The rest of the data will be used to train the forecast model. We define different training data sets with different sizes. Three training set sizesin this example are 144, 109, and 144. The same test set with size 7 is used to calculate the prediction error.

We next perform error analysis.

Error comparison between forecast results with different sample sizes
MSE MAPE
n.144 217.6320 0.1010451
n.109 200.7711 0.0967022
n. 73 184.9340 0.0925949
n. 48 154.3774 0.0814401

We trained the same algorithm with different sample sizes and compared the accuracy measures. Among four training sizes 144, 109, 73, and 48. training data size 48 yields the best performance because it has the least amount of error calculated.

Forecasting with STL smoothing does not yield decent results. However, our case study still accomplishes the main learning goal of decomposing a time series to find the optimal training size to achieve the best accuracy.