0.1 Case Study

This is a dataset of the amount of beer produced which is taken in the year 1955-95 in New York. It contains 76 observations and the units are a count. this dataset is taken from Kagle, which is a website where many data miners share their ideas and publish their works.

0.2 Building 4 Forecast Models

We next use the four baseline forecasting methods and the first 30 data values to forecast the next 15 years of data values.

Forecasting Table
pred.mv pred.naive pred.snaive pred.rwf
136.1132 131 129 131.0822
136.1132 131 128 131.1643
136.1132 131 140 131.2465
136.1132 131 143 131.3287
136.1132 131 151 131.4109
136.1132 131 177 131.4930
136.1132 131 184 131.5752
136.1132 131 151 131.6574
136.1132 131 134 131.7396
136.1132 131 164 131.8217
136.1132 131 126 131.9039
136.1132 131 131 131.9861
136.1132 131 129 132.0683
136.1132 131 128 132.1504
136.1132 131 140 132.2326

0.3 Visualization

We now make a time series plot and the predicted values. Note that, the forecast values were based on the model that uses 461 historical data in the time series.The following only show observations #462 -#476 and the 15 forecasted values.

We can see that the moving average, naive, and drift method worked fairly well. Also, The performance of the three methods in this seasonal time series are close to each other. However, seasonal naive methods worked poorly compared to the other methods.

Overall performance of the four forecasting methods
MAPE MAD MSE
Moving Average 10.21725 239.6603 470.49288
Naive 11.07494 265.0000 588.20000
Seasonal Naive 4.61989 103.0000 83.26667
Drift 10.97103 261.8774 571.42499

In the final analysis, the results are close, but the the drift method has the best performance. As a reminder, the methods introduced in this module are baseline forecasting. They are all descriptive since we did not use any statistical assumptions.