The data has two variables, date and temperature. It measures the minimum daily temperatures in degrees Celcius over 10 years (1981-1990) in Melbourne, Austrailia. We will be splitting the data into a training set and a testing set. The train set is observations #1-#3640, and the testing set is the last 40 observations.
We can use the four baseline forecasting methods and the first 30 data values to forecast data values in the future.
kable(pred.table, caption = "Forecasting Table")
| pred.mv | pred.naive | pred.snaive | pred.rwf |
|---|---|---|---|
| 11.17154 | 13.1 | 11.8 | 13.09791 |
| 11.17154 | 13.1 | 12.0 | 13.09582 |
| 11.17154 | 13.1 | 12.7 | 13.09373 |
| 11.17154 | 13.1 | 16.4 | 13.09165 |
| 11.17154 | 13.1 | 16.0 | 13.08956 |
| 11.17154 | 13.1 | 13.3 | 13.08747 |
| 11.17154 | 13.1 | 11.7 | 13.08538 |
| 11.17154 | 13.1 | 10.4 | 13.08329 |
| 11.17154 | 13.1 | 14.4 | 13.08120 |
| 11.17154 | 13.1 | 12.7 | 13.07912 |
| 11.17154 | 13.1 | 14.8 | 13.07703 |
| 11.17154 | 13.1 | 13.3 | 13.07494 |
| 11.17154 | 13.1 | 15.6 | 13.07285 |
| 11.17154 | 13.1 | 14.5 | 13.07076 |
| 11.17154 | 13.1 | 14.3 | 13.06867 |
| 11.17154 | 13.1 | 15.3 | 13.06658 |
| 11.17154 | 13.1 | 16.4 | 13.06450 |
| 11.17154 | 13.1 | 14.8 | 13.06241 |
| 11.17154 | 13.1 | 17.4 | 13.06032 |
| 11.17154 | 13.1 | 18.8 | 13.05823 |
| 11.17154 | 13.1 | 22.1 | 13.05614 |
| 11.17154 | 13.1 | 19.0 | 13.05405 |
| 11.17154 | 13.1 | 15.5 | 13.05196 |
| 11.17154 | 13.1 | 15.8 | 13.04988 |
| 11.17154 | 13.1 | 14.7 | 13.04779 |
| 11.17154 | 13.1 | 10.7 | 13.04570 |
| 11.17154 | 13.1 | 11.5 | 13.04361 |
| 11.17154 | 13.1 | 15.0 | 13.04152 |
| 11.17154 | 13.1 | 14.5 | 13.03943 |
| 11.17154 | 13.1 | 14.5 | 13.03735 |
We now make a time series plot and the predicted values that only show observations #3600- #3650 and the 30 forecasted values.
plot(3600:3650, day.temp[3600:3650], type="l", xlim=c(3600,3650), ylim=c(-10,30),
xlab = "observation sequence",
ylab = "Daily Low Temperature counts",
main = "Monthly counts and forecasting")
points(3600:3650, day.temp[3600:3650],pch=20)
##
points(3621:3650, pred.mv, pch=15, col = "red")
points(3621:3650, pred.naive, pch=16, col = "blue")
points(3621:3650, pred.rwf, pch=18, col = "navy")
points(3621:3650, pred.snaive, pch=17, col = "purple")
##
lines(3621:3650, pred.mv, lty=2, col = "red")
lines(3621:3650, pred.snaive, lty=2, col = "purple")
lines(3621:3650, pred.naive, lty=2, col = "blue")
lines(3621:3650, pred.rwf, lty=2, col = "navy")
##
legend("topright", c("moving avergae", "naive", "drift", "seasonal naive"),
col=c("red", "blue", "navy", "purple"), pch=15:18, lty=rep(2,4),
bty="n", cex = 0.8)
Based off of this graph, we can see the moving average method worked best.
We can test this using the mean absolute prediction error (MAPE) to compare the performance of the four forecasting methods.
accuracy.table = cbind(MAPE = MAPE, MAD = MAD, MSE = MSE)
row.names(accuracy.table) = c("Moving Average", "Naive", "Seasonal Naive", "Drift")
kable(accuracy.table, caption ="Overall performance of the four forecasting methods")
| MAPE | MAD | MSE | |
|---|---|---|---|
| Moving Average | 21.17789 | 99.78308 | 16.663487 |
| Naive | 12.49265 | 58.70000 | 7.917667 |
| Seasonal Naive | 20.51542 | 86.80000 | 12.482000 |
| Drift | 12.58712 | 59.16573 | 7.968845 |
Because the MAPE value for moving average is higher than all other forecasting methods, this confirms that the moving average method works best.