1 Data Description

The data has two variables, date and temperature. It measures the minimum daily temperatures in degrees Celcius over 10 years (1981-1990) in Melbourne, Austrailia. We will be splitting the data into a training set and a testing set. The train set is observations #1-#3640, and the testing set is the last 40 observations.

2 Forecasting Models

We can use the four baseline forecasting methods and the first 30 data values to forecast data values in the future.

kable(pred.table, caption = "Forecasting Table")
Forecasting Table
pred.mv pred.naive pred.snaive pred.rwf
11.17154 13.1 11.8 13.09791
11.17154 13.1 12.0 13.09582
11.17154 13.1 12.7 13.09373
11.17154 13.1 16.4 13.09165
11.17154 13.1 16.0 13.08956
11.17154 13.1 13.3 13.08747
11.17154 13.1 11.7 13.08538
11.17154 13.1 10.4 13.08329
11.17154 13.1 14.4 13.08120
11.17154 13.1 12.7 13.07912
11.17154 13.1 14.8 13.07703
11.17154 13.1 13.3 13.07494
11.17154 13.1 15.6 13.07285
11.17154 13.1 14.5 13.07076
11.17154 13.1 14.3 13.06867
11.17154 13.1 15.3 13.06658
11.17154 13.1 16.4 13.06450
11.17154 13.1 14.8 13.06241
11.17154 13.1 17.4 13.06032
11.17154 13.1 18.8 13.05823
11.17154 13.1 22.1 13.05614
11.17154 13.1 19.0 13.05405
11.17154 13.1 15.5 13.05196
11.17154 13.1 15.8 13.04988
11.17154 13.1 14.7 13.04779
11.17154 13.1 10.7 13.04570
11.17154 13.1 11.5 13.04361
11.17154 13.1 15.0 13.04152
11.17154 13.1 14.5 13.03943
11.17154 13.1 14.5 13.03735

3 Visualization

We now make a time series plot and the predicted values that only show observations #3600- #3650 and the 30 forecasted values.

plot(3600:3650, day.temp[3600:3650], type="l", xlim=c(3600,3650), ylim=c(-10,30),
     xlab = "observation sequence",
     ylab = "Daily Low Temperature counts",
     main = "Monthly counts and forecasting")
points(3600:3650, day.temp[3600:3650],pch=20)
##
points(3621:3650, pred.mv, pch=15, col = "red")
points(3621:3650, pred.naive, pch=16, col = "blue")
points(3621:3650, pred.rwf, pch=18, col = "navy")
points(3621:3650, pred.snaive, pch=17, col = "purple")
##
lines(3621:3650, pred.mv, lty=2, col = "red")
lines(3621:3650, pred.snaive, lty=2, col = "purple")
lines(3621:3650, pred.naive, lty=2, col = "blue")
lines(3621:3650, pred.rwf, lty=2, col = "navy")
## 
legend("topright", c("moving avergae", "naive", "drift", "seasonal naive"),
       col=c("red", "blue", "navy", "purple"), pch=15:18, lty=rep(2,4),
       bty="n", cex = 0.8)

Based off of this graph, we can see the moving average method worked best.

4 Accuracy Metrics

We can test this using the mean absolute prediction error (MAPE) to compare the performance of the four forecasting methods.

accuracy.table = cbind(MAPE = MAPE, MAD = MAD, MSE = MSE)
row.names(accuracy.table) = c("Moving Average", "Naive", "Seasonal Naive", "Drift")
kable(accuracy.table, caption ="Overall performance of the four forecasting methods")
Overall performance of the four forecasting methods
MAPE MAD MSE
Moving Average 21.17789 99.78308 16.663487
Naive 12.49265 58.70000 7.917667
Seasonal Naive 20.51542 86.80000 12.482000
Drift 12.58712 59.16573 7.968845

Because the MAPE value for moving average is higher than all other forecasting methods, this confirms that the moving average method works best.