1 Data Description

The data has two variables, date and temperature. It measures the minimum daily temperatures in degrees Celcius over 10 years (1981-1990) in Melbourne, Austrailia. We will be splitting the data into a training set and a testing set. The train set is observations #1-#3640, and the testing set is the last 40 observations.

2 Forecasting Models

We can use the four baseline forecasting methods and the first 30 data values to forecast data values in the future.

kable(pred.table, caption = "Forecasting Table")

Forecasting Table
pred.mv	pred.naive	pred.snaive	pred.rwf
11.17154	13.1	11.8	13.09791
11.17154	13.1	12.0	13.09582
11.17154	13.1	12.7	13.09373
11.17154	13.1	16.4	13.09165
11.17154	13.1	16.0	13.08956
11.17154	13.1	13.3	13.08747
11.17154	13.1	11.7	13.08538
11.17154	13.1	10.4	13.08329
11.17154	13.1	14.4	13.08120
11.17154	13.1	12.7	13.07912
11.17154	13.1	14.8	13.07703
11.17154	13.1	13.3	13.07494
11.17154	13.1	15.6	13.07285
11.17154	13.1	14.5	13.07076
11.17154	13.1	14.3	13.06867
11.17154	13.1	15.3	13.06658
11.17154	13.1	16.4	13.06450
11.17154	13.1	14.8	13.06241
11.17154	13.1	17.4	13.06032
11.17154	13.1	18.8	13.05823
11.17154	13.1	22.1	13.05614
11.17154	13.1	19.0	13.05405
11.17154	13.1	15.5	13.05196
11.17154	13.1	15.8	13.04988
11.17154	13.1	14.7	13.04779
11.17154	13.1	10.7	13.04570
11.17154	13.1	11.5	13.04361
11.17154	13.1	15.0	13.04152
11.17154	13.1	14.5	13.03943
11.17154	13.1	14.5	13.03735

3 Visualization

We now make a time series plot and the predicted values that only show observations #3600- #3650 and the 30 forecasted values.

plot(3600:3650, day.temp[3600:3650], type="l", xlim=c(3600,3650), ylim=c(-10,30),
     xlab = "observation sequence",
     ylab = "Daily Low Temperature counts",
     main = "Monthly counts and forecasting")
points(3600:3650, day.temp[3600:3650],pch=20)
##
points(3621:3650, pred.mv, pch=15, col = "red")
points(3621:3650, pred.naive, pch=16, col = "blue")
points(3621:3650, pred.rwf, pch=18, col = "navy")
points(3621:3650, pred.snaive, pch=17, col = "purple")
##
lines(3621:3650, pred.mv, lty=2, col = "red")
lines(3621:3650, pred.snaive, lty=2, col = "purple")
lines(3621:3650, pred.naive, lty=2, col = "blue")
lines(3621:3650, pred.rwf, lty=2, col = "navy")
## 
legend("topright", c("moving avergae", "naive", "drift", "seasonal naive"),
       col=c("red", "blue", "navy", "purple"), pch=15:18, lty=rep(2,4),
       bty="n", cex = 0.8)

Based off of this graph, we can see the moving average method worked best.

4 Accuracy Metrics

We can test this using the mean absolute prediction error (MAPE) to compare the performance of the four forecasting methods.

accuracy.table = cbind(MAPE = MAPE, MAD = MAD, MSE = MSE)
row.names(accuracy.table) = c("Moving Average", "Naive", "Seasonal Naive", "Drift")
kable(accuracy.table, caption ="Overall performance of the four forecasting methods")

Overall performance of the four forecasting methods
	MAPE	MAD	MSE
Moving Average	21.17789	99.78308	16.663487
Naive	12.49265	58.70000	7.917667
Seasonal Naive	20.51542	86.80000	12.482000
Drift	12.58712	59.16573	7.968845

Because the MAPE value for moving average is higher than all other forecasting methods, this confirms that the moving average method works best.

Time Series Concepts

Emma Laughlin

1 Data Description

2 Forecasting Models

3 Visualization

4 Accuracy Metrics