Autoregressive & Moving Averaging Forecasting Example: Global Surface Temperature Anomaly

The data shows annual averages of global temperature anomaly, spanning from 1880 to 2016. The data contains two part: average and 5-year average temperatures anomaly.

Data %>% summary
##      Index                 Avg               Avg5yr        
##  Min.   :1880-01-01   Min.   :-0.47000   Min.   :-0.41000  
##  1st Qu.:1914-01-01   1st Qu.:-0.21000   1st Qu.:-0.23000  
##  Median :1948-01-01   Median :-0.07000   Median :-0.04000  
##  Mean   :1948-01-01   Mean   : 0.02438   Mean   : 0.02445  
##  3rd Qu.:1982-01-01   3rd Qu.: 0.19000   3rd Qu.: 0.21000  
##  Max.   :2016-01-01   Max.   : 0.99000   Max.   : 0.91000
Data %>% dygraph(main = "Global Surface Temperature Anomaly 1880-2016") %>%
  dyRangeSelector()

Forecasting approaches: Autoregressive (AR) and Moving Average (MA)

AR structure: We’d like to use AIC (Akaike information criteria) and BIC (Bayesian information criteria) to see which order is best.

#Using AIC as information criterion
Data$Avg %>% auto.arima(max.q = 0, max.p = 2, max.d = 0, trace = TRUE, ic = "aic")
## 
##  ARIMA(2,0,0) with non-zero mean : -216.1359
##  ARIMA(0,0,0) with non-zero mean : 85.50236
##  ARIMA(1,0,0) with non-zero mean : -212.8962
##  ARIMA(0,0,0) with zero mean     : 84.26737
##  ARIMA(2,0,0) with zero mean     : -217.7611
##  ARIMA(1,0,0) with zero mean     : -214.5527
## 
##  Best model: ARIMA(2,0,0) with zero mean
## Series: . 
## ARIMA(2,0,0) with zero mean 
## 
## Coefficients:
##          ar1     ar2
##       0.7806  0.1971
## s.e.  0.0839  0.0852
## 
## sigma^2 estimated as 0.01135:  log likelihood=111.88
## AIC=-217.76   AICc=-217.58   BIC=-209
#Using BIC as information criterion
Data$Avg %>% auto.arima(max.q = 0, max.p = 2, max.d = 0, trace = TRUE, ic = "bic") 
## 
##  ARIMA(2,0,0) with non-zero mean : -204.456
##  ARIMA(0,0,0) with non-zero mean : 91.34232
##  ARIMA(1,0,0) with non-zero mean : -204.1363
##  ARIMA(0,0,0) with zero mean     : 87.18735
##  ARIMA(2,0,0) with zero mean     : -209.0011
##  ARIMA(1,0,0) with zero mean     : -208.7127
## 
##  Best model: ARIMA(2,0,0) with zero mean
## Series: . 
## ARIMA(2,0,0) with zero mean 
## 
## Coefficients:
##          ar1     ar2
##       0.7806  0.1971
## s.e.  0.0839  0.0852
## 
## sigma^2 estimated as 0.01135:  log likelihood=111.88
## AIC=-217.76   AICc=-217.58   BIC=-209

Since it seems AR(2) works the best, we’ll apply an AR(2) structure and forecast 10 period ahead.

ar(Data$Avg, TRUE, 2) %>% forecast(h=10) 
##     Point Forecast     Lo 80     Hi 80       Lo 95    Hi 95
## 138      0.8993491 0.7272480 1.0714502  0.63614319 1.162555
## 139      0.8458384 0.6275354 1.0641413  0.51197280 1.179704
## 140      0.7910633 0.5372299 1.0448967  0.40285857 1.179268
## 141      0.7406345 0.4597119 1.0215571  0.31100043 1.170269
## 142      0.6934159 0.3907960 0.9960358  0.23059872 1.156233
## 143      0.6493266 0.3289781 0.9696751  0.15939590 1.139257
## 144      0.6081403 0.2730877 0.9431928  0.09572167 1.120559
## 145      0.5696686 0.2222945 0.9170427  0.03840573 1.100932
## 146      0.5337323 0.1759536 0.8915110 -0.01344302 1.080908
## 147      0.5001643 0.1335484 0.8667803 -0.06052638 1.060855
ar(Data$Avg, TRUE, 2) %>% forecast(h=10) %>% autoplot()

Same thing for the MA structure order selection.

#Using AIC as information criterion
Data$Avg %>% auto.arima(max.q = 2, max.p = 0, max.d = 0, trace = TRUE, ic = "aic")
## 
##  ARIMA(0,0,2) with non-zero mean : -104.6498
##  ARIMA(0,0,0) with non-zero mean : 85.50236
##  ARIMA(0,0,1) with non-zero mean : -29.1264
##  ARIMA(0,0,0) with zero mean     : 84.26737
##  ARIMA(0,0,2) with zero mean     : -106.0445
##  ARIMA(0,0,1) with zero mean     : -30.39957
## 
##  Best model: ARIMA(0,0,2) with zero mean
## Series: . 
## ARIMA(0,0,2) with zero mean 
## 
## Coefficients:
##          ma1     ma2
##       1.0832  0.6547
## s.e.  0.0638  0.0578
## 
## sigma^2 estimated as 0.02591:  log likelihood=56.02
## AIC=-106.04   AICc=-105.86   BIC=-97.28
#Using BIC as information criterion
Data$Avg %>% auto.arima(max.q = 2, max.p = 0, max.d = 0, trace = TRUE, ic = "bic")
## 
##  ARIMA(0,0,2) with non-zero mean : -92.96985
##  ARIMA(0,0,0) with non-zero mean : 91.34232
##  ARIMA(0,0,1) with non-zero mean : -20.36645
##  ARIMA(0,0,0) with zero mean     : 87.18735
##  ARIMA(0,0,2) with zero mean     : -97.28453
##  ARIMA(0,0,1) with zero mean     : -24.55961
## 
##  Best model: ARIMA(0,0,2) with zero mean
## Series: . 
## ARIMA(0,0,2) with zero mean 
## 
## Coefficients:
##          ma1     ma2
##       1.0832  0.6547
## s.e.  0.0638  0.0578
## 
## sigma^2 estimated as 0.02591:  log likelihood=56.02
## AIC=-106.04   AICc=-105.86   BIC=-97.28
ma(Data$Avg, 2, TRUE) %>% forecast(h=10)
## Warning in ets(object, lambda = lambda, biasadj = biasadj,
## allow.multiplicative.trend = allow.multiplicative.trend, : Missing values
## encountered. Using longest contiguous portion of time series
##     Point Forecast     Lo 80     Hi 80     Lo 95     Hi 95
## 137      0.8674882 0.7986313 0.9363452 0.7621807 0.9727958
## 138      0.8674882 0.7701147 0.9648618 0.7185683 1.0164082
## 139      0.8674882 0.7482325 0.9867440 0.6851023 1.0498742
## 140      0.8674882 0.7297847 1.0051918 0.6568889 1.0780876
## 141      0.8674882 0.7135318 1.0214447 0.6320321 1.1029444
## 142      0.8674882 0.6988379 1.0361386 0.6095599 1.1254166
## 143      0.8674882 0.6853255 1.0496510 0.5888944 1.1460821
## 144      0.8674882 0.6727485 1.0622280 0.5696594 1.1653170
## 145      0.8674882 0.6609358 1.0740407 0.5515935 1.1833830
## 146      0.8674882 0.6497631 1.0852134 0.5345064 1.2004701
ma(Data$Avg, 2, TRUE) %>% forecast(h=10) %>% autoplot() +
  ggtitle("Forecasts from MA(2)")
## Warning in ets(object, lambda = lambda, biasadj = biasadj,
## allow.multiplicative.trend = allow.multiplicative.trend, : Missing values
## encountered. Using longest contiguous portion of time series