Прогнозы генерированы с использованием модели auto.arima из пакета forecast - Hyndman.

1. Данные

  • 3003 times series из M3-соревнования (yearly, quarterly, monthly and other data)

  • Эти данные сохраняются в пакете Mcomp

library(Mcomp)
data(M3)
print(M3)
## M-Competition data: 3003 time series 
## 
##            Type of data
## Period      DEMOGRAPHIC FINANCE INDUSTRY MACRO MICRO OTHER Total
##   MONTHLY           111     145      334   312   474    52  1428
##   OTHER               0      29        0     0     4   141   174
##   QUARTERLY          57      76       83   336   204     0   756
##   YEARLY            245      58      102    83   146    11   645
##   Total             413     308      519   731   828   204  3003
  • 645 yearly time series: from 1 to 645
  • 756 quarterly time sries: from 646 to 1401
  • 1428 monthly time series: from 1401 to 2829
  • 174 other time series: from 2830 to 3003

Format of M3: M3 is a list of 3003 series of class Mcomp. Each series within M3 is of class Mdata with the following structure:

  • sn: Name of the series

  • st: Series number and period. For example “Y1” denotes first yearly series, “Q20” denotes 20th quarterly series and so on.

  • n: The number of observations in the time series

  • h: The number of required forecasts

  • period: Interval of the time series. Possible values are “YEARLY”, “QUARTERLY”, “MONTHLY” & “OTHER”.

  • type: The type of series. Possible values for M3 are “DEMOGRAPHIC”, “FINANCE”, “INDUSTRY”, “MACRO”, “MICRO”, “OTHER”.

  • description: A short description of the time series

  • x: A time series of length n (the historical data)

  • xx: A time series of length h (the future data)

Например:

  • First time series of the yearly times series:
M3[[1]]
## Series: Y1
## Type of series: MICRO
## Period of series: YEARLY
## Series description: SALES ( CODE= ABT)
## 
## HISTORICAL data
## Time Series:
## Start = 1975 
## End = 1988 
## Frequency = 1 
##  [1]  940.66 1084.86 1244.98 1445.02 1683.17 2038.15 2342.52 2602.45
##  [9] 2927.87 3103.96 3360.27 3807.63 4387.88 4936.99
## 
## FUTURE data
## Time Series:
## Start = 1989 
## End = 1994 
## Frequency = 1 
## [1] 5379.75 6158.68 6876.58 7851.91 8407.84 9156.01
  • First time series of the quarterly times series:
M3[[646]]
## Series: Q1
## Type of series: MICRO
## Period of series: QUARTERLY
## Series description: ASSETS-TOTAL QUANTITY (CODE= AA )
## 
## HISTORICAL data
##         Qtr1    Qtr2    Qtr3    Qtr4
## 1984 3142.63 3190.75 3178.69 3170.94
## 1985 3124.38 3170.00 3200.94 3176.75
## 1986 3170.44 3268.67 3198.25 3383.35
## 1987 3389.78 3368.60 3383.70 4950.95
## 1988 5086.10 5203.95 5302.75 5268.75
## 1989 5406.85 5472.50 5656.40 5770.30
## 1990 5677.20 5725.85 5742.00 5706.60
## 1991 5591.95 5605.15 5630.00 5589.20
## 1992 5551.25 5592.15 5481.60 5511.55
## 
## FUTURE data
##         Qtr1    Qtr2    Qtr3    Qtr4
## 1993 5531.50 5670.60 5730.00 5798.45
## 1994 5809.05 5707.05 5661.75 6176.60
  • First time series of the monthly times series:
M3[[1402]]
## Series: M1
## Type of series: MICRO
## Period of series: MONTHLY
## Series description: SHIPMENTS (Code TD-30EXP)
## 
## HISTORICAL data
##       Jan  Feb  Mar  Apr  May  Jun  Jul  Aug  Sep  Oct  Nov  Dec
## 1990 2640 2640 2160 4200 3360 2400 3600 1920 4200 4560  480 3720
## 1991 5640 2880 1800 3120 2400 2520 9000 2640 3120 2880 8760 5160
## 1992 2160 8280 4920 3120 6600 4080 5880 1680 6720 2040 6480 1920
## 1993 3600 2040 2760 3840  960 2280 1320 2160 4800 3000 3120 5880
## 1994 2640 2400                                                  
## 
## FUTURE data
##       Jan  Feb  Mar  Apr  May  Jun  Jul  Aug  Sep  Oct  Nov  Dec
## 1994           2280  480 5040 1920  840 2520 1560 1440  240 1800
## 1995 4680 1800 1680 3720 2160  480 2040 1440
  • First time series of the other times series:
M3[[2830]]
## Series: O1
## Type of series: MICRO
## Period of series: OTHER
## Series description: Weekly delivery
## 
## HISTORICAL data
## Time Series:
## Start = 1 
## End = 96 
## Frequency = 1 
##  [1] 3060.42 3021.19 3301.13 3287.03 3080.71 3160.68 3077.17 3023.04
##  [9] 2976.01 3199.55 3219.28 3266.87 3228.60 3108.45 3052.15 3081.10
## [17] 3072.48 2989.99 2983.53 2978.99 3075.09 3259.02 3252.64 3197.50
## [25] 3095.91 3213.78 3174.21 3196.42 3752.91 4238.37 4140.09 3959.70
## [33] 3892.21 3892.16 3757.58 3738.49 3626.78 3658.79 3723.11 3710.43
## [41] 3640.67 3676.83 3642.97 3676.80 3635.68 3687.71 3649.13 3857.15
## [49] 3930.03 3737.96 3774.24 3767.55 3699.60 3620.30 3641.00 3656.02
## [57] 3710.94 3913.79 3829.84 3856.37 3712.15 3829.26 4048.86 4420.14
## [65] 4603.30 4738.44 4762.83 4812.26 4816.10 4680.41 4521.21 4586.80
## [73] 4585.28 4556.63 4292.69 4261.42 4235.59 4258.32 4321.97 4245.44
## [81] 4274.12 4149.42 4155.70 4117.26 4204.07 4041.04 4157.26 4141.84
## [89] 4183.90 4215.39 4138.89 4031.36 4339.38 4509.34 4547.25 4542.51
## 
## FUTURE data
## Time Series:
## Start = 97 
## End = 104 
## Frequency = 1 
## [1] 4381.08 4405.63 4377.02 4371.18 4255.07 4285.44 4260.68 4249.63

2 Настройка для обучения модели auto.arima

  • Обучающая выборка: the historical data (то есть x)

  • Forecast origin: последнняя точка of the historical data (x)

  • Forecast horizon: The number of required forecasts (то есть h)

  • Тестовая выборка: the future data (То есть xx)

  • PIs: 80% and 95%

2.1 для yearly data:

library(forecast)
library(Mcomp)
# создать пустой список для сохранения результатов прогнозирования
list.out <- list()
# Прогнозирование с использованием auto.arima
# результаты прогнозирования сохраняются в виде список фреймов. 
# Каждый фрейм - это результат прогнозирования одного ряда
for (i in 1:645) {
  fit <- auto.arima(M3[[i]]$x)
  fc <- forecast(fit, h=M3[[i]]$h, level=c(80, 95))
  df <- as.data.frame(fc)
  names(df) <- c("forecast", "lo80", "hi80", "lo90", "hi90")
  df$series_id <- M3[[i]]$st
  df$method <- "ARIMA"
  df$timestamp <- time(M3[[i]]$xx)
  df$origin_timestamp <- time(M3[[i]]$x)[length(time(M3[[i]]$x))]
  df$horizon <- 1:M3[[i]]$h
  list.out[[i]] <- df
}

# преобразование списка в виде одного фрейма
out <- do.call = do.call("rbind", df)

В результате:

##   forecast     lo80     hi80     lo95      hi95 series_id method timestamp
## 1  5486.10 5363.602 5608.598 5298.756  5673.444        Y1  ARIMA      1989
## 2  6035.21 5761.297 6309.123 5616.295  6454.125        Y1  ARIMA      1990
## 3  6584.32 6125.975 7042.665 5883.342  7285.298        Y1  ARIMA      1991
## 4  7133.43 6462.482 7804.378 6107.303  8159.557        Y1  ARIMA      1992
## 5  7682.54 6774.072 8591.008 6293.158  9071.922        Y1  ARIMA      1993
## 6  8231.65 7063.095 9400.205 6444.500 10018.800        Y1  ARIMA      1994
##   origin_timestamp horizon
## 1             1988       1
## 2             1988       2
## 3             1988       3
## 4             1988       4
## 5             1988       5
## 6             1988       6

Для quarterly, monthly and other data аналогично