1. Для TSTS schema

1.1 Load data:

ts <- read.csv("https://raw.githubusercontent.com/forvis/forvision_data/master/M3_yearly_TSTS.csv")
head(ts)
##   series_id category   value timestamp
## 1        Y1    MICRO  940.66      1975
## 2        Y1    MICRO 1084.86      1976
## 3        Y1    MICRO 1244.98      1977
## 4        Y1    MICRO 1445.02      1978
## 5        Y1    MICRO 1683.17      1979
## 6        Y1    MICRO 2038.15      1980

1.2 ValidateTSTS():

library(forvision)
validateTSTS(ts)
## [1] TRUE
  • Если нет какой-то колонки:
ts1 <- ts
ts1$timestamp <- NULL
validateTSTS(ts1)

Error in validateTSTS(ts1) : Check the column names of input data frame. The input data needed in the form of a data frame containing columns named ‘series_id’, ‘timestamp’, and ‘value’.

1.3 getSeriesSummary():

getSeriesSummary(ts)
##   Time series data summary
##   ========================
##   Number of time series:  645
##   Total number of observations: 18319
##   Timestamp range: from 1811 to 2001

2. Для FTS schema:

2.1 Load data:

fs <- read.csv("https://raw.githubusercontent.com/forvis/forvision_data/master/M3_yearly_FTS.csv")
head(fs)
##   series_id category method forecast horizon timestamp origin_timestamp
## 1        Y1    MICRO NAIVE2  4936.99       1      1989             1988
## 2        Y1    MICRO NAIVE2  4936.99       2      1990             1988
## 3        Y1    MICRO NAIVE2  4936.99       3      1991             1988
## 4        Y1    MICRO NAIVE2  4936.99       4      1992             1988
## 5        Y1    MICRO NAIVE2  4936.99       5      1993             1988
## 6        Y1    MICRO NAIVE2  4936.99       6      1994             1988

2.2 ValidateTSTS():

library(forvision)
validateFTS(fs)

column horizon must be numeric

  • преобразование этой колонки в numeric:
fs$horizon <- as.numeric(fs$horizon)
validateFTS(fs)

Error in validateFTS(fs) : Check the input data frame. The input data frame has duplicate records for the columns series_id, origin_timestamp, timestamp and horizon. Indices of duplicated rows: 7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103,104,105,106,107,108,109,110,111,112,113,114,115,116,117,118,119,120,121,122,123,124,125,126,127,128,129,130,131,132,139,140,141,142,143,144,145,146,147,148,149,150,151,152,153,154,155,156,157,158,159,160,161,162,163,164,165,166,167,168,169,170,171,172,173,174,175,176,177,178,179,180,181,182,183,184,185,186,187,188,189,190,191,192,193,194,195,196,197,198,199,200,201,202,203,204,205,206,207,208,209,210,211,212,213,214,215,216,217,218,219,220,221,222,223,224,225,226,227,228,229,230,231,232,233,234,235,236,237,238,239,240,241

Это то что я вчера говорил: В одном временном ряде записи в series_id, origin_timestamp, timestamp and horizon повторяются

Поэтому туду надо добавить ещё method, то есть:

  • method, series_id, origin_timestamp, timestamp and horizon

  • method, series_id, timestamp and horizon

  • method, series_id, origin_timestamp, and horizon

И после того как добавил method в комбинации:

fs$horizon <- as.numeric(fs$horizon)
validateFTS(fs)
## [1] TRUE

2.3 getForecastSummary():

fs$horizon <- as.numeric(fs$horizon)
getForecastSummary(fs)
##   Forecast data summary
##   ========================
##   Number of time series:  645
##   Total number of observations: 85140
##   Timestamp range: from 1841 to 2001
##   Number of methods:  22
##   Total number of forecasts: 85140
##   Horizon range: from 1 to 6