opera
Abstract
This is an undergrad student level instruction for class use. We try to ilustrate time series methods with hybrid models usingopera
package.
This work is licensed under the Creative Commons Attribution-ShareAlike 4.0 International License. To view a copy of this license, visit http://creativecommons.org/licenses/by-sa/4.0/ or send a letter to Creative Commons, PO Box 1866, Mountain View, CA 94042, USA.
License: CC BY-SA 4.0
Sugestão de citação: FIGUEIREDO, Adriano Marcos Rodrigues. Séries Temporais com R: Modelo híbrido em opera
. Campo Grande-MS,Brasil: RStudio/Rpubs, 2019. Disponível em http://rpubs.com/amrofi/Time_Series_Hybrid_opera.
Este artigo procura ilustrar os procedimentos para trabalhar com métodos de séries temporais para Modelo híbrido em opera
. Baseado em postagem de Rob J Hyndman (HYNDMAN, 2016).
A combinação de forecasts, ou também chamado de modelo híbrido de previsão, tem por ideia básica a combinação de forecasts de diferentes métodos. Por exemplo, na postagem de Peter Ellis (ELLIS, 2015), mostra-se que os modelos com combinação de forecasts podem ter melhores resultados que os melhores modelos da competição M3 de forecasts (competição de Spyros Makridakis e Michele Hibon, em que testaram vários métodos de previsão para 3.003 séries temporais) (Figura 1).
Comparação de métodos de previsão da competição M3. Fonte: ELLIS, 2015
O nome do pacote opera
significa “Online Prediction by ExpeRt Aggregation” (Previsão Online por Agregação por ExpeRt). Foi escrito por Pierre Gaillard e Yannig Goude e será aqui considerado para uma previsão simples de série temporal univariada mensal de dados de consumo no varejo de São Paulo de Morettin e Toloi (2005).
library(forecast)
library(ggplot2)
library(readxl)
library(seasonal)
dados <- structure(list(obs = structure(c(441763200, 444441600, 446947200, 449625600,
452217600, 454896000, 457488000, 460166400, 462844800, 465436800, 468115200,
470707200, 473385600, 476064000, 478483200, 481161600, 483753600, 486432000,
489024000, 491702400, 494380800, 496972800, 499651200, 502243200, 504921600,
507600000, 510019200, 512697600, 515289600, 517968000, 520560000, 523238400,
525916800, 528508800, 531187200, 533779200, 536457600, 539136000, 541555200,
544233600, 546825600, 549504000, 552096000, 554774400, 557452800, 560044800,
562723200, 565315200, 567993600, 570672000, 573177600, 575856000, 578448000,
581126400, 583718400, 586396800, 589075200, 591667200, 594345600, 596937600,
599616000, 602294400, 604713600, 607392000, 609984000, 612662400, 615254400,
617932800, 620611200, 623203200, 625881600, 628473600, 631152000, 633830400,
636249600, 638928000, 641520000, 644198400, 646790400, 649468800, 652147200,
654739200, 657417600, 660009600, 662688000, 665366400, 667785600, 670464000,
673056000, 675734400, 678326400, 681004800, 683683200, 686275200, 688953600,
691545600, 694224000, 696902400, 699408000, 702086400, 704678400, 707356800,
709948800, 712627200, 715305600, 717897600, 720576000, 723168000, 725846400,
728524800, 730944000, 733622400, 736214400, 738892800, 741484800, 744163200,
746841600, 749433600, 752112000, 754704000, 757382400, 760060800, 762480000,
765158400, 767750400, 770428800, 773020800, 775699200, 778377600, 780969600,
783648000, 786240000, 788918400, 791596800, 794016000, 796694400, 799286400,
801964800, 804556800, 807235200, 809913600, 812505600, 815184000, 817776000,
820454400, 823132800, 825638400, 828316800, 830908800, 833587200, 836179200,
838857600, 841536000, 844128000), class = c("POSIXct", "POSIXt"), tzone = "UTC"),
t = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19,
20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36,
37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53,
54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70,
71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87,
88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103,
104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117,
118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131,
132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145,
146, 147, 148, 149, 150, 151, 152, 153, 154), consumo = c(114.13, 110.79,
116.46, 111.57, 120.66, 121.15, 121.27, 127.02, 129.04, 133.3, 130.6,
179.39, 120.64, 114.05, 130.6, 118.26, 145.54, 135.13, 153.35, 159.95,
150.01, 164.93, 170.37, 220.96, 134.26, 133.11, 147.84, 164.46, 181.86,
170.44, 186.64, 174.21, 181.62, 194.16, 181.9, 232.01, 140.16, 130.78,
119.04, 120.73, 129.81, 111.04, 122.75, 133.95, 125.41, 132.05, 129.54,
176.37, 110.09, 113.25, 124.03, 110.63, 116.72, 124.63, 124.38, 130.27,
119.87, 115.75, 122.44, 162.43, 105.89, 115.59, 147, 131.7, 131.32,
136.66, 126.43, 134.88, 128.26, 125.32, 124.61, 166.11, 116.25, 96.93,
89.27, 101.87, 125.57, 113.31, 109.39, 127.33, 120.56, 117.73, 113.81,
147.25, 100.15, 95.11, 112.26, 109.39, 114.2, 113.8, 126.47, 128.36,
115.71, 116.09, 99.53, 127.27, 87.08, 85.67, 82.02, 98.2, 96.44, 90.23,
97.15, 95.08, 94, 93, 96.09, 129.21, 75.39, 77.7, 97.34, 84.97, 87.55,
86.64, 90.52, 95.4, 95.2, 95.8, 101.23, 128.49, 85.63, 82.77, 96.55,
81.33, 96.91, 83.76, 90.19, 114.84, 108.4, 106.05, 109.71, 143.86, 99.12,
99.28, 114.75, 106.13, 110.02, 108.07, 112.52, 113.87, 107.84, 112.12,
112.03, 139.37, 92.24, 93.56, 107.37, 102.89, 114.78, 102.88, 118.41,
119.23, 117.36, 122.06)), row.names = c(NA, -154L), class = c("tbl_df",
"tbl", "data.frame"))
attach(dados)
dados2 <- matrix(consumo)
dados.ts <- ts(dados2, start = c(1984, 1), frequency = 12)
train <- window(dados.ts, end = c(1994, 3))
test <- window(dados.ts, start = c(1994, 4))
h <- length(test)
# construir forecasts dos experts
ETS <- forecast(ets(train), h = h)
ARIMA <- forecast(auto.arima(train, stepwise = FALSE, approximation = FALSE),
h = h)
STL <- stlf(train, lambda = 0, h = h)
# ajuste calendário brasileiro
library(RCurl)
usingR_url_wd <- getURL("https://raw.githubusercontent.com/pedrocostaferreira/Articles/master/using-R-to-teach-seasonal-adjustment/work_days.csv")
wd <- read.csv2(text = usingR_url_wd)
head(wd)
Workdays_ok <- ts(wd[, 5], start = c(1970, 1), freq = 12)
# construção feriados
usingR_url_mh <- getURL("https://raw.githubusercontent.com/pedrocostaferreira/Articles/master/using-R-to-teach-seasonal-adjustment/moving_holidays.csv")
mh <- read.csv2(text = usingR_url_mh)
head(mh) # datas de Pascoa e Carnaval
# Pascoa (DOMINGO)
dates.easter <- as.Date(as.character(mh$Easter), "%d/%m/%Y")
easter <- genhol(dates.easter, start = -8, end = 1, frequency = 12)
# Carnaval (TERCA-FEIRA DE CARNAVAL)
dates.carnival <- as.Date(as.character(mh$Carnival), "%d/%m/%Y")
carnival <- genhol(dates.carnival, start = -3, end = 1, frequency = 12, center = "calendar")
# carnival <- genhol(dates.carnival, start = -3, end = 1, frequency = 12)
regs <- na.omit(cbind(Workdays_ok, easter, carnival))
x13 <- series(seas(train, xreg = regs, regression.aictest = NULL, regression.usertype = "holiday",
forecast.save = "forecasts"), c("forecast.forecasts"))
# agregar experts
X <- cbind(ETS = ETS$mean, ARIMA = ARIMA$mean, STL = STL$mean, x13 = x13[1:31,
1])
df <- cbind(dados.ts, X)
colnames(df) <- c("Dados", "ETS", "ARIMA", "STL", "x13")
autoplot(df) + xlab("Ano") + ylab(expression("Consumo no varejo de São Paulo"))
Vou olhar o oráculo das previsões.
require(opera)
oracle.convex <- oracle(Y = test, experts = X, loss.type = "square", model = "convex")
plot(oracle.convex)
oracle.convex
Call:
oracle.default(Y = test, experts = X, model = "convex", loss.type = "square")
Coefficients:
ETS ARIMA STL x13
0 1 0 -1.68e-16
rmse mape
Best expert oracle: 8.71 0.0716
Uniform combination: 11.20 0.0962
Best convex oracle: 8.71 0.0716
Existe algum expert individualmente melhor ao longo do tempo? Existem quebras?
require(opera)
oracle.shift <- oracle(Y = test, experts = X, loss.type = "percentage", model = "shifting")
plot(oracle.shift)
oracle.shift
Call:
oracle.default(Y = test, experts = X, model = "shifting", loss.type = "percentage")
0 shifts 7 shifts 15 shifts 23 shifts 30 shifts
mape: 0.0716 0.0656 0.0671 0.069 0.0734
opera
A função de mistura do pacote opera
calcula pesos ao combinar as previsões com base no desempenho até esse período de tempo.
library(opera)
# iniciar regra de agregação ou mistura
MLpol0 <- mixture(model = "MLpol", loss.type = "square")
m1.MLpol <- MLpol0
for (i in 1:length(test)) {
m1.MLpol <- predict(m1.MLpol, newexperts = X[i, ], newY = test[i])
}
# fazer agregação diretamente
m3.MLpol <- mixture(Y = test, experts = X, model = "MLpol", loss.type = "square")
weights <- predict(m1.MLpol, X, test, type = "weights")
(weights)
ETS ARIMA STL x13
[1,] 0 1 0 0
[2,] 0 1 0 0
[3,] 0 1 0 0
[4,] 0 1 0 0
[5,] 0 1 0 0
[6,] 0 1 0 0
[7,] 0 1 0 0
[8,] 0 1 0 0
[9,] 0 1 0 0
[10,] 0 1 0 0
[11,] 0 1 0 0
[12,] 0 1 0 0
[13,] 0 1 0 0
[14,] 0 1 0 0
[15,] 0 1 0 0
[16,] 0 1 0 0
[17,] 0 1 0 0
[18,] 0 1 0 0
[19,] 0 1 0 0
[20,] 0 1 0 0
[21,] 0 1 0 0
[22,] 0 1 0 0
[23,] 0 1 0 0
[24,] 0 1 0 0
[25,] 0 1 0 0
[ reached getOption("max.print") -- omitted 6 rows ]
Ele começa com a ponderação já igual a 1 para o ARIMA e segue até o fim, sem misturar.
Aqui estão as previsões resultantes:
z <- ts(predict(m3.MLpol, X, test, type = "response"), start = c(1994, 4), freq = 12)
df <- cbind(dados.ts, z)
colnames(df) <- c("Dados", "Mistura")
autoplot(df) + xlab("Ano") + ylab(expression("Consumo no varejo de São Paulo"))
ELLIS, Peter. X13-SEATS-ARIMA as an automated forecasting tool. 2015. Disponível em: https://www.r-bloggers.com/x13-seats-arima-as-an-automated-forecasting-tool/ e http://freerangestats.info/blog/2015/12/21/m3-and-x13.
GAILLARD, Pierre; GOUDE, Yannig. opera: Online Prediction by Expert Aggregation. R package version 1.0. 2016. Disponível em: https://CRAN.R-project.org/package=opera.
HYNDMAN, Rob J. R packages for forecast combinations. 2016. Disponível em:https://robjhyndman.com/hyndsight/forecast-combinations/
SHAUB, David; ELLIS, Peter. forecastHybrid: Convenient Functions for Ensemble Time Series Forecasts. R package version 3.0.14. 2018. Disponível em: https://CRAN.R-project.org/package=forecastHybrid.
SHAUB, David; ELLIS, Peter.forecastHybrid: vignette. 2016. Disponível em: https://cran.r-project.org/web/packages/forecastHybrid/vignettes/forecastHybrid.html.