Abstract
This is an exercise for class use. We analyse data on Retail consumption, from January/2000 to nowadays.This work is licensed under the Creative Commons Attribution-ShareAlike 4.0 International License. To view a copy of this license, visit http://creativecommons.org/licenses/by-sa/4.0/ or send a letter to Creative Commons, PO Box 1866, Mountain View, CA 94042, USA.
License: CC BY-SA 4.0
Sugestão de citação: FIGUEIREDO, Adriano Marcos Rodrigues. Séries Temporais com R: Análise ARIMA do Consumo do Varejo em MS com X13ARIMA-SEATS. Campo Grande-MS,Brasil: RStudio/Rpubs, 2020. Disponível em http://rpubs.com/amrofi/x13arima_seats_varejoms.
Neste arquivo utilizo a série do Índice de volume de vendas no varejo Total de Mato Grosso do Sul, série mensal a partir de jan/2000 até jul/2019 obtida com o pacote BETS
e importada do Banco Central do Brasil. Portanto, são 235 observações mensais.
Farei de duas formas para o leitor. Uma carrega direto do site do Banco Central do Brasil com o pacote BETS
(FERREIRA, SPERANZA e COSTA, 2018) e a outra eu gerei a estrutura idêntica pela função dput()
para os leitores que não conseguirem por qualquer motivo o acesso ao site do Banco Central (as vezes vejo isso ocorrer dependendo dos bloqueios da sua rede de internet). A forma pelo dput assume o nome varejoms2 enquanto a extraída pelo BETS tem nome varejoms. Esclareço ao leitor que após baixar a série pelo BETS, fiz o dput e a partir de então, desabilitei o bloco (Chunk
) que acessa o BETS apenas para agilizar os cálculos.
library(BETS)
# Pegando as séries a partir do site do Banco Central do Brasil
# Índice de volume de vendas no varejo Total de Mato Grosso do Sul
# mensal a partir de jan/2000 até jul/2019
# 235 observações mensais
varejoms <- BETSget(1479)
print(varejoms)
class(varejoms)
dput(varejoms) # opção para ter os dados como na structure abaixo
suppressMessages(library(fpp2))
suppressMessages(library(tseries))
suppressMessages(library(zoo))
suppressMessages(library(forecast))
suppressMessages(library(ggplot2))
# Pegando as séries a partir do site do Banco Central do Brasil Índice de volume
# de vendas no varejo Total de Mato Grosso do Sul mensal a partir de jan/2000 até
# jul/2019 Loading the dataset library(readxl) dados <-
# read_excel('dados.xlsx',sheet = 'dados') attach(dados) library(BETS)
# Pegando as séries a partir do site do Banco Central do Brasil Índice de volume
# de vendas no varejo Total de Mato Grosso do Sul mensal a partir de jan/2000 até
# jul/2019 (em 12.09.2019)
# varejoms <- BETSget(1479)
varejoms <- structure(c(35.2, 35.6, 39.2, 40.5, 41.6, 40.4, 40.8, 38.7, 37.3, 37.6,
35.6, 47.4, 34.2, 32.2, 38, 37.5, 38.8, 35, 38.4, 40.2, 38.2, 39.3, 36, 46.3,
36.4, 34, 39, 37.8, 38.9, 35.3, 37.2, 38.1, 35.6, 38.3, 35.6, 45.7, 32.2, 31.6,
35.2, 36.8, 37.5, 34.8, 38.4, 38.1, 37, 39, 37.3, 49, 35.8, 35.3, 40.2, 41.3,
43.9, 41.6, 45.9, 42.3, 42.2, 44, 41.3, 56.9, 38.5, 38.5, 45.3, 43.6, 46.2, 44.3,
47.5, 46.5, 46.4, 46, 44.1, 61, 42, 40.2, 44.6, 44.5, 47.8, 45.3, 46.5, 48.5,
47.7, 50.2, 49.3, 64.5, 47, 46.8, 51, 50.5, 55, 51.3, 52.8, 55.3, 54.8, 55.6,
55.3, 72.2, 54.5, 52.1, 56.2, 57.2, 60.8, 56.1, 61.8, 61.6, 59.8, 63.3, 57.7,
77.4, 61.4, 51.9, 57.3, 57.9, 61.9, 57.3, 61.1, 61.1, 60.6, 65.6, 63.5, 83.1,
64.1, 60.2, 67.8, 67.1, 72.8, 68.5, 71.1, 69.3, 69.9, 71.1, 67.9, 92.7, 67.5,
64.8, 69.1, 69.4, 79.6, 70.2, 73.8, 72.5, 71.3, 75.6, 74.7, 100.8, 79.5, 75.7,
82.4, 78, 84.8, 83.2, 84.8, 88.5, 86.3, 91.7, 92.8, 111.4, 92.8, 83.7, 92.5,
88.3, 93.9, 88.8, 96, 95.9, 93.2, 98.3, 100.5, 128.8, 97.2, 90.2, 94.3, 94.4,
101.1, 92, 96.4, 98.2, 97.6, 105.8, 103.1, 129.6, 99.6, 87.8, 97, 94.8, 98.6,
93.4, 98.4, 96.4, 92.5, 100.6, 97.2, 124.5, 91.5, 85.1, 91.6, 88.5, 92.2, 87.4,
90.5, 88.1, 85.2, 89.4, 93.4, 116.9, 90.8, 84, 89.7, 86.3, 90, 87.3, 90.8, 93.5,
93.7, 91.4, 93.5, 114.1, 87.8, 81.1, 94.5, 83.2, 89.9, 88.8, 89.3, 93.7, 93.5,
96.3, 101.3, 118.3, 93.8, 85.2, 90, 86.6, 90, 85.2, 90.9), .Tsp = c(2000, 2019.5,
12), class = "ts")
suppressMessages(library(seasonal)) #chamando o pacote
checkX13() #checagem se o x13 está operacional no RStudio
consumo.ts <- varejoms
plot(consumo.ts, xlab = "Tempo")
abline(h = seq(40, 230, 5), v = seq(2000, 2017, 1), lty = 3, col = "darkgrey")
monthplot(consumo.ts, labels = month.abb, lty.base = 2)
legend("topleft", legend = c("consumo/mes", "media/mes"), cex = 0.8, lty = c(1, 2),
bty = "n")
options(digits = 6)
library(seasonal)
ajuste <- seas(x = consumo.ts)
Neste caso, o programa faz as seguintes avaliações no automático: 1. Verificar teste de sazonalidade QS; A hipótese nula do teste QS é de não haver sazonalidade; - se P-value=0 => tem sazonalidade; 2. Diagnosticar pre-ajuste e modelo ARIMA; 3. Verificar indicios de sazonalidade ou efeitos de dias uteis graficamente; 4. Estabilidade do ajuste sazonal.
# specserie<-spec.ar(consumo.ts) specserie
qs(ajuste) # faz o teste QS para sazonalidade
qs p-val
qsori 401.53016 0.00000
qsorievadj 413.70084 0.00000
qsrsd 0.01129 0.99437
qssadj 0.00000 1.00000
qssadjevadj 0.00000 1.00000
qsirr 0.00000 1.00000
qsirrevadj 0.00000 1.00000
qssori 152.62432 0.00000
qssorievadj 157.51593 0.00000
qssrsd 0.02656 0.98681
qsssadj 0.00000 1.00000
qsssadjevadj 0.00000 1.00000
qssirr 0.00000 1.00000
qssirrevadj 0.00000 1.00000
summary(ajuste) # mostra os resultados da estimacao ARIMA automatica e outliers detectados
Call:
seas(x = consumo.ts)
Coefficients:
Estimate Std. Error z value Pr(>|z|)
Mon -0.00750 0.00299 -2.51 0.0121 *
Tue 0.00415 0.00302 1.38 0.1689
Wed -0.00304 0.00302 -1.01 0.3136
Thu 0.00656 0.00301 2.18 0.0293 *
Fri 0.00564 0.00297 1.90 0.0576 .
Sat 0.00273 0.00301 0.91 0.3645
Easter[15] 0.02240 0.00603 3.71 0.0002 ***
AO2011.May 0.09444 0.02037 4.63 3.6e-06 ***
MA-Nonseasonal-01 0.25901 0.06416 4.04 5.4e-05 ***
MA-Seasonal-12 0.62511 0.05423 11.53 < 2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
SEATS adj. ARIMA: (0 1 1)(0 1 1) Obs.: 235 Transform: log
AICc: 920, BIC: 956 QS (no seasonality in final): 0
Box-Ljung (no autocorr.): 30.4 Shapiro (normality): 0.995
final(ajuste) # mostra a serie ajustada
Jan Feb Mar Apr May Jun Jul Aug
2000 39.3300 39.5505 39.4395 40.1428 40.3741 41.2593 40.0219 38.0183
2001 37.9094 37.1737 37.9657 37.7401 37.3143 36.1452 37.7736 39.3231
2002 39.7687 39.1410 38.7496 38.4960 37.4410 37.0553 36.5095 37.0814
2003 35.2496 36.4043 36.3683 36.7113 35.8835 36.8316 37.1479 37.7011
2004 38.9672 39.6054 41.1016 41.3233 43.1107 43.1698 44.0771 42.4076
2005 42.8621 44.2698 45.2394 44.7964 45.3881 45.9583 46.5637 46.4023
2006 46.1595 46.0124 45.7515 45.6118 46.8711 46.7904 46.6203 47.8558
2007 50.6874 53.3260 51.5483 53.0344 53.3142 53.3089 53.1680 54.5771
Sep Oct Nov Dec
2000 37.8608 37.6293 37.7240 37.9952
2001 39.1803 38.9487 37.6451 37.4916
2002 36.8762 37.3688 37.1320 36.7815
2003 37.8514 38.0284 39.2811 39.0132
2004 42.8546 43.3296 43.2495 44.3708
2005 46.6471 45.8362 45.7397 47.0622
2006 48.1642 49.9190 50.7424 50.4282
2007 56.1715 55.0249 56.2822 57.2221
[ reached getOption("max.print") -- omitted 12 rows ]
plot(ajuste) # grafico da serie ajustada
Pode-se utilizar a especificação do static para ajustar manualmente a especificação do spec do X13 ARIMA-SEATS.O static mostra o que foi feito e facilita a alteração de algum item específico.
static(ajuste) # permite o ajuste manual do seas
seas(x = consumo.ts, regression.variables = c("td", "easter[15]",
"ao2011.May"), arima.model = "(0 1 1)(0 1 1)", regression.aictest = NULL,
outlier = NULL, transform.function = "log")
O static permite o ajuste manual do seas. Farei um cenário incluindo uma level shift em maio de 2016.
# cenario ajustemanual
ajustemanual <- seas(x = consumo.ts, regression.variables = c("td", "easter[15]",
"ao2011.May", "ls2016.May"), arima.model = "(0 1 1)(0 1 1)", regression.aictest = NULL,
outlier = NULL, transform.function = "log")
summary(ajustemanual)
Call:
seas(x = consumo.ts, transform.function = "log", regression.aictest = NULL,
outlier = NULL, regression.variables = c("td", "easter[15]",
"ao2011.May", "ls2016.May"), arima.model = "(0 1 1)(0 1 1)")
Coefficients:
Estimate Std. Error z value Pr(>|z|)
Mon -0.0074946 0.0029889 -2.508 0.012158 *
Tue 0.0041474 0.0030153 1.375 0.169000
Wed -0.0030434 0.0030201 -1.008 0.313592
Thu 0.0065613 0.0030159 2.176 0.029585 *
Fri 0.0056318 0.0029703 1.896 0.057956 .
Sat 0.0027212 0.0030235 0.900 0.368119
Easter[15] 0.0224081 0.0060488 3.705 0.000212 ***
AO2011.May 0.0944147 0.0203910 4.630 3.65e-06 ***
LS2016.May -0.0006483 0.0248004 -0.026 0.979145
MA-Nonseasonal-01 0.2591039 0.0641634 4.038 5.39e-05 ***
MA-Seasonal-12 0.6251594 0.0542302 11.528 < 2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
SEATS adj. ARIMA: (0 1 1)(0 1 1) Obs.: 235 Transform: log
AICc: 922.2, BIC: 961.6 QS (no seasonality in final): 0
Box-Ljung (no autocorr.): 30.46 Shapiro (normality): 0.9947
# comparando com o ajuste automático, vejo que não ajudou com ls2016
summary(ajuste)
Call:
seas(x = consumo.ts)
Coefficients:
Estimate Std. Error z value Pr(>|z|)
Mon -0.007496 0.002988 -2.509 0.012123 *
Tue 0.004148 0.003015 1.376 0.168933
Wed -0.003040 0.003017 -1.008 0.313621
Thu 0.006556 0.003009 2.179 0.029332 *
Fri 0.005635 0.002967 1.899 0.057568 .
Sat 0.002729 0.003009 0.907 0.364459
Easter[15] 0.022397 0.006031 3.713 0.000205 ***
AO2011.May 0.094435 0.020374 4.635 3.57e-06 ***
MA-Nonseasonal-01 0.259008 0.064165 4.037 5.42e-05 ***
MA-Seasonal-12 0.625106 0.054231 11.527 < 2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
SEATS adj. ARIMA: (0 1 1)(0 1 1) Obs.: 235 Transform: log
AICc: 920, BIC: 956.2 QS (no seasonality in final): 0
Box-Ljung (no autocorr.): 30.43 Shapiro (normality): 0.9946
Comentário: embora tenha saído um ajuste, o programa não consegue visualizar adequadamente a time series, colocando ajustes de dias da semana de forma inadequada, uma vez que a série é mensal.
ajuste.fcst <- series(ajuste, c("forecast.forecasts"))
autoplot(ajuste.fcst)
Vou carregar a partir do sítio eletrônico do Prof. Pedro Costa Ferreira.
library(RCurl)
usingR_url_wd <- getURL("https://raw.githubusercontent.com/pedrocostaferreira/Articles/master/using-R-to-teach-seasonal-adjustment/work_days.csv")
wd <- read.csv2(text = usingR_url_wd)
head(wd)
# outra opcao é pegar direto do arquivo csv, dentro desta pasta uteis <-
# ts(du[,5], start = c(1970,1), freq = 12)
Workdays_ok <- ts(wd[, 5], start = c(1970, 1), freq = 12)
# building moving holidays (mh) feriados moveis construção dos feriados móveis -
# Pascoa, Carnaval, Natal e Ano Novo feriados <- read.csv2('feriados_moveis.csv')
# head(feriados)
usingR_url_mh <- getURL("https://raw.githubusercontent.com/pedrocostaferreira/Articles/master/using-R-to-teach-seasonal-adjustment/moving_holidays.csv")
mh <- read.csv2("feriados.csv")
# mh <- read.csv2(text = usingR_url_mh) outra opcao é pegar direto do arquivo
# csv, dentro desta pasta
head(mh) # Pascoa e Carnaval, Natal e Ano Novo
# Easter Pascoa
dates.easter <- as.Date(as.character(mh$Easter), "%d/%m/%Y")
easter <- genhol(dates.easter, start = -8, end = 1, frequency = 12)
# Carnival
dates.carnival <- as.Date(as.character(mh$Carnival), "%d/%m/%Y")
carnival <- genhol(dates.carnival, start = -3, end = 1, frequency = 12, center = "calendar")
# Ampliação para Natal e Ano Novo
# Natal
dates.natal <- as.Date(as.character(mh$Natal), "%d/%m/%Y")
natal <- genhol(dates.natal, start = -10, end = 5, frequency = 12, center = "calendar")
# NewYear
dates.anonovo <- as.Date(as.character(mh$NewYear), "%d/%m/%Y")
anonovo <- genhol(dates.anonovo, start = -10, end = 5, frequency = 12, center = "calendar")
regs <- na.omit(cbind(Workdays_ok, easter, carnival))
regs2 <- na.omit(cbind(Workdays_ok, easter))
head(regs)
FALSE Workdays_ok easter carnival
FALSE Jan 1970 -4.0 0 0.000000
FALSE Feb 1970 -3.5 0 0.203125
FALSE Mar 1970 -4.0 1 -0.203125
FALSE Apr 1970 -1.5 0 0.000000
FALSE May 1970 -11.0 0 0.000000
FALSE Jun 1970 2.0 0 0.000000
regsamp <- na.omit(cbind(Workdays_ok, easter, carnival, anonovo))
# obs: embora tenha criado esta rotina, o programa identificou problemas de
# multicolinearidade e conflito do anonovo com adittive outlier
head(regsamp)
FALSE Workdays_ok easter carnival anonovo
FALSE Jan 1970 -4.0 0 0.000000 0.00290698
FALSE Feb 1970 -3.5 0 0.203125 0.00000000
FALSE Mar 1970 -4.0 1 -0.203125 0.00000000
FALSE Apr 1970 -1.5 0 0.000000 0.00000000
FALSE May 1970 -11.0 0 0.000000 0.00000000
FALSE Jun 1970 2.0 0 0.000000 0.00000000
Agora faz-se novamente a estimação com os dados de dias úteis e feriados brasileiros. Primeiro, farei sem especificar os outliers.
ajuste2 <- seas(x = consumo.ts, xreg = regs, regression.aictest = NULL, regression.usertype = "holiday",
forecast.save = "forecasts")
qs(ajuste2)
qs p-val
qsori 401.53016 0.00000
qsorievadj 402.57808 0.00000
qsrsd 0.88286 0.64312
qssadj 0.00000 1.00000
qssadjevadj 0.00000 1.00000
qsirr 0.00000 1.00000
qsirrevadj 0.00000 1.00000
qssori 152.62432 0.00000
qssorievadj 152.25008 0.00000
qssrsd 1.30197 0.52153
qsssadj 0.00000 1.00000
qsssadjevadj 0.00000 1.00000
qssirr 0.00000 1.00000
qssirrevadj 0.00000 1.00000
summary(ajuste2)
Call:
seas(x = consumo.ts, xreg = regs, regression.aictest = NULL,
regression.usertype = "holiday", forecast.save = "forecasts")
Coefficients:
Estimate Std. Error z value Pr(>|z|)
xreg1 0.0013903 0.0004922 2.825 0.00473 **
xreg2 0.0186926 0.0081896 2.282 0.02246 *
xreg3 -0.0137603 0.0081059 -1.698 0.08959 .
MA-Nonseasonal-01 0.3957676 0.0606551 6.525 6.81e-11 ***
MA-Seasonal-12 0.6277735 0.0545185 11.515 < 2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
SEATS adj. ARIMA: (0 1 1)(0 1 1) Obs.: 235 Transform: log
AICc: 968.1, BIC: 988.1 QS (no seasonality in final): 0
Box-Ljung (no autocorr.): 33.88 . Shapiro (normality): 0.996
Agora com os outliers predefinidos. Verá que melhorou um pouco, reduzindo o AICc.
ajuste3 <- seas(x = consumo.ts, xreg = regs, regression.aictest = NULL, regression.variables = c("ls2016.May",
"AO2011.May", "AO2009.Jan"), regression.usertype = "holiday", forecast.save = "forecasts")
qs(ajuste3)
qs p-val
qsori 401.53016 0.00000
qsorievadj 407.23799 0.00000
qsrsd 0.72962 0.69433
qssadj 0.00000 1.00000
qssadjevadj 0.00000 1.00000
qsirr 0.00000 1.00000
qsirrevadj 0.00000 1.00000
qssori 152.62432 0.00000
qssorievadj 152.53446 0.00000
qssrsd 0.95047 0.62174
qsssadj 0.00000 1.00000
qsssadjevadj 0.00000 1.00000
qssirr 0.00000 1.00000
qssirrevadj 0.00000 1.00000
summary(ajuste3)
Call:
seas(x = consumo.ts, xreg = regs, regression.aictest = NULL,
regression.variables = c("ls2016.May", "AO2011.May", "AO2009.Jan"),
regression.usertype = "holiday", forecast.save = "forecasts")
Coefficients:
Estimate Std. Error z value Pr(>|z|)
AO2009.Jan 0.090403 0.022427 4.031 5.55e-05 ***
AO2011.May 0.066876 0.022482 2.975 0.00293 **
LS2016.May -0.024245 0.025523 -0.950 0.34214
xreg1 0.001375 0.000460 2.990 0.00279 **
xreg2 0.019513 0.007647 2.552 0.01072 *
xreg3 -0.012936 0.007546 -1.714 0.08647 .
MA-Nonseasonal-01 0.363815 0.061521 5.914 3.35e-09 ***
MA-Seasonal-12 0.623582 0.055022 11.333 < 2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
SEATS adj. ARIMA: (0 1 1)(0 1 1) Obs.: 235 Transform: log
AICc: 950.1, BIC: 979.9 QS (no seasonality in final): 0
Box-Ljung (no autocorr.): 31.2 Shapiro (normality): 0.9972
library(Metrics)
# mae(actual, predicted) rmse(actual, predicted)
predicted <- final(ajuste3)
actual <- consumo.ts
forecast::accuracy(predicted, actual)
ME RMSE MAE MPE MAPE ACF1 Theil's U
Test set -0.2066034 6.366519 3.649847 -1.015523 5.211835 -0.009962979 0.7242298
# idem ao obtido no Metrics abaixo
(mae3 <- mae(actual, predicted))
[1] 3.649847
(rmse3 <- rmse(actual, predicted))
[1] 6.366519
Verifico a especificação do ajuste3.
static(ajuste3) # permite o ajuste manual do seas
seas(x = consumo.ts, xreg = regs, regression.variables = c("ao2009.Jan",
"ao2011.May", "ls2016.May"), regression.usertype = "holiday",
forecast.save = "forecasts", arima.model = "(0 1 1)(0 1 1)",
regression.aictest = NULL, outlier = NULL, transform.function = "log")
Vou avaliar a retirada de ls2016 não significativo em ajuste3.
require(seasonal)
#
ajuste4 <- seas(x = consumo.ts, xreg = regs, regression.variables = c("AO2011.May",
"AO2009.Jan"), regression.usertype = "holiday", forecast.save = "forecasts",
regression.aictest = NULL, outlier = NULL)
qs(ajuste4)
qs p-val
qsori 401.53016 0.00000
qsorievadj 407.88433 0.00000
qsrsd 0.80386 0.66903
qssadj 0.00000 1.00000
qssadjevadj 0.00000 1.00000
qsirr 0.00000 1.00000
qsirrevadj 0.00000 1.00000
qssori 152.62432 0.00000
qssorievadj 152.32568 0.00000
qssrsd 1.08601 0.58100
qsssadj 0.00000 1.00000
qsssadjevadj 0.00000 1.00000
qssirr 0.00000 1.00000
qssirrevadj 0.00000 1.00000
summary(ajuste4)
Call:
seas(x = consumo.ts, xreg = regs, regression.aictest = NULL,
outlier = NULL, regression.variables = c("AO2011.May", "AO2009.Jan"),
regression.usertype = "holiday", forecast.save = "forecasts")
Coefficients:
Estimate Std. Error z value Pr(>|z|)
AO2009.Jan 0.0904186 0.0224273 4.032 5.54e-05 ***
AO2011.May 0.0671744 0.0224795 2.988 0.00281 **
xreg1 0.0013582 0.0004595 2.956 0.00312 **
xreg2 0.0188255 0.0076081 2.474 0.01335 *
xreg3 -0.0133124 0.0075326 -1.767 0.07718 .
MA-Nonseasonal-01 0.3597317 0.0616364 5.836 5.34e-09 ***
MA-Seasonal-12 0.6215622 0.0550650 11.288 < 2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
SEATS adj. ARIMA: (0 1 1)(0 1 1) Obs.: 235 Transform: log
AICc: 948.8, BIC: 975.4 QS (no seasonality in final): 0
Box-Ljung (no autocorr.): 31.67 Shapiro (normality): 0.9969
predicted4 <- final(ajuste4)
actual <- consumo.ts
forecast::accuracy(predicted4, actual)
ME RMSE MAE MPE MAPE ACF1 Theil's U
Test set -0.2072867 6.366703 3.651408 -1.016476 5.212728 -0.008495727 0.7242178
# idem ao obtido no Metrics abaixo
require(Metrics)
(mae4 <- mae(actual, predicted4))
[1] 3.651408
(rmse4 <- rmse(actual, predicted4))
[1] 6.366703
static(ajuste4)
seas(x = consumo.ts, xreg = regs, regression.variables = c("ao2009.Jan",
"ao2011.May"), regression.usertype = "holiday", forecast.save = "forecasts",
arima.model = "(0 1 1)(0 1 1)", regression.aictest = NULL,
outlier = NULL, transform.function = "log")
Melhorou pelo AICc e significancia e acurácia. Vou checar mais uma alternativa alterando o ARIMA conforme o obtido em https://rpubs.com/amrofi/arima_varejoms, ARIMA(4,1,0)(0,1,1)[12].
# com especificação do ARIMA
ajuste5 <- seas(x = consumo.ts, xreg = regs, regression.aictest = NULL, regression.variables = c("AO2011.May",
"AO2009.Jan"), arima.model = "(3 1 0)(0 1 1)", regression.usertype = "holiday",
forecast.save = "forecasts")
qs(ajuste5)
qs p-val
qsori 401.53016 0.00000
qsorievadj 407.96642 0.00000
qsrsd 0.72012 0.69763
qssadj 0.00000 1.00000
qssadjevadj 0.00000 1.00000
qsirr 0.02259 0.98877
qsirrevadj 0.00000 1.00000
qssori 152.62432 0.00000
qssorievadj 152.39745 0.00000
qssrsd 0.95080 0.62164
qsssadj 0.00000 1.00000
qsssadjevadj 0.00000 1.00000
qssirr 0.00000 1.00000
qssirrevadj 0.00000 1.00000
summary(ajuste5)
Call:
seas(x = consumo.ts, xreg = regs, regression.aictest = NULL,
regression.variables = c("AO2011.May", "AO2009.Jan"), arima.model = "(3 1 0)(0 1 1)",
regression.usertype = "holiday", forecast.save = "forecasts")
Coefficients:
Estimate Std. Error z value Pr(>|z|)
AO2009.Jan 0.0886610 0.0219665 4.036 5.43e-05 ***
AO2011.May 0.0684329 0.0219820 3.113 0.00185 **
xreg1 0.0013164 0.0004944 2.662 0.00776 **
xreg2 0.0211998 0.0076110 2.785 0.00535 **
xreg3 -0.0118157 0.0075295 -1.569 0.11659
AR-Nonseasonal-01 -0.3810618 0.0660034 -5.773 7.77e-09 ***
AR-Nonseasonal-02 -0.0898955 0.0705512 -1.274 0.20260
AR-Nonseasonal-03 0.0938840 0.0659875 1.423 0.15481
MA-Seasonal-12 0.6213611 0.0546936 11.361 < 2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
SEATS adj. ARIMA: (3 1 0)(0 1 1) Obs.: 235 Transform: log
AICc: 949.4, BIC: 982.4 QS (no seasonality in final): 0
Box-Ljung (no autocorr.): 28.71 Shapiro (normality): 0.9968
Comentário: o melhor modelo foi o ajuste4.
Testei com level shift para maio de 2017 e não ajudou.
require(seasonal)
#
ajuste6 <- seas(x = consumo.ts, xreg = regs, regression.variables = c("AO2011.May",
"AO2009.Jan", "ls2017.May"), regression.usertype = "holiday", forecast.save = "forecasts",
regression.aictest = NULL, outlier = NULL)
qs(ajuste6)
qs p-val
qsori 401.53016 0.00000
qsorievadj 407.42904 0.00000
qsrsd 0.81383 0.66570
qssadj 0.00000 1.00000
qssadjevadj 0.00000 1.00000
qsirr 0.00000 1.00000
qsirrevadj 0.00000 1.00000
qssori 152.62432 0.00000
qssorievadj 152.17610 0.00000
qssrsd 1.08365 0.58168
qsssadj 0.00000 1.00000
qsssadjevadj 0.00000 1.00000
qssirr 0.00000 1.00000
qssirrevadj 0.00000 1.00000
summary(ajuste6)
Call:
seas(x = consumo.ts, xreg = regs, regression.aictest = NULL,
outlier = NULL, regression.variables = c("AO2011.May", "AO2009.Jan",
"ls2017.May"), regression.usertype = "holiday", forecast.save = "forecasts")
Coefficients:
Estimate Std. Error z value Pr(>|z|)
AO2009.Jan 0.090486 0.022418 4.036 5.43e-05 ***
AO2011.May 0.067037 0.022474 2.983 0.00286 **
LS2017.May -0.008531 0.025883 -0.330 0.74171
xreg1 0.001371 0.000461 2.974 0.00294 **
xreg2 0.018752 0.007607 2.465 0.01370 *
xreg3 -0.013282 0.007529 -1.764 0.07772 .
MA-Nonseasonal-01 0.358471 0.061663 5.813 6.12e-09 ***
MA-Seasonal-12 0.622614 0.055040 11.312 < 2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
SEATS adj. ARIMA: (0 1 1)(0 1 1) Obs.: 235 Transform: log
AICc: 950.9, BIC: 980.7 QS (no seasonality in final): 0
Box-Ljung (no autocorr.): 31.28 Shapiro (normality): 0.9969
predicted6 <- final(ajuste6)
actual <- consumo.ts
forecast::accuracy(predicted6, actual)
ME RMSE MAE MPE MAPE ACF1 Theil's U
Test set -0.2099459 6.366541 3.651575 -1.020477 5.21332 -0.00905445 0.7242057
# idem ao obtido no Metrics abaixo
require(Metrics)
(mae6 <- mae(actual, predicted6))
[1] 3.651575
(rmse6 <- rmse(actual, predicted6))
[1] 6.366541
static(ajuste6)
seas(x = consumo.ts, xreg = regs, regression.variables = c("ao2009.Jan",
"ao2011.May", "ls2017.May"), regression.usertype = "holiday",
forecast.save = "forecasts", arima.model = "(0 1 1)(0 1 1)",
regression.aictest = NULL, outlier = NULL, transform.function = "log")
Fiz uso do seasonalview::view
para obter um refinamento do modelo, a partir do ajuste5. É possível verificar que o modelo melhora com a especificação do ajuste7.
library(seasonalview)
# view(ajuste5) script do shiny
ajuste7 <- seas(x = consumo.ts, xreg = regs, regression.aictest = NULL, regression.variables = c("AO2011.May",
"AO2009.Jan"), arima.model = "(2 1 1)(0 1 1)", regression.usertype = "holiday",
forecast.save = "forecasts")
summary(ajuste7)
Call:
seas(x = consumo.ts, xreg = regs, regression.aictest = NULL,
regression.variables = c("AO2011.May", "AO2009.Jan"), arima.model = "(2 1 1)(0 1 1)",
regression.usertype = "holiday", forecast.save = "forecasts")
Coefficients:
Estimate Std. Error z value Pr(>|z|)
AO2009.Jan 0.088026 0.021870 4.025 5.70e-05 ***
AO2011.May 0.065013 0.021902 2.968 0.00299 **
xreg1 0.001299 0.000493 2.635 0.00841 **
xreg2 0.020741 0.007548 2.748 0.00600 **
xreg3 -0.012047 0.007450 -1.617 0.10587
AR-Nonseasonal-01 -0.987530 0.206865 -4.774 1.81e-06 ***
AR-Nonseasonal-02 -0.346695 0.077358 -4.482 7.41e-06 ***
MA-Nonseasonal-01 -0.609473 0.212446 -2.869 0.00412 **
MA-Seasonal-12 0.617472 0.054676 11.293 < 2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
SEATS adj. ARIMA: (2 1 1)(0 1 1) Obs.: 235 Transform: log
AICc: 948, BIC: 981 QS (no seasonality in final): 0
Box-Ljung (no autocorr.): 23.62 Shapiro (normality): 0.9962
monthplot(ajuste7, col.base = 1)
legend("topleft", legend = c("Irregular", "Seasonal", "Seasonal Average"), col = c(4,
2, 1), lwd = c(1, 2, 2), lty = 1, bty = "n", cex = 0.6)
# figura da serie ajustada
plot(ajuste7, main = "Ajuste 7 de consumo.ts")
grid()
legend("topleft", legend = c("Original", "Adjusted"), col = c(1, 2), lwd = c(1, 2),
lty = 1, bty = "n", cex = 0.6)
forecasts7 <- series(ajuste7, c("forecast.forecasts"))
require(graphics)
require(zoo)
data.fcst <- cbind.zoo(consumo.ts, series(ajuste7, "forecast.forecasts"))
ts.plot(data.fcst, col = c(1, 2, 3, 4), main = "serie original de consumo e forecasts ajuste7",
xlab = "Mes/Ano")
legend("topleft", lty = 1, pch = 1, col = 1:4, c("data", "X13as", "Lim Inferior",
"Lim Superior"))
grid()
list(forecasts7)
[[1]]
forecast lowerci upperci
Aug 2019 91.50238 86.30016 97.01820
Sep 2019 91.03687 84.97473 97.53149
Oct 2019 94.31486 87.17815 102.03580
Nov 2019 95.86953 87.61021 104.90749
Dec 2019 117.06348 106.18165 129.06051
Jan 2020 90.81741 81.67070 100.98852
Feb 2020 82.27673 73.44455 92.17103
Mar 2020 90.88800 80.55815 102.54242
Apr 2020 86.05807 75.79097 97.71602
May 2020 89.52632 78.35627 102.28872
Jun 2020 86.69946 75.43711 99.64323
Jul 2020 90.46494 78.26559 104.56582
Aug 2020 90.99027 77.69484 106.56087
Sep 2020 91.01938 77.02856 107.55137
Oct 2020 93.24180 78.21993 111.14857
Nov 2020 95.81952 79.65863 115.25909
Dec 2020 117.41156 96.83776 142.35641
Jan 2021 89.89017 73.53641 109.88085
Feb 2021 82.44484 66.94186 101.53815
Mar 2021 92.13314 74.25678 114.31300
Apr 2021 85.06853 68.08588 106.28716
May 2021 89.81990 71.39896 112.99344
Jun 2021 86.59238 68.37608 109.66173
Jul 2021 89.94242 70.56231 114.64534
Aug 2021 91.29086 70.74652 117.80116
Sep 2021 90.90638 69.79494 118.40357
Oct 2021 92.70304 70.51940 121.86510
Nov 2021 95.70034 72.11909 126.99209
Dec 2021 117.79982 88.01230 157.66883
Jan 2022 90.18742 66.80133 121.76062
Feb 2022 83.11708 61.05891 113.14399
Mar 2022 90.22011 65.74051 123.81512
Apr 2022 85.45885 61.77763 118.21779
[ reached getOption("max.print") -- omitted 3 rows ]
require(xlsx)
write.xlsx(as.data.frame.ts(forecasts7), "forecasts.xlsx", col.names = TRUE, row.names = TRUE)
library(dygraphs)
dygraph(data.fcst, main = "Índice de consumo do varejo de Mato Grosso do Sul") %>%
dyAxis("y", label = "Índice", valueRange = c(0, 150), axisLabelFontSize = 20) %>%
dyAxis("x", label = "Mês/Ano", axisLabelFontSize = 20) %>% dyGroup(c("consumo.ts",
"forecast"), drawPoints = TRUE, color = c("blue", "red")) %>% dyLegend(width = 400) %>%
dyAnnotation("2009-1-1", text = "AOjan09", attachAtBottom = TRUE, width = 80) %>%
dyAnnotation("2011-5-1", text = "AOmai11", attachAtBottom = TRUE, width = 80) %>%
dyAnnotation("2016-5-1", text = "LSmai16", attachAtBottom = TRUE, width = 80) %>%
dyOptions(drawPoints = TRUE, pointSize = 5, pointShape = "triangle", axisLineWidth = 1.5)
Agora que já temos as informações até fev/2020, podemos comparar as estimativas com os dados reais.
# dadosnovos<-BETS::BETSget(1479)
dadosnovos <- structure(c(35.2, 35.6, 39.2, 40.5, 41.6, 40.4, 40.8, 38.7, 37.3, 37.6,
35.6, 47.4, 34.2, 32.2, 38, 37.5, 38.8, 35, 38.4, 40.2, 38.2, 39.3, 36, 46.3,
36.4, 34, 39, 37.8, 38.9, 35.3, 37.2, 38.1, 35.6, 38.3, 35.6, 45.7, 32.2, 31.6,
35.2, 36.8, 37.5, 34.8, 38.4, 38.1, 37, 39, 37.3, 49, 35.8, 35.3, 40.2, 41.3,
43.9, 41.6, 45.9, 42.3, 42.2, 44, 41.3, 56.9, 38.5, 38.5, 45.3, 43.6, 46.2, 44.3,
47.5, 46.5, 46.4, 46, 44.1, 61, 42, 40.2, 44.6, 44.5, 47.8, 45.3, 46.5, 48.5,
47.7, 50.2, 49.3, 64.5, 47, 46.8, 51, 50.5, 55, 51.3, 52.8, 55.3, 54.8, 55.6,
55.3, 72.2, 54.5, 52.1, 56.2, 57.2, 60.8, 56.1, 61.8, 61.6, 59.8, 63.3, 57.7,
77.4, 61.4, 51.9, 57.3, 57.9, 61.9, 57.3, 61.1, 61.1, 60.6, 65.6, 63.5, 83.1,
64.1, 60.2, 67.8, 67.1, 72.8, 68.5, 71.1, 69.3, 69.9, 71.1, 67.9, 92.7, 67.5,
64.8, 69.1, 69.4, 79.6, 70.2, 73.8, 72.5, 71.3, 75.6, 74.7, 100.8, 79.5, 75.7,
82.4, 78, 84.8, 83.2, 84.8, 88.5, 86.3, 91.7, 92.8, 111.4, 92.8, 83.7, 92.5,
88.3, 93.9, 88.8, 96, 95.9, 93.2, 98.3, 100.5, 128.8, 97.2, 90.2, 94.3, 94.4,
101.1, 92, 96.4, 98.2, 97.6, 105.8, 103.1, 129.6, 99.6, 87.8, 97, 94.8, 98.6,
93.4, 98.4, 96.4, 92.5, 100.6, 97.2, 124.5, 91.5, 85.1, 91.6, 88.5, 92.2, 87.4,
90.5, 88.1, 85.2, 89.4, 93.4, 116.9, 90.8, 84, 89.7, 86.3, 90, 87.3, 90.8, 93.5,
93.7, 91.4, 93.5, 114.1, 87.8, 81.1, 94.5, 83.2, 89.9, 88.8, 89.3, 93.7, 93.5,
96.3, 101.3, 118.3, 93.8, 85.2, 90, 86.6, 90, 85.2, 91, 94.4, 93.4, 95.9, 102.6,
116, 94.9, 89.1), .Tsp = c(2000, 2020.08333333333, 12), class = "ts")
# View(cbind.zoo(data.fcst,dadosnovos))
print(cbind.zoo(data.fcst, dadosnovos)[230:242])
consumo.ts forecast lowerci upperci dadosnovos
fev 2019 85.2 NA NA NA 85.2
mar 2019 90.0 NA NA NA 90.0
abr 2019 86.6 NA NA NA 86.6
mai 2019 90.0 NA NA NA 90.0
jun 2019 85.2 NA NA NA 85.2
jul 2019 90.9 NA NA NA 91.0
ago 2019 NA 91.50238 86.30016 97.01820 94.4
set 2019 NA 91.03687 84.97473 97.53149 93.4
out 2019 NA 94.31486 87.17815 102.03580 95.9
nov 2019 NA 95.86953 87.61021 104.90749 102.6
dez 2019 NA 117.06348 106.18165 129.06051 116.0
jan 2020 NA 90.81741 81.67070 100.98852 94.9
fev 2020 NA 82.27673 73.44455 92.17103 89.1
Acurácia do período Agosto/2019 a fev/2020:
previsto <- forecasts7[1:7, 1]
observado <- dadosnovos[236:242]
forecast::accuracy(previsto, observado)
ME RMSE MAE MPE MAPE
Test set 3.345533 4.245131 3.649385 3.550805 3.812746
Ou seja, um erro percentual médio de 3.55% e um erro absoluto médio de 3.64%.
FERREIRA, Pedro Costa; SPERANZA, Talitha; COSTA, Jonatha (2018). BETS: Brazilian Economic Time Series. R package version 0.4.9. Disponível em: https://CRAN.R-project.org/package=BETS.
FERREIRA, Pedro C.; MATOS, Daiane M. Usando o R para ensinar Ajuste Sazonal. São Paulo: FGV, 2017. 18p. Disponível em: http://portalibre.fgv.br/lumis/portal/file/fileDownload.jsp?fileId=8A7C82C5519A547801533DF7BE5E2D0D
HYNDMAN, Rob. (2018). fpp2: Data for “Forecasting: Principles and Practice” (2nd Edition). R package version 2.3. Disponível em: https://CRAN.R-project.org/package=fpp2.
HYNDMAN, R.J., & ATHANASOPOULOS, G. (2018) Forecasting: principles and practice, 2nd edition, OTexts: Melbourne, Australia. Disponível em: https://otexts.com/fpp2/. Accessed on 12 Set 2019.
SAX, C; EDDELBUETTEL, D. (2018). “Seasonal Adjustment by X-13ARIMA-SEATS in R.” Journal of Statistical Software, 87(11), 1-17. doi: 10.18637/jss.v087.i11 (URL: https://doi.org/10.18637/jss.v087.i11).