library(forecast)
frequency(gas)
## [1] 12
autoplot(gas)
ggseasonplot(gas, year.labels=TRUE, year.labels.left=TRUE)
ggsubseriesplot(gas)
gglagplot(gas,lags=6)
decomposed <- decompose(gas, type = "multiplicative")
plot(decomposed)
# for values
ggAcf(gas)
# residuals how coorelated
ggPacf(gas)
frequency(decomposed$seasonal)
## [1] 12
frequency(decomposed$x)
## [1] 12
frequency(decomposed$trend)
## [1] 12
# Every frequency is the same as frequency of (gas) object
which.max(gas)
## [1] 475
which.max(decomposed$seasonal)
## [1] 7
which.max(decomposed$x)
## [1] 475
which.max(decomposed$trend)
## [1] 470
# We can spot that overall observe has the outlier in the series
gas %>% log() %>% diff(lag=9) %>% autoplot()
* Explore your chosen retail time series using the following functions: autoplot, ggseasonplot, ggsubseriesplot, gglagplot, ggAcf Can you spot any seasonality, cyclicity and trend? What do you learn about the series? * What is the frequency of each commodity series? * Use which.max() to spot the outlier in the series. Which observation was it? * For the series, find an appropriate transformation in order to stabilize the variance (or maybe it is not necessary?).
From the plots above, we can see that appropriate transformation is ggAcf of gas production.
Time series - 02.06.2021 Workflow:
library(forecast)
## Warning: package 'forecast' was built under R version 4.0.5
## Registered S3 method overwritten by 'quantmod':
## method from
## as.zoo.data.frame zoo
# time series
Box.test(gas, lag=12, type="Lj")
##
## Box-Ljung test
##
## data: gas
## X-squared = 4890.7, df = 12, p-value < 2.2e-16
#train <- gas %>% log() %>% diff(lag=12) %>% window(gas, start=1955, end=1995)
train <- gas %>% log() %>% diff(lag=12)%>%diff(lag=1) %>% window(gas, start=1955, end=1995)
## Warning in window.default(x, ...): 'frequency' not changed
## Warning in window.default(x, ...): 'start' value not changed
fit <- auto.arima(train)
checkresiduals(fit)
##
## Ljung-Box test
##
## data: Residuals from ARIMA(0,0,1)(0,0,1)[12] with zero mean
## Q* = 50.923, df = 22, p-value = 0.0004394
##
## Model df: 2. Total lags used: 24
pred <- forecast(fit, h=4)
accuracy(pred, gas)
## ME RMSE MAE MPE MAPE
## Training set 2.189476e-04 5.059782e-02 3.668321e-02 -13.0329 253.9918
## Test set 4.834876e+04 4.865924e+04 4.834876e+04 100.0000 100.0000
## MASE ACF1 Theil's U
## Training set 4.537833e-01 -0.00318431 NA
## Test set 5.980899e+05 0.15195339 9.493974
Summary etc
library(forecast)
train <- window(gas, start=1955, end=1995)
## Warning in window.default(x, ...): 'start' value not changed
fit <- naive(train)
pred <- forecast(fit, h=4)
summary(fit)
##
## Forecast method: Naive method
##
## Model Information:
## Call: naive(y = train)
##
## Residual sd: 2769.21
##
## Error measures:
## ME RMSE MAE MPE MAPE MASE ACF1
## Training set 85.23718 2769.21 1684.618 0.1846975 8.223701 0.9083114 0.2929434
##
## Forecasts:
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## Feb 1995 41600 38051.11 45148.89 36172.45 47027.55
## Mar 1995 41600 36581.12 46618.88 33924.28 49275.72
## Apr 1995 41600 35453.15 47746.85 32199.20 51000.80
## May 1995 41600 34502.23 48697.77 30744.90 52455.10
## Jun 1995 41600 33664.45 49535.55 29463.63 53736.37
## Jul 1995 41600 32907.04 50292.96 28305.27 54894.73
## Aug 1995 41600 32210.53 50989.47 27240.05 55959.95
## Sep 1995 41600 31562.24 51637.76 26248.57 56951.43
## Oct 1995 41600 30953.34 52246.66 25317.34 57882.66
## Nov 1995 41600 30377.44 52822.56 24436.57 58763.43