library(fpp3)
library(ggplot2)
library(ggfortify)
All of the figures have spikes that seem to be touching the interval lines. Collectively however, they do not make up 5% of the values, thus all three figures contain white noise data.
The critical values are at different distances due to the random manner of choosing the numbers.
amzn <- gafa_stock %>% filter(Symbol == 'AMZN')
amzn %>%
gg_tsdisplay(Close, plot_type='partial')
The time series shows an increasing trend that makes the series non-stationary. The ACF plot has a large positive r1 and is slowly decreasing. The PACF plot shows a daily cyclic pattern with the highest spike being at 1.
# Transformation lambda
turk <- global_economy %>%
filter(Country == "Turkey") %>%
features(GDP, features = guerrero) %>%
pull(lambda_guerrero)
# Display
global_economy %>%
filter(Country == "Turkey") %>%
gg_tsdisplay(GDP, plot_type='partial')
# Num Differentiation
global_economy %>%
filter(Country == "Turkey") %>%
features(box_cox(GDP,turk), unitroot_ndiffs)
## # A tibble: 1 x 2
## Country ndiffs
## <fct> <int>
## 1 Turkey 1
# Differentiate and display
global_economy %>%
filter(Country == "Turkey") %>%
gg_tsdisplay(difference(box_cox(GDP,turk)), plot_type='partial')
The Turkish GDP series is non-stationary and non-seasonal. A Box-Cox transformation with the guerrero feature was applied to the Turkish GDP series. The KPSS test indicated that one difference was required to obtain stationary data.
# Transformation lambda
tas <-aus_accommodation %>%
filter(State == "Tasmania") %>%
features(Takings, features = guerrero) %>%
pull(lambda_guerrero)
# Display
aus_accommodation %>%
filter(State == "Tasmania") %>%
gg_tsdisplay(Takings, plot_type='partial')
# Num Differentiation
aus_accommodation %>%
filter(State == "Tasmania") %>%
features(box_cox(Takings,tas), unitroot_nsdiffs)
## # A tibble: 1 x 2
## State nsdiffs
## <chr> <int>
## 1 Tasmania 1
# Differentiate and display
aus_accommodation %>%
filter(State == "Tasmania") %>%
gg_tsdisplay(difference(box_cox(Takings,tas), 4), plot_type='partial')
The Tasmania Accommodation series is non-stationary and seasonal. A Box-Cox transformation with the guerrero feature was applied to the Tasmania Accommodation series. The KPSS test indicated that one seasonal difference was required to obtain stationary data.
# Transformation lambda
souv <- souvenirs %>%
features(Sales, features = guerrero) %>%
pull(lambda_guerrero)
# Display
souvenirs %>%
gg_tsdisplay(Sales, plot_type='partial') +
labs(title = "Non-transformed Monthly Souvenir Sales")
# Num Differentiation
souvenirs %>%
features(box_cox(Sales,souv), unitroot_nsdiffs)
## # A tibble: 1 x 1
## nsdiffs
## <int>
## 1 1
# Differentiate and display
souvenirs %>%
gg_tsdisplay(difference(box_cox(Sales,souv), 12), plot_type='partial')
The Monthly Sales series is non-stationary and seasonal. A Box-Cox transformation with the guerrero feature was applied to the Monthly Sales series. The KPSS test indicated that one seasonal difference was required to obtain stationary data.
For your retail data (from Exercise 8 in Section 2.10), find the appropriate order of differencing (after transformation if necessary) to obtain stationary data.
set.seed(1234)
myseries <- aus_retail %>%
filter(`Series ID` == 'A3349791W')
s_lam <- myseries %>%
features(Turnover, features = guerrero) %>%
pull(lambda_guerrero)
myseries %>%
gg_tsdisplay(Turnover, plot_type='partial')
myseries %>%
features(box_cox(Turnover, s_lam), unitroot_nsdiffs)
## # A tibble: 1 x 3
## State Industry nsdiffs
## <chr> <chr> <int>
## 1 New South Wales Other recreational goods retailing 1
myseries %>%
gg_tsdisplay(difference(box_cox(Turnover,s_lam),12), plot_type='partial')
The ‘A3349791W’ series from aus_retail is non-stationary and seasonal. The KPSS test performed on the Box-Cox transformed (guerrero feature) series indicated one seasonal difference was required to obtain stationary data.
Simulate and plot some data from simple ARIMA models.
Use the following R code to generate data from an AR(1) model
with
ϕ1=0.6 and σ2=1. The process starts with y1=0.
y <- numeric(100)
e <- rnorm(100)
for(i in 2:100)
y[i] <- 0.6*y[i-1] + e[i]
sim <- tsibble(idx = seq_len(100), y = y, index = idx)
ar1 <- function(phi)
{
set.seed(123)
y <- ts(numeric(100))
e <- rnorm(100)
for(i in 2:100)
y[i] <- phi*y[i-1] + e[i]
return (y)
}
Produce a time plot for the series. How does the plot change as you change ϕ1?
ar1(1) %>% autoplot()
ar1(0.6) %>% autoplot()
ar1(0) %>% autoplot()
ar1(-1) %>% autoplot()
As ϕ1 decreases we see more oscillation and more spikes in the data.
Write your own code to generate data from an MA(1) model with θ1=0.6 and σ2=1.
ma1 <- function(theta)
{
set.seed(456)
y <- ts(numeric(100))
e <- rnorm(100)
for(i in 2:100)
y[i] <- theta*y[i-1] + e[i]
return (y)
}
Produce a time plot for the series. How does the plot change as you change θ1?
ma1(1) %>% autoplot()
ma1(0.6) %>% autoplot()
ma1(0) %>% autoplot()
ma1(-1) %>% autoplot()
As θ1 decreases we see more oscillation and the spkies increasing.
Generate data from an ARMA(1,1) model with ϕ1=0.6, θ1=0.6 and σ2=1.
arma <- function(phi, theta)
{
set.seed(111)
y <- ts(numeric(100))
e <- rnorm(100)
for(i in 2:100)
y[i] <- phi*y[i-1] + theta*e[i-1] + e[i]
return (y)
}
Generate data from an AR(2) model with ϕ1=−0.8, ϕ2=0.3 and σ2=1. (Note that these parameters will give a non-stationary series.)
ar2 <- function(phi1, phi2)
{
set.seed(222)
y <- ts(numeric(100))
e <- rnorm(100)
for(i in 3:100)
y[i] <- phi1*y[i-1] + phi2*y[i-2] + e[i]
return (y)
}
Graph the latter two series and compare them.
arma(0.6,0.6) %>% autoplot()
ar2(-0.8,0.2) %>% autoplot()
The ARMA(1,1) model looks to be stationary. The AR(2) model however is non-stationary with osillation and increasing variance.
aus_air <- aus_airpassengers %>%
filter(Year < 2012) %>%
model(ARIMA(Passengers))
report(aus_air)
## Series: Passengers
## Model: ARIMA(0,2,1)
##
## Coefficients:
## ma1
## -0.8756
## s.e. 0.0722
##
## sigma^2 estimated as 4.671: log likelihood=-87.8
## AIC=179.61 AICc=179.93 BIC=182.99
aus_air %>%
forecast(h=10) %>%
autoplot(aus_airpassengers)
aus_air %>%
gg_tsresiduals()
The ARMA(0,2,1) model was used. The acf plot shows the residuals look like white noise.
\(y_{t} = -0.88\varepsilon_{t-1}+\varepsilon{t}\)
\((1-B)^2y_{t} = (1 - 0.88B)\varepsilon_{t}\)
aus_air2 <-aus_airpassengers %>%
filter(Year < 2012) %>%
model(ARIMA(Passengers ~ pdq(0,1,0)))
aus_air2 %>%
forecast(h=10) %>%
autoplot(aus_airpassengers)
aus_air2 %>%
gg_tsresiduals()
The model in part a predicted higher values, while this model predicted lower values.
aus_air3 <-aus_airpassengers %>%
filter(Year < 2012) %>%
model(ARIMA(Passengers ~ pdq(2,1,2)))
aus_air3 %>%
forecast(h=10) %>%
autoplot(aus_airpassengers)
aus_air3 %>%
gg_tsresiduals()
The model in part a predicted higher values, and the model in part b
predicted lower values. This model however predicted values closer to
the actual values.
aus_air4 <-aus_airpassengers %>%
filter(Year < 2012) %>%
model(ARIMA(Passengers ~ 0 + pdq(2,1,2)))
report(aus_air4)
## Series: Passengers
## Model: NULL model
## NULL model
aus_air5 <-aus_airpassengers %>%
filter(Year < 2012) %>%
model(ARIMA(Passengers ~ pdq(0,2,1)))
aus_air5 %>%
forecast(h=10) %>%
autoplot(aus_airpassengers)
aus_air5 %>%
gg_tsresiduals()
The slope of becomes steeper.
global_economy %>%
filter(Code == "USA") %>%
gg_tsdisplay(GDP, plot_type='partial')
us_lam <-global_economy %>%
filter(Code == "USA") %>%
features(GDP, features = guerrero) %>%
pull(lambda_guerrero)
alam <- myseries %>% features(Turnover,features=guerrero) %>% pull(lambda_guerrero)
The US GDP series is non-stationary, thus a Box-Cox transformation with the guerrero feature was applied to make the data more stationary.
us_gdp <- global_economy %>%
filter(Code == "USA") %>%
model(ARIMA(box_cox(GDP, us_lam)))
report(us_gdp)
## Series: GDP
## Model: ARIMA(1,1,0) w/ drift
## Transformation: box_cox(GDP, us_lam)
##
## Coefficients:
## ar1 constant
## 0.4586 118.1822
## s.e. 0.1198 9.5047
##
## sigma^2 estimated as 5479: log likelihood=-325.32
## AIC=656.65 AICc=657.1 BIC=662.78
The ARIMA(1,1,0) model with drift was used to fit the data.
global_economy %>%
filter(Code == "USA") %>%
gg_tsdisplay(box_cox(GDP,us_lam), plot_type='partial')
global_economy %>%
filter(Code == "USA") %>%
features(box_cox(GDP,us_lam), unitroot_ndiffs)
## # A tibble: 1 x 2
## Country ndiffs
## <fct> <int>
## 1 United States 1
The KPSS test indicates that only one difference is needed to make the data stationary.
other <- global_economy %>%
filter(Code == "USA") %>%
model(arima_110 = ARIMA(box_cox(GDP,us_lam) ~ pdq(1,1,0)),
arima_111 = ARIMA(box_cox(GDP,us_lam) ~ pdq(1,1,1)),
arima_120 = ARIMA(box_cox(GDP,us_lam) ~ pdq(1,2,0)),
arima_210 = ARIMA(box_cox(GDP,us_lam) ~ pdq(2,1,0)),
arima_212 = ARIMA(box_cox(GDP,us_lam) ~ pdq(2,1,2)))
glance(other) %>% arrange(AICc) %>% select(.model:BIC)
## # A tibble: 5 x 6
## .model sigma2 log_lik AIC AICc BIC
## <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 arima_120 6780. -326. 656. 656. 660.
## 2 arima_110 5479. -325. 657. 657. 663.
## 3 arima_111 5580. -325. 659. 659. 667.
## 4 arima_210 5580. -325. 659. 659. 667.
## 5 arima_212 5734. -325. 662. 664. 674.
The ARIMA(1,2,0) model is chosen because it has the best AICc value.
best <- other %>% select(arima_120)
best %>% gg_tsresiduals()
other %>% forecast(h=10) %>%
filter(.model=='arima_120') %>%
autoplot(global_economy)
The forecasts produced by the ARIMA(1,2,0) model seem to be reasonable and follow the series trend.
ets <- global_economy %>%
filter(Code == "USA") %>%
model(ETS(GDP))
report(ets)
## Series: GDP
## Model: ETS(M,A,N)
## Smoothing parameters:
## alpha = 0.9990876
## beta = 0.5011949
##
## Initial states:
## l[0] b[0]
## 448093333334 64917355687
##
## sigma^2: 7e-04
##
## AIC AICc BIC
## 3190.787 3191.941 3201.089
ETS has a significantly larger AICc value than the ARIMA(1,2,0) model, thus suggest it performs worse.