library(fpp3)
## ── Attaching packages ────────────────────────────────────────────── fpp3 0.5 ──
## ✔ tibble 3.1.6 ✔ tsibble 1.1.3
## ✔ dplyr 1.1.0 ✔ tsibbledata 0.4.1
## ✔ tidyr 1.3.0 ✔ feasts 0.3.0
## ✔ lubridate 1.8.0 ✔ fable 0.3.2
## ✔ ggplot2 3.4.1 ✔ fabletools 0.3.2
## ── Conflicts ───────────────────────────────────────────────── fpp3_conflicts ──
## ✖ lubridate::date() masks base::date()
## ✖ dplyr::filter() masks stats::filter()
## ✖ tsibble::intersect() masks base::intersect()
## ✖ tsibble::interval() masks lubridate::interval()
## ✖ dplyr::lag() masks stats::lag()
## ✖ tsibble::setdiff() masks base::setdiff()
## ✖ tsibble::union() masks base::union()
library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.1 ──
## ✔ readr 2.1.2 ✔ stringr 1.5.0
## ✔ purrr 1.0.1 ✔ forcats 0.5.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ lubridate::as.difftime() masks base::as.difftime()
## ✖ lubridate::date() masks base::date()
## ✖ dplyr::filter() masks stats::filter()
## ✖ tsibble::intersect() masks lubridate::intersect(), base::intersect()
## ✖ tsibble::interval() masks lubridate::interval()
## ✖ dplyr::lag() masks stats::lag()
## ✖ tsibble::setdiff() masks lubridate::setdiff(), base::setdiff()
## ✖ tsibble::union() masks lubridate::union(), base::union()
library(gridExtra)
##
## Attaching package: 'gridExtra'
## The following object is masked from 'package:dplyr':
##
## combine
library(ggplot2)
library(forecast)
## Registered S3 method overwritten by 'quantmod':
## method from
## as.zoo.data.frame zoo
##
## Attaching package: 'forecast'
## The following objects are masked from 'package:fabletools':
##
## accuracy, forecast
library(quantmod)
## Loading required package: xts
## Loading required package: zoo
##
## Attaching package: 'zoo'
## The following object is masked from 'package:tsibble':
##
## index
## The following objects are masked from 'package:base':
##
## as.Date, as.Date.numeric
##
## Attaching package: 'xts'
## The following objects are masked from 'package:dplyr':
##
## first, last
## Loading required package: TTR
library(ggfortify)
## Registered S3 methods overwritten by 'ggfortify':
## method from
## autoplot.Arima forecast
## autoplot.acf forecast
## autoplot.ar forecast
## autoplot.bats forecast
## autoplot.decomposed.ts forecast
## autoplot.ets forecast
## autoplot.forecast forecast
## autoplot.stl forecast
## autoplot.ts forecast
## fitted.ar forecast
## fortify.ts forecast
## residuals.ar forecast
library(tseries)
library(knitr)
library(latex2exp)
library(ggpubr)
##
## Attaching package: 'ggpubr'
## The following object is masked from 'package:forecast':
##
## gghistogram
Q 9.1 Figure 9.32 shows the ACFs for 36 random numbers, 360 random numbers and 1,000 random numbers.
Explain the differences among these figures. Do they all indicate that the data are white noise?
ans: a white noise series is stationary, the ACF bars of all 3 of them are inside of the dashline which indicate the point of statistical significance. This mean they are not statistically significant. However, the strong correlation is at different lag.and all plot look like Figure 2.23: Autocorrelation function for the white noise series.
Why are the critical values at different distances from the mean of zero? Why are the autocorrelations different in each figure when they each refer to white noise?
ans: it is because all each autocorrelation out of the 3 plot is close to zero, also there is no spikes are outside of the dashline which is the bounds.
Q 9.2 A classic example of a non-stationary series are stock prices. Plot the daily closing prices for Amazon stock (contained in gafa_stock), along with the ACF and PACF. Explain how each plot shows that the series is non-stationary and should be differenced.
there is no seasonal pattern, and it is neg trend in 2018 which is the end of the plot, and r value is high.
gafa_stock %>%
filter(Symbol == "AMZN") %>%
gg_tsdisplay(Close) +
labs(title = "AMZ Closing Stock Price")
## Warning: Provided data has an irregular interval, results should be treated with
## caution. Computing ACF by observation.
gafa_stock %>%
filter(Symbol == "AMZN") %>%
features(Close, unitroot_ndiffs)
## # A tibble: 1 × 2
## Symbol ndiffs
## <chr> <int>
## 1 AMZN 1
9.3
For the following series, find an appropriate Box-Cox transformation and order of differencing in order to obtain stationary data.
Turkish GDP from global_economy.
the lambda is .157 and ndiff is 1.
turkey <- global_economy %>%
filter(Country=='Turkey') %>%
select(Country, GDP)
turkey
## # A tsibble: 58 x 3 [1Y]
## # Key: Country [1]
## Country GDP Year
## <fct> <dbl> <dbl>
## 1 Turkey 13995067818. 1960
## 2 Turkey 8022222222. 1961
## 3 Turkey 8922222222. 1962
## 4 Turkey 10355555556. 1963
## 5 Turkey 11177777778. 1964
## 6 Turkey 11944444444. 1965
## 7 Turkey 14122222222. 1966
## 8 Turkey 15666666667. 1967
## 9 Turkey 17500000000 1968
## 10 Turkey 19466666667. 1969
## # … with 48 more rows
lambda <- turkey %>%
features(GDP, features = guerrero) %>%
pull(lambda_guerrero)
lambda
## [1] 0.1572187
turkey %>%
mutate(GDP = box_cox(GDP, lambda)) %>%
features(GDP, unitroot_ndiffs)
## # A tibble: 1 × 2
## Country ndiffs
## <fct> <int>
## 1 Turkey 1
global_economy %>%
filter(Country == "Turkey") %>%
gg_tsdisplay(GDP, plot_type='partial') +
labs(title = "Not transformed Turkish GDP")
global_economy %>%
filter(Country == "Turkey") %>%
gg_tsdisplay(difference(box_cox(GDP,lambda)), plot_type='partial') +
labs(title = "Turkish GDP with lambda ")
## Warning: Removed 1 row containing missing values (`geom_line()`).
## Warning: Removed 1 rows containing missing values (`geom_point()`).
Accommodation takings in the state of Tasmania from aus_accommodation.
Tasmania <- aus_accommodation %>%
filter(State=='Tasmania') %>%
select(State, Takings)
lambda <- Tasmania %>%
features(Takings, features = guerrero) %>%
pull(lambda_guerrero)
lambda
## [1] 0.001819643
Tasmania %>%
mutate(Takings = box_cox(Takings, lambda)) %>%
features(Takings, unitroot_ndiffs)
## # A tibble: 1 × 2
## State ndiffs
## <chr> <int>
## 1 Tasmania 1
aus_accommodation %>%
filter(State == "Tasmania") %>%
gg_tsdisplay(Takings, plot_type='partial') +
labs(title = "Not transformed Tasmania Accomodation Takings")
aus_accommodation %>%
filter(State == "Tasmania") %>%
gg_tsdisplay(difference(box_cox(Takings,lambda), 4), plot_type='partial') +
labs(title = "Differenced Tasmania Accomodation Takings with lambda ")
## Warning: Removed 4 rows containing missing values (`geom_line()`).
## Warning: Removed 4 rows containing missing values (`geom_point()`).
Monthly sales from souvenirs.
lambda_souvenirs <- souvenirs %>%
features(Sales, features = guerrero) %>%
pull(lambda_guerrero)
souvenirs %>%
mutate(Sales = box_cox(Sales, lambda_souvenirs)) %>%
features(Sales, unitroot_ndiffs)
## # A tibble: 1 × 1
## ndiffs
## <int>
## 1 1
souvenirs %>%
gg_tsdisplay(Sales, plot_type='partial', lag = 36) +
labs(title = "Non-transformed Monthly Souvenir Sales")
souvenirs %>%
gg_tsdisplay(difference(box_cox(Sales,lambda_souvenirs), 12), plot_type='partial', lag = 36) +
labs(title = "Differenced souvenirs sales with lambda ")
## Warning: Removed 12 rows containing missing values (`geom_line()`).
## Warning: Removed 12 rows containing missing values (`geom_point()`).
9.5
For your retail data (from Exercise 8 in Section 2.10), find the appropriate order of differencing (after transformation if necessary) to obtain stationary data.
set.seed(1234)
myseries <- aus_retail %>%
filter(`Series ID` == sample(aus_retail$`Series ID`,1))
myseries %>%
gg_tsdisplay(Turnover, plot_type='partial', lag = 36) +
labs(title = "Not transformed Retail Turnover")
lambda <- myseries %>%
features(Turnover, features = guerrero) %>%
pull(lambda_guerrero)
myseries %>%
features(box_cox(Turnover, lambda), unitroot_nsdiffs)
## # A tibble: 1 × 3
## State Industry nsdiffs
## <chr> <chr> <int>
## 1 Tasmania Cafes, restaurants and takeaway food services 1
myseries %>%
gg_tsdisplay(difference(box_cox(Turnover,lambda), 12), plot_type='partial', lag = 36) +
labs(title = "Differenced Tasmania Accomodation Takings with lambda ")
## Warning: Removed 12 rows containing missing values (`geom_line()`).
## Warning: Removed 12 rows containing missing values (`geom_point()`).
9.6
Simulate and plot some data from simple ARIMA models.
y <- numeric(100)
e <- rnorm(100)
for(i in 2:100)
y[i] <- 0.6*y[i-1] + e[i]
sim <- tsibble(idx = seq_len(100), y = y, index = idx)
b.Produce a time plot for the series. How does the plot change as you change
for(i in 2:100)
y[i] <- 1*y[i-1] + e[i]
sim2 <- tsibble(idx = seq_len(100), y = y, index = idx)
sim2 %>%
autoplot(y)+
labs(title = "AR(1) model with phi i = 1 ")
for(i in 2:100)
y[i] <- 2*y[i-1] + e[i]
sim3 <- tsibble(idx = seq_len(100), y = y, index = idx)
sim3 %>%
autoplot(y)+
labs(title = "AR(1) model with phi i = 2 ")
for(i in 2:100)
y[i] <- -0.6*y[i-1] + e[i]
sim4 <- tsibble(idx = seq_len(100), y = y, index = idx)
sim4 %>%
autoplot(y)+
labs(title = "AR(1) model with phi i = -0.6 ")
for(i in 2:100)
y[i] <- e[i] + 0.6 * e[i-1]
sim9c1 <- tsibble(idx = seq_len(100), y = y, index = idx)
sim9c1 %>%
autoplot(y)+
labs(title = "AR(1) model with theta i = .06 ")
for(i in 2:100)
y[i] <- e[i] + 1 * e[i-1]
sim9c2 <- tsibble(idx = seq_len(100), y = y, index = idx)
sim9c2 %>%
autoplot(y)+
labs(title = "AR(1) model with theta i = 1 ")
for(i in 2:100)
y[i] <- e[i] + 2 * e[i-1]
sim9c3 <- tsibble(idx = seq_len(100), y = y, index = idx)
sim9c3 %>%
autoplot(y)+
labs(title = "AR(1) model with theta i = 2 ")
for(i in 2:100)
y[i] <- e[i] + -.06 * e[i-1]
sim9c4 <- tsibble(idx = seq_len(100), y = y, index = idx)
sim9c4 %>%
autoplot(y)+
labs(title = "AR(1) model with theta i = -.06 ")
e. Generate data from an ARMA(1,1) model with
for(i in 2:100)
y[i] <- 0.6*y[i-1] + 0.6*e[i-1] + e[i]
q9e <- tsibble(idx = seq_len(100), y = y, index = idx)
set.seed(1234)
y <- numeric(100)
for(i in 3:100)
y[i] <- -0.8 * y[i-1] + 0.3 * y[i-2] + e[i]
q9f <- tsibble(idx = seq_len(100), y = y, index = idx)
q9e %>%
gg_tsdisplay(y, plot_type='partial')
q9f %>%
gg_tsdisplay(y, plot_type='partial')
9.7
Consider aus_airpassengers, the total number of passengers (in millions) from Australian air carriers for the period 1970-2011.
aus_airpassengers2 <- aus_airpassengers
autoplot(aus_airpassengers2)
## Plot variable not specified, automatically selected `.vars = Passengers`
fit <- aus_airpassengers2 %>%
model(ARIMA(Passengers))
report(fit)
## Series: Passengers
## Model: ARIMA(0,2,1)
##
## Coefficients:
## ma1
## -0.8963
## s.e. 0.0594
##
## sigma^2 estimated as 4.308: log likelihood=-97.02
## AIC=198.04 AICc=198.32 BIC=201.65
fit %>% gg_tsresiduals()
b. Write the model in terms of the backshift operator.
\((1-\phi_1B) (1-B)^2y_t = c + (1 + \theta_1B)e_t\)
\(y_t = -0.8963*e_t-1 + e_t\)
c.Plot forecasts from an ARIMA(0,1,0) model with drift and compare these to part a.
fit2 <- aus_airpassengers %>%
model(ARIMA(Passengers ~ pdq(0,1,0)))
fit2 %>% gg_tsresiduals()
d.Plot forecasts from an ARIMA(2,1,2) model with drift and compare these to parts a and c. Remove the constant and see what happens.
no model is found.
fit3 <- aus_airpassengers %>%
model(ARIMA(Passengers ~ pdq(2,1,2)))
## Warning: It looks like you're trying to fully specify your ARIMA model but have not said if a constant should be included.
## You can include a constant using `ARIMA(y~1)` to the formula or exclude it by adding `ARIMA(y~0)`.
## Warning: 1 error encountered for ARIMA(Passengers ~ pdq(2, 1, 2))
## [1] Could not find an appropriate ARIMA model.
## This is likely because automatic selection does not select models with characteristic roots that may be numerically unstable.
## For more details, refer to https://otexts.com/fpp3/arima-r.html#plotting-the-characteristic-roots
e.Plot forecasts from an ARIMA(0,2,1) model with a constant. What happens?
no model is found.
fit3 <- aus_airpassengers %>%
model(ARIMA(Passengers ~ pdq(2,1,2)))
## Warning: It looks like you're trying to fully specify your ARIMA model but have not said if a constant should be included.
## You can include a constant using `ARIMA(y~1)` to the formula or exclude it by adding `ARIMA(y~0)`.
## Warning: 1 error encountered for ARIMA(Passengers ~ pdq(2, 1, 2))
## [1] Could not find an appropriate ARIMA model.
## This is likely because automatic selection does not select models with characteristic roots that may be numerically unstable.
## For more details, refer to https://otexts.com/fpp3/arima-r.html#plotting-the-characteristic-roots
9.8
a the line look smooth, so it does not look like Box-Cox transformation is needed.
usgdp <- global_economy %>%
filter(Country=="United States")%>%
select(Country, GDP)
usgdp %>% autoplot(GDP) +
labs(title = "US GDP")
b.
it looks like 0,2,2 fit the best.
fit <- usgdp %>%
model(
arima = ARIMA(GDP, stepwise = FALSE, approx = FALSE))
report(fit)
## Series: GDP
## Model: ARIMA(0,2,2)
##
## Coefficients:
## ma1 ma2
## -0.4206 -0.3048
## s.e. 0.1197 0.1078
##
## sigma^2 estimated as 2.615e+22: log likelihood=-1524.08
## AIC=3054.15 AICc=3054.61 BIC=3060.23
fit 222 has the lower value of AIC, AICc and BIC which is more ideal to use.
fit222 <- usgdp %>%
model(ARIMA(GDP ~ pdq(2,2,2)))
report(fit222)
## Series: GDP
## Model: ARIMA(2,2,2)
##
## Coefficients:
## ar1 ar2 ma1 ma2
## 1.3764 -0.4780 -1.9659 1.0000
## s.e. 0.1216 0.1354 0.0723 0.0719
##
## sigma^2 estimated as 2.283e+22: log likelihood=-1521.14
## AIC=3052.27 AICc=3053.47 BIC=3062.4
#p=1
fit122 <- usgdp %>%
model(ARIMA(GDP ~ pdq(1,2,2)))
report(fit122)
## Series: GDP
## Model: ARIMA(1,2,2)
##
## Coefficients:
## ar1 ma1 ma2
## 0.2053 -0.5912 -0.1928
## s.e. 0.3008 0.2886 0.2102
##
## sigma^2 estimated as 2.646e+22: log likelihood=-1523.86
## AIC=3055.72 AICc=3056.51 BIC=3063.82
Like we saw, fit222 has lower value of all 3 items, so it is a better model overall.
fit222 %>% gg_tsresiduals() +
labs(title = "US GDP 10 Year Forecast",
subtitle = "ARIMA(2,2,2)")
forecast gave me error messages
same thing, fit222 and 0,2,2 has lower value than fitets.
fitets <- usgdp %>%
model(ETS(GDP))
report(fitets)
## Series: GDP
## Model: ETS(M,A,N)
## Smoothing parameters:
## alpha = 0.9990876
## beta = 0.5011949
##
## Initial states:
## l[0] b[0]
## 448093333334 64917355687
##
## sigma^2: 7e-04
##
## AIC AICc BIC
## 3190.787 3191.941 3201.089