exercises 9.1, 9.2, 9.3, 9.5, 9.6, 9.7, 9.8

9.1

Figure 9.32 shows the ACFs for 36 random numbers, 360 random numbers and 1,000 random numbers.

  1. Explain the differences among these figures. Do they all indicate that the data are white noise?

    • The differences among the figures are all have different size of the height in each lag, and they indicate to have white noises since they are inside the two blue lines.
  2. Why are the critical values at different distances from the mean of zero? Why are the autocorrelations different in each figure when they each refer to white noise?

    • The critical values at different distances from the mean of zero because they ard calculated with standard deviation (1.96/sqrt(T)), where when T (time) increases, the critical value decreases. The autocorrelatons are different in each fgure when they each refer to white noise because of larger series sizes of random numbers, leading to chances of autocorrelation to decrease.

9.2

A classic example of a non-stationary series are stock prices. Plot the daily closing prices for Amazon stock (contained in gafa_stock), along with the ACF and PACF. Explain how each plot shows that the series is non-stationary and should be differenced.

The time series increases until around 2019, the decreases, but overall increasing trend, so non-stationary. ACF plot is decreasing and has a large height that goes beyond the blue line, even though stationary time series goes to zero quickly. PACF plot starts off with 1.00, shows its non-stationary. It should be differenced because it will help the mean be stabilized.

library(fpp3)
## Registered S3 method overwritten by 'tsibble':
##   method               from 
##   as_tibble.grouped_df dplyr
## ── Attaching packages ──────────────────────────────────────────── fpp3 1.0.1 ──
## ✔ tibble      3.2.1     ✔ tsibble     1.1.6
## ✔ dplyr       1.1.4     ✔ tsibbledata 0.4.1
## ✔ tidyr       1.3.1     ✔ feasts      0.4.1
## ✔ lubridate   1.9.4     ✔ fable       0.4.1
## ✔ ggplot2     3.5.1
## ── Conflicts ───────────────────────────────────────────────── fpp3_conflicts ──
## ✖ lubridate::date()    masks base::date()
## ✖ dplyr::filter()      masks stats::filter()
## ✖ tsibble::intersect() masks base::intersect()
## ✖ tsibble::interval()  masks lubridate::interval()
## ✖ dplyr::lag()         masks stats::lag()
## ✖ tsibble::setdiff()   masks base::setdiff()
## ✖ tsibble::union()     masks base::union()
gafa_stock |>
  filter(Symbol == "AMZN") |>
  gg_tsdisplay(Close, plot_type='partial')
## Warning: Provided data has an irregular interval, results should be treated with caution. Computing ACF by observation.
## Provided data has an irregular interval, results should be treated with caution. Computing ACF by observation.

9.3

global_economy |>
  filter(Country == "Turkey") |>
  gg_tsdisplay(GDP, plot_type='partial') 

lambda <- global_economy |>
  filter(Country == "Turkey") |>
  features(GDP, features = guerrero) |>
  pull(lambda_guerrero)

global_economy |>
  filter(Country == "Turkey") |>
  features(box_cox(GDP,lambda), unitroot_ndiffs) 
global_economy |>
  filter(Country == "Turkey") |>
  gg_tsdisplay(difference(box_cox(GDP,lambda)), plot_type='partial')
## Warning: Removed 1 row containing missing values or values outside the scale range
## (`geom_line()`).
## Warning: Removed 1 row containing missing values or values outside the scale range
## (`geom_point()`).

aus_accommodation |>
  filter(State == "Tasmania") |>
  gg_tsdisplay(Takings, plot_type='partial')

lambda <-aus_accommodation |>
  filter(State == "Tasmania") |>
  features(Takings, features = guerrero) |>
  pull(lambda_guerrero)

aus_accommodation |>
  filter(State == "Tasmania") |>
  features(box_cox(Takings,lambda), unitroot_nsdiffs) 

9.5

For your retail data (from Exercise 7 in Section 2.10), find the appropriate order of differencing (after transformation if necessary) to obtain stationary data.

The data is both seasonal and non stationary, and the differencing is only suggested once, by the KPSS unit root testing. ACF and PACF is around zero, so it should be stationary.

set.seed(12345678)
myseries <- aus_retail |>
  filter(`Series ID` == sample(aus_retail$`Series ID`,1)) 

myseries |>
  gg_tsdisplay(Turnover, plot_type='partial', lag = 50) 

lambda <- myseries |>
  features(Turnover, features = guerrero) |>
  pull(lambda_guerrero)

myseries |>
  features(box_cox(Turnover, lambda), unitroot_nsdiffs) 
myseries |>
  gg_tsdisplay(difference(box_cox(Turnover,lambda), 15), plot_type='partial', lag = 50)
## Warning: Removed 15 rows containing missing values or values outside the scale range
## (`geom_line()`).
## Warning: Removed 15 rows containing missing values or values outside the scale range
## (`geom_point()`).

9.6

Simulate and plot some data from simple ARIMA models.

  1. Use the following R code to generate data from an AR(1) model witho1=0.6 and p2=1. The process starts with y1=0

  2. Produce a time plot for the series. How does the plot change as you change o1

    Different o1 has different plot, like as o1 decrease, the plot becomes bigger in height, negative value makes the plotting even larger.

  3. Write your own code to generate data from an MA(1) model with x1=0.6 and p2=1

  4. Produce a time plot for the series. How does the plot change as you change 01

    It seems to have the opposite effect with 01, the more the value is, the larger the height, and vice versa, helps the centering around the mean.

  5. Generate data from an ARMA(1,1) model with o1=0.6, x1=0.6 and p2=1

    y <- numeric(100)
    e <- rnorm(100)
    for(i in 2:100)
      y[i] <- 0.6*y[i-1] + e[i]
    sim <- tsibble(idx = seq_len(100), y = y, index = idx)
    
    set.seed(1111)
    y <- numeric(100)
    
    for(i in 2:100)
      y[i] <- e[i] + 0.6 * e[i-1] + 0.6 * y[i-1]
    
    tsibble(idx = seq_len(100), y = y, index = idx) |>
      autoplot(y)

  6. Generate data from an AR(2) model with o1=-0.8 o2=0.3 and p2=1

    set.seed(1111)
    y <- numeric(100)
    
    for(i in 3:100)
      y[i] <- -0.8 * y[i-1] + 0.3 * y[i-2] + e[i]
    
    tsibble(idx = seq_len(100), y = y, index = idx) |>
      autoplot(y)

  7. Graph the latter two series and compare them.

    ARMA(1,1) looks to be stationary and random. ACF plot seems to have alot of up and down and only the last value goes beyond the blue line, and PACF is the same. AR model doesn’t seem stationary, ACF is up and down as well but the size consistently decreases overtime. PACF plot has only 1 negative value.

y <- numeric(100)
e <- rnorm(100)
for(i in 2:100)
  y[i] <- 0.6*y[i-1] + e[i]
sim <- tsibble(idx = seq_len(100), y = y, index = idx)

sim |>
  autoplot(y)

for(i in 2:100)
  y[i] <- 0.2*y[i-1] + e[i]

tsibble(idx = seq_len(100), y = y, index = idx) |>
  autoplot(y) 

for(i in 2:100)
  y[i] <- -0.8*y[i-1] + e[i]

tsibble(idx = seq_len(100), y = y, index = idx) |>
  autoplot(y) 

for(i in 2:100)
  y[i] <- 0*y[i-1] + e[i]

tsibble(idx = seq_len(100), y = y, index = idx) |>
  autoplot(y)

for(i in 2:100)
  y[i] <- e[i] + 0.6 * e[i-1]

sim <- tsibble(idx = seq_len(100), y = y, index = idx)
sim |>
  autoplot(y)

for(i in 2:100)
  y[i] <- e[i] + 0.2 * e[i-1]

tsibble(idx = seq_len(100), y = y, index = idx) |>
  autoplot(y)

for(i in 2:100)
  y[i] <- e[i] + -0.8 * e[i-1]

tsibble(idx = seq_len(100), y = y, index = idx) |>
  autoplot(y) 

for(i in 2:100)
  y[i] <- e[i] + 0 * e[i-1]

tsibble(idx = seq_len(100), y = y, index = idx) |>
  autoplot(y) 

set.seed(1111)
y <- numeric(100)

tsibble(idx = seq_len(100), y = y, index = idx) |>
  gg_tsdisplay(y, plot_type='partial')
## Warning: Removed 20 rows containing missing values or values outside the scale range
## (`geom_segment()`).
## Removed 20 rows containing missing values or values outside the scale range
## (`geom_segment()`).

for(i in 3:100)
  y[i] <- -0.8 * y[i-1] + 0.3 * y[i-2] + e[i]

tsibble(idx = seq_len(100), y = y, index = idx) |>
  gg_tsdisplay(y, plot_type='partial') 

9.7

Consider aus_airpassengers, the total number of passengers (in millions) from Australian air carriers for the period 1970-2011.

nextperiods <- aus_airpassengers |>
  filter(Year <= 2011) |>
  model(ARIMA(Passengers)) 
report(nextperiods)
## Series: Passengers 
## Model: ARIMA(0,2,1) 
## 
## Coefficients:
##           ma1
##       -0.8756
## s.e.   0.0722
## 
## sigma^2 estimated as 4.671:  log likelihood=-87.8
## AIC=179.61   AICc=179.93   BIC=182.99
nextperiods |>
  gg_tsresiduals() 

nextperiodss <-aus_airpassengers |>
  filter(Year <= 2011) |>
  model(ARIMA(Passengers ~ pdq(0,1,0)))

nextperiodss |>
  forecast(h=15) |>
  autoplot(aus_airpassengers) 

nextperiodss |>
  gg_tsresiduals() 

nextperiodsss <-aus_airpassengers |>
  filter(Year <= 2011) |>
  model(ARIMA(Passengers ~ pdq(2,1,2)))

nextperiodsss |>
  forecast(h=15) |>
  autoplot(aus_airpassengers) 

nextperiodsss |>
  gg_tsresiduals() 

nextperiodssss <-aus_airpassengers |>
  filter(Year <= 2011) |>
  model(ARIMA(Passengers ~ 0 + pdq(2,1,2)))
## Warning: 1 error encountered for ARIMA(Passengers ~ 0 + pdq(2, 1, 2))
## [1] non-stationary AR part from CSS
nextperiodsssss <-aus_airpassengers |>
  filter(Year <= 2011) |>
  model(ARIMA(Passengers ~ 1 + pdq(0,2,1)))
## Warning: Model specification induces a quadratic or higher order polynomial trend. 
## This is generally discouraged, consider removing the constant or reducing the number of differences.
nextperiodsssss |>
  forecast(h=10) |>
  autoplot(aus_airpassengers)

nextperiodsssss |> 
  gg_tsresiduals() 

9.8

global_economy |>
  filter(Code == "USA") |>
  gg_tsdisplay(GDP, plot_type='partial')

lambda <-global_economy |>
  filter(Code == "USA") |>
  features(GDP, features = guerrero) |>
  pull(lambda_guerrero)

lambda
## [1] 0.2819443
fit <- global_economy |>
  filter(Code == "USA") |>
  model(ARIMA(box_cox(GDP, lambda))) 

report(fit)
## Series: GDP 
## Model: ARIMA(1,1,0) w/ drift 
## Transformation: box_cox(GDP, lambda) 
## 
## Coefficients:
##          ar1  constant
##       0.4586  118.1822
## s.e.  0.1198    9.5047
## 
## sigma^2 estimated as 5479:  log likelihood=-325.32
## AIC=656.65   AICc=657.1   BIC=662.78
global_economy |>
  filter(Code == "USA") |>
  gg_tsdisplay(box_cox(GDP,lambda), plot_type='partial')

global_economy |>
  filter(Code == "USA") |>
  features(box_cox(GDP,lambda), unitroot_ndiffs) 
fit <- global_economy |>
  filter(Code == "USA") |>
  model(arima110 = ARIMA(box_cox(GDP,lambda) ~ pdq(1,1,0)),
        arima120 = ARIMA(box_cox(GDP,lambda) ~ pdq(1,2,0)),
        arima111 = ARIMA(box_cox(GDP,lambda) ~ pdq(1,1,1)))

glance(fit) 
fit |>
  select(arima120) |>
  gg_tsresiduals()

fit |>
  forecast(h=10) |>
  filter(.model=='arima120') |>
  autoplot(global_economy)

ets <- global_economy |>
  filter(Code == "USA") |>
  model(ETS(GDP))

report(ets)
## Series: GDP 
## Model: ETS(M,A,N) 
##   Smoothing parameters:
##     alpha = 0.9990876 
##     beta  = 0.5011949 
## 
##   Initial states:
##          l[0]        b[0]
##  448093333334 64917355687
## 
##   sigma^2:  7e-04
## 
##      AIC     AICc      BIC 
## 3190.787 3191.941 3201.089
ets |>
  forecast(h=10) |>
  autoplot(global_economy)