Chp 9 exercise focus on aurocorrelation, starionarity, transformations, and Arima modeling.

#Load Packages

library(forecast)
## Warning: package 'forecast' was built under R version 4.4.3
## Registered S3 method overwritten by 'quantmod':
##   method            from
##   as.zoo.data.frame zoo
library(ggplot2)

9.1 White Noise and ACF

Figure 9.32 shows the ACFs for 36 random numbers, 360 random numbers and 1,000 random numbers. Explain the differences among these figures. Do they all indicate that the data are white noise? Why are the critical values at different distances from the mean of zero? Why are the autocorrelations different in each figure when they each refer to white noise?

Correlation at various lags is measured by the autocorrelation function (ACF). ACF values in a white noise series should fluctuate randomly and be close to zero.

ACF estimations are more variable in the short sample (36 points) because of the significant sampling variability.

ACF stabilizes and values stay within confidence boundaries for larger samples (360 and 1,000 points), suggesting white noise. As NNN grows, the 95% confidence bounds for ACF decrease, and they are roughly ±1.96N±N 1.96.

Genarating White Noise and ACF Plots

White noise series simulation of different Lenghts,then Plot ACFs

set.seed(123)
wn_36 <- ts(rnorm(36))
wn_360 <- ts(rnorm(360))
wn_1000 <- ts(rnorm(1000))

par(mfrow=c(1,3))
acf(wn_36, main="ACF for White Noise (N=36)")
acf(wn_360, main="ACF for White Noise (N=360)")
acf(wn_1000, main="ACF for White Noise (N=1000)")

par(mfrow=c(1,1))

9.2 The Non- Stationarity in Stock Prices

Plot the daily closing prices for Amazon stock (from gafa_stock).

Plot the ACF and PACF.

Explain how these plots show that the series is non-stationary.

Apply differencing to achieve stationarity

#Analyzing Amazon Stock Data Load data, Then plot the time series, ACF and PACF plots including first diffencing.

library(fpp3)
## Registered S3 method overwritten by 'tsibble':
##   method               from 
##   as_tibble.grouped_df dplyr
## ── Attaching packages ──────────────────────────────────────────── fpp3 1.0.1 ──
## ✔ tibble      3.2.1     ✔ tsibble     1.1.6
## ✔ dplyr       1.1.4     ✔ tsibbledata 0.4.1
## ✔ tidyr       1.3.1     ✔ feasts      0.4.1
## ✔ lubridate   1.9.4     ✔ fable       0.4.1
## Warning: package 'dplyr' was built under R version 4.4.3
## ── Conflicts ───────────────────────────────────────────────── fpp3_conflicts ──
## ✖ lubridate::date()    masks base::date()
## ✖ dplyr::filter()      masks stats::filter()
## ✖ tsibble::intersect() masks base::intersect()
## ✖ tsibble::interval()  masks lubridate::interval()
## ✖ dplyr::lag()         masks stats::lag()
## ✖ tsibble::setdiff()   masks base::setdiff()
## ✖ tsibble::union()     masks base::union()
data("gafa_stock")
head(gafa_stock)
## # A tsibble: 6 x 8 [!]
## # Key:       Symbol [1]
##   Symbol Date        Open  High   Low Close Adj_Close    Volume
##   <chr>  <date>     <dbl> <dbl> <dbl> <dbl>     <dbl>     <dbl>
## 1 AAPL   2014-01-02  79.4  79.6  78.9  79.0      67.0  58671200
## 2 AAPL   2014-01-03  79.0  79.1  77.2  77.3      65.5  98116900
## 3 AAPL   2014-01-06  76.8  78.1  76.2  77.7      65.9 103152700
## 4 AAPL   2014-01-07  77.8  78.0  76.8  77.1      65.4  79302300
## 5 AAPL   2014-01-08  77.0  77.9  77.0  77.6      65.8  64632400
## 6 AAPL   2014-01-09  78.1  78.1  76.5  76.6      65.0  69787200
amazon_stock <- gafa_stock %>% filter(Symbol == "AMZN")
autoplot(amazon_stock, Close) + ggtitle("Amazon Stock Prices")

amazon_stock %>%
  ACF(Close) %>%
  autoplot() + ggtitle("ACF of Amazon Stock Prices")
## Warning: Provided data has an irregular interval, results should be treated
## with caution. Computing ACF by observation.

amazon_stock %>%
  PACF(Close) %>%
  autoplot() + ggtitle("PACF of Amazon Stock Prices")
## Warning: Provided data has an irregular interval, results should be treated
## with caution. Computing ACF by observation.

amazon_diff <- amazon_stock %>% mutate(Diff_Close = difference(Close))

autoplot(amazon_diff, Diff_Close) + ggtitle("First Differenced Amazon Stock Prices")
## Warning: Removed 1 row containing missing values or values outside the scale range
## (`geom_line()`).

Because of their significant volatility and long-term patterns, stock prices are usually non-stationary.

Understanding ACF and PACF:

A unit root (non-stationarity) is suggested by a steady decline in the ACF.

First-order differencing is suggested by a substantial PACF at lag 1.

Differencing and Transformation:

Variance can be stabilized using a Box-Cox transformation.

In order to make the data steady, differencing eliminates trends

#9.3

For the following series, determine an appropriate Box-Cox transformation and differencing order to achieve stationarity:

Turkish GDP (global_economy dataset)

Accommodation takings in Tasmania (aus_accommodation dataset)

Monthly souvenir sales

Box-Cox and differencing

data("global_economy")
data("aus_accommodation")


head(global_economy)
## # A tsibble: 6 x 9 [1Y]
## # Key:       Country [1]
##   Country     Code   Year         GDP Growth   CPI Imports Exports Population
##   <fct>       <fct> <dbl>       <dbl>  <dbl> <dbl>   <dbl>   <dbl>      <dbl>
## 1 Afghanistan AFG    1960  537777811.     NA    NA    7.02    4.13    8996351
## 2 Afghanistan AFG    1961  548888896.     NA    NA    8.10    4.45    9166764
## 3 Afghanistan AFG    1962  546666678.     NA    NA    9.35    4.88    9345868
## 4 Afghanistan AFG    1963  751111191.     NA    NA   16.9     9.17    9533954
## 5 Afghanistan AFG    1964  800000044.     NA    NA   18.1     8.89    9731361
## 6 Afghanistan AFG    1965 1006666638.     NA    NA   21.4    11.3     9938414
head(aus_accommodation)
## # A tsibble: 6 x 5 [1Q]
## # Key:       State [1]
##      Date State                        Takings Occupancy   CPI
##     <qtr> <chr>                          <dbl>     <dbl> <dbl>
## 1 1998 Q1 Australian Capital Territory    24.3        65  67  
## 2 1998 Q2 Australian Capital Territory    22.3        59  67.4
## 3 1998 Q3 Australian Capital Territory    22.5        58  67.5
## 4 1998 Q4 Australian Capital Territory    24.4        59  67.8
## 5 1999 Q1 Australian Capital Territory    23.7        58  67.8
## 6 1999 Q2 Australian Capital Territory    25.4        61  68.1
turkish_gdp <- global_economy %>% filter(Country == "Turkey") %>% select(GDP)
lambda_gdp <- BoxCox.lambda(turkish_gdp$GDP)
turkish_gdp_trans <- BoxCox(turkish_gdp$GDP, lambda_gdp)
glimpse(turkish_gdp)
## Rows: 58
## Columns: 2
## $ GDP  <dbl> 13995067818, 8022222222, 8922222222, 10355555556, 11177777778, 11…
## $ Year <dbl> 1960, 1961, 1962, 1963, 1964, 1965, 1966, 1967, 1968, 1969, 1970,…
glimpse(lambda_gdp)
##  num 0.157
tas_accom <- aus_accommodation %>% filter(State == "Tasmania") %>% select(Takings)
lambda_accom <- BoxCox.lambda(tas_accom$Takings)
tas_accom_trans <- BoxCox(tas_accom$Takings, lambda_accom)

head(tas_accom)
## # A tsibble: 6 x 2 [1Q]
##   Takings    Date
##     <dbl>   <qtr>
## 1    28.7 1998 Q1
## 2    19.0 1998 Q2
## 3    16.1 1998 Q3
## 4    25.9 1998 Q4
## 5    28.4 1999 Q1
## 6    20.1 1999 Q2
souvenir <- tsibble::as_tsibble(aus_retail) %>% filter(Industry == "Souvenirs")
lambda_souvenir <- BoxCox.lambda(souvenir$Turnover)
souvenir_trans <- BoxCox(souvenir$Turnover, lambda_souvenir)
glimpse(souvenir_trans)
##  num(0) 
##  - attr(*, "lambda")= num 1
turkish_gdp_diff <- difference(turkish_gdp_trans)
tas_accom_diff <- difference(tas_accom_trans)
souvenir_diff <- difference(souvenir_trans)

Variance can be stabilized using the Box-Cox transformation.

When ACF exhibits a long-term correlation, differencing is used.

#9.5-9.8 Arima Model Simulations

Simulate an AR(1) process with ϕ1=0.6_1 = 0.6ϕ1​=0.6, σ2=1^2 = 1σ2=1.

Simulate an MA(1) process with θ1=0.6_1 = 0.6θ1​=0.6, σ2=1^2 = 1σ2=1.

Simulate an ARMA(1,1) process with ϕ1=0.6_1 = 0.6ϕ1​=0.6, θ1=0.6_1 = 0.6θ1​=0.6, σ2=1^2 = 1σ2=1.

Simulate an AR(2) process with ϕ1=−0.8_1 = -0.8ϕ1​=−0.8, ϕ2=0.3_2 = 0.3ϕ2​=0.3, σ2=1^2 = 1σ2=1.

set.seed(123)
y <- numeric(100)
e <- rnorm(100)
for(i in 2:100) y[i] <- 0.6*y[i-1] + e[i]
ar1_series <- ts(y)
autoplot(ar1_series) + ggtitle("AR(1) Simulation")

set.seed(123)
e <- rnorm(100)
y <- stats::filter(e, filter=0.6, method="convolution", sides=1)

ma1_series <- ts(y)

autoplot(ma1_series) + ggtitle("MA(1) Simulation")

set.seed(123)
arma_series <- arima.sim(n=100, model=list(ar=0.6, ma=0.6))
autoplot(arma_series) + ggtitle("ARMA(1,1) Simulation")

set.seed(123)
ar2_series <- arima.sim(n = 100, model = list(ar = c(0.5, -0.3)), sd = 1)

autoplot(ar2_series) + ggtitle("AR(2) Simulation")

Conclusion

The significance of stationarity in time series modeling is emphasized by this exercise, especially when dealing with autoregressive processes. When simulating an AR(2) model in R, the stated parameters (ϕ1=−0.8_1 = -0.8ϕ1​=−0.8, ϕ2=0.3_2 = 0.3ϕ2​=0.3) did not meet the stationarity requirements, which resulted in the error “ar” component of the model is not stationary.”

In order to guarantee stationarity, the characteristic equation’s roots:

1 - _1 z - _2 z^2 = 01−ϕ1 z−ϕ2 z2=0

must be outside the circle of the unit. The model behaves non-stationarily when this need is not satisfied, producing time series that are unstable and possibly explosive.

We used arima.sim() to create a stationary AR(2) process by changing the parameters to ϕ1=0.5_1 = 0.5ϕ1​=0.5, ϕ2=−0.3_2 = -0.3ϕ2​=−0.3. The appropriateness of these values was confirmed by the time series’ steady fluctuations.

This procedure highlights how important it is to verify stationarity prior to using ARIMA models for forecasting. The assumptions of the model are broken in the absence of stationarity, producing predictions that are not trustworthy. Producing significant and comprehensible time series studies requires that the parameters selected adhere to theoretical limitations.