#Load Packages
library(forecast)
## Warning: package 'forecast' was built under R version 4.4.3
## Registered S3 method overwritten by 'quantmod':
## method from
## as.zoo.data.frame zoo
library(ggplot2)
Figure 9.32 shows the ACFs for 36 random numbers, 360 random numbers and 1,000 random numbers. Explain the differences among these figures. Do they all indicate that the data are white noise? Why are the critical values at different distances from the mean of zero? Why are the autocorrelations different in each figure when they each refer to white noise?
Correlation at various lags is measured by the autocorrelation function (ACF). ACF values in a white noise series should fluctuate randomly and be close to zero.
ACF estimations are more variable in the short sample (36 points) because of the significant sampling variability.
ACF stabilizes and values stay within confidence boundaries for larger samples (360 and 1,000 points), suggesting white noise. As NNN grows, the 95% confidence bounds for ACF decrease, and they are roughly ±1.96N±N 1.96.
White noise series simulation of different Lenghts,then Plot ACFs
set.seed(123)
wn_36 <- ts(rnorm(36))
wn_360 <- ts(rnorm(360))
wn_1000 <- ts(rnorm(1000))
par(mfrow=c(1,3))
acf(wn_36, main="ACF for White Noise (N=36)")
acf(wn_360, main="ACF for White Noise (N=360)")
acf(wn_1000, main="ACF for White Noise (N=1000)")
par(mfrow=c(1,1))
Plot the daily closing prices for Amazon stock (from gafa_stock).
Plot the ACF and PACF.
Explain how these plots show that the series is non-stationary.
Apply differencing to achieve stationarity
#Analyzing Amazon Stock Data Load data, Then plot the time series, ACF and PACF plots including first diffencing.
library(fpp3)
## Registered S3 method overwritten by 'tsibble':
## method from
## as_tibble.grouped_df dplyr
## ── Attaching packages ──────────────────────────────────────────── fpp3 1.0.1 ──
## ✔ tibble 3.2.1 ✔ tsibble 1.1.6
## ✔ dplyr 1.1.4 ✔ tsibbledata 0.4.1
## ✔ tidyr 1.3.1 ✔ feasts 0.4.1
## ✔ lubridate 1.9.4 ✔ fable 0.4.1
## Warning: package 'dplyr' was built under R version 4.4.3
## ── Conflicts ───────────────────────────────────────────────── fpp3_conflicts ──
## ✖ lubridate::date() masks base::date()
## ✖ dplyr::filter() masks stats::filter()
## ✖ tsibble::intersect() masks base::intersect()
## ✖ tsibble::interval() masks lubridate::interval()
## ✖ dplyr::lag() masks stats::lag()
## ✖ tsibble::setdiff() masks base::setdiff()
## ✖ tsibble::union() masks base::union()
data("gafa_stock")
head(gafa_stock)
## # A tsibble: 6 x 8 [!]
## # Key: Symbol [1]
## Symbol Date Open High Low Close Adj_Close Volume
## <chr> <date> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 AAPL 2014-01-02 79.4 79.6 78.9 79.0 67.0 58671200
## 2 AAPL 2014-01-03 79.0 79.1 77.2 77.3 65.5 98116900
## 3 AAPL 2014-01-06 76.8 78.1 76.2 77.7 65.9 103152700
## 4 AAPL 2014-01-07 77.8 78.0 76.8 77.1 65.4 79302300
## 5 AAPL 2014-01-08 77.0 77.9 77.0 77.6 65.8 64632400
## 6 AAPL 2014-01-09 78.1 78.1 76.5 76.6 65.0 69787200
amazon_stock <- gafa_stock %>% filter(Symbol == "AMZN")
autoplot(amazon_stock, Close) + ggtitle("Amazon Stock Prices")
amazon_stock %>%
ACF(Close) %>%
autoplot() + ggtitle("ACF of Amazon Stock Prices")
## Warning: Provided data has an irregular interval, results should be treated
## with caution. Computing ACF by observation.
amazon_stock %>%
PACF(Close) %>%
autoplot() + ggtitle("PACF of Amazon Stock Prices")
## Warning: Provided data has an irregular interval, results should be treated
## with caution. Computing ACF by observation.
amazon_diff <- amazon_stock %>% mutate(Diff_Close = difference(Close))
autoplot(amazon_diff, Diff_Close) + ggtitle("First Differenced Amazon Stock Prices")
## Warning: Removed 1 row containing missing values or values outside the scale range
## (`geom_line()`).
Because of their significant volatility and long-term patterns, stock
prices are usually non-stationary.
Understanding ACF and PACF:
A unit root (non-stationarity) is suggested by a steady decline in the ACF.
First-order differencing is suggested by a substantial PACF at lag 1.
Differencing and Transformation:
Variance can be stabilized using a Box-Cox transformation.
In order to make the data steady, differencing eliminates trends
#9.3
For the following series, determine an appropriate Box-Cox transformation and differencing order to achieve stationarity:
Turkish GDP (global_economy dataset)
Accommodation takings in Tasmania (aus_accommodation dataset)
Monthly souvenir sales
data("global_economy")
data("aus_accommodation")
head(global_economy)
## # A tsibble: 6 x 9 [1Y]
## # Key: Country [1]
## Country Code Year GDP Growth CPI Imports Exports Population
## <fct> <fct> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 Afghanistan AFG 1960 537777811. NA NA 7.02 4.13 8996351
## 2 Afghanistan AFG 1961 548888896. NA NA 8.10 4.45 9166764
## 3 Afghanistan AFG 1962 546666678. NA NA 9.35 4.88 9345868
## 4 Afghanistan AFG 1963 751111191. NA NA 16.9 9.17 9533954
## 5 Afghanistan AFG 1964 800000044. NA NA 18.1 8.89 9731361
## 6 Afghanistan AFG 1965 1006666638. NA NA 21.4 11.3 9938414
head(aus_accommodation)
## # A tsibble: 6 x 5 [1Q]
## # Key: State [1]
## Date State Takings Occupancy CPI
## <qtr> <chr> <dbl> <dbl> <dbl>
## 1 1998 Q1 Australian Capital Territory 24.3 65 67
## 2 1998 Q2 Australian Capital Territory 22.3 59 67.4
## 3 1998 Q3 Australian Capital Territory 22.5 58 67.5
## 4 1998 Q4 Australian Capital Territory 24.4 59 67.8
## 5 1999 Q1 Australian Capital Territory 23.7 58 67.8
## 6 1999 Q2 Australian Capital Territory 25.4 61 68.1
turkish_gdp <- global_economy %>% filter(Country == "Turkey") %>% select(GDP)
lambda_gdp <- BoxCox.lambda(turkish_gdp$GDP)
turkish_gdp_trans <- BoxCox(turkish_gdp$GDP, lambda_gdp)
glimpse(turkish_gdp)
## Rows: 58
## Columns: 2
## $ GDP <dbl> 13995067818, 8022222222, 8922222222, 10355555556, 11177777778, 11…
## $ Year <dbl> 1960, 1961, 1962, 1963, 1964, 1965, 1966, 1967, 1968, 1969, 1970,…
glimpse(lambda_gdp)
## num 0.157
tas_accom <- aus_accommodation %>% filter(State == "Tasmania") %>% select(Takings)
lambda_accom <- BoxCox.lambda(tas_accom$Takings)
tas_accom_trans <- BoxCox(tas_accom$Takings, lambda_accom)
head(tas_accom)
## # A tsibble: 6 x 2 [1Q]
## Takings Date
## <dbl> <qtr>
## 1 28.7 1998 Q1
## 2 19.0 1998 Q2
## 3 16.1 1998 Q3
## 4 25.9 1998 Q4
## 5 28.4 1999 Q1
## 6 20.1 1999 Q2
souvenir <- tsibble::as_tsibble(aus_retail) %>% filter(Industry == "Souvenirs")
lambda_souvenir <- BoxCox.lambda(souvenir$Turnover)
souvenir_trans <- BoxCox(souvenir$Turnover, lambda_souvenir)
glimpse(souvenir_trans)
## num(0)
## - attr(*, "lambda")= num 1
turkish_gdp_diff <- difference(turkish_gdp_trans)
tas_accom_diff <- difference(tas_accom_trans)
souvenir_diff <- difference(souvenir_trans)
Variance can be stabilized using the Box-Cox transformation.
When ACF exhibits a long-term correlation, differencing is used.
#9.5-9.8 Arima Model Simulations
Simulate an AR(1) process with ϕ1=0.6_1 = 0.6ϕ1=0.6, σ2=1^2 = 1σ2=1.
Simulate an MA(1) process with θ1=0.6_1 = 0.6θ1=0.6, σ2=1^2 = 1σ2=1.
Simulate an ARMA(1,1) process with ϕ1=0.6_1 = 0.6ϕ1=0.6, θ1=0.6_1 = 0.6θ1=0.6, σ2=1^2 = 1σ2=1.
Simulate an AR(2) process with ϕ1=−0.8_1 = -0.8ϕ1=−0.8, ϕ2=0.3_2 = 0.3ϕ2=0.3, σ2=1^2 = 1σ2=1.
set.seed(123)
y <- numeric(100)
e <- rnorm(100)
for(i in 2:100) y[i] <- 0.6*y[i-1] + e[i]
ar1_series <- ts(y)
autoplot(ar1_series) + ggtitle("AR(1) Simulation")
set.seed(123)
e <- rnorm(100)
y <- stats::filter(e, filter=0.6, method="convolution", sides=1)
ma1_series <- ts(y)
autoplot(ma1_series) + ggtitle("MA(1) Simulation")
set.seed(123)
arma_series <- arima.sim(n=100, model=list(ar=0.6, ma=0.6))
autoplot(arma_series) + ggtitle("ARMA(1,1) Simulation")
set.seed(123)
ar2_series <- arima.sim(n = 100, model = list(ar = c(0.5, -0.3)), sd = 1)
autoplot(ar2_series) + ggtitle("AR(2) Simulation")
The significance of stationarity in time series modeling is emphasized by this exercise, especially when dealing with autoregressive processes. When simulating an AR(2) model in R, the stated parameters (ϕ1=−0.8_1 = -0.8ϕ1=−0.8, ϕ2=0.3_2 = 0.3ϕ2=0.3) did not meet the stationarity requirements, which resulted in the error “ar” component of the model is not stationary.”
In order to guarantee stationarity, the characteristic equation’s roots:
1 - _1 z - _2 z^2 = 01−ϕ1 z−ϕ2 z2=0
must be outside the circle of the unit. The model behaves non-stationarily when this need is not satisfied, producing time series that are unstable and possibly explosive.
We used arima.sim() to create a stationary AR(2) process by changing the parameters to ϕ1=0.5_1 = 0.5ϕ1=0.5, ϕ2=−0.3_2 = -0.3ϕ2=−0.3. The appropriateness of these values was confirmed by the time series’ steady fluctuations.
This procedure highlights how important it is to verify stationarity prior to using ARIMA models for forecasting. The assumptions of the model are broken in the absence of stationarity, producing predictions that are not trustworthy. Producing significant and comprehensible time series studies requires that the parameters selected adhere to theoretical limitations.