Joel Park
8.1 Figure 8.31 shows the ACFs for 36 random numbers, 360 random numbers and 1000 random numbers.
A. Explain the differences among these figures. Do they all indicate that the data are white noise?
In this exercise, the three figures are all ACF plots, or plots of autocorrelations. While every time series has some fluctuations, this could be due either by chance (non-reducible error) or by an underlying trend, cycle, or seasonality component that we have not further identified. If the latter is true, by plotting the ACFs, we would notice that the autocorrelations would show multiple points that are statistically significant, indicating that some sort of relationship likely exists. However, in this exercise, all three ACFs demonstrate plots that demonstrate no statistical difference; therefore, rendering all three plots white noise (as defined above). (The dashed blue lines represent the 95% confidence interval.)
While all of these plots are white noise, the difference here is the confidence intervals. According to Wikipedia, “the standard deviation is a measure that is used to quantify the amount of variation or dispersion of a set of data values.” The formula for corrected sample standard deviation is:
\[s_N = \sqrt{\frac{1}{N-1}\Sigma_{i=1}^{N}(x_i-\bar{x})^2}\]
As the sample size increases, the standard deviation becomes increasingly smaller, and hence the confidence intervals (or the critical values) as well (which we will talk further in part B of this question). Though the mathematics of standard deviation is somewhat different than the mathematics of calculating a critical value for ACF, the overall concept remains the same.
Given that the right ACF plot was for a white noise series of 1000 numbers, the CI is not surprisingly smaller than the plot on the left with 36 numbers.
B. Why are the critical values at different distances from the mean of zero? Why are the autocorrelations different in each figure when they each refer to white noise?
According to StackExchange, we assume under a null hypothesis that no autocorrelation exists, which is why we use an assumed mean of zero. From there, we define the critical values of what would fall under random noise distributed with variance \(\sigma^2\) around the mean. Anything beyond \(1.96 * \sigma\) or beyond the 5% critical value would be deemed significant.
The formula for calculating the critical values (5%) are:
\[\pm \frac{1.96}{\sqrt{T-d}}\]
where T is the sample size and d is lag. So for example, if we had a sample size of 1000, the critical values for lag 1 are \(\pm \frac{1.96}{\sqrt{1000-1}}\) which equals approximately plus/minus 0.06201. Whereas, for a sample size of 36 (and lag of 1), the critical values would equal approximately plus/minus 0.3313. (This is demonstrated in the graphs.) This explains why the critical values are different for each plot.
The autocorrelations will differ depending on the length of the time series. However, as long as the autocorrelations are below the critical values threshold, they would continue to be considered as white noise.
Formula for autocorrelation from section 2.8:
\[r_k = \frac{\Sigma_{t=k+1}^{T}(y_t-\bar{y})(y_{t-k}-\bar{y})}{\Sigma_{t=1}^{T}(y_t-\bar{y})^2}\]
8.2 A classic example of a non-stationary series is the daily closing IBM stock price series (data set ibmclose
). Use R to plot the daily closing prices for IBM stock and the ACF and PACF. Explain how each plot shows that the series is non-stationary and should be differenced.
library(fpp2)
## Loading required package: ggplot2
## Loading required package: forecast
## Loading required package: fma
## Loading required package: expsmooth
ibm <- data(ibmclose)
head(ibmclose)
## Time Series:
## Start = 1
## End = 6
## Frequency = 1
## [1] 460 457 452 459 462 459
str(ibmclose)
## Time-Series [1:369] from 1 to 369: 460 457 452 459 462 459 463 479 493 490 ...
Plotting the time series, ACF, and PACF.
ibmclose %>% ggtsdisplay(main="")
According to section 8.1, “Stationarity and Differencing”, a stationary time series is one whose properties do not depend on the time at which the series is observed. Thus, time series with trends, or with seasonality, are not stationary - the trend and seasonality will affect the value of the time series at different times.
From the top plot of the time series, there appears to be a downward trend in the dataset, therefore making this non-stationary.
According to r-statistics.co, “a stationary time series will have the autocorrelation fall to zero fairly quickly but for a non-stationary series it drops gradually”. As evidenced from the bottom left ACF plot, given the gradual fall, this suggests a non-stationary series.
Partial correlation is the correlation of the time series with a lag of itself, with the linear dependence of all the lags between them removed. Duke University does a great job explaining partial autocorrelation. “Partial autocorrelation is the amount of correlation between a variable and a lag of itself that is not explained by correlations at all lower-order-lags. The autocorrelation of a time series Y at lag 1 is the coefficient of correlation between Yt and Yt-1, which is presumably also the correlation between Yt-1 and Yt-2. But if Yt is correlated with Yt-1, and Yt-1 is equally correlated with Yt-2, then we should also expect to find correlation between Yt and Yt-2. In fact, the amount of correlation we should expect at lag 2 is precisely the square of the lag-1 correlation. Thus, the correlation at lag 1”propagates" to lag 2 and presumably to higher-order lags. The partial autocorrelation at lag 2 is therefore the difference between the actual correlation at lag 2 and the expected correlation due to the propagation of correlation at lag 1."
Therefore, in the bottom right plot for the pACF, the autocorrelation is significant for a large number of lags, but the pACF suggests that the autocorrelations at lags 2 and above are due to the propagation of the autocorrelation at lag 1. Given that the pACF plot has a significant spike at lag 1, this means that all the higher-order autocorrelations are effectively explained by the lag-1 autocorrelation.
Knowing all of this information, let us use differencing to render this time series stationary.
Using the library urca
, we can apply a unit root test to determine if differencing is required. Specficially, we will be using the KPSS test.
library(urca)
ibmclose %>% ur.kpss() %>% summary()
##
## #######################
## # KPSS Unit Root Test #
## #######################
##
## Test is of type: mu with 5 lags.
##
## Value of test-statistic is: 3.6236
##
## Critical value for a significance level of:
## 10pct 5pct 2.5pct 1pct
## critical values 0.347 0.463 0.574 0.739
ndiffs(ibmclose)
## [1] 1
GIven that the value of the test-statistics is 3.6236, which exceeds the 1pct critical value of 0.739. However, according to ndiffs()
, we will likely only need to perform one level of differencing.
ibmclose %>% diff() %>% ur.kpss() %>% summary()
##
## #######################
## # KPSS Unit Root Test #
## #######################
##
## Test is of type: mu with 5 lags.
##
## Value of test-statistic is: 0.4702
##
## Critical value for a significance level of:
## 10pct 5pct 2.5pct 1pct
## critical values 0.347 0.463 0.574 0.739
ndiffs(ibmclose %>% diff())
## [1] 0
autoplot(ibmclose %>% diff(), main="1-Diff IBM Close",ylab="Closing Price Difference 1")
As you can see, the time series is now stationary (and requires no further differencing).
8.3 For the following series, find an appropriate Box-Cox transformation and order of differencing in order to obtain stationary data.
A. usnetelec
head(usnetelec)
## Time Series:
## Start = 1949
## End = 1954
## Frequency = 1
## [1] 296.1 334.1 375.3 403.8 447.0 476.3
lambda <- BoxCox.lambda(usnetelec)
paste0("Usnetelec BoxCox Lambda: ", round(lambda,3))
## [1] "Usnetelec BoxCox Lambda: 0.517"
transformed_ts <- usnetelec %>% BoxCox(lambda=lambda)
head(transformed_ts)
## Time Series:
## Start = 1949
## End = 1954
## Frequency = 1
## [1] 34.69772 37.05630 39.47125 41.06742 43.38646 44.89810
## attr(,"lambda")
## [1] 0.5167714
The best transformation here is the square root tranformation. Let us visualize the transformation and see how it stabilizes the variances of the time series.
par(mfrow=c(2,1))
autoplot(usnetelec)
autoplot(transformed_ts)
While I do not notice seasonality, I certainly do see a trend. This most certainly requires differencing. Let us be more objective about this by using KPSS.
transformed_ts %>% ur.kpss() %>% summary()
##
## #######################
## # KPSS Unit Root Test #
## #######################
##
## Test is of type: mu with 3 lags.
##
## Value of test-statistic is: 1.4583
##
## Critical value for a significance level of:
## 10pct 5pct 2.5pct 1pct
## critical values 0.347 0.463 0.574 0.739
ndiffs(transformed_ts)
## [1] 2
This will require 2 levels of differencing. Let us go ahead and perform this.
two_dff <- transformed_ts %>% diff() %>% diff()
two_dff %>% ur.kpss() %>% summary()
##
## #######################
## # KPSS Unit Root Test #
## #######################
##
## Test is of type: mu with 3 lags.
##
## Value of test-statistic is: 0.072
##
## Critical value for a significance level of:
## 10pct 5pct 2.5pct 1pct
## critical values 0.347 0.463 0.574 0.739
ndiffs(two_dff)
## [1] 0
autoplot(two_dff, main="Two Difference")
After performing a BoxCox transformation (square root) and differencing twice, we have successfully achieved a stationary time series.
B. usgdp
This time series is the quarterly US GDP from 1947 to 2006.
head(usgdp)
## Qtr1 Qtr2 Qtr3 Qtr4
## 1947 1570.5 1568.7 1568.0 1590.9
## 1948 1616.1 1644.6
lambda <- BoxCox.lambda(usgdp)
paste0("Usgdp BoxCox Lambda: ", round(lambda,3))
## [1] "Usgdp BoxCox Lambda: 0.366"
transformed_ts <- usgdp %>% BoxCox(lambda=lambda)
head(transformed_ts)
## Qtr1 Qtr2 Qtr3 Qtr4
## 1947 37.72578 37.70879 37.70217 37.91751
## 1948 38.15221 38.41487
The optimal BoxCox Transformation is cubed root. Let us visualize the pre-transformed time series vs. the transformed time series and see if the variance has been stabilized.
par(mfrow=c(2,1))
autoplot(usgdp)
autoplot(transformed_ts)
The original time series appears to have a slight convexity with an upward trend. There is no apparent seasonal component. The transformation has sucessfully improved the variance and the series itself even appears linear.
Given that there is a trend, this by definition is not stationary. In other words, let us perform a kpss()
to determine how many levels of differencing needs to take place.
transformed_ts %>% ur.kpss() %>% summary()
##
## #######################
## # KPSS Unit Root Test #
## #######################
##
## Test is of type: mu with 4 lags.
##
## Value of test-statistic is: 4.8114
##
## Critical value for a significance level of:
## 10pct 5pct 2.5pct 1pct
## critical values 0.347 0.463 0.574 0.739
ndiffs(transformed_ts)
## [1] 1
Given a test-statistic value of 4.8114, this exceeds the 1 percent critical value of 0.739. The ndiffs()
function suggests that we need to perform one level of differencing.
one_dff <- transformed_ts %>% diff()
one_dff %>% ur.kpss() %>% summary()
##
## #######################
## # KPSS Unit Root Test #
## #######################
##
## Test is of type: mu with 4 lags.
##
## Value of test-statistic is: 0.2013
##
## Critical value for a significance level of:
## 10pct 5pct 2.5pct 1pct
## critical values 0.347 0.463 0.574 0.739
ndiffs(one_dff)
## [1] 0
autoplot(one_dff, main="One Difference")
The time series is now stationary.
C. mcopper
The time series mcopper is the monthly copper prices.
head(mcopper)
## Jan Feb Mar Apr May Jun
## 1960 255.2 259.7 249.3 258.0 244.3 246.8
lambda <- BoxCox.lambda(mcopper)
paste0("Mcopper BoxCox Lambda: ", round(lambda,3))
## [1] "Mcopper BoxCox Lambda: 0.192"
transformed_ts <- mcopper %>% BoxCox(lambda=lambda)
head(transformed_ts)
## Jan Feb Mar Apr May Jun
## 1960 9.883142 9.933858 9.815540 9.914783 9.757230 9.786504
The BoxCox Transformation lamba is power to the 1/5th. Like the prior exercises, let us visualization the pre and post transformed time series.
par(mfrow=c(2,1))
autoplot(mcopper)
autoplot(transformed_ts)
The post transformation does appear to stabilize the variance (as the pre transformed time series appeared to have a wider variance). There is one obvious component, and upward trend. Now, given that this is monthly data, is there a seasonal component? It’s difficult to say at this moment, but we could take a look at the ggsubseriesplot()
to see if there’s any obvious seasonal component.
ggsubseriesplot(mcopper)
ggsubseriesplot(transformed_ts)
There does not appear to be a seaonal component with this time series.
Regarding differencing, let us determine this by using the kpss()
, ndiffs()
and nsdiffs()
functions.
transformed_ts %>% ur.kpss() %>% summary()
##
## #######################
## # KPSS Unit Root Test #
## #######################
##
## Test is of type: mu with 6 lags.
##
## Value of test-statistic is: 6.2659
##
## Critical value for a significance level of:
## 10pct 5pct 2.5pct 1pct
## critical values 0.347 0.463 0.574 0.739
paste0("Number of Differences needed: ", ndiffs(transformed_ts))
## [1] "Number of Differences needed: 1"
paste0("Number of Seasonal Differences needed: ", nsdiffs(transformed_ts))
## [1] "Number of Seasonal Differences needed: 0"
The test-statistic value of 6.269 exceeds the 1 percent critical value. According to the ndiffs()
and nsdiffs()
functions, we only need one level of differencing to create a stationary time series.
one_dff <- transformed_ts %>% diff()
one_dff %>% ur.kpss() %>% summary()
##
## #######################
## # KPSS Unit Root Test #
## #######################
##
## Test is of type: mu with 6 lags.
##
## Value of test-statistic is: 0.0573
##
## Critical value for a significance level of:
## 10pct 5pct 2.5pct 1pct
## critical values 0.347 0.463 0.574 0.739
ndiffs(one_dff)
## [1] 0
autoplot(one_dff, main="One Difference")
We have successfully created a stationary time series.
D. enplanements
This time series is the domestic revenue enplanements (millions) from 1996 to 2000.
head(enplanements)
## Jan Feb Mar Apr May Jun
## 1979 21.12 22.92 25.90 24.38 23.41 26.82
lambda <- BoxCox.lambda(enplanements)
paste0("Enplanements BoxCox Lambda: ", round(lambda,3))
## [1] "Enplanements BoxCox Lambda: -0.227"
transformed_ts <- enplanements %>% BoxCox(lambda=lambda)
head(transformed_ts)
## Jan Feb Mar Apr May Jun
## 1979 2.201157 2.241712 2.300934 2.271836 2.252078 2.317546
The BoxCox Transformation appears to be inverse(1/4). Let us visualize the pre and post transformation.
par(mfrow=c(2,1))
autoplot(enplanements)
autoplot(transformed_ts)
The variance appears more stabilized after the transformation.
There appears to be an upward trend as well as a seasonal component. Let us verify if there’s a seasonal component.
ggsubseriesplot(enplanements)
ggseasonplot(enplanements)
It appears that the summer months appear to be better with the domestic revenue enplanements. I suspect that we will need to perform a seasonal differencing first (by frequency 12 a.k.a. 12 months). Let us confirm with our nsdiffs()
and ndiffs()
functions.
transformed_ts %>% ur.kpss() %>% summary()
##
## #######################
## # KPSS Unit Root Test #
## #######################
##
## Test is of type: mu with 5 lags.
##
## Value of test-statistic is: 4.3785
##
## Critical value for a significance level of:
## 10pct 5pct 2.5pct 1pct
## critical values 0.347 0.463 0.574 0.739
paste0("Number of Differences needed: ", ndiffs(transformed_ts))
## [1] "Number of Differences needed: 1"
paste0("Number of Seasonal Differences needed: ", nsdiffs(transformed_ts))
## [1] "Number of Seasonal Differences needed: 1"
Just as we suspected. We will perform the seasonal difference first. (According to the textbook, many times we can get away with just a seasonal difference first, and then determine whether or not we need to perform additional differencing.)
seasonal_dff <- transformed_ts %>% diff(lag=12)
seasonal_dff %>% ur.kpss() %>% summary()
##
## #######################
## # KPSS Unit Root Test #
## #######################
##
## Test is of type: mu with 5 lags.
##
## Value of test-statistic is: 0.3617
##
## Critical value for a significance level of:
## 10pct 5pct 2.5pct 1pct
## critical values 0.347 0.463 0.574 0.739
paste0("ndiffs: ", ndiffs(seasonal_dff))
## [1] "ndiffs: 1"
paste0("nsdiffs: ", nsdiffs(seasonal_dff))
## [1] "nsdiffs: 0"
autoplot(seasonal_dff, main="Seasonal Difference")
Despite accounting for the seasonal difference, we still need to take one more level of differencing.
one_dff <- seasonal_dff %>% diff()
one_dff %>% ur.kpss() %>% summary()
##
## #######################
## # KPSS Unit Root Test #
## #######################
##
## Test is of type: mu with 5 lags.
##
## Value of test-statistic is: 0.0424
##
## Critical value for a significance level of:
## 10pct 5pct 2.5pct 1pct
## critical values 0.347 0.463 0.574 0.739
ndiffs(one_dff)
## [1] 0
autoplot(one_dff, main="One Difference After Seasonal Differencing")
Though there is an outlier around the year 2003-2004, the time series now appears to be stationary.
E. visitors
This time series is the monthly Australian short-term overseas visitors.
head(visitors)
## May Jun Jul Aug Sep Oct
## 1985 75.7 75.4 83.1 82.9 77.3 105.7
lambda <- BoxCox.lambda(visitors)
paste0("Visitors BoxCox Lambda: ", round(lambda,3))
## [1] "Visitors BoxCox Lambda: 0.278"
transformed_ts <- visitors %>% BoxCox(lambda=lambda)
head(transformed_ts)
## May Jun Jul Aug Sep Oct
## 1985 8.369471 8.356284 8.683416 8.675203 8.439170 9.531695
The BoxCox lambda is approximately 1/4th power.
Visualization:
par(mfrow=c(2,1))
autoplot(visitors)
autoplot(transformed_ts)
The variance has improved dramatically with the transformation. There certainly is an upward trend and likely a seasonal component. Let’s verify the seasonal component.
ggsubseriesplot(visitors)
ggseasonplot(visitors)
Again, like the previous example, I suspect that there is a yearly seasonal component with the most overseas visitors being in December, which is summer months for Australia
transformed_ts %>% ur.kpss() %>% summary()
##
## #######################
## # KPSS Unit Root Test #
## #######################
##
## Test is of type: mu with 4 lags.
##
## Value of test-statistic is: 4.5233
##
## Critical value for a significance level of:
## 10pct 5pct 2.5pct 1pct
## critical values 0.347 0.463 0.574 0.739
paste0("Number of Differences needed: ", ndiffs(transformed_ts))
## [1] "Number of Differences needed: 1"
paste0("Number of Seasonal Differences needed: ", nsdiffs(transformed_ts))
## [1] "Number of Seasonal Differences needed: 1"
As we suspected, this will require a seasonal differencing and one level differencing.
visitor_dff <- visitors %>% diff(lag=12) %>% diff()
visitor_dff %>% ur.kpss() %>% summary()
##
## #######################
## # KPSS Unit Root Test #
## #######################
##
## Test is of type: mu with 4 lags.
##
## Value of test-statistic is: 0.0154
##
## Critical value for a significance level of:
## 10pct 5pct 2.5pct 1pct
## critical values 0.347 0.463 0.574 0.739
paste0("ndiffs: ", ndiffs(visitor_dff))
## [1] "ndiffs: 0"
paste0("nsdiffs: ", nsdiffs(visitor_dff))
## [1] "nsdiffs: 0"
autoplot(one_dff, main="One Difference After Seasonal Differencing")
The time series is now stationary.
8.5. For your retail data (from Exercise 3 in Section 2.10), find the appropriate order of differencing (after transformation if necessary) to obtain stationary data.
retaildata <- readxl::read_excel("retail.xlsx", skip=1)
# Category: Turnover ; Western Australia ; Furniture, floor coverings, houseware and textile goods retailing - this was from my previous three homework assignments.
myts <- ts(retaildata[,"A3349661X"],
frequency=12, start=c(1982,4))
Like question 8.3, we will be going through the same pipeline to find the appropriate order of differencing.
autoplot(myts)
As we can see, the variance increases with time, therefore, we will need to perform a Box Cox transformation.
head(myts)
## Apr May Jun Jul Aug Sep
## 1982 19.2 21.9 19.9 19.3 19.6 19.9
lambda <- BoxCox.lambda(myts)
paste0("Myts BoxCox Lambda: ", round(lambda,3))
## [1] "Myts BoxCox Lambda: -0.049"
transformed_ts <- myts %>% BoxCox(lambda=lambda)
head(transformed_ts)
## Apr May Jun Jul Aug Sep
## 1982 2.749405 2.862745 2.780324 2.753894 2.767215 2.780324
A logarithmic transformation will need to be applied. There appears to be an upward trend and possibly a seasonal component. Let us confirm this.
ggsubseriesplot(myts)
ggseasonplot(myts)
There does not appear to be a seaonal component. Let us evaluate and see how many differencing needs to be performed.
transformed_ts %>% ur.kpss() %>% summary()
##
## #######################
## # KPSS Unit Root Test #
## #######################
##
## Test is of type: mu with 5 lags.
##
## Value of test-statistic is: 6.2162
##
## Critical value for a significance level of:
## 10pct 5pct 2.5pct 1pct
## critical values 0.347 0.463 0.574 0.739
paste0("Number of Differences needed: ", ndiffs(transformed_ts))
## [1] "Number of Differences needed: 1"
paste0("Number of Seasonal Differences needed: ", nsdiffs(transformed_ts))
## [1] "Number of Seasonal Differences needed: 0"
It appears that once differencing needs to take place.
one_dff <- myts %>% diff()
one_dff %>% ur.kpss() %>% summary()
##
## #######################
## # KPSS Unit Root Test #
## #######################
##
## Test is of type: mu with 5 lags.
##
## Value of test-statistic is: 0.1557
##
## Critical value for a significance level of:
## 10pct 5pct 2.5pct 1pct
## critical values 0.347 0.463 0.574 0.739
paste0("ndiffs: ", ndiffs(one_dff))
## [1] "ndiffs: 0"
paste0("nsdiffs: ", nsdiffs(one_dff))
## [1] "nsdiffs: 0"
autoplot(one_dff, main="One Difference")
The myts
retail time series is now stationary.
8.6 Use R to simulate and plot some data from simple ARIMA models.
A. Use the following R code to generate data from an AR(1) model with \(\phi_1 = 0.6\) and \(\sigma^2\). The process starts with \(y_1 = 0\).
y <- ts(numeric(100))
e <- rnorm(100)
for (i in 2:100)
y[i] <- 0.6*y[i-1] + e[i]
autoplot(y)
B. Produce a time plot for the series. How does the plot change as you change \(\phi_1\)?
Let us create a range of phi going from -1 to 1 with steps of 0.4. As noted in section 8.3, “Autoregressive Models”, “we normally restrict autoregressive models to stationary data, in which case some contraints on the values of the parameters are required.” For an AR(1) model: \(-1 < \phi_1 < 1\).
set.seed(42)
phi <- seq(-1,1,.4)
plots_AR <- list()
e <- rnorm(100)
counter <- 1
for (phi_i in phi){
y <- ts(numeric(100))
for (i in 2:100)
y[i] <- phi_i*y[i-1] + e[i]
plots_AR[[counter]] <- y
counter <- counter + 1
}
str(plots_AR)
## List of 6
## $ : Time-Series [1:100] from 1 to 100: 0 -0.565 0.928 -0.295 0.699 ...
## $ : Time-Series [1:100] from 1 to 100: 0 -0.565 0.702 0.212 0.277 ...
## $ : Time-Series [1:100] from 1 to 100: 0 -0.565 0.476 0.538 0.297 ...
## $ : Time-Series [1:100] from 1 to 100: 0 -0.565 0.25 0.683 0.541 ...
## $ : Time-Series [1:100] from 1 to 100: 0 -0.5647 0.0243 0.6474 0.7927 ...
## $ : Time-Series [1:100] from 1 to 100: 0 -0.565 -0.202 0.431 0.836 ...
autoplot(ts(numeric(100))) +
autolayer(plots_AR[[1]], series="Phi = -1.0") +
autolayer(plots_AR[[2]], series="Phi = -0.6") +
autolayer(plots_AR[[3]], series="Phi = -0.2") +
autolayer(plots_AR[[4]], series="Phi = 0.2") +
autolayer(plots_AR[[5]], series="Phi = 0.6") +
autolayer(plots_AR[[6]], series="Phi = 1.0") +
ylab("")
To make it easier to visualize, we will do two plots at a time with phi = 0.6 as the baseline.
autoplot(ts(numeric(100))) +
autolayer(plots_AR[[5]], series="Phi = 0.6") +
autolayer(plots_AR[[6]], series="Phi = 1.0")
By increasing the size of phi to 1.0, this has essentially increased amplitude of the AR(1) from part A as phi went from 0.6 to 1.0.
autoplot(ts(numeric(100))) +
autolayer(plots_AR[[5]], series="Phi = 0.6") +
autolayer(plots_AR[[2]], series="Phi = -0.2")
As we notice, when we set phi = -0.2, the rapid oscillation around 0 is due to the negative sign. When we increment in y[i], the negative sign will flip the sign, thus creating this up and down pattern. When we decrease phi to -0.6 or -1.0, a similar situation occurs only except the amplitude gets larger. We will demonstrate when phi = -0.2 and phi = -1.0.
autoplot(ts(numeric(100))) +
autolayer(plots_AR[[1]], series="Phi = -1.0") +
autolayer(plots_AR[[2]], series="Phi = -0.2")
We have demonstrated that by decreasing phi, we have successfully created the oscillation pattern with different amplitudes.
C. Write your own code to generate data from an MA(1) model with \(\theta_1 = 0.6\) and \(\sigma^2 = 1\).
According to section 8.4, “Moving average models”, a moving average model uses past forecast errors in a regression - like model.
\[y_t = c + \epsilon_t + \theta_1 \epsilon_{t-1} + \theta_2 \epsilon_{t-2} + ... + \theta_q \epsilon_{t-q}\] where \(\epsilon_t\) is white noise. Notice that each value of \(y_t\) can be thought of as a weighted moving average of the past few forecast errors.
Using \(\theta_1 = 0.6\) and \(\sigma^2 = 1\) for MA(1) model, the equation can be written as:
\[y_t = c + \epsilon_t + 0.6 \epsilon_{t-1}\]
Given that we do not know the intercept term \(c\), we will eliminate that from our equation.
\[y_t = \epsilon_t + 0.6 \epsilon_{t-1}\]
Let’s generate an MA(1) model and a plot.
y <- ts(numeric(100))
e <- rnorm(100)
for (i in 2:100)
y[i] <- e[i] + 0.6*e[i-1]
head(y,10)
## Time Series:
## Start = 1
## End = 10
## Frequency = 1
## [1] 0.0000000 1.7653303 -0.3763580 1.2465567 0.4423157 -0.2945502
## [7] -0.3589476 -0.3757037 0.1147829 0.2320768
autoplot(y) + ylab("MA(1) Model")
D. Produce a time plot for the series. How does the plot change as you change \(\theta_1\)?
We will perform the same action as in part B and C.
Recall from Section 8.4, “Moving Average Models”, the constraints on \(\theta\) for an MA(1) model are:
\[-1 < \theta_1 < 1\]
set.seed(42)
theta <- seq(-1,1,.4)
plots_MR <- list()
e <- rnorm(100)
counter <- 1
for (theta_i in theta){
y <- ts(numeric(100))
for (i in 2:100)
y[i] <- e[i] + theta_i*e[i-1]
plots_MR[[counter]] <- y
counter <- counter + 1
}
str(plots_MR)
## List of 6
## $ : Time-Series [1:100] from 1 to 100: 0 -1.936 0.928 0.27 -0.229 ...
## $ : Time-Series [1:100] from 1 to 100: 0 -1.3873 0.7019 0.415 0.0246 ...
## $ : Time-Series [1:100] from 1 to 100: 0 -0.839 0.476 0.56 0.278 ...
## $ : Time-Series [1:100] from 1 to 100: 0 -0.291 0.25 0.705 0.531 ...
## $ : Time-Series [1:100] from 1 to 100: 0 0.2579 0.0243 0.8507 0.784 ...
## $ : Time-Series [1:100] from 1 to 100: 0 0.806 -0.202 0.996 1.037 ...
autoplot(ts(numeric(100))) +
autolayer(plots_MR[[1]], series="Theta = -1.0") +
autolayer(plots_MR[[2]], series="Theta = -0.6") +
autolayer(plots_MR[[3]], series="Theta = -0.2") +
autolayer(plots_MR[[4]], series="Theta = 0.2") +
autolayer(plots_MR[[5]], series="Theta = 0.6") +
autolayer(plots_MR[[6]], series="Theta = 1.0") +
ylab("")
For the most part, it appears to follow a certain pattern (but with different amplitudes depending on the theta). Let’s take a closer look.
autoplot(ts(numeric(100))) +
autolayer(plots_MR[[5]], series="Theta = 0.6") +
autolayer(plots_MR[[6]], series="Theta = 1.0")
The pattern looks quite similar except that the theta of 1.0 has a slightly higher amplitude than the theta of 0.6.
autoplot(ts(numeric(100))) +
autolayer(plots_MR[[5]], series="Theta = 0.6") +
autolayer(plots_MR[[2]], series="Theta = -0.2")
The theta of -0.2 appears to have the reverse direction and smaller amplitude compared to theta of 0.6.
autoplot(ts(numeric(100))) +
autolayer(plots_MR[[1]], series="Theta = -1.0") +
autolayer(plots_MR[[2]], series="Theta = -0.2")
Not surprising, the theta of -0.2 and -1.0 look similar to each other but with different amplitudes.
E. Generate data from an ARMA(1,1) model with \(\phi_1=0.6, \theta_1=0.6, \text{and } \sigma^2=1\).
This is an ARIMA model with an autoregressive and moving average components.
If we assume that this is a non-seasonal ARIMA model, then the model can be written as:
\[y\prime_t = c + \phi_1 y\prime_{t-1} + ... + \phi_p y\prime_{t-p} + \theta_1 \epsilon_{t-1} + ... + \theta_q \epsilon_{t-q} + \epsilon_t\]
where \(y\prime_t\) is the differenced series.
For this particular example and the provided constraints, we will be using the formula (and we will eliminate c by assuming that the intercept = 0):
\[y\prime_t = 0.6\phi_{t-1} + 0.6 \epsilon_{t-1} + \epsilon_t\]
Let us simulate an ARIMA(1,1) model.
y1 <- ts(numeric(100))
e <- rnorm(100)
for (i in 2:100)
y1[i] <- 0.6*y1[i-1] + 0.6*e[i-1] + e[i]
print(y1)
## Time Series:
## Start = 1
## End = 100
## Frequency = 1
## [1] 0.00000000 1.76533031 0.68284019 1.65626083 1.43607223
## [6] 0.56709311 -0.01869173 -0.38691874 -0.11736831 0.16165579
## [11] 0.14339750 0.17905570 -0.31315818 -0.98337318 -2.55365327
## [16] -2.91118513 -2.48876157 0.90104390 0.79964471 -0.20022669
## [21] -1.53140735 -3.28545519 -2.72883217 -2.55911701 -2.13527630
## [26] -1.71051823 -1.89693787 -3.53104353 -4.55818078 -3.29024080
## [31] -1.29881402 -0.93159341 -0.85461957 0.61015563 2.47968291
## [36] 1.25460942 -0.02282217 1.11741336 0.92161748 0.21866325
## [41] 0.01360896 -0.93117802 -1.53599823 -1.21785422 -1.16224831
## [46] 0.16771573 0.28766821 -0.54916381 0.10746287 -0.57377315
## [51] -1.01878341 -2.18723395 -1.07609772 -0.21900260 -0.76343431
## [56] -1.97702011 -1.93692549 -1.96709469 -2.19391845 -0.34877122
## [61] 0.38781654 -0.94440798 -1.04650734 -0.89271869 -0.16326071
## [66] 1.68847363 0.87984482 0.38694168 0.58985325 1.30041637
## [71] 1.08781103 1.35143880 -0.43222114 0.38309272 2.10830896
## [76] 1.63307618 -0.55962702 -0.56217179 0.53169601 0.60257830
## [81] 0.50918950 -0.18772174 -0.09429169 0.45936336 0.17315125
## [86] -1.39990153 -0.94093409 0.41008546 -0.25773734 -2.25101452
## [91] -2.10240303 -1.48355465 -0.84457387 -1.64917977 -2.72507978
## [96] -1.12477528 0.38037465 1.05697727 2.80130733 2.89874289
autoplot(y1) + ggtitle("ARIMA(1,1)")
F. Generate data from an AR(2) model with \(\phi_1 = -0.8, \phi_2=0.3, text{and } sigma^2=1\).(Note that these parameters will give a non-stationary series.)
According to section 8.3, “Autoregresive models”, for an AR(2) model, the constraints on the parameters are:
\[-1 < \phi_2 < 1, \phi_1 + \phi_2 < 1, \phi_2 - \phi_1 < 1\]
The above criteria is satisfied with the exception of the last one (\(\phi_2 - \phi_1 = 1.1\)) This may potentially lead to a non-stationary series.
The parameters above creates the following formula: (again assuming no constant)
\[y_t = -0.8y_{t-1} + 0.3y_{t-2} + \epsilon_t\]
Let us simulate the data and visualize the time series plots.
y2 <- ts(numeric(100))
e <- rnorm(100)
for (i in 3:100)
y2[i] <- -0.8*y2[i-1] + 0.3*y2[i-2] + e[i]
print(y2)
## Time Series:
## Start = 1
## End = 100
## Frequency = 1
## [1] 0.0000000 0.0000000 1.1713251 1.1224791 -1.9234474
## [6] 0.7246461 -1.8625725 0.6533960 -1.7272323 1.3924267
## [11] -2.8333331 4.7213666 -4.5193185 4.9477566 -4.8183812
## [16] 5.3764472 -5.8787601 7.7927297 -8.2148420 7.6260903
## [21] -8.1796569 8.4800398 -9.7597250 9.2836607 -9.9264802
## [26] 10.5522641 -10.9040876 11.6545840 -13.2533969 15.3493294
## [31] -16.5272463 18.7745478 -21.1793946 22.1097639 -24.3109809
## [36] 25.6907485 -26.4971861 28.8822087 -30.8106969 32.3708484
## [41] -35.8691051 39.4046075 -41.0259359 45.8909946 -50.4012135
## [46] 56.1382299 -59.0140752 64.0260116 -68.2214241 72.8135575
## [51] -79.8134295 85.7438613 -93.7376137 100.9032684 -107.5461929
## [56] 115.2740611 -125.2215475 134.8060203 -146.4288766 157.2016234
## [61] -168.8172063 183.1837971 -196.8083529 210.5502658 -227.5367152
## [66] 246.2592251 -264.4551996 285.2511108 -310.2373783 333.8262025
## [71] -359.5584238 387.8404034 -417.9824373 451.1696362 -486.7269899
## [76] 526.0424610 -566.3816724 609.6754059 -656.2732510 709.1256815
## [81] -763.3584466 821.7618323 -886.9863062 956.7531084 -1031.4546566
## [86] 1112.5376701 -1197.0069395 1290.5484723 -1393.6540598 1502.3614848
## [91] -1620.6730026 1747.6928886 -1885.1685964 2034.6547992 -2193.3981242
## [96] 2364.6376036 -2549.8957817 2750.1704698 -2965.0077699 3195.4317401
autoplot(y2) + ggtitle("ARIMA(2)")
As we see, because one of the constraints was violated, the time series no longer becomes stationary.
G. Graph the latter two series and compare them.
autoplot(ts(numeric(100))) +
autolayer(y1, series="ARIMA(1,1)") +
autolayer(y2, series="ARIMA(2)") + ylab("")
As we can see, the ARIMA(2) models becomes non-stationary whereas ARIMA(1,1) remains stationary.
8.7 Consider wmurders
, the number of women murdered each year (per 100,000 standard population) in the United States.
This time series is the annual female murder rate, per 100,000 standard population.
str(wmurders)
## Time-Series [1:55] from 1950 to 2004: 2.43 2.36 2.37 2.3 2.33 ...
head(wmurders)
## Time Series:
## Start = 1950
## End = 1955
## Frequency = 1
## [1] 2.429415 2.363364 2.374305 2.295520 2.329716 2.233017
autoplot(wmurders)
A. By studying appropriate graphs of the series in R, find an appropriate ARIMA(p,d,q) model for these data.
In order for this time series to be appropriate for ARIMA, the time series needs to be stationary. As we can see the data appears to have this upward then downward movement. There does not seem to be an obvious seasonal component. Let us check and see if differencing would create a stationary time series.
wmurders %>% ur.kpss() %>% summary()
##
## #######################
## # KPSS Unit Root Test #
## #######################
##
## Test is of type: mu with 3 lags.
##
## Value of test-statistic is: 0.6331
##
## Critical value for a significance level of:
## 10pct 5pct 2.5pct 1pct
## critical values 0.347 0.463 0.574 0.739
paste0("ndiffs: ", ndiffs(wmurders))
## [1] "ndiffs: 2"
The KPSS and ndiffs()
function suggests that we need to difference our time series twice.
st_wmurders <- wmurders %>% diff() %>% diff()
head(st_wmurders)
## Time Series:
## Start = 1952
## End = 1957
## Frequency = 1
## [1] 0.076992 -0.089726 0.112981 -0.130895 0.241861 -0.200670
paste0("Needs more differencing? ", ndiffs(st_wmurders))
## [1] "Needs more differencing? 0"
autoplot(st_wmurders)
There is no more differencing required and the data now appears stationary.
Therefore, d = 2.
To help us determine the appropriate p and q values, we will utilize the ACF and PACF plots. According to section 8.5, “Non-seasonal ARIMA models”, it is sometimes possible to use the ACF plot and PACF plot to determine appropriate values for p and q.
Before we continue the plot to p and q, we notice that the stationary time series has may have a varying variance. We should transfrom the data using a Box-Cox transformation to stabilize the variance (if one is even required).
lambda <- BoxCox.lambda(st_wmurders)
paste0("Stationary wmurders BoxCox Lambda: ", round(lambda,3))
## [1] "Stationary wmurders BoxCox Lambda: 1.202"
transformed_ts <- st_wmurders %>% BoxCox(lambda=lambda)
head(transformed_ts)
## Time Series:
## Start = 1952
## End = 1957
## Frequency = 1
## [1] -0.7936398 -0.8775937 -0.7713058 -0.9039278 -0.6807972 -0.9523806
## attr(,"lambda")
## [1] 1.202262
The lambda is ~1.2. We will use this information and transform the time series.
On a side note, “if the data are from an ARIMA(p,d,0) or ARIMA(0,d,q) model, then the ACF and PACF plots can be helpful in determining the value of p or q. If p and q are both positive, then the plots do not help in finding suitable values of p and q.”
“The data may follow an ARIMA(p,d,0) mode if the ACF and PACF plots of the differenced data show the following patterns:”
the ACF is exponentially decaying or sinusoidal;
there is a significant spike at lag p in the PACF, but none beyond p.
“The data may follow an ARIMA(0,d,q) model if the ACF and PACF plots of the differenced data show the following patterns:”
the PACF is exponentially decaying or sinusoidal;
there is a significant spike at lag q in the ACF, but none beyond lag q.
ggAcf(transformed_ts, main="ACF")
ggPacf(transformed_ts, main="PACF")
Looking at the ACF and PACF plots, it appears that the ACF is sinusoidal and with a significant spike at lag 1 in the PACF, but none beyond lag 1.
This suggests that p = 1, and q = 0 , giving an ARIMA model of (1,2,0).
Now that we have our base ARIMA model of (1,2,0), we will need to create some variations and calculate the AICc and determine which model is best via the smallest AICc value. We will consider:
fit_120 <- Arima(wmurders, order=c(1,2,0))
fit_020 <- Arima(wmurders, order=c(0,2,0))
fit_220 <- Arima(wmurders, order=c(2,2,0))
fit_121 <- Arima(wmurders, order=c(1,2,1))
fit_120
## Series: wmurders
## ARIMA(1,2,0)
##
## Coefficients:
## ar1
## -0.6719
## s.e. 0.0981
##
## sigma^2 estimated as 0.05471: log likelihood=2
## AIC=0 AICc=0.24 BIC=3.94
fit_020
## Series: wmurders
## ARIMA(0,2,0)
##
## sigma^2 estimated as 0.1007: log likelihood=-14.38
## AIC=30.76 AICc=30.84 BIC=32.73
fit_220
## Series: wmurders
## ARIMA(2,2,0)
##
## Coefficients:
## ar1 ar2
## -0.8289 -0.2246
## s.e. 0.1346 0.1353
##
## sigma^2 estimated as 0.05292: log likelihood=3.34
## AIC=-0.68 AICc=-0.19 BIC=5.23
fit_121
## Series: wmurders
## ARIMA(1,2,1)
##
## Coefficients:
## ar1 ma1
## -0.2434 -0.8261
## s.e. 0.1553 0.1143
##
## sigma^2 estimated as 0.04632: log likelihood=6.44
## AIC=-6.88 AICc=-6.39 BIC=-0.97
It appears that out of all the models presented here, the one with the smallest (or lowest) AICc is ARIMA(1,2,1) with a value of -6.39.
Let us see what the auto.arima()
function gives in regards to what they find the optimal answer is.
# We are setting seasonal FALSE since the time series is not seasonal.
auto.arima(wmurders, seasonal=FALSE)
## Series: wmurders
## ARIMA(1,2,1)
##
## Coefficients:
## ar1 ma1
## -0.2434 -0.8261
## s.e. 0.1553 0.1143
##
## sigma^2 estimated as 0.04632: log likelihood=6.44
## AIC=-6.88 AICc=-6.39 BIC=-0.97
The auto.arima()
function has procuded ARIMA(1,2,1). Fortunately, the function is in agreement with the hand-by-hand calculation.
B. Should you include a constant in the model? Explain.
We should not introduce a constant as that will introduce a drift into the model, which the original time series does not seem to suggest.
C. Write this model in terms of the backshift operator.
Recall section 8.2, “Backshift notation”, “the backward shift operation B is a useful notational device when working with time series lags:”
\[By_t = y_{t-1}\] \[B(By_t)= B^2y_t = y_{t-2}\]
“The backward shift operator is convenient for describing the process of differencing. A first difference can be written as:”
\[y\prime_t=y_t-y_{t-1}=y_t-By_t=(1-B)y_t\]
“Similarly, if second-order differences have to be computed, then:”
\[y\prime\prime_t=y_t-2y_{t-1} + y_{t-2} = (1-B)^2y_t\]
As noted in section 8.7 and taking all of the information noted above, a non-seasonal ARIMA model of (1,2,1) can be written as:
\[(1-\phi_1B)(1-B)^2y_t=(1+\theta_1B)\epsilon_t\]
D. Fit the model using R and examine the residuals. Is the model satisfactory?
fit_121
## Series: wmurders
## ARIMA(1,2,1)
##
## Coefficients:
## ar1 ma1
## -0.2434 -0.8261
## s.e. 0.1553 0.1143
##
## sigma^2 estimated as 0.04632: log likelihood=6.44
## AIC=-6.88 AICc=-6.39 BIC=-0.97
checkresiduals(fit_121)
##
## Ljung-Box test
##
## data: Residuals from ARIMA(1,2,1)
## Q* = 12.419, df = 8, p-value = 0.1335
##
## Model df: 2. Total lags used: 10
The residuals show a Ljung-Box p-value of 0.1335 which suggests that the residuals are white noise and satisfactory.
The ACF plot falls within the blue-dotted critical values line and the overall histogram appears to have a normal distribution. So, again, this model is satisfactory.
E. Forecast three times ahead. Check your forecasts by hand to make sure that you know how they have been calculated.
forecast(fit_121,h=3) # h=3 for forecast 3 times ahead
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## 2005 2.470660 2.194836 2.746484 2.048824 2.892496
## 2006 2.363106 1.986351 2.739862 1.786908 2.939304
## 2007 2.252833 1.765391 2.740276 1.507354 2.998313
The general formula for forecasting using the ARIMA(1,2,1) model is:
\[(1-\phi_1B)(1-B)^2y_t=(1+\theta_1B)\epsilon_t\] where \(\hat{\phi_1}=-0.2434\) and \(\hat{\theta_1}=-0.8261\).
Recall from section 8.8, “Forecasting”, “point forecasts can be calculated using the following three steps.”
Expand the ARIMA equation so that \(y_t\) is on the left hand side and all other terms are on the right.
Rewrite the equation by replacing t with T + h.
On the right hand side of the equation, replace future observations with their forecasts, future errors with zero, and past errors with the corresponding residuals.
“Beginning with h = 1, these steps are then repeated for h = 2,3,… until all forecats have been calculated.”
\[(1-\phi_1B)(1-B)(1-B)y_t = ....\] \[(1-\phi_1B)(1-2B+B^2)y_t=....\] \[(1-2B+B^2-\phi_1B+2\phi_1B^2-\phi_1B^3)y_t=...\] \[(1 - (2+\phi_1)B + (1+2\phi_1)B^2 - \phi_1B^3)y_t = (1+\theta_1B)\epsilon_t \] Replace all B’s.
\[(1 - (2+\phi_1)y_{t-1} + (1+2\phi_1)y_{t-2} - \phi_1y_{t-3})y_t = \epsilon_t + \hat{\theta_1}\epsilon_{t-1}\]
Move all left terms to the left and solve for \(y_t\). Then we replace all t with T + 1 to get the first forecast.
As the algebra can get pretty extensive, I will leave the general formula here and leave the calculations for R.
In regards for the prediction intervals, according to the textbook, the calculation of ARIMA prediction intervals is more difficult, and the details are largely beyond the scope of this book.
However, for the first prediction intervals, The first prediction interval is easy to calculate. If \(\hat{\sigma}\) is the standard deviation of the residuals, then a 95% prediction interval is given by \(\hat{y_{T+1|T}} \pm 1.96\hat{sigma}\). This result is true for all ARIMA models regardless of their parameters and orders.
F. Create a plot of the series with forecasts and prediction intervals for the next three periods shown.
autoplot(forecast(fit_121, h=3))
Above is the created forecasts with its corresponding prediction intervals.
G. Does auto.arima()
give the same model you have chosen? If not, which model do you think is better?
As I have demonstrated in part A, auto.arima()
does give the same model that I have chosen.