Figure 8.31 Shows the ACFs for the 36 randmo numbers, 360 random numbers and 1,000 Random numbers
Explain the differences among these figures. Do they all indicate that the data are white noise?
Each of these figures show a white noise series of different periods. The longer the time series, the smaller the scale for indicating whether the autocorrelations are significant. series 2 actually seems to show to lags that go above the significance threshold.
Why are the critical values at different distances from the mean of zero? why are the autocorrelations different in each figure when they refer to white noise?
The critical values are different because of the length of the time series. the longer the time series, the more data points there are to increase confidence and lower the threshold for significance. the auto correlations are different because the series represents different lengths of time.
A classic example of a non-stationary series is the daily closing IBM stock and the ACF and PACF. Explain how each plot shows that the series is a non-stationary and shold be differenced.
ibmclose %>% autoplot()+
defaulttheme+
labs("IBM Closing Stock Price")
Acf(ibmclose)
pacf(ibmclose)
The autocorrelation of this plot shows that many the lags of the series are closely related to each other when considering the entire series. However, when evaluating these lags independently using a partial autocorrelation plot, the time series shows that
***For the following series, find an appropriate Box-Cox transformation and order of differencing in order to obtain stationary data.
exercise8.3<-
function(data, name){
#Find lambda for data
lmbda<-round(BoxCox.lambda(data),2)
#get time series for boxcoxed data
bcox_data<-BoxCox(data,lmbda)
#return differences of boxcoxed data
diffs<-ndiffs(bcox_data)
#return time series of differenced boxcoxed data
data_bcox_diff<-diff(bcox_data,differences = diffs)
return1<-
data.frame(dataset = name, lambda = lmbda, differences = diffs)
p1<-
autoplot(data)+
theme_bw()+
labs(title = name,
y= name)
p2<-
autoplot(bcox_data)+
theme_bw()+
labs(title = paste0("BoxCox transformed ", name),
y = name)
p3<-
autoplot(data_bcox_diff)+
theme_bw()+
labs(title = paste0("BoxCox and Differenced ", name),
y = name)
return2<-grid.arrange(p1,p2,p3, nrow=3)
return(list(return1,return2))
}
exercise8.3(usnetelec,"usnetelec")
[[1]]
dataset lambda differences
1 usnetelec 0.52 2
[[2]]
TableGrob (3 x 1) "arrange": 3 grobs
z cells name grob
1 1 (1-1,1-1) arrange gtable[layout]
2 2 (2-2,1-1) arrange gtable[layout]
3 3 (3-3,1-1) arrange gtable[layout]
exercise8.3(usgdp,"usgdp")
[[1]]
dataset lambda differences
1 usgdp 0.37 1
[[2]]
TableGrob (3 x 1) "arrange": 3 grobs
z cells name grob
1 1 (1-1,1-1) arrange gtable[layout]
2 2 (2-2,1-1) arrange gtable[layout]
3 3 (3-3,1-1) arrange gtable[layout]
exercise8.3(mcopper,"mcopper")
[[1]]
dataset lambda differences
1 mcopper 0.19 1
[[2]]
TableGrob (3 x 1) "arrange": 3 grobs
z cells name grob
1 1 (1-1,1-1) arrange gtable[layout]
2 2 (2-2,1-1) arrange gtable[layout]
3 3 (3-3,1-1) arrange gtable[layout]
exercise8.3(enplanements,"enplanements")
[[1]]
dataset lambda differences
1 enplanements -0.23 1
[[2]]
TableGrob (3 x 1) "arrange": 3 grobs
z cells name grob
1 1 (1-1,1-1) arrange gtable[layout]
2 2 (2-2,1-1) arrange gtable[layout]
3 3 (3-3,1-1) arrange gtable[layout]
exercise8.3(visitors,"visitors")
[[1]]
dataset lambda differences
1 visitors 0.28 1
[[2]]
TableGrob (3 x 1) "arrange": 3 grobs
z cells name grob
1 1 (1-1,1-1) arrange gtable[layout]
2 2 (2-2,1-1) arrange gtable[layout]
3 3 (3-3,1-1) arrange gtable[layout]
For your retail data (from exercise 3 in section 2.10), find the appropriate order of differencing (after transforming if necessary) to obtain stationary data.
library(httr)
url1<-"https://otexts.com/fpp2/extrafiles/retail.xlsx"
GET(url1, write_disk(tf <- tempfile(fileext = ".xlsx")))
Response [https://otexts.com/fpp2/extrafiles/retail.xlsx]
Date: 2021-03-29 03:04
Status: 200
Content-Type: application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
Size: 639 kB
<ON DISK> C:\Users\REGIST~1\AppData\Local\Temp\RtmpcBZeYu\fileae586ec2d4a.xlsx
retaildata <- readxl::read_excel(tf, skip = 1)
myts <- ts(retaildata[,"A3349873A"],
frequency=12, start=c(1982,4))
exercise8.3(myts, "retail data")
[[1]]
dataset lambda differences
1 retail data 0.13 1
[[2]]
TableGrob (3 x 1) "arrange": 3 grobs
z cells name grob
1 1 (1-1,1-1) arrange gtable[layout]
2 2 (2-2,1-1) arrange gtable[layout]
3 3 (3-3,1-1) arrange gtable[layout]
The appropriate order of differencing is shown in the table above.
***Use the following R code to generate data from an AR(1) model with theta1 = 0.6 and signme squares = 1. the process starts with y1 = 0.
ex8.6a<- function(phi){
y <- ts(numeric(100))
e <- rnorm(100)
for(i in 2:100){
y[i] <- phi*y[i-1] + e[i]}
return(y)
}
Produce a time plot for the series. How does the plot change as you change phi?
The plot changes with phi by adjusting the number of autocorrelations in the dataset. lower phi values seem to difference the data better and decrease number of significant autocorrelations
p1a<-ex8.6a(0.6) %>% autoplot+defaulttheme+
labs(title = "phi = 0.6")
p1b<-ggAcf(ex8.6a(0.6))+defaulttheme
p2a<-ex8.6a(0.3) %>% autoplot+defaulttheme+
labs(title = "phi = 0.3")
p2b<-ggAcf(ex8.6a(0.3))+defaulttheme
p3a<-ex8.6a(0.9) %>% autoplot+defaulttheme+
labs(title = "phi = 0.9")
p3b<-ggAcf(ex8.6a(0.9))+defaulttheme
grid.arrange(p1a,p2a,p3a,p1b,p2b,p3b, nrow=2)
Write your own code to generate data from an MA(1) model with phi 1 = 0.6 and sigma squared = 1.
The function can be seen in the hidden code below
ex8.6c <- function(theta, sd=1, n=100){
y <- ts(numeric(n))
e <- rnorm(n, sd=sd)
e[1] <- 0
for(i in 2:n)
y[i] <- theta*e[i-1] + e[i]
return(y)
}
Produce a time plot for the series. How does the plot change as you change theta 1?
The time plot adjusts with theta by changing the peaks of the time series. the data seems to get more compressed with lower theta’s based on the scales.
p1a<-ex8.6c(0.6) %>% autoplot+defaulttheme+
labs(title = "theta = 0.6")
p1b<-ggAcf(ex8.6c(0.6))+defaulttheme
p2a<-ex8.6c(0.3) %>% autoplot+defaulttheme+
labs(title = "theta = 0.3")
p2b<-ggAcf(ex8.6c(0.3))+defaulttheme
p3a<-ex8.6c(0.9) %>% autoplot+defaulttheme+
labs(title = "theta = 0.9")
p3b<-ggAcf(ex8.6c(0.9))+defaulttheme
grid.arrange(p1a,p2a,p3a,p1b,p2b,p3b, nrow=2)
Generate data from an ArMA(1,1) model with theta1 = 0.6, and sigma 2 =1. (Note that these parameters will give non-stationary series.)
The ARMA(1,1) model is presented below
y1 <- ts(numeric(100))
e <- rnorm(100, sd=1)
for(i in 2:100)
y1[i] <- 0.6*y1[i-1] + 0.6*e[i-1] + e[i]
autoplot(y1) +
ggtitle('ARMA(1,1)')+
defaulttheme
Generate data from an AR(2) model with -0.8, 0.3, and 1 (note that these parameters will give a non-stationary series.)
the two series are plotted below and are quite different. The ARMA 2 series shows a sinusoidal pattern that is increasing in its frequency while the ARMA 1,1 series shows a stationary dataset with two autocorrelations on the acf and three on the pacf.
y2 <- ts(numeric(100))
e <- rnorm(100, sd=1)
for(i in 3:100)
y2[i] <- -0.8*y2[i-1] + 0.3*y2[i-2] + e[i]
autoplot(y2) +
ggtitle('AR(2)')+
defaulttheme
Graph the latter two series and compare them.
ggtsdisplay(y1, main = 'ARMA(1,1)')
ggtsdisplay(y2, main = 'ARMA(2)')
By studying appropriate graphs of the series in R, find an appropriate ARIMA(p,d,q) model for these data.
Lookin at the initial series, we need to difference this non-stationary data. the data is non-seasonal so we do not need to be concerned on any seasonality ARIMA components.
ggtsdisplay(wmurders)
after differencing this data, we see that it is still not stationary as shown below with a slight trend downwards and a bit of cyclicality
diff(wmurders, differences = 1) %>% ggtsdisplay()
after a second order of differencing, we may not move to the next step with this stationary time series
diff(wmurders, differences = 2) %>% ggtsdisplay()
The autocorrelations of this dataset show that we can assume an AR term of 2 based on the ACF which gradually descends towards 0.
We’ll guess a few values around that and test their BICs based on our initial assumption. The value with the lowest BIC will be the one assumed as the best model and the one used to produce forecasts.
Arima(wmurders, order = c(2,2,0))$bic
[1] 5.232945
Arima(wmurders, order = c(1,2,0))$bic
[1] 3.93715
Arima(wmurders, order = c(3,2,0))$bic
[1] 7.862588
Arima(wmurders, order = c(2,2,1))$bic
[1] 0.7628476
Arima(wmurders, order = c(1,2,1))$bic
[1] -0.9688923
Based on above, the best seems to be around arima(1,2,1)
Should you include a constant in the model? Explain.
I do not believe a constant is necessary in this model as no drift term is necessary
Fit the model using R and examining the residuals. Is the model satisfactory?
wmurdersArima <- Arima(wmurders, order = c(1, 2, 1))
checkresiduals(wmurdersArima)
Ljung-Box test
data: Residuals from ARIMA(1,2,1)
Q* = 12.419, df = 8, p-value = 0.1335
Model df: 2. Total lags used: 10
Based on the residuals of this model showing white noise, the model is satisfactory
Forecast three times ahead. Check your forecasts by hand to make sure that you know how they have been calculated
wmurders_forecast <- forecast(wmurdersArima, h = 3)
wmurders_forecast$model
Series: wmurders
ARIMA(1,2,1)
Coefficients:
ar1 ma1
-0.2434 -0.8261
s.e. 0.1553 0.1143
sigma^2 estimated as 0.04632: log likelihood=6.44
AIC=-6.88 AICc=-6.39 BIC=-0.97
years <- length(wmurders)
e <- wmurdersArima$residuals
fc1 <- 2*wmurders[years] - wmurders[years - 1] - 1.0181*e[years] + 0.1470*e[years - 1]
fc2 <- 2*fc1 - wmurders[years] + 0.1470*e[years]
fc3 <- 2*fc2 - fc1
c(fc1, fc2, fc3)
[1] 2.458155 2.332379 2.206602
wmurders_forecast$mean
Time Series:
Start = 2005
End = 2007
Frequency = 1
[1] 2.470660 2.363106 2.252833
Our estimate is close to the model estimate and satisfactory. ### f.
Create a plot of the series with forecasts and prediction intervals for the next three periods shown.
autoplot(wmurders_forecast)+defaulttheme
Does auto.arima() give the same model you have chosen? if not, which model do you think is better?
auto.arima(wmurders, approximation = F)
Series: wmurders
ARIMA(1,2,1)
Coefficients:
ar1 ma1
-0.2434 -0.8261
s.e. 0.1553 0.1143
sigma^2 estimated as 0.04632: log likelihood=6.44
AIC=-6.88 AICc=-6.39 BIC=-0.97
Auto Arima selects the same model that I have chosen