Data 624 HW 6

8.1, 8.2, 8.3, 8.5., 8.6, 8.7

library(fpp2)

## Registered S3 method overwritten by 'quantmod':
##   method            from
##   as.zoo.data.frame zoo

## -- Attaching packages ---------------------------------------------- fpp2 2.4 --

## v ggplot2   3.3.3     v fma       2.4  
## v forecast  8.13      v expsmooth 2.3

##

library(seasonal)
library(kableExtra)

## Warning: package 'kableExtra' was built under R version 4.0.4

8.1 Figure 8.31 shows the ACFs for 36 random numbers, 360 random numbers and 1,000 random numbers.

Explain the differences among these figures. Do they all indicate that the data are white noise?

The ACF in the left diagram shows a series of 35 numbers for white noise. The mid ACF diagram shows 360 numbers of white noise. Lastly, the right ACF white noises is 1,000.

Why are the critical values at different distances from the mean of zero? Why are the autocorrelations different in each figure when they each refer to white noise?

by using this formula for critical values +/-1.96/sqrt(t-d)) when T is the sample size and there is a difference used in d. So then the sample size in T has increased the same size for the critical value will decrease.

8.2 A classic example of a non-stationary series is the daily closing IBM stock price series (data set ibmclose). Use R to plot the daily closing prices for IBM stock and the ACF and PACF. Explain how each plot shows that the series is non-stationary and should be differenced.

ggtsdisplay(ibmclose)

8.3 For the following series, find an appropriate Box-Cox transformation and order of differencing in order to obtain stationary data.

usnetelec usgdp mcopper enplanements visitors

usnetelec

Box.test(diff(usnetelec), lag=10, type="Ljung-Box")

## 
##  Box-Ljung test
## 
## data:  diff(usnetelec)
## X-squared = 7.4406, df = 10, p-value = 0.6833

ndiffs(usnetelec)

(lambda <- BoxCox.lambda(usnetelec))

## [1] 0.5167714

autoplot(BoxCox(usnetelec,lambda))

usgdp

Box.test(diff(usgdp), lag=10, type="Ljung-Box")

## 
##  Box-Ljung test
## 
## data:  diff(usgdp)
## X-squared = 116.87, df = 10, p-value < 2.2e-16

ndiffs(usgdp)

## [1] 2

autoplot(diff(usgdp))

autoplot(diff(diff(usgdp)))

Box.test(diff(diff(usgdp)), type = "Ljung-Box")

## 
##  Box-Ljung test
## 
## data:  diff(diff(usgdp))
## X-squared = 53.294, df = 1, p-value = 2.872e-13

ggAcf(diff(diff(usgdp)))

mcopper

autoplot(mcopper)

lambda_mcopper <- BoxCox.lambda(mcopper)

autoplot(diff(BoxCox(mcopper, lambda_mcopper)))

Box.test(diff(BoxCox(mcopper, lambda_mcopper)),type = "Ljung-Box")

## 
##  Box-Ljung test
## 
## data:  diff(BoxCox(mcopper, lambda_mcopper))
## X-squared = 57.517, df = 1, p-value = 3.353e-14

ggAcf(diff(BoxCox(mcopper, lambda_mcopper)))

ggtsdisplay(mcopper)

enplanements

autoplot(enplanements)

lambda_mcopper <- BoxCox.lambda(enplanements)

lambda_enplanements <- BoxCox.lambda(enplanements)
ndiffs(enplanements)

## [1] 1

autoplot(diff(diff(BoxCox(enplanements, lambda_enplanements), lag = 12)))

Box.test(diff(diff(BoxCox(enplanements, lambda_enplanements), lag = 12)), type = "Ljung-Box")

## 
##  Box-Ljung test
## 
## data:  diff(diff(BoxCox(enplanements, lambda_enplanements), lag = 12))
## X-squared = 29.562, df = 1, p-value = 5.417e-08

ggAcf(diff(diff(BoxCox(enplanements, lambda_enplanements),lag = 12)))

visitors

autoplot(visitors)

lambda_visitors <- BoxCox.lambda(visitors)
ndiffs(visitors)

## [1] 1

nsdiffs(visitors)

## [1] 1

autoplot(diff(diff(BoxCox(visitors, lambda_visitors),lag = 12)))

Box.test(diff(diff(BoxCox(visitors, lambda_visitors),lag = 12)),type = "Ljung-Box")

## 
##  Box-Ljung test
## 
## data:  diff(diff(BoxCox(visitors, lambda_visitors), lag = 12))
## X-squared = 21.804, df = 1, p-value = 3.02e-06

ggAcf(diff(diff(BoxCox(visitors, lambda_visitors),lag = 12)))

8.5 For your retail data (from Exercise 3 in Section 2.10), find the appropriate order of differencing (after transformation if necessary) to obtain stationary data.

retaildata <- readxl::read_excel("retail.xlsx", skip=1)
myts <- ts(retaildata[,"A3349335T"],
  frequency=12, start=c(1982,4))

autoplot(myts)

ndiffs(myts)

## [1] 1

nsdiffs(myts)

## [1] 1

8.6 Use R to simulate and plot some data from simple ARIMA models.

Use the following R code to generate data from an AR(1) model with \(\phi_{1} = 0.6\) and \(\sigma^2=1\)The process starts with \(y_1=0\)

y <- ts(numeric(100))
e <- rnorm(100)
for(i in 2:100)
  y[i] <- 0.6*y[i-1] + e[i]

set.seed(25)
y <- ts(numeric(100))    
e <- rnorm(100)   
for(i in 2:100)    
  y[i] <- 0.6*y[i-1] + e[i]

plot(y)

b. Produce a time plot for the series. How does the plot change as you change \(\phi_1\)?

for(i in 2:100)    
  y[i] <- 0.3*y[i-1] + e[i]

plot(y)

for(i in 2:100)    
  y[i] <- 0.8*y[i-1] + e[i]

plot(y)

Write your own code to generate data from an MA(1) model with \(\theta_{1} = 0.6\) and \(\sigma^2=1\)

theta=0.6
for(i in 2:100)    
  y[i] <- theta*e[i-1] + e[i]

plot(y)

Produce a time plot for the series. How does the plot change as you change \(\theta_1\)

theta=0.8
for(i in 2:100)    
  y[i] <- theta*e[i-1] + e[i]

plot(y)

theta=0.2
for(i in 2:100)    
  y[i] <- theta*e[i-1] + e[i]

plot(y)

Generate data from an AR(2) model with \(\phi_{1} = 0.6\), \(\theta_{1} = 0.6\) and \(\sigma^2=1\) (Note that these parameters will give a non-stationary series.)

phi=0.6 ; theta=0.6

for(i in 2:100)    
  y[i] <- phi*y[i-1] + theta*e[i-1] + e[i]

plot(y)

f.Generate data from an AR(2) model with ϕ1=−0.8, and ϕ2=0.3 and σ2=1. (Note that these parameters will give a non-stationary series.)

phi1=-0.8 ; phi2=0.3 

for(i in 3:100)    
  y[i] <- phi1*y[i-1] + phi2*e[i-2] + e[i]

plot(y)

bGraph the latter two series and compare them.

For this one The plots above showed us that ARMA(1,1) has resulted in a more stable series in comparsion to AR(2) high variation time series.

8.7 Consider wmurders, the number of women murdered each year (per 100,000 standard population) in the United States.

A. By studying appropriate graphs of the series in R, find an appropriate ARIMA( ) model for these data.

autoplot(wmurders)

ggtsdisplay(wmurders)

wmurders %>%
  diff() %>% 
  ggtsdisplay()

print(paste("AR(2) AICc =",Arima(wmurders, order=c(2,1,0), method="ML")$aicc))

## [1] "AR(2) AICc = -12.4787165334315"

print(paste("MA(2) AICc =",Arima(wmurders, order=c(0,1,2), method="ML")$aicc))

## [1] "MA(2) AICc = -12.9452660674903"

B.Should you include a constant in the model? Explain.

No, we shouldn’t because as mentioned above, there does not seem to be any drift in the series.

C.Write this model in terms of the backshift operator.

\(y^t^=phi x y^(t-2)^ + theta x (e-2)\)

D.Fit the model using R and examine the residuals. Is the model satisfactory?

I believe it is the same as the question above “a”

a_012 <- Arima(wmurders, order=c(0,1,2))
checkresiduals(a_012)

## 
##  Ljung-Box test
## 
## data:  Residuals from ARIMA(0,1,2)
## Q* = 9.7748, df = 8, p-value = 0.2812
## 
## Model df: 2.   Total lags used: 10

e.Forecast three times ahead. Check your forecasts by hand to make sure that you know how they have been calculated.

a_fc <- forecast(a_012, h=3)
a_fc

##      Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 2005       2.458450 2.195194 2.721707 2.055834 2.861066
## 2006       2.477101 2.116875 2.837327 1.926183 3.028018
## 2007       2.477101 1.979272 2.974929 1.715738 3.238464

F.Create a plot of the series with forecasts and prediction intervals for the next three periods shown.

autoplot(a_fc)

G. Does auto.arima() give the same model you have chosen? If not, which model do you think is better?

autofit_arima <- auto.arima(wmurders, d=1, stepwise = FALSE)
forecast(autofit_arima, h=3) %>%
  autoplot()

Data 624 HW 6

Amanda Arce

3/20/2021