Data 625 HW6

if (!require("fpp2")) install.packages("fpp2")
if (!require("ggplot2")) install.packages("ggplot2")
if (!require("gridExtra")) install.packages("gridExtra")
library(fma)
library(forecast)
library(tseries)

8.1 Figure 8.31 shows the ACFs for 36 random numbers, 360 random numbers and 1,000 random numbers.

a. Explain the differences among these figures. Do they all indicate that the data are white noise?

Figure 8.31

All three three plots indicate that the data is white noise. This is because none of the spikes are larger than the critical value range for any of the plots

b. Why are the critical values at different distances from the mean of zero? Why are the autocorrelations different in each figure when they each refer to white noise?

The formula for the critical values is \(\pm 1.96/(\sqrt{T - d})\) where T is the sample size and d is the amount of differencing used. As the sample size increases the critical values get smaller. This explains why the cricial value region gets smaller (from left to right in the plot) as the sample size increases.

8.2 A classic example of a non-stationary series is the daily closing IBM stock price series (data set `ibmclose`). Use R to plot the daily closing prices for IBM stock and the ACF and PACF. Explain how each plot shows that the series is non-stationary and should be differenced.

ggtsdisplay(ibmclose)

There is clearly a trend element throughout the plot. The ACF plot shows that there are significant autocorrelations throughout. Therefore the data should be differenced in order to remove autocorrelation.

8.3 For the following series, find an appropriate Box-Cox transformation and order of differencing in order to obtain stationary data.

a. `usnetelec`

plot(usnetelec)

It is almost linearly increasing data. It looked like that the data only need first differencing.

Box.test(diff(usnetelec), type = "Ljung-Box")

## 
##  Box-Ljung test
## 
## data:  diff(usnetelec)
## X-squared = 0.8508, df = 1, p-value = 0.3563

first differenced usnetelec data can be thought of as a white noise series

kpss.test(diff(usnetelec))

## Warning in kpss.test(diff(usnetelec)): p-value greater than printed p-value

## 
##  KPSS Test for Level Stationarity
## 
## data:  diff(usnetelec)
## KPSS Level = 0.15848, Truncation lag parameter = 3, p-value = 0.1

kpss test result also shows that first differencing made the data stationary.

b. `usgdp`

plot(usgdp)

It is almost linearly increasing data. It looked like that the data only need first differencing

Box.test(diff(usgdp), type = "Ljung-Box")

## 
##  Box-Ljung test
## 
## data:  diff(usgdp)
## X-squared = 39.187, df = 1, p-value = 3.85e-10

first differenced usnetelec data cannot be thought of as a white noise series.

plot(diff(usgdp))

There is still a trend left in the differenced data. It looked like one more differencing would be enough, but use ndiffs function to check the number of differencing needed.

ndiffs(usgdp)

## [1] 2

plot(diff(diff(usgdp)))

Plot shows that the twice differenced data is like white noise series.

Box.test(diff(diff(usgdp)), type = "Ljung-Box")

## 
##  Box-Ljung test
## 
## data:  diff(diff(usgdp))
## X-squared = 53.294, df = 1, p-value = 2.872e-13

kpss.test(diff(diff(usnetelec)))

## Warning in kpss.test(diff(diff(usnetelec))): p-value greater than printed p-
## value

## 
##  KPSS Test for Level Stationarity
## 
## data:  diff(diff(usnetelec))
## KPSS Level = 0.098532, Truncation lag parameter = 3, p-value = 0.1

But kpss test result shows that differencing twice was enough to make the data stationary. Therefore in usgdp data case, even if twice differencing didn’t make the data like white noise series, it made the data stationary

c. `mcopper`

plot(mcopper)

mcopper data have increasing trend. And they have bigger variation for bigger prices. Therefore I’ll use Box-Cox transformation before differencing

lambda_mcopper <- BoxCox.lambda(mcopper)
plot(diff(BoxCox(mcopper, lambda_mcopper)))

Box.test(diff(BoxCox(mcopper, lambda_mcopper)),
         type = "Ljung-Box")

## 
##  Box-Ljung test
## 
## data:  diff(BoxCox(mcopper, lambda_mcopper))
## X-squared = 57.517, df = 1, p-value = 3.353e-14

Plot result looked like BoxCox transformation and first differencing made the data like white noise series. But Ljung-Box test shows that it didn’t.

kpss.test(diff(BoxCox(mcopper, lambda_mcopper)))

## Warning in kpss.test(diff(BoxCox(mcopper, lambda_mcopper))): p-value greater
## than printed p-value

## 
##  KPSS Test for Level Stationarity
## 
## data:  diff(BoxCox(mcopper, lambda_mcopper))
## KPSS Level = 0.057275, Truncation lag parameter = 6, p-value = 0.1

But kpss test result shows that differencing with Box-Cox transformation was enough to make the data stationary. Even if differencing with Box-Cox transformation didn’t make the data like white noise series, it made the data stationary.

d. `enplanements`

plot(enplanements)

enplanements data have seasonality and increasing trend even if the number of enplanements fell in 2001. Therefore, I think that the data need seasonal differencing, too. The variations are bigger for bigger numbers. Therefore I’ll use Box-Cox transformation before differencing

lambda_enplanements <- BoxCox.lambda(enplanements)
ndiffs(enplanements)

## [1] 1

diff_enplane <- diff(BoxCox(enplanements, lambda_enplanements))
plot(diff_enplane)

Box.test(diff_enplane,type = "Ljung-Box")

## 
##  Box-Ljung test
## 
## data:  diff_enplane
## X-squared = 16.55, df = 1, p-value = 4.739e-05

Plot result looked like BoxCox transformation and multiple differencings made the data like white noise series. But Ljung-Box test shows that it didn’t.

kpss.test(diff_enplane)

## Warning in kpss.test(diff_enplane): p-value greater than printed p-value

## 
##  KPSS Test for Level Stationarity
## 
## data:  diff_enplane
## KPSS Level = 0.015112, Truncation lag parameter = 5, p-value = 0.1

e. `visitors`

plot(visitors)

visitors data are similar to enplanements data. They have seasonality and increasing trend. It looked like they also need Box-Cox transformation, first and seasonal differencing.

lambda_visitors <- BoxCox.lambda(visitors)
ndiffs(visitors)

## [1] 1

diff_visit<- diff(BoxCox(visitors, lambda_visitors))

plot(diff_visit)

Box.test(diff_visit, type = "Ljung-Box")

## 
##  Box-Ljung test
## 
## data:  diff_visit
## X-squared = 21.922, df = 1, p-value = 2.839e-06

similar results like earliear example

kpss.test(diff_visit)

## Warning in kpss.test(diff_visit): p-value greater than printed p-value

## 
##  KPSS Test for Level Stationarity
## 
## data:  diff_visit
## KPSS Level = 0.051907, Truncation lag parameter = 4, p-value = 0.1

But kpss test result shows that differencings with Box-Cox transformation was enough to make the data stationary. In visitors data case, even if differencings with Box-Cox transformation didn’t make the data like white noise series, it made the data stationary.

8.5 For your retail data (from Exercise 3 in Section 2.10), find the appropriate order of differencing (after transformation if necessary) to obtain stationary data.

library(readxl)
library(seasonal)

retaildata <- readxl::read_excel("C:/Users/patel/Documents/Data_624/retail.xlsx", skip=1)

myts <- ts(retaildata[,"A3349335T"],
  frequency=12, start=c(1982,4))
plot(myts)

kpss.test(diff(BoxCox(myts, BoxCox.lambda(myts))))

## Warning in kpss.test(diff(BoxCox(myts, BoxCox.lambda(myts)))): p-value greater
## than printed p-value

## 
##  KPSS Test for Level Stationarity
## 
## data:  diff(BoxCox(myts, BoxCox.lambda(myts)))
## KPSS Level = 0.010432, Truncation lag parameter = 5, p-value = 0.1

To make retail.ts data stationary, I did Box-Cox transformation, 1 first differencing

8.6 Use R to simulate and plot some data from simple ARIMA models.

a. Use the following R code to generate data from an AR(1) model with \(\phi_{1} = 0.6\) and \(\sigma^2=1\). The process starts with \(y_1=0\)

ar.1 <- function(phi, sd=1, n=100){
  y <- ts(numeric(n))
  e <- rnorm(n, sd=sd)
  for(i in 2:n)
    y[i] <- phi*y[i-1] + e[i]
  return(y)
}

b. Produce a time plot for the series. How does the plot change as you change \(\phi_{1}\) ?

autoplot(ar.1(0.006), series = "0.006") + 
  geom_line(colour = "red") +   
  autolayer(ar.1(0.06), series = "0.06") +   
  autolayer(ar.1(0.6), series = "0.6") +
  ylab("Data") +  
  guides(colour = guide_legend(title = "Phi"))

par(mfrow=c(1,3))
acf(ar.1(0.006),main='Phi=0.006')
acf(ar.1(0.06),main='Phi=0.06')
acf(ar.1(0.6),main='Phi=0.6')

c. Write your own code to generate data from an MA(1) model with \(\theta_{1} = 0.6\) and \(\sigma^2=1\)

set.seed(17)
ma.1 <- function(theta, sd=1, n=100){
  y <- ts(numeric(n))
  e <- rnorm(n, sd=sd)
  e[1] <- 0
  for(i in 2:n)
    y[i] <- theta*e[i-1] + e[i]
  return(y)
}

d. Produce a time plot for the series. How does the plot change as you change \(\theta_{1}\) ?

autoplot(ma.1(0.006), series = "0.006") + 
  geom_line(colour = "red") +   
  autolayer(ma.1(0.06), series = "0.06") +   
  autolayer(ma.1(0.6), series = "0.6") +
  ylab("Data") +  
  guides(colour = guide_legend(title = "Theta"))

par(mfrow=c(1,3))
acf(ma.1(0.006),main='Theta=0.006')
acf(ma.1(0.06),main='Theta=0.06')
acf(ma.1(0.6),main='Theta=0.6')

e. Generate data from an ARMA(1,1) model with \(\phi_{1} = 0.6\) , \(\theta_{1} = 0.6\) and \(\sigma^2=1\)

y1 <- ts(numeric(100))
e <- rnorm(100, sd=1)
for(i in 2:100)
  y1[i] <- 0.6*y1[i-1] + 0.6*e[i-1] + e[i] 
plot(y1)

f. Generate data from an AR(2) model with \(\phi_{1} =-0.8, \phi_{2} = 0.3\) and \(\sigma^2=1\)

y2 <- ts(numeric(100))
e <- rnorm(100, sd=1)
for(i in 3:100)
  y2[i] <- -0.8*y2[i-1] + 0.3*y2[i-2] + e[i]
plot(y2)

g. Graph the latter two series and compare them.

par(mfrow=c(1,2))
acf(y1, main='AR(1,1)')
acf(y2, main='AR(2)')

data from an AR(2) model increased with oscillation. They are non-staionary data. But data from an ARMA(1, 1) model were stationary.

Data 625 HW6

V Patel

2020-10-19

8.1 Figure 8.31 shows the ACFs for 36 random numbers, 360 random numbers and 1,000 random numbers.

a. Explain the differences among these figures. Do they all indicate that the data are white noise?

b. Why are the critical values at different distances from the mean of zero? Why are the autocorrelations different in each figure when they each refer to white noise?

8.2 A classic example of a non-stationary series is the daily closing IBM stock price series (data set `ibmclose`). Use R to plot the daily closing prices for IBM stock and the ACF and PACF. Explain how each plot shows that the series is non-stationary and should be differenced.

8.3 For the following series, find an appropriate Box-Cox transformation and order of differencing in order to obtain stationary data.

a. `usnetelec`

b. `usgdp`

c. `mcopper`

d. `enplanements`

e. `visitors`

8.5 For your retail data (from Exercise 3 in Section 2.10), find the appropriate order of differencing (after transformation if necessary) to obtain stationary data.

8.6 Use R to simulate and plot some data from simple ARIMA models.

a. Use the following R code to generate data from an AR(1) model with \(\phi_{1} = 0.6\) and \(\sigma^2=1\). The process starts with \(y_1=0\)

b. Produce a time plot for the series. How does the plot change as you change \(\phi_{1}\) ?

c. Write your own code to generate data from an MA(1) model with \(\theta_{1} = 0.6\) and \(\sigma^2=1\)

d. Produce a time plot for the series. How does the plot change as you change \(\theta_{1}\) ?

e. Generate data from an ARMA(1,1) model with \(\phi_{1} = 0.6\) , \(\theta_{1} = 0.6\) and \(\sigma^2=1\)

f. Generate data from an AR(2) model with \(\phi_{1} =-0.8, \phi_{2} = 0.3\) and \(\sigma^2=1\)

g. Graph the latter two series and compare them.

Data 625 HW6

V Patel

2020-10-19

8.1 Figure 8.31 shows the ACFs for 36 random numbers, 360 random numbers and 1,000 random numbers.

a. Explain the differences among these figures. Do they all indicate that the data are white noise?

b. Why are the critical values at different distances from the mean of zero? Why are the autocorrelations different in each figure when they each refer to white noise?

8.2 A classic example of a non-stationary series is the daily closing IBM stock price series (data set ibmclose). Use R to plot the daily closing prices for IBM stock and the ACF and PACF. Explain how each plot shows that the series is non-stationary and should be differenced.

8.3 For the following series, find an appropriate Box-Cox transformation and order of differencing in order to obtain stationary data.

a. usnetelec

b. usgdp

c. mcopper

d. enplanements

e. visitors

8.5 For your retail data (from Exercise 3 in Section 2.10), find the appropriate order of differencing (after transformation if necessary) to obtain stationary data.

8.6 Use R to simulate and plot some data from simple ARIMA models.

a. Use the following R code to generate data from an AR(1) model with \(\phi_{1} = 0.6\) and \(\sigma^2=1\). The process starts with \(y_1=0\)

b. Produce a time plot for the series. How does the plot change as you change \(\phi_{1}\) ?

c. Write your own code to generate data from an MA(1) model with \(\theta_{1} = 0.6\) and \(\sigma^2=1\)

d. Produce a time plot for the series. How does the plot change as you change \(\theta_{1}\) ?

e. Generate data from an ARMA(1,1) model with \(\phi_{1} = 0.6\) , \(\theta_{1} = 0.6\) and \(\sigma^2=1\)

f. Generate data from an AR(2) model with \(\phi_{1} =-0.8, \phi_{2} = 0.3\) and \(\sigma^2=1\)

g. Graph the latter two series and compare them.

8.2 A classic example of a non-stationary series is the daily closing IBM stock price series (data set `ibmclose`). Use R to plot the daily closing prices for IBM stock and the ACF and PACF. Explain how each plot shows that the series is non-stationary and should be differenced.

a. `usnetelec`

b. `usgdp`

c. `mcopper`

d. `enplanements`

e. `visitors`