Homework 5

library(readxl, quietly = TRUE, warn.conflicts =  FALSE, verbose = F)
library(fpp2,quietly = TRUE, warn.conflicts =  FALSE, verbose = F)
library(ggplot2)
library(gridExtra)
library(mlbench)
library(caret)
library(corrplot)
library(dplyr )
library(kableExtra)
library(e1071)
library(urca)

8.1, 8.2, 8.3, 8.5., 8.6, 8.7

Excercise 8.1

a

The ACF looks just like that of a white noise series. For series X1(has one) and X3 there are no autocorrelations lying outside the 95% limits. For series X2 there are four auto correlation oustsdie 95% limts. We could also perform Ljung-Box Q∗ statistic and look at the p-value for statistical significance.

b

I am assuming it is because of law of large numbers. As the number of sample increases, the mean will be close to zero.

Atime series with cyclic behaviour (but with no trend or seasonality) is stationary. From the plot below it is clear that there is a trend in IBM’s stock price. For a stationary time series, the ACF will drop to zero relatively quickly, while the ACF of non-stationary data decreases slowly. Based on the plots below we need to do second order differencing to make the series stationary

autoplot(ibmclose)

ggAcf(ibmclose)

ggPacf(ibmclose)

autoplot( diff(ibmclose,12))

autoplot(diff(diff(ibmclose,12)))

Excercise 8.3

usnetelec

Lambda : 0.5167714 Differncing Order : 2

autoplot(usnetelec)

lambda <- BoxCox.lambda(usnetelec)
t_data<- BoxCox(usnetelec,lambda)
num_diffs <-ndiffs(t_data)
 
autoplot(diff(diff(t_data)))

usgdp

Lambda : 0.366352 Differncing Order : 1

autoplot(usgdp)

lambda <- BoxCox.lambda(usgdp)
t_data<- BoxCox(usgdp,lambda)
num_diffs <-ndiffs(t_data)
 
autoplot(diff(t_data))

mcopper

Lambda : 0.1919047 Differncing Order : 1

autoplot(mcopper)

lambda <- BoxCox.lambda(mcopper)
t_data<- BoxCox(mcopper,lambda)
num_diffs <-ndiffs(t_data)
 
autoplot(diff(t_data))

enplanements

Lambda : -0.2269461 Differncing Order : 1

autoplot(enplanements)

lambda <- BoxCox.lambda(enplanements)
t_data<- BoxCox(enplanements,lambda)
num_diffs <-ndiffs(t_data)
 
autoplot(diff(t_data))

visitors

Lambda : 0.2775249 Differncing Order : 1

autoplot(visitors)

lambda <- BoxCox.lambda(visitors)
t_data<- BoxCox(visitors,lambda)
num_diffs <-ndiffs(t_data)
 
autoplot(diff(t_data))

Excercise 8.5

We need to transform data first. Lambda : 0.1276369 Differncing Order : 1

retailts <- readxl::read_excel("retail.xlsx", skip=1)
retailts <- ts(retailts[,"A3349873A"],frequency=12, start=c(1982,4))
autoplot(retailts)

lambda <- BoxCox.lambda(retailts)
t_data <-BoxCox(retailts,lambda) 
autoplot(BoxCox(retailts,lambda))

num_diffs <-ndiffs(t_data)
 
autoplot(diff(t_data))

Excercise 8.6

a

ar1<- function(phi){
  set.seed(1)
y <- ts(numeric(100))
e <- rnorm(100)
for(i in 2:100)
{
  y[i] <-  phi * y[i-1] + e[i]
}
y
}

b

It looks like the time series with the smaller phi is more “choppy” and seems to stay closer to 0. Time series with the larger phi appears to wander around more. R As phi goes to 1,the model approaches a random walk.

plt <- autoplot(ar1(.1))
for(phi in seq(0, .9, 0.2))
{
   
  plt <- plt +autolayer(ar1(phi), series = paste(phi))
}
plt + labs( color = "Phi") +
  theme(axis.title = element_blank(), legend.position = "bottom")

c

  set.seed(1)
ma1<- function(theta){

y <- ts(numeric(100))
e <- rnorm(100)
 
for(i in 2:100){
  y[i] <-  theta * e[i-1] + e[i]
}
return(y)
}

d

Based on the plots below we do not see different behaviours with differnt values of theta.

plt <- autoplot(ma1(.1))
for(theta in seq(0, .9, 0.3))
{
 
plt <- plt + autolayer(ma1(theta), series = paste(theta))
}
plt + labs( color = "theta") +
  theme(axis.title = element_blank(), legend.position = "bottom")

### e

 set.seed(5)
arma_test<- function(theta, phi){
 
y <- ts(numeric(100))
e <- rnorm(100)
 
for(i in 2:100){
    y[i] <- phi*y[i-1] + theta*e[i-1] + e[i]
}
return(y)
}

data1 <- arma_test(0.6, 0.6)

f

  set.seed(5)
ar2 <- function(phi1, phi2){

  y <- ts(numeric(100))
  e <- rnorm(100)
  for(i in 3:100)
    y[i] <- phi1*y[i-1] + phi2*y[i-2] + e[i]
  return(y)
}

data2 <- ar2(-.8, .3)

g

autoplot(data1, series = "ARMA(1,1)") +
  autolayer(data2, series = "AR(2)")

Excercise 8.7

a

Based on the plot below, we see the data is not stationary. We need to difference the series twice to make it stationary. The first lag of the PACF is the only value outside the blue line which indicates the value of p should be 1. The first high value in the ACF means d should be one.

autoplot(wmurders)

ggAcf(diff(wmurders, ndiffs(wmurders)))

ggPacf(diff(wmurders, ndiffs(wmurders)))

b

NO. Constant c has an important effect on the long-term forecasts obtained from models. If c=0 and d=2, the long-term forecasts will follow a straight line. f c≠0 and d=2, the long-term forecasts will follow a quadratic trend.

c

\((1-\phi_1B) (1-B)^2y_t = c + (1 + \theta_1B)e_t\)

d

The ACF plot of the residuals from the model shows that all autocorrelations are within the threshold limits, indicating that the residuals are behaving like white noise. A portmanteau test returns a large p-value, also suggesting that the residuals are white noise

model <- arima(wmurders, order = c(1, 2, 1))
checkresiduals(model)

## 
##  Ljung-Box test
## 
## data:  Residuals from ARIMA(1,2,1)
## Q* = 12.419, df = 8, p-value = 0.1335
## 
## Model df: 2.   Total lags used: 10

e

forecast(model, h=3) %>%
  kable() %>%
  kable_styling()

	Point Forecast	Lo 80	Hi 80	Lo 95	Hi 95
2005	2.470660	2.200091	2.741229	2.056861	2.884459
2006	2.363106	1.993529	2.732684	1.797886	2.928327
2007	2.252833	1.774677	2.730989	1.521557	2.984110

f

autoplot(forecast(model, h=3))

g

Yes. Manually selected model is the same the as one from auto.arima

auto.arima(wmurders)

## Series: wmurders 
## ARIMA(1,2,1) 
## 
## Coefficients:
##           ar1      ma1
##       -0.2434  -0.8261
## s.e.   0.1553   0.1143
## 
## sigma^2 estimated as 0.04632:  log likelihood=6.44
## AIC=-6.88   AICc=-6.39   BIC=-0.97