Discussion6

This data is monthly adjusted closing price for Home Depot and Lowe's. Data source is yahoo finance.

Setup

library(forecast)

## Registered S3 method overwritten by 'quantmod':
##   method            from
##   as.zoo.data.frame zoo

library(ggplot2)
library(tseries)
library(vars)

## Loading required package: MASS

## Loading required package: strucchange

## Loading required package: zoo

## 
## Attaching package: 'zoo'

## The following objects are masked from 'package:base':
## 
##     as.Date, as.Date.numeric

## Loading required package: sandwich

## Loading required package: urca

## Loading required package: lmtest

Data

ts = ts(HD, frequency = 12, start = c(2015,8))
ts = ts[,-1]

train = as.ts(head(ts, n = 48))
test = as.ts(tail(ts, n = 12))

EDA

autoplot(ts, main = "Adj. Closing Stock Prices", ylab = "Stock Price ($)")

cor(HD$HD, HD$Lowe)

## [1] 0.955375

adf.test(ts[,1])

## 
##  Augmented Dickey-Fuller Test
## 
## data:  ts[, 1]
## Dickey-Fuller = -3.2708, Lag order = 3, p-value = 0.0848
## alternative hypothesis: stationary

adf.test(ts[,2])

## 
##  Augmented Dickey-Fuller Test
## 
## data:  ts[, 2]
## Dickey-Fuller = -3.6078, Lag order = 3, p-value = 0.04015
## alternative hypothesis: stationary

hist(ts[,1])

hist(ts[,2])

Not only do the plots show that the stocks appear to move together, but the correlation statistic shows us the two stocks have a correlation of 0.955.

VAR select

select  = VARselect(train[,1:2], lag.max=8,
  type="const")[["selection"]]
select

## AIC(n)  HQ(n)  SC(n) FPE(n) 
##      8      8      1      8

Based on these results, I will fit a VAR(1) and a VAR(8)

VAR

var1 <- VAR(train[,1:2], p=1, type="const")
serial.test(var1, lags.pt=10, type="PT.asymptotic")

## 
##  Portmanteau Test (asymptotic)
## 
## data:  Residuals of VAR object var1
## Chi-squared = 49.16, df = 36, p-value = 0.07068

var2 <- VAR(train[,1:2], p=8, type="const")
serial.test(var2, lags.pt=10, type="PT.asymptotic")

## 
##  Portmanteau Test (asymptotic)
## 
## data:  Residuals of VAR object var2
## Chi-squared = 20.614, df = 8, p-value = 0.008246

The VAR(8) model seems to be better.

Forecast

fc = forecast(var2,h=12)
autoplot(fc)

checkresiduals(fc$forecast$HD)

checkresiduals(fc$forecast$Lowe)

accuracy(fc$forecast$HD,test[1])

##                         ME      RMSE       MAE        MPE     MAPE      MASE
## Training set -3.553147e-16  4.134252  3.339148 -0.0760513 2.251943 0.4957663
## Test set     -1.384163e+01 13.841631 13.841631 -6.2275640 6.227564 2.0550794
##                     ACF1
## Training set -0.04051015
## Test set              NA

accuracy(fc$forecast$Lowe, test[2])

##                        ME       RMSE        MAE        MPE      MAPE       MASE
## Training set 1.775923e-16   3.710486   3.120389 -0.1949789  3.817529  0.6471301
## Test set     1.182114e+02 118.211410 118.211410 52.2430139 52.243014 24.5155843
##                     ACF1
## Training set -0.02426916
## Test set              NA

Neither model has troublesome serial correlation. The forecasts for Home Depot are decent. They are not very biased, and the RMSE of 13 is reasonable. However, the forecasts for Lowe's are not very good. And this likely hurts the Home Depot forecasts as well as the poor Lowe's forecasts hurt the Home Depot forecasts. It is interesting how differently the forecasts perform given how correlated the two are.

Discussion6

Justin Lynch

8/3/2020