This data is monthly adjusted closing price for Home Depot and Lowe's. Data source is yahoo finance.
Setup
library(forecast)
## Registered S3 method overwritten by 'quantmod':
## method from
## as.zoo.data.frame zoo
library(ggplot2)
library(tseries)
library(vars)
## Loading required package: MASS
## Loading required package: strucchange
## Loading required package: zoo
##
## Attaching package: 'zoo'
## The following objects are masked from 'package:base':
##
## as.Date, as.Date.numeric
## Loading required package: sandwich
## Loading required package: urca
## Loading required package: lmtest
Data
ts = ts(HD, frequency = 12, start = c(2015,8))
ts = ts[,-1]
train = as.ts(head(ts, n = 48))
test = as.ts(tail(ts, n = 12))
EDA
autoplot(ts, main = "Adj. Closing Stock Prices", ylab = "Stock Price ($)")
cor(HD$HD, HD$Lowe)
## [1] 0.955375
adf.test(ts[,1])
##
## Augmented Dickey-Fuller Test
##
## data: ts[, 1]
## Dickey-Fuller = -3.2708, Lag order = 3, p-value = 0.0848
## alternative hypothesis: stationary
adf.test(ts[,2])
##
## Augmented Dickey-Fuller Test
##
## data: ts[, 2]
## Dickey-Fuller = -3.6078, Lag order = 3, p-value = 0.04015
## alternative hypothesis: stationary
hist(ts[,1])
hist(ts[,2])
Not only do the plots show that the stocks appear to move together, but the correlation statistic shows us the two stocks have a correlation of 0.955.
VAR select
select = VARselect(train[,1:2], lag.max=8,
type="const")[["selection"]]
select
## AIC(n) HQ(n) SC(n) FPE(n)
## 8 8 1 8
Based on these results, I will fit a VAR(1) and a VAR(8)
VAR
var1 <- VAR(train[,1:2], p=1, type="const")
serial.test(var1, lags.pt=10, type="PT.asymptotic")
##
## Portmanteau Test (asymptotic)
##
## data: Residuals of VAR object var1
## Chi-squared = 49.16, df = 36, p-value = 0.07068
var2 <- VAR(train[,1:2], p=8, type="const")
serial.test(var2, lags.pt=10, type="PT.asymptotic")
##
## Portmanteau Test (asymptotic)
##
## data: Residuals of VAR object var2
## Chi-squared = 20.614, df = 8, p-value = 0.008246
The VAR(8) model seems to be better.
Forecast
fc = forecast(var2,h=12)
autoplot(fc)
checkresiduals(fc$forecast$HD)
checkresiduals(fc$forecast$Lowe)
accuracy(fc$forecast$HD,test[1])
## ME RMSE MAE MPE MAPE MASE
## Training set -3.553147e-16 4.134252 3.339148 -0.0760513 2.251943 0.4957663
## Test set -1.384163e+01 13.841631 13.841631 -6.2275640 6.227564 2.0550794
## ACF1
## Training set -0.04051015
## Test set NA
accuracy(fc$forecast$Lowe, test[2])
## ME RMSE MAE MPE MAPE MASE
## Training set 1.775923e-16 3.710486 3.120389 -0.1949789 3.817529 0.6471301
## Test set 1.182114e+02 118.211410 118.211410 52.2430139 52.243014 24.5155843
## ACF1
## Training set -0.02426916
## Test set NA
Neither model has troublesome serial correlation. The forecasts for Home Depot are decent. They are not very biased, and the RMSE of 13 is reasonable. However, the forecasts for Lowe's are not very good. And this likely hurts the Home Depot forecasts as well as the poor Lowe's forecasts hurt the Home Depot forecasts. It is interesting how differently the forecasts perform given how correlated the two are.