VAR stands for vector auto regression. The benefit of using a VAR model compared to what we have learned so far is that one can look at how variables affect each other, instead of just a unidirectional relationship. A VAR model fits two linear regressions (or ARIMAs) in an autoregressive fashion and use one variable to forecast the other, and vis versa.
I will be comparing the stock prices of Nike (NKE) and United Natural Foods Inc. (UNFI) to see if there might be some sort of relationship to athletic wear sales and health food sales.
library(forecast)
## Registered S3 method overwritten by 'quantmod':
## method from
## as.zoo.data.frame zoo
library(fpp2)
## Loading required package: ggplot2
## Loading required package: fma
## Loading required package: expsmooth
library(vars)
## Loading required package: MASS
##
## Attaching package: 'MASS'
## The following objects are masked from 'package:fma':
##
## cement, housing, petrol
## Loading required package: strucchange
## Loading required package: zoo
##
## Attaching package: 'zoo'
## The following objects are masked from 'package:base':
##
## as.Date, as.Date.numeric
## Loading required package: sandwich
## Loading required package: urca
## Loading required package: lmtest
data <- read.csv("C:/Users/andre/Documents/Boston College/2. Summer 2020/Forecasting/week 6/week6.csv")
data.ts <- ts(data[2:3], start = c(2015,9), frequency = 12)
autoplot(data.ts)
VARselect(data.ts)
## $selection
## AIC(n) HQ(n) SC(n) FPE(n)
## 1 1 1 1
##
## $criteria
## 1 2 3 4 5 6
## AIC(n) 5.709257 5.844912 5.968709 6.095825 5.961497 5.900285
## HQ(n) 5.796105 5.989659 6.171354 6.356369 6.279940 6.276626
## SC(n) 5.936530 6.223701 6.499014 6.777646 6.794834 6.885137
## FPE(n) 301.728828 345.909017 392.366258 447.328204 393.571265 373.631711
## 7 8 9 10
## AIC(n) 5.874339 5.975119 6.027019 6.010317
## HQ(n) 6.308578 6.467258 6.577056 6.618252
## SC(n) 7.010707 7.263003 7.466419 7.601232
## FPE(n) 368.757631 414.917738 446.903180 452.231997
By looking at the plot of these two stocks, it would seem to be that there is some sort of inverse relationship between the two stocks starting right around 2018. Since all of the criteria agrees on a lag of one, that is the VAR model that I will run (didn’t even mean to have that rhyme).
var2 <- VAR(data.ts, p = 1, type = c("both"))
plot(var2)
fc_var2 <- forecast(var2, h = 12)
autoplot(fc_var2)
Now I want to see how well these two companies predict each other’s stick prices by looking at their error metrics
acc_UNFI <- accuracy(fc_var2$forecast$UNFI)
acc_NKE <- accuracy(fc_var2$forecast$NKE)
acc_UNFI
## ME RMSE MAE MPE MAPE MASE
## Training set 1.473069e-17 3.841569 2.936223 -3.112916 12.85338 0.2481372
## ACF1
## Training set 0.01411832
acc_NKE
## ME RMSE MAE MPE MAPE MASE
## Training set -8.289236e-16 4.07869 3.161761 -0.3503399 4.485871 0.2646998
## ACF1
## Training set -0.01472569
From my understanding, the accuracy of UNFI would be with NKE as the predictor and the accuracy of NKE would be with UNFI as a predictor. With that definition, NKE does a better job of forecasting UNFI than UNFI does of NKE. One reason that I could think to explain this by is if people are making decisions about living a healthier lifestyle, then they first turn to exercise before diet, but diet would follow.