Data Information

Data Manipulation

# Variable selection

new_df <- df[, c(1,5,7)]

# Change "Date" type to "date"

new_df$Date <- as.Date(new_df$Date)

# Turn the data frame into a time series object

amzn_full <- xts(new_df[, -1], order.by = new_df$Date)

# Testing set

amzn <- amzn_full[c(1:481),]

# levels of Volume 

v_level <- amzn_full$Volume[482:503]

Data Visualization

Models

1. ARIMA With External Predictor

# ARIMA model with external predictors 

arima_model <- auto.arima(amzn$Close, xreg = amzn$Volume)

print(summary(arima_model))
## Series: amzn$Close 
## Regression with ARIMA(0,1,0) errors 
## 
## Coefficients:
##       Volume
##            0
## s.e.       0
## 
## sigma^2 = 9.649:  log likelihood = -1224.63
## AIC=2453.27   AICc=2453.29   BIC=2461.61
## 
## Training set error measures:
##                      ME     RMSE      MAE         MPE     MAPE      MASE
## Training set 0.01222973 3.099813 2.263282 -0.02607771 1.884501 0.9989591
##                     ACF1
## Training set -0.01191041

auto-selected result is ARIMA(0,1,0) which is a random walk model.

# Perform prediction using "Volume" into the next 30 days 

arima_result <- forecast(arima_model, h = 30, xreg = v_level)

2. ARIMA

# Regular ARIMA model

norm_arima_model <- auto.arima(amzn$Close)

summary(norm_arima_model)
## Series: amzn$Close 
## ARIMA(1,2,2) 
## 
## Coefficients:
##           ar1      ma1      ma2
##       -0.7790  -0.2186  -0.7614
## s.e.   0.5178   0.5347   0.5292
## 
## sigma^2 = 9.708:  log likelihood = -1224.43
## AIC=2456.87   AICc=2456.95   BIC=2473.55
## 
## Training set error measures:
##                     ME     RMSE      MAE       MPE     MAPE     MASE
## Training set 0.2453345 3.099501 2.282659 0.1930221 1.900045 1.007511
##                     ACF1
## Training set -0.01962389
# Perform prediction into the next 30 days 

norm_arima_result <- forecast(norm_arima_model, h = 30)

3. ETS

# ETS model

ets_model <- ets(amzn$Close)

summary(ets_model)
## ETS(A,Ad,N) 
## 
## Call:
##  ets(y = amzn$Close) 
## 
##   Smoothing parameters:
##     alpha = 0.9727 
##     beta  = 1e-04 
##     phi   = 0.9713 
## 
##   Initial states:
##     l = 173.0921 
##     b = -1.7545 
## 
##   sigma:  3.0953
## 
##      AIC     AICc      BIC 
## 4064.502 4064.679 4089.557 
## 
## Training set error measures:
##                     ME     RMSE      MAE        MPE     MAPE      MASE
## Training set 0.1298499 3.079128 2.262811 0.06372123 1.884763 0.9987511
##                      ACF1
## Training set -0.002629638
# Perform prediction into the next 30 days 

ets_result <- forecast(ets_model, h = 30)

Summary

A total of three models were created above. As seen from the summary of each model, the ARIMA model with external predictor and ETS models exhibit quite similar forecasts. However, after comparing the model evaluation parameters such as AICc, MAE, and MAPE, it is evident that the ARIMA(1,2,2) performed the worst, as it has the highest values for these parameters among the three models.

It’s important to note that despite their similarity in forecasts and their MAPE and MAE values, I would choose the ARIMA model with external predictor as the better model compared to ETS because it has the lowest AICc value.

Additionally, I think the ARIMA model with external prediction is the best model is further justified by the fact that the model’s ability to provide more accuracy than the traditional ETS model by leveraging information from external predictors, in our case would the daily trading volume of Amazon’s stock. The inclusion of external predictors allows the model to capture additional sources of variability and adjust forecasts accordingly, leading to better predictions of future outcomes.