rm(list = ls())
library(quantmod)
library(forecast)
library(ggplot2)
library(scales)
theme_set(theme_minimal())
library(gridExtra)
library(dygraphs)
library(tseries)
library(ggfortify)
library(TTR)
library(kableExtra)
start <- as.Date("2017-05-01")
end <- as.Date("2018-11-30")
suppressWarnings(getSymbols("ATVI", src = "yahoo", from = start, to = end))
## [1] "ATVI"
ATVI_df <- as.data.frame(ATVI)
ATVI_df<- cbind(Date=as.Date(rownames(ATVI_df)),ATVI_df)
nrow(ATVI)
## [1] 401
acc_table <- data.frame()
forecast_len <- floor(0.1*(nrow(ATVI)))
There are 401 data entries.
A summary of the Activision Blizzard stock.
summary(ATVI)
## Index ATVI.Open ATVI.High ATVI.Low
## Min. :2017-05-01 Min. :47.77 Min. :49.91 Min. :46.83
## 1st Qu.:2017-09-21 1st Qu.:62.13 1st Qu.:62.64 1st Qu.:61.20
## Median :2018-02-14 Median :66.18 Median :67.00 Median :65.14
## Mean :2018-02-13 Mean :67.09 Mean :67.90 Mean :66.15
## 3rd Qu.:2018-07-10 3rd Qu.:71.96 3rd Qu.:72.96 3rd Qu.:71.19
## Max. :2018-11-29 Max. :84.18 Max. :84.68 Max. :82.74
## ATVI.Close ATVI.Volume ATVI.Adjusted
## Min. :49.14 Min. : 1644700 Min. :48.75
## 1st Qu.:61.89 1st Qu.: 4709400 1st Qu.:61.08
## Median :66.09 Median : 5721500 Median :65.33
## Mean :67.06 Mean : 6601388 Mean :66.34
## 3rd Qu.:71.99 3rd Qu.: 7621700 3rd Qu.:71.24
## Max. :83.39 Max. :34543400 Max. :82.73
Check for anyNA values.
anyNA(ATVI)
## [1] FALSE
Transforming the xts to a timeseries (ts).
ATVI_ts <- ts(ATVI, frequency = 5)
plot(ATVI_ts)
Looking at univariate distributions of the stocks.
par(mfrow = c(2,2))
uni_open <- ggplot(ATVI, aes(ATVI.Open)) + geom_histogram(bins = 50, aes(y = ..density..), col = "red", fill = "red", alpha = 0.3) + geom_density()
uni_high <- ggplot(ATVI, aes(ATVI.High)) + geom_histogram(bins = 50, aes(y = ..density..), col = "red", fill = "red", alpha = 0.3) + geom_density()
uni_low <- ggplot(ATVI, aes(ATVI.Low)) + geom_histogram(bins = 50, aes(y = ..density..), col = "red", fill = "red", alpha = 0.3) + geom_density()
uni_close <- ggplot(ATVI, aes(ATVI.Close)) + geom_histogram(bins = 50, aes(y = ..density..), col = "red", fill = "red", alpha = 0.3) + geom_density()
grid.arrange(uni_open, uni_high, uni_low, uni_close, nrow = 2, ncol = 2)
average_price = sum(ATVI_df$ATVI.Open + ATVI_df$ATVI.Close)/(2*nrow(ATVI))
The density plots are left skewed and have a bimodal distribution.
Visualizing the closing stock prices.
dygraph(ATVI_df[,c("ATVI.Open","ATVI.Close")], xlab = "Time", ylab = "Price in USD", main = "Activision Blizzard, Inc. Opening and Closing Stock Price") %>% dyOptions(colors = c(rgb(57,106,177, maxColorValue = 255), rgb(218,124,48, maxColorValue = 255)))
average_price = sum(ATVI_df$ATVI.Open + ATVI_df$ATVI.Close)/(2*nrow(ATVI))
The peak closing price of 83.39 USD and the lowest recorded closing price of 49.14 USD. It seems historically the majority of prices for the ATVI stock has been around 67.08 USD.
Visualizing the daily stock returns if one were to buy at the opening price and sell at the closing price.
ATVI_return <- dailyReturn(ATVI)
ATVI_return <- as.data.frame(ATVI_return)
ATVI_return <- cbind(Date=as.Date(rownames(ATVI_return)),ATVI_return)
ggplot(ATVI_return, aes(x = Date, y = daily.returns)) + geom_path() + geom_hline(yintercept = 0) + ggtitle("Daily Returns of ATVI") + ylab("Daily Returns in USD")
mean_return <- mean(ATVI_return[,2])
The average daily return is 0.0002158914.
Volatility of the stock.
autoplot(volatility(ATVI, calc = "close", n = 14), main = "14 Day Volatility of ATVI Closing Prices")
## Warning: Removed 13 rows containing missing values (geom_path).
Volatility is the measure of the stock’s uncertainty or risk in relation to the size of changes in the stock. The volatility of the ATVI closing prices has recently risen rapidlybut the stock has had a volatility below 0.5 for the majority of the stock in the time frame we have chosen.
Plotting the stock using a candlestick chart to help visualize the differences between opening and closing prices. The chart represents days where the opening price was greater than the closing price with a black candlestick and days where the opening price is lower than the closing price with red candlesticks.
candleChart(ATVI, up.col = "black", dn.col = "red", theme = "white")
addSMA(n = c(20, 50, 100))
Looking at the 20-day moving average (dark blue) the stock needs to be above or below for the slope of the line to change directions. This can possibly be used to identify a change in trend and can be seen as trading signals. The 50-day moving average (light blue) indicates that the stock switches between a bearish and bullish trend but overall was rising. The 100-day moving average (pink) indicated that the stock in the long run was increasing but near November the stock starts to trend downwards.
Decomposing the ATVI time series using both a additive and multiplicative model.
ATVI_add <- decompose(ATVI_ts[,"ATVI.Close"], type = "additive")
ATVI_mul <- decompose(ATVI_ts[,"ATVI.Close"], type = "multiplicative")
plot(ATVI_add, col = "red")
plot(ATVI_mul, col = "blue")
The additive and multiplicative models are very similary with slight differences in the seasonal and random elements. Looking at the trend there appears to be an overall upwards trend until recently where it has started to drop much lower.
Checking for stationarity is important as it is an assumption for many underlying tools for time series analysis.
adf.test(ATVI$ATVI.Close, k = 11)
##
## Augmented Dickey-Fuller Test
##
## data: ATVI$ATVI.Close
## Dickey-Fuller = -1.239, Lag order = 11, p-value = 0.8987
## alternative hypothesis: stationary
kpss.test(ATVI$ATVI.Close, lshort = FALSE, null = "Trend")
##
## KPSS Test for Trend Stationarity
##
## data: ATVI$ATVI.Close
## KPSS Trend = 0.18356, Truncation lag parameter = 16, p-value =
## 0.02217
A stationary process has mean and variance that do not change with time and the process does not have trend. The Augmented Dickey-Fuller Test tests the null hypothesis that a unit root is present in the time series and the alternative hypothesis is that the time series is stationary or trend-stationary.
The Kwiatkowski–Phillips–Schmidt–Shin (KPSS) test is used for testing the null hypothesis that the time series is stationary and has a deterministic trend; the alternative hypothesis is that there is a unit root.
Both tests point to the time series not being stationary; we fail to reject the null hypothesis for the ADF test and we reject the null hypothesis of the KPSS test, both at 5% level at significance.
diff_ATVI_close <- diff(ATVI_ts[,"ATVI.Close"], differences = 1)
diff_adf <- adf.test(diff_ATVI_close, k = 11)
## Warning in adf.test(diff_ATVI_close, k = 11): p-value smaller than printed
## p-value
diff_adf
##
## Augmented Dickey-Fuller Test
##
## data: diff_ATVI_close
## Dickey-Fuller = -5.2599, Lag order = 11, p-value = 0.01
## alternative hypothesis: stationary
kpss.test(diff_ATVI_close, lshort = FALSE, null = "Trend")
## Warning in kpss.test(diff_ATVI_close, lshort = FALSE, null = "Trend"): p-
## value greater than printed p-value
##
## KPSS Test for Trend Stationarity
##
## data: diff_ATVI_close
## KPSS Trend = 0.081084, Truncation lag parameter = 16, p-value =
## 0.1
We have chosen the significance level \(alpha\) = 5% or 0.05. Here we reject the the null hypothesis for the ADF test since the p-value < 0.01 which is less than 0.05. In addition we can look at the Dicker-Fuller statistic of -5.259904 and we compare this to the critical values for Dickey–Fuller t-distribution table 1, we observe that the ADF test statistic is less than -3.43 and the null hypothesis of a unit root is rejected. We accept the null hypothesis of the KPSS test since the p-value is larger than 0.1. Therefore we can say the timeseries is stationary when differenced once.
autoplot(diff_ATVI_close, ts.colour = "blue", main = "Closing Prices Differenced Once")
Simple exponential smoothing utilizes weights to forecast future values by assigning weights to observations. The more recent the observation the higher the weight.
Creating a training and data subset of the timeseries to see how well the SES model is at forecasting.
start <- floor(0.8*(nrow(ATVI)))
test_ahead <- (nrow(ATVI) - start)
ATVI_window <- window(ts(ATVI_ts[,"ATVI.Close"]), end = floor(0.8*(nrow(ATVI))))
ATVI_SES <- HoltWinters(ATVI_window, beta = FALSE, gamma = FALSE)
ATVI_SES
## Holt-Winters exponential smoothing without trend and without seasonal component.
##
## Call:
## HoltWinters(x = ATVI_window, beta = FALSE, gamma = FALSE)
##
## Smoothing parameters:
## alpha: 0.8888619
## beta : FALSE
## gamma: FALSE
##
## Coefficients:
## [,1]
## a 70.6859
ATVI_forecast <- forecast(ATVI_SES, h = test_ahead)
#ATVI_forecast
plot(ATVI_forecast, main = "Forecasts from Simple Exponential Smoothing")
lines(ts(ATVI_ts[,"ATVI.Close"]))
acc_table <- rbind(accuracy(ATVI_forecast$fitted, ATVI_ts[,"ATVI.Close"][191:255]), acc_table)
Forecasting values in the future using Simple Exponential Smoothing.
ATVI_SES1 <- HoltWinters(ATVI_ts[,"ATVI.Close"], beta = FALSE, gamma = FALSE)
ATVI_SES1
## Holt-Winters exponential smoothing without trend and without seasonal component.
##
## Call:
## HoltWinters(x = ATVI_ts[, "ATVI.Close"], beta = FALSE, gamma = FALSE)
##
## Smoothing parameters:
## alpha: 0.9687366
## beta : FALSE
## gamma: FALSE
##
## Coefficients:
## [,1]
## a 52.5116
plot(ATVI_SES1)
ATVI_forecast1 <- forecast(ATVI_SES1, h = forecast_len)
#ATVI_forecast
plot(ATVI_forecast1, main = "Forecasts from Simple Exponential Smoothing")
Looking at the residuals on the forecast.
ATVI_fc1_acf <- Acf(ATVI_forecast$residuals, lag.max=20)
ATVI_boxtest1 <- Box.test(ATVI_forecast$residuals, lag=20, type="Ljung-Box")
ATVI_boxtest1
##
## Box-Ljung test
##
## data: ATVI_forecast$residuals
## X-squared = 17.722, df = 20, p-value = 0.6057
plot.ts(ATVI_forecast$residuals, main = "SES Forecast Residuals")
From the autocorrelcation function plot only 1 out of the 20 lags are out of significance bounds. The p-value from the BoxLjung test is 0.6057, this indicates that there is little evidence of non-zero autocorrelations in the in-sample forecast errors at lags 1-20.
Holt-Winters Exponential smoothing is an exponentially weighted moving average filter of the level, trend, and seasonal components of a time series. The parameters used for smoothing are chosen by minimizing the sum of the squared one-step-ahead-prediction errors 2.
Creating a training and data subset of the timeseries to see how well the Holt-Winters Exponential Smoothing model is at forecasting.
suppressWarnings(ATVI_hw <- HoltWinters(ATVI_window, gamma = FALSE))
ATVI_hw
## Holt-Winters exponential smoothing with trend and without seasonal component.
##
## Call:
## HoltWinters(x = ATVI_window, gamma = FALSE)
##
## Smoothing parameters:
## alpha: 0.8910129
## beta : 0.01199306
## gamma: FALSE
##
## Coefficients:
## [,1]
## a 7.068490e+01
## b 7.289125e-04
ATVI_forecast2_test <- forecast(ATVI_hw, h = test_ahead)
plot(ATVI_forecast2_test)
lines(ts(ATVI_ts[,"ATVI.Close"]))
acc_table <- rbind(accuracy(ATVI_forecast2_test$fitted, ATVI_ts[,"ATVI.Close"][191:255]), acc_table)
ATVI_hw2 <- HoltWinters(ATVI_ts[,"ATVI.Close"], gamma = FALSE)
ATVI_hw2
## Holt-Winters exponential smoothing with trend and without seasonal component.
##
## Call:
## HoltWinters(x = ATVI_ts[, "ATVI.Close"], gamma = FALSE)
##
## Smoothing parameters:
## alpha: 0.959677
## beta : 0.01410099
## gamma: FALSE
##
## Coefficients:
## [,1]
## a 52.4864679
## b -0.2369341
plot(ATVI_hw2)
ATVI_forecast2 <- forecast(ATVI_hw2, h = forecast_len)
#ATVI_forecast2
plot(ATVI_forecast2)
The estimated values of alpha and beta are 0.9597 and 0.0141. The value of alpha is high indicating that the estimate of the level at the time T is based upon more recent observations rather than observations further in the past. The value of beta is rather small and points towards the older values being weighted more heavily than the recent ones when determining the slope. There is no gamma component due to the lack of seasonality within the data.
ATVI_fc2_acf <- Acf(ATVI_forecast2$residuals, lag.max=20)
ATVI_boxtest2 <- Box.test(ATVI_forecast2$residuals, lag=20, type="Ljung-Box")
ATVI_boxtest2
##
## Box-Ljung test
##
## data: ATVI_forecast2$residuals
## X-squared = 25.688, df = 20, p-value = 0.1764
plot.ts(ATVI_forecast2$residuals, main = "HoltWinters Forecast Residuals")
The autocorrelation plot has two and possibly three lags that are out of the significance bounds. The p-value from the Box-Ljung test is 0.1764, this indicates that there is little evidence of non-zero autocorrelations in the in-sample forecast errors at lags 1-20.
We must now see if the model is an AR or MA process. This can be done using by looking at the plots of acf() and pacf() .
close_acf <- Acf(diff_ATVI_close, lag = 20)
close_acf
##
## Autocorrelations of series 'diff_ATVI_close', by lag
##
## 0 1 2 3 4 5 6 7 8 9
## 1.000 -0.028 -0.055 0.125 0.009 -0.071 0.077 0.001 -0.111 0.085
## 10 11 12 13 14 15 16 17 18 19
## 0.044 -0.027 0.061 0.013 -0.036 0.025 0.046 -0.014 -0.019 -0.009
## 20
## -0.069
close_pacf <- Pacf(diff_ATVI_close, lag = 20)
close_pacf
##
## Partial autocorrelations of series 'diff_ATVI_close', by lag
##
## 1 2 3 4 5 6 7 8 9 10
## -0.028 -0.056 0.123 0.013 -0.058 0.061 -0.005 -0.092 0.069 0.034
## 11 12 13 14 15 16 17 18 19 20
## 0.012 0.047 -0.008 -0.011 0.008 0.029 0.015 -0.024 -0.028 -0.062
Looking at the ACF and PACF gives us p = 6 and q = 3.
Using the auto.arima() function to automatically predict the values of p and q for the arima(p,d,q) model. Here the variable “d” represents the amount of times the data set would need to be differenced to obtain stationarity.
auto.arima(ATVI_ts[,4], d=1, seasonal = FALSE)
## Series: ATVI_ts[, 4]
## ARIMA(2,1,2)
##
## Coefficients:
## ar1 ar2 ma1 ma2
## -0.8458 -0.8801 0.8241 0.7924
## s.e. 0.0925 0.0806 0.1184 0.1022
##
## sigma^2 estimated as 1.93: log likelihood=-697.2
## AIC=1404.39 AICc=1404.54 BIC=1424.35
Testing the accuracy of the arima(2,1,2)
fit_no_holdout = arima(ATVI_window, order=c(2,1,2))
fcast_no_holdout <- forecast(fit_no_holdout, h = test_ahead)
plot(fcast_no_holdout, main="ARIMA(2,1,2) ", ylim=c(45,100))
lines(ts(ATVI_ts[,"ATVI.Close"]))
acc_table <- rbind(accuracy(fcast_no_holdout$fitted, ATVI_ts[,"ATVI.Close"][191:255]), acc_table)
Fitting an arima(2,1,2) to forecast.
fit <- arima(ATVI_ts[,4], order = c(2,1,2))
fit
##
## Call:
## arima(x = ATVI_ts[, 4], order = c(2, 1, 2))
##
## Coefficients:
## ar1 ar2 ma1 ma2
## -0.8458 -0.8801 0.8241 0.7924
## s.e. 0.0925 0.0806 0.1184 0.1022
##
## sigma^2 estimated as 1.911: log likelihood = -697.2, aic = 1404.39
tsdisplay(residuals(fit), lag.max=20, main = "Arima(2,1,2) Model Residuals")
ATVI_fcast <- forecast(fit, h= forecast_len)
plot(ATVI_fcast, main = "Forecasts from ARIMA(2,1,2)")
ggplot(data.frame(residuals = residuals(fit)), aes(x =residuals)) + geom_histogram(bins = 50, aes(y = ..density..), col = "red", fill = "red", alpha = 0.3) + geom_density()
## Don't know how to automatically pick scale for object of type ts. Defaulting to continuous.
qqnorm(residuals(fit))
qqline(residuals(fit))
The residuals of the arima(2,1,2) model appear to be slightly left skewed but normally distributed.The Q-Q plot supports that most of the residuals are normally distributed however the points at the ends of the range.
Testing the ARIMA model using p and q from our visual analysis of the Acf and Pacf.
suppressWarnings(fit_test2 <- arima(ATVI_ts[,4], order = c(3,1,2)))
fit_test2
##
## Call:
## arima(x = ATVI_ts[, 4], order = c(3, 1, 2))
##
## Coefficients:
## ar1 ar2 ar3 ma1 ma2
## -0.8021 -0.8541 0.0200 0.7894 0.7787
## s.e. 0.1892 0.1285 0.0711 0.1827 0.1205
##
## sigma^2 estimated as 1.911: log likelihood = -697.16, aic = 1406.31
tsdisplay(residuals(fit_test2), lag.max=20, main = "Arima(3,1,2) Model Residuals")
ATVI_fcast3 <- forecast(fit_test2, h= forecast_len)
plot(ATVI_fcast3, main = "Forecasts from ARIMA(3,1,2)")
suppressWarnings(fit_test3 <- arima(ATVI_ts[,4], order = c(1,1,2)))
fit_test3
##
## Call:
## arima(x = ATVI_ts[, 4], order = c(1, 1, 2))
##
## Coefficients:
## ar1 ma1 ma2
## -0.4810 0.4620 -0.0770
## s.e. 0.3212 0.3174 0.0458
##
## sigma^2 estimated as 1.964: log likelihood = -702.57, aic = 1413.14
tsdisplay(residuals(fit_test2), lag.max=20, main = "Arima(1,1,2) Model Residuals")
ATVI_fcast4 <- forecast(fit_test2, h= forecast_len)
plot(ATVI_fcast4, main = "Forecasts from ARIMA(1,1,2)")
| Arima_Model | AIC |
|---|---|
| (2,1,2) | 1404.392 |
| (2,1,3) | 1406.311 |
| (1,1,2) | 1413.135 |
The Akaike information criterion (AIC) is an estimator of the relative quality of the model for the given data. From the table, the arima(2,1,2) model performed the best in reduction of information loss; however there is no substantial difference between the models as the AIC’s are not relatively different from each other.
Our dataset contains 384 entries of Activision Blizzard Inc. (ATVI) daily stock data. We have set the frequency = 5 to match a Monday to Friday week. The original data was not stationary and thus required us to difference once in order to obtain stationarity. During the analysis of the plots for ATVI certain points of interest were observered in the Open/Close and Volume plots.
In both the following plots the colored points represent points of interest:
The following is a plot showing the volume traded within the last three months:
The volume traded on 2018-11-09 (November 9th, 2018) was more than six times the volume traded one week prior.
This plot shows the closing prices within the same time frame as the volume plot above:
On October 2nd 2018 (orange point in the above plot), the highest ATVI stock was at an all time high closing price of 83.39 USD; however in the following days the stock would drop to 73.58 USD on October 10th before starting to rise again. On October 12th, Activision launched Call of Duty: Black Ops 4; the highly anticpated sequel to Call of Duty: Black Ops 3. Less than a week later on October 18th, Activsion announced that within the first three days of release that they have sold more than $500 million worldwide and that the release set records for digital sales across all platforms3. However, investors were disappointed as this only matched the sales of Black Ops 3 which implies flat sales growth over the year. Investors expected the release of Black Ops 4 to capitalize on the popularity of “battle royale” type games and result in bigger sales.
Exactly one week prior to November 9th (red point), Blizzard at their annual gaming convention, BlizzCon, was expected to announce a new release; many were hopeful for the next installment in their Diablo series, Diablo 4, but were instead disappointed with a release of a mobile game titled Diablo: Immortal. On the day of their announcement, Novemember 2nd, the stock price opened at 69.95 USD and closed at 68.99 USD, a very small decrease. However on the following Monday, November 5th, the stock price ended at 64.34 USD, this is a 5.38% decrease from the opening price of 68 USD. Since the BlizzCon on November 2nd, as of November 29th the stock has dropped 24.9%.
| ME | RMSE | MAE | MPE | MAPE | |
|---|---|---|---|---|---|
| Simple Exponential Smoothing | 11.31201 | 12.20809 | 11.31201 | 16.03481 | 16.03481 |
| Holt-Winters Exponential Smoothing | 10.69137 | 11.51633 | 10.69137 | 15.18956 | 15.18956 |
| ARIMA(0,2,2) Model | 11.15876 | 12.01255 | 11.15876 | 15.84009 | 15.84009 |
The following measures were used to test the accuracy of the models: mean error (ME), root-mean-square error (RMSE), mean absolute error (MAE), mean percentage error (MPE), and mean absolute percentage error (MAPE). The lower the measure, the less error the model had in forecasting the correct values in the test set and thus should be a better model at forecasting future values. From our table, the Holt-Winters exponential smoothing model performed the the best in forecasting the test set in every measure.
Looking at the Holt-Winters forecast:
## Time Series:
## Start = c(81, 2)
## End = c(89, 1)
## Frequency = 5
## [1] 52.24953 52.01260 51.77567 51.53873 51.30180 51.06486 50.82793
## [8] 50.59099 50.35406 50.11713 49.88019 49.64326 49.40632 49.16939
## [15] 48.93246 48.69552 48.45859 48.22165 47.98472 47.74778 47.51085
## [22] 47.27392 47.03698 46.80005 46.56311 46.32618 46.08925 45.85231
## [29] 45.61538 45.37844 45.14151 44.90458 44.66764 44.43071 44.19377
## [36] 43.95684 43.71990 43.48297 43.24604 43.00910
The above represents the predicted mean price for the next 40 days which corresponds to the blue line in the forecast plot. The model forecasts that the price will continue to drop (blue line) while the lighter grey area (wider area) represents the 80% prediction inteval and the darker grey area represents the 95% prediction interval. This downwards trend is not unexpected as the trend prior to the forecast was downwards and was a significant drop in stock price compared to years prior.
Fuller, W. A. (1976). Introduction to Statistical Time Series. New York: John Wiley and Sons. ISBN 0-471-28715-6.↩
https://docs.tibco.com/pub/enterprise-runtime-for-R/4.0.0/doc/html/Language_Reference/stats/HoltWinters.html↩
https://investor.activision.com/news-releases/news-release-details/call-duty-black-ops-4-delivers-over-half-billion-dollar-opening↩