Discussion Topic
For the data set that you used in the last several discussions, build an auto.arima model. How well did this model perform in comparison to your ETS model? Retain 6 months of data for a test set and try to forecast the ETS and the auto.arima. Which performs better on the hold-out set?
Obtain Data
library(quantmod)
library(forecast)
library(lubridate)
getSymbols('YOU.L', src = 'yahoo',
from = Sys.Date() - years(2), to = Sys.Date())
[1] "YOU.L"
yougov <- ts(YOU.L, frequency = 252)
Exploring Stationarity
The first step in exploratory analysis with this data set is to determine whether or not stationarity needs to be considered. Data is stationary if it is uneffected by trend or seasonality. A quick glance at adjusted closing prices over time makes the trend in the data fairly obvious, but formal tests like Augmented Dickey-Fuller (ADF) and Kwiatkowski-Phillips-Schmidt-Shin (KPSS) tests can be used to test this observation empirically (Hyndman).

The ADF test considers the null hypothesis that the data is non-stationary. In the case of YouGov PLC, the p-value of 0.1292 surpasses the typical threshold of .05 so this test fails to prove stationarity.
library(tseries)
adf.test(y, alternative = "stationary")
Augmented Dickey-Fuller Test
data: y
Dickey-Fuller = -3.0586, Lag order = 7, p-value = 0.1302
alternative hypothesis: stationary
The KNSS test reverses the null hypothesis so smaller p-values indicate non-stationarity. The result of this test echos the results of the ADS test.
kpss.test(y)
KPSS Test for Level Stationarity
data: y
KPSS Level = 8.2515, Truncation lag parameter = 5, p-value = 0.01
Both the ADF and KNSS tests confirm the need for differencing to account for non-stationarity in the YouGov data. Differencing doesn’t produce a perfect white noise distribution, but does lead to clear improvements. Second ordered differencing doesn’t help, though - that ACF plot looks less stationary than before!
library(gridExtra)
diff.y <- diff(y)
grid.arrange(
ggAcf(y)+theme_yaz()+labs(title = 'ACF of YouGov Adj. Close'),
ggAcf(diff.y)+theme_yaz()+labs(title = 'Differenced'),
ggAcf(diff(diff.y))+theme_yaz()+labs(title = 'Second Order Differenced'),
nrow = 1
)

Building Models
ARIMA models have three components: the order of autoregression, degree of differencing, and the order of the moving average. In this case, the auto.arima()
function uses a variety of techniques to find the ideal mix of those components to minimize AICc. The selected model is ARIMA(0,1,0)
- a random walk model with no constant. The model of differenced values resulted in a white noise model and a naive forecast. AICc for that model was virtually identical and residuals performed similarly.
aa.fit <- auto.arima(y)
summary(aa.fit)
Series: y
ARIMA(0,1,0) with drift
Coefficients:
drift
0.3795
s.e. 0.1755
sigma^2 estimated as 15.59: log likelihood=-1409.51
AIC=2823.02 AICc=2823.04 BIC=2831.46
Training set error measures:
ME RMSE MAE MPE MAPE MASE ACF1
Training set 0.0002607789 3.93999 2.261203 -0.02967813 1.028288 0.3810233 -0.0591626
Residuals tend to get eratic about a year ago because YouGov has a high profile polling operation which was involved in several major elections in the UK and US around the time of the spikes. While some autocorrelations of residuals are outside the acceptable boundaries, the Portmanteau test returns a p-value of 1 which indicates residuals behave as white noise.
checkresiduals(aa.fit)+theme_yaz()
Ljung-Box test
data: Residuals from ARIMA(0,1,0) with drift
Q* = 13.999, df = 9, p-value = 0.1224
Model df: 1. Total lags used: 10
NULL

Plotting forcasted values indicates an 80% change that investing in YouGov will, at a bare minimum, retain the original value after 9 months and likely grow more than that. Some damping may be in order though because it’s unrealistic to expect the stock to just keep growing indefinitely.

Comparing to ETS Modeling
Last week, we used exponential smoothing methods to predict stock prices.
summary(ets.fit)
ETS(M,A,N)
Call:
ets(y = y)
Smoothing parameters:
alpha = 0.961
beta = 1e-04
Initial states:
l = 131.38
b = 0.3612
sigma: 0.0172
AIC AICc BIC
4430.341 4430.461 4451.474
Training set error measures:
ME RMSE MAE MPE MAPE MASE ACF1
Training set 0.02143105 3.934518 2.252726 -0.01979528 1.025184 0.3795948 -0.01836211
Comparing the models against one another shows a really close race! MPE and ACF1 are slightly lower for the ARIMA model, but all other metrics are virtually identical.

Hyndman points out that this is due to the fact that many ETS models are special cases of ARIMA models. The process isn’t so different so neither is the output!
LS0tDQp0aXRsZTogIlVzaW5nIGF1dG8uYXJpbWEoKSB0byBGb3JlY2FzdCBZb3VHb3YgU2hhcmUgUHJpY2VzIg0Kb3V0cHV0OiBodG1sX25vdGVib29rDQotLS0NCg0KIyBEaXNjdXNzaW9uIFRvcGljDQpGb3IgdGhlIGRhdGEgc2V0IHRoYXQgeW91IHVzZWQgaW4gdGhlIGxhc3Qgc2V2ZXJhbCBkaXNjdXNzaW9ucywgYnVpbGQgYW4gYXV0by5hcmltYSBtb2RlbC4gIEhvdyB3ZWxsIGRpZCB0aGlzIG1vZGVsIHBlcmZvcm0gaW4gY29tcGFyaXNvbiB0byB5b3VyIEVUUyBtb2RlbD8gIFJldGFpbiA2IG1vbnRocyBvZiBkYXRhIGZvciBhIHRlc3Qgc2V0IGFuZCB0cnkgdG8gZm9yZWNhc3QgdGhlIEVUUyBhbmQgdGhlIGF1dG8uYXJpbWEuICBXaGljaCBwZXJmb3JtcyBiZXR0ZXIgb24gdGhlIGhvbGQtb3V0IHNldD8NCg0KIyBPYnRhaW4gRGF0YQ0KYGBge3IsIGVjaG8gPSBUUlVFLCB3YXJuaW5nPUZBTFNFLCBtZXNzYWdlPSBGQUxTRX0NCmxpYnJhcnkocXVhbnRtb2QpDQpsaWJyYXJ5KGZvcmVjYXN0KQ0KbGlicmFyeShsdWJyaWRhdGUpDQpnZXRTeW1ib2xzKCdZT1UuTCcsIHNyYyA9ICd5YWhvbycsIA0KICAgICAgICAgICBmcm9tID0gU3lzLkRhdGUoKSAtIHllYXJzKDIpLCB0byA9IFN5cy5EYXRlKCkpDQoNCnlvdWdvdiA8LSB0cyhZT1UuTCwgZnJlcXVlbmN5ID0gNSkNCmBgYA0KDQojIEV4cGxvcmluZyBTdGF0aW9uYXJpdHkNClRoZSBmaXJzdCBzdGVwIGluIGV4cGxvcmF0b3J5IGFuYWx5c2lzIHdpdGggdGhpcyBkYXRhIHNldCBpcyB0byBkZXRlcm1pbmUgd2hldGhlciBvciBub3Qgc3RhdGlvbmFyaXR5IG5lZWRzIHRvIGJlIGNvbnNpZGVyZWQuIERhdGEgaXMgc3RhdGlvbmFyeSBpZiBpdCBpcyB1bmVmZmVjdGVkIGJ5IHRyZW5kIG9yIHNlYXNvbmFsaXR5LiBBIHF1aWNrIGdsYW5jZSBhdCBhZGp1c3RlZCBjbG9zaW5nIHByaWNlcyBvdmVyIHRpbWUgbWFrZXMgdGhlIHRyZW5kIGluIHRoZSBkYXRhIGZhaXJseSBvYnZpb3VzLCBidXQgZm9ybWFsIHRlc3RzIGxpa2UgX0F1Z21lbnRlZCBEaWNrZXktRnVsbGVyIChBREYpXyBhbmQgX0t3aWF0a293c2tpLVBoaWxsaXBzLVNjaG1pZHQtU2hpbiAoS1BTUylfIHRlc3RzIGNhbiBiZSB1c2VkIHRvIHRlc3QgdGhpcyBvYnNlcnZhdGlvbiBlbXBpcmljYWxseSAoW0h5bmRtYW5dKGh0dHA6Ly9vdGV4dHMub3JnL2ZwcDIvc3RhdGlvbmFyaXR5LWFuZC1kaWZmZXJlbmNpbmcuaHRtbCkpLiANCmBgYHtyLCBlY2hvID0gVFJVRSwgd2FybmluZz1GQUxTRSwgbWVzc2FnZT0gRkFMU0V9DQpsaWJyYXJ5KGRwbHlyKQ0KbGlicmFyeSh5YXp0aGVtZSkNCmxpYnJhcnkoZ2dwbG90MikNCnkgPSB5b3Vnb3ZbLCdZT1UuTC5BZGp1c3RlZCddDQpjbG9zZV9wcmljZV9saW5lIDwtIGF1dG9wbG90KHkpKw0KICBsYWJzKHggPSAnVGltZScsIHkgPSAnQWRqdXN0ZWQgQ2xvc2luZyBQcmljZScsDQogICAgICAgdGl0bGUgPSAnWW91R292IFBMQyAoWU9VLkwpIEFkanVzdGVkIENsb3NpbmcgUHJpY2Ugb3ZlciBUaW1lJykrDQogIHRoZW1lX3lheigpKw0KICBzY2FsZV94X2NvbnRpbnVvdXMoYnJlYWtzID0gYygwLCA1MiksIGxhYmVscyA9IGMoJzIgWWVhcnMgQWdvJywnMSBZZWFyIEFnbycpKQ0KY2xvc2VfcHJpY2VfbGluZQ0KYGBgDQoNClRoZSBBREYgdGVzdCBjb25zaWRlcnMgdGhlIG51bGwgaHlwb3RoZXNpcyB0aGF0IHRoZSBkYXRhIGlzIG5vbi1zdGF0aW9uYXJ5LiBJbiB0aGUgY2FzZSBvZiBZb3VHb3YgUExDLCB0aGUgcC12YWx1ZSBvZiAwLjEyOTIgc3VycGFzc2VzIHRoZSB0eXBpY2FsIHRocmVzaG9sZCBvZiAuMDUgc28gdGhpcyB0ZXN0IGZhaWxzIHRvIHByb3ZlIHN0YXRpb25hcml0eS4gDQpgYGB7ciwgZWNobyA9IFRSVUUsIHdhcm5pbmc9RkFMU0UsIG1lc3NhZ2U9IEZBTFNFfQ0KbGlicmFyeSh0c2VyaWVzKQ0KYWRmLnRlc3QoeSwgYWx0ZXJuYXRpdmUgPSAic3RhdGlvbmFyeSIpDQpgYGANCg0KVGhlIEtOU1MgdGVzdCByZXZlcnNlcyB0aGUgbnVsbCBoeXBvdGhlc2lzIHNvIHNtYWxsZXIgcC12YWx1ZXMgaW5kaWNhdGUgbm9uLXN0YXRpb25hcml0eS4gVGhlIHJlc3VsdCBvZiB0aGlzIHRlc3QgZWNob3MgdGhlIHJlc3VsdHMgb2YgdGhlIEFEUyB0ZXN0Lg0KYGBge3IsIGVjaG8gPSBUUlVFLCB3YXJuaW5nPUZBTFNFLCBtZXNzYWdlPSBGQUxTRX0NCmtwc3MudGVzdCh5KQ0KYGBgDQoNCkJvdGggdGhlIEFERiBhbmQgS05TUyB0ZXN0cyBjb25maXJtIHRoZSBuZWVkIGZvciBkaWZmZXJlbmNpbmcgdG8gYWNjb3VudCBmb3Igbm9uLXN0YXRpb25hcml0eSBpbiB0aGUgWW91R292IGRhdGEuIERpZmZlcmVuY2luZyBkb2Vzbid0IHByb2R1Y2UgYSBwZXJmZWN0IHdoaXRlIG5vaXNlIGRpc3RyaWJ1dGlvbiwgYnV0IGRvZXMgbGVhZCB0byBjbGVhciBpbXByb3ZlbWVudHMuIFNlY29uZCBvcmRlcmVkIGRpZmZlcmVuY2luZyBkb2Vzbid0IGhlbHAsIHRob3VnaCAtIHRoYXQgQUNGIHBsb3QgbG9va3MgbGVzcyBzdGF0aW9uYXJ5IHRoYW4gYmVmb3JlIQ0KYGBge3IsIGVjaG8gPSBUUlVFLCB3YXJuaW5nPUZBTFNFLCBtZXNzYWdlPSBGQUxTRSwgZmlnLndpZHRoPTEwLCBmaWcuaGVpZ2h0PTR9DQpsaWJyYXJ5KGdyaWRFeHRyYSkNCmRpZmYueSA8LSBkaWZmKHkpDQpncmlkLmFycmFuZ2UoDQogIGdnQWNmKHkpK3RoZW1lX3lheigpK2xhYnModGl0bGUgPSAnQUNGIG9mIFlvdUdvdiBBZGouIENsb3NlJyksDQogIGdnQWNmKGRpZmYueSkrdGhlbWVfeWF6KCkrbGFicyh0aXRsZSA9ICdEaWZmZXJlbmNlZCcpLA0KICBnZ0FjZihkaWZmKGRpZmYueSkpK3RoZW1lX3lheigpK2xhYnModGl0bGUgPSAnU2Vjb25kIE9yZGVyIERpZmZlcmVuY2VkJyksDQogIG5yb3cgPSAxDQopDQpgYGANCg0KIyBCdWlsZGluZyBNb2RlbHMNCkFSSU1BIG1vZGVscyBoYXZlIHRocmVlIGNvbXBvbmVudHM6IHRoZSBvcmRlciBvZiBhdXRvcmVncmVzc2lvbiwgZGVncmVlIG9mIGRpZmZlcmVuY2luZywgYW5kIHRoZSBvcmRlciBvZiB0aGUgbW92aW5nIGF2ZXJhZ2UuIEluIHRoaXMgY2FzZSwgdGhlIGBhdXRvLmFyaW1hKClgIGZ1bmN0aW9uIHVzZXMgYSB2YXJpZXR5IG9mIHRlY2huaXF1ZXMgdG8gZmluZCB0aGUgaWRlYWwgbWl4IG9mIHRob3NlIGNvbXBvbmVudHMgdG8gbWluaW1pemUgQUlDYy4gVGhlIHNlbGVjdGVkIG1vZGVsIGlzIGBBUklNQSgwLDEsMClgLSBhIHJhbmRvbSB3YWxrIG1vZGVsIHdpdGggbm8gY29uc3RhbnQuIFRoZSBtb2RlbCBvZiBkaWZmZXJlbmNlZCB2YWx1ZXMgcmVzdWx0ZWQgaW4gYSB3aGl0ZSBub2lzZSBtb2RlbCBhbmQgYSBuYWl2ZSBmb3JlY2FzdC4gQUlDYyBmb3IgdGhhdCBtb2RlbCB3YXMgdmlydHVhbGx5IGlkZW50aWNhbCBhbmQgcmVzaWR1YWxzIHBlcmZvcm1lZCBzaW1pbGFybHkuICANCmBgYHtyfQ0KYWEuZml0IDwtIGF1dG8uYXJpbWEoeSkNCnN1bW1hcnkoYWEuZml0KQ0KYGBgDQoNClJlc2lkdWFscyB0ZW5kIHRvIGdldCBlcmF0aWMgYWJvdXQgYSB5ZWFyIGFnbyBiZWNhdXNlIFlvdUdvdiBoYXMgYSBoaWdoIHByb2ZpbGUgcG9sbGluZyBvcGVyYXRpb24gd2hpY2ggd2FzIGludm9sdmVkIGluIHNldmVyYWwgbWFqb3IgZWxlY3Rpb25zIGluIHRoZSBVSyBhbmQgVVMgYXJvdW5kIHRoZSB0aW1lIG9mIHRoZSBzcGlrZXMuIFdoaWxlIHNvbWUgYXV0b2NvcnJlbGF0aW9ucyBvZiByZXNpZHVhbHMgYXJlIG91dHNpZGUgdGhlIGFjY2VwdGFibGUgYm91bmRhcmllcywgdGhlIFBvcnRtYW50ZWF1IHRlc3QgcmV0dXJucyBhIHAtdmFsdWUgb2YgMSB3aGljaCBpbmRpY2F0ZXMgcmVzaWR1YWxzIGJlaGF2ZSBhcyB3aGl0ZSBub2lzZS4gDQpgYGB7cn0NCmNoZWNrcmVzaWR1YWxzKGFhLmZpdCkrdGhlbWVfeWF6KCkNCmBgYA0KDQpQbG90dGluZyBmb3JjYXN0ZWQgdmFsdWVzIGluZGljYXRlcyBhbiA4MCUgY2hhbmdlIHRoYXQgaW52ZXN0aW5nIGluIFlvdUdvdiB3aWxsLCBhdCBhIGJhcmUgbWluaW11bSwgcmV0YWluIHRoZSBvcmlnaW5hbCB2YWx1ZSBhZnRlciA5IG1vbnRocyBhbmQgbGlrZWx5IGdyb3cgbW9yZSB0aGFuIHRoYXQuIFNvbWUgZGFtcGluZyBtYXkgYmUgaW4gb3JkZXIgdGhvdWdoIGJlY2F1c2UgaXQncyB1bnJlYWxpc3RpYyB0byBleHBlY3QgdGhlIHN0b2NrIHRvIGp1c3Qga2VlcCBncm93aW5nIGluZGVmaW5pdGVseS4gDQpgYGB7cn0NCmF1dG9wbG90KGZvcmVjYXN0KGFhLmZpdCxib290c3RyYXAgPSBUKSkrdGhlbWVfeWF6KCkNCmBgYA0KDQojIyBDb21wYXJpbmcgdG8gRVRTIE1vZGVsaW5nDQpMYXN0IHdlZWssIHdlIHVzZWQgZXhwb25lbnRpYWwgc21vb3RoaW5nIG1ldGhvZHMgdG8gcHJlZGljdCBzdG9jayBwcmljZXMuIA0KDQpgYGB7cn0NCmV0cy5maXQgPC0gZXRzKHkpDQpzdW1tYXJ5KGV0cy5maXQpDQpgYGANCg0KQ29tcGFyaW5nIHRoZSBtb2RlbHMgYWdhaW5zdCBvbmUgYW5vdGhlciBzaG93cyBhIHJlYWxseSBjbG9zZSByYWNlISBNUEUgYW5kIEFDRjEgYXJlIHNsaWdodGx5IGxvd2VyIGZvciB0aGUgQVJJTUEgbW9kZWwsIGJ1dCBhbGwgb3RoZXIgbWV0cmljcyBhcmUgdmlydHVhbGx5IGlkZW50aWNhbC4NCmBgYHtyfQ0KYmluZF9yb3dzKA0KICBkYXRhLmZyYW1lKGFjY3VyYWN5KGFhLmZpdCkpJT4lbXV0YXRlKG1vZGVsID0gJ0FSSU1BJyksDQogIGRhdGEuZnJhbWUoYWNjdXJhY3koZXRzLmZpdCkpJT4lbXV0YXRlKG1vZGVsID0gJ0VUUycpDQopJT4lDQogIHJlc2hhcGUyOjptZWx0KGlkLnZhcnMgPSAnbW9kZWwnKSU+JQ0KICBnZ3Bsb3QoYWVzKHggPSBhcy5jaGFyYWN0ZXIodmFyaWFibGUpLCB5ID0gdmFsdWUsIGZpbGwgPSBtb2RlbCkpKw0KICBnZW9tX2JhcihzdGF0ID0gJ2lkZW50aXR5JywgcG9zaXRpb24gPSAnZG9kZ2UnKSsNCiAgdGhlbWVfeWF6KCkrDQogIGxhYnMoeCA9ICdNb2RlbCBIZWFsdGggTWV0cmljJywgdGl0bGUgPSAnTW9kZWwgQ29tcGFyaXNvbicsDQogICAgICAgeSA9IGVsZW1lbnRfYmxhbmsoKSkNCg0KYGBgDQoNCkh5bmRtYW4gcG9pbnRzIG91dCB0aGF0IHRoaXMgaXMgZHVlIHRvIHRoZSBmYWN0IHRoYXQgbWFueSBFVFMgbW9kZWxzIGFyZSBzcGVjaWFsIGNhc2VzIG9mIEFSSU1BIG1vZGVscy4gVGhlIHByb2Nlc3MgaXNuJ3Qgc28gZGlmZmVyZW50IHNvIG5laXRoZXIgaXMgdGhlIG91dHB1dCE=