Discussion Topic

For the data set that you used in the last several discussions, build an auto.arima model. How well did this model perform in comparison to your ETS model? Retain 6 months of data for a test set and try to forecast the ETS and the auto.arima. Which performs better on the hold-out set?

Obtain Data

library(quantmod)
library(forecast)
library(lubridate)
getSymbols('YOU.L', src = 'yahoo', 
           from = Sys.Date() - years(2), to = Sys.Date())
[1] "YOU.L"
yougov <- ts(YOU.L, frequency = 252)

Exploring Stationarity

The first step in exploratory analysis with this data set is to determine whether or not stationarity needs to be considered. Data is stationary if it is uneffected by trend or seasonality. A quick glance at adjusted closing prices over time makes the trend in the data fairly obvious, but formal tests like Augmented Dickey-Fuller (ADF) and Kwiatkowski-Phillips-Schmidt-Shin (KPSS) tests can be used to test this observation empirically (Hyndman).

The ADF test considers the null hypothesis that the data is non-stationary. In the case of YouGov PLC, the p-value of 0.1292 surpasses the typical threshold of .05 so this test fails to prove stationarity.

library(tseries)
adf.test(y, alternative = "stationary")

    Augmented Dickey-Fuller Test

data:  y
Dickey-Fuller = -3.0586, Lag order = 7, p-value = 0.1302
alternative hypothesis: stationary

The KNSS test reverses the null hypothesis so smaller p-values indicate non-stationarity. The result of this test echos the results of the ADS test.

kpss.test(y)

    KPSS Test for Level Stationarity

data:  y
KPSS Level = 8.2515, Truncation lag parameter = 5, p-value = 0.01

Both the ADF and KNSS tests confirm the need for differencing to account for non-stationarity in the YouGov data. Differencing doesn’t produce a perfect white noise distribution, but does lead to clear improvements. Second ordered differencing doesn’t help, though - that ACF plot looks less stationary than before!

library(gridExtra)
diff.y <- diff(y)
grid.arrange(
  ggAcf(y)+theme_yaz()+labs(title = 'ACF of YouGov Adj. Close'),
  ggAcf(diff.y)+theme_yaz()+labs(title = 'Differenced'),
  ggAcf(diff(diff.y))+theme_yaz()+labs(title = 'Second Order Differenced'),
  nrow = 1
)

Building Models

ARIMA models have three components: the order of autoregression, degree of differencing, and the order of the moving average. In this case, the auto.arima() function uses a variety of techniques to find the ideal mix of those components to minimize AICc. The selected model is ARIMA(0,1,0)- a random walk model with no constant. The model of differenced values resulted in a white noise model and a naive forecast. AICc for that model was virtually identical and residuals performed similarly.

aa.fit <- auto.arima(y)
summary(aa.fit)
Series: y 
ARIMA(0,1,0)           with drift         

Coefficients:
       drift
      0.3795
s.e.  0.1755

sigma^2 estimated as 15.59:  log likelihood=-1409.51
AIC=2823.02   AICc=2823.04   BIC=2831.46

Training set error measures:
                       ME    RMSE      MAE         MPE     MAPE      MASE       ACF1
Training set 0.0002607789 3.93999 2.261203 -0.02967813 1.028288 0.3810233 -0.0591626

Residuals tend to get eratic about a year ago because YouGov has a high profile polling operation which was involved in several major elections in the UK and US around the time of the spikes. While some autocorrelations of residuals are outside the acceptable boundaries, the Portmanteau test returns a p-value of 1 which indicates residuals behave as white noise.

checkresiduals(aa.fit)+theme_yaz()

    Ljung-Box test

data:  Residuals from ARIMA(0,1,0) with drift
Q* = 13.999, df = 9, p-value = 0.1224

Model df: 1.   Total lags used: 10

NULL

Plotting forcasted values indicates an 80% change that investing in YouGov will, at a bare minimum, retain the original value after 9 months and likely grow more than that. Some damping may be in order though because it’s unrealistic to expect the stock to just keep growing indefinitely.

Comparing to ETS Modeling

Last week, we used exponential smoothing methods to predict stock prices.

summary(ets.fit)
ETS(M,A,N) 

Call:
 ets(y = y) 

  Smoothing parameters:
    alpha = 0.961 
    beta  = 1e-04 

  Initial states:
    l = 131.38 
    b = 0.3612 

  sigma:  0.0172

     AIC     AICc      BIC 
4430.341 4430.461 4451.474 

Training set error measures:
                     ME     RMSE      MAE         MPE     MAPE      MASE        ACF1
Training set 0.02143105 3.934518 2.252726 -0.01979528 1.025184 0.3795948 -0.01836211

Comparing the models against one another shows a really close race! MPE and ACF1 are slightly lower for the ARIMA model, but all other metrics are virtually identical.

Hyndman points out that this is due to the fact that many ETS models are special cases of ARIMA models. The process isn’t so different so neither is the output!

LS0tDQp0aXRsZTogIlVzaW5nIGF1dG8uYXJpbWEoKSB0byBGb3JlY2FzdCBZb3VHb3YgU2hhcmUgUHJpY2VzIg0Kb3V0cHV0OiBodG1sX25vdGVib29rDQotLS0NCg0KIyBEaXNjdXNzaW9uIFRvcGljDQpGb3IgdGhlIGRhdGEgc2V0IHRoYXQgeW91IHVzZWQgaW4gdGhlIGxhc3Qgc2V2ZXJhbCBkaXNjdXNzaW9ucywgYnVpbGQgYW4gYXV0by5hcmltYSBtb2RlbC4gIEhvdyB3ZWxsIGRpZCB0aGlzIG1vZGVsIHBlcmZvcm0gaW4gY29tcGFyaXNvbiB0byB5b3VyIEVUUyBtb2RlbD8gIFJldGFpbiA2IG1vbnRocyBvZiBkYXRhIGZvciBhIHRlc3Qgc2V0IGFuZCB0cnkgdG8gZm9yZWNhc3QgdGhlIEVUUyBhbmQgdGhlIGF1dG8uYXJpbWEuICBXaGljaCBwZXJmb3JtcyBiZXR0ZXIgb24gdGhlIGhvbGQtb3V0IHNldD8NCg0KIyBPYnRhaW4gRGF0YQ0KYGBge3IsIGVjaG8gPSBUUlVFLCB3YXJuaW5nPUZBTFNFLCBtZXNzYWdlPSBGQUxTRX0NCmxpYnJhcnkocXVhbnRtb2QpDQpsaWJyYXJ5KGZvcmVjYXN0KQ0KbGlicmFyeShsdWJyaWRhdGUpDQpnZXRTeW1ib2xzKCdZT1UuTCcsIHNyYyA9ICd5YWhvbycsIA0KICAgICAgICAgICBmcm9tID0gU3lzLkRhdGUoKSAtIHllYXJzKDIpLCB0byA9IFN5cy5EYXRlKCkpDQoNCnlvdWdvdiA8LSB0cyhZT1UuTCwgZnJlcXVlbmN5ID0gNSkNCmBgYA0KDQojIEV4cGxvcmluZyBTdGF0aW9uYXJpdHkNClRoZSBmaXJzdCBzdGVwIGluIGV4cGxvcmF0b3J5IGFuYWx5c2lzIHdpdGggdGhpcyBkYXRhIHNldCBpcyB0byBkZXRlcm1pbmUgd2hldGhlciBvciBub3Qgc3RhdGlvbmFyaXR5IG5lZWRzIHRvIGJlIGNvbnNpZGVyZWQuIERhdGEgaXMgc3RhdGlvbmFyeSBpZiBpdCBpcyB1bmVmZmVjdGVkIGJ5IHRyZW5kIG9yIHNlYXNvbmFsaXR5LiBBIHF1aWNrIGdsYW5jZSBhdCBhZGp1c3RlZCBjbG9zaW5nIHByaWNlcyBvdmVyIHRpbWUgbWFrZXMgdGhlIHRyZW5kIGluIHRoZSBkYXRhIGZhaXJseSBvYnZpb3VzLCBidXQgZm9ybWFsIHRlc3RzIGxpa2UgX0F1Z21lbnRlZCBEaWNrZXktRnVsbGVyIChBREYpXyBhbmQgX0t3aWF0a293c2tpLVBoaWxsaXBzLVNjaG1pZHQtU2hpbiAoS1BTUylfIHRlc3RzIGNhbiBiZSB1c2VkIHRvIHRlc3QgdGhpcyBvYnNlcnZhdGlvbiBlbXBpcmljYWxseSAoW0h5bmRtYW5dKGh0dHA6Ly9vdGV4dHMub3JnL2ZwcDIvc3RhdGlvbmFyaXR5LWFuZC1kaWZmZXJlbmNpbmcuaHRtbCkpLiANCmBgYHtyLCBlY2hvID0gVFJVRSwgd2FybmluZz1GQUxTRSwgbWVzc2FnZT0gRkFMU0V9DQpsaWJyYXJ5KGRwbHlyKQ0KbGlicmFyeSh5YXp0aGVtZSkNCmxpYnJhcnkoZ2dwbG90MikNCnkgPSB5b3Vnb3ZbLCdZT1UuTC5BZGp1c3RlZCddDQpjbG9zZV9wcmljZV9saW5lIDwtIGF1dG9wbG90KHkpKw0KICBsYWJzKHggPSAnVGltZScsIHkgPSAnQWRqdXN0ZWQgQ2xvc2luZyBQcmljZScsDQogICAgICAgdGl0bGUgPSAnWW91R292IFBMQyAoWU9VLkwpIEFkanVzdGVkIENsb3NpbmcgUHJpY2Ugb3ZlciBUaW1lJykrDQogIHRoZW1lX3lheigpKw0KICBzY2FsZV94X2NvbnRpbnVvdXMoYnJlYWtzID0gYygwLCA1MiksIGxhYmVscyA9IGMoJzIgWWVhcnMgQWdvJywnMSBZZWFyIEFnbycpKQ0KY2xvc2VfcHJpY2VfbGluZQ0KYGBgDQoNClRoZSBBREYgdGVzdCBjb25zaWRlcnMgdGhlIG51bGwgaHlwb3RoZXNpcyB0aGF0IHRoZSBkYXRhIGlzIG5vbi1zdGF0aW9uYXJ5LiBJbiB0aGUgY2FzZSBvZiBZb3VHb3YgUExDLCB0aGUgcC12YWx1ZSBvZiAwLjEyOTIgc3VycGFzc2VzIHRoZSB0eXBpY2FsIHRocmVzaG9sZCBvZiAuMDUgc28gdGhpcyB0ZXN0IGZhaWxzIHRvIHByb3ZlIHN0YXRpb25hcml0eS4gDQpgYGB7ciwgZWNobyA9IFRSVUUsIHdhcm5pbmc9RkFMU0UsIG1lc3NhZ2U9IEZBTFNFfQ0KbGlicmFyeSh0c2VyaWVzKQ0KYWRmLnRlc3QoeSwgYWx0ZXJuYXRpdmUgPSAic3RhdGlvbmFyeSIpDQpgYGANCg0KVGhlIEtOU1MgdGVzdCByZXZlcnNlcyB0aGUgbnVsbCBoeXBvdGhlc2lzIHNvIHNtYWxsZXIgcC12YWx1ZXMgaW5kaWNhdGUgbm9uLXN0YXRpb25hcml0eS4gVGhlIHJlc3VsdCBvZiB0aGlzIHRlc3QgZWNob3MgdGhlIHJlc3VsdHMgb2YgdGhlIEFEUyB0ZXN0Lg0KYGBge3IsIGVjaG8gPSBUUlVFLCB3YXJuaW5nPUZBTFNFLCBtZXNzYWdlPSBGQUxTRX0NCmtwc3MudGVzdCh5KQ0KYGBgDQoNCkJvdGggdGhlIEFERiBhbmQgS05TUyB0ZXN0cyBjb25maXJtIHRoZSBuZWVkIGZvciBkaWZmZXJlbmNpbmcgdG8gYWNjb3VudCBmb3Igbm9uLXN0YXRpb25hcml0eSBpbiB0aGUgWW91R292IGRhdGEuIERpZmZlcmVuY2luZyBkb2Vzbid0IHByb2R1Y2UgYSBwZXJmZWN0IHdoaXRlIG5vaXNlIGRpc3RyaWJ1dGlvbiwgYnV0IGRvZXMgbGVhZCB0byBjbGVhciBpbXByb3ZlbWVudHMuIFNlY29uZCBvcmRlcmVkIGRpZmZlcmVuY2luZyBkb2Vzbid0IGhlbHAsIHRob3VnaCAtIHRoYXQgQUNGIHBsb3QgbG9va3MgbGVzcyBzdGF0aW9uYXJ5IHRoYW4gYmVmb3JlIQ0KYGBge3IsIGVjaG8gPSBUUlVFLCB3YXJuaW5nPUZBTFNFLCBtZXNzYWdlPSBGQUxTRSwgZmlnLndpZHRoPTEwLCBmaWcuaGVpZ2h0PTR9DQpsaWJyYXJ5KGdyaWRFeHRyYSkNCmRpZmYueSA8LSBkaWZmKHkpDQpncmlkLmFycmFuZ2UoDQogIGdnQWNmKHkpK3RoZW1lX3lheigpK2xhYnModGl0bGUgPSAnQUNGIG9mIFlvdUdvdiBBZGouIENsb3NlJyksDQogIGdnQWNmKGRpZmYueSkrdGhlbWVfeWF6KCkrbGFicyh0aXRsZSA9ICdEaWZmZXJlbmNlZCcpLA0KICBnZ0FjZihkaWZmKGRpZmYueSkpK3RoZW1lX3lheigpK2xhYnModGl0bGUgPSAnU2Vjb25kIE9yZGVyIERpZmZlcmVuY2VkJyksDQogIG5yb3cgPSAxDQopDQpgYGANCg0KIyBCdWlsZGluZyBNb2RlbHMNCkFSSU1BIG1vZGVscyBoYXZlIHRocmVlIGNvbXBvbmVudHM6IHRoZSBvcmRlciBvZiBhdXRvcmVncmVzc2lvbiwgZGVncmVlIG9mIGRpZmZlcmVuY2luZywgYW5kIHRoZSBvcmRlciBvZiB0aGUgbW92aW5nIGF2ZXJhZ2UuIEluIHRoaXMgY2FzZSwgdGhlIGBhdXRvLmFyaW1hKClgIGZ1bmN0aW9uIHVzZXMgYSB2YXJpZXR5IG9mIHRlY2huaXF1ZXMgdG8gZmluZCB0aGUgaWRlYWwgbWl4IG9mIHRob3NlIGNvbXBvbmVudHMgdG8gbWluaW1pemUgQUlDYy4gVGhlIHNlbGVjdGVkIG1vZGVsIGlzIGBBUklNQSgwLDEsMClgLSBhIHJhbmRvbSB3YWxrIG1vZGVsIHdpdGggbm8gY29uc3RhbnQuIFRoZSBtb2RlbCBvZiBkaWZmZXJlbmNlZCB2YWx1ZXMgcmVzdWx0ZWQgaW4gYSB3aGl0ZSBub2lzZSBtb2RlbCBhbmQgYSBuYWl2ZSBmb3JlY2FzdC4gQUlDYyBmb3IgdGhhdCBtb2RlbCB3YXMgdmlydHVhbGx5IGlkZW50aWNhbCBhbmQgcmVzaWR1YWxzIHBlcmZvcm1lZCBzaW1pbGFybHkuICANCmBgYHtyfQ0KYWEuZml0IDwtIGF1dG8uYXJpbWEoeSkNCnN1bW1hcnkoYWEuZml0KQ0KYGBgDQoNClJlc2lkdWFscyB0ZW5kIHRvIGdldCBlcmF0aWMgYWJvdXQgYSB5ZWFyIGFnbyBiZWNhdXNlIFlvdUdvdiBoYXMgYSBoaWdoIHByb2ZpbGUgcG9sbGluZyBvcGVyYXRpb24gd2hpY2ggd2FzIGludm9sdmVkIGluIHNldmVyYWwgbWFqb3IgZWxlY3Rpb25zIGluIHRoZSBVSyBhbmQgVVMgYXJvdW5kIHRoZSB0aW1lIG9mIHRoZSBzcGlrZXMuIFdoaWxlIHNvbWUgYXV0b2NvcnJlbGF0aW9ucyBvZiByZXNpZHVhbHMgYXJlIG91dHNpZGUgdGhlIGFjY2VwdGFibGUgYm91bmRhcmllcywgdGhlIFBvcnRtYW50ZWF1IHRlc3QgcmV0dXJucyBhIHAtdmFsdWUgb2YgMSB3aGljaCBpbmRpY2F0ZXMgcmVzaWR1YWxzIGJlaGF2ZSBhcyB3aGl0ZSBub2lzZS4gDQpgYGB7cn0NCmNoZWNrcmVzaWR1YWxzKGFhLmZpdCkrdGhlbWVfeWF6KCkNCmBgYA0KDQpQbG90dGluZyBmb3JjYXN0ZWQgdmFsdWVzIGluZGljYXRlcyBhbiA4MCUgY2hhbmdlIHRoYXQgaW52ZXN0aW5nIGluIFlvdUdvdiB3aWxsLCBhdCBhIGJhcmUgbWluaW11bSwgcmV0YWluIHRoZSBvcmlnaW5hbCB2YWx1ZSBhZnRlciA5IG1vbnRocyBhbmQgbGlrZWx5IGdyb3cgbW9yZSB0aGFuIHRoYXQuIFNvbWUgZGFtcGluZyBtYXkgYmUgaW4gb3JkZXIgdGhvdWdoIGJlY2F1c2UgaXQncyB1bnJlYWxpc3RpYyB0byBleHBlY3QgdGhlIHN0b2NrIHRvIGp1c3Qga2VlcCBncm93aW5nIGluZGVmaW5pdGVseS4gDQpgYGB7cn0NCmF1dG9wbG90KGZvcmVjYXN0KGFhLmZpdCxib290c3RyYXAgPSBUKSkrdGhlbWVfeWF6KCkNCmBgYA0KDQojIyBDb21wYXJpbmcgdG8gRVRTIE1vZGVsaW5nDQpMYXN0IHdlZWssIHdlIHVzZWQgZXhwb25lbnRpYWwgc21vb3RoaW5nIG1ldGhvZHMgdG8gcHJlZGljdCBzdG9jayBwcmljZXMuIA0KDQpgYGB7cn0NCmV0cy5maXQgPC0gZXRzKHkpDQpzdW1tYXJ5KGV0cy5maXQpDQpgYGANCg0KQ29tcGFyaW5nIHRoZSBtb2RlbHMgYWdhaW5zdCBvbmUgYW5vdGhlciBzaG93cyBhIHJlYWxseSBjbG9zZSByYWNlISBNUEUgYW5kIEFDRjEgYXJlIHNsaWdodGx5IGxvd2VyIGZvciB0aGUgQVJJTUEgbW9kZWwsIGJ1dCBhbGwgb3RoZXIgbWV0cmljcyBhcmUgdmlydHVhbGx5IGlkZW50aWNhbC4NCmBgYHtyfQ0KYmluZF9yb3dzKA0KICBkYXRhLmZyYW1lKGFjY3VyYWN5KGFhLmZpdCkpJT4lbXV0YXRlKG1vZGVsID0gJ0FSSU1BJyksDQogIGRhdGEuZnJhbWUoYWNjdXJhY3koZXRzLmZpdCkpJT4lbXV0YXRlKG1vZGVsID0gJ0VUUycpDQopJT4lDQogIHJlc2hhcGUyOjptZWx0KGlkLnZhcnMgPSAnbW9kZWwnKSU+JQ0KICBnZ3Bsb3QoYWVzKHggPSBhcy5jaGFyYWN0ZXIodmFyaWFibGUpLCB5ID0gdmFsdWUsIGZpbGwgPSBtb2RlbCkpKw0KICBnZW9tX2JhcihzdGF0ID0gJ2lkZW50aXR5JywgcG9zaXRpb24gPSAnZG9kZ2UnKSsNCiAgdGhlbWVfeWF6KCkrDQogIGxhYnMoeCA9ICdNb2RlbCBIZWFsdGggTWV0cmljJywgdGl0bGUgPSAnTW9kZWwgQ29tcGFyaXNvbicsDQogICAgICAgeSA9IGVsZW1lbnRfYmxhbmsoKSkNCg0KYGBgDQoNCkh5bmRtYW4gcG9pbnRzIG91dCB0aGF0IHRoaXMgaXMgZHVlIHRvIHRoZSBmYWN0IHRoYXQgbWFueSBFVFMgbW9kZWxzIGFyZSBzcGVjaWFsIGNhc2VzIG9mIEFSSU1BIG1vZGVscy4gVGhlIHByb2Nlc3MgaXNuJ3Qgc28gZGlmZmVyZW50IHNvIG5laXRoZXIgaXMgdGhlIG91dHB1dCE=