I am forecasting the price of Brent Oil1, over the next 11 months on the 1st May 2015, 1st July 2015 and 1st January 2016.
Oil is a vastly important commodity, with almost the entire poplation relying on it. It is vital in poducing and consuming many goods, and fuels the globalised world that we live in. This is why oil has such important economic influences, and is represented in the ‘basket of goods’ used to calculate both RPI and CPI in many different forms. As has been demonstrated over the past months, any fluctuation in the price of oil has dramatic knock-on effects throughout the economy and due to the most recent drop in price inflation (CPI) has fallen to 0% in March of this year2. Oil price fluctations directly affect the purchasing power of consumers, as widespread news covergage tracing oil price changes inadventently affects their spending patters, and firms are not exempt from this effect. Companies, especially oil exploration and extraction ones, make strategic investment decisions given the price of oil and what they believe will happen to the price of oil in the future. Therefore, in predicting the price of such a crucial commodity there is potential to predict other economic variables - growth of the economy, and inflation, as well as potential to earn a lot of money.
There are academic writers3 who argue forecasting commodities and stocks that display unpredictable behaviour in everything but the long run, is a hopeless task - as it is impossible to predict anything more than the general trend or drift4.
I have chosen Brent oil specifically as this is a very important measure for the British economy, seeing as north sea resources are icredibly valuable to the economy. Brent is also very closely linked to other oils - West Texas Intermediate/Dubai, as they compete on the world market.
In this project I plan on building models with important factors including: US shale oil production, OPEC production, a non-fuel based commodity price index, measures of geopolitical stability amongst oil exporting countries, as well as lags of oil price and explanatory variables in order to produce the best possible forecast. This is because, in my opinion they are the factors that are most likely to most influence prices over the next year. For example, removal of political sanctions on Iran could increase their oil exports by up to 1 million barrels per day, as reported by Reuters5. This would clearly reduce the price of oil on the world market. Another factor could be Saudi Arabia, an OPEC member increasing supply in order to drive down prices and clear the market of higher cost producer - such as some US shale producers which only became profitable in recent years since the 2005 price boom, as is expected by MPC member Ian Mccaferty6.
In order to test the accuracy of my model I plan on comparing it to some popular forecasting methods, that are widely used in Governments and financial institutions alike - a Delphi approach7. This will invlove me collecting and collating forecasts from prominant figures who publicise their forecasts and building an unbiased forecast from them. Then I can see if my models beat this forecasts or if a more qualitative approach is needed. This approach will be useful for contrast because many factors such as wars and natural disasters, that have huge implications for oil price, are very dificult to predict and vary greatly between opinions.
In this paper I examine characteristics of oil prices over time, then follow to present several different regressions and methods of forecasting prices most accurately and evaluate them with regards to popular market predictions.
In order to collate this information I have selected several well-known institutions that publish Brent oil price forecasts. These are The IMF, The World Bank, The Economist Intelligence Unit, The US Energy Information Administration and Goldman Sachs Global Investment Research. All of these high profile forecasters should be producing unbiased and as accurate as possible forecasts - as these are heavily relied upon in their sectors. In this ‘Delphi-style’ forecast it is impossible for me - the facilitator, to have anonyminity from the forecasters, as I have to chose them. So I have attempted to be as unbiased as possible and selected forecasts of varying ranges. Another factor that is highly stressed by the institutions in their justifications is that oil prices are very volatile, and highly accuracte forecasts are difficult to produce, hence why most of the institutions provide only a vague date for their forecats - simply a yearly or quarterly average.
The IMF 8 forecasts that the price will end the first quarter of 2015 at 53/bl, then rise to 57.2 and subsequently 60.1 and finishing the year at 62.3. The World Bank9 produces a yearly average forecast, whcih predicts 53/bl for 2015 rising marginally to 57/bl in 2016. The Economist10 forecasts a 2015 average of 58/bl, and a 2016 average of 71/bl. The EIA11 predicts a 2015 average of 59.5, and a 2016 average of 75. Finally, Goldman Sachs12 forecasts 50.4 and 70 for 2015 and 2016 respectively. The fact there is so much variation in the ‘proffesional’ forecasts represents the unpredictability in the oil industry and forecasts should be treated as such.
As the facilitator of these forecasts, my job is to compile them and put forward my unbiased view. Using the mean and median values, I predict a 2015 average of 55, and a 2016 average of 67. This shows that all the general consensus in industry is that prices will fall further over the next year or so before rebounding towards the mean over approximately the last decade. This is because the market will take time to rebalance after facing reducing investment from exploration and extraction firms - due to the low prices13.
As oil is very volatile there have been discussions surrounding the unpredictable nature of it. The demand and supply of oil is affected by many different variables, both qualitative and quantitative - many of which are unpredictable. (Geman, 2007)14 argues that pre 2000, oil prices were stationary, and not a random walk because they reverted to a set mean. Then it is argued that post 2000 oil prices became a random walk. This is feasible, as the variance and unpredictability of prices did seem to change around this era on an upward trend.
The implications of oil price becoming a random walk are very significant for my forecasts, as many traditional forecasting techniques rely on the variable reverting to an average value. If the series is conclusively a random walk a naive approach is best suited to forecasting, meaning that the last value is continued forward into the out of sample area.
##
## Phillips-Perron Unit Root Test
##
## data: oil.t
## Dickey-Fuller = -2.6936, Truncation lag parameter = 5, p-value =
## 0.2844
##
## Augmented Dickey-Fuller Test
##
## data: oil.t
## Dickey-Fuller = -2.3259, Lag order = 7, p-value = 0.4398
## alternative hypothesis: stationary
##
## KPSS Test for Level Stationarity
##
## data: oil.t
## KPSS Level = 5.1693, Truncation lag parameter = 4, p-value = 0.01
Testing the theory put forward by Geman, using Phillips-Peron, ADF and KPSS tests I find that Brent oil prices are in fact a random walk for the entirety of the series, from 1980 until 2015. I conclude that there is a unit at both 5 and 10% significance levels. This is a borderline decision, but as all three tests come to the same conclusion this is conclusive, regardless of the ‘low power’ of some unit root tests.
The implications of finding that oil prices follows a random walking structure are important from a forecasting point of view. According to Hyndman & Athanasopoulos15 the best approach given this information is a (seasonal) naive approach, or possibly an ARIMA(0,1,0) model. I will attempt these, in additional to building models that could potentially ‘beat the random walk’.
Looking at the plot of the original series, it is apparent to me that the series is non-stationary. This is confirmed by examining the ACF - as if oil was stationary the ACF would drop to zero at a much quicker rate than this. The ACF shows that shocks have a certain amount of persistence in the economy, which is believable seeing as they are very important as I detailed previously.
Both the ACF and the time series created by differencing look like typical of a series created by ‘white noise’. There are only two auto correlations are outside the 95% confidence limits. This implies that an appropriate model will utilize two lags of oil prices, as they are significant in determining current price level.
The first order differencing is white noise this means it is by definition stationary, provided the variance is constant throughout the whole series - as variance can increase with the price.
\[y_{t} - y_{t-1} = e_{t} (+ stoachastic constant)\]
The implications of white noise resulting from differencing further enpowers the random walk argument.
It could also be argued that post 2000 the differencing looks somewhat like ‘coloured noise’16 as the series displays more volatility and deviations from the zero line. This would imply that oil prices may be mean reverting implying that the coefficient of lags may be less than unitary, but close to it, ~ 0.95 for example.
The Europpean Central Bank17 have published in depth studies into the best indicators of Brent oil price, they conclude that none of their models consistently ‘beats’ the random walk model in forecasting various training periods. However, they find a combination of several forecasts gives a good indication of the general trend in which the prices are heading. Following their findings I will combine forecasts built by time series decomposition, exponential smoothing and ARIMA to attempt the most accurate prediction.
Evaluation of my forecasts are based on two factors. These are Mean Average Percentage Error (MAPE), and residuals being identically and independently distributed.
\[MAPE=mean(|p_{i}|)\]
Where, \[p_{i}=100e_{i}/y_{i}\]
This second factor is built on three assumptions - constant expected value of errors (ideally zero for highest precision), constant variance (s) of errors across time and no serial autocorrelation of errors throughout time. This implies that the errors are stationary and also exhibit ‘white-noise’ stlye characteristics.
\[e_{t} DIST iid(0,s^2)\]
Given that the price of Brent oil is a random walk over the period since 1980, the best forecast according FPP18 is a naive or a seasonal naive - as oil price seem to drift upwards.
Naive and seasonal naive forecasts give MAPE’s of 6.352 and 23.926 respectively. The seasonal naive is a bad forecaster because the data is too highly trended for oil prices to be equal to the same period a year before, and is not overtly seasonal. It can be clearly seen that this forecast is not accurate as there were dramatic events over the past year.
Using the automatically selected arima function, an ARIMA(2,1,1)(0,0,2) model is produced. This shows that the function believes brent oil prices are seasonal, integrated of order 1 and has 2 significant lags (as demonstrated by ACF analysis). After attempting various combinations, including (0,1,0), (1,1,1) and (1,1,0) models, I find that this ARIMA(2,1,1)(0,0,2) is the best performing with significantly lower information criterion.
The MAPE of the ‘best’ ARIMA forecast is 6.275, which outperforms the naive forecast. The residuals of this model are also well-suited to the normailty assumption.
The exponential smoothing model produces a forecast with MAPE of 6.411, which over the training set does not outperform the naive forecast. Furthermore, the residuals are significantly skewed which means the model may not be the best suited to forecasting.
Using a time series decomposition I prepare two models, one using a periodic window and one experimenting with various values. I find that the residuals from the experimental model are preferrable for forecasting as they are less skewed, but still non-normal. Both forecasts produce a MAPE of 6.5469, and I will encorporate the experimental measure into my combination forecast because the residuals are far less sinosodial and the periodic model gives highly autocorrelated residuals.
Overall, the time series decomposition forecasts are lower than the actual values, so combined with the ARIMA model which tends to overestimate values they may become a good forecaster.
These two training periods demonstrate the performance of several different forecasts. The first training period forecasts over a period beginning in 2009 over a 2 year period, and shows that the training forecasts are in general reasonably accurate here. The ARIMA(2,1,2) is the best forecaster over this in-sample period but on average is an overestimate. However it fails to encorporate any kind of fluctuations, so a combination of the ARIMA and STL is useful in this scenario, as is seen. This combination forecast gives a MAPE in this scenario of 6.25, which marginally beats the ARIMA model to become the best performing forecast.
In this second in sample forecast, from May 2009 over 2 years, it becomes even clearer that a combination forecast of the mean of the ARIMA and the time series decomposition results in the most accurate forecast by far. This combined forecast has a MAPE of 5.87 - which is by far the most accurate forecaster.
As you can see from this compilation of forecasts, some of them coincide with those produced by the “experts” selected for the Delphi-style forecast. In particular the STL forecast seems to predict well, however I am of the opinion that all of my forecasts are either too dramatically high or low. Therefore, I am going to take a geometric mean to build a forecast that is more realistic. The output is shown above - labelled “Combination average”.
Using two linear regressions I have built models including measures of geopolitical stability19 amongst top oil exporting countries20 that have faced uncertainty in their countries over recent years, production amongst OPEC21, US shale oil production22 and a non-fuel based commodity price index23.
## Start: AIC=-375.5
## oil.price ~ iran.stab + iraq.stab + russia.stab + saudi.stab +
## comm.price
##
##
## Step: AIC=-375.5
## oil.price ~ iran.stab + iraq.stab + russia.stab + comm.price
##
##
## Step: AIC=-375.5
## oil.price ~ iran.stab + iraq.stab + comm.price
##
##
## Step: AIC=-375.5
## oil.price ~ iran.stab + comm.price
##
## Df Sum of Sq RSS AIC
## <none> 11.0 -375.50
## - iran.stab 2 745.2 756.2 847.93
## - comm.price 273 26515.5 26526.5 1337.65
## CV AIC AICc BIC AdjR2
## Inf -3.735027e+02 1.660905e+05 6.834231e+02 9.971394e-01
## Start: AIC=634.5
## oil.price ~ opec.prod + us.shale.prod
##
## Df Sum of Sq RSS AIC
## <none> 359.7 634.5
## - us.shale.prod 262 14457 14816.6 1188.8
## - opec.prod 11 10823 11182.8 1609.2
## CV AIC AICc BIC AdjR2
## Inf 6.364995e+02 8.329250e+04 1.689755e+03 9.296822e-01
Using backwards stepwise approach to model selection it is found that important explanatory variables are the non-fuel based commodity price index and the political stability in Iran - which as mentioned may have a significant effect in the near future.
However, I feel that these models are incredibly susceptible to objectivity. They are dependent on the regressors that are chosen, and just as importantly those that are omitted. Regardless of the use of stepwise model selection, it must be remembered that a badly selected model can still be selected if a bad bath is chosen, and the best possible model is by no means guaranteed.
With regards to models surrounding oil prices there is a large number of possible regressors that have been argued to be significant at some point in history - many of which are not available quantitatively. Therefore, I have decided to omit these forecasts from my combination, as there are too many potential sources of errors. Including omitted variable bias, serial auto correlation in error terms and heteroskedasticity in errors.
Furthermore, I feel that a scenario based forecast on a variable as volatile as oil prices will add nothing to my predictions.
This Impulse indicator saturation test shows that there are various structural breaks, over the last 30 years. To put this into some historical context there is a drop of prices in the mid 80’s due to the OPEC collapse, then there is an upward trend as OPEC resolve and attempt to fix prices24. More recently there were several upward structural breaks in the approach to the 2008 ‘boom’, and the following bust due to the recession and falling demand. Given the first structural break I have decided to split the data from 1991 and build a model that furthers the seasonal and trend discovered during the time series decomposition.
## CV AIC AICc BIC AdjR2
## 323.1855454 1676.8275018 1678.3547746 1728.2058348 0.7588551
## CV AIC AICc BIC AdjR2
## 532.1384138 2649.7964913 2650.8284324 2706.4265657 0.4916035
It is shown through the cross-validation, AICc, BIC and adjusted R-squared statistics that this, split model from 1991, is a highly preferable model to selecting all of the data. However, these regressions are not good forecasters - it is clearly seen that the residuals are highly correlated and the fitted values are not anywhere near the current value. It is naive to think that there will be a constant expansion in oil prices, as this regression shows.
The trend and seasonal regression for the regression using data from 1991 onward has an accuracy (MAPE) of 45.0275.
Running an arx regression on oil prices with two significant lags and a non-fuel based commodity price index gives this output. At first glance it looks like a suitable regression for forecasting the oil price - with seemingly random residuals.
Performing General-to-specific analysis on the above equation shows that the results are not well specified. This fails the normality assumption for the general unrestricted model, being clearly skewed to the right.
Changing the GETS parameters to ignore the normality clause gives the following output; it concludes that all of the explanatory variables are significant regressors.
Therefore, below is a model using the optimal ARIMA(2,1,1)(0,0,2) with the commodity price index as an explanatory variable. However, forecasting using this equation is not a feasible approach as it requires forecasting the commodity price index into the future before it can forecast the price of oil. Either way there is going to be a similar error.
## Series: oil.t.1
## ARIMA(2,1,1)(0,0,2)[12]
##
## Coefficients:
## ar1 ar2 ma1 sma1 sma2 comm.pric.non.t
## -0.6009 0.2381 1.0000 0.0866 -0.0676 0.7253
## s.e. 0.0600 0.0600 0.0108 0.0641 0.0628 0.0461
##
## sigma^2 estimated as 12.17: log likelihood=-769.74
## AIC=1553.47 AICc=1553.87 BIC=1579.14
As was earlier stated my method of evaluation is to try and always make the errors to look independently and indentically distributed:
\[e_{t} DIST iid(0,s^2)\]
However, none of the forecasts produced in this paper give residuals that have these characteristics. Non-normality of errors could be a combination of heteroskedasticity in errors, potentially caused by omitted variable bias. This leads me to believe that there is autocorrelation in the error terms that is also caused by omitted variable bias. This is a problem that is very difficult to avoid with oil prices - as there are many factors driving it. It would be feasible to add in further lags to model these omitted variables as a proxy, however after analysing the autocorrelation function it is clear that no more than 2 lags are significant - which is verified in ARIMA model selection.
Furthermore, some of these influencing factors are difficult to quantify as they may resemble qualitative information - ‘Animal spirits’ in the world economy, that may effect speculation, investment and consumption surrounding oil. Other factors could be the underlying evolution that has changed the structure of oil exploration and extraction over recent decades. This technology is always evolving and can only be modelled as a stochastic disturbance term that is derived exogenously.
I have found that oil is very unpredictable; it random walks like many other commodities, making it very difficult to forecast accurately. After attempting many different methods models, I conclude that the most favourable method of prediction is a combination of an ARIMA(2,1,1)(0,0,2) and a Time series decomposition made by averaging them with an equal weighting. This is because they gave the lowest MAPE.
The forecasts that I have produced are significantly higher than the Delphi forecast produced, and have significantly less variance than many other forecasts produced. My forecasts for Brent oil prices over the next 11 months are:
Date Forecast
2015-03 60.845
2015-04 62.6025
2015-05 62.9275
2015-06 63.3175
2015-07 63.68
2015-08 63.695
2015-09 63.4025
2015-10 62.875
2015-11 62.265
2015-12 61.4575
2016-01 60.6165
2016-02 62.39575
https://www.quandl.com/data/ODA/POILBRE_USD-Brent-Crude-Oil-Price - Original source IMF. All prices measured in US dollars per barrel.↩
Office for National Statistics, “Inflation at 0%”, [http://www.ons.gov.uk/ons/dcp171780_399052.pdf]↩
Stevenson, R.; Bear, R. (2012) “Commodity Futures: Trend or Random Walks”, Journal of Finance, Vol. 25, Issue 1.↩
Malkiel, B. (2007), “A Random Walk Down Wall Street”, first published in 1973, updated in 2007 by Norton & co.↩
Torbati, Y. (2015), “Iran nuclear deal may open oil taps in months, not weeks”, Thomson Reuters[ http://www.reuters.com/article/2015/03/17/us-iran-oil-sanctions-analysis-idUSKBN0MD0DJ20150317]↩
McCafferty, I. (2015), “Oil price falls - what consequences for monetary policy?”, speech given at Durham Business School, 10/03/2015.↩
Linstone, H.; Turoff, M. (1975) “The Delphi Method”, Addison-Wesley Reading, Ma.↩
IMF Primary Commodity Price forecasts, Updated on 23/03/2015 [http://www.imf.org/external/np/res/commod/index.aspx]↩
The World Bank Commodity Markets Outlook, January 2015 [http://www.worldbank.org/content/dam/Worldbank/GEP/GEPcommodities/GEP2015a_commodity_Jan2015.pdf]↩
The Economist Intelligence Unit, Individual commodity price forecasts, 2015. [http://gfs.eiu.com/Article.aspx?articleType=cf&articleId=1642990748&secId=0 18/3/15]↩
The US Energy Information Administration, Short term energy outlook report, March 2015.↩
Goldman Sachs Global Investment Research, 2015 commodity forecast.↩
McCafferty, I. (2015), “Oil price falls - what consequences for monetary policy?”, speech given at Durham Business School, 10/03/2015.↩
Geman, H. (2007), “Mean Reversion versus Random Walk in Oil and Natural Gas Prices”, Advances in Mathematical Finance, page 219-288.↩
Hyndman, R.; Athanasopoulos, G. (2013) “Forecasting: Principles and Practice” [https://www.otexts.org/book/fpp]↩
Patterson, K. (2000) “An Introduction to Applied Econometrics: A Time Series Approach” Palgrave Macmillan.↩
Manescu, C.; Robays, I. (2014) “Forecasting The Brent Oil Price, Addressing Time-Variation in Forecast Performance” European central Bank working paper series, No 1735.↩
Hyndman, R.; Athanasopoulos, G. (2013) “Forecasting: Principles and Practice” [https://www.otexts.org/book/fpp]↩
Geopolitical stability estimates, World Bank data sets [https://www.quandl.com/c/society/estimated-political-stability-by-country]↩
Top oil exporting countries, US Energy Information Administration - http://www.eia.gov/countries/index.cfm↩
OPEC Production, US Energy Information Administration [https://www.quandl.com/data/EIA/STEO_COPC_OPEC_A-OPEC-Total-Crude-Oil-Production-Capacity-Annual]↩
US Shale oil production, [https://www.quandl.com/c/energy/crude-petroleum-reserves-from-oil-shale-by-country]↩
Non-fuel price index, IMF [https://www.quandl.com/data/ODA/PNFUEL_INDEX-Non-Fuel-Price-Index]↩
McCafferty, I. (2015), “Oil price falls - what consequences for monetary policy?”, speech given at Durham Business School, 10/03/2015.↩