TakeAway_CAT : Energy Analytics

library(tinytex)

Introduction

Energy remains an essential pillar for human livelihood for its necessity in sustainability and development of any current civilization. One of the main feature distinguishing it from most of the other commodities is the mean-reverting characteristic together with evident spikes and high volatility as seen for electricity and crude oil prices. Here, we’ll look at crude oil daily spot prices. A number of works have employed the Ornstein–Uhlenbeck process to model directly the dynamics over time of different commodity spot prices under reduced-form one-factor models (Ribeiro and Hodges, 2004). Nonetheless, in recent studies, crude oil spot price has also been modelled as a jump-diffusion process, as attributed to Merton (1976), as in Jorion (1988) for asset returns and Ball & Torous (1983) for foreign exchange and other securities. Moreover, due to the presence of skewness and fat tails in the empirical distribution of oil price returns, the series is more effectively modeled using Levy models which assign higher probabilities to sudden unexpected events unlike the gaussian models like Black Scholes model. These jump-diffusion models incorporate jumps into a diffusion process through a Poisson process with parameter, \(\lambda\).

oil_data <- read.csv(file.choose())
head(oil_data)

##         Day Brent_Spot_Price_FOB
## 1 5/20/1987                18.63
## 2 5/21/1987                18.45
## 3 5/22/1987                18.55
## 4 5/25/1987                18.60
## 5 5/26/1987                18.63
## 6 5/27/1987                18.60

my_data = oil_data$Brent_Spot_Price_FOB
head(my_data)

## [1] 18.63 18.45 18.55 18.60 18.63 18.60

plot(my_data, type = "l", lwd = 2.5, xlab = "Time", ylab = "Crude Spot Price", col = 1, main = "Historical Spot Price for Brent Fuel")

hist(my_data, main='Crude Oil Spot Prices', xlab='Crude Oil Price', col='darkblue', breaks = 1000, border="red", freq=F)
lines(density(my_data))

returns = diff(log(my_data))
head(returns)

## [1] -0.009708814  0.005405419  0.002691792  0.001611604 -0.001611604
## [6]  0.000000000

As evidenced from the above plot, the oil spot price is quite volatile as characterized by a number of jumps. We’ll transform the prices to log returns to bring an aspect of stationarity in the series for better analysis and parameter estimation.

plot(returns, type = "l", lwd = 2.5, xlab = "Time", ylab = "CrudeLog Returns", col = 1, main = "Brent Crude Oil Log Returns")

hist(returns, main='Crude Oil Log Returns', xlab='Log Returns', col='darkblue', breaks = 1000, border="red", freq=F)
lines(density(returns))

Time-Series Properties of the Crude Oil Price

From the time series plots above, there has been heightened variance in the crude oil price, with the earlier periods being characterized by very high prices and very low levels for the recent periods. Moreover, the price series is non-gaussian as ooposed to the log returns’ series. To determine a seasonal or trend factor in the time series, we plot the ACF and PACF charts and run a dickey fuller test to check for stationarity.

library(tseries)
library(forecast)
library(x12)

## Warning: package 'x12' was built under R version 3.6.3

par(mfrow=c(1,2))
 acf(my_data,  main="ACF")
 pacf(my_data,  main="PACF")

 adf.test(my_data)

## 
##  Augmented Dickey-Fuller Test
## 
## data:  my_data
## Dickey-Fuller = -2.2202, Lag order = 20, p-value = 0.4851
## alternative hypothesis: stationary

With a p value of .4851, clearly the time series is not stationary. The ACF plot doesn’t have curves in the lags, so seasonality is not evident in the series. The high 1st lag in PACF also shows evidence of non stationarity in the series. In order to make the series stationary, we first try a transformation using the BoxCox method to check for the lambda.

lambda = BoxCox.lambda(my_data, method = 'loglik')
 lambda

## [1] -0.2

With a lambda value of -0.2, we use inverse power transformation on the series and run the augmented dickey fuller test to check for stationarity again.

my_data.tr = (my_data^lambda)
adf.test(my_data.tr)

## 
##  Augmented Dickey-Fuller Test
## 
## data:  my_data.tr
## Dickey-Fuller = -2.8328, Lag order = 20, p-value = 0.2256
## alternative hypothesis: stationary

The p value has reduced slightly but is still at .2256 which is well above the significant value of 0.05. Since transformation did not work in making the series stationary, we difference the series using ordinary differencing and plot and test for stationarity using the dickey fuller test.

my_data.diff = diff(my_data)
 plot(my_data.diff,type = "o", ylab='Change in price index',xlab='Time', main = "Differenced time series plot for Crude oil price")

adf.test(my_data.diff)

## 
##  Augmented Dickey-Fuller Test
## 
## data:  my_data.diff
## Dickey-Fuller = -18.585, Lag order = 20, p-value = 0.01
## alternative hypothesis: stationary

The differenced series has a better variance around the mean level, and the peaks are evidence of the interventions in the orignal series. The p-value of the dickey fuller test is significant and is now 0.01 and hence the series is now stationary.

Brownian Motion and Yuima

library(yuima)

As stated earlier, basic theoretical GBM oil pricing model was proposed by Brennan and Schwartz (1985) and used for pricing of financial derivatives. Later Gibson and Schwartz (1990) derived a stochastic spot price model with convenience yield following a mean-reversion process. In yuima, one can describe three main families of stochastic processes at present. These models can be one or multidimensional and eventually described as parametric models. Let \(X_0 = x_0\) be the initial value of the process, then, the main classes are as follows:

Diffusion models described as; \[\begin{equation} dX_t = a(t, X_t, \theta_2)dt + b(t, X_t, \theta_1)dW_t ; X_0 = x_0 \end{equation}\] where \(W_t\) is a standard Brownian motion.

SDE’s driven by fractional Gaussian noise, with H the Hurst parameter, described as; \[\begin{equation} dX_t = a(t, X_t, \theta_2)dt + b(t, X_t, \theta_1)dW_t^H ; X_0 = x_0 \end{equation}\]

Diffusion processes with jumps and Levy processes solution to; \[\begin{equation} dX_t = a(t, X_t, \theta_2)dt + b(t, X_t, \theta_1)dW_t + \int_{|Z|>1} c(X_{t-}, Z) \mu(dt, dZ) + \int_{0<|Z|>1} c(X_{t-}, Z) \{\mu(dt, dZ) - v(dZ)dt \}; X_0 = x_0 \end{equation}\]

Delta <- 1/252
x <- my_data
gBm <- setModel(drift="mu*x", diffusion="sigma*x")
mod <- setYuima(model=gBm, data=setData(my_data, delta=Delta))
set.seed(123)
fit <- qmle(mod, start=list(mu=1, sigma=1),
lower=list(mu=0.1, sigma=0.1),
upper=list(mu=100, sigma=10))
summary(fit)

## Quasi-Maximum likelihood estimation
## 
## Call:
## qmle(yuima = mod, start = list(mu = 1, sigma = 1), lower = list(mu = 0.1, 
##     sigma = 0.1), upper = list(mu = 100, sigma = 10))
## 
## Coefficients:
##        Estimate  Std. Error
## sigma 0.4047972 0.003171736
## mu    0.1016507 0.070179800
## 
## -2 log L: 22331.1

It is easy to show that the \(X_i\) form a sequence of independent and identically distributed Gaussian random variables \(X_i ∼ N(\alpha \Delta, \sigma^2 \Delta)\), where \(\Delta = t_i − t_{i−1}\) and \(\alpha = \mu − \frac{1}{2}\sigma^2\). Then, the plug-in estimators for \(\mu\) and \(\sigma\) can be calculated according below code;

X <- diff(log(my_data))
X <- as.numeric(na.omit(diff(log(my_data))))
alpha <- mean(X)/Delta
sigma <- sqrt(var(X)/Delta)
mu <- alpha +0.5*sigma^2
coef(fit)

##     sigma        mu 
## 0.4047972 0.1016507

which looks close to the QMLE estimates as it should be in this situation.

Compound Poisson Driving Processes

Lévy models can be specified in two different ways in yuima. The first one is in terms of the compound Poisson structure, and the second allows for direct specification of the density of the Lévy process.

S <- oil_data$Brent_Spot_Price_FOB
Z <- na.omit(diff(log(S)))
Dt <- 1/252
# geometric Brownian motion estimation
model1 <- setModel(drift="mu*x", diff="sigma*x")
gBm <- setYuima(model=model1, data=setData(S,delta=Dt))
gBm.fit <- qmle(gBm, start=list(mu=0,sigma=1),method="BFGS")
gBm.cf <- coef(gBm.fit)
zMin <- min(Z)
zMax <- max(Z)
gBm.cf

##     sigma        mu 
## 0.4008832 0.1016508

# Gaussian-Levy estimation
model3 <- setPoisson(df="dnorm(z,mu,sigma)")
Norm <- setYuima(model=model3, data=setData(cumsum(Z),delta=Dt))
Norm.fit <- qmle(Norm,start=list(mu=1, sigma=1), lower=list(mu=1e-7,sigma=0.01),method="L-BFGS-B")
Norm.cf <- coef(Norm.fit)
Norm.cf

##         mu      sigma 
## 0.00000010 0.02586104

# NIG-Levy estimation
model2 <- setPoisson( df="dNIG(z,alpha,beta,delta1,mu)")
NIG <- setYuima(model=model2, data=setData(cumsum(Z),delta=Dt))
NIG.fit <- qmle(NIG, start=list(alpha=10, beta=1, delta1=0.01, mu=0.0001), method="L-BFGS-B")
NIG.cf <- coef(NIG.fit)
NIG.cf

##        alpha         beta       delta1           mu 
## 1.000000e+01 9.999988e-01 1.324321e-02 2.370332e-04

myfgBm <- function(u)
dnorm(u, mean=gBm.cf["mu"], sd=gBm.cf["sigma"])
myfNorm <- function(u)
dnorm(u, mean=Norm.cf["mu"],sd=Norm.cf["sigma"])
myfNIG <- function(u)
dNIG(u, alpha=NIG.cf["alpha"],beta=NIG.cf["beta"],delta=NIG.cf["delta1"], mu=NIG.cf["mu"])
plot(density(Z,na.rm=TRUE),main="Gaussian versus NIG")
curve(myfgBm, zMin, zMax, add=TRUE, lty=2)
curve(myfNorm, zMin, zMax, col="red", add=TRUE, lty=4)
curve(myfNIG, zMin, zMax, col="blue", add=TRUE,lty=3)

AIC(gBm.fit)

## [1] 22333.48

AIC(Norm.fit)

## [1] -36598.66

AIC(NIG.fit)

## [1] -38995.63

The codes and graph above, compares the empirical density of the data \(Z_t\) with the densities of an estimated geometric Brownian motion (dashed line), a Gaussian distribution where the mean and the variance coincide with the sample means and variance of \(Z_t\) and the NIG density where the parameters have been estimated using the degenerate compound Poisson process \(X_t\) above. It can be clearly seen that the data is not Gaussian (and especially not coming from a geometric Brownian motion), but more likely to be of NIG type in this particular dataset. The Akaike information criterion evaluated with the AIC function also confirms this empirical evidence. The plot above is a representation of the empirical density (continuous line) against a geometric Brownian motion fit (dashed line), a Gaussian fit (horizontal dashed and dotted line) and the estimated NIG Lévy model density (dotted line).

Dynamics of the Jump Diffusion model

This model superimposes a jump component on a diffusion component, which is the familiar geometric Brownian motion. The jump component is composed of lognormal jumps driven by a Poisson process and models the sudden changes in the spot price due to the arrival of new important information. Here, a Levy process is defined in order to discuss a jump-diffusion model.

Definition

A continuous - time stochastic process \({X_t : t > 0}\), defined on a probability space \((Ω, F, Ρ)\) is said to be a Levy process if it possesses the following properties;

1.) \(P\{X0 = 0\} = 1\).

2.) Stationary increments: the distribution \(X_{t+s}-X_t\) over the interval \([t, t + s]\) does not depend on \(t\) but on length of interval \(s\).

3.) Independent increments: for every increasing sequence of times \(t_0,...,t_n\), the random \(X_{t_0}, X_{t_1}-X_{t_0}, \cdots, X_{t_n}-X_{t_{n-1}}\) variables are independent.

4.) Stochastic continuity; \(Lim_{h \rightarrow 0} P(|X_{t+h}-X_t| \geq \varepsilon)\) i.e. discontinuity occurs at random times.

5.) The paths of \(X_t\) are \(P\) almost surely right continuous with left limits (Cadlag paths).

Jump Diffusion Model

Here, the Levy model employed is a variant of Merton’s model which has a diffusion component and a jump component. Let \(X_t\) be the asset price at time \(t\). The risk-neutral jump-diffusion process for the spot price follows \[\begin{equation} \displaystyle{\frac{dX_t}{X_t} = (r-\lambda \bar{k})dt + \sigma dW_t + k dq_t} \end{equation}\] where \(\sigma\) denotes the volatility of the diffusion component. The jump event is governed by a compound Poisson process \(q_t\) with intensity \(\lambda\), where \(k\) denotes the magnitude of the random jump. The distribution of \(k\) obeys; \[\begin{equation} \displaystyle{ln(1+k) \sim N(\gamma, \delta^2)} \end{equation}\] with mean; \[\begin{equation} \displaystyle{\bar{k} \equiv E(k) = e^{\gamma + \frac{\delta^2}{2}} - 1} \end{equation}\] The model with \(\lambda = 0\) reduces to the Black-Scholes model. The solution to the sde above is given by; \[\begin{equation} \displaystyle{X_t = X_0 e^{(r-\lambda \bar{k} - \frac{\sigma^2}{2})t + \sigma W_t} U(n(t))} \end{equation}\] where \[\begin{equation} \displaystyle{U(n(t)) = \prod_{i=0} ^{n(t)} (1 + k_i)} \end{equation}\] where; \(k_i\) is the magnitude of the \(i\)th jump with \(ln(1+k_i) \sim N(\gamma, \delta^2)\); \(k_0 = 0\) and \(n(t)\) is a Poisson process with intensity \(\lambda\). Recall that \(n(t)\) denotes the number of jumps that occur up to time \(t\). As \(k > −1\), stock prices will stay positive. Nonetheless, the geometric Brownian motion, the lognormal jumps, and the Poisson process are assumed to be independent.

### Load the YUIMA package 
library(yuima)

Delta <- 1/252 
x <- my_data
Date <- oil_data$Day
# plotting original data
plot(x, ylab="Price", xlab="Period",main=paste("Crude Oil Daily Spot Price"))

sum(is.na(x))  #for missing values in x

## [1] 0

sum(is.na(Date))   #check for missing values in date column

## [1] 0

Yum<-function(k,prices,Date)
{
 Delta <- 1/252
 x <- list(my_data[])
 oil_data$Day<-Date
 # plotting original data
 oil_data$time <- as.Date(oil_data$Day, "%d/%m/%Y")
 tmp <- setYuima(data=setData(zoo(x[[k]],order.by=oil_data$time), delta=Delta))
 # pre-estimates of gBm parameters
 a.hat <- var(diff(log(x[[k]])))/Delta
 b.hat <- mean(diff(log(x[[k]])))/Delta + 0.5 * a.hat
 model <- setModel(drift="mu*x", diffusion="sigma*x", jump.coeff="1", measure=list(intensity="lambda", df=list("dnorm(z, beta, dels)")), measure.type="CP", solve.variable="x")
 yuima <- setYuima(model=model, data=tmp@data)
 lower <- list(mu=0.01, sigma=0.01,lambda=0.001,beta=0.1,dels=0.1)
 upper <- list(mu=100, sigma=100,lambda=25,beta=100,dels=20)
 start <- list(mu=b.hat, sigma=a.hat, lambda=5, beta=10, dels=2)
 out <- qmle(yuima,start=start,threshold=sqrt(1/Delta),
 upper=upper,lower=lower,method="L-BFGS-B")
 plot(yuima,ylab="Price", xlab="Year", main=paste("Crude Oil Price"))
#plots data
 plot(out,col="red") 
 legend("topleft",col=c(1,2),lty=c(1,1),
 legend=c("True Value","Forecast"), cex = 0.7)
 return(summary(out))
}

Results/Discussion

Let \(Z\) be a compound Poisson process (i.e. jump \(t\) sizes follow some distribution, like the Gaussian law, and jumps occur at Poisson times). Then, Yoshida et al (2014) in the Yuima -package considered the following SDE which involves jumps; \[\begin{equation} dX_t = a(t, X_t, \theta)dt + b(t, X_t, \theta)dW_t + dZ_t \end{equation}\]

Note that a compound Poisson process with intensity \(\lambda\) with Gaussian jumps can be specified in the setModel using the argument measure \(type = ''CP"\) (for compound Poisson). Quasi Maximum Likelihood Method of Estimation procedure for diffusion model is also employed by Yoshida et al (2014) in the Yuima package for the jump type model.

Estimation of parameter in the YUIMA package: QMLE

QMLE method is applied in the estimation with the Yuima library with a multidimensional diffusion process given by; \[\begin{equation} dX_t = a(t, X_t, \theta_2)dt + b(t, X_t, \theta_1)dW_t ; X_0 = x_0 \end{equation}\] where \(W\) is an r-dimensional \(t\) standard Wiener process independent of the initial variable \(X\). From the code above, lists of upper and lower bounds are specified to identify the search region for the optimizer, in our case, the L-BFGSB given we’ve specified the bounds; otherwise, the BFGS optimizer is applied.

Yum(k=1,prices=my_data,Date=oil_data$Day)

## Quasi-Maximum likelihood estimation
## 
## Call:
## qmle(yuima = yuima, start = start, method = "L-BFGS-B", lower = lower, 
##     upper = upper, threshold = sqrt(1/Delta))
## 
## Coefficients:
##         Estimate Std. Error
## sigma   1.385404 0.01084705
## mu      1.460662 0.24282331
## lambda  5.440363 0.40437895
## beta    0.100000 1.54084269
## dels   20.000000 0.64759003
## 
## -2 log L: 43482.11 
## 
## 
## Number of estimated jumps: 181
## 
## Average inter-arrival times: 0.178197
## 
## Average jump size: -6.158287
## 
## Standard Dev. of jump size: 29.218560
## 
## Jump Threshold: 15.874508
## 
## Summary statistics for jump times:
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   1.044   8.306   8.849   9.882  10.901  33.119 
## 
## Summary statistics for jump size:
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
## -51.700 -27.750 -18.800  -6.158  18.250  56.740

From the above output, we have a \(\mu = 1.460662\), \(\sigma = 1.385404\), \(\lambda = 5.440363\), \(\beta = 0.1\) and dels of 20. We hae 181 estimated jumps with an average jump size of -6.158287 and 15.874508 jump threshold. Given the struccture of above SDE with a diffusion part and a jump component, there are periods of normal price distribution (Gaussian) and random intermittent unusual price movements (jumps). For our dataset, the variance of the diffusion and jump component were positive and significant at \(1.385404\) and \(5.440363\) respectively, implying that jumps were quite dominant in the oil price dynamics. Moreover, the intensity of the compound Poisson, i.e. the probability of the jump is high, because. The drift term was significantly at \(1.460662\) indicating a presence o an upward trend in the oil price.

Conclusion

One does not need to fix the value of any of the parameters to obtain the value of others with the Yuima library. From literature, with this method of parameter estimation, more accurate values have been obtained. Nonetheless, as commented by Ball & Torous (1983) that difficulties arise with the empirical implementation and verification of the financial process, this has been curbed by employing the Yuima.

References

Black, F., and M. Scholes, 1973, “The Pricing of Options and Corporate Liabilities,” Journal of Political Economy, Vol. 81, pp. 637–54.

Bakshi, G., Cao, C., Chen, Z., 1997, “Empirical Performance of Alternative Pricing Models,” Journal of Finance, 52, 2003–2049.

Ball, C., A. and W. N. Torous, 1983, “A Simplified Jump Process for Common Stock Returns,” Journal of Financial and Quantitative Analysis, 18, pp. 53–65.

Ball, C., A. and W. N. Torous, 1985, “On Jumps in Common Stock Prices and their Impact on Call Option Pricing,” Journal of Finance, Vol. 40, No. 1, pp. 155–173.

Bates, D., S., 1966, “Jumps and Stochastic Volatility: Exchange Rate Processes Implicit in Deutsche Mark Options,” Review of Financial Studies, Vol. 9, No. 1, pp. 69–107.

Beckers, S., 1981, “A Note on Estimating the Parameters of the Diffusion-Jump Model of Stock Returns,” Journal of Financial and Quantitative Analysis, Vol. 16, No. 1, pp. 127–140.

Merton R. C., 1976, “Option Pricing when underlying stock returns are discontinuous,” Journal of Financial Economics, Vol 3, pp125 - 144.

Yoshida, N., 1992, “Estimation for diffusion processes from discrete observatio,” Journal of Multivariate Analysis 41, 2, 220-242.