In line with my dissertation, I have published the R code needed to fit flexible two-piece distributions to the daily log-returns to any financial asset. In this R markdown, we will investigate the daily logarithmic returns of GameStop’s stock over a 1 year period. As a brief overview, we will use both point and interval estimation, namely the method of maximum likelihood estimation and confidence intervals. We will begin by defining the likelihood function and log-likelihood function and optimise the latter to obtain the MLEs for each respective distribution. From this, we can generate a 95% (asymptotic) confidence interval for the parameters in question. Finally, we will implement the code I have written for value at risk (VaR) estimation.
We begin by loading the data set into question an begin defining several equations that we will later use.
Data
We also plot a histogram just to observe the shape of the log-returns density.
GME <- read.table("R_Diss\\GME.csv",header=T,sep=",")
attach(GME)
str(GME)
## 'data.frame': 254 obs. of 7 variables:
## $ Date : Factor w/ 254 levels "2020-01-28","2020-01-29",..: 1 2 3 4 5 6 7 8 9 10 ...
## $ Open : num 4.3 4.22 4.1 3.91 3.88 4.03 4.15 4.2 4.11 3.85 ...
## $ High : num 4.3 4.32 4.12 3.94 3.98 4.25 4.41 4.3 4.13 4.1 ...
## $ Low : num 4.18 4.13 3.88 3.83 3.85 3.97 4.14 4.14 3.77 3.74 ...
## $ Close : num 4.21 4.13 3.93 3.84 3.95 4.07 4.18 4.14 3.81 3.94 ...
## $ Adj.Close: num 4.21 4.13 3.93 3.84 3.95 4.07 4.18 4.14 3.81 3.94 ...
## $ Volume : int 2144900 1552600 3006500 2891700 2026200 3563100 2641700 1510300 2742300 2777000 ...
n=length(Adj.Close)
ret = log(Adj.Close[-1]/(Adj.Close[-n]-1))
hist(ret,breaks = 50, probability=T)
data <- log(Adj.Close[-1]/(Adj.Close[-n]-1))
Normal Distribution and VaR
We begin with the normal disribution. We first define the log-likelihood function (“log.lik”), and use the (“optim”) function to optimise the log-likelihood function. With these obtained maximum likelihood estimators, we can plot the corresponding distribution over the log-returns density.
#Normal
log.lik <- function(par){
mu <- par[1]; sigma <- exp(par[2])
out <- -sum( dnorm(data, mu, sigma, log=TRUE) )
return(out)
}
OPT <- optim(par = c(0,0), fn = log.lik)
MLE <- c(OPT$par[1], exp(OPT$par[2]))
MLE
## [1] 0.2069991 0.1403370
We use these maximum likelihood estimators obtained and plot it over the log-return densities:
fitn <- Vectorize(function(x) dnorm(x, MLE[1], MLE[2]))
hist(data, probability = T, breaks = 70)
curve(fitn,-1,1,add=T, n = 1000)
Using these MLEs we also can obtain the 5% value at risk (VaR) by taking the lower quantile of the respective distribution:
#VaR for Normal
alpha<- 0.05
conf <- 1 - alpha
VaRn95 <- qnorm(alpha, mean = MLE[1],sd = MLE[2], lower.tail = TRUE)
VaRn95
## [1] -0.02383483
We will also define a function for the confidence intervals, and obtain the CIs for the parameters of the normal distribution.
library(numDeriv)
Conf.Int <- function(FUN,MLE,level=0.95){
sd.int <- abs(qnorm(0.5*(1-level)))
HESS <- hessian(FUN,x=MLE)
Fisher.Info <- solve(HESS, tol = 1e-18)
Sigma <- (sqrt(diag(Fisher.Info)))
U<- MLE + sd.int*Sigma
L<- MLE - sd.int*Sigma
C.I <- cbind(L,U,MLE, Sigma)
rownames(C.I)<- paste0("par", seq_along(1:length(MLE)))
colnames(C.I)<- c("Lower","Upper","Transf MLE", "Std. Error")
return(C.I)
}
Conf.Int(FUN=log.lik,OPT$par, level = 0.95)
## Lower Upper Transf MLE Std. Error
## par1 0.1897065 0.2242916 0.2069991 0.008822914
## par2 -2.0508352 -1.8765815 -1.9637084 0.044453272
Two piece normal and VaR
In a similar way, we will obtain the MLEs and CIs of the two-piece normal distribution. We use the “twopiece” package by [Rubio & Steel, 2020] (https://rpubs.com/FJRubio/TPD) for the two-piece normal distribution:
library(twopiece)
dtpn <- function(x, mu, sigma1, sigma2){
val <- dtp3(x, mu = mu, par1 = sigma1, par2 = sigma2, FUN = dnorm, param = "tp", log = FALSE)
return(val)}
ldtpn <- function(x, mu, sigma1, sigma2){
val <- dtp3(x, mu = mu, par1 = sigma1, par2 = sigma2, FUN = dnorm, param = "tp", log = TRUE)
return(val)}
log.liktpn <- function(par){
mu <- par[1]; sigma1 <- exp(par[2]); sigma2 <- exp(par[3])
out <- -sum(ldtpn(data, mu,sigma1,sigma2) )
return(out)
}
OPTTPN <- optim(par = c(0,0,0), fn = log.liktpn)
MLETPN <- c(OPTTPN$par[1], exp(OPTTPN$par[2]), exp(OPTTPN$par[3]))
MLETPN
## [1] 0.2108834 0.1429300 0.1377128
fittpn <- Vectorize(function(x) dtpn(x, MLETPN[1], MLETPN[2],MLETPN[3]))
hist(data, probability = T, breaks = 80)
curve(fittpn,-1,1,add=T, n = 1000, col = 'red')
Conf.Int(FUN=log.liktpn,OPTTPN$par, level = 0.95)
## Lower Upper Transf MLE Std. Error
## par1 0.1765326 0.2452343 0.2108834 0.01752626
## par2 -2.1093765 -1.7814242 -1.9454003 0.08366282
## par3 -2.1510537 -1.8141156 -1.9825847 0.08595518
#VaR for dtpn
Vartpn95 <- qtp3(alpha, mu = MLETPN[1], par1 = MLETPN[2], par2 = MLETPN[3],FUN = dnorm, param = "tp")
Vartpn95
## [1] 0.2678356
t-distribution and VaR In this section, we will obtain the MLEs and CIs for the t-distribution:
#t distribution
log.likt <- function(par){
mu <- par[1]; sigma <- exp(par[2]); nu <- exp(par[3])
out <- -sum( dt( (data - mu)/sigma, df = nu , log=TRUE) - log(sigma) )
return(out)
}
OPTt <- optim(par = c(0,0,0), fn = log.likt)
MLEt <- c(OPTt$par[1], exp(OPTt$par[2]), exp(OPTt$par[3]))
From this we obtain the MLE and Confidence Interval:
MLEt
## [1] 0.2066455 0.1127515 6.2042186
Conf.Int(FUN=log.likt,OPTt$par[], level = 0.95)
## Lower Upper Transf MLE Std. Error
## par1 0.190886 0.222405 0.2066455 0.008040705
## par2 -2.311717 -2.053422 -2.1825692 0.065892820
## par3 1.218013 2.432446 1.8252295 0.309809920
fitdt <- Vectorize(function(x) dt((x-MLEt[1])/MLEt[2], df = MLEt[3], log=FALSE)/MLEt[2])
hist(data, probability = T, breaks = 80)
curve(fitdt,-1,1,add=T, n = 1000, col = "red")
Here we obtain the 5% VaR:
Vart95 <- qt(alpha, MLEt[3])
Vart95
## [1] -1.931785
Two Piece t Distribution and VaR
library(twopiece)
dtpt <- function(x, mu, sigma1, sigma2, df){
val <- dtp4(x, mu = mu, par1 = sigma1, par2 = sigma2,delta = df, FUN = dt, param = "tp", log = FALSE)
return(val)}
ldtpt <- function(x, mu, sigma1, sigma2, df){
val <- dtp4(x, mu = mu, par1 = sigma1, par2 = sigma2,delta = df, FUN = dt, param = "tp", log = TRUE)
return(val)}
log.liktpt <- function(par){
mu <- par[1]; sigma1 <- exp(par[2]); sigma2 <- exp(par[3]); df <- exp(par[4])
out <- -sum(ldtpt(data,mu,sigma1,sigma2,df))
return(out)}
OPTTPt <- optim(par = c(0,0,0,0), fn = log.liktpt)
MLETPt <- c(OPTTPt$par[1], exp(OPTTPt$par[2]), exp(OPTTPt$par[3]), exp(OPTTPt$par[4]))
MLETPt
## [1] 0.2150003 0.1172765 0.1069059 6.0138730
Conf.Int(FUN=log.liktpt, OPTTPt$par[], level=0.95)
## Lower Upper Transf MLE Std. Error
## par1 0.1746819 0.2553186 0.2150003 0.02057096
## par2 -2.3586839 -1.9277580 -2.1432209 0.10993210
## par3 -2.5137064 -1.9579069 -2.2358067 0.14178820
## par4 1.1732823 2.4148557 1.7940690 0.31673372
fittpt <- Vectorize(function(x) dtpt(x, MLETPt[1], MLETPt[2],MLETPt[3], MLETPt[4]))
hist(data, probability = T, breaks = 80)
curve(fittpt,-1,1,add=T, n = 1000, col = 'blue')
#VaR tpt
Vartpt <- qtp4(alpha,MLETPt[1],MLETPt[2],MLETPt[3],MLETPt[4], FUN = dt, param = "tp")
Vartpt
## [1] 0.2598304
TPSAS Distribution and VaR Here we will fit the two-piece sinh-arcsinh distribution. We will use the “TPSAS” package from [Rubio & Ogundimu, 2017] (https://rpubs.com/FJRubio/TPSAS) :
#TPSAS
library(TPSAS)
Again we begin by defining the log-likelihood function and optimise it to obtain the MLEs:
log.liktpsas <- function(par){
if(par[2]>0 & abs(par[3])<1 & par[4]>0){
return( - sum(dtpsas(data,par[1],par[2],par[3],par[4],param="eps",log=TRUE)))
}
else return(Inf)
}
OPTtpsas = optim(c(MLETPN,1),log.liktpsas, control = list(maxit=50000),method = c("Nelder-Mead"))
MLE.sas<- OPTtpsas$par
MLE.sas
## [1] 0.22862652 0.07742275 0.10087809 0.70388876
We use these MLEs and plot the distribution over the daily log-return density of GME.
fit.tpsas <- Vectorize(function(x) dtpsas(x, MLE.sas[1], MLE.sas[2], MLE.sas[3], MLE.sas[4], param="eps") )
hist(data, breaks = 80, probability = T)
curve(fit.tpsas, -1,1, add= T, lwd = 2, col = "blue", n = 1000)
summary(data)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -0.5821 0.1148 0.2145 0.2070 0.2809 0.8605
#VaR tpsas
Vartpsas95 <- qtpsas(alpha,MLE.sas[1], MLE.sas[2], MLE.sas[3], MLE.sas[4], param="eps")
Vartpsas95
## [1] -0.03336937
AIC Values
Finally we define the AIC function in order to calculate the AIC value and determine which function has the best relative fit:
AIC.n <- 2*OPT$value + 2*length(OPT$par)
AIC.TPN <- 2*OPTTPN$value + 2*length(OPTTPN$par)
AIC.t <- 2*OPTt$value + 2*length(OPTt$par)
AIC.tpt <- 2*OPTTPt$value + 2*length(OPTTPt$par)
AIC.sas <- 2*OPTtpsas$value + 2*length(OPTtpsas$par)
AIC.n
## [1] -271.6291
AIC.TPN
## [1] -269.6953
AIC.t
## [1] -295.9732
AIC.tpt
## [1] -294.1578
AIC.sas
## [1] -285.844
For further reading and the mathematical theory behind this, please refer to my undergraduate dissertation found here: