Fitting Smooth Skewed Probability Density Based on Quantiles

Installing the Packages

Install the required packages that are important to run the code. The following code reads the list of packages from the ``packages” list provided. If the package is not installed, then it installs, and if already installed then it uses it directly.

if (!require(pacman)) install.packages("pacman")

## Loading required package: pacman

pacman::p_load(survival,skewt,survival, ggfortify, survminer,plotly, gridExtra, Epi, KMsurv, gnm, cmprsk, mstate, flexsurv, splines, epitools, eha, shiny, ctqr, scales,tigerstats, data.table,astsa,sn, ggfortify,cvar,PerformanceAnalytics, survminer, plotly, gridExtra, Epi, KMsurv, gnm, cmprsk, mstate, flexsurv, splines, epitools, shiny, ctqr, scales,tigerstats, data.table,astsa, foreign, haven, tidyverse, ROI.plugin.quadprog,ROI.plugin.glpk,rriskDistributions,fGarch,ctqr,ITNr,RColorBrewer,scales,graphsim, igraph,readxl,plyr,quantreg,bayesQR, spatstat,ggplot2,ggpubr, gtools, car, devtools,statar,reshape2,MASS,dynpanel,plm,dummies,ggridges,ggthemes,PortfolioAnalytics, ROI)

In this basic code I show how to estimate and derive a skewed normal and t distribution when we know only the quantiles are known.

Deriving Skewed Normal Distribution From Quantiles

Normally, we derive quantiles from a distribution. In this case, we will do the reverse engineering. Given the quantiles, we will derive an approximate distribution based on these quantiles. Suppose we have \(n\) quantiles with given quantile probability space \(P:=\{p_1,p_2,..., p_n\}\) and corresponding quantile values \(X_P=x_{p_1}, x_{p_2}, ..., x_{p_n}\). We need to fit a skewed normal distribution given by \[f(x ; \mu, \delta,\alpha)={\frac {2}{\delta }}\phi \left({\frac {x-\mu }{\delta }}\right)\Phi \left(\alpha \left({\frac {x-\mu }{\delta }}\right)\right)\] with parameters \(\mu, \delta,\alpha\). Here, \(\mu\) is location parameter, \(\delta > 0\) is the scale parameter, and \(\alpha\) is the shape or skewness parameter. The location parameter is value of random variable \(x\) where the density is peaked. But we do not know the value where it takes the peak value. To determine this, first the kernel density of \(X_P\) is estimated. The value \(x_m\) is selected where the kernel density of quantiles has a peak. The scale parameter \(\delta\) and shape parameter \(\alpha\) is estimated by minimizing the distance between given quantiles that we know and quantiles derived from a random skewed-normal distribution based on randomly selected \(\delta, \alpha\). Formally, if \(Q^f_{p}(\delta,\alpha)\) is \(p^{th}\) quantile from the skewed normal distribution with randomly choosen \(\delta, \alpha\), then optimal parameters are \(\hat{\delta}, \hat{\alpha}\) that minimizes the distance

\[\hat{\delta},\hat{\alpha} = argmin\ _{\delta,\alpha} \ \ \ \sum_{p \in P}\Big[Q^f_{p}(\delta,\alpha)-x_{p}\Big]^2\] Now we have \(\mu, \hat{\delta}, \hat{\alpha}\), using this the distribution is estimated.

#### Skewed Normal Distribution ####

skewed_normal=function(x,p){
p=p
q=x
dd=density(x)
mod=dd$x[which.max(dd$y)]
N=10000
omgg=rep(1, N)
kz=rep(1, N)
nz=rep(1, N)
for (i in 1:N){
  k=runif(1, 1, 100)
  n=runif(1,-10,10)
  kz[i]=k
  nz[i]=n
  ff=sort(as.numeric(as.vector(q)))
  g=rsn(n=10000, xi=mod, omega=k, alpha=n,  dp=NULL)
  gg=as.vector(quantile(g,probs=p, na.rm = T))
  ss=sum((ff-gg)^2)
  omgg[i]=ss
}
m=which.min(omgg)
rn=rsn(n=50000, xi=mod, omega=kz[m], alpha=nz[m],  dp=NULL)
plot(density(rn))
}

#AN EXAMPLE
p=c(0.05, 0.25, 0.5, 0.75, 0.95) #Vector probabilities
x=c(-2, -1, 0.1,2.11,38)  # Vector of quantiles
skewed_normal(x, p)

Fitting Skewed t Distribution

The procedure is same as of for the skewed normal distribution. The only difference here is to add degrees of freedom \(\nu\). As default, in the algorithm the degrees of freedom is assumed to be large, hence it set \(\nu=\infty\) in the algorithm.

#### Skewed t Distribution ####

skewed_t=function(x,p){
  p=p
  q=x
  dd=density(x)
  mod=dd$x[which.max(dd$y)]
  N=10000
  omgg=rep(1, N)
  kz=rep(1, N)
  nz=rep(1, N)
  for (i in 1:N){
    k=runif(1, 1, 100)
    n=runif(1,-10,10)
    kz[i]=k
    nz[i]=n
    ff=sort(as.numeric(as.vector(q)))
    g=rst(n=10000, xi=mod, omega=k, alpha=n, nu=Inf,  dp=NULL)
    gg=as.vector(quantile(g,probs=p, na.rm = T))
    ss=sum((ff-gg)^2)
    omgg[i]=ss
  }
  m=which.min(omgg)
  rn=rst(n=500000, xi=mod, omega=kz[m], alpha=nz[m], nu=Inf,  dp=NULL)
  plot(density(rn))
}


# AN EXAMPLE
p=c(0.05, 0.25, 0.5, 0.75, 0.95) #Vector probabilities
x=c(-75, -1, 0.1,0.11,0.8)  # Vector of quantiles
skewed_t(x, p)