, Hao Hu2
(i). Probability density function:
\(f(t)=\frac{\Gamma(\frac{\nu+1}{2})}{\sqrt{\nu\pi}\Gamma(\frac{\nu}{2})}(1+\frac{t^2}{\nu})^{-\frac{\nu+1}{2}}\).
where \(\nu\) is the number of degrees of freedom and \(\Gamma\) is the gamma function.
ii). For \(\nu>1\) even, \(\frac{\Gamma(\frac{\nu+1}{2})}{\sqrt{\nu\pi}\Gamma(\frac{\nu}{2})}=\frac{(\nu-1)(\nu-3)...5\times3}{2\sqrt{\nu}(\nu-2)(\nu-4)...4\times2}\)
For \(\nu>1\) odd, \(\frac{\Gamma(\frac{\nu+1}{2})}{\sqrt{\nu\pi}\Gamma(\frac{\nu}{2})}=\frac{(\nu-1)(\nu-3)...4\times2}{\pi\sqrt{\nu}(\nu-2)(\nu-4)...5\times3}\)
The half Cauchy distribution (HC) is derived from the standard Cauchy distribution by folding the curve on the origin so that only positive values can be observed, and its pdf:
\(f(x)=\frac{2}{\pi}\frac{1}{1+x^2}\), \(x>0\)
Notes: The probability density function of standard Cauchy distribution is:
\(f(x;x_0,\gamma)=\frac{1}{\pi\gamma}[\frac{\gamma^2}{(x-x_0)^2+\gamma^2}]\).
where \(x_0\) is the location parameter (specifying the location of the peak of the distribution), \(\gamma\) is the scale parameter (specifies the half-width at half-maximum (HWHM))
A random variable has a \(Laplace(\mu,b)\) distribution if its probability density function is:
\(f(x|\mu,b)=\frac{1}{2b}\left\{\begin{aligned}exp(-\frac{\mu-x}{b}),(x<\mu)\\exp(-\frac{x-\mu}{b}),(x\ge\mu)\end{aligned}\right.\)
where \(\mu\) is a location parameter and \(b\) is a scale parameter (sometimes referred to as the diversity)
(i). Under the situation where \((y|\beta)\sim N(\beta,\sigma^2I)\), where \(\beta\) is believed to be sparse.
(ii). Assumes that each \(\beta_i\) is conditionally independent with density \(\pi_{HS}(\beta_i|\tau)\), where \(\pi_{HS}\) can be represented as a scale mixture of normals:
\((\beta_i|\lambda_i,\tau)\sim N(0,\lambda_i^2\tau^2)\)
\(\lambda_i\sim C^+(0,1)\), where \(C^+(0,1)\) is a standard half-Cauchy distribution on the positive reals.
NOTES: \(\lambda_i's\) are the local shrinkage parameters, \(\tau\) is global shrinkage parameter.
(iii). The density funtion \(\pi_{HS}(\beta_i|\tau)\) lacks a closed form distribution
Figure 1: HS with different global parameters
Figure 2: HS with different local paramters
Figure 3: The Comparison b/w HS, t and Laplace distributions
From Figure 1, we know that when \(\tau\) increases, the “fake” peak of the density curve will decreases. Also, the tail of the density curve become more heavier when \(\tau\) increases.
From Figure 2, we know that when \(\lambda\) increases, the “fake” peak of the density curve will decreases. Also, the density curve will become more flatter when \(\lambda\) increses. However, there is no change for the tail of the density curve when \(\lambda\) changes.
From Figure 3, we know that the density curve of Horseshoe Prior has the highest peak (even though is “fake”), the student-t ditribution has the lowest peak, and the Laplace distribution is in the middle. However, the density curve of Horseshoe prior has the lightest tail, the student-t distribution has the heaviest tail, and the Laplace distribution is still in the middle.