The Generalised Gamma (GG) distribution (Stacy 1962) is a three-parameter distribution with support on \({\mathbb R}_+\). The corresponding hazard function can accommodate bathtub, unimodal and monotone (increasing and decreasing) hazard shapes. The GG distribution has become popular in survival analysis due to its flexibility. Other flexible distributions that can account for these hazard shapes are discussed in Rubio et al. (2019) and Jones and Noufaily (2015). See also The Power Generalised Weibull Distribution The Exponentiated Weibull distribution and Simulating survival times from a General Hazard structure with a flexible baseline hazard.
The pdf of the GG distribution is \[f(t;\theta,\kappa,\delta) = \dfrac{\delta}{\Gamma\left(\frac{\kappa}{\delta}\right)\theta^\kappa} t^{{\kappa-1}}e^{{-\left(\frac{t}{\theta}\right)^{\delta}}},\] where \(\theta>0\) is a scale parameter, and \(\kappa,\delta >0\) are shape parameters.
The CDF of the GG distribution is \[F(t;\theta,\kappa,\delta) = {\frac {\gamma \left( \frac{\kappa}{\delta},\left(\frac{t}{\theta}\right)^{\delta}\right)}{\Gamma\left(\frac{\kappa}{\delta}\right)}},\] where where \(\gamma (\cdot )\) denotes the lower incomplete gamma function. The survival function can be obtained using the relationship \(S(t;\theta,\kappa,\delta)=1-F(t;\theta,\kappa,\delta)\). An interesting relationship between the Gamma CDF (\(G(t;\theta,\kappa)\), scale \(\theta\) and shape \(\kappa\)) and the GG CDF is \[F(t;\theta,\kappa,\delta) = G\left(t^\delta; \theta^\delta, \frac{\kappa}{\delta}\right).\] This allows the implementation of the GG CDF using the R command pgamma
.
The hazard function of the GG distribution is \[h(t;\theta,\kappa,\delta) = \dfrac{f(t;\theta,\kappa,\delta)}{1-F(t;\theta,\kappa,\delta)}.\] The survival function can be obtained as \(S(t;\theta,\kappa,\delta)=1-F(t;\theta,\kappa,\delta)\), and the cumulative hazard function as \(H(t;\theta,\kappa,\delta) = -\log S(t;\theta,\kappa,\delta)\), as usual. The connection of the GG CDF with the Gamma distribution allows for writing these functions in terms of the R command pgamma
as shown in the following code.
# theta : scale parameter
# kappa : shape parameter
# delta : shape parameter
# t : positive argument
# p : probability (0,1)
# n : number of simulations
# Probability Density Function
dggamma <- function(t, theta, kappa, delta, log = FALSE){
val <- log(delta) - kappa*log(theta) - lgamma(kappa/delta) + (kappa - 1)*log(t) -
(t/theta)^delta
if(log) return(val) else return(exp(val))
}
# GG CDF
pggamma <- function(t, theta, kappa, delta, log.p = FALSE){
val <- pgamma( t^delta, shape = kappa/delta, scale = theta^delta, log.p = TRUE)
if(log.p) return(val) else return(exp(val))
}
# GG Survival Function
sggamma <- function(t, theta, kappa, delta, log.p = FALSE){
val <- pgamma( t^delta, shape = kappa/delta, scale = theta^delta, log.p = TRUE, lower.tail = FALSE)
if(log.p) return(val) else return(exp(val))
}
# GG Hazard Function
hggamma <- function(t, theta, kappa, delta, log = FALSE){
val <- dggamma(t, theta, kappa, delta, log = TRUE) - sggamma(t, theta, kappa, delta, log.p = TRUE)
if(log) return(val) else return(exp(val))
}
# GG Cumulative Hazard Function
chggamma <- function(t, theta, kappa, delta){
val <- -pgamma( t^delta, shape = kappa/delta, scale = theta^delta, log.p = TRUE, lower.tail = FALSE)
return(val)
}
# Quantile Function
qggamma <- function(p, theta, kappa, delta){
out <- qgamma(p, shape = kappa/delta, scale = theta^delta)^(1/delta)
return(out)
}
# Random number Generation Function
rggamma <- function(n, theta, kappa, delta){
p <- runif(n)
out <- qgamma(p, shape = kappa/delta, scale = theta^delta)^(1/delta)
return(as.vector(out))
}
# Simulated data
set.seed(123)
# true values of the parameters
par0 <- c(0.5, 1.5, 0.75)
data0 <- rggamma(n=10000, theta = par0[1], kappa = par0[2], delta = par0[3])
# Density vs Histogram
true.f <- Vectorize(function(t) dggamma(t, theta = par0[1], kappa = par0[2], delta = par0[3]))
hist(data0, probability = TRUE, breaks = 100)
curve(true.f,0,max(data0), add = T, n = 1000, lwd = 2, col = "blue")
box()
# CDF vs Histogram
true.f <- Vectorize(function(t) pggamma(t, theta = par0[1], kappa = par0[2], delta = par0[3]))
plot(ecdf(data0))
curve(true.f,0,max(data0), add = T, n = 1000, col = "red")
# Hazard function
true.f <- Vectorize(function(t) hggamma(t, theta = par0[1], kappa = par0[2], delta = par0[3]))
curve(true.f,0,max(data0), xlab = "t", ylab = "hazard", n = 1000, lwd = 2, col = "darkgreen")
Jones, M. C., and A. Noufaily. 2015. “Log-Location-Scale-Log-Concave Distributions for Survival and Reliability Analysis.” Electronic Journal of Statistics 9 (2): 2732–50.
Rubio, F. J., L. Remontet, N. P. Jewell, and A. Belot. 2019. “On a General Structure for Hazard-Based Regression Models: An Application to Population-Based Cancer Research.” Statistical Methods in Medical Research 28: 2404–17.
Stacy, E. W. 1962. “A Generalization of the Gamma Distribution.” The Annals of Mathematical Statistics 33 (3): 1187–92.