Abstract

In medical research and clinical practice, Bayesian survival modeling is a powerful technique for assessing time-to-event data. It allows for the incorporation of prior knowledge about the model’s parameters and provides a more comprehensive understanding of the underlying hazard function. In this paper, we propose a Bayesian survival modeling strategy for Proportional Hazard regression models that employs the Sine-G family of distributions as baseline hazards. The Sine-G distribution family is a flexible family of distributions which can capture a wide range of hazards forms, including increasing, decreasing, and bathtub-shaped hazards. In order to capture the underlying hazard function, we examine the flexibility and effectiveness of several distributions within the Sine-G family, such as the Gompertz, Lomax, Weibull, and exponentiated exponential distributions. The proposed approach is implemented using the R programming language and the STAN probabilistic programming framework. To evaluate the proposed approach, we use a right-censored survival dataset of gastric cancer patients, which allows for precise determination of the hazard function while accounting for censoring. The Watanabe Akaike Information Criterion (WAIC) and the Leave-One-Out Information Criterion (LOOIC) are used to evaluate the performance of various baseline hazards.

Sine Gompertz PH Model

library(rstan)
library(bayesplot)
library(loo)
####################Bayesian Sine-Gompertz PH Model ############
stancode_sGO<-"
functions{
// log of survival function
vector log_Sval(vector t,  real lambda, real alpha, vector mu){
vector[num_elements(t)] log_Sval;
for (j in 1:num_elements(t)){
log_Sval[j]=-log((1- sin((pi()/2)*(1-exp(-alpha*(exp(lambda*t[j])-1)))))^mu[j]);
}
return log_Sval;
}
// log of hazard function
vector log_hd (vector t, real lambda, real alpha, vector mu){
vector[num_elements(t)] log_hd;
for (j in 1:num_elements(t)){
log_hd[j]<-log(((pi()/2)*((alpha*lambda*exp(alpha+(lambda*t[j])-alpha*exp(lambda*t[j])))*cos((pi()/2)*(1-exp(-alpha*(exp(lambda*t[j])-1)))))/
           (1- sin((pi()/2)*(1-exp(-alpha*(exp(lambda*t[j])-1))))))*mu[j]);
}
return log_hd;
}
// log-likelihood for right censoring
real survival_llogist_lpdf(vector t, vector d,
real lambda, real alpha, vector mu){
vector[num_elements(t)] log_likhood;
real prob;
log_likhood=d .*log_hd(t,lambda,alpha,mu)-log_Sval(t,lambda,alpha,mu);
prob=sum(log_likhood);
return prob;
}
}
// block for data
data{
int M; // covariate number
int N; // observations number
vector<lower=0> [N]y; // observation vector (times)
vector<lower=0,upper=1> [N] event; //censoring (1=obs.,0=cens.)
matrix[N,M] x;//model matrix (N for rows and M for columns)
}
// transformed data block, not used
// block for parameters
parameters{
vector [M] beta; //coeff. in the linear predictor
real<lower=0> lambda; //scale parameter
real<lower=0> alpha; //shape1 parameter
}
// block for transformed parameters
transformed parameters{
vector[N] linpred;
vector[N] mu;
linpred=x*beta;//linear predictor
for (j in 1:N){
mu[j]=exp(linpred[j]);
}
}
// block for model specification
model{
lambda~gamma(10,10);//prior for lambda
alpha~gamma(10,10);//prior for alpha
beta~normal(0,100); //prior for reg. coefficients
y~survival_llogist(event,lambda,alpha,mu); //likelihood
}
// block for generated quantities
generated quantities {
vector[N] y_rep; //posterior predictive value
vector[N] log_likhood; //log-likelihood
{for (n in 1:N)
log_likhood[n]=(log(((pi()/2)*((alpha*lambda*exp(alpha+(lambda*y[n])-alpha*exp(lambda*y[n])))*cos((pi()/2)*(1-exp(-alpha*(exp(lambda*y[n])-1)))))/
           (1- sin((pi()/2)*(1-exp(-alpha*(exp(lambda*y[n])-1))))))*(exp(x[n,]*beta)))*event[n])-
(-log((1- sin((pi()/2)*(1-exp(-alpha*(exp(lambda*y[n])-1)))))^(exp(x[n,]*beta))));
}
  {real u;
u=uniform_rng(0,1);
for (n in 1:N){
y_rep[n]=log((-log(1-(2/pi())*asin(1-exp((-log(1-u)/(exp(x[n,]*beta))))))/alpha)+1)/lambda;}
}
}
"
#Survival times for  patients
library(AmoudSurv)
data("gastric")
str(gastric)

## 'data.frame':    90 obs. of  3 variables:
##  $ time  : int  1 17 42 44 48 60 63 72 74 95 ...
##  $ status: int  1 1 1 1 1 1 1 1 1 1 ...
##  $ trt   : int  0 1 1 1 1 1 0 1 1 1 ...

y<-as.vector(gastric$time)/365.24
#event status, 0 indicates censored and 1 indicates death
event<-gastric$status
#x1=0=F1 patients treated with CT, x1=1=F1, treated with CTR
X<-as.matrix(gastric$trt)
F1<-X
#dm=x data matrix
dm<-cbind(F1)
N<-nrow(dm)
M<-ncol(dm)
data3=list(event=event,y=y,x=dm,N=N,M=M)


#Model Fitting
#Starting values, linear regression of log(y) on X's
#model1<-lm(log(y)~F1)
#b1<-model1$coefficients
#b1
Model<-stan(model_code=stancode_sGO,data=data3,iter=3000,
            chains=4)

## 
## SAMPLING FOR MODEL '6b66ffafd6a94229ee4ba409c0535dcb' NOW (CHAIN 1).
## Chain 1: Rejecting initial value:
## Chain 1:   Log probability evaluates to log(0), i.e. negative infinity.
## Chain 1:   Stan can't start sampling from this initial value.
## Chain 1: Rejecting initial value:
## Chain 1:   Log probability evaluates to log(0), i.e. negative infinity.
## Chain 1:   Stan can't start sampling from this initial value.
## Chain 1: 
## Chain 1: Gradient evaluation took 0 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1: 
## Chain 1: 
## Chain 1: Iteration:    1 / 3000 [  0%]  (Warmup)
## Chain 1: Iteration:  300 / 3000 [ 10%]  (Warmup)
## Chain 1: Iteration:  600 / 3000 [ 20%]  (Warmup)
## Chain 1: Iteration:  900 / 3000 [ 30%]  (Warmup)
## Chain 1: Iteration: 1200 / 3000 [ 40%]  (Warmup)
## Chain 1: Iteration: 1500 / 3000 [ 50%]  (Warmup)
## Chain 1: Iteration: 1501 / 3000 [ 50%]  (Sampling)
## Chain 1: Iteration: 1800 / 3000 [ 60%]  (Sampling)
## Chain 1: Iteration: 2100 / 3000 [ 70%]  (Sampling)
## Chain 1: Iteration: 2400 / 3000 [ 80%]  (Sampling)
## Chain 1: Iteration: 2700 / 3000 [ 90%]  (Sampling)
## Chain 1: Iteration: 3000 / 3000 [100%]  (Sampling)
## Chain 1: 
## Chain 1:  Elapsed Time: 5.179 seconds (Warm-up)
## Chain 1:                5.398 seconds (Sampling)
## Chain 1:                10.577 seconds (Total)
## Chain 1: 
## 
## SAMPLING FOR MODEL '6b66ffafd6a94229ee4ba409c0535dcb' NOW (CHAIN 2).
## Chain 2: 
## Chain 2: Gradient evaluation took 0 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2: 
## Chain 2: 
## Chain 2: Iteration:    1 / 3000 [  0%]  (Warmup)
## Chain 2: Iteration:  300 / 3000 [ 10%]  (Warmup)
## Chain 2: Iteration:  600 / 3000 [ 20%]  (Warmup)
## Chain 2: Iteration:  900 / 3000 [ 30%]  (Warmup)
## Chain 2: Iteration: 1200 / 3000 [ 40%]  (Warmup)
## Chain 2: Iteration: 1500 / 3000 [ 50%]  (Warmup)
## Chain 2: Iteration: 1501 / 3000 [ 50%]  (Sampling)
## Chain 2: Iteration: 1800 / 3000 [ 60%]  (Sampling)
## Chain 2: Iteration: 2100 / 3000 [ 70%]  (Sampling)
## Chain 2: Iteration: 2400 / 3000 [ 80%]  (Sampling)
## Chain 2: Iteration: 2700 / 3000 [ 90%]  (Sampling)
## Chain 2: Iteration: 3000 / 3000 [100%]  (Sampling)
## Chain 2: 
## Chain 2:  Elapsed Time: 5.321 seconds (Warm-up)
## Chain 2:                5.002 seconds (Sampling)
## Chain 2:                10.323 seconds (Total)
## Chain 2: 
## 
## SAMPLING FOR MODEL '6b66ffafd6a94229ee4ba409c0535dcb' NOW (CHAIN 3).
## Chain 3: Rejecting initial value:
## Chain 3:   Log probability evaluates to log(0), i.e. negative infinity.
## Chain 3:   Stan can't start sampling from this initial value.
## Chain 3: 
## Chain 3: Gradient evaluation took 0 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3: 
## Chain 3: 
## Chain 3: Iteration:    1 / 3000 [  0%]  (Warmup)
## Chain 3: Iteration:  300 / 3000 [ 10%]  (Warmup)
## Chain 3: Iteration:  600 / 3000 [ 20%]  (Warmup)
## Chain 3: Iteration:  900 / 3000 [ 30%]  (Warmup)
## Chain 3: Iteration: 1200 / 3000 [ 40%]  (Warmup)
## Chain 3: Iteration: 1500 / 3000 [ 50%]  (Warmup)
## Chain 3: Iteration: 1501 / 3000 [ 50%]  (Sampling)
## Chain 3: Iteration: 1800 / 3000 [ 60%]  (Sampling)
## Chain 3: Iteration: 2100 / 3000 [ 70%]  (Sampling)
## Chain 3: Iteration: 2400 / 3000 [ 80%]  (Sampling)
## Chain 3: Iteration: 2700 / 3000 [ 90%]  (Sampling)
## Chain 3: Iteration: 3000 / 3000 [100%]  (Sampling)
## Chain 3: 
## Chain 3:  Elapsed Time: 5.186 seconds (Warm-up)
## Chain 3:                5.554 seconds (Sampling)
## Chain 3:                10.74 seconds (Total)
## Chain 3: 
## 
## SAMPLING FOR MODEL '6b66ffafd6a94229ee4ba409c0535dcb' NOW (CHAIN 4).
## Chain 4: Rejecting initial value:
## Chain 4:   Log probability evaluates to log(0), i.e. negative infinity.
## Chain 4:   Stan can't start sampling from this initial value.
## Chain 4: Rejecting initial value:
## Chain 4:   Log probability evaluates to log(0), i.e. negative infinity.
## Chain 4:   Stan can't start sampling from this initial value.
## Chain 4: Rejecting initial value:
## Chain 4:   Log probability evaluates to log(0), i.e. negative infinity.
## Chain 4:   Stan can't start sampling from this initial value.
## Chain 4: Rejecting initial value:
## Chain 4:   Log probability evaluates to log(0), i.e. negative infinity.
## Chain 4:   Stan can't start sampling from this initial value.
## Chain 4: 
## Chain 4: Gradient evaluation took 0 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4: 
## Chain 4: 
## Chain 4: Iteration:    1 / 3000 [  0%]  (Warmup)
## Chain 4: Iteration:  300 / 3000 [ 10%]  (Warmup)
## Chain 4: Iteration:  600 / 3000 [ 20%]  (Warmup)
## Chain 4: Iteration:  900 / 3000 [ 30%]  (Warmup)
## Chain 4: Iteration: 1200 / 3000 [ 40%]  (Warmup)
## Chain 4: Iteration: 1500 / 3000 [ 50%]  (Warmup)
## Chain 4: Iteration: 1501 / 3000 [ 50%]  (Sampling)
## Chain 4: Iteration: 1800 / 3000 [ 60%]  (Sampling)
## Chain 4: Iteration: 2100 / 3000 [ 70%]  (Sampling)
## Chain 4: Iteration: 2400 / 3000 [ 80%]  (Sampling)
## Chain 4: Iteration: 2700 / 3000 [ 90%]  (Sampling)
## Chain 4: Iteration: 3000 / 3000 [100%]  (Sampling)
## Chain 4: 
## Chain 4:  Elapsed Time: 5.069 seconds (Warm-up)
## Chain 4:                5.916 seconds (Sampling)
## Chain 4:                10.985 seconds (Total)
## Chain 4:

print(Model,c("beta","lambda","alpha"),digits=3,probs=c(0.025,0.50,0.975))

## Inference for Stan model: 6b66ffafd6a94229ee4ba409c0535dcb.
## 4 chains, each with iter=3000; warmup=1500; thin=1; 
## post-warmup draws per chain=1500, total post-warmup draws=6000.
## 
##          mean se_mean    sd   2.5%   50% 97.5% n_eff  Rhat
## beta[1] 0.091   0.004 0.226 -0.354 0.098 0.531  2678 1.000
## lambda  0.207   0.001 0.036  0.142 0.204 0.283  2643 1.003
## alpha   1.080   0.005 0.252  0.658 1.057 1.630  2721 1.002
## 
## Samples were drawn using NUTS(diag_e) at Sat Nov 18 10:38:17 2023.
## For each parameter, n_eff is a crude measure of effective sample size,
## and Rhat is the potential scale reduction factor on split chains (at 
## convergence, Rhat=1).

#Traceplots
windows()
stan_trace(Model,pars = c("beta","lambda","alpha"))+ggtitle("Trace plots for the (Sine-Gompertz PH Model parameters)")

#Autocorrelation plots
windows()
stan_ac(Model,pars = c("beta","lambda","alpha"))+grid_lines()+ggtitle("Autocorrelation plots for the (Sine-Gompertz PH Model parameters)")

#Density plots
windows()
stan_dens(Model,pars = c("beta","lambda","alpha"))+ggtitle("Trace plots for the (Sine-Gompertz PH Model parameters)")

#Histogram plots
windows()
stan_hist(Model,pars = c("beta","lambda","alpha"))+ggtitle("Trace plots for the (Sine-Gompertz PH Model parameters)")

#loo package is used for calculating LOOIC and WAIC
log_lik_11<-extract_log_lik(Model,parameter_name ="log_likhood",merge_chains = TRUE)
loo_ll<-loo(log_lik_11,r_ef=NULL,save_psis=FALSE)
print(loo_ll)

## 
## Computed from 6000 by 90 log-likelihood matrix
## 
##          Estimate   SE
## elpd_loo   -134.6  6.4
## p_loo         3.6  0.7
## looic       269.2 12.8
## ------
## Monte Carlo SE of elpd_loo is 0.0.
## 
## All Pareto k estimates are good (k < 0.5).
## See help('pareto-k-diagnostic') for details.

waic_ll<-waic(log_lik_11)
print(waic_ll)

## 
## Computed from 6000 by 90 log-likelihood matrix
## 
##           Estimate   SE
## elpd_waic   -134.6  6.4
## p_waic         3.6  0.7
## waic         269.1 12.8

Sine Exponentiated Exponential PH Model

library(rstan)
library(bayesplot)
library(loo)
####################Bayesian Sine-Exponentiated Exponential PH Model ############
stancode_sEE<-"
functions{
// log of survival function
vector log_Sval(vector t,  real lambda, real alpha, vector mu){
vector[num_elements(t)] log_Sval;
for (j in 1:num_elements(t)){
log_Sval[j]=-log((1- sin((pi()/2)*(1-exp(-lambda*t[j]))^alpha))^mu[j]);
}
return log_Sval;
}
// log of hazard function
vector log_hd (vector t, real lambda, real alpha, vector mu){
vector[num_elements(t)] log_hd;
for (j in 1:num_elements(t)){
log_hd[j]<-log(((pi()/2)*(alpha*lambda*(1-exp(-lambda*t[j]))^(alpha-1)*exp(-lambda*t[j]))*cos((pi()/2)*(1-exp(-lambda*t[j]))^alpha)/
           (1- sin((pi()/2)*(1-exp(-lambda*t[j]))^alpha)))*mu[j]);
}
return log_hd;
}
// log-likelihood for right censoring
real survival_llogist_lpdf(vector t, vector d,
real lambda, real alpha, vector mu){
vector[num_elements(t)] log_likhood;
real prob;
log_likhood=d .*log_hd(t,lambda,alpha,mu)-log_Sval(t,lambda,alpha,mu);
prob=sum(log_likhood);
return prob;
}
}
// block for data
data{
int M; // covariate number
int N; // observations number
vector<lower=0> [N]y; // observation vector (times)
vector<lower=0,upper=1> [N] event; //censoring (1=obs.,0=cens.)
matrix[N,M] x;//model matrix (N for rows and M for columns)
}
// transformed data block, not used
// block for parameters
parameters{
vector [M] beta; //coeff. in the linear predictor
real<lower=0> lambda; //scale parameter
real<lower=0> alpha; //shape1 parameter
}
// block for transformed parameters
transformed parameters{
vector[N] linpred;
vector[N] mu;
linpred=x*beta;//linear predictor
for (j in 1:N){
mu[j]=exp(linpred[j]);
}
}
// block for model specification
model{
lambda~gamma(10,10);//prior for lambda
alpha~gamma(10,10);//prior for alpha
beta~normal(0,100); //prior for reg. coefficients
y~survival_llogist(event,lambda,alpha,mu); //likelihood
}
// block for generated quantities
generated quantities {
vector[N] y_rep; //posterior predictive value
vector[N] log_likhood; //log-likelihood
{for (n in 1:N)
log_likhood[n]=(log(((pi()/2)*(alpha*lambda*(1-exp(-lambda*y[n]))^(alpha-1)*exp(-lambda*y[n]))*cos((pi()/2)*(1-exp(-lambda*y[n]))^alpha)/
           (1- sin((pi()/2)*(1-exp(-lambda*y[n]))^alpha)))*(exp(x[n,]*beta)))*event[n])-
(-log((1- sin((pi()/2)*(1-exp(-lambda*y[n]))^alpha))^(exp(x[n,]*beta))));
}
  {real u;
u=uniform_rng(0,1);
for (n in 1:N){
y_rep[n]=-log(1-((2/pi())*asin(1-exp((-log(1-u)/(exp(x[n,]*beta))))))^(1/alpha))/lambda;}
}
}
"
#Survival times for  patients
library(AmoudSurv)
data("gastric")
str(gastric)

## 'data.frame':    90 obs. of  3 variables:
##  $ time  : int  1 17 42 44 48 60 63 72 74 95 ...
##  $ status: int  1 1 1 1 1 1 1 1 1 1 ...
##  $ trt   : int  0 1 1 1 1 1 0 1 1 1 ...

y<-as.vector(gastric$time)/365.24
#event status, 0 indicates censored and 1 indicates death
event<-gastric$status
#x1=0=F1 patients treated with CT, x1=1=F1, treated with CTR
X<-as.matrix(gastric$trt)
F1<-X
#dm=x data matrix
dm<-cbind(F1)
N<-nrow(dm)
M<-ncol(dm)
data3=list(event=event,y=y,x=dm,N=N,M=M)


#Model Fitting
#Starting values, linear regression of log(y) on X's
#model1<-lm(log(y)~F1)
#b1<-model1$coefficients
#b1
Model<-stan(model_code=stancode_sEE,data=data3,iter=3000,
            chains=4)

## 
## SAMPLING FOR MODEL 'e827b3c31fce1e783a5f0fcffd730534' NOW (CHAIN 1).
## Chain 1: 
## Chain 1: Gradient evaluation took 0 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1: 
## Chain 1: 
## Chain 1: Iteration:    1 / 3000 [  0%]  (Warmup)
## Chain 1: Iteration:  300 / 3000 [ 10%]  (Warmup)
## Chain 1: Iteration:  600 / 3000 [ 20%]  (Warmup)
## Chain 1: Iteration:  900 / 3000 [ 30%]  (Warmup)
## Chain 1: Iteration: 1200 / 3000 [ 40%]  (Warmup)
## Chain 1: Iteration: 1500 / 3000 [ 50%]  (Warmup)
## Chain 1: Iteration: 1501 / 3000 [ 50%]  (Sampling)
## Chain 1: Iteration: 1800 / 3000 [ 60%]  (Sampling)
## Chain 1: Iteration: 2100 / 3000 [ 70%]  (Sampling)
## Chain 1: Iteration: 2400 / 3000 [ 80%]  (Sampling)
## Chain 1: Iteration: 2700 / 3000 [ 90%]  (Sampling)
## Chain 1: Iteration: 3000 / 3000 [100%]  (Sampling)
## Chain 1: 
## Chain 1:  Elapsed Time: 4.124 seconds (Warm-up)
## Chain 1:                4.062 seconds (Sampling)
## Chain 1:                8.186 seconds (Total)
## Chain 1: 
## 
## SAMPLING FOR MODEL 'e827b3c31fce1e783a5f0fcffd730534' NOW (CHAIN 2).
## Chain 2: 
## Chain 2: Gradient evaluation took 0 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2: 
## Chain 2: 
## Chain 2: Iteration:    1 / 3000 [  0%]  (Warmup)
## Chain 2: Iteration:  300 / 3000 [ 10%]  (Warmup)
## Chain 2: Iteration:  600 / 3000 [ 20%]  (Warmup)
## Chain 2: Iteration:  900 / 3000 [ 30%]  (Warmup)
## Chain 2: Iteration: 1200 / 3000 [ 40%]  (Warmup)
## Chain 2: Iteration: 1500 / 3000 [ 50%]  (Warmup)
## Chain 2: Iteration: 1501 / 3000 [ 50%]  (Sampling)
## Chain 2: Iteration: 1800 / 3000 [ 60%]  (Sampling)
## Chain 2: Iteration: 2100 / 3000 [ 70%]  (Sampling)
## Chain 2: Iteration: 2400 / 3000 [ 80%]  (Sampling)
## Chain 2: Iteration: 2700 / 3000 [ 90%]  (Sampling)
## Chain 2: Iteration: 3000 / 3000 [100%]  (Sampling)
## Chain 2: 
## Chain 2:  Elapsed Time: 4.13 seconds (Warm-up)
## Chain 2:                5.019 seconds (Sampling)
## Chain 2:                9.149 seconds (Total)
## Chain 2: 
## 
## SAMPLING FOR MODEL 'e827b3c31fce1e783a5f0fcffd730534' NOW (CHAIN 3).
## Chain 3: 
## Chain 3: Gradient evaluation took 0 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3: 
## Chain 3: 
## Chain 3: Iteration:    1 / 3000 [  0%]  (Warmup)
## Chain 3: Iteration:  300 / 3000 [ 10%]  (Warmup)
## Chain 3: Iteration:  600 / 3000 [ 20%]  (Warmup)
## Chain 3: Iteration:  900 / 3000 [ 30%]  (Warmup)
## Chain 3: Iteration: 1200 / 3000 [ 40%]  (Warmup)
## Chain 3: Iteration: 1500 / 3000 [ 50%]  (Warmup)
## Chain 3: Iteration: 1501 / 3000 [ 50%]  (Sampling)
## Chain 3: Iteration: 1800 / 3000 [ 60%]  (Sampling)
## Chain 3: Iteration: 2100 / 3000 [ 70%]  (Sampling)
## Chain 3: Iteration: 2400 / 3000 [ 80%]  (Sampling)
## Chain 3: Iteration: 2700 / 3000 [ 90%]  (Sampling)
## Chain 3: Iteration: 3000 / 3000 [100%]  (Sampling)
## Chain 3: 
## Chain 3:  Elapsed Time: 4.374 seconds (Warm-up)
## Chain 3:                4.789 seconds (Sampling)
## Chain 3:                9.163 seconds (Total)
## Chain 3: 
## 
## SAMPLING FOR MODEL 'e827b3c31fce1e783a5f0fcffd730534' NOW (CHAIN 4).
## Chain 4: Rejecting initial value:
## Chain 4:   Log probability evaluates to log(0), i.e. negative infinity.
## Chain 4:   Stan can't start sampling from this initial value.
## Chain 4: 
## Chain 4: Gradient evaluation took 0 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4: 
## Chain 4: 
## Chain 4: Iteration:    1 / 3000 [  0%]  (Warmup)
## Chain 4: Iteration:  300 / 3000 [ 10%]  (Warmup)
## Chain 4: Iteration:  600 / 3000 [ 20%]  (Warmup)
## Chain 4: Iteration:  900 / 3000 [ 30%]  (Warmup)
## Chain 4: Iteration: 1200 / 3000 [ 40%]  (Warmup)
## Chain 4: Iteration: 1500 / 3000 [ 50%]  (Warmup)
## Chain 4: Iteration: 1501 / 3000 [ 50%]  (Sampling)
## Chain 4: Iteration: 1800 / 3000 [ 60%]  (Sampling)
## Chain 4: Iteration: 2100 / 3000 [ 70%]  (Sampling)
## Chain 4: Iteration: 2400 / 3000 [ 80%]  (Sampling)
## Chain 4: Iteration: 2700 / 3000 [ 90%]  (Sampling)
## Chain 4: Iteration: 3000 / 3000 [100%]  (Sampling)
## Chain 4: 
## Chain 4:  Elapsed Time: 4.238 seconds (Warm-up)
## Chain 4:                4.679 seconds (Sampling)
## Chain 4:                8.917 seconds (Total)
## Chain 4:

print(Model,c("beta","lambda","alpha"),digits=3,probs=c(0.025,0.50,0.975))

## Inference for Stan model: e827b3c31fce1e783a5f0fcffd730534.
## 4 chains, each with iter=3000; warmup=1500; thin=1; 
## post-warmup draws per chain=1500, total post-warmup draws=6000.
## 
##           mean se_mean    sd   2.5%    50% 97.5% n_eff  Rhat
## beta[1] -0.050   0.004 0.219 -0.486 -0.044 0.365  3483 1.000
## lambda   0.355   0.001 0.057  0.252  0.351 0.475  2995 1.001
## alpha    1.042   0.002 0.122  0.820  1.036 1.300  3112 1.001
## 
## Samples were drawn using NUTS(diag_e) at Sat Nov 18 10:39:44 2023.
## For each parameter, n_eff is a crude measure of effective sample size,
## and Rhat is the potential scale reduction factor on split chains (at 
## convergence, Rhat=1).

#Traceplots
windows()
stan_trace(Model,pars = c("beta","lambda","alpha"))+ggtitle("Trace plots for the (Sine-Exponentiated Exponential PH Model parameters)")

#Autocorrelation plots
windows()
stan_ac(Model,pars = c("beta","lambda","alpha"))+grid_lines()+ggtitle("Autocorrelation plots for the (Sine-Exponentiated Exponential PH Model parameters)")

#Density plots
windows()
stan_dens(Model,pars = c("beta","lambda","alpha"))+ggtitle("Trace plots for the (Sine-Exponentiated Exponential PH Model parameters)")

#Histogram plots
windows()
stan_hist(Model,pars = c("beta","lambda","alpha"))+ggtitle("Trace plots for the (Sine-Exponentiated Exponential PH Model parameters)")

#loo package is used for calculating LOOIC and WAIC
log_lik_11<-extract_log_lik(Model,parameter_name ="log_likhood",merge_chains = TRUE)
loo_ll<-loo(log_lik_11,r_ef=NULL,save_psis=FALSE)
print(loo_ll)

## 
## Computed from 6000 by 90 log-likelihood matrix
## 
##          Estimate   SE
## elpd_loo   -129.6  6.8
## p_loo         3.7  0.7
## looic       259.2 13.6
## ------
## Monte Carlo SE of elpd_loo is 0.0.
## 
## All Pareto k estimates are good (k < 0.5).
## See help('pareto-k-diagnostic') for details.

waic_ll<-waic(log_lik_11)
print(waic_ll)

## 
## Computed from 6000 by 90 log-likelihood matrix
## 
##           Estimate   SE
## elpd_waic   -129.6  6.8
## p_waic         3.7  0.7
## waic         259.2 13.6
## 
## 1 (1.1%) p_waic estimates greater than 0.4. We recommend trying loo instead.

Conclusion

Our primary objective in this study was to develop and evaluate a Bayesian survival modeling approach for PH regression models. Specifically, we focused on utilizing the Sine-G family of distributions as baseline hazards.

By incorporating the Sine-G family, which includes distributions such as Weibull, exponentiated exponential, Lomax, and Gompertz, we aimed to enhance the modeling capabilities of our approach. These distributions offer a flexible range of hazard forms, allowing us to capture various patterns observed in survival data.

Through rigorous evaluation and analysis, we compared the performance of different baseline hazards within the Sine-G family. Our goal was to identify the most suitable choice for our proposed Bayesian survival modeling approach.

Based on our findings, we determined that the Sine-Lomax-PH model demonstrated superior performance compared to the other baseline hazards considered. This model exhibited greater flexibility and accuracy in representing the hazard function within proportional hazard regression models.

By utilizing the Sine-Lomax-PH model, we were able to effectively capture a wide range of hazard shapes, including increasing, decreasing, and bathtub-shaped hazards. This flexibility enabled a more comprehensive understanding of the underlying survival patterns and improved the accuracy of our predictions.

To implement our proposed approach, we utilized the R programming language and the STAN probabilistic programming framework. This combination ensured efficient computation and reliable inference of model parameters, thereby enhancing the reliability and applicability of our findings.

In summary, our study successfully developed and evaluated a Bayesian survival modeling approach for PH regression models using the Sine-G family of distributions as baseline hazards. The Sine-Lomax-PH model emerged as the preferred choice due to its superior flexibility and accuracy in representing hazard forms. Our findings contribute to advancing the field of survival analysis and provide a valuable tool for researchers and practitioners working with PH regression models in medical research and clinical practice.

Bayesian Survival Modelling of Sine-G Family of Distributions A Baseline Hazard Approach for Proportional Hazard Regression Models using R and STAN with Application to Right-Censored Survival Datasets

Abdisalam Hassan Muse, Amani Almohaimeed, Hana Alqifari and Christophe Chesneau

2023-11-18

Abstract

Sine Gompertz PH Model

Sine Exponentiated Exponential PH Model

Conclusion