where \(y_{ij}\) represents the cholesterol level divided by 100 at the \(j\)th time for subject \(i\) and \(t_{ij}\) is \((\text{time}-5)/10\), with time measured in years from baseline. [6] assume normal residual errors \(\varepsilon_{ij}\) and the density of the random effects \({\bf u}_i=(u_{1i}, u_{2i})^\top\) is represented by a semi-nonparametric truncated series expansion.
Model (1) is simply the LMM in (1) from [1] with \(p=4\) and \(q=2\). Here we adopt the following hierarchical structure: \[\begin{eqnarray*} \varepsilon_{ij}\mid \sigma_{\varepsilon},\delta_{\varepsilon} &\stackrel{i.i.d.}{\sim}& \text{t}({0},\sigma_{\varepsilon} ,\delta_{\varepsilon}), \notag\\ %y_j \vert {\bf u}, {\bf \beta}, \sigma_{\varepsilon},\delta_{\varepsilon} &\stackrel{ind.}{\sim}& \text{St}_{1} ({\bf x}_j^{\top}\bm{\beta} + {\bf z}_j^{\top}{\bf u},\sigma_{\varepsilon} , \delta_{\varepsilon}),\notag\\ {\bf u}_i\mid {\bf \theta_1},{\bf \theta_2},\rho &\sim& \text{GC}(F_1,F_2\mid \rho), \end{eqnarray*}\] where GC denotes a bivariate Gaussian copula with marginals \(F_1\) and \(F_2\), and correlation parameter \(\rho\). For these marginal distributions we use a two-piece sinh-arcsinh distribution with the epsilon–skew parameterisation which contains a scale parameter \(\sigma_i>0\), a skewness parameter \(\gamma_i\in (-1,1)\), and a kurtosis parameter \(\delta_i>0\), \(i=1,2\). We denote this model as Model 1. We adopt the prior structure: \[\begin{eqnarray}\label{PriorFramingham} \pi({\bf \beta},\sigma_{\varepsilon},\delta_{\varepsilon},\sigma_1,\sigma_2,\gamma_1,\gamma_2,\delta_1,\delta_2,\rho) &\sim& \dfrac{\pi(\delta_{\varepsilon})\pi(\rho)}{\sigma_{\varepsilon}} \prod_{i=1}^2 \pi(\sigma_i)\pi(\gamma_i)\pi(\delta_i). \end{eqnarray}\] As a weakly informative prior for the degrees of freedom of the Student-\(t\) distribution, \(\delta_{\varepsilon}\) we employ the prior proposed in [2]. For each of the scale parameters \((\sigma_1,\sigma_2\)) we adopt a half-Cauchy prior. The shape parameters \((\delta_1,\delta_2)\) are assigned the prior proposed in for a general class of kurtosis parameters. For each of the skewness parameters \((\gamma_1,\gamma_2)\) we adopt a uniform distribution on \((-1,1)\). In a bivariate Gaussian copula the Spearman measure of association, \(r_{\rho}\in(-1,1)\), can be calculated in closed form as \[\begin{eqnarray*} r_{\rho} = \dfrac{6}{\pi}\arcsin\left(\dfrac{\rho}{2}\right), \end{eqnarray*}\] for which we assume a uniform prior \(r_{\rho}\sim U(-1,1)\), inducing the following prior density on \(\rho\) \[\begin{eqnarray*} \pi(\rho) \propto \dfrac{1}{\sqrt{1-\left({\rho}/{2}\right)^2}}. \end{eqnarray*}\]The propriety of the corresponding posterior distribution is guaranteed by Theorem 1 from [1]. See [1] for a description of other sub-models of interest, which can be implemented with some little tweaks of the R code presented below. The following R code illustrates the implementation of Model 1. A posterior sample of size \(5,000\) was obtained after a burn-in period of \(15,000\) iterations and thinning every \(25\) iterations (\(140,000\) simulations in total).
The format of the tables is better in Google Chrome and Mozilla Firefox in our experience
References
Flexible linear mixed models with improper priors for longitudinal and survival data
Bayesian modelling of skewness and kurtosis with Two-Piece Scale and shape distributions
A weakly informative prior for the degrees of freedom of the t distribution
Linear mixed models with flexible distributions of random effects for longitudinal data
The data can be found here
rm(list=ls())
# Required packagges
library(twopiece)
## Loading required package: flexclust
## Loading required package: grid
## Loading required package: lattice
## Loading required package: modeltools
## Loading required package: stats4
## Loading required package: mvtnorm
## Loading required package: label.switching
library(TPSAS)
library(mvtnorm)
library(compiler)
library(spBayes)
## Loading required package: coda
## Loading required package: magic
## Loading required package: abind
## Loading required package: Formula
library(knitr)
enableJIT(3)
## [1] 3
# Preparing the data
data <- read.table("cholst.txt",header=F)
colnames(data) <- c("newid","oriid","cholst","sex","age","year")
# Some plots
plot(data$year,data$cholst,xlab="Year",ylab="Cholesterol Level",cex.axis=1.5, cex.lab=1.5)
plot(data$year,data$cholst,xlab="Year",ylab="Cholesterol Level",type="l",cex.axis=1.5, cex.lab=1.5)
hist(data$cholst,xlab="Cholesterol Level",cex.axis=1.5, cex.lab=1.5)
box()
hist(data$age,xlab="Age",cex.axis=1.5, cex.lab=1.5)
box()
# variables
Y <- data$cholst/100 # response
time <- (data$year-5)/10 # time
sex <- data$sex
age <- data$age
n <- length(Y) # sample size
q <- max(data$newid) # Number of clusters
# cluster indicators
ind <- data$newid
ind1 <- unique(ind)
#-----------------------------------------------------------------------------------
#Prior on delta Student-T
#-----------------------------------------------------------------------------------
# Parameters obtained by fitting the sample from \pi(\nu)
MLE <- c(0.84886337, 1.44326981, 0.05384718, 0.89544807)
# Prior on nu
logkdeT <- Vectorize(function(x) dtpsas(log(x),MLE[1],MLE[2],MLE[3],MLE[4],param="eps")/x)
curve(logkdeT,0,15,n=10000,lwd=2,xlab=~nu,ylab="Density",cex.axis=1.5,cex.lab=1.5)
#-----------------------------------------------------------------------------------
#Prior on delta TPSAS
#-----------------------------------------------------------------------------------
MLE2 <- c(-0.1553498, 0.1620447, -0.2605134, 0.5742682)
# Prior on delta
logkde <- Vectorize(function(t) dtpsas(log(t),MLE2[1],MLE2[2],MLE2[3],MLE2[4],param="eps")/t)
curve(logkde,0,10,n=10000,lwd=2,xlab=~delta,ylab="Density",cex.axis=1.5,cex.lab=1.5)
# Log Copula
lpriord <- cmpfun(function(par){
p1 <- par[1]
p2 <- par[2]
r <- par[3]
R.I <- rbind(c (1/(1 - r^2), -r/(1 - r^2)),c (-r/(1 - r^2), 1/(1 - r^2)))
V <- c(qnorm(p1),qnorm(p2))
out <- - 0.5*log(1.-r^2.) -0.5*V%*%(R.I-diag(2))%*%V
return(out)
})
# log posterior
lpc <- cmpfun(function(par){
if(par[5]>0 & par[6]>0.1 & par[7]>0 & par[8]>-1 & par[8]<1 & par[9]>0 & par[10]>-1 & par[10]<1 & par[11]>0 & par[12]>0 & par[13]>-1 & par[13]<1){
var = dt((Y-(par[1] + par[2]*age + par[3]*sex + par[4]*time + par[13+ind] + par[213+ind]*time))/par[5],df=par[6],log=T)-log(par[5])
p1 = ptpsas(par[13+ind1],0,par[7],par[8],par[11],param="eps")
p2 = ptpsas(par[213+ind1],0,par[9],par[10],par[12],param="eps")
d1 = dtpsas(par[13+ind1],0,par[7],par[8],par[11],param="eps",log=T)
d2 = dtpsas(par[213+ind1],0,par[9],par[10],par[12],param="eps",log=T)
varc = apply(cbind(p1,p2,par[13]),1,lpriord)
ifelse(is.na(sum(var) + sum(varc)),
return(-Inf),
return( sum(var) + sum(varc) + sum(d1) + sum(d2) - log(par[5]) + log(logkdeT(par[6])) + dt(par[7],df=1,log=T) + dt(par[9],df=1,log=T) + log(logkde(par[11])) + log(logkde(par[12])) - 0.5*log(1-par[13]^2/4)))
}
else return(-Inf)
})
# Initial points
inits <- c(1.8,0.01,0,0,0.5,5,0.4,0,0.2,0,1,1,0.4,rep(0,400))
lpc(inits)
## [1] -560.1483
########################################################################################################
# # Sampling from the posterior: approximately 3 hours for each 25x100 iterations
# # Uncomment this section is you want to run the sampler
# n.batch <- 5600
# batch.length <- 25
#
# # Here only fixed effects and shape parameters are saved
# # To save the posterior samples of the random effects, just increase the index in p.theta.samples[,1:13]
# set.seed(1234)
# chain <- adaptMetropGibbs(ltd=lpc, starting=inits, accept.rate=0.44, batch=n.batch,
# batch.length=batch.length, report=100, verbose=FALSE)$p.theta.samples[,1:13]
#
# # Burning and thinning the chain
# burn <- 15000
# thin <- 25
# NS <- n.batch*batch.length
# index <- seq(burn,NS,thin)
#
# beta1p <- chain[index,1]
# beta2p <- chain[index,2]
# beta3p <- chain[index,3]
# beta4p <- chain[index,4]
# sigmap <- chain[index,5]
# deltap <- chain[index,6]
# sigma1p <- chain[index,7]
# gamma1p <- chain[index,8]
# sigma2p <- chain[index,9]
# gamma2p <- chain[index,10]
# delta1p <- chain[index,11]
# delta2p <- chain[index,12]
# rhop <- chain[index,13]
########################################################################################################
# Already saved outcome
load('postsamp.RData')
# Summaries of the posterior samples
hist(beta1p)
hist(beta2p)
hist(beta3p)
hist(beta4p)
hist(sigmap)
hist(deltap)
hist(sigma1p)
hist(gamma1p)
hist(sigma2p)
hist(gamma2p)
hist(delta1p)
hist(delta2p)
hist(rhop)
post.samp <- as.matrix(cbind(beta1p,beta2p,beta3p,beta4p,sigmap,deltap,sigma1p,gamma1p,sigma2p,gamma2p,delta1p,delta2p,rhop))
SUM <- apply(post.samp,2,summary)
colnames(SUM) <- c("beta1", "beta2", "beta3", "beta4", "sigma_e", "delta_e", "sigma_1", "gamma_1", "sigma_2", "gamma_2", "delta_1", "delta_2", "rho")
kable(SUM,digits = 3)
| beta1 | beta2 | beta3 | beta4 | sigma_e | delta_e | sigma_1 | gamma_1 | sigma_2 | gamma_2 | delta_1 | delta_2 | rho | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Min. | 1.093 | 0.003 | -0.211 | -0.108 | 0.130 | 2.687 | 0.118 | -0.641 | 0.038 | -1.000 | 0.529 | 0.456 | -0.023 |
| 1st Qu. | 1.402 | 0.014 | -0.092 | 0.077 | 0.157 | 4.445 | 0.239 | -0.397 | 0.142 | -0.754 | 0.757 | 0.859 | 0.342 |
| Median | 1.494 | 0.016 | -0.057 | 0.130 | 0.163 | 5.094 | 0.275 | -0.331 | 0.202 | -0.546 | 0.820 | 1.048 | 0.425 |
| Mean | 1.496 | 0.016 | -0.057 | 0.146 | 0.164 | 5.275 | 0.281 | -0.327 | 0.264 | -0.498 | 0.830 | 1.280 | 0.420 |
| 3rd Qu. | 1.584 | 0.018 | -0.022 | 0.200 | 0.170 | 5.916 | 0.317 | -0.259 | 0.307 | -0.296 | 0.893 | 1.406 | 0.503 |
| Max. | 2.006 | 0.028 | 0.122 | 0.546 | 0.202 | 16.794 | 0.677 | 0.080 | 1.505 | 0.686 | 1.427 | 5.392 | 0.810 |