# This initial housekeeping is not needed when knitting in R Markdown but is
# useful when running R code without knitting.
# !! Because this clears all memory, it must be the first R code executed !!
graphics.off() # This closes all of R's graphics windows.
rm(list=ls()) # Careful! This clears all of R's memory!
# Set some global options for knitr:
knitr::opts_chunk$set( error=TRUE , fig.asp=1 , fig.align="center")
# Define some URL's that will be used in the text:
OSFurlMain <- 'https://osf.io/w7cph/'
# OSFurlArticle <- 'https://osf.io/xdgzt/'
# OSFurlSupp <- 'https://osf.io/apf3y/'
# Define variable names for the BARG steps. Memo to me: Check that these BARG
# Step numbers match those in the main document!
stepCount = 0
stepCount = stepCount+1 ; StepModel = stepCount
stepCount = stepCount+1 ; StepComputation = stepCount
stepCount = stepCount+1 ; StepPosterior = stepCount
stepCount = stepCount+1 ; StepDecision = stepCount
stepCount = stepCount+1 ; StepSensitivity = stepCount
stepCount = stepCount+1 ; StepReproducible = stepCount
# N.B.! Next, specify whether to run new MCMC analyses or to re-load previously
# run and saved MCMC results. When runNewMCMC=TRUE then the MCMC chains are run
# anew. When runNewMCMC=FALSE (same as !TRUE) then the MCMC chains from the
# previous run are used. Because all MCMC processes are seeded, they should give
# the same results every time they are run. By using previously saved chains,
# the R Markdown document is typeset (i.e., knit) much faster.
runNewMCMC = TRUE
# Specify filename prefix for saved files:
saveFileDir = "BARG-Supplement-Output"
dir.create( saveFileDir )
filenameRoot = "BARG-Supplement"
This version has some minor revisions since the originally posted version of 28 June 2021:
...”)
is added to the bfplot() function.plotPost() and
plotPostDiff() functions.This is Supplementary Information (SI) to the article, “Bayesian analysis reporting guidelines” (BARG; Kruschke, J. K., Nature Human Behaviour, 2021). This SI has two main sections:
First, there is an example of a detailed Bayesian analysis with full reporting according to the BARG. The example of applying the BARG begins in the next section. A subsection of that that example, headed “Outline…”, provides additional details regarding the organization of its components.
Second, there is a review of previous Bayesian reporting guidelines. The review of previous Bayesian reporting guidelines appears after the example at the end of this document, in the final section.
This SI is best viewed onscreen from the HTML format available at https://osf.io/w7cph/. The HTML version of this document has a floating and dynamic table of contents visible in the left margin on wide screens. The table of contents is intended to aid navigation through the various subsections. The HTML version also has clickable buttons on the right side which dynamically unfold R code chunks if the reader wishes to examine them. The buttons can be clicked again to hide the code chunks. The HTML version also has tables that can be dynamically re-sorted by different columns.
This example applies the Bayesian analysis reporting guidelines (BARG) to ordinal data, specifically the ratings of movies from Liddell & Kruschke (2018) and its supplemental material at https://osf.io/53ce9/.
The primary goal of this document is to illustrate thorough reporting of the process and results of a Bayesian analysis. The goal is not to explain details of the specific computer programming I happened to use. Therefore most computer code is folded out of sight but can be revealed in the HTML document by clicking the “Code” buttons throughout the document. Moreover, this document is not intended to be a tutorial on Bayesian methods, and it is assumed that the reader is familiar with the basic concepts of Bayesian analysis (see, e.g., Farrell & Lewandowsky, 2018; Kruschke, 2015; Kruschke & Liddell, 2018a, 2018b; McElreath, 2020; van de Schoot et al., 2021).
The analysis begins with setting up the R software environment; various packages are loaded.
# Load packages:
library("rjags") # includes coda package too
library("runjags")
library("bridgesampling") # For Bayes factors
library("emdbook") # for dmvnorm() function
library("DT") # For table presentation
library("psych") # For pairwise scatter plots
library("tictoc") # For timing of processes
Next, various utility functions are defined. The functions defined here are idiosyncratic and do not need to be studied to understand the results presented later. (Some functions defined in this section might be vestigial and ultimately not used later.)
# Begin timing of processing:
tic("beginDocument") # matching toc() is near the end of the document
# Utility functions. Some of these functions are from the script,
# DBDA2E-utilities.R, that accompanies
# Kruschke (2015) Doing Bayesian Data Analysis 2nd Edition; see
# https://sites.google.com/site/doingbayesiandataanalysis/software-installation
HDIofMCMC = function( sampleVec , credMass=0.95 ) {
# Computes highest density interval from a sample of representative values,
# estimated as shortest credible interval.
# Arguments:
# sampleVec
# is a vector of representative values from a probability distribution.
# credMass
# is a scalar between 0 and 1, indicating the mass within the credible
# interval that is to be estimated.
# Value:
# HDIlim is a vector containing the limits of the HDI
sortedPts = sort( sampleVec )
ciIdxInc = ceiling( credMass * length( sortedPts ) )
nCIs = length( sortedPts ) - ciIdxInc
ciWidth = rep( 0 , nCIs )
for ( i in 1:nCIs ) {
ciWidth[ i ] = sortedPts[ i + ciIdxInc ] - sortedPts[ i ]
}
HDImin = sortedPts[ which.min( ciWidth ) ]
HDImax = sortedPts[ which.min( ciWidth ) + ciIdxInc ]
HDIlim = c( HDImin , HDImax )
return( HDIlim )
}
gammaShRaFromMeanSD = function( mean , sd ) {
if ( mean <=0 ) stop("mean must be > 0")
if ( sd <=0 ) stop("sd must be > 0")
shape = mean^2/sd^2
rate = mean/sd^2
return( list( shape=shape , rate=rate ) )
}
gammaShRaFromModeSD = function( mode , sd ) {
if ( mode <=0 ) stop("mode must be > 0")
if ( sd <=0 ) stop("sd must be > 0")
rate = ( mode + sqrt( mode^2 + 4 * sd^2 ) ) / ( 2 * sd^2 )
shape = 1 + mode * rate
return( list( shape=shape , rate=rate ) )
}
# Find mode value of a vector of numeric values, using kernel-smoothed density
# approximation:
modeMaxDens = function( v , ... ) {
dres = density( v , ... )
mode = dres$x[which.max(dres$y)]
return( mode )
}
# Tabular summary of MCMC diagnostics:
diagSummary = function( mcmcCoda ) {
require("coda")
if ( class(mcmcCoda) != "mcmc.list" ) {
stop( "In diagSummary() input must be mcmc.list (i.e., coda object)." )
}
mcmcMat = as.matrix(mcmcCoda,chains=TRUE)
parameterNames = varnames(mcmcCoda)
chainSummary = NULL
parIdx = 0
for ( parName in parameterNames ) {
parIdx = parIdx+1
thisPSRF = as.vector(gelman.diag(mcmcCoda[,parName])[[1]])
names( thisPSRF) = c("psrfPt","psrfUpCI")
thisESS = as.vector(effectiveSize(mcmcCoda[,parName]))
names( thisESS ) = "ESS"
thisMedian = quantile(mcmcMat[,parName],probs=c(0.5))
thisETI = quantile(mcmcMat[,parName],probs=c(0.025,0.975))
thisDensity = density(mcmcMat[,parName])
thisMode = thisDensity$x[which.max(thisDensity$y)]
names( thisMode ) = "Mode"
thisHDI = HDIofMCMC( mcmcMat[,parName] , credMass=0.95 )
names( thisHDI ) = c("HDIlow","HDIhigh")
chainSummary = rbind( chainSummary ,
c( thisPSRF , thisESS ,
thisMedian , thisETI , thisMode , thisHDI ) )
rownames(chainSummary)[parIdx] = parName
}
return( chainSummary )
}
# Compute number of rows and columns for panel display of n panels:
nrnc = function(n){
nr=round(sqrt(n))
nc=ceiling(n/nr)
return( data.frame( nr=nr , nc=nc ) )
}
# Create histogram with superimposed normal density. First argument is matrix or
# data frame with rows being steps in MCMC and columns being named parameters.
# Second argument is a column name.
histNorm = function( mcmcDF , varName , ... ){
samp = mcmcDF[,varName]
if ( sd(samp) > 0 ) {
hist( samp , main=varName , probability = TRUE,
breaks=31 , col="gray" , border="gray" , ... )
x = seq(min(samp),max(samp),length=201)
lines( x , dnorm(x,mean(samp),sd(samp)) , col="blue" )
} else {
hist( samp , main=varName , probability = TRUE,
breaks=seq( mean(samp)-1 , mean(samp)+1 , length=301 ) ,
col="gray" , border="gray" , ... )
}
}
# Histogram with superimposed gamma density:
histGamma = function( mcmcDF , varName ,
cnt=c("mode","mean")[2] , offset=c(0,NA)[1] , ... ){
samp = mcmcDF[,varName]
hist( samp , main=varName , probability = TRUE,
breaks=31 , col="gray" , border="gray" , ... )
x = seq(min(samp),max(samp),length=201)
if (is.na(offset)) { offset=min(samp) }
if (cnt=="mean") {
lines( x , dgamma(x-offset,
gammaShRaFromMeanSD(mean(samp)-offset,sd(samp))$shape ,
gammaShRaFromMeanSD(mean(samp)-offset,sd(samp))$rate ) ,
col="blue" )
}
if (cnt=="mode") {
lines( x , dgamma(x-offset,
gammaShRaFromModeSD(modeMaxDens(samp)-offset,sd(samp))$shape ,
gammaShRaFromModeSD(modeMaxDens(samp)-offset,sd(samp))$rate ),
col="blue" )
}
}
# Plot (posterior) histogram with annotation
plotPost = function( mcmc , name="Param" , rope=NULL , ... ){
require("coda") # for effectiveSize()
oldPar = par()[c("mar","mgp","pty")]
on.exit( par(oldPar) )
par( mar=c(3.5,3.5,3,1) , mgp=c(2.0,0.7,0) )
histInfo = hist( mcmc , probability=TRUE ,
xlab=name , main=name ,
col="skyblue" , border="skyblue" , breaks=51 , ... )
# ETI:
etiLims = quantile( mcmc , probs=c(0.025,0.975) )
etiMid = median(mcmc)
lines( x=etiLims , y=c(0,0) , lwd=10 , lend=1 , col="darkblue" )
text( x=etiMid , y=0 , labels="95% ETI" , adj=c(0.5,-1) , col="darkblue" )
# ESS:
ess = coda::effectiveSize(mcmc)
text( x=max(histInfo$breaks) , y=max(histInfo$density) , adj=c(1,1) ,
labels=paste0("ESS=",round(ess,1)) )
# ROPE:
if ( !is.null(rope) ) {
if ( is.numeric(rope) & length(rope)==2 & rope[1]<rope[2] ) {
pctIn = sum( mcmc>rope[1] & mcmc<rope[2] ) / length(mcmc)
yHt = max(histInfo$density)/2
ropeCol = "red"
segments( x0=rope[1] , y0=0 ,
x1=rope[1] , y1=yHt , lty="dotted" , col=ropeCol )
segments( x0=rope[2] , y0=0 ,
x1=rope[2] , y1=yHt , lty="dotted" , col=ropeCol )
text( x=mean(rope) , y=yHt , col=ropeCol , adj=c(0.5,-0.2) ,
labels=paste0(round(100*pctIn,1),"% in ROPE") )
}
}
infoTab = matrix( c(etiMid,etiLims,ess) ,
nrow=1 , ncol=4 ,
dimnames=list( c(name) ,
c("Q50","Q2.5","Q97.5","ESS") ) )
return( invisible( infoTab ) )
}
# Plot (posterior) histograms of two variables and their difference:
plotPostDiff = function( mcmc1 , mcmc2 , name1="Param.1" , name2="Param.2" ,
rope1=NULL , rope2=NULL , ropeD=NULL ) {
layout(matrix(1:3,ncol=1))
# Param 1:
p1info = plotPost( mcmc=mcmc1 , name=name1 , rope=rope1 ,
xlim=range(c(rope1,mcmc1,mcmc2)) )
# Param 2:
p2info = plotPost( mcmc=mcmc2 , name=name2 , rope=rope2 ,
xlim=range(c(rope2,mcmc1,mcmc2)) )
# Difference:
pDiffInfo = plotPost( mcmc=mcmc1-mcmc2 , name=paste0(name1," - ",name2) , rope=ropeD ,
xlim=range(c(0,ropeD,mcmc1-mcmc2)) )
abline( v=0 , lty="dotted" )
# Reset layout:
layout(matrix(1))
# Assemble info into table:
infoTab = matrix( 0 , nrow=3 , ncol=4 ,
dimnames=list( c(name1,name2,paste0(name1," - ",name2)) ,
c("Q50","Q2.5","Q97.5","ESS") ) )
infoTab[1,]=unlist(p1info)
infoTab[2,]=unlist(p2info)
infoTab[3,]=unlist(pDiffInfo)
return( invisible( infoTab ) )
}
# Plot smoothed density of G samples on same plot.
plotGpdfs = function( sampMat , # samples in columns, *named*
rope=NULL ,
cols=c("deepskyblue","deeppink","lightgreen") ,
lwds=c(2.0,2.75,3.5) ,
... ) { # user specifies axis labels and main title
# Compute the smoothed densities of columns, and the plot limits:
dlist = NULL
maxd = 0
xrange = NULL
for ( gIdx in 1:ncol(sampMat) ) {
thisDens = density(sampMat[,gIdx])
dlist = c( dlist , list( thisDens ) )
maxd = max( maxd , max(thisDens$y) )
xrange = range( c( thisDens$x , xrange ) )
}
xrange = range( c( xrange , rope ) )
# Make empty plot:
plot( 0,0,pch="" , xlim=xrange , ylim=c(0,maxd) ,
cex.lab=1.33 , ... )
# Insert density curves and HDI's:
for ( gIdx in 1:ncol(sampMat) ) {
lines( dlist[[gIdx]]$x , dlist[[gIdx]]$y ,
lwd=lwds[1+gIdx%%length(lwds)] ,
col=cols[1+gIdx%%length(cols)] )
hdi = HDIofMCMC(sampMat[,gIdx])
yhdi = ( dlist[[gIdx]]$y[ which.min( abs( dlist[[gIdx]]$x - hdi[1] ) ) ]
+ dlist[[gIdx]]$y[ which.min( abs( dlist[[gIdx]]$x - hdi[2] ) ) ] ) /2
lines( x=hdi , y=rep(yhdi,2) ,
lwd=lwds[1+gIdx%%length(lwds)] , col=cols[1+gIdx%%length(cols)] )
points( x=dlist[[gIdx]]$x[which.max(dlist[[gIdx]]$y)] , y=yhdi , pch="|" ,
col=cols[1+gIdx%%length(cols)] )
}
# baseline:
abline( h=0 , col="gray" )
# rope:
if ( length(rope)==2 ) {
abline( v=rope , col="red" , lty="dashed" )
}
# legend:
legend( "topleft" ,
legend=colnames(sampMat) ,
text.col=cols[1+(1:ncol(sampMat))%%length(cols)] ,
col=cols[1+(1:ncol(sampMat))%%length(cols)] ,
lwd=lwds[1+(1:ncol(sampMat))%%length(lwds)] )
}
# Plot empirical cumulative density functions (ECDF's) of G samples on same plot.
plotGecdfs = function( sampMat , # samples in columns, *named*
rope=NULL ,
cols=c("deepskyblue","deeppink","lightgreen") ,
lwds=c(2.0,2.75,3.5) ,
yeti = 0.05 , # y val at which to plot ETI
... ) { # user specifies axis labels and main title
xrange = range( c( range(sampMat) , rope ) )
# Make empty plot:
plot( 0,0,pch="" , xlim=xrange , ylim=c(0,1) , cex.lab=1.33 , ... )
# Insert ecdf's and ETI's:
for ( gIdx in 1:ncol(sampMat) ) {
lines( ecdf( sampMat[,gIdx] ) , verticals=FALSE , do.points=FALSE ,
lwd=lwds[1+gIdx%%length(lwds)] , col=cols[1+gIdx%%length(cols)] )
lines( x=quantile( sampMat[,gIdx] , probs=c(0.025,0.975) ) , y=rep(yeti,2) ,
lwd=lwds[1+gIdx%%length(lwds)] , col=cols[1+gIdx%%length(cols)] )
points( x=quantile( sampMat[,gIdx] , probs=c(0.5) ) , y=yeti ,
pch="|" , col=cols[1+gIdx%%length(cols)] )
}
abline( h=c(0.025,0.5,0.975) , col="gray" , lty="dotted" )
#grid()
if ( length(rope)==2 ) {
abline( v=rope , col="red" , lty="dashed" )
}
# legend:
legend( "topleft" ,
legend=colnames(sampMat) ,
text.col=cols[1+(1:ncol(sampMat))%%length(cols)] ,
col=cols[1+(1:ncol(sampMat))%%length(cols)] ,
lwd=lwds[1+(1:ncol(sampMat))%%length(lwds)] )
}
# Function for displaying data with model predictions.
postPredPlot = function( dataMat , FUN , mcmcMat ){
# Arguments:
# * dataMat should be 2 x 5 matrix of counts
# * FUN is name of function defined elsewhere that gives probabilities
# predicted by model from parameters input to FUN().
# * mcmcMat is MCMC matrix of parameter values for FUN(), with columns in the
# order required by FUN().
oldPar = par()[c("mar","mgp","pty")]
on.exit( { par(oldPar) ; layout(c(1)) } )
par( mar=c(3.5,3.5,3,1) , mgp=c(2.0,0.7,0) )
# Compute preducted probabilities at every step:
postPreds = matrix( NA ,
ncol = nrow(dataMat) * ncol(dataMat) , # should be 2 x 5 = 10
nrow = nrow(mcmcMat) )
for ( i in 1:nrow(mcmcMat) ) {
postPreds[i,] = FUN( mcmcMat[i,] )
}
# Compute random predicted data proportions at every step:
postPredProp = NA*postPreds
N1 = sum(dataMat[1,])
N2 = sum(dataMat[2,])
for ( i in 1:nrow(postPreds) ) {
if ( !any(is.na(postPreds[i,])) ) {
postPredProp[i,] = c( rmultinom(1,size=N1,prob=postPreds[i,1:5])/N1 ,
rmultinom(1,size=N2,prob=postPreds[i,6:10])/N2 )
}
}
# Plot actual data:
#maxCellP = max( apply( dataMat , 1 , FUN=function(x){x/sum(x)} ) )
maxCellP = max( apply( postPredProp , 2 ,
FUN=function(x){quantile( x , probs=c(0.975) , na.rm=TRUE )} ) )
nPanel = nrow(dataMat)
nr = nrnc(nPanel)$nr
nc = nrnc(nPanel)$nc
layout( matrix(1:(nr*nc),nrow=nr,byrow=TRUE) )
for ( idIdx in 1:nPanel ) {
# Plot data bars:
datVec = dataMat[idIdx,]/sum(dataMat[idIdx,])
plot( datVec ,
xlim=c(0.5,length(datVec)+0.5) ,
xlab="Rating" , ylab="p(rating)" , ylim=c(0,maxCellP) ,
main=paste0("Case ",idIdx," (N=",sum(dataMat[idIdx,]),")") ,
type="h" , lend=1 , lwd=15 , col="gold" )
abline(h=seq(0,1,0.1),lty="dotted",col="gray")
# Superimpose posterior predicted data:
points( 1:5 , apply( postPredProp[,(idIdx-1)*5+1:5] , 2 ,
FUN=function(x){median(x,na.rm=TRUE)} ) ,
col="deepskyblue" , cex=1.5 , lwd=2 )
for ( pt in 1:5 ) {
eti=quantile( postPredProp[,(idIdx-1)*5+pt] , probs=c(0.025,0.975) , na.rm=TRUE )
segments( x0=pt , x1=pt , y0=eti[1] , y1=eti[2] ,
col="deepskyblue" , lwd=4 , lend=1 )
}
}
}
# Function for computing probabilities of ordered-probit model:
predDiffmHetv = function( param ){
# Assumes param contains, in order, "mu[1]","mu[2]", "sigma[1]","sigma[2]",
# "thresh[2]","thresh[3]"
if ( length(param) != 6 ) {
stop( "In predDiffmHetv() argument param has incorrect length." )
}
# Rename param's:
mu = param[1:2]
sigma = param[3:4]
thresh = c(-Inf,1.5,param[5:6],4.5,Inf)
pred1 = pnorm((thresh[2:6]-mu[1])/sigma[1]) - pnorm((thresh[1:5]-mu[1])/sigma[1])
pred2 = pnorm((thresh[2:6]-mu[2])/sigma[2]) - pnorm((thresh[1:5]-mu[2])/sigma[2])
if ( any(pred1<0) | any(pred2<0) ) { pred1=NA*pred1 ; pred2=NA*pred2 }
return( c( pred1 , pred2 ) )
}
# Function for computing probabilities of EqmHetv ordered-probit model:
predEqmHetv = function( param ){
# Assumes param contains, in order,
# "mu", "sigma[1]","sigma[2]", "thresh[2]","thresh[3]"
if ( length(param) != 5 ) {
stop( "In predEqmHetv() argument param has incorrect length." )
}
# Rename param's:
mu = param["mu"]
sigma = param[c("sigma[1]","sigma[2]")]
thresh = c(-Inf,1.5,param[c("thresh[2]","thresh[3]")],4.5,Inf)
pred1 = pnorm((thresh[2:6]-mu)/sigma[1]) - pnorm((thresh[1:5]-mu)/sigma[1])
pred2 = pnorm((thresh[2:6]-mu)/sigma[2]) - pnorm((thresh[1:5]-mu)/sigma[2])
if ( any(pred1<0) | any(pred2<0) ) { pred1=NA*pred1 ; pred2=NA*pred2 }
return( c( pred1 , pred2 ) )
}
# Function for computing probabilities of DiffmHomv ordered-probit model:
predDiffmHomv = function( param ){
# Assumes param contains, in order,
# "mu", "sigma[1]","sigma[2]", "thresh[2]","thresh[3]"
if ( length(param) != 5 ) {
stop( "In predDiffmHomv() argument param has incorrect length." )
}
# Rename param's:
mu = param[c("mu[1]","mu[2]")]
sigma = param[c("sigma")]
thresh = c(-Inf,1.5,param[c("thresh[2]","thresh[3]")],4.5,Inf)
pred1 = pnorm((thresh[2:6]-mu[1])/sigma) - pnorm((thresh[1:5]-mu[1])/sigma)
pred2 = pnorm((thresh[2:6]-mu[2])/sigma) - pnorm((thresh[1:5]-mu[2])/sigma)
if ( any(pred1<0) | any(pred2<0) ) { pred1=NA*pred1 ; pred2=NA*pred2 }
return( c( pred1 , pred2 ) )
}
Readers might be interested in a specific function I created for
plotting Bayes factors (BF’s), results from which appeared in the
article that describes the BARG. The function takes the BF as input
(along with other inputs) and creates a plot of posterior model
probability as a function of prior model probability. The plot also
shows the range of prior probabilities that yield posterior
probabilities beyond a decision threshold. Examples of the plots appear
near the end of this document, in the section
on hypothesis testing. The function, called bfplot(),
is defined in the following code chunk.
#' @title bfplot
#' @description For an input Bayes factor, `bfplot()` creates a plot of the
#' posterior model probability as a function of the prior model probability.
#' @details For an input Bayes factor, `bfplot()` plots the posterior
#' probability of the numerator model as a function of the prior probability
#' of the numerator model. The plot is annotated with the critical posterior
#' probability for deciding to accept or reject the models, and with the
#' corresponding prior probability required to reach that threshold. The
#' function returns a list including a text message summarizing the results.
#' @param bf The Bayes factor to be plotted; a scalar greater than zero. User
#' must specify.
#' @param numerName Name of numerator model; character string in quotes. Has default.
#' @param denomName Name of denominator model; character string in quotes. Has default.
#' @param critPost Critical value for posterior probability of numerator model
#' to 'accept' the numerator model. Defaults to 0.95. Must be numeric > 0.50,
#' or NA in which case no accept/reject annotation is displayed or returned.
#' @param main Title annotation for plot. Has default.
#' @param bfCol Color of Bayes factor annotation. Has default.
#' @param acceptCol Color of 'accept' annotation. Has default.
#' @param rejectCol Color of 'reject' annotation. Has default.
#' @param bfLabelNudge Displacement of BF label relative to curve. Has default.
#' @param bfLabelCex Magnification of BF label text. Has default.
#' @param noPlot If TRUE then only returns info, no plot. Defaults to FALSE.
#' @return A list containing the input values and various computed values along
#' with a text message (character string) explaining critical prior
#' probabilities of models.
#' @examples
#' # Example with bf>1, lean toward 'accept' numerator model:
#' bfplot( bf = 19 )
#' # Example with bf<1, lean toward 'reject' numerator model:
#' bfplot( bf = 1/19 )
#' # Example with different critical posterior probability for decision:
#' bfplot( bf = 3 , critPost = 0.75 )
#' # Example with returned output:
#' bfplotInfo = bfplot( bf = 19 )
#' print( bfplotInfo )
#' # Example for disease diagnosis:
#' falsePositiveRate = 0.07 ; specificity = 1.0-falsePositiveRate
#' falseNegativeRate = 0.03 ; sensitivity = 1.0-falseNegativeRate
#' bfPositiveTest = sensitivity / falsePositiveRate # for positive test result
#' bfNegativeTest = falseNegativeRate / specificity # for negative test result
#' par(mfrow=c(1,2))
#' bfplot( bfPositiveTest ,
#' numerName="Have Disease" , denomName="Don't Have Dis." ,
#' main="Test Positive" , critPost=NA )
#' bfplot( bfNegativeTest ,
#' numerName="Have Disease" , denomName="Don't Have Dis." ,
#' main="Test Negative" , critPost=NA )
#' par(mfrow=c(1,1)) # reset single plot panel
#' @author John K. Kruschke, johnkruschke@gmail.com,
#' https://jkkweb.sitehost.iu.edu/, December 2020.
#' @export
bfplot = function( bf ,
numerName="Numerator Model" ,
denomName="Denominator Model" ,
critPost=0.95 , # or NA
# ornamentation:
main=bquote(atop(.(numerName),"/ "*.(denomName))) ,
bfCol = "deepskyblue" ,
acceptCol = "darkgreen" ,
rejectCol = "darkred" ,
bfLabelNudge = 0.2 ,
bfLabelCex = 1.5 ,
noPlot = FALSE , # if TRUE then only returns info, no plot
... # additional arguments passed to plot()
) {
if ( !noPlot ){
oldPar = par()[c("mar","mgp","pty")]
on.exit( par(oldPar) )
par( mar=c(3.5,3.5,4.0,1.0) , mgp=c(2.0,0.7,0) , pty="s" )
}
# Check values of inputs:
if ( bf <=0 ) { stop(" bf must be greater than zero") }
if ( !is.na(critPost) ) {
if ( critPost <= 0.50 ) { stop(" critPost must be greater than 0.50") }
if ( critPost > 1 ) { stop(" critPost must not exceed 1") }
}
if ( !noPlot ){
# Plot bf curve:
prior = seq( 1e-100 , 1 - 1e-100 , length=501 ) # vector, obviously
priorOdds = prior/(1-prior) # vector
postOdds = bf*priorOdds # vector
post = postOdds/(1+postOdds) # vector
idxToInclude = which(is.finite(post))
post = c(0,post[idxToInclude],1)
prior = c(0,prior[idxToInclude],1)
limMarg = 0.05
plot( prior , post ,
xlim=c(0-limMarg,1+limMarg) , ylim=c(0-limMarg,1+limMarg) ,
xlab=paste0("Prior Prob. of ",numerName) ,
ylab=paste0("Posterior Prob. of ",numerName) ,
type="l" , lwd=4 , col=bfCol ,
main=main ,
...
)
# Draw limited grid:
segments( x0=seq(0,1,0.2),y0=0,y1=1 , lty="dotted" , col="lightgray")
segments( y0=seq(0,1,0.2),x0=0,x1=1 , lty="dotted" , col="lightgray")
# Annotate curve with bf value:
bfLabelNudgeCrit = 1.7
if ( bf > 1 ) {
bfPlotIdx = which.max( post - prior )
text( prior[bfPlotIdx] , post[bfPlotIdx] ,
adj=c(0-bfLabelNudge,1+bfLabelNudge+1.3*(abs(log10(bf))>bfLabelNudgeCrit)) ,
labels=paste0("BF = ",signif(bf,3)) , col=bfCol , cex=bfLabelCex )
} else {
bfPlotIdx = which.min( post - prior )
text( prior[bfPlotIdx] , post[bfPlotIdx] ,
#adj=c(1+bfLabelNudge,0-bfLabelNudge-1.3*(abs(log10(bf))>bfLabelNudgeCrit)) ,
adj=c(1 ,0-bfLabelNudge-1.3*(abs(log10(bf))>bfLabelNudgeCrit)) ,
labels=paste0("BF = ",signif(bf,3)) , col=bfCol , cex=bfLabelCex )
}
}
if ( !is.na(critPost) ) {
# Compute critical values of prior prob:
hiCritPost = critPost
hiCritPostOdds = hiCritPost/(1-hiCritPost)
hiCritPrior = hiCritPostOdds/(bf+hiCritPostOdds)
loCritPost = 1-critPost
loCritPostOdds = loCritPost/(1-loCritPost)
loCritPrior = loCritPostOdds/(bf+loCritPostOdds)
if ( !noPlot ){
# Plot critical intervals of prior prob:
# Accept interval:
segments( y0=hiCritPost ,x0=0,x1=1, lty="dashed" , col=acceptCol )
lines( x=c(hiCritPrior,1) , y=c(hiCritPost,hiCritPost) ,
lwd=3 , lend=1 , col=acceptCol )
text( x=hiCritPrior , y=hiCritPost , adj=c(0.5,-0.5) ,
labels=round(hiCritPrior,3) , col=acceptCol )
text( x=1 , y=hiCritPost , adj=c(1.0,1.5) ,
labels=paste0("Prior s.t. Post>",round(hiCritPost,3)) , col=acceptCol )
# Reject interval:
segments( y0=loCritPost ,x0=0,x1=1, lty="dashed" , col=rejectCol )
lines( x=c(0,loCritPrior) , y=c(loCritPost,loCritPost) ,
lwd=3 , lend=1 , col=rejectCol )
text( x=loCritPrior , y=loCritPost , adj=c(0.5,1.5) ,
labels=round(loCritPrior,3) , col=rejectCol )
text( x=0 , y=loCritPost , adj=c(0.0,-0.5) ,
labels=paste0("Prior s.t. Post<",round(loCritPost,3)) , col=rejectCol )
}
# Construct message:
message = paste0( "The Bayes factor is ", signif(bf,3),
" for ", numerName, " relative to ", denomName, ".",
" To 'accept' ",numerName," (relative to ",denomName,")",
" with a posterior probability of at least ",
round(hiCritPost,3),", ",
numerName,"'s prior probability must be at least ",
round(hiCritPrior,3),".",
" To 'reject' ",numerName," (relative to ",denomName,")" ,
" with a posterior probability less than ",
round(loCritPost,3),", ",
numerName,"'s prior probability must be less than ",
round(loCritPrior,3),"." )
} else {
hiCritPost = NULL
hiCritPrior = NULL
loCritPost = NULL
loCritPrior = NULL
message = paste0("The Bayes factor is ", signif(bf,3),
" for ", numerName, " relative to ", denomName, ".")
}
# Invisibly return relevant information:
invisible( list(
bf=bf ,
numerName=numerName ,
denomName=denomName ,
critPost=critPost ,
hiCritPost=hiCritPost ,
hiCritPrior=hiCritPrior ,
loCritPost=loCritPost ,
loCritPrior=loCritPrior ,
message=message ) )
}
(Cf. BARG Step 1.A.)
The data are ratings of movies from Amazon.com, on an ordinal scale from 1 to 5 stars. For details of data acquisition see Liddell & Kruschke (2018). The original full data set contains ratings of 36 movies. For purposes of the present illustration, two of the movies are selected for comparison. The cases are selected to demonstrate that the conventional assumption of homogeneous variances can be very misleading, which was one of the main points of Liddell & Kruschke (2018).
# Read the full data set:
datFrm = read.csv( "MoviesData.csv" )
# Select subset of cases for this application:
datFrm = datFrm[c(9,15),] # similar means, different variances
## Other options for subsets to look at:
# datFrm = datFrm[c(5,12),] # similar means, different variances
## Or, check that results make sense for different means with similar variances:
# datFrm = datFrm[c(18,29),] # different means, similar variances
# Turn ID into a factor:
datFrm$ID = factor( datFrm$ID )
# Create matrix of counts for use in subsequent functions:
yMatAct = as.matrix( datFrm[,c("n1","n2","n3","n4","n5")] )
# # Vestigial check using fictitious data...
# # To test Eqm model, change yMatAct to equal means:
# if ( !TRUE ) {
# m1 = 3.0 ; s1 = 1.0
# m2 = 3.0 ; s2 = 1.5
# N = 2000
# t = c(-Inf,1.5,2.5,3.5,4.5,Inf)
# f1 = round( N*( pnorm((t[2:6]-m1)/s1) - pnorm((t[1:5]-m1)/s1) ) )
# f2 = round( N*( pnorm((t[2:6]-m2)/s2) - pnorm((t[1:5]-m2)/s2) ) )
# yMatAct = rbind( f1 , f2 )
# yMatAct
# }
The data, displayed numerically: In the frequency table below, there is one row per movie, with the rows labeled by arbitrary ID numbers (9 , 15) subsequently generically called ‘Case 1’ and ‘Case 2’. There is one column per star-rating level with the columns labeled arbitrarily as n1, n2, n3, n4, n5. Each cell contains the count (frequency) of ratings.
yMatAct
## n1 n2 n3 n4 n5
## 9 110 63 52 89 209
## 15 71 43 92 147 190
The data, displayed graphically:
datMat = yMatAct
maxCellP = max( apply( datMat , 1 , FUN=function(x){x/sum(x)} ) )
nPanel = nrow(datMat)
nr = nrnc(nPanel)$nr
nc = nrnc(nPanel)$nc
layout( matrix(1:(nr*nc),nrow=nr,byrow=TRUE) )
par( mar=c(3.5,3.5,3,1) , mgp=c(2.0,0.7,0) )
for ( idIdx in 1:nPanel ) {
datVec = datMat[idIdx,]/sum(datMat[idIdx,])
plot( datVec ,
xlim=c(0.5,length(datVec)+0.5) ,
xlab="Rating" , ylab="p(rating)" , ylim=c(0,maxCellP) ,
main=paste0("Case ",idIdx," (N=",sum(datMat[idIdx,]),")") ,
type="h" , lend=1 , lwd=15 , col="gold" )
abline(h=seq(0,1,0.1),lty="dotted",col="gray")
}
Visual inspection of the graphs and table shows that the distribution of ratings is “J” shaped, which is not unusual for rating distributions (e.g., Hu et al., 2009). Moreover, while Case 1 has a higher proportion of 5-star ratings than Case 2, Case 1 also has a higher proportion of 1-star ratings than Case 2.
(Cf. BARG Preamble B.)
I am interested in (i) the differences of the central tendencies of the cases, and (ii) the differences of the standard deviations of the cases. First, the magnitude of difference between central tendencies indicates how much better is one case than the other, on average. The difference of central tendencies will be converted to effect size (Cohen’s d), which is the difference relative to the average standard deviation. The effect size indicates the magnitude of difference between movies relative to the variability of opinion within movies. Second, the standard deviation indicates how much opinion varies across people, and the magnitude of difference between standard deviations indicates how much one case has more variance of opinion than the other. In other words, some movies elicit consistent opinions while other movies elicit diverse opinions. The model, explained later, describes means and standard deviations on a latent metric scale, not on the observed scale of ordinal labels.
I will also report decisions regarding whether the difference of means and difference of variances (on the latent scale) are effectively zero or non-zero. Decisions will be made two ways. First, decisions will be made by using the posterior estimated magnitudes of means and variances. The analysis for estimating parameter values and assessing null values takes approximately the first third of the example text.
A second way for making decisions is model comparison for null hypothesis testing. The full model has a distinct mean and variance for each case, whereas one restricted model requires the same variance for all cases (but allows different means), and a second restricted model requires the same mean for all cases (but allows different variances). The analysis of model comparison for null hypothesis testing takes approximately the latter two-thirds of the example text.
A note on terminology: A “null value” is the value of a parameter that indicates a null effect, such as an effect size of zero. By contrast, a “null hypothesis” is an entire model that has the relevant parameter fixed at its null value. An analyst can “assess a null value” by estimating the parameter value and considering the relation of its posterior distribution to its null value. An analyst can “test a null hypothesis” by comparing a full model with a restricted model that fixes the parameter at the null value.
(Cf. BARG Step 1.B.)
The data are ordinal values, which I choose to describe with an ordered-probit model. It is not appropriate to treat the data as if they were metric, a.k.a. interval, values. (In fact, treating the data as normally-distributed metric values and applying a \(t\)-test yields the inappropriate and misleading conclusion that the two movies have significantly different mean ratings.) There is no claim that an ordered-probit model is the correct model of the data or even the best model of the data. Rather, the ordered-probit model is better than treating the data as if they were normally distributed metric values. Moreover, the ordered-probit model fits the rating distributions reasonably well (as shown later by posterior predictive checks).
The ordered-probit model is explained in more detail below. For extensive background information about ordered-probit models and their analysis in Bayesian software, see Liddell & Kruschke (2018) and its supplemental material at https://osf.io/53ce9/.
Another resource for Bayesian analysis of ordinal data was provided
by Bürkner & Vuorre (2019), but unfortunately
the brms software featured in that article does not (at the
time of this writing) have a multivariate joint prior on all the
parameters as explained below. Also, in the present application the JAGS
software used here is much faster than the Stan software (Carpenter et al.,
2017) used by brms.
(Cf. BARG Preamble A.)
The Bayesian approach is especially useful for this application because of its flexibility for specifying exactly the desired model structure. Moreover, the Bayesian approach directly yields credible intervals for every parameter and derived variable, and yields posterior model probabilities for null and alternative models.
(Cf. BARG Step 1.D.)
In an ordered-probit model, there is a latent continuous variable underlying the ordinal response. The population is assumed to be a normally distributed on the latent variable, with mean \(\mu_i\) (for case \(i\)) and standard deviation \(\sigma_i\). The latent variable is cut at thresholds, \(\theta_1\) to \(\theta_{K-1}\) (for \(K\) response levels), such that latent values between \(\theta_{k-1}\) and \(\theta_{k}\) produce ordinal response \(k\). The figure below illustrates the ordered-probit model:
Figure: Ordinal data in upper panel are produced from thresholded cumulative-normal in lower panel. (Diagram is from Figure 1 of Liddell & Kruschke (2018), p. 329, used with permission.)
Mathematically, the probability of response level \(k\) is \[ p\big( k \,|\, \mu_i , \sigma_i , \{\theta_j\} \big) = \Phi\big( (\theta_{k} - \mu_i )/\sigma_i \big) - \Phi\big( (\theta_{k-1} - \mu_i )/\sigma_i \big) \] where \(\Phi()\) is the standardized cumulative-normal function. For the highest response level \(K\) the threshold \(\theta_{K}\) is effectively \(+\infty\), and for the lowest response level \(k=1\) the threshold \(\theta_{k-1}\) is effectively \(-\infty\). The thresholds are assumed to be determined by the response process and are therefore the same across all cases \(i\) (because all cases are measured by the same response process).
Parameters: The parameters consist of
However, the “stretch” and position of the latent scale are arbitrary —by analogy, the latent scale could be Fahrenheit or Celsius— and therefore two parameter values are fixed at arbitrary constants. In the traditional parameterization for an ordered-probit model, \(\mu_1 \equiv 0.0\) and \(\sigma_1 \equiv 1.0\) and all other parameters are specified relative to those constants. I find that parameterization to be unintuitive (Kruschke, 2015; Liddell & Kruschke, 2018), and prefer instead to fix \(\theta_1 \equiv 1.5\) and \(\theta_{K-1} \equiv K+0.5\) (where \(K\) is the highest ordinal level), which makes the values of the parameters correspond roughly to the response scale of \(1\) through \(K\). Thus, in the present application with two cases (i.e., two movies) and five response levels, there are a total of six estimated parameters: \(\mu_1\), \(\sigma_1\), \(\mu_2\), \(\sigma_2\), \(\theta_2\), and \(\theta_3\), with \(\theta_1 \equiv 1.5\) and \(\theta_4 \equiv 4.5\).
The means (\(\mu_1\) and \(\mu_2\)) describe the central tendency of ratings on the latent scale, and the standard deviations (\(\sigma_1\) and \(\sigma_2\)) describe the variability of ratings across people on the latent scale. Primary interest is in the magnitudes of the means and standard deviations, and in the magnitudes of the differences of means and standard deviations across cases.
In the threshold-pinned parameterization that has \(\theta_1 \equiv 1.5\) and \(\theta_4 \equiv 4.5\), a mean of 1.5 indicates that 50% of the responses will be level “1” and 50% of the responses will be levels \(>\)“1”. A mean of 4.5 indicates that 50% of the responses will be level “5”. A mean of 3.0 suggests the underlying (latent) rating is near a “3” on the response scale, subject to exact placement of the thresholds. A difference between means of 1.0 suggests the underlying (latent) ratings have central tendencies roughly 1 response level apart.
(Cf. BARG Step 1.C. and D.)
For basic parameter estimation from a broad prior, the prior can simply use diffuse univariate distributions on each parameter, as was done by Liddell & Kruschke (2018) and for which software was provided in the supplementary material at https://osf.io/53ce9/.
The present application will also illustrate model comparison with Bayes factors. Bayes factors can be very sensitive to the priors, and therefore it is crucial to specify meaningful priors that reasonably mimic parameter distributions that could have arisen from realistic data (or from idealized data generated from theory). It turns out that posterior distributions from the ordered-probit model can have strongly correlated parameters, and therefore these correlations should be represented in the prior distributions that are used for Bayes factors.
The threshold parameters are typically strongly correlated with each other, in any application of an ordered-probit model. The correlation of thresholds is natural because if one threshold estimate is scooted higher then the probabilities of intervals can be roughly maintained by also scooting the other threshold estimates a bit higher. Moreover, the \(\mu\) and \(\sigma\) parameters can be strongly correlated. The correlation is especially strong for \(\mu\) values near or beyond end thresholds, as is typical for movie ratings. For example, when there are a lot of 5-star ratings, then \(\mu\) is estimated to be near or above the highest threshold, and the data can be reasonably well fit by scooting \(\mu\) to a higher value while compensating by making \(\sigma\) also a higher value. The \(\mu\) and \(\sigma\) parameters can also be correlated with the threshold parameters. Examples will be shown later.
I have chosen to represent these correlations by using a multivariate normal distribution as the joint prior for all parameters simultaneously. A multivariate normal is specified by the mean of each variable (i.e., a vector of means) and the covariance matrix of the variables. The mean of the multivariate normal indicates the central tendency of the parameter value, and the variances represent the uncertainty of the prior such that larger variances represent greater uncertainty. The covariances represents how strongly the parameters are thought to covary.
Because each variable of a multivariate normal spans the real values from \(-\infty\) to \(+\infty\) but a standard-deviation parameter (\(\sigma_j\)) can only be non-negative, the multivariate normal is a prior on the logarithm of the standard deviation, \(\log(\sigma_j)\). Formally, the prior is given by \[ \left[ \matrix{ \mu_1 \\ \mu_2 \\ \log(\sigma_1) \\ \log(\sigma_2) \\ \theta_2 \\ \theta_3 } \right] \sim \mbox{mvnorm}\left( \left[ \matrix{ M_{\mu_1} \\ M_{\mu_2} \\ M_{\log(\sigma_1)} \\ M_{\log(\sigma_2)} \\ M_{\theta_2} \\ M_{\theta_3} } \right] , \left[ \matrix{ \mbox{V}_{\mu_1,\mu_1} &\mbox{C}_{\mu_1,\mu_2} &\mbox{C}_{\mu_1,\log(\sigma_1)} &\mbox{C}_{\mu_1,\log(\sigma_2)} &\mbox{C}_{\mu_1,\theta_2} &\mbox{C}_{\mu_1,\theta_3} \\ \mbox{C}_{\mu_2,\mu_1} &\mbox{V}_{\mu_2,\mu_2} &\mbox{C}_{\mu_2,\log(\sigma_1)} &\mbox{C}_{\mu_2,\log(\sigma_2)} &\mbox{C}_{\mu_2,\theta_2} &\mbox{C}_{\mu_2,\theta_3} \\ \mbox{C}_{\log(\sigma_1),\mu_1} &\mbox{C}_{\log(\sigma_1),\mu_2} &\mbox{V}_{\log(\sigma_1),\log(\sigma_1)} &\mbox{C}_{\log(\sigma_1),\log(\sigma_2)} &\mbox{C}_{\log(\sigma_1),\theta_2} &\mbox{C}_{\log(\sigma_1),\theta_3} \\ \mbox{C}_{\log(\sigma_2),\mu_1} &\mbox{C}_{\log(\sigma_2),\mu_2} &\mbox{C}_{\log(\sigma_2),\log(\sigma_1)} &\mbox{V}_{\log(\sigma_2),\log(\sigma_2)} &\mbox{C}_{\log(\sigma_2),\theta_2} &\mbox{C}_{\log(\sigma_2),\theta_3} \\ \mbox{C}_{\theta_2,\mu_1} &\mbox{C}_{\theta_2,\mu_2} &\mbox{C}_{\theta_2,\log(\sigma_1)} &\mbox{C}_{\theta_2,\log(\sigma_2)} &\mbox{V}_{\theta_2,\theta_2} &\mbox{C}_{\theta_2,\theta_3} \\ \mbox{C}_{\theta_3,\mu_1} &\mbox{C}_{\theta_3,\mu_2} &\mbox{C}_{\theta_3,\log(\sigma_1)} &\mbox{C}_{\theta_3,\log(\sigma_2)} &\mbox{C}_{\theta_3,\theta_2} &\mbox{V}_{\theta_3,\theta_3} } \right] \right) \] where \(\mbox{mvnorm}\) denotes the multivariate normal distribution, \(M\) is the prior mean, \(V\) is the prior variance (uncertainty), and \(C\) is the prior covariance. The covariance matrix is symmetric around the main diagonal (i.e., \(C_{i,j} = C_{j,i}\)). The formula above specifies only thresholds 2 and 3 because those are the only estimated thresholds in this application; thresholds 1 and 4 are pinned at fixed values as explained earlier. In summary, the prior multivariate normal distribution is specified by 27 constants: 6 means, 6 variances, and 15 distinct covariances.
Three choices for prior constants. The choice of \(M\), \(V\), and \(C\) constants will be explained in detail below, the first time each specific prior is introduced in an analysis. There will be three versions of the prior, to explore the sensitivity of the posterior to the choice of prior.
Two restricted models will be fit for comparison with the full model defined above.
(Cf. BARG Step 2.A.)
The Bayesian analysis was computed in R using JAGS; version details are provided at the end of the document in the section on How to Reproduce this Document. The present section shows the R code for the models. This section is not intended to be studied on a first reading, but is included at this point for reference. In particular, interested readers can check that the R code correctly implements the models that were specified mathematically above. (Cf. BARG Step 1.D.)
The functions set initial values for the MCMC chains, so they will generate reproducible outputs (cf. BARG Step 6.H.).
R code for full model. The function takes three types of inputs: (i) data, (ii) prior constants, (iii) MCMC run constants. Click the “Code” button in the HTML document to see the R code.
runDiffmHetv = function( yMat , # matrix of counts: cases x ordinal levels
paramM , # mu[1:2],logsigma[1:2],thresh[2:3]
paramVCOV ,
adaptSteps = 500 ,
burnInSteps = 1000 ,
numSavedSteps = 50000 ,
thinSteps = 10 , # because strong autocorrelation
nChains = 4 ,
initsList = "setDefSeed" ,
priorOnly = FALSE
) {
require("runjags")
require("tictoc")
tic("runDiffmHetv")
# Set up some constants to be assembled and passed to JAGS:
z = rowSums(yMat)
nCases = nrow(yMat)
nYlevels = ncol(yMat)
# Threshold vector with indices 1 and nYlevels-1 fixed; other thresholds are
# estimated. This allows all parameters to be interpretable on the response
# scale, relative to 1.5 and K-0.5.
thresh = rep(NA,nYlevels-1)
thresh[1] = 1 + 0.5
thresh[nYlevels-1] = nYlevels-1 + 0.5
# Specify the model in JAGS/BUGS language:
modelStringDiffmHetv = "
model {
# likelihood:
for ( i in 1:nCases ) {
y[i, ] ~ dmulti( pr[i,1:nYlevels] , z[i] )
pr[i,1] <- pnorm( thresh[1] , mu[i] , 1/sigma[i]^2 )
for ( k in 2:(nYlevels-1) ) {
pr[i,k] <- max( 0 ,
pnorm( thresh[ k ] ,mu[i],1/sigma[i]^2)
- pnorm( thresh[k-1] ,mu[i],1/sigma[i]^2) )
}
pr[i,nYlevels] <- 1-pnorm( thresh[nYlevels-1] ,mu[i],1/sigma[i]^2)
}
# prior:
param[1:6] ~ dmnorm.vcov( paramM[1:6] , paramVCOV[1:6,1:6] )
mu[1] <- param[1]
mu[2] <- param[2]
sigma[1] <- exp(param[3])
sigma[2] <- exp(param[4])
thresh[2] <- param[5]
thresh[3] <- param[6]
}
" # close modelString
# Set RNG seeds for reproducibility:
if ( initsList=="setDefSeed" ) {
initsList = list()
for( i in 1:nChains ) {
initsList = c( initsList , list(list(.RNG.seed=i)) )
}
}
# Assemble the info into a list that subsequently gets passed to runjags:
JagsDataListDiffmHetv = list(
# data:
# y = yMat , # combine later depending on priorOnly
nYlevels = nYlevels ,
nCases = nCases ,
z = z ,
# prior constants:
thresh = thresh ,
paramM = paramM ,
paramVCOV = paramVCOV
) # close list
if ( priorOnly==FALSE ) { # include y
JagsDataListDiffmHetv = c( JagsDataListDiffmHetv , list(y=yMat) )
}
if ( priorOnly==TRUE ) { # include y as NA's
JagsDataListDiffmHetv = c( JagsDataListDiffmHetv , list(y=NA*yMat) )
}
# Finally, run JAGS:
runJagsOut <- run.jags( method="parallel" ,
model=modelStringDiffmHetv ,
data=JagsDataListDiffmHetv ,
# For bridge_sampler, only monitor estimated
# parameters and no fixed values
monitor= c("mu","sigma","thresh[2]","thresh[3]") ,
inits=initsList ,
n.chains=nChains ,
adapt=adaptSteps ,
burnin=burnInSteps ,
sample=ceiling(numSavedSteps/nChains) ,
thin=thinSteps ,
summarise=FALSE ,
plots=FALSE )
toc() # print duration
return( runJagsOut )
} # end function definition
R code for restricted model with equal means, heterogeneous variances. As with the full model, the function takes three types of inputs: (i) data, (ii) prior constants, (iii) MCMC run constants. Click the “Code” button in the HTML document to see the R code.
runEqmHetv = function( yMat , # matrix of counts: cases x ordinal levels
paramM , # mu,logsigma[1:2],thresh[2:3]
paramVCOV ,
adaptSteps = 500 ,
burnInSteps = 1000 ,
numSavedSteps = 50000 ,
thinSteps = 10 , # because strong autocorrelation
nChains = 4 ,
initsList = "setDefSeed" ,
priorOnly = FALSE
) {
require("runjags")
require("tictoc")
tic("runEqmHetv")
# Set up some constants to be assembled and passed to JAGS:
z = rowSums(yMat)
nCases = nrow(yMat)
nYlevels = ncol(yMat)
# Threshold vector with indices 1 and nYlevels-1 fixed; other thresholds are
# estimated. This allows all parameters to be interpretable on the response
# scale, relative to 1.5 and K-0.5.
thresh = rep(NA,nYlevels-1)
thresh[1] = 1 + 0.5
thresh[nYlevels-1] = nYlevels-1 + 0.5
# Specify the model in JAGS/BUGS language:
modelStringEqmHetv = "
model {
# likelihood:
for ( i in 1:nCases ) {
y[i, ] ~ dmulti( pr[i,1:nYlevels] , z[i] )
pr[i,1] <- pnorm( thresh[1] , mu , 1/sigma[i]^2 )
for ( k in 2:(nYlevels-1) ) {
pr[i,k] <- max( 0 ,
pnorm( thresh[ k ] ,mu,1/sigma[i]^2)
- pnorm( thresh[k-1] ,mu,1/sigma[i]^2) )
}
pr[i,nYlevels] <- 1-pnorm( thresh[nYlevels-1] ,mu,1/sigma[i]^2)
}
# prior:
param[1:5] ~ dmnorm.vcov( paramM[1:5] , paramVCOV[1:5,1:5] )
mu <- param[1]
sigma[1] <- exp(param[2])
sigma[2] <- exp(param[3])
thresh[2] <- param[4]
thresh[3] <- param[5]
}
" # close modelString
# Set RNG seeds for reproducibility:
if ( initsList=="setDefSeed" ) {
initsList = list()
for( i in 1:nChains ) {
initsList = c( initsList , list(list(.RNG.seed=i)) )
}
}
# Assemble the info into a list that subsequently gets passed to runjags:
JagsDataListEqmHetv = list(
# data:
# y = yMat , # combine later depending on priorOnly
nYlevels = nYlevels ,
nCases = nCases ,
z = z ,
# prior constants:
thresh = thresh ,
paramM = paramM ,
paramVCOV = paramVCOV
) # close list
if ( priorOnly==FALSE ) { # include y
JagsDataListEqmHetv = c( JagsDataListEqmHetv , list(y=yMat) )
}
if ( priorOnly==TRUE ) { # include y as NA's
JagsDataListEqmHetv = c( JagsDataListEqmHetv , list(y=NA*yMat) )
}
# Finally, run JAGS:
runJagsOut <- run.jags( method="parallel" ,
model=modelStringEqmHetv ,
data=JagsDataListEqmHetv ,
# For bridge_sampler, only monitor estimated
# parameters and no fixed values
monitor= c("mu","sigma","thresh[2]","thresh[3]") ,
inits=initsList ,
n.chains=nChains ,
adapt=adaptSteps ,
burnin=burnInSteps ,
sample=ceiling(numSavedSteps/nChains) ,
thin=thinSteps ,
summarise=FALSE ,
plots=FALSE )
toc() # print duration
return( runJagsOut )
} # end function definition
R code for restricted model with different means, homogeneous variances. As with the full model, the function takes three types of inputs: (i) data, (ii) prior constants, (iii) MCMC run constants. Click the “Code” button in the HTML document to see the R code.
runDiffmHomv = function( yMat , # matrix of counts: cases x ordinal levels
paramM , # mu[1:2],logsigma,thresh[2:3]
paramVCOV ,
adaptSteps = 500 ,
burnInSteps = 1000 ,
numSavedSteps = 50000 ,
thinSteps = 10 , # because strong autocorrelation
nChains = 4 ,
initsList = "setDefSeed" ,
priorOnly = FALSE
) {
require("runjags")
require("tictoc")
tic("runDiffmHomv")
# Set up some constants to be assembled and passed to JAGS:
z = rowSums(yMat)
nCases = nrow(yMat)
nYlevels = ncol(yMat)
# Threshold vector with indices 1 and nYlevels-1 fixed; other thresholds are
# estimated. This allows all parameters to be interpretable on the response
# scale, relative to 1.5 and K-0.5.
thresh = rep(NA,nYlevels-1)
thresh[1] = 1 + 0.5
thresh[nYlevels-1] = nYlevels-1 + 0.5
# Specify the model in JAGS/BUGS language:
modelStringDiffmHomv = "
model {
# likelihood:
for ( i in 1:nCases ) {
y[i, ] ~ dmulti( pr[i,1:nYlevels] , z[i] )
pr[i,1] <- pnorm( thresh[1] , mu[i] , 1/sigma^2 )
for ( k in 2:(nYlevels-1) ) {
pr[i,k] <- max( 0 ,
pnorm( thresh[ k ] ,mu[i],1/sigma^2)
- pnorm( thresh[k-1] ,mu[i],1/sigma^2) )
}
pr[i,nYlevels] <- 1-pnorm( thresh[nYlevels-1] ,mu[i],1/sigma^2)
}
# prior:
param[1:5] ~ dmnorm.vcov( paramM[1:5] , paramVCOV[1:5,1:5] )
mu[1] <- param[1]
mu[2] <- param[2]
sigma <- exp(param[3])
thresh[2] <- param[4]
thresh[3] <- param[5]
}
" # close modelString
# Set RNG seeds for reproducibility:
if ( initsList=="setDefSeed" ) {
initsList = list()
for( i in 1:nChains ) {
initsList = c( initsList , list(list(.RNG.seed=i)) )
}
}
# Assemble the info into a list that subsequently gets passed to runjags:
JagsDataListDiffmHomv = list(
# data:
# y = yMat , # combine later depending on priorOnly
nYlevels = nYlevels ,
nCases = nCases ,
z = z ,
# prior constants:
thresh = thresh ,
paramM = paramM ,
paramVCOV = paramVCOV
) # close list
if ( priorOnly==FALSE ) { # include y
JagsDataListDiffmHomv = c( JagsDataListDiffmHomv , list(y=yMat) )
}
if ( priorOnly==TRUE ) { # include y as NA's
JagsDataListDiffmHomv = c( JagsDataListDiffmHomv , list(y=NA*yMat) )
}
# Finally, run JAGS:
runJagsOut <- run.jags( method="parallel" ,
model=modelStringDiffmHomv ,
data=JagsDataListDiffmHomv ,
# For bridge_sampler, only monitor estimated
# parameters and no fixed values
monitor= c("mu","sigma","thresh[2]","thresh[3]") ,
inits=initsList ,
n.chains=nChains ,
adapt=adaptSteps ,
burnin=burnInSteps ,
sample=ceiling(numSavedSteps/nChains) ,
thin=thinSteps ,
summarise=FALSE ,
plots=FALSE )
toc() # print duration
return( runJagsOut )
} # end function definition
The remaining sections proceed as follows (see also the dynamic table of contents floating in the left margin when viewing the HTML version):
(Cf. BARG Step 1.C.)
The constants for the prior are chosen to be generically broad relative to the response scale.
# Specify constants for prior:
paramM_DiffmHetvBroad = c( "mu[1]"=3.0 , "mu[2]"=3.0 ,
"logsigma[1]"=log(2.0) , "logsigma[2]"=log(2.0) ,
"thresh[2]"=2.5 , "thresh[3]"=3.5 )
paramVCOV_DiffmHetvBroad = diag( c( 2^2 , 2^2 ,
log(2)^2 , log(2)^2 ,
2^2 , 2^2 ) )
dimnames(paramVCOV_DiffmHetvBroad) = list( names( paramM_DiffmHetvBroad ) ,
names( paramM_DiffmHetvBroad ) )
The value for \(M_{\mu}\) is set at the middle of the latent scale, at 3. The value for its uncertainty, \(V_{\mu}\), is set to be very large relative to the latent scale, at SD=2 which is V=4. The value for \(M_{\sigma}\) is set to be middling on the latent scale, at 2, which is \(M_{\log(\sigma)}\)=0.693. The uncertainty of \(M_{\log(\sigma)}\) is set to be large, at \(SD_{\sigma}\)=2 which is \(V_{\log(\sigma)}\)=0.48. Similarly, the prior for the thresholds sets \(M_{\theta_k} = k+1\), with high uncertainty on the latent scale such that \(SD_{\theta}\)=2 or \(V_{\theta}\)=4.
The broad-prior covariances are all set at zero. Note that setting a prior covariance to zero does not prevent the posterior distribution from having a strong correlation of parameters; the prior merely does not impose a correlation on the parameters.
In summary, the numerical settings of the broad prior constants are displayed here:
cat("Broad prior mean vector:")
## Broad prior mean vector:
print(round(paramM_DiffmHetvBroad,3))
## mu[1] mu[2] logsigma[1] logsigma[2] thresh[2] thresh[3]
## 3.000 3.000 0.693 0.693 2.500 3.500
cat("Broad prior covariance matrix:")
## Broad prior covariance matrix:
print(round(paramVCOV_DiffmHetvBroad,3))
## mu[1] mu[2] logsigma[1] logsigma[2] thresh[2] thresh[3]
## mu[1] 4 0 0.00 0.00 0 0
## mu[2] 0 4 0.00 0.00 0 0
## logsigma[1] 0 0 0.48 0.00 0 0
## logsigma[2] 0 0 0.00 0.48 0 0
## thresh[2] 0 0 0.00 0.00 4 0
## thresh[3] 0 0 0.00 0.00 0 4
(Cf. BARG Step 1.E.)
The broad prior was specified abstractly to be symmetric on the response scale and to provide opportunity for extreme responses on either end of the scale (which can happen in ratings data). The broad prior is not intended to mimic any particular realistic response distribution. Nevertheless, a prior predictive check can at least verify the intended symmetry and extremity of the prior.
# Generate MCMC from prior:
if ( runNewMCMC ) {
runjagsDiffmHetvBroadPrior = runDiffmHetv(
yMatAct ,
priorOnly=TRUE , # N.B. !!!!!!!!!!!!!!!!
thinSteps=1 , # because no autocorrelation
paramM_DiffmHetvBroad ,
paramVCOV_DiffmHetvBroad )
save( runjagsDiffmHetvBroadPrior ,
file=paste0(saveFileDir,"/",filenameRoot,
"-runjagsDiffmHetvBroadPrior.Rdata") )
} else {
load( file=paste0(saveFileDir,"/",filenameRoot,
"-runjagsDiffmHetvBroadPrior.Rdata") )
}
## Calling 4 simulations using the parallel method...
## Following the progress of chain 1 (the program will wait for all chains
## to finish before continuing):
## Welcome to JAGS 4.3.1 on Sat Dec 2 08:31:27 2023
## JAGS is free software and comes with ABSOLUTELY NO WARRANTY
## Loading module: basemod: ok
## Loading module: bugs: ok
## . . Reading data file data.txt
## . Compiling model graph
## Resolving undeclared variables
## Allocating nodes
## Graph information:
## Observed stochastic nodes: 0
## Unobserved stochastic nodes: 3
## Total graph size: 92
## . Reading parameter file inits1.txt
## . Initializing model
## . Adaptation skipped: model is not in adaptive mode.
## . Updating 1000
## -------------------------------------------------| 1000
## ************************************************** 100%
## . . . . . Updating 12500
## -------------------------------------------------| 12500
## ************************************************** 100%
## . . . . Updating 0
## . Deleting model
## .
## All chains have finished
## Note: the model did not require adaptation
## Simulation complete. Reading coda files...
## Coda files loaded successfully
## Finished running the simulation
## runDiffmHetv: 7.66 sec elapsed
codaDiffmHetvBroadPrior = as.mcmc.list( runjagsDiffmHetvBroadPrior )
# resulting mcmcCoda object has these indices:
# mcmcCoda[[ chainIdx ]][ stepIdx , paramIdx ]
mcmcMatDiffmHetvBroadPrior = as.matrix(codaDiffmHetvBroadPrior,chains=TRUE)
Show prior predictions:
postPredPlot( dataMat=yMatAct ,
FUN=predDiffmHetv ,
mcmcMat=mcmcMatDiffmHetvBroadPrior[,c("mu[1]","mu[2]",
"sigma[1]","sigma[2]",
"thresh[2]","thresh[3]")] )
Above, it can be seen that the broad prior (blue dots are medians, bars are 95% ETI’s) is indeed symmetric on the rating scale, and provides opportunity for either end of the scale to have a high probability.
(Cf. BARG Step 2.B and 2.C.)
Run the MCMC:
if ( runNewMCMC ) {
runjagsDiffmHetvBroadPost = runDiffmHetv(
yMatAct ,
# large thinSteps because strong autocorrelation,
# but weirdly must have thinSteps <=18 for gelman.diag()
thinSteps=18 ,
paramM = paramM_DiffmHetvBroad ,
paramVCOV = paramVCOV_DiffmHetvBroad )
save( runjagsDiffmHetvBroadPost ,
file=paste0(saveFileDir,"/",filenameRoot,
"-runjagsDiffmHetvBroadPost.Rdata") )
} else {
load( file=paste0(saveFileDir,"/",filenameRoot,
"-runjagsDiffmHetvBroadPost.Rdata") )
}
## Calling 4 simulations using the parallel method...
## Following the progress of chain 1 (the program will wait for all chains
## to finish before continuing):
## Welcome to JAGS 4.3.1 on Sat Dec 2 08:31:37 2023
## JAGS is free software and comes with ABSOLUTELY NO WARRANTY
## Loading module: basemod: ok
## Loading module: bugs: ok
## . . Reading data file data.txt
## . Compiling model graph
## Resolving undeclared variables
## Allocating nodes
## Graph information:
## Observed stochastic nodes: 2
## Unobserved stochastic nodes: 1
## Total graph size: 92
## . Reading parameter file inits1.txt
## . Initializing model
## . Adapting 500
## -------------------------------------------------| 500
## ++++++++++++++++++++++++++++++++++++++++++++++++++ 100%
## Adaptation successful
## . Updating 1000
## -------------------------------------------------| 1000
## ************************************************** 100%
## . . . . . Updating 225000
## -------------------------------------------------| 225000
## ************************************************** 100%
## . . . . Updating 0
## . Deleting model
## .
## All chains have finished
## Simulation complete. Reading coda files...
## Coda files loaded successfully
## Finished running the simulation
## runDiffmHetv: 10.61 sec elapsed
We must check that the MCMC chains for every parameter have converged and are long enough to provide stable estimates.
Convergence here is indicated by the potential scale reduction factor [PSRF; Brooks & Gelman (1998)], which must be near 1.0 to indicate convergence. (Some practitioners deem psrf\(<1.1\) to be acceptable but I prefer psrf\(<1.05\).)
The effective length of an MCMC chain is indicated by the effective sample size (ESS), which refers to the sample size of the MCMC chain not to the sample size of the data. For reasonably stable estimates of the limits of 95% HDI’s, I recommend ESS\(>10,000\) (Kruschke, 2015, sec. 7.5.2).
There are different specific formulations of PSRF and ESS; in this
application the PSRF and ESS are computed using functions supplied in
the coda package (Plummer et al., 2006),
specifically gelman.diag() for PSRF and
effectiveSize() for ESS.
The table below has a row for every estimated parameter in the model. The columns indicate MCMC diagnostics and estimated values. Column headers are as follows:
codaDiffmHetvBroadPost = as.mcmc.list( runjagsDiffmHetvBroadPost )
# resulting mcmcCoda object has these indices:
# mcmcCoda[[ chainIdx ]][ stepIdx , paramIdx ]
mcmcMatDiffmHetvBroadPost = as.matrix(codaDiffmHetvBroadPost,chains=TRUE)
diagSumDiffmHetvBroadPost = diagSummary( codaDiffmHetvBroadPost )
DT::datatable( round(diagSumDiffmHetvBroadPost,3) , options=list(pageLength=10))
# Extract max PSRF and min ESS:
estimatedRows = ( rownames(diagSumDiffmHetvBroadPost) != "thresh[1]"
& rownames(diagSumDiffmHetvBroadPost) != "thresh[4]" )
maxPSRF = max(diagSumDiffmHetvBroadPost[estimatedRows,c("psrfPt")])
minESS = min(diagSumDiffmHetvBroadPost[estimatedRows,c("ESS")])
From the table above, we observe
(Cf. BARG Step 3.A.)
Here we view the model’s posterior predictions visually superimposed on the data.
The graph below shows the data (as bars) with posterior-predicted median probabilities (as circles) and their 95% ETI’s (as vertical segments). The ETI’s are for the posterior counts of each rating, including both the uncertainty of the estimated probability and the randomness of the sample. It can be seen that the model provides a reasonably accurate description of the data.
postPredPlot( dataMat=yMatAct ,
FUN=predDiffmHetv ,
mcmcMat=mcmcMatDiffmHetvBroadPost[,c("mu[1]","mu[2]",
"sigma[1]","sigma[2]",
"thresh[2]","thresh[3]")] )
(Cf. BARG Step 3.B.)
The table in the previous section reported numerical details of mode and HDI, and of median and ETI, for all parameters.
As mentioned previously, primary interest is in the magnitudes and uncertainties of the means and their difference, and in the magnitudes and uncertainties of the standard deviations and their difference. The difference of means can also be converted to effect size, measured as Cohen’s \(d = \big(\mu_1-\mu_2\big)\Big/\sqrt{(\sigma_1^2+\sigma_2^2)/2}\), which is on the scale of pooled SD units.
Difference of means:
ppdInfo = plotPostDiff( mcmc1=mcmcMatDiffmHetvBroadPost[,"mu[1]"] ,
mcmc2=mcmcMatDiffmHetvBroadPost[,"mu[2]"] ,
name1="mu[1]" , name2="mu[2]" ,
rope1=NULL , rope2=NULL , ropeD=c(-0.25,0.25) )
DT::datatable( round(ppdInfo,3) )
Effect size (Cohen’s d), \((\mu_1 - \mu_2)\Big/\sqrt{(\sigma_1^2+\sigma_2^2)/2}\):
diff = ( mcmcMatDiffmHetvBroadPost[,"mu[1]"] - mcmcMatDiffmHetvBroadPost[,"mu[2]"] )
poolsd = sqrt( ( mcmcMatDiffmHetvBroadPost[,"sigma[1]"]^2
+ mcmcMatDiffmHetvBroadPost[,"sigma[2]"]^2 ) / 2 )
esInfo = plotPost( diff/poolsd ,
name="Effect Size" ,
rope=c(-0.1,0.1) )
DT::datatable( round(esInfo,3) )
Difference of standard deviations:
ppdInfo = plotPostDiff( mcmc1=mcmcMatDiffmHetvBroadPost[,"sigma[1]"] ,
mcmc2=mcmcMatDiffmHetvBroadPost[,"sigma[2]"] ,
name1="sigma[1]" , name2="sigma[2]" ,
rope1=NULL , rope2=NULL , ropeD=c(-0.25,0.25) )
DT::datatable( round(ppdInfo,3) )
Thus, the two movies have nearly the same (latent) mean rating but they have very different (latent) SD’s, with movie 1 having a much larger SD than movie 2. The analysis suggests that the primary difference between the movies is the range (SD) of reactions they evoke, not the central tendency of those reactions. For movie 2 there is a more consistent reaction than for movie 1.
Decisions:
(Cf. BARG Step 4.C.)
Some of the posterior distributions graphed above include annotations that mark a region of practical equivalence (ROPE) to zero difference. These ROPE’s are used only for making decisions and are not an inherent part of the posterior distribution. According to different decision rules, the null value of zero difference can be rejected if the 95% credible interval falls completely outside the ROPE or if the percentage of the posterior distribution inside the ROPE is sufficiently small (e.g., less than 2.5%). The null value of zero difference can be accepted for practical purposes if the 95% credible interval falls completely inside the ROPE or if the percentage of the posterior distribution inside the ROPE is sufficiently large (e.g., greater than 95%) (Kruschke, 2018). In this application, the ROPE on the scale of effect size (Cohen’s d) was set at \(-0.1\) to \(+0.1\) because \(d=0.1\) is half of a typical “small” effect size in the social sciences as suggested by Cohen (1988). For ROPE’s on the threshold-pinned latent scale, a difference of 0.25 was deemed negligible for practical purposes (especially in light of the huge variability of ratings across people).
If a decision about the difference of means were to be made based on the indicated ROPE’s, we would withhold a decision because either (i) the percentage of the posterior distribution of effect size or difference of means, while large, does not exceed a criterion of 95%, or (ii) the 95% CI does not fall entirely inside the ROPE.
If a decision about the difference of SD’s were to be made based on the indicated ROPE, we would decide that the standard deviations are different because either (i) the percentage of the posterior distribution of the difference inside the ROPE is tiny (less than 2.5%), or (ii) the 95% CI of the difference falls well outside the ROPE.
Despite the accurate fit of the model to the data, there is no claim that this model is the correct description of the ratings. This model implies that the larger-SD movie has more extremely-high latent-scale evaluations than the smaller-SD movie, but those extremely-high latent-scale evaluations are censored (in the statistical sense) by the threshold for the highest response level. This suggests that if the response scale provided more levels for extremely high ratings, then the larger-SD movie would show more extremely high ratings than the smaller-SD movie.
If the goal of the analysis were to estimate parameters, and not to do model comparisons for hypothesis tests, then the analysis is nearly finished, as the next sub-steps just check the sensitivity of the posterior to the choice of prior. The estimates of means and SD’s shown above closely match the estimates reported by Liddell & Kruschke (2018), which used a different broad prior. Additional sensitivity analysis is provided below, after re-doing the analysis with other priors.
One approach to specifying priors is to directly intuit plausible values for the parameters and their uncertainty. I believe this approach is fraught with perils. For example, the degree of correlation between parameters is far from intuitive (to me), yet the correlation of parameters is quite strong in this application (as will be shown below). Moreover, domain-specific values for the means of threshold parameters and their uncertainties are not at all intuitive (to me). As another example, consider analogous parameters that are used across different models, such as the means in the heterogeneous-variance and homogeneous-variance models. Those means should not necessarily have the same uncertainties in the different models, and it is not intuitively obvious how different the uncertainties should be. Recall that the full model has 27 distinct constants in its prior, and setting all of those constants by intuition or even by theory seems highly implausible.
By contrast, a different approach that yields automatically consistent and meaningful priors is to start with a small set of representative data and inform all the models with that small set of representative data. In this scheme, every model begins with an extremely broad, diffuse, non-committal prior that is updated with the small set of representative data. Because the initial prior is so diffuse, its influence can be effectively erased (in typical models with modest numbers of parameters relative to the size of the small representative data set). The resulting posterior acts as the mildly informed prior for analyzing the actual data. By starting all the models this way, the models are equally informed and they are equivalently “tuned” to the same initial assumptions, not relying on user intuitions to attempt to set priors comparably across models. (For more discussion and an application to power analysis, see Section 13.2.5 of Kruschke, 2015.)
# Create small representative (fictional) data set for informing priors.
# First consider column sums of star-rating frequencies across all movies other
# than those being analyzed:
origData = read.csv( "MoviesData.csv" )
ratingTotal = colSums( origData[ !(rownames(origData) %in% rownames(yMatAct))
, c("n1","n2","n3","n4","n5") ] )
ratingProp = ratingTotal / sum(ratingTotal)
To create a representative data set, I first summed the frequencies of ratings across all 34 the movies in the original set other than the two movies being analyzed to create a representative profile of ratings for a movie. Notice this summation automatically gives movies with many ratings more influence than movies with fewer ratings. Across the 34 movies, there were a total of 283605 ratings with an average of N=8341.3 ratings per movie. Then I used a small \(N\) that would mimic the average ratings profile while giving the smallest-probability rating a non-zero count. (Non-zero counts are required for the thresholds between levels to be estimated.) The smallest \(N\) for all non-zero counts was \(N=26\), but an \(N\) that better represented the proportions was \(N=32\).
# Now find smallest N that mimics the frequency distribution and gives lowest
# frequencies a non-zero count. Must have each level represented by at least 1
# instance, otherwise thresholds are not definable. It turns out to be n=17,
# but that makes only 1 instance in each of first three levels, whereas level
# three (n3) is actually twice as probable as levels one and two (n1 and n2).
# The smallest n that yields n3=2 is n=25 (yielding an actual total n of 26).
# Or, use n that better matches empirical proportions: n=32.
fPrior32 = round(ratingProp*32)
The following graph shows the actual proportion of each rating from the full data set and the small-sample proportion achieved by N=32:
# Plot:
layout(matrix(1))
plot( ratingProp , ylim=c(0,1) , pch="+" , cex=2 , col="red" ,
xlab="Rating" , ylab="p(Rating)" , main="Representative Data")
grid()
points( fPrior32/sum(fPrior32) , pch="o" , col="blue" )
legend( x="topleft" ,
legend=c(paste0("Real Data (N=",sum(ratingTotal),")") ,
paste0( "Small Data (N=",sum(fPrior32),")") ) ,
pch=c("+","o") , col=c("red","blue") , text.col=c("red","blue") )
Here (below) is the small representative data set that will be used to inform all the models. Each row is a case (i.e., movie), each column is an ordinal rating (1-5), and each cell is the count for the corresponding case and rating.
yMatOther = rbind( fPrior32 , fPrior32 )
yMatOther
## n1 n2 n3 n4 n5
## fPrior32 1 1 2 6 22
## fPrior32 1 1 2 6 22
Notice that the data are identical for the two movies, and therefore this set of data (weakly) favors a model with equal means and equal variances. These data will be used to inform every model, starting each with a very broad prior. The resulting posterior distributions will serve as mildly informed priors for subsequent model comparison. Thus, all models will be equivalently informed in the model comparison.
Now I run the broad-prior model with the small representative data. I use same broad prior as before; here is a review of the broad-prior constants:
print(round(paramM_DiffmHetvBroad,3))
## mu[1] mu[2] logsigma[1] logsigma[2] thresh[2] thresh[3]
## 3.000 3.000 0.693 0.693 2.500 3.500
print(round(paramVCOV_DiffmHetvBroad,3))
## mu[1] mu[2] logsigma[1] logsigma[2] thresh[2] thresh[3]
## mu[1] 4 0 0.00 0.00 0 0
## mu[2] 0 4 0.00 0.00 0 0
## logsigma[1] 0 0 0.48 0.00 0 0
## logsigma[2] 0 0 0.00 0.48 0 0
## thresh[2] 0 0 0.00 0.00 4 0
## thresh[3] 0 0 0.00 0.00 0 4
I run JAGS with the small representative data:
if ( runNewMCMC ) {
runjagsDiffmHetvOtherPrior = runDiffmHetv( yMatOther ,
paramM = paramM_DiffmHetvBroad ,
paramVCOV = paramVCOV_DiffmHetvBroad )
save( runjagsDiffmHetvOtherPrior ,
file=paste0(saveFileDir,"/",filenameRoot,
"-runjagsDiffmHetvOtherPrior.Rdata") )
} else {
load( file=paste0(saveFileDir,"/",filenameRoot,
"-runjagsDiffmHetvOtherPrior.Rdata") )
}
## Calling 4 simulations using the parallel method...
## Following the progress of chain 1 (the program will wait for all chains
## to finish before continuing):
## Welcome to JAGS 4.3.1 on Sat Dec 2 08:31:50 2023
## JAGS is free software and comes with ABSOLUTELY NO WARRANTY
## Loading module: basemod: ok
## Loading module: bugs: ok
## . . Reading data file data.txt
## . Compiling model graph
## Resolving undeclared variables
## Allocating nodes
## Graph information:
## Observed stochastic nodes: 2
## Unobserved stochastic nodes: 1
## Total graph size: 92
## . Reading parameter file inits1.txt
## . Initializing model
## . Adapting 500
## -------------------------------------------------| 500
## ++++++++++++++++++++++++++++++++++++++++++++++++++ 100%
## Adaptation successful
## . Updating 1000
## -------------------------------------------------| 1000
## ************************************************** 100%
## . . . . . Updating 125000
## -------------------------------------------------| 125000
## ************************************************** 100%
## . . . . Updating 0
## . Deleting model
## .
## All chains have finished
## Simulation complete. Reading coda files...
## Coda files loaded successfully
## Finished running the simulation
## runDiffmHetv: 8.5 sec elapsed
Check MCMC diagnostics:
codaDiffmHetvOtherPrior = as.mcmc.list( runjagsDiffmHetvOtherPrior )
# resulting mcmcCoda object has these indices:
# mcmcCoda[[ chainIdx ]][ stepIdx , paramIdx ]
mcmcMatDiffmHetvOtherPrior = as.matrix(codaDiffmHetvOtherPrior,chains=TRUE)
diagSumDiffmHetvOtherPrior = diagSummary( codaDiffmHetvOtherPrior )
DT::datatable( round(diagSumDiffmHetvOtherPrior,3) , options=list(pageLength=10))
# Extract max PSRF and min ESS:
estimatedRows = ( rownames(diagSumDiffmHetvOtherPrior) != "thresh[1]"
& rownames(diagSumDiffmHetvOtherPrior) != "thresh[4]" )
maxPSRF = max(diagSumDiffmHetvOtherPrior[estimatedRows,c("psrfPt")])
minESS = min(diagSumDiffmHetvOtherPrior[estimatedRows,c("ESS")])
Posterior predictive check:
I check here that the posterior distribution accurately reflects the small representative data set. The graph below shows the data (as bars) with posterior-predicted probabilities (as circles) and their 95% ETI’s (as vertical segments). The ETI’s are for the posterior counts of each rating, including both the uncertainty of the estimated probability and the randomness of the sample. It can be seen that the model provides an accurate description of the data.
postPredPlot( dataMat=yMatOther ,
FUN=predDiffmHetv ,
mcmcMat=mcmcMatDiffmHetvOtherPrior[,c("mu[1]","mu[2]",
"sigma[1]","sigma[2]",
"thresh[2]","thresh[3]")] )
Examine posterior (from small other-movie data, broad prior):
This posterior distribution from the small representative data
will serve as the prior for subsequent analysis of the actual data.
Pair-wise scatter plots of parameters in MCMC are shown below, where it
can be seen that there can be strong covariation of mu[j]
with logsigma[j] and among the thresholds. The prior
distribution for subsequent analysis with the full data will approximate
the means and covariances of this mildly informed posterior
distribution.
# Compute log(sigma) and append to MCMC matrix:
mcmcMatDiffmHetvOtherPrior = cbind( mcmcMatDiffmHetvOtherPrior ,
"logsigma[1]" = log(mcmcMatDiffmHetvOtherPrior[,"sigma[1]"]) ,
"logsigma[2]" = log(mcmcMatDiffmHetvOtherPrior[,"sigma[2]"]) )
plotRows = seq( 1 , nrow(mcmcMatDiffmHetvOtherPrior) ,
length=min(c(2000,nrow(mcmcMatDiffmHetvOtherPrior))) )
psych::pairs.panels(
mcmcMatDiffmHetvOtherPrior[
plotRows ,
c("mu[1]","mu[2]","logsigma[1]","logsigma[2]","thresh[2]","thresh[3]") ] ,
smooth=FALSE , lm=TRUE , smoother=TRUE , gap=0.2 , # pch=20 ,
ellipses=TRUE , hist.col="skyblue" )
Next I compute and display the prior constants from the posterior of the small representative data. Note that these constants are covariances, not correlations, so they will not numerically equal the correlations in the plot above (although they are algebraically equivalent).
paramM_DiffmHetvOther = colMeans( mcmcMatDiffmHetvOtherPrior[ ,
c("mu[1]","mu[2]","logsigma[1]","logsigma[2]","thresh[2]","thresh[3]") ] )
paramVCOV_DiffmHetvOther = cov( mcmcMatDiffmHetvOtherPrior[ ,
c("mu[1]","mu[2]","logsigma[1]","logsigma[2]","thresh[2]","thresh[3]") ] )
Here are the prior constants:
print(round(paramM_DiffmHetvOther,3) )
## mu[1] mu[2] logsigma[1] logsigma[2] thresh[2] thresh[3]
## 5.495 5.494 0.793 0.794 2.251 3.099
print(round(paramVCOV_DiffmHetvOther,3) )
## mu[1] mu[2] logsigma[1] logsigma[2] thresh[2] thresh[3]
## mu[1] 0.354 -0.002 0.083 0.000 -0.006 -0.004
## mu[2] -0.002 0.348 0.000 0.082 -0.006 -0.003
## logsigma[1] 0.083 0.000 0.067 0.004 -0.017 -0.018
## logsigma[2] 0.000 0.082 0.004 0.066 -0.017 -0.019
## thresh[2] -0.006 -0.006 -0.017 -0.017 0.107 0.057
## thresh[3] -0.004 -0.003 -0.018 -0.019 0.057 0.100
If the mathematical prior were an exact copy of the posterior from the mildly informed fit, then a prior predictive check would be superfluous. But because the mathematical prior is merely a close approximation of the posterior from the mildly informed fit, a prior predictive check verifies that the prior really does mimic the small data as intended.
# Generate MCMC from prior:
if ( runNewMCMC ) {
runjagsDiffmHetvOtherPriorPred = runDiffmHetv(
yMatAct ,
priorOnly=TRUE , # N.B. !!!!!!!!!!!!!!!!
thinSteps=1 , # because no autocorrelation
paramM_DiffmHetvOther ,
paramVCOV_DiffmHetvOther )
save( runjagsDiffmHetvOtherPriorPred ,
file=paste0(saveFileDir,"/",filenameRoot,
"-runjagsDiffmHetvOtherPriorPred.Rdata") )
} else {
load( file=paste0(saveFileDir,"/",filenameRoot,
"-runjagsDiffmHetvOtherPriorPred.Rdata") )
}
## Calling 4 simulations using the parallel method...
## Following the progress of chain 1 (the program will wait for all chains
## to finish before continuing):
## Welcome to JAGS 4.3.1 on Sat Dec 2 08:32:03 2023
## JAGS is free software and comes with ABSOLUTELY NO WARRANTY
## Loading module: basemod: ok
## Loading module: bugs: ok
## . . Reading data file data.txt
## . Compiling model graph
## Resolving undeclared variables
## Allocating nodes
## Graph information:
## Observed stochastic nodes: 0
## Unobserved stochastic nodes: 3
## Total graph size: 92
## . Reading parameter file inits1.txt
## . Initializing model
## . Adaptation skipped: model is not in adaptive mode.
## . Updating 1000
## -------------------------------------------------| 1000
## ************************************************** 100%
## . . . . . Updating 12500
## -------------------------------------------------| 12500
## ************************************************** 100%
## . . . . Updating 0
## . Deleting model
## .
## All chains have finished
## Note: the model did not require adaptation
## Simulation complete. Reading coda files...
## Coda files loaded successfully
## Finished running the simulation
## runDiffmHetv: 7.23 sec elapsed
codaDiffmHetvOtherPriorPred = as.mcmc.list( runjagsDiffmHetvOtherPriorPred )
# resulting mcmcCoda object has these indices:
# mcmcCoda[[ chainIdx ]][ stepIdx , paramIdx ]
mcmcMatDiffmHetvOtherPriorPred = as.matrix(codaDiffmHetvOtherPriorPred,chains=TRUE)
Show prior predictions:
postPredPlot( dataMat=yMatOther ,
FUN=predDiffmHetv ,
mcmcMat=mcmcMatDiffmHetvOtherPriorPred[,c("mu[1]","mu[2]",
"sigma[1]","sigma[2]",
"thresh[2]","thresh[3]")] )
Above, it can be seen that the mildly-informed prior does mimic the small data.
Now I run actual data using the mildly informed prior. This full model is the same as the go-to model for parameter estimation, but now it will use a mildly informed prior. The posterior that results from the mildly-informed prior should be essentially the same as what would be obtained by starting with the broad prior but using data that merges the actual data with the small representative data set. Because the actual data set has N=533 per movie while the prior data set has N=32 per movie, the posterior from the mildly informed prior should be only a little different from the posterior from the broad prior. But for purposes of model comparison it can be important that all the models have priors that are equivalently and usefully informed.
Run the MCMC on the actual data with the mildly informed prior:
if ( runNewMCMC ) {
runjagsDiffmHetvOtherPost = runDiffmHetv(
yMatAct ,
thinSteps=15 , # because strong autocorrelation
paramM_DiffmHetvOther ,
paramVCOV_DiffmHetvOther )
save( runjagsDiffmHetvOtherPost ,
file=paste0(saveFileDir,"/",filenameRoot,
"-runjagsDiffmHetvOtherPost.Rdata") )
} else {
load( file=paste0(saveFileDir,"/",filenameRoot, "-runjagsDiffmHetvOtherPost.Rdata") )
}
## Calling 4 simulations using the parallel method...
## Following the progress of chain 1 (the program will wait for all chains
## to finish before continuing):
## Welcome to JAGS 4.3.1 on Sat Dec 2 08:32:12 2023
## JAGS is free software and comes with ABSOLUTELY NO WARRANTY
## Loading module: basemod: ok
## Loading module: bugs: ok
## . . Reading data file data.txt
## . Compiling model graph
## Resolving undeclared variables
## Allocating nodes
## Graph information:
## Observed stochastic nodes: 2
## Unobserved stochastic nodes: 1
## Total graph size: 92
## . Reading parameter file inits1.txt
## . Initializing model
## . Adapting 500
## -------------------------------------------------| 500
## ++++++++++++++++++++++++++++++++++++++++++++++++++ 100%
## Adaptation successful
## . Updating 1000
## -------------------------------------------------| 1000
## ************************************************** 100%
## . . . . . Updating 187500
## -------------------------------------------------| 187500
## ************************************************** 100%
## . . . . Updating 0
## . Deleting model
## .
## All chains have finished
## Simulation complete. Reading coda files...
## Coda files loaded successfully
## Finished running the simulation
## runDiffmHetv: 10.31 sec elapsed
codaDiffmHetvOtherPost = as.mcmc.list( runjagsDiffmHetvOtherPost )
# resulting mcmcCoda object has these indices:
# mcmcCoda[[ chainIdx ]][ stepIdx , paramIdx ]
mcmcMatDiffmHetvOtherPost = as.matrix(codaDiffmHetvOtherPost,chains=TRUE)
diagSumDiffmHetvOtherPost = diagSummary( codaDiffmHetvOtherPost )
DT::datatable( round(diagSumDiffmHetvOtherPost,3) , options=list(pageLength=10))
# Extract max PSRF and min ESS:
estimatedRows = ( rownames(diagSumDiffmHetvOtherPost) != "thresh[1]"
& rownames(diagSumDiffmHetvOtherPost) != "thresh[4]" )
maxPSRF = max(diagSumDiffmHetvOtherPost[estimatedRows,c("psrfPt")])
minESS = min(diagSumDiffmHetvOtherPost[estimatedRows,c("ESS")])
The graph below shows the data (as bars) with posterior-predicted probabilities (as circles) and their 95% ETI’s (as vertical segments). The ETI’s are for the posterior counts of each rating, including both the uncertainty of the estimated probability and the randomness of the sample. It can be seen that the model provides an accurate description of the data.
postPredPlot( dataMat=yMatAct ,
FUN=predDiffmHetv ,
mcmcMat=mcmcMatDiffmHetvOtherPost[,c("mu[1]","mu[2]",
"sigma[1]","sigma[2]",
"thresh[2]","thresh[3]")] )
Difference of means:
ppdInfo = plotPostDiff( mcmc1=mcmcMatDiffmHetvOtherPost[,"mu[1]"] ,
mcmc2=mcmcMatDiffmHetvOtherPost[,"mu[2]"] ,
name1="mu[1]" , name2="mu[2]" ,
rope1=NULL , rope2=NULL , ropeD=c(-0.25,0.25) )
DT::datatable( round(ppdInfo,3) )
Effect size (Cohen’s d), \((\mu_1 - \mu_2)\Big/\sqrt{(\sigma_1^2+\sigma_2^2)/2}\):
diff = ( mcmcMatDiffmHetvOtherPost[,"mu[1]"] - mcmcMatDiffmHetvOtherPost[,"mu[2]"] )
poolsd = sqrt( ( mcmcMatDiffmHetvOtherPost[,"sigma[1]"]^2
+ mcmcMatDiffmHetvOtherPost[,"sigma[2]"]^2 ) / 2 )
esInfo = plotPost( diff/poolsd ,
name="Effect Size" ,
rope=c(-0.1,0.1) )
DT::datatable( round(esInfo,3) )
Difference of standard deviations:
ppdInfo = plotPostDiff( mcmc1=mcmcMatDiffmHetvOtherPost[,"sigma[1]"] ,
mcmc2=mcmcMatDiffmHetvOtherPost[,"sigma[2]"] ,
name1="sigma[1]" , name2="sigma[2]" ,
rope1=NULL , rope2=NULL , ropeD=c(-0.25,0.25) )
DT::datatable( round(ppdInfo,3) )
Above, the posterior estimates of the parameters are very similar to the estimates obtained from using a broad prior. A direct comparison will be shown in a subsequent section on sensitivity analysis.
Here I apply the method of using a small subset from the actual data to inform the priors. This is related to the method of an ‘intrinsic’ Bayes factor (Berger & Pericchi, 1996) and an example application was reported by Kary et al. (2016). I modify the procedure somewhat by, instead of taking a random subset of the data to inform the priors, constructing a small representative data set that accurately mimics the full data. (Also, after informing the priors with the small representative data set, I will use the full data set instead of the complement of the data set after removing the representative data. I do this for consistency with the procedure that used data from other movies to inform the prior.) For each case (i.e., movie), the proportion of ratings at each level was multiplied by a selected small N and the result was rounded to the nearest integer. For consistency with the previous representative data set, I used N=32 in each case. Here is the result:
yMatActProp = yMatAct
yMatActProp[1,] = yMatAct[1,]/sum(yMatAct[1,])
yMatActProp[2,] = yMatAct[2,]/sum(yMatAct[2,])
N=32
yMatSubset = yMatAct
yMatSubset[1,] = round(yMatActProp[1,]*N)
yMatSubset[2,] = round(yMatActProp[2,]*N)
par(mfrow=c(1,2))
plot( 1:5 , yMatActProp[1,] , main="Case 1" ,
pch="+" , cex=2 , col="red" , ylim=c(0,.5) ,
xlab="Rating" , ylab="p(Rating)" )
points( 1:5 , yMatSubset[1,]/sum(yMatSubset[1,]) , pch="o" , col="blue" )
grid()
legend( x="topleft" ,
legend=c(paste0("Real Data (N=",sum(yMatAct[1,]),")") ,
paste0( "Small Data (N=",sum(yMatSubset[1,]),")") ) ,
pch=c("+","o") , col=c("red","blue") , text.col=c("red","blue") )
plot( 1:5 , yMatActProp[2,] , main="Case 2" ,
pch="+" , cex=2 , col="red" , ylim=c(0,.5) ,
xlab="Rating" , ylab="p(Rating)" )
points( 1:5 , yMatSubset[2,]/sum(yMatSubset[2,]) , pch="o" , col="blue" )
grid()
legend( x="topleft" ,
legend=c(paste0("Real Data (N=",sum(yMatAct[2,]),")") ,
paste0( "Small Data (N=",sum(yMatSubset[2,]),")") ) ,
pch=c("+","o") , col=c("red","blue") , text.col=c("red","blue") )
par(mfrow=c(1,1))
A frequency table of the small data set is shown below. The row labels (9, 15) are the case numbers of the movies.
yMatSubset
## n1 n2 n3 n4 n5
## 9 7 4 3 5 13
## 15 4 3 5 9 11
Notice that this small data set does not have equal means and equal variances (unlike the data set constructed from other movies).
Now I run the broad-prior model with small representative data. I use same broad prior as before. For review, here are the broad-prior constants:
print(round(paramM_DiffmHetvBroad,3))
## mu[1] mu[2] logsigma[1] logsigma[2] thresh[2] thresh[3]
## 3.000 3.000 0.693 0.693 2.500 3.500
print(round(paramVCOV_DiffmHetvBroad,3))
## mu[1] mu[2] logsigma[1] logsigma[2] thresh[2] thresh[3]
## mu[1] 4 0 0.00 0.00 0 0
## mu[2] 0 4 0.00 0.00 0 0
## logsigma[1] 0 0 0.48 0.00 0 0
## logsigma[2] 0 0 0.00 0.48 0 0
## thresh[2] 0 0 0.00 0.00 4 0
## thresh[3] 0 0 0.00 0.00 0 4
Run JAGS with the small representative data:
if ( runNewMCMC ) {
runjagsDiffmHetvSubsetPrior = runDiffmHetv(
yMatSubset ,
paramM = paramM_DiffmHetvBroad ,
paramVCOV = paramVCOV_DiffmHetvBroad )
save( runjagsDiffmHetvSubsetPrior ,
file=paste0(saveFileDir,"/",filenameRoot, "-runjagsDiffmHetvSubsetPrior.Rdata") )
} else {
load( file=paste0(saveFileDir,"/",filenameRoot, "-runjagsDiffmHetvSubsetPrior.Rdata") )
}
## Calling 4 simulations using the parallel method...
## Following the progress of chain 1 (the program will wait for all chains
## to finish before continuing):
## Welcome to JAGS 4.3.1 on Sat Dec 2 08:32:24 2023
## JAGS is free software and comes with ABSOLUTELY NO WARRANTY
## Loading module: basemod: ok
## Loading module: bugs: ok
## . . Reading data file data.txt
## . Compiling model graph
## Resolving undeclared variables
## Allocating nodes
## Graph information:
## Observed stochastic nodes: 2
## Unobserved stochastic nodes: 1
## Total graph size: 92
## . Reading parameter file inits1.txt
## . Initializing model
## . Adapting 500
## -------------------------------------------------| 500
## ++++++++++++++++++++++++++++++++++++++++++++++++++ 100%
## Adaptation successful
## . Updating 1000
## -------------------------------------------------| 1000
## ************************************************** 100%
## . . . . . Updating 125000
## -------------------------------------------------| 125000
## ************************************************** 100%
## . . . . Updating 0
## . Deleting model
## .
## All chains have finished
## Simulation complete. Reading coda files...
## Coda files loaded successfully
## Finished running the simulation
## runDiffmHetv: 8.43 sec elapsed
MCMC diagnostics:
codaDiffmHetvSubsetPrior = as.mcmc.list( runjagsDiffmHetvSubsetPrior )
# resulting mcmcCoda object has these indices:
# mcmcCoda[[ chainIdx ]][ stepIdx , paramIdx ]
mcmcMatDiffmHetvSubsetPrior = as.matrix(codaDiffmHetvSubsetPrior,chains=TRUE)
diagSumDiffmHetvSubsetPrior = diagSummary( codaDiffmHetvSubsetPrior )
DT::datatable( round(diagSumDiffmHetvSubsetPrior,3) , options=list(pageLength=10))
# Extract max PSRF and min ESS:
estimatedRows = ( rownames(diagSumDiffmHetvSubsetPrior) != "thresh[1]"
& rownames(diagSumDiffmHetvSubsetPrior) != "thresh[4]" )
maxPSRF = max(diagSumDiffmHetvSubsetPrior[estimatedRows,c("psrfPt")])
minESS = min(diagSumDiffmHetvSubsetPrior[estimatedRows,c("ESS")])
Posterior predictive check:
We check here that the posterior distribution accurately reflects the small representative data set. The graph below shows the data (as bars) with posterior-predicted probabilities (as circles) and their 95% ETI’s (as vertical segments). The ETI’s are for the posterior counts of each rating, including both the uncertainty of the estimated probability and the randomness of the sample. It can be seen that the model provides an accurate description of the data.
postPredPlot( dataMat=yMatSubset ,
FUN=predDiffmHetv ,
mcmcMat=mcmcMatDiffmHetvSubsetPrior[,c("mu[1]","mu[2]",
"sigma[1]","sigma[2]",
"thresh[2]","thresh[3]")] )
Examine posterior:
This posterior distribution from the small representative data will serve as the prior for subsequent analysis of the actual data. Pair-wise scatter plots of parameters in MCMC are shown below, where it can be seen that there can be strong covariation among some parameters. The prior distribution for subsequent models will approximate the means and covariances of this mildly informed posterior distribution.
# Compute log(sigma) and append to MCMC matrix:
mcmcMatDiffmHetvSubsetPrior = cbind( mcmcMatDiffmHetvSubsetPrior ,
"logsigma[1]" = log(mcmcMatDiffmHetvSubsetPrior[,"sigma[1]"]) ,
"logsigma[2]" = log(mcmcMatDiffmHetvSubsetPrior[,"sigma[2]"]) )
plotRows = seq( 1 , nrow(mcmcMatDiffmHetvSubsetPrior) ,
length=min(c(2000,nrow(mcmcMatDiffmHetvSubsetPrior))) )
psych::pairs.panels(
mcmcMatDiffmHetvSubsetPrior[
plotRows ,
c("mu[1]","mu[2]","logsigma[1]","logsigma[2]","thresh[2]","thresh[3]") ] ,
smooth=FALSE , lm=TRUE , smoother=TRUE , gap=0.2 , # pch=20 ,
ellipses=TRUE , hist.col="skyblue" )
Now I compute the prior constants from the posterior of the small representative data:
paramM_DiffmHetvSubset = colMeans( mcmcMatDiffmHetvSubsetPrior[ ,
c("mu[1]","mu[2]","logsigma[1]","logsigma[2]","thresh[2]","thresh[3]") ] )
paramVCOV_DiffmHetvSubset = cov( mcmcMatDiffmHetvSubsetPrior[ ,
c("mu[1]","mu[2]","logsigma[1]","logsigma[2]","thresh[2]","thresh[3]") ] )
Here are the prior constants (again, these are covariances, not correlations, so they will not numerically equal the correlations in the pairs plot above):
print(round(paramM_DiffmHetvSubset,3) )
## mu[1] mu[2] logsigma[1] logsigma[2] thresh[2] thresh[3]
## 3.717 3.771 1.133 0.706 2.387 3.217
print(round(paramVCOV_DiffmHetvSubset,3) )
## mu[1] mu[2] logsigma[1] logsigma[2] thresh[2] thresh[3]
## mu[1] 0.383 0.004 0.021 -0.002 0.011 0.012
## mu[2] 0.004 0.163 0.000 0.012 0.013 0.017
## logsigma[1] 0.021 0.000 0.054 0.000 -0.003 -0.002
## logsigma[2] -0.002 0.012 0.000 0.039 -0.006 -0.004
## thresh[2] 0.011 0.013 -0.003 -0.006 0.058 0.033
## thresh[3] 0.012 0.017 -0.002 -0.004 0.033 0.060
If the mathematical prior were an exact copy of the posterior from the mildly informed fit, then a prior predictive check would be superfluous. But because the mathematical prior is merely a close approximation of the posterior from the mildly informed fit, a prior predictive check verifies that the prior really does mimic the small data as intended.
# Generate MCMC from prior:
if ( runNewMCMC ) {
runjagsDiffmHetvSubsetPriorPred = runDiffmHetv(
yMatAct ,
priorOnly=TRUE , # N.B. !!!!!!!!!!!!!!!!
thinSteps=1 , # because no autocorrelation
paramM_DiffmHetvSubset ,
paramVCOV_DiffmHetvSubset )
save( runjagsDiffmHetvSubsetPriorPred ,
file=paste0(saveFileDir,"/",filenameRoot, "-runjagsDiffmHetvSubsetPriorPred.Rdata") )
} else {
load( file=paste0(saveFileDir,"/",filenameRoot, "-runjagsDiffmHetvSubsetPriorPred.Rdata") )
}
## Calling 4 simulations using the parallel method...
## Following the progress of chain 1 (the program will wait for all chains
## to finish before continuing):
## Welcome to JAGS 4.3.1 on Sat Dec 2 08:32:37 2023
## JAGS is free software and comes with ABSOLUTELY NO WARRANTY
## Loading module: basemod: ok
## Loading module: bugs: ok
## . . Reading data file data.txt
## . Compiling model graph
## Resolving undeclared variables
## Allocating nodes
## Graph information:
## Observed stochastic nodes: 0
## Unobserved stochastic nodes: 3
## Total graph size: 92
## . Reading parameter file inits1.txt
## . Initializing model
## . Adaptation skipped: model is not in adaptive mode.
## . Updating 1000
## -------------------------------------------------| 1000
## ************************************************** 100%
## . . . . . Updating 12500
## -------------------------------------------------| 12500
## ************************************************** 100%
## . . . . Updating 0
## . Deleting model
## .
## All chains have finished
## Note: the model did not require adaptation
## Simulation complete. Reading coda files...
## Coda files loaded successfully
## Finished running the simulation
## runDiffmHetv: 7.5 sec elapsed
codaDiffmHetvSubsetPriorPred = as.mcmc.list( runjagsDiffmHetvSubsetPriorPred )
# resulting mcmcCoda object has these indices:
# mcmcCoda[[ chainIdx ]][ stepIdx , paramIdx ]
mcmcMatDiffmHetvSubsetPriorPred = as.matrix(codaDiffmHetvSubsetPriorPred,chains=TRUE)
Show prior predictions:
postPredPlot( dataMat=yMatSubset ,
FUN=predDiffmHetv ,
mcmcMat=mcmcMatDiffmHetvSubsetPriorPred[,c("mu[1]","mu[2]",
"sigma[1]","sigma[2]",
"thresh[2]","thresh[3]")] )
Above, it can be seen that the mildly-informed prior does mimic the small representative data.
Run the MCMC on the actual data with the mildly informed prior:
if ( runNewMCMC ) {
runjagsDiffmHetvSubsetPost = runDiffmHetv(
yMatAct ,
thinSteps=15 , # because strong autocorrelation
paramM_DiffmHetvSubset ,
paramVCOV_DiffmHetvSubset )
save( runjagsDiffmHetvSubsetPost ,
file=paste0(saveFileDir,"/",filenameRoot, "-runjagsDiffmHetvSubsetPost.Rdata") )
} else {
load( file=paste0(saveFileDir,"/",filenameRoot, "-runjagsDiffmHetvSubsetPost.Rdata") )
}
## Calling 4 simulations using the parallel method...
## Following the progress of chain 1 (the program will wait for all chains
## to finish before continuing):
## Welcome to JAGS 4.3.1 on Sat Dec 2 08:32:47 2023
## JAGS is free software and comes with ABSOLUTELY NO WARRANTY
## Loading module: basemod: ok
## Loading module: bugs: ok
## . . Reading data file data.txt
## . Compiling model graph
## Resolving undeclared variables
## Allocating nodes
## Graph information:
## Observed stochastic nodes: 2
## Unobserved stochastic nodes: 1
## Total graph size: 92
## . Reading parameter file inits1.txt
## . Initializing model
## . Adapting 500
## -------------------------------------------------| 500
## ++++++++++++++++++++++++++++++++++++++++++++++++++ 100%
## Adaptation successful
## . Updating 1000
## -------------------------------------------------| 1000
## ************************************************** 100%
## . . . . . Updating 187500
## -------------------------------------------------| 187500
## ************************************************** 100%
## . . . . Updating 0
## . Deleting model
## .
## All chains have finished
## Simulation complete. Reading coda files...
## Coda files loaded successfully
## Finished running the simulation
## runDiffmHetv: 10.62 sec elapsed
codaDiffmHetvSubsetPost = as.mcmc.list( runjagsDiffmHetvSubsetPost )
# resulting mcmcCoda object has these indices:
# mcmcCoda[[ chainIdx ]][ stepIdx , paramIdx ]
mcmcMatDiffmHetvSubsetPost = as.matrix(codaDiffmHetvSubsetPost,chains=TRUE)
diagSumDiffmHetvSubsetPost = diagSummary( codaDiffmHetvSubsetPost )
DT::datatable( round(diagSumDiffmHetvSubsetPost,3) , options=list(pageLength=10))
# Extract max PSRF and min ESS:
estimatedRows = ( rownames(diagSumDiffmHetvSubsetPost) != "thresh[1]"
& rownames(diagSumDiffmHetvSubsetPost) != "thresh[4]" )
maxPSRF = max(diagSumDiffmHetvSubsetPost[estimatedRows,c("psrfPt")])
minESS = min(diagSumDiffmHetvSubsetPost[estimatedRows,c("ESS")])
The graph below shows the data (as bars) with posterior-predicted probabilities (as circles) and their 95% ETI’s (as vertical segments). The ETI’s are for the posterior counts of each rating, including both the uncertainty of the estimated probability and the randomness of the sample. It can be seen that the model provides an accurate description of the data.
postPredPlot( dataMat=yMatAct ,
FUN=predDiffmHetv ,
mcmcMat=mcmcMatDiffmHetvSubsetPost[,c("mu[1]","mu[2]",
"sigma[1]","sigma[2]",
"thresh[2]","thresh[3]")] )
Difference of means:
ppdInfo = plotPostDiff( mcmc1=mcmcMatDiffmHetvSubsetPost[,"mu[1]"] ,
mcmc2=mcmcMatDiffmHetvSubsetPost[,"mu[2]"] ,
name1="mu[1]" , name2="mu[2]" ,
rope1=NULL , rope2=NULL , ropeD=c(-0.25,0.25) )
DT::datatable( round(ppdInfo,3) )
Effect size (Cohen’s d), \((\mu_1 - \mu_2)\Big/\sqrt{(\sigma_1^2+\sigma_2^2)/2}\):
diff = ( mcmcMatDiffmHetvSubsetPost[,"mu[1]"] - mcmcMatDiffmHetvSubsetPost[,"mu[2]"] )
poolsd = sqrt( ( mcmcMatDiffmHetvSubsetPost[,"sigma[1]"]^2
+ mcmcMatDiffmHetvSubsetPost[,"sigma[2]"]^2 ) / 2 )
esInfo = plotPost( diff/poolsd ,
name="Effect Size" ,
rope=c(-0.1,0.1) )
DT::datatable( round(esInfo,3) )
Difference of standard deviations:
ppdInfo = plotPostDiff( mcmc1=mcmcMatDiffmHetvSubsetPost[,"sigma[1]"] ,
mcmc2=mcmcMatDiffmHetvSubsetPost[,"sigma[2]"] ,
name1="sigma[1]" , name2="sigma[2]" ,
rope1=NULL , rope2=NULL , ropeD=c(-0.25,0.25) )
DT::datatable( round(ppdInfo,3) )
The parameter estimates, above, are very similar to those obtained from other priors. A direct comparison is presented in the following section on sensitivity analysis
(Cf. BARG Step 5.A, 5.B, 5.E.)
By inspecting the summary tables and graphs of the posterior distributions, above, it can be seen that there is very little change in the posterior estimation when starting with the mildy informed prior, relative to the broad prior.
MCMC distributions can be visually compared by superimposing their smoothed histograms. This is useful for visualizing density-based values such as mode and HDI. Unfortunately, smoothed histograms are inherently prone to approximation error due to the arbitrary width of the smoothing kernel, resulting in random “ripples” in the smoothed curve.
Another useful way to compare MCMC distributions is with the empirical cumulative distribution function (ECDF). (For example, ECDF’s are at the core of the Kolmogorov–Smirnov test of differences between distributions, but we will not be applying that frequentist method here.) One advantage of ECDF’s over smoothed histograms is that no kernel smoother or binning is involved, so there is no loss of resolution. Cumulative distribution functions are useful for visualizing cumulative values such as median and ETI.
Below are graphical comparisons of the marginal posteriors from the two priors. The plot of posterior density relies on smoothing by a kernel with width determined heuristically. The posterior density is a natural way to view the mode and HDI. The plot of cumulative probability does not rely on smoothing (but is effectively smoothed by the pixel resolution). The cumulative probability is a natural way to view the median and ETI.
The impact of the different priors is made clear by the plot, but the magnitude of difference relative to the credible intervals and relative to the ROPE is very small.
effSzBroad = ( ( mcmcMatDiffmHetvBroadPost[,"mu[1]"]
- mcmcMatDiffmHetvBroadPost[,"mu[2]"] )
/ sqrt( ( mcmcMatDiffmHetvBroadPost[,"sigma[1]"]^2
+ mcmcMatDiffmHetvBroadPost[,"sigma[2]"]^2 ) / 2 ) )
effSzOther = ( ( mcmcMatDiffmHetvOtherPost[,"mu[1]"]
- mcmcMatDiffmHetvOtherPost[,"mu[2]"] )
/ sqrt( ( mcmcMatDiffmHetvOtherPost[,"sigma[1]"]^2
+ mcmcMatDiffmHetvOtherPost[,"sigma[2]"]^2 ) / 2 ) )
effSzSubset = ( ( mcmcMatDiffmHetvSubsetPost[,"mu[1]"]
- mcmcMatDiffmHetvSubsetPost[,"mu[2]"] )
/ sqrt( ( mcmcMatDiffmHetvSubsetPost[,"sigma[1]"]^2
+ mcmcMatDiffmHetvSubsetPost[,"sigma[2]"]^2 ) / 2 ) )
par(mfrow=c(2,1))
plotGpdfs( cbind( Broad=effSzBroad , Other=effSzOther , Subset=effSzSubset ) ,
main="For Different Priors" ,
ylab="Posterior Density" ,
xlab="Effect Size (Cohen's d)" , rope=c(-0.1,0.1) )
plotGecdfs( cbind( Broad=effSzBroad , Other=effSzOther , Subset=effSzSubset ) ,
main="For Different Priors" ,
ylab="Posterior Cumulative Proportion" ,
xlab="Effect Size (Cohen's d)" , rope=c(-0.1,0.1) )
Above: Effect Size estimates from different priors. Broad = broad prior, Other = informed by other movies, Subset = informed by subset.
par(mfrow=c(1,1))
sigDiffBroad = ( mcmcMatDiffmHetvBroadPost[,"sigma[1]"]
- mcmcMatDiffmHetvBroadPost[,"sigma[2]"] )
sigDiffOther = ( mcmcMatDiffmHetvOtherPost[,"sigma[1]"]
- mcmcMatDiffmHetvOtherPost[,"sigma[2]"] )
sigDiffSubset = ( mcmcMatDiffmHetvSubsetPost[,"sigma[1]"]
- mcmcMatDiffmHetvSubsetPost[,"sigma[2]"] )
par(mfrow=c(2,1))
plotGpdfs( cbind( Broad=sigDiffBroad , Other=sigDiffOther , Subset=sigDiffSubset ) ,
main="For different priors" ,
ylab="Posterior Density" ,
xlab="sigma[1] - sigma[2]" , rope=c(-0.1,0.1) )
plotGecdfs( cbind( Broad=sigDiffBroad , Other=sigDiffOther , Subset=sigDiffSubset ) ,
main="For different priors" ,
ylab="Posterior Cumulative Proportion" ,
xlab="sigma[1] - sigma[2]" , rope=c(-0.1,0.1) )
Above: Standard deviation (difference) estimates from different priors. Broad = broad prior, Other = informed by other movies, Subset = informed by subset.
par(mfrow=c(1,1))
Notice above that the posterior estimates are virtually identical for the broad prior and the subset-informed prior.
The posterior estimates from the other-informed prior are slightly different, but only slightly relative to the uncertainty (width) of the distribution and relative to the ROPE. That is, the conclusion remains the same that there is a notable non-zero difference of standard deviations, but only a tiny difference of means if any. I suspect that the reason the other-informed prior has a visible effect is that this equal-SD prior pulls together the estimates of the two SD’s, thereby forcing the means to compensate. We will see later that the restricted equal-SD model does not fit the data very well.
If the goals of the analysis were to estimate parameters and assess null values from the posterior distribution of the estimates (not to do model comparisons for hypothesis tests), then the analysis is complete at this point. The remaining sections consider model comparisons for hypothesis tests.
This model allows heterogeneous variances but restricts there to be a single mean used simultaneously for all cases. This model is not necessarily directly informative because we are interested in estimating the magnitude of difference between means, even if that difference is small. But the model can be useful as a foil in model comparison. Moreover, it is (remotely) conceivable that ratings of the movies have an identical underlying mean, if, for example, the ratings were supplied by robots that did not actually watch the movies and instead gave random ratings from the same generating mean.
Because there is a single mean, the prior puts a multivariate normal on \(\left\langle \mu , \log(\sigma_1) , \log(\sigma_2) , \theta_1 , \theta_2 \right\rangle\).
The same numerical values are used as for the full model:
paramM_EqmHetvBroad = c( "mu"=3.0 ,
"logsigma[1]"=log(2.0) , "logsigma[2]"=log(2.0) ,
"thresh[2]"=2.5 , "thresh[3]"=3.5 )
paramVCOV_EqmHetvBroad = diag( c( 2^2 ,
log(2)^2 , log(2)^2 ,
2^2 , 2^2 ) )
dimnames(paramVCOV_EqmHetvBroad) = list( names( paramM_EqmHetvBroad ) ,
names( paramM_EqmHetvBroad ) )
print(round(paramM_EqmHetvBroad,3))
## mu logsigma[1] logsigma[2] thresh[2] thresh[3]
## 3.000 0.693 0.693 2.500 3.500
print(round(paramVCOV_EqmHetvBroad,3))
## mu logsigma[1] logsigma[2] thresh[2] thresh[3]
## mu 4 0.00 0.00 0 0
## logsigma[1] 0 0.48 0.00 0 0
## logsigma[2] 0 0.00 0.48 0 0
## thresh[2] 0 0.00 0.00 4 0
## thresh[3] 0 0.00 0.00 0 4
The broad prior was specified abstractly to be symmetric on the response scale and to provide opportunity for extreme responses on either end of the scale. The broad prior is not intended to mimic any particular realistic response distribution. Nevertheless, a prior predictive check can at least verify the intended symmetry and extremity of the prior.
# Generate MCMC from prior:
if ( runNewMCMC ) {
runjagsEqmHetvBroadPrior = runEqmHetv(
yMatAct ,
priorOnly=TRUE , # N.B. !!!!!!!!!!!!!!!!
thinSteps=1 , # because no autocorrelation
paramM_EqmHetvBroad ,
paramVCOV_EqmHetvBroad )
save( runjagsEqmHetvBroadPrior ,
file=paste0(saveFileDir,"/",filenameRoot, "-runjagsEqmHetvBroadPrior.Rdata") )
} else {
load( file=paste0(saveFileDir,"/",filenameRoot, "-runjagsEqmHetvBroadPrior.Rdata") )
}
## Calling 4 simulations using the parallel method...
## Following the progress of chain 1 (the program will wait for all chains
## to finish before continuing):
## Welcome to JAGS 4.3.1 on Sat Dec 2 08:33:07 2023
## JAGS is free software and comes with ABSOLUTELY NO WARRANTY
## Loading module: basemod: ok
## Loading module: bugs: ok
## . . Reading data file data.txt
## . Compiling model graph
## Resolving undeclared variables
## Allocating nodes
## Graph information:
## Observed stochastic nodes: 0
## Unobserved stochastic nodes: 3
## Total graph size: 79
## . Reading parameter file inits1.txt
## . Initializing model
## . Adaptation skipped: model is not in adaptive mode.
## . Updating 1000
## -------------------------------------------------| 1000
## ************************************************** 100%
## . . . . . Updating 12500
## -------------------------------------------------| 12500
## ************************************************** 100%
## . . . . Updating 0
## . Deleting model
## .
## All chains have finished
## Note: the model did not require adaptation
## Simulation complete. Reading coda files...
## Coda files loaded successfully
## Finished running the simulation
## runEqmHetv: 7.24 sec elapsed
codaEqmHetvBroadPrior = as.mcmc.list( runjagsEqmHetvBroadPrior )
# resulting mcmcCoda object has these indices:
# mcmcCoda[[ chainIdx ]][ stepIdx , paramIdx ]
mcmcMatEqmHetvBroadPrior = as.matrix(codaEqmHetvBroadPrior,chains=TRUE)
Show prior predictions:
postPredPlot( dataMat=yMatAct ,
FUN=predEqmHetv ,
mcmcMat=mcmcMatEqmHetvBroadPrior[,c("mu",
"sigma[1]","sigma[2]",
"thresh[2]","thresh[3]")] )
Above, it can be seen that the broad prior (blue dots are medians, bars are 95% ETI’s) is indeed symmetric on the rating scale, and provides opportunity for either end of the scale to have a high probability.
if ( runNewMCMC ) {
runjagsEqmHetvBroadPost = runEqmHetv(
yMat=yMatAct ,
thinSteps=15 , # because of strong autocorrelation
paramM = paramM_EqmHetvBroad ,
paramVCOV = paramVCOV_EqmHetvBroad )
save( runjagsEqmHetvBroadPost ,
file=paste0(saveFileDir,"/",filenameRoot, "-runjagsEqmHetvBroadPost.Rdata") )
} else {
load( file=paste0(saveFileDir,"/",filenameRoot, "-runjagsEqmHetvBroadPost.Rdata") )
}
## Warning: You attempted to start parallel chains without setting different PRNG
## for each chain, which is not recommended. Different .RNG.name values have been
## added to each set of initial values.
## Calling 4 simulations using the parallel method...
## Following the progress of chain 1 (the program will wait for all chains
## to finish before continuing):
## Welcome to JAGS 4.3.1 on Sat Dec 2 08:33:15 2023
## JAGS is free software and comes with ABSOLUTELY NO WARRANTY
## Loading module: basemod: ok
## Loading module: bugs: ok
## . . Reading data file data.txt
## . Compiling model graph
## Resolving undeclared variables
## Allocating nodes
## Graph information:
## Observed stochastic nodes: 2
## Unobserved stochastic nodes: 1
## Total graph size: 79
## . Reading parameter file inits1.txt
## . Initializing model
## . Adapting 500
## -------------------------------------------------| 500
## ++++++++++++++++++++++++++++++++++++++++++++++++++ 100%
## Adaptation successful
## . Updating 1000
## -------------------------------------------------| 1000
## ************************************************** 100%
## . . . . . Updating 187500
## -------------------------------------------------| 187500
## ************************************************** 100%
## . . . . Updating 0
## . Deleting model
## .
## All chains have finished
## Simulation complete. Reading coda files...
## Coda files loaded successfully
## Finished running the simulation
## runEqmHetv: 8.92 sec elapsed
codaEqmHetvBroadPost = as.mcmc.list( runjagsEqmHetvBroadPost )
# resulting mcmcCoda object has these indices:
# mcmcCoda[[ chainIdx ]][ stepIdx , paramIdx ]
mcmcMatEqmHetvBroadPost = as.matrix(codaEqmHetvBroadPost,chains=TRUE)
diagSumEqmHetvBroadPost = diagSummary( codaEqmHetvBroadPost )
DT::datatable( round(diagSumEqmHetvBroadPost,3) , options=list(pageLength=10))
# Extract max PSRF and min ESS:
estimatedRows = ( rownames(diagSumEqmHetvBroadPost) != "thresh[1]"
& rownames(diagSumEqmHetvBroadPost) != "thresh[4]" )
maxPSRF = max(diagSumEqmHetvBroadPost[estimatedRows,c("psrfPt")])
minESS = min(diagSumEqmHetvBroadPost[estimatedRows,c("ESS")])
From the table above, we observe
postPredPlot( dataMat=yMatAct ,
FUN=predEqmHetv ,
mcmcMat=mcmcMatEqmHetvBroadPost[,c("mu",
"sigma[1]","sigma[2]",
"thresh[2]","thresh[3]")] )
mInfo = plotPost( mcmcMatEqmHetvBroadPost[,"mu"] ,
name="mu (Equal mean model)" ,
rope=c(-0.25,0.25) )
DT::datatable( round(mInfo,3) )
ppdInfo = plotPostDiff( mcmc1=mcmcMatEqmHetvBroadPost[,"sigma[1]"] ,
mcmc2=mcmcMatEqmHetvBroadPost[,"sigma[2]"] ,
name1="sigma[1]" , name2="sigma[2]" ,
rope1=NULL , rope2=NULL , ropeD=c(-0.25,0.25) )
DT::datatable( round(ppdInfo,3) )
From the output above, it can be seen that the parameter estimates from the equal-mean heterogeneous-variance model are very similar to the estimates from the different-means heterogeneous variance model, which is not surprising because the two means estimated in the different-means model were very close to each other.
The small representative data set was described earlier, the first time it was introduced.
Run the Eqm,Hetv model with the small representative data set, starting with the broad prior:
if ( runNewMCMC ) {
runjagsEqmHetvOtherPrior = runEqmHetv(
yMat=yMatOther ,
paramM = paramM_EqmHetvBroad ,
paramVCOV = paramVCOV_EqmHetvBroad )
save( runjagsEqmHetvOtherPrior ,
file=paste0(saveFileDir,"/",filenameRoot, "-runjagsEqmHetvOtherPrior.Rdata") )
} else {
load( file=paste0(saveFileDir,"/",filenameRoot, "-runjagsEqmHetvOtherPrior.Rdata") )
}
## Warning: You attempted to start parallel chains without setting different PRNG
## for each chain, which is not recommended. Different .RNG.name values have been
## added to each set of initial values.
## Calling 4 simulations using the parallel method...
## Following the progress of chain 1 (the program will wait for all chains
## to finish before continuing):
## Welcome to JAGS 4.3.1 on Sat Dec 2 08:33:26 2023
## JAGS is free software and comes with ABSOLUTELY NO WARRANTY
## Loading module: basemod: ok
## Loading module: bugs: ok
## . . Reading data file data.txt
## . Compiling model graph
## Resolving undeclared variables
## Allocating nodes
## Graph information:
## Observed stochastic nodes: 2
## Unobserved stochastic nodes: 1
## Total graph size: 79
## . Reading parameter file inits1.txt
## . Initializing model
## . Adapting 500
## -------------------------------------------------| 500
## ++++++++++++++++++++++++++++++++++++++++++++++++++ 100%
## Adaptation successful
## . Updating 1000
## -------------------------------------------------| 1000
## ************************************************** 100%
## . . . . . Updating 125000
## -------------------------------------------------| 125000
## ************************************************** 100%
## . . . . Updating 0
## . Deleting model
## .
## All chains have finished
## Simulation complete. Reading coda files...
## Coda files loaded successfully
## Finished running the simulation
## runEqmHetv: 8.54 sec elapsed
codaEqmHetvOtherPrior = as.mcmc.list( runjagsEqmHetvOtherPrior )
# resulting mcmcCoda object has these indices:
# mcmcCoda[[ chainIdx ]][ stepIdx , paramIdx ]
mcmcMatEqmHetvOtherPrior = as.matrix(codaEqmHetvOtherPrior,chains=TRUE)
MCMC Diagnostics:
diagSumEqmHetvOtherPrior = diagSummary( codaEqmHetvOtherPrior )
DT::datatable( round(diagSumEqmHetvOtherPrior,3) , options=list(pageLength=10))
# Extract max PSRF and min ESS:
estimatedRows = ( rownames(diagSumEqmHetvOtherPrior) != "thresh[1]"
& rownames(diagSumEqmHetvOtherPrior) != "thresh[4]" )
maxPSRF = max(diagSumEqmHetvOtherPrior[estimatedRows,c("psrfPt")])
minESS = min(diagSumEqmHetvOtherPrior[estimatedRows,c("ESS")])
From the table above, we observe
Posterior predictive check for prior data:
postPredPlot( dataMat=yMatOther ,
FUN=predEqmHetv ,
mcmcMat=mcmcMatEqmHetvOtherPrior[,c("mu",
"sigma[1]","sigma[2]",
"thresh[2]","thresh[3]")] )
Above plot shows that the prior data are well represented by the model.
Now examine the posterior distribution. First, compute log(sigma):
# Compute log(sigma):
mcmcMatEqmHetvOtherPrior = cbind( mcmcMatEqmHetvOtherPrior ,
"logsigma[1]"=log(mcmcMatEqmHetvOtherPrior[,"sigma[1]"]) ,
"logsigma[2]"=log(mcmcMatEqmHetvOtherPrior[,"sigma[2]"]) )
Pairs plots:
plotRows = seq( 1 , nrow(mcmcMatEqmHetvOtherPrior) ,
length=min(c(2000,nrow(mcmcMatEqmHetvOtherPrior))) )
psych::pairs.panels(
mcmcMatEqmHetvOtherPrior[
plotRows ,
c("mu","logsigma[1]","logsigma[2]","thresh[2]","thresh[3]") ] ,
smooth=FALSE , lm=TRUE , smoother=TRUE , gap=0.2 , # pch=20 ,
ellipses=TRUE , hist.col="skyblue" )
Here are the M and V,C constants used for the informed prior:
# Extract posterior summary for use as subsequent prior:
paramM_EqmHetvOther = colMeans( mcmcMatEqmHetvOtherPrior[ ,
c("mu","logsigma[1]","logsigma[2]","thresh[2]","thresh[3]") ] )
paramVCOV_EqmHetvOther = cov( mcmcMatEqmHetvOtherPrior[ ,
c("mu","logsigma[1]","logsigma[2]","thresh[2]","thresh[3]") ] )
print(round(paramM_EqmHetvOther,3))
## mu logsigma[1] logsigma[2] thresh[2] thresh[3]
## 5.503 0.786 0.785 2.266 3.115
print(round(paramVCOV_EqmHetvOther,3))
## mu logsigma[1] logsigma[2] thresh[2] thresh[3]
## mu 0.189 0.049 0.049 -0.010 -0.008
## logsigma[1] 0.049 0.059 0.017 -0.019 -0.020
## logsigma[2] 0.049 0.017 0.059 -0.019 -0.020
## thresh[2] -0.010 -0.019 -0.019 0.109 0.057
## thresh[3] -0.008 -0.020 -0.020 0.057 0.099
If the mathematical prior were an exact copy of the posterior from the mildly informed fit, then a prior predictive check would be superfluous. But because the mathematical prior is merely a close approximation of the posterior from the mildly informed fit, a prior predictive check verifies that the prior really does mimic the small data as intended.
# Generate MCMC from prior:
if ( runNewMCMC ) {
runjagsEqmHetvOtherPriorPred = runEqmHetv(
yMatAct ,
priorOnly=TRUE , # N.B. !!!!!!!!!!!!!!!!
thinSteps=1 , # because no autocorrelation
paramM_EqmHetvOther ,
paramVCOV_EqmHetvOther )
save( runjagsEqmHetvOtherPriorPred ,
file=paste0(saveFileDir,"/",filenameRoot, "-runjagsEqmHetvOtherPriorPred.Rdata") )
} else {
load( file=paste0(saveFileDir,"/",filenameRoot, "-runjagsEqmHetvOtherPriorPred.Rdata") )
}
## Calling 4 simulations using the parallel method...
## Following the progress of chain 1 (the program will wait for all chains
## to finish before continuing):
## Welcome to JAGS 4.3.1 on Sat Dec 2 08:33:38 2023
## JAGS is free software and comes with ABSOLUTELY NO WARRANTY
## Loading module: basemod: ok
## Loading module: bugs: ok
## . . Reading data file data.txt
## . Compiling model graph
## Resolving undeclared variables
## Allocating nodes
## Graph information:
## Observed stochastic nodes: 0
## Unobserved stochastic nodes: 3
## Total graph size: 79
## . Reading parameter file inits1.txt
## . Initializing model
## . Adaptation skipped: model is not in adaptive mode.
## . Updating 1000
## -------------------------------------------------| 1000
## ************************************************** 100%
## . . . . . Updating 12500
## -------------------------------------------------| 12500
## ************************************************** 100%
## . . . . Updating 0
## . Deleting model
## .
## All chains have finished
## Note: the model did not require adaptation
## Simulation complete. Reading coda files...
## Coda files loaded successfully
## Finished running the simulation
## runEqmHetv: 6.79 sec elapsed
codaEqmHetvOtherPriorPred = as.mcmc.list( runjagsEqmHetvOtherPriorPred )
# resulting mcmcCoda object has these indices:
# mcmcCoda[[ chainIdx ]][ stepIdx , paramIdx ]
mcmcMatEqmHetvOtherPriorPred = as.matrix(codaEqmHetvOtherPriorPred,chains=TRUE)
Show prior predictions:
postPredPlot( dataMat=yMatOther ,
FUN=predEqmHetv ,
mcmcMat=mcmcMatEqmHetvOtherPriorPred[,c("mu",
"sigma[1]","sigma[2]",
"thresh[2]","thresh[3]")] )
Above, it can be seen that the mildly-informed prior does mimic the small data.
Run Eqm, Hetv model on actual data with informed prior:
if ( runNewMCMC ) {
runjagsEqmHetvOtherPost = runEqmHetv(
yMat=yMatAct ,
thinSteps=15 , # because of strong autocorrelation
paramM = paramM_EqmHetvOther ,
paramVCOV = paramVCOV_EqmHetvOther )
save( runjagsEqmHetvOtherPost ,
file=paste0(saveFileDir,"/",filenameRoot, "-runjagsEqmHetvOtherPost.Rdata") )
} else {
load( file=paste0(saveFileDir,"/",filenameRoot, "-runjagsEqmHetvOtherPost.Rdata") )
}
## Warning: You attempted to start parallel chains without setting different PRNG
## for each chain, which is not recommended. Different .RNG.name values have been
## added to each set of initial values.
## Calling 4 simulations using the parallel method...
## Following the progress of chain 1 (the program will wait for all chains
## to finish before continuing):
## Welcome to JAGS 4.3.1 on Sat Dec 2 08:33:47 2023
## JAGS is free software and comes with ABSOLUTELY NO WARRANTY
## Loading module: basemod: ok
## Loading module: bugs: ok
## . . Reading data file data.txt
## . Compiling model graph
## Resolving undeclared variables
## Allocating nodes
## Graph information:
## Observed stochastic nodes: 2
## Unobserved stochastic nodes: 1
## Total graph size: 79
## . Reading parameter file inits1.txt
## . Initializing model
## . Adapting 500
## -------------------------------------------------| 500
## ++++++++++++++++++++++++++++++++++++++++++++++++++ 100%
## Adaptation successful
## . Updating 1000
## -------------------------------------------------| 1000
## ************************************************** 100%
## . . . . . Updating 187500
## -------------------------------------------------| 187500
## ************************************************** 100%
## . . . . Updating 0
## . Deleting model
## .
## All chains have finished
## Simulation complete. Reading coda files...
## Coda files loaded successfully
## Finished running the simulation
## runEqmHetv: 9.88 sec elapsed
codaEqmHetvOtherPost = as.mcmc.list( runjagsEqmHetvOtherPost )
# resulting mcmcCoda object has these indices:
# mcmcCoda[[ chainIdx ]][ stepIdx , paramIdx ]
mcmcMatEqmHetvOtherPost = as.matrix(codaEqmHetvOtherPost,chains=TRUE)
Diagnostics:
diagSumEqmHetvOtherPost = diagSummary( codaEqmHetvOtherPost )
DT::datatable( round(diagSumEqmHetvOtherPost,3) , options=list(pageLength=10))
# Extract max PSRF and min ESS:
estimatedRows = ( rownames(diagSumEqmHetvOtherPost) != "thresh[1]"
& rownames(diagSumEqmHetvOtherPost) != "thresh[4]" )
maxPSRF = max(diagSumEqmHetvOtherPost[estimatedRows,c("psrfPt")])
minESS = min(diagSumEqmHetvOtherPost[estimatedRows,c("ESS")])
From the table above, we observe
postPredPlot( dataMat=yMatAct ,
FUN=predEqmHetv ,
mcmcMat=mcmcMatEqmHetvOtherPost[,c("mu",
"sigma[1]","sigma[2]",
"thresh[2]","thresh[3]")] )
Posterior mu and sigma’s:
mInfo = plotPost( mcmcMatEqmHetvOtherPost[,"mu"] ,
name="mu (Equal mean model)" ,
rope=c(-0.25,0.25) )
DT::datatable( round(mInfo,3) )
ppdInfo = plotPostDiff( mcmc1=mcmcMatEqmHetvOtherPost[,"sigma[1]"] ,
mcmc2=mcmcMatEqmHetvOtherPost[,"sigma[2]"] ,
name1="sigma[1]" , name2="sigma[2]" ,
rope1=NULL , rope2=NULL , ropeD=c(-0.25,0.25) )
DT::datatable( round(ppdInfo,3) )
From the output above, it can be seen that the parameter estimates from the equal-mean heterogeneous-variance model are very similar to the estimates from the different-means heterogeneous-variance model, which is not surprising because the two means estimated in the different-means model were very close to each other (i.e., nearly equal).
The representative subset was described above, the first time it was used.
Run the model on the small representative data set using broad prior:
if ( runNewMCMC ) {
runjagsEqmHetvSubsetPrior = runEqmHetv(
yMat=yMatSubset ,
paramM = paramM_EqmHetvBroad ,
paramVCOV = paramVCOV_EqmHetvBroad )
save( runjagsEqmHetvSubsetPrior ,
file=paste0(saveFileDir,"/",filenameRoot, "-runjagsEqmHetvSubsetPrior.Rdata") )
} else {
load( file=paste0(saveFileDir,"/",filenameRoot, "-runjagsEqmHetvSubsetPrior.Rdata") )
}
## Warning: You attempted to start parallel chains without setting different PRNG
## for each chain, which is not recommended. Different .RNG.name values have been
## added to each set of initial values.
## Calling 4 simulations using the parallel method...
## Following the progress of chain 1 (the program will wait for all chains
## to finish before continuing):
## Welcome to JAGS 4.3.1 on Sat Dec 2 08:34:00 2023
## JAGS is free software and comes with ABSOLUTELY NO WARRANTY
## Loading module: basemod: ok
## Loading module: bugs: ok
## . . Reading data file data.txt
## . Compiling model graph
## Resolving undeclared variables
## Allocating nodes
## Graph information:
## Observed stochastic nodes: 2
## Unobserved stochastic nodes: 1
## Total graph size: 79
## . Reading parameter file inits1.txt
## . Initializing model
## . Adapting 500
## -------------------------------------------------| 500
## ++++++++++++++++++++++++++++++++++++++++++++++++++ 100%
## Adaptation successful
## . Updating 1000
## -------------------------------------------------| 1000
## ************************************************** 100%
## . . . . . Updating 125000
## -------------------------------------------------| 125000
## ************************************************** 100%
## . . . . Updating 0
## . Deleting model
## .
## All chains have finished
## Simulation complete. Reading coda files...
## Coda files loaded successfully
## Finished running the simulation
## runEqmHetv: 7.86 sec elapsed
codaEqmHetvSubsetPrior = as.mcmc.list( runjagsEqmHetvSubsetPrior )
# resulting mcmcCoda object has these indices:
# mcmcCoda[[ chainIdx ]][ stepIdx , paramIdx ]
mcmcMatEqmHetvSubsetPrior = as.matrix(codaEqmHetvSubsetPrior,chains=TRUE)
MCMC diagnostics:
diagSumEqmHetvSubsetPrior = diagSummary( codaEqmHetvSubsetPrior )
DT::datatable( round(diagSumEqmHetvSubsetPrior,3) , options=list(pageLength=10))
# Extract max PSRF and min ESS:
estimatedRows = ( rownames(diagSumEqmHetvSubsetPrior) != "thresh[1]"
& rownames(diagSumEqmHetvSubsetPrior) != "thresh[4]" )
maxPSRF = max(diagSumEqmHetvSubsetPrior[estimatedRows,c("psrfPt")])
minESS = min(diagSumEqmHetvSubsetPrior[estimatedRows,c("ESS")])
From the table above, we observe
Posterior predictive check of small data:
postPredPlot( dataMat=yMatSubset ,
FUN=predEqmHetv ,
mcmcMat=mcmcMatEqmHetvSubsetPrior[,c("mu",
"sigma[1]","sigma[2]",
"thresh[2]","thresh[3]")] )
Above plot shows that the prior data are well represented by the model.
Now examine the posterior distribution (which will be used as the prior for the actual data). First, compute log(sigma):
# Compute log(sigma):
mcmcMatEqmHetvSubsetPrior = cbind( mcmcMatEqmHetvSubsetPrior ,
"logsigma[1]"=log(mcmcMatEqmHetvSubsetPrior[,"sigma[1]"]) ,
"logsigma[2]"=log(mcmcMatEqmHetvSubsetPrior[,"sigma[2]"]) )
Pairs plots:
plotRows = seq( 1 , nrow(mcmcMatEqmHetvSubsetPrior) ,
length=min(c(2000,nrow(mcmcMatEqmHetvSubsetPrior))) )
psych::pairs.panels(
mcmcMatEqmHetvSubsetPrior[
plotRows ,
c("mu","logsigma[1]","logsigma[2]","thresh[2]","thresh[3]") ] ,
smooth=FALSE , lm=TRUE , smoother=TRUE , gap=0.2 , # pch=20 ,
ellipses=TRUE , hist.col="skyblue" )
Here are the \(M\), \(V\), and \(C\) constants used for the informed prior:
# Extract posterior summary for use as subsequent prior:
paramM_EqmHetvSubset = colMeans( mcmcMatEqmHetvSubsetPrior[ ,
c("mu","logsigma[1]","logsigma[2]","thresh[2]","thresh[3]") ] )
paramVCOV_EqmHetvSubset = cov( mcmcMatEqmHetvSubsetPrior[ ,
c("mu","logsigma[1]","logsigma[2]","thresh[2]","thresh[3]") ] )
print(round(paramM_EqmHetvSubset,3))
## mu logsigma[1] logsigma[2] thresh[2] thresh[3]
## 3.756 1.104 0.692 2.396 3.225
print(round(paramVCOV_EqmHetvSubset,3))
## mu logsigma[1] logsigma[2] thresh[2] thresh[3]
## mu 0.108 0.008 0.007 0.013 0.016
## logsigma[1] 0.008 0.051 0.001 -0.003 -0.002
## logsigma[2] 0.007 0.001 0.038 -0.006 -0.005
## thresh[2] 0.013 -0.003 -0.006 0.059 0.033
## thresh[3] 0.016 -0.002 -0.005 0.033 0.059
If the mathematical prior were an exact copy of the posterior from the mildly informed fit, then a prior predictive check would be superfluous. But because the mathematical prior is merely a close approximation of the posterior from the mildly informed fit, a prior predictive check verifies that the prior really does mimic the small data as intended.
# Generate MCMC from prior:
if ( runNewMCMC ) {
runjagsEqmHetvSubsetPriorPred = runEqmHetv(
yMatAct ,
priorOnly=TRUE , # N.B. !!!!!!!!!!!!!!!!
thinSteps=1 , # because no autocorrelation
paramM_EqmHetvSubset ,
paramVCOV_EqmHetvSubset )
save( runjagsEqmHetvSubsetPriorPred ,
file=paste0(saveFileDir,"/",filenameRoot, "-runjagsEqmHetvSubsetPriorPred.Rdata") )
} else {
load( file=paste0(saveFileDir,"/",filenameRoot, "-runjagsEqmHetvSubsetPriorPred.Rdata") )
}
## Calling 4 simulations using the parallel method...
## Following the progress of chain 1 (the program will wait for all chains
## to finish before continuing):
## Welcome to JAGS 4.3.1 on Sat Dec 2 08:34:12 2023
## JAGS is free software and comes with ABSOLUTELY NO WARRANTY
## Loading module: basemod: ok
## Loading module: bugs: ok
## . . Reading data file data.txt
## . Compiling model graph
## Resolving undeclared variables
## Allocating nodes
## Graph information:
## Observed stochastic nodes: 0
## Unobserved stochastic nodes: 3
## Total graph size: 79
## . Reading parameter file inits1.txt
## . Initializing model
## . Adaptation skipped: model is not in adaptive mode.
## . Updating 1000
## -------------------------------------------------| 1000
## ************************************************** 100%
## . . . . . Updating 12500
## -------------------------------------------------| 12500
## ************************************************** 100%
## . . . . Updating 0
## . Deleting model
## .
## All chains have finished
## Note: the model did not require adaptation
## Simulation complete. Reading coda files...
## Coda files loaded successfully
## Finished running the simulation
## runEqmHetv: 7.25 sec elapsed
codaEqmHetvSubsetPriorPred = as.mcmc.list( runjagsEqmHetvSubsetPriorPred )
# resulting mcmcCoda object has these indices:
# mcmcCoda[[ chainIdx ]][ stepIdx , paramIdx ]
mcmcMatEqmHetvSubsetPriorPred = as.matrix(codaEqmHetvSubsetPriorPred,chains=TRUE)
Show prior predictions:
postPredPlot( dataMat=yMatSubset ,
FUN=predEqmHetv ,
mcmcMat=mcmcMatEqmHetvSubsetPriorPred[,c("mu",
"sigma[1]","sigma[2]",
"thresh[2]","thresh[3]")] )
Above, it can be seen that the mildly-informed prior does mimic the small representative data.
Run Eqm, Hetv model on actual data with informed prior:
if ( runNewMCMC ) {
runjagsEqmHetvSubsetPost = runEqmHetv(
yMat=yMatAct ,
thinSteps=15 , # because of strong autocorrelation
paramM = paramM_EqmHetvSubset ,
paramVCOV = paramVCOV_EqmHetvSubset )
save( runjagsEqmHetvSubsetPost ,
file=paste0(saveFileDir,"/",filenameRoot, "-runjagsEqmHetvSubsetPost.Rdata") )
} else {
load( file=paste0(saveFileDir,"/",filenameRoot, "-runjagsEqmHetvSubsetPost.Rdata") )
}
## Warning: You attempted to start parallel chains without setting different PRNG
## for each chain, which is not recommended. Different .RNG.name values have been
## added to each set of initial values.
## Calling 4 simulations using the parallel method...
## Following the progress of chain 1 (the program will wait for all chains
## to finish before continuing):
## Welcome to JAGS 4.3.1 on Sat Dec 2 08:34:21 2023
## JAGS is free software and comes with ABSOLUTELY NO WARRANTY
## Loading module: basemod: ok
## Loading module: bugs: ok
## . . Reading data file data.txt
## . Compiling model graph
## Resolving undeclared variables
## Allocating nodes
## Graph information:
## Observed stochastic nodes: 2
## Unobserved stochastic nodes: 1
## Total graph size: 79
## . Reading parameter file inits1.txt
## . Initializing model
## . Adapting 500
## -------------------------------------------------| 500
## ++++++++++++++++++++++++++++++++++++++++++++++++++ 100%
## Adaptation successful
## . Updating 1000
## -------------------------------------------------| 1000
## ************************************************** 100%
## . . . . . Updating 187500
## -------------------------------------------------| 187500
## ************************************************** 100%
## . . . . Updating 0
## . Deleting model
## .
## All chains have finished
## Simulation complete. Reading coda files...
## Coda files loaded successfully
## Finished running the simulation
## runEqmHetv: 9.91 sec elapsed
codaEqmHetvSubsetPost = as.mcmc.list( runjagsEqmHetvSubsetPost )
# resulting mcmcCoda object has these indices:
# mcmcCoda[[ chainIdx ]][ stepIdx , paramIdx ]
mcmcMatEqmHetvSubsetPost = as.matrix(codaEqmHetvSubsetPost,chains=TRUE)
Diagnostics:
diagSumEqmHetvSubsetPost = diagSummary( codaEqmHetvSubsetPost )
DT::datatable( round(diagSumEqmHetvSubsetPost,3) , options=list(pageLength=10))
# Extract max PSRF and min ESS:
estimatedRows = ( rownames(diagSumEqmHetvSubsetPost) != "thresh[1]"
& rownames(diagSumEqmHetvSubsetPost) != "thresh[4]" )
maxPSRF = max(diagSumEqmHetvSubsetPost[estimatedRows,c("psrfPt")])
minESS = min(diagSumEqmHetvSubsetPost[estimatedRows,c("ESS")])
From the table above, we observe
postPredPlot( dataMat=yMatAct ,
FUN=predEqmHetv ,
mcmcMat=mcmcMatEqmHetvSubsetPost[,c("mu",
"sigma[1]","sigma[2]",
"thresh[2]","thresh[3]")] )
Posterior mu and sigma’s:
mInfo = plotPost( mcmcMatEqmHetvSubsetPost[,"mu"] ,
name="mu (Equal mean model)" ,
rope=c(-0.25,0.25) )
DT::datatable( round(mInfo,3) )
ppdInfo = plotPostDiff( mcmc1=mcmcMatEqmHetvSubsetPost[,"sigma[1]"] ,
mcmc2=mcmcMatEqmHetvSubsetPost[,"sigma[2]"] ,
name1="sigma[1]" , name2="sigma[2]" ,
rope1=NULL , rope2=NULL , ropeD=c(-0.25,0.25) )
DT::datatable( round(ppdInfo,3) )
From the output above, it can be seen that the parameter estimates from the equal-mean heterogeneous-variance model are very similar to the estimates from the different-means heterogeneous variance model, which is not surprising because the two means estimated in the different-means model were very close to each other.
Below are graphical comparisons of the marginal posteriors from the three priors. The plot of posterior density relies on smoothing by a kernel with width determined heuristically. The posterior density is a natural way to view the mode and HDI. The plot of cumulative probability does not rely on smoothing (but is effectively smoothed by the pixel resolution). The cumulative probability is a natural way to view the median and ETI.
par(mfrow=c(2,1))
plotGpdfs( cbind( Broad=mcmcMatEqmHetvBroadPost[,"mu"] ,
Other=mcmcMatEqmHetvOtherPost[,"mu"] ,
Subset=mcmcMatEqmHetvSubsetPost[,"mu"] ) ,
main="For Different Priors" ,
ylab="Posterior Density" ,
xlab="mu" , rope=NULL )
plotGecdfs( cbind( Broad=mcmcMatEqmHetvBroadPost[,"mu"] ,
Other=mcmcMatEqmHetvOtherPost[,"mu"] ,
Subset=mcmcMatEqmHetvSubsetPost[,"mu"] ) ,
main="For Different Priors" ,
ylab="Posterior Cumulative Proportion" ,
xlab="mu" , rope=NULL )
par(mfrow=c(1,1))
sigDiffBroad = ( mcmcMatEqmHetvBroadPost[,"sigma[1]"]
- mcmcMatEqmHetvBroadPost[,"sigma[2]"] )
sigDiffOther = ( mcmcMatEqmHetvOtherPost[,"sigma[1]"]
- mcmcMatEqmHetvOtherPost[,"sigma[2]"] )
sigDiffSubset = ( mcmcMatEqmHetvSubsetPost[,"sigma[1]"]
- mcmcMatEqmHetvSubsetPost[,"sigma[2]"] )
par(mfrow=c(2,1))
plotGpdfs( cbind( Broad=sigDiffBroad , Other=sigDiffOther , Subset=sigDiffSubset ) ,
main="For different priors" ,
ylab="Posterior Density" ,
xlab="sigma[1] - sigma[2]" , rope=c(-0.1,0.1) )
plotGecdfs( cbind( Broad=sigDiffBroad , Other=sigDiffOther , Subset=sigDiffSubset ) ,
main="For different priors" ,
ylab="Posterior Cumulative Proportion" ,
xlab="sigma[1] - sigma[2]" , rope=c(-0.1,0.1) )
par(mfrow=c(1,1))
The impact of the different priors is made clear by the plot, but the magnitude of difference relative to the credible intervals and relative to the ROPE is very small. That is, the conclusion remains the same as the full model that there is a notable non-zero difference of standard deviations.
This model allows different means but restricts there to be a single variance used simultaneously for all cases. This model is not necessarily directly informative because we are interested in estimating the magnitude of difference between variances, even if that difference is small. But the model can be useful as a foil in model comparison, and an assumption of homogeneous variances is typical in conventional ordered-probit models. Moreover, it is (remotely) conceivable that ratings of the movies have an identical underlying variance, if, for example, the ratings were supplied by robots that did not actually watch the movies and instead gave random ratings from the same generating distributions.
Because there is a single variance, the prior puts a multivariate normal on \(\left\langle \mu_1 , \mu_2 , \log(\sigma) , \theta_1 , \theta_2 \right\rangle\).
The same numerical values are used as for the full model:
paramM_DiffmHomvBroad = c( "mu[1]"=3.0 , "mu[2]"=3.0 ,
"logsigma"=log(2.0) ,
"thresh[2]"=2.5 , "thresh[3]"=3.5 )
paramVCOV_DiffmHomvBroad = diag( c( 2^2 , 2^2 ,
log(2)^2 ,
2^2 , 2^2 ) )
dimnames(paramVCOV_DiffmHomvBroad) = list( names( paramM_DiffmHomvBroad ) ,
names( paramM_DiffmHomvBroad ) )
print(round(paramM_DiffmHomvBroad,3))
## mu[1] mu[2] logsigma thresh[2] thresh[3]
## 3.000 3.000 0.693 2.500 3.500
print(round(paramVCOV_DiffmHomvBroad,3))
## mu[1] mu[2] logsigma thresh[2] thresh[3]
## mu[1] 4 0 0.00 0 0
## mu[2] 0 4 0.00 0 0
## logsigma 0 0 0.48 0 0
## thresh[2] 0 0 0.00 4 0
## thresh[3] 0 0 0.00 0 4
The broad prior was specified abstractly to be symmetric on the response scale and to provide opportunity for extreme responses on either end of the scale. The broad prior is not intended to mimic any particular realistic response distribution. Nevertheless, a prior predictive check can at least verify the intended symmetry and extremity of the prior.
# Generate MCMC from prior:
if ( runNewMCMC ) {
runjagsDiffmHomvBroadPrior = runDiffmHomv(
yMatAct ,
priorOnly=TRUE , # N.B. !!!!!!!!!!!!!!!!
thinSteps=1 , # because no autocorrelation
paramM_DiffmHomvBroad ,
paramVCOV_DiffmHomvBroad )
save( runjagsDiffmHomvBroadPrior ,
file=paste0(saveFileDir,"/",filenameRoot, "-runjagsDiffmHomvBroadPrior.Rdata") )
} else {
load( file=paste0(saveFileDir,"/",filenameRoot, "-runjagsDiffmHomvBroadPrior.Rdata") )
}
## Calling 4 simulations using the parallel method...
## Following the progress of chain 1 (the program will wait for all chains
## to finish before continuing):
## Welcome to JAGS 4.3.1 on Sat Dec 2 08:34:41 2023
## JAGS is free software and comes with ABSOLUTELY NO WARRANTY
## Loading module: basemod: ok
## Loading module: bugs: ok
## . . Reading data file data.txt
## . Compiling model graph
## Resolving undeclared variables
## Allocating nodes
## Graph information:
## Observed stochastic nodes: 0
## Unobserved stochastic nodes: 3
## Total graph size: 76
## . Reading parameter file inits1.txt
## . Initializing model
## . Adaptation skipped: model is not in adaptive mode.
## . Updating 1000
## -------------------------------------------------| 1000
## ************************************************** 100%
## . . . . . Updating 12500
## -------------------------------------------------| 12500
## ************************************************** 100%
## . . . . Updating 0
## . Deleting model
## .
## All chains have finished
## Note: the model did not require adaptation
## Simulation complete. Reading coda files...
## Coda files loaded successfully
## Finished running the simulation
## runDiffmHomv: 7.28 sec elapsed
codaDiffmHomvBroadPrior = as.mcmc.list( runjagsDiffmHomvBroadPrior )
# resulting mcmcCoda object has these indices:
# mcmcCoda[[ chainIdx ]][ stepIdx , paramIdx ]
mcmcMatDiffmHomvBroadPrior = as.matrix(codaDiffmHomvBroadPrior,chains=TRUE)
Show prior predictions:
postPredPlot( dataMat=yMatAct ,
FUN=predDiffmHomv ,
mcmcMat=mcmcMatDiffmHomvBroadPrior[,c("mu[1]","mu[2]",
"sigma",
"thresh[2]","thresh[3]")] )
Above, it can be seen that the broad prior (blue dots are medians, bars are 95% ETI’s) is indeed symmetric on the rating scale, and provides opportunity for either end of the scale to have a high probability.
if ( runNewMCMC ) {
runjagsDiffmHomvBroadPost = runDiffmHomv(
yMat=yMatAct ,
thinSteps=15 , # because of strong autocorrelation
paramM = paramM_DiffmHomvBroad ,
paramVCOV = paramVCOV_DiffmHomvBroad )
save( runjagsDiffmHomvBroadPost ,
file=paste0(saveFileDir,"/",filenameRoot, "-runjagsDiffmHomvBroadPost.Rdata") )
} else {
load( file=paste0(saveFileDir,"/",filenameRoot, "-runjagsDiffmHomvBroadPost.Rdata") )
}
## Warning: You attempted to start parallel chains without setting different PRNG
## for each chain, which is not recommended. Different .RNG.name values have been
## added to each set of initial values.
## Calling 4 simulations using the parallel method...
## Following the progress of chain 1 (the program will wait for all chains
## to finish before continuing):
## Welcome to JAGS 4.3.1 on Sat Dec 2 08:34:50 2023
## JAGS is free software and comes with ABSOLUTELY NO WARRANTY
## Loading module: basemod: ok
## Loading module: bugs: ok
## . . Reading data file data.txt
## . Compiling model graph
## Resolving undeclared variables
## Allocating nodes
## Graph information:
## Observed stochastic nodes: 2
## Unobserved stochastic nodes: 1
## Total graph size: 76
## . Reading parameter file inits1.txt
## . Initializing model
## . Adapting 500
## -------------------------------------------------| 500
## ++++++++++++++++++++++++++++++++++++++++++++++++++ 100%
## Adaptation successful
## . Updating 1000
## -------------------------------------------------| 1000
## ************************************************** 100%
## . . . . . Updating 187500
## -------------------------------------------------| 187500
## ************************************************** 100%
## . . . . Updating 0
## . Deleting model
## .
## All chains have finished
## Simulation complete. Reading coda files...
## Coda files loaded successfully
## Finished running the simulation
## runDiffmHomv: 9 sec elapsed
codaDiffmHomvBroadPost = as.mcmc.list( runjagsDiffmHomvBroadPost )
# resulting mcmcCoda object has these indices:
# mcmcCoda[[ chainIdx ]][ stepIdx , paramIdx ]
mcmcMatDiffmHomvBroadPost = as.matrix(codaDiffmHomvBroadPost,chains=TRUE)
diagSumDiffmHomvBroadPost = diagSummary( codaDiffmHomvBroadPost )
DT::datatable( round(diagSumDiffmHomvBroadPost,3) , options=list(pageLength=10))
# Extract max PSRF and min ESS:
estimatedRows = ( rownames(diagSumDiffmHomvBroadPost) != "thresh[1]"
& rownames(diagSumDiffmHomvBroadPost) != "thresh[4]" )
maxPSRF = max(diagSumDiffmHomvBroadPost[estimatedRows,c("psrfPt")])
minESS = min(diagSumDiffmHomvBroadPost[estimatedRows,c("ESS")])
From the table above, we observe
postPredPlot( dataMat=yMatAct ,
FUN=predDiffmHomv ,
mcmcMat=mcmcMatDiffmHomvBroadPost[,c("mu[1]","mu[2]",
"sigma",
"thresh[2]","thresh[3]")] )
It can be seen above that the Diffm,Homv model does not mimic the data very well, or at least not nearly as well as the heterogenous-variance models. This is not surprising, given that the heterogeneous-variance models estimated a large difference between variances, so this homogeneous-variance model should strain to fit the data.
The poor fit should make us suspicious of interpreting the parameter values. But merely for completeness, here they are.
Difference of means:
ppdInfo = plotPostDiff( mcmc1=mcmcMatDiffmHomvBroadPost[,"mu[1]"] ,
mcmc2=mcmcMatDiffmHomvBroadPost[,"mu[2]"] ,
name1="mu[1]" , name2="mu[2]" ,
rope1=NULL , rope2=NULL , ropeD=c(-0.25,0.25) )
DT::datatable( round(ppdInfo,3) )
Effect size (Cohen’s d), \((\mu_1 - \mu_2)\Big/\sqrt{(\sigma_1^2+\sigma_2^2)/2}\):
diff = ( mcmcMatDiffmHomvBroadPost[,"mu[1]"] - mcmcMatDiffmHomvBroadPost[,"mu[2]"] )
sd = mcmcMatDiffmHomvBroadPost[,"sigma"]
esInfo = plotPost( diff/sd ,
name="Effect Size" ,
rope=c(-0.1,0.1) )
DT::datatable( round(esInfo,3) )
sInfo = plotPost( mcmcMatDiffmHomvBroadPost[,"sigma"] ,
name="sigma (Equal variance model)" )
DT::datatable( round(sInfo,3) )
Again, the estimated parameter values above should not be taken too seriously, given the relatively poor fit of the model.
The small representative data set was described earlier, the first time it was introduced.
Run the Diffm,Homv model with the small representative data set, starting with the broad prior:
if ( runNewMCMC ) {
runjagsDiffmHomvOtherPrior = runDiffmHomv(
yMat=yMatOther ,
paramM = paramM_DiffmHomvBroad ,
paramVCOV = paramVCOV_DiffmHomvBroad )
save( runjagsDiffmHomvOtherPrior ,
file=paste0(saveFileDir,"/",filenameRoot, "-runjagsDiffmHomvOtherPrior.Rdata") )
} else {
load( file=paste0(saveFileDir,"/",filenameRoot, "-runjagsDiffmHomvOtherPrior.Rdata") )
}
## Warning: You attempted to start parallel chains without setting different PRNG
## for each chain, which is not recommended. Different .RNG.name values have been
## added to each set of initial values.
## Calling 4 simulations using the parallel method...
## Following the progress of chain 1 (the program will wait for all chains
## to finish before continuing):
## Welcome to JAGS 4.3.1 on Sat Dec 2 08:35:02 2023
## JAGS is free software and comes with ABSOLUTELY NO WARRANTY
## Loading module: basemod: ok
## Loading module: bugs: ok
## . . Reading data file data.txt
## . Compiling model graph
## Resolving undeclared variables
## Allocating nodes
## Graph information:
## Observed stochastic nodes: 2
## Unobserved stochastic nodes: 1
## Total graph size: 76
## . Reading parameter file inits1.txt
## . Initializing model
## . Adapting 500
## -------------------------------------------------| 500
## ++++++++++++++++++++++++++++++++++++++++++++++++++ 100%
## Adaptation successful
## . Updating 1000
## -------------------------------------------------| 1000
## ************************************************** 100%
## . . . . . Updating 125000
## -------------------------------------------------| 125000
## ************************************************** 100%
## . . . . Updating 0
## . Deleting model
## .
## All chains have finished
## Simulation complete. Reading coda files...
## Coda files loaded successfully
## Finished running the simulation
## runDiffmHomv: 8.31 sec elapsed
codaDiffmHomvOtherPrior = as.mcmc.list( runjagsDiffmHomvOtherPrior )
# resulting mcmcCoda object has these indices:
# mcmcCoda[[ chainIdx ]][ stepIdx , paramIdx ]
mcmcMatDiffmHomvOtherPrior = as.matrix(codaDiffmHomvOtherPrior,chains=TRUE)
MCMC Diagnostics:
diagSumDiffmHomvOtherPrior = diagSummary( codaDiffmHomvOtherPrior )
DT::datatable( round(diagSumDiffmHomvOtherPrior,3) , options=list(pageLength=10))
# Extract max PSRF and min ESS:
estimatedRows = ( rownames(diagSumDiffmHomvOtherPrior) != "thresh[1]"
& rownames(diagSumDiffmHomvOtherPrior) != "thresh[4]" )
maxPSRF = max(diagSumDiffmHomvOtherPrior[estimatedRows,c("psrfPt")])
minESS = min(diagSumDiffmHomvOtherPrior[estimatedRows,c("ESS")])
From the table above, we observe
Posterior predictive check for prior data:
postPredPlot( dataMat=yMatOther ,
FUN=predDiffmHomv ,
mcmcMat=mcmcMatDiffmHomvOtherPrior[,c("mu[1]","mu[2]",
"sigma",
"thresh[2]","thresh[3]")] )
Above plot shows that the prior data are well represented by the model.
Now examine the posterior distribution. First, compute log(sigma):
# Compute log(sigma):
mcmcMatDiffmHomvOtherPrior = cbind( mcmcMatDiffmHomvOtherPrior ,
"logsigma"=log(mcmcMatDiffmHomvOtherPrior[,"sigma"]) )
Pairs plots:
plotRows = seq( 1 , nrow(mcmcMatDiffmHomvOtherPrior) ,
length=min(c(2000,nrow(mcmcMatDiffmHomvOtherPrior))) )
psych::pairs.panels(
mcmcMatDiffmHomvOtherPrior[
plotRows ,
c("mu[1]","mu[2]","logsigma","thresh[2]","thresh[3]") ] ,
smooth=FALSE , lm=TRUE , smoother=TRUE , gap=0.2 , # pch=20 ,
ellipses=TRUE , hist.col="skyblue" )
Here are the M and V,C constants used for the informed prior:
# Extract posterior summary for use as subsequent prior:
paramM_DiffmHomvOther = colMeans( mcmcMatDiffmHomvOtherPrior[ ,
c("mu[1]","mu[2]","logsigma","thresh[2]","thresh[3]") ] )
paramVCOV_DiffmHomvOther = cov( mcmcMatDiffmHomvOtherPrior[ ,
c("mu[1]","mu[2]","logsigma","thresh[2]","thresh[3]") ] )
print(round(paramM_DiffmHomvOther,3))
## mu[1] mu[2] logsigma thresh[2] thresh[3]
## 5.475 5.472 0.785 2.246 3.093
print(round(paramVCOV_DiffmHomvOther,3))
## mu[1] mu[2] logsigma thresh[2] thresh[3]
## mu[1] 0.290 0.062 0.047 -0.010 -0.006
## mu[2] 0.062 0.289 0.046 -0.009 -0.007
## logsigma 0.047 0.046 0.039 -0.020 -0.021
## thresh[2] -0.010 -0.009 -0.020 0.108 0.058
## thresh[3] -0.006 -0.007 -0.021 0.058 0.102
If the mathematical prior were an exact copy of the posterior from the mildly informed fit, then a prior predictive check would be superfluous. But because the mathematical prior is merely a close approximation of the posterior from the mildly informed fit, a prior predictive check verifies that the prior really does mimic the small data as intended.
# Generate MCMC from prior:
if ( runNewMCMC ) {
runjagsDiffmHomvOtherPriorPred = runDiffmHomv(
yMatAct ,
priorOnly=TRUE , # N.B. !!!!!!!!!!!!!!!!
thinSteps=1 , # because no autocorrelation
paramM_DiffmHomvOther ,
paramVCOV_DiffmHomvOther )
save( runjagsDiffmHomvOtherPriorPred ,
file=paste0(saveFileDir,"/",filenameRoot, "-runjagsDiffmHomvOtherPriorPred.Rdata") )
} else {
load( file=paste0(saveFileDir,"/",filenameRoot, "-runjagsDiffmHomvOtherPriorPred.Rdata") )
}
## Calling 4 simulations using the parallel method...
## Following the progress of chain 1 (the program will wait for all chains
## to finish before continuing):
## Welcome to JAGS 4.3.1 on Sat Dec 2 08:35:14 2023
## JAGS is free software and comes with ABSOLUTELY NO WARRANTY
## Loading module: basemod: ok
## Loading module: bugs: ok
## . . Reading data file data.txt
## . Compiling model graph
## Resolving undeclared variables
## Allocating nodes
## Graph information:
## Observed stochastic nodes: 0
## Unobserved stochastic nodes: 3
## Total graph size: 76
## . Reading parameter file inits1.txt
## . Initializing model
## . Adaptation skipped: model is not in adaptive mode.
## . Updating 1000
## -------------------------------------------------| 1000
## ************************************************** 100%
## . . . . . Updating 12500
## -------------------------------------------------| 12500
## ************************************************** 100%
## . . . . Updating 0
## . Deleting model
## .
## All chains have finished
## Note: the model did not require adaptation
## Simulation complete. Reading coda files...
## Coda files loaded successfully
## Finished running the simulation
## runDiffmHomv: 7.14 sec elapsed
codaDiffmHomvOtherPriorPred = as.mcmc.list( runjagsDiffmHomvOtherPriorPred )
# resulting mcmcCoda object has these indices:
# mcmcCoda[[ chainIdx ]][ stepIdx , paramIdx ]
mcmcMatDiffmHomvOtherPriorPred = as.matrix(codaDiffmHomvOtherPriorPred,chains=TRUE)
Show prior predictions:
postPredPlot( dataMat=yMatOther ,
FUN=predDiffmHomv ,
mcmcMat=mcmcMatDiffmHomvOtherPriorPred[,c("mu[1]","mu[2]",
"sigma",
"thresh[2]","thresh[3]")] )
Above, it can be seen that the mildly-informed prior does mimic the small data.
Run Diffm, Homv model on actual data with informed prior:
if ( runNewMCMC ) {
runjagsDiffmHomvOtherPost = runDiffmHomv(
yMat=yMatAct ,
thinSteps=15 , # because of strong autocorrelation
paramM = paramM_DiffmHomvOther ,
paramVCOV = paramVCOV_DiffmHomvOther )
save( runjagsDiffmHomvOtherPost ,
file=paste0(saveFileDir,"/",filenameRoot, "-runjagsDiffmHomvOtherPost.Rdata") )
} else {
load( file=paste0(saveFileDir,"/",filenameRoot, "-runjagsDiffmHomvOtherPost.Rdata") )
}
## Warning: You attempted to start parallel chains without setting different PRNG
## for each chain, which is not recommended. Different .RNG.name values have been
## added to each set of initial values.
## Calling 4 simulations using the parallel method...
## Following the progress of chain 1 (the program will wait for all chains
## to finish before continuing):
## Welcome to JAGS 4.3.1 on Sat Dec 2 08:35:23 2023
## JAGS is free software and comes with ABSOLUTELY NO WARRANTY
## Loading module: basemod: ok
## Loading module: bugs: ok
## . . Reading data file data.txt
## . Compiling model graph
## Resolving undeclared variables
## Allocating nodes
## Graph information:
## Observed stochastic nodes: 2
## Unobserved stochastic nodes: 1
## Total graph size: 76
## . Reading parameter file inits1.txt
## . Initializing model
## . Adapting 500
## -------------------------------------------------| 500
## ++++++++++++++++++++++++++++++++++++++++++++++++++ 100%
## Adaptation successful
## . Updating 1000
## -------------------------------------------------| 1000
## ************************************************** 100%
## . . . . . Updating 187500
## -------------------------------------------------| 187500
## ************************************************** 100%
## . . . . Updating 0
## . Deleting model
## .
## All chains have finished
## Simulation complete. Reading coda files...
## Coda files loaded successfully
## Finished running the simulation
## runDiffmHomv: 9.95 sec elapsed
codaDiffmHomvOtherPost = as.mcmc.list( runjagsDiffmHomvOtherPost )
# resulting mcmcCoda object has these indices:
# mcmcCoda[[ chainIdx ]][ stepIdx , paramIdx ]
mcmcMatDiffmHomvOtherPost = as.matrix(codaDiffmHomvOtherPost,chains=TRUE)
Diagnostics:
diagSumDiffmHomvOtherPost = diagSummary( codaDiffmHomvOtherPost )
DT::datatable( round(diagSumDiffmHomvOtherPost,3) , options=list(pageLength=10))
# Extract max PSRF and min ESS:
estimatedRows = ( rownames(diagSumDiffmHomvOtherPost) != "thresh[1]"
& rownames(diagSumDiffmHomvOtherPost) != "thresh[4]" )
maxPSRF = max(diagSumDiffmHomvOtherPost[estimatedRows,c("psrfPt")])
minESS = min(diagSumDiffmHomvOtherPost[estimatedRows,c("ESS")])
From the table above, we observe
postPredPlot( dataMat=yMatAct ,
FUN=predDiffmHomv ,
mcmcMat=mcmcMatDiffmHomvOtherPost[,c("mu[1]","mu[2]",
"sigma",
"thresh[2]","thresh[3]")] )
The above plot shows a poor fit by the homogeneous-variance model, at least much less accurate than the heterogeneous-variance models.
Despite the poor fit of the model, the parameter estimates are displayed here for completeness:
Difference of means:
ppdInfo = plotPostDiff( mcmc1=mcmcMatDiffmHomvOtherPost[,"mu[1]"] ,
mcmc2=mcmcMatDiffmHomvOtherPost[,"mu[2]"] ,
name1="mu[1]" , name2="mu[2]" ,
rope1=NULL , rope2=NULL , ropeD=c(-0.25,0.25) )
DT::datatable( round(ppdInfo,3) )
Effect size (Cohen’s d), \((\mu_1 - \mu_2)\Big/\sqrt{(\sigma_1^2+\sigma_2^2)/2}\):
diff = ( mcmcMatDiffmHomvOtherPost[,"mu[1]"] - mcmcMatDiffmHomvOtherPost[,"mu[2]"] )
sd = mcmcMatDiffmHomvOtherPost[,"sigma"]
esInfo = plotPost( diff/sd ,
name="Effect Size" ,
rope=c(-0.1,0.1) )
DT::datatable( round(esInfo,3) )
sInfo = plotPost( mcmcMatDiffmHomvOtherPost[,"sigma"] ,
name="sigma (Equal variance model)" )
DT::datatable( round(sInfo,3) )
Again, the estimated parameter values above should not be taken too seriously, given the relatively poor fit of the model.
The representative subset was described above, the first time it was used.
Run the model on the small representative data set using broad prior:
if ( runNewMCMC ) {
runjagsDiffmHomvSubsetPrior = runDiffmHomv(
yMat=yMatSubset ,
paramM = paramM_DiffmHomvBroad ,
paramVCOV = paramVCOV_DiffmHomvBroad )
save( runjagsDiffmHomvSubsetPrior ,
file=paste0(saveFileDir,"/",filenameRoot, "-runjagsDiffmHomvSubsetPrior.Rdata") )
} else {
load( file=paste0(saveFileDir,"/",filenameRoot, "-runjagsDiffmHomvSubsetPrior.Rdata") )
}
## Warning: You attempted to start parallel chains without setting different PRNG
## for each chain, which is not recommended. Different .RNG.name values have been
## added to each set of initial values.
## Calling 4 simulations using the parallel method...
## Following the progress of chain 1 (the program will wait for all chains
## to finish before continuing):
## Welcome to JAGS 4.3.1 on Sat Dec 2 08:35:34 2023
## JAGS is free software and comes with ABSOLUTELY NO WARRANTY
## Loading module: basemod: ok
## Loading module: bugs: ok
## . . Reading data file data.txt
## . Compiling model graph
## Resolving undeclared variables
## Allocating nodes
## Graph information:
## Observed stochastic nodes: 2
## Unobserved stochastic nodes: 1
## Total graph size: 76
## . Reading parameter file inits1.txt
## . Initializing model
## . Adapting 500
## -------------------------------------------------| 500
## ++++++++++++++++++++++++++++++++++++++++++++++++++ 100%
## Adaptation successful
## . Updating 1000
## -------------------------------------------------| 1000
## ************************************************** 100%
## . . . . . Updating 125000
## -------------------------------------------------| 125000
## ************************************************** 100%
## . . . . Updating 0
## . Deleting model
## .
## All chains have finished
## Simulation complete. Reading coda files...
## Coda files loaded successfully
## Finished running the simulation
## runDiffmHomv: 8.14 sec elapsed
codaDiffmHomvSubsetPrior = as.mcmc.list( runjagsDiffmHomvSubsetPrior )
# resulting mcmcCoda object has these indices:
# mcmcCoda[[ chainIdx ]][ stepIdx , paramIdx ]
mcmcMatDiffmHomvSubsetPrior = as.matrix(codaDiffmHomvSubsetPrior,chains=TRUE)
MCMC diagnostics:
diagSumDiffmHomvSubsetPrior = diagSummary( codaDiffmHomvSubsetPrior )
DT::datatable( round(diagSumDiffmHomvSubsetPrior,3) , options=list(pageLength=10))
# Extract max PSRF and min ESS:
estimatedRows = ( rownames(diagSumDiffmHomvSubsetPrior) != "thresh[1]"
& rownames(diagSumDiffmHomvSubsetPrior) != "thresh[4]" )
maxPSRF = max(diagSumDiffmHomvSubsetPrior[estimatedRows,c("psrfPt")])
minESS = min(diagSumDiffmHomvSubsetPrior[estimatedRows,c("ESS")])
From the table above, we observe
Posterior predictive check of small data:
postPredPlot( dataMat=yMatSubset ,
FUN=predDiffmHomv ,
mcmcMat=mcmcMatDiffmHomvSubsetPrior[,c("mu[1]","mu[2]",
"sigma",
"thresh[2]","thresh[3]")] )
Above plot shows that the prior data are not very well represented by the model, but at least the data are within the broad uncertainty bands of the prior. This lack of fit by the homogeneous model should not be too surprising given that the prior data are designed to represent the full data with their heterogeneous variances. Nevertheless we proceed with this prior because it is the best the model can do to mimic the representative subset data.
Now examine the posterior distribution (which will be used as the prior for the actual data). First, compute log(sigma):
# Compute log(sigma):
mcmcMatDiffmHomvSubsetPrior = cbind( mcmcMatDiffmHomvSubsetPrior ,
"logsigma"=log(mcmcMatDiffmHomvSubsetPrior[,"sigma"]) )
Pairs plots:
plotRows = seq( 1 , nrow(mcmcMatDiffmHomvSubsetPrior) ,
length=min(c(2000,nrow(mcmcMatDiffmHomvSubsetPrior))) )
psych::pairs.panels(
mcmcMatDiffmHomvSubsetPrior[
plotRows ,
c("mu[1]","mu[2]","logsigma","thresh[2]","thresh[3]") ] ,
smooth=FALSE , lm=TRUE , smoother=TRUE , gap=0.2 , # pch=20 ,
ellipses=TRUE , hist.col="skyblue" )
Here are the \(M\), \(V\), and \(C\) constants used for the informed prior:
# Extract posterior summary for use as subsequent prior:
paramM_DiffmHomvSubset = colMeans( mcmcMatDiffmHomvSubsetPrior[ ,
c("mu[1]","mu[2]","logsigma","thresh[2]","thresh[3]") ] )
paramVCOV_DiffmHomvSubset = cov( mcmcMatDiffmHomvSubsetPrior[ ,
c("mu[1]","mu[2]","logsigma","thresh[2]","thresh[3]") ] )
print(round(paramM_DiffmHomvSubset,3))
## mu[1] mu[2] logsigma thresh[2] thresh[3]
## 3.627 3.845 0.913 2.374 3.206
print(round(paramVCOV_DiffmHomvSubset,3))
## mu[1] mu[2] logsigma thresh[2] thresh[3]
## mu[1] 0.239 0.007 0.007 0.011 0.011
## mu[2] 0.007 0.231 0.009 0.013 0.016
## logsigma 0.007 0.009 0.024 -0.005 -0.003
## thresh[2] 0.011 0.013 -0.005 0.057 0.032
## thresh[3] 0.011 0.016 -0.003 0.032 0.061
If the mathematical prior were an exact copy of the posterior from the mildly informed fit, then a prior predictive check would be superfluous. But because the mathematical prior is merely a close approximation of the posterior from the mildly informed fit, a prior predictive check is useful.
# Generate MCMC from prior:
if ( runNewMCMC ) {
runjagsDiffmHomvSubsetPriorPred = runDiffmHomv(
yMatAct ,
priorOnly=TRUE , # N.B. !!!!!!!!!!!!!!!!
thinSteps=1 , # because no autocorrelation
paramM_DiffmHomvSubset ,
paramVCOV_DiffmHomvSubset )
save( runjagsDiffmHomvSubsetPriorPred ,
file=paste0(saveFileDir,"/",filenameRoot, "-runjagsDiffmHomvSubsetPriorPred.Rdata") )
} else {
load( file=paste0(saveFileDir,"/",filenameRoot, "-runjagsDiffmHomvSubsetPriorPred.Rdata") )
}
## Calling 4 simulations using the parallel method...
## Following the progress of chain 1 (the program will wait for all chains
## to finish before continuing):
## Welcome to JAGS 4.3.1 on Sat Dec 2 08:35:46 2023
## JAGS is free software and comes with ABSOLUTELY NO WARRANTY
## Loading module: basemod: ok
## Loading module: bugs: ok
## . . Reading data file data.txt
## . Compiling model graph
## Resolving undeclared variables
## Allocating nodes
## Graph information:
## Observed stochastic nodes: 0
## Unobserved stochastic nodes: 3
## Total graph size: 76
## . Reading parameter file inits1.txt
## . Initializing model
## . Adaptation skipped: model is not in adaptive mode.
## . Updating 1000
## -------------------------------------------------| 1000
## ************************************************** 100%
## . . . . . Updating 12500
## -------------------------------------------------| 12500
## ************************************************** 100%
## . . . . Updating 0
## . Deleting model
## .
## All chains have finished
## Note: the model did not require adaptation
## Simulation complete. Reading coda files...
## Coda files loaded successfully
## Finished running the simulation
## runDiffmHomv: 7.13 sec elapsed
codaDiffmHomvSubsetPriorPred = as.mcmc.list( runjagsDiffmHomvSubsetPriorPred )
# resulting mcmcCoda object has these indices:
# mcmcCoda[[ chainIdx ]][ stepIdx , paramIdx ]
mcmcMatDiffmHomvSubsetPriorPred = as.matrix(codaDiffmHomvSubsetPriorPred,chains=TRUE)
Show prior predictions:
postPredPlot( dataMat=yMatSubset ,
FUN=predDiffmHomv ,
mcmcMat=mcmcMatDiffmHomvSubsetPriorPred[,c("mu[1]","mu[2]",
"sigma",
"thresh[2]","thresh[3]")] )
The mathematically specified prior does mimic the MCMC prior, which was the goal of this demonstration. But, as noted before, the prior for the homogeneous-variance model cannot capture the subset-data very well because it seems to have different variances. We proceed because this is, nevertheless, the best the prior distribution can do to capture the prior data.
Run Diffm, Homv model on actual data with informed prior:
if ( runNewMCMC ) {
runjagsDiffmHomvSubsetPost = runDiffmHomv(
yMat=yMatAct ,
thinSteps=15 , # because of strong autocorrelation
paramM = paramM_DiffmHomvSubset ,
paramVCOV = paramVCOV_DiffmHomvSubset )
save( runjagsDiffmHomvSubsetPost ,
file=paste0(saveFileDir,"/",filenameRoot, "-runjagsDiffmHomvSubsetPost.Rdata") )
} else {
load( file=paste0(saveFileDir,"/",filenameRoot, "-runjagsDiffmHomvSubsetPost.Rdata") )
}
## Warning: You attempted to start parallel chains without setting different PRNG
## for each chain, which is not recommended. Different .RNG.name values have been
## added to each set of initial values.
## Calling 4 simulations using the parallel method...
## Following the progress of chain 1 (the program will wait for all chains
## to finish before continuing):
## Welcome to JAGS 4.3.1 on Sat Dec 2 08:35:54 2023
## JAGS is free software and comes with ABSOLUTELY NO WARRANTY
## Loading module: basemod: ok
## Loading module: bugs: ok
## . . Reading data file data.txt
## . Compiling model graph
## Resolving undeclared variables
## Allocating nodes
## Graph information:
## Observed stochastic nodes: 2
## Unobserved stochastic nodes: 1
## Total graph size: 76
## . Reading parameter file inits1.txt
## . Initializing model
## . Adapting 500
## -------------------------------------------------| 500
## ++++++++++++++++++++++++++++++++++++++++++++++++++ 100%
## Adaptation successful
## . Updating 1000
## -------------------------------------------------| 1000
## ************************************************** 100%
## . . . . . Updating 187500
## -------------------------------------------------| 187500
## ************************************************** 100%
## . . . . Updating 0
## . Deleting model
## .
## All chains have finished
## Simulation complete. Reading coda files...
## Coda files loaded successfully
## Finished running the simulation
## runDiffmHomv: 9.48 sec elapsed
codaDiffmHomvSubsetPost = as.mcmc.list( runjagsDiffmHomvSubsetPost )
# resulting mcmcCoda object has these indices:
# mcmcCoda[[ chainIdx ]][ stepIdx , paramIdx ]
mcmcMatDiffmHomvSubsetPost = as.matrix(codaDiffmHomvSubsetPost,chains=TRUE)
Diagnostics:
diagSumDiffmHomvSubsetPost = diagSummary( codaDiffmHomvSubsetPost )
DT::datatable( round(diagSumDiffmHomvSubsetPost,3) , options=list(pageLength=10))
# Extract max PSRF and min ESS:
estimatedRows = ( rownames(diagSumDiffmHomvSubsetPost) != "thresh[1]"
& rownames(diagSumDiffmHomvSubsetPost) != "thresh[4]" )
maxPSRF = max(diagSumDiffmHomvSubsetPost[estimatedRows,c("psrfPt")])
minESS = min(diagSumDiffmHomvSubsetPost[estimatedRows,c("ESS")])
From the table above, we observe
postPredPlot( dataMat=yMatAct ,
FUN=predDiffmHomv ,
mcmcMat=mcmcMatDiffmHomvSubsetPost[,c("mu[1]","mu[2]",
"sigma",
"thresh[2]","thresh[3]")] )
We see above that the fit of homogeneous-variance model is not very good, at least not as good as the heterogeneous-variance models.
Despite the poor fit of the model, the parameter estimates are displayed here for completeness:
Difference of means:
ppdInfo = plotPostDiff( mcmc1=mcmcMatDiffmHomvSubsetPost[,"mu[1]"] ,
mcmc2=mcmcMatDiffmHomvSubsetPost[,"mu[2]"] ,
name1="mu[1]" , name2="mu[2]" ,
rope1=NULL , rope2=NULL , ropeD=c(-0.25,0.25) )
DT::datatable( round(ppdInfo,3) )
Effect size (Cohen’s d), \((\mu_1 - \mu_2)\Big/\sqrt{(\sigma_1^2+\sigma_2^2)/2}\):
diff = ( mcmcMatDiffmHomvSubsetPost[,"mu[1]"] - mcmcMatDiffmHomvSubsetPost[,"mu[2]"] )
sd = mcmcMatDiffmHomvSubsetPost[,"sigma"]
esInfo = plotPost( diff/sd ,
name="Effect Size" ,
rope=c(-0.1,0.1) )
DT::datatable( round(esInfo,3) )
sInfo = plotPost( mcmcMatDiffmHomvSubsetPost[,"sigma"] ,
name="sigma (Equal variance model)" )
DT::datatable( round(sInfo,3) )
Again, the estimated parameter values above should not be taken too seriously, given the relatively poor fit of the model.
Below are graphical comparisons of the marginal posteriors from the three priors. The plot of posterior density relies on smoothing by a kernel with width determined heuristically. The posterior density is a natural way to view the mode and HDI. The plot of cumulative probability does not rely on smoothing (but is effectively smoothed by the pixel resolution). The cumulative probability is a natural way to view the median and ETI.
effSzBroad = ( ( mcmcMatDiffmHomvBroadPost[,"mu[1]"]
- mcmcMatDiffmHomvBroadPost[,"mu[2]"] )
/ mcmcMatDiffmHomvBroadPost[,"sigma"] )
effSzOther = ( ( mcmcMatDiffmHomvOtherPost[,"mu[1]"]
- mcmcMatDiffmHomvOtherPost[,"mu[2]"] )
/ mcmcMatDiffmHomvOtherPost[,"sigma"] )
effSzSubset = ( ( mcmcMatDiffmHomvSubsetPost[,"mu[1]"]
- mcmcMatDiffmHomvSubsetPost[,"mu[2]"] )
/ mcmcMatDiffmHomvSubsetPost[,"sigma"] )
par(mfrow=c(2,1))
plotGpdfs( cbind( Broad=effSzBroad , Other=effSzOther , Subset=effSzSubset ) ,
main="For Different Priors" ,
ylab="Posterior Density" ,
xlab="Effect Size (Cohen's d)" , rope=c(-0.1,0.1) )
plotGecdfs( cbind( Broad=effSzBroad , Other=effSzOther , Subset=effSzSubset ) ,
main="For Different Priors" ,
ylab="Posterior Cumulative Proportion" ,
xlab="Effect Size (Cohen's d)" , rope=c(-0.1,0.1) )
Above: Effect Size estimates from different priors. Broad = broad prior, Other = informed by other movies, Subset = informed by subset.
par(mfrow=c(1,1))
par(mfrow=c(2,1))
plotGpdfs( cbind( Broad=mcmcMatDiffmHomvBroadPost[,"sigma"] ,
Other=mcmcMatDiffmHomvOtherPost[,"sigma"] ,
Subset=mcmcMatDiffmHomvSubsetPost[,"sigma"] ) ,
main="For Different Priors" ,
ylab="Posterior Density" ,
xlab="sigma" , rope=NULL )
plotGecdfs( cbind( Broad=mcmcMatDiffmHomvBroadPost[,"sigma"] ,
Other=mcmcMatDiffmHomvOtherPost[,"sigma"] ,
Subset=mcmcMatDiffmHomvSubsetPost[,"sigma"] ) ,
main="For Different Priors" ,
ylab="Posterior Cumulative Proportion" ,
xlab="sigma" , rope=NULL )
Above: Sigma estimates from different priors. Broad = broad prior, Other = informed by other movies, Subset = informed by subset.
par(mfrow=c(1,1))
The impact of the different priors is made clear by the plot, but the magnitude of difference relative to the credible intervals and relative to the ROPE is very small.
(Cf. BARG Step 4.D.)
The Bayes factor (BF) is the coefficient that converts the prior odds of the models to the posterior odds of the models, and the magnitude of the BF indicates the degree to which model probabilities are reallocated from their prior probabilities. Formally, for models \(M1\) and \(M2\), and for data \(D\), the Bayes factor for \(M1\) relative to \(M2\) is \[ BF_{M1/M2} = \frac{p(M1|D)}{p(M2|D)} \bigg/ \frac{p(M1)}{p(M2)} \] where \(p(M)\) is the prior probability of model \(M\) and \(p(M|D)\) is its posterior probability. That relationship can be algebraically converted to a simple formula for computing the posterior probability of model \(M1\) from its prior probability (assuming \(p(M2)=1-p(M1)\)): \[ p(M1|D) = p(M1) \, BF_{M1/M2} \Big/ \left( 1 + p(M1) ( BF_{M1/M2} - 1 ) \right) \] This relationship between \(p(M1|D)\) and \(p(M1)\) is displayed in some graphs below, for the particular values of the BF’s that result from these data and models. Finally, the relationship can be re-arranged to yield the prior model probability that achieves the critical posterior model probability (again assuming \(p(M2)=1-p(M1)\)): \[ p(M1)_{crit} = \frac{p(M1|D)_{crit}}{1-p(M1|D)_{crit}} \bigg/ \left( BF_{M1/M2} + \frac{p(M1|D)_{crit}}{1-p(M1|D)_{crit}} \right) \] This prior model probability is also annotated in the specific graphs below.
Bayes factors are computed here by using bridge sampling (Gronau et al., 2017, 2020). Bridge sampling is a general-purpose method for computing marginal likelihoods, hence Bayes factors, for arbitrary models and priors. Bridge sampling is quite flexible but must be used carefully (e.g., it can have issues with extremely broad prior distributions); see the Discussion section of Gronau et al. (2020). The application here is well behaved.
To use the bridge sampling package in R, I must define functions that compute the unnormalized log posterior probability for each model. The functions for the three models are presented below.
Function to compute the unnormalized log posterior probability of Diffm Hetv model:
# Function for unnormalized log posterior probability for specified parameter
# values and data. N.B.! For bridge_sampling(), this function must have two
# arguments:
# * the first argument is a vector with *named* parameters that includes ONLY
# and the estimated parameters ("mu[1]","mu[2]", "sigma[1]","sigma[2]",
# "thresh[2]","thresh[3]")
# * the second argument can be any object, but for *this* function it must be a
# list with these *named* elements:
# list( yMat , paramM , paramVCOV )
logPostProbDiffmHetv = function( param , data ) {
# Internally rename components of param for meaning:
mu = param[c("mu[1]","mu[2]")]
sigma = param[c("sigma[1]","sigma[2]")]
threshI = c(-Inf,1.5,param[c("thresh[2]","thresh[3]")],4.5,Inf)
# Note above, thresh[i] is threshI[i+1]
# Internally rename data arguments:
yMat = data$yMat
nCases = nrow(yMat)
paramM = data$paramM
paramVCOV = data$paramVCOV
# Compute log probability
logProb = 0
for ( cIdx in 1:nCases ) {
# Likelihood:
pdgp = unname(
pnorm( ( threshI[2: length(threshI) ] - mu[cIdx] ) / sigma[cIdx] )
- pnorm( ( threshI[1:(length(threshI)-1)] - mu[cIdx] ) / sigma[cIdx] )
)
logProb = logProb + sum( yMat[cIdx,] * log(pdgp) )
}
# Prior on parameters:
logProb = ( logProb
+ dmvnorm( c( mu[1] , mu[2] ,
log(sigma[1]) , log(sigma[2]) ,
threshI[2+1] , threshI[3+1] ) , # thresh[i] is threshI[i+1]
paramM , paramVCOV , log=TRUE )
+ -log(sigma[1]) # Jacobian
+ -log(sigma[2]) # Jacobian
)
# Re Jacobian, previous line: See Section 25.3 of DBDA2E. Prior density
# pls() on logsigma, which here is dmvnorm(), is mapped to density ps() on
# sigma via f=exp(), with inverse being finv=log() and derivative being
# fprime()=exp(). Therefore density on sigma is ps(sigma) =
# pls(finv(sigma))/abs(fprime(finv(sigma))) =
# pls(log(sigma))/exp(log(sigma)) = pls(log(sigma))/sigma. And, the
# logarithm of that density is log(pls(log(sigma)))-log(sigma). Notice the
# "-log(sigma)" term at the end. In this particular case, each exp(logsigma)
# function applies only to its single corresponding logsigma, and therefore
# the Jacobian matrix is diagonal and det(J) is simply the product of the
# diagonal terms.
return( unname( logProb ) )
}
Function to compute the unnormalized log posterior probability of Eqm Hetv model:
# Function for unnormalized log posterior probability for specified parameter
# values and data.
logPostProbEqmHetv = function( param , data ) {
# Internally rename components of param for meaning:
mu = param[c("mu")]
sigma = param[c("sigma[1]","sigma[2]")]
threshI = c(-Inf,1.5,param[c("thresh[2]","thresh[3]")],4.5,Inf)
# Note above, thresh[i] is threshI[i+1]
# Internally rename data arguments:
yMat = data$yMat
nCases = nrow(yMat)
paramM = data$paramM
paramVCOV = data$paramVCOV
# Compute log probability
logProb = 0
for ( cIdx in 1:nCases ) {
# Likelihood:
pdgp = unname(
pnorm( ( threshI[2: length(threshI) ] - mu ) / sigma[cIdx] )
- pnorm( ( threshI[1:(length(threshI)-1)] - mu ) / sigma[cIdx] )
)
logProb = logProb + sum( yMat[cIdx,] * log(pdgp) )
}
# Prior on parameters:
logProb = ( logProb
+ dmvnorm( c( mu ,
log(sigma[1]) , log(sigma[2]) ,
threshI[2+1] , threshI[3+1] ) , # thresh[i] is threshI[i+1]
paramM , paramVCOV , log=TRUE )
+ -log(sigma[1]) # Jacobian
+ -log(sigma[2]) # Jacobian
)
# Re Jacobian, previous line: See Section 25.3 of DBDA2E. Prior density
# pls() on logsigma, which here is dmvnorm(), is mapped to density ps() on
# sigma via f=exp(), with inverse being finv=log() and derivative being
# fprime()=exp(). Therefore density on sigma is ps(sigma) =
# pls(finv(sigma))/abs(fprime(finv(sigma))) =
# pls(log(sigma))/exp(log(sigma)) = pls(log(sigma))/sigma. And, the
# logarithm of that density is log(pls(log(sigma)))-log(sigma). Notice the
# "-log(sigma)" term at the end. In this particular case, each exp(logsigma)
# function applies only to its single corresponding logsigma, and therefore
# the Jacobian matrix is diagonal and det(J) is simply the product of the
# diagonal terms.
return( unname( logProb ) )
}
Function to compute the unnormalized log posterior probability of Diffm Homv model:
# Function for unnormalized log posterior probability for specified parameter
# values and data. N.B.! For bridge_sampling(), this function must have two
# arguments:
# * the first argument is a vector with *named* parameters that includes ONLY
# and the estimated parameters ("mu[1]","mu[2]", "sigma",
# "thresh[2]","thresh[3]")
# * the second argument can be any object, but for *this* function it must be a
# list with these *named* elements:
# list( yMat , paramM , paramVCOV )
logPostProbDiffmHomv = function( param , data ) {
# Internally rename components of param for meaning:
mu = param[c("mu[1]","mu[2]")]
sigma = param[c("sigma")]
threshI = c(-Inf,1.5,param[c("thresh[2]","thresh[3]")],4.5,Inf)
# Note above, thresh[i] is threshI[i+1]
# Internally rename data arguments:
yMat = data$yMat
nCases = nrow(yMat)
paramM = data$paramM
paramVCOV = data$paramVCOV
# Compute log probability
logProb = 0
for ( cIdx in 1:nCases ) {
# Likelihood:
pdgp = unname(
pnorm( ( threshI[2: length(threshI) ] - mu[cIdx] ) / sigma )
- pnorm( ( threshI[1:(length(threshI)-1)] - mu[cIdx] ) / sigma )
)
logProb = logProb + sum( yMat[cIdx,] * log(pdgp) )
}
# Prior on parameters:
logProb = ( logProb
+ dmvnorm( c( mu[1] , mu[2] ,
log(sigma) ,
threshI[2+1] , threshI[3+1] ) , # thresh[i] is threshI[i+1]
paramM , paramVCOV , log=TRUE )
+ -log(sigma) # Jacobian
)
# Re Jacobian, previous line: See Section 25.3 of DBDA2E. Prior density
# pls() on logsigma, which here is dmvnorm(), is mapped to density ps() on
# sigma via f=exp(), with inverse being finv=log() and derivative being
# fprime()=exp(). Therefore density on sigma is ps(sigma) =
# pls(finv(sigma))/abs(fprime(finv(sigma))) =
# pls(log(sigma))/exp(log(sigma)) = pls(log(sigma))/sigma. And, the
# logarithm of that density is log(pls(log(sigma)))-log(sigma). Notice the
# "-log(sigma)" term at the end. In this particular case, each exp(logsigma)
# function applies only to its single corresponding logsigma, and therefore
# the Jacobian matrix is diagonal and det(J) is simply the product of the
# diagonal terms.
return( unname( logProb ) )
}
The bridge sampling package in R (Gronau et al., 2020) also requires lower and upper bounds to be specified for all of the parameters in the model.
Diffm Hetv:
# Define lower and upper bound vectors for Diffm Hetv model, for subsequent use
# with bridge_sampler():
lbVecDiffmHetv = c( rep(-Inf,nrow(yMatAct)), # mu
rep(0,nrow(yMatAct)), # sigma
rep(-Inf,ncol(yMatAct)-3) ) # thresh
ubVecDiffmHetv = c( rep(Inf,nrow(yMatAct)), # mu
rep(Inf,nrow(yMatAct)), # sigma
rep(Inf,ncol(yMatAct)-3) ) # thresh
vNamesDiffmHetv = c( paste0( "mu[",1:nrow(yMatAct),"]" ) ,
paste0( "sigma[",1:nrow(yMatAct),"]" ) ,
paste0( "thresh[",1+1:(ncol(yMatAct)-3),"]" ) )
names(lbVecDiffmHetv) = vNamesDiffmHetv
names(ubVecDiffmHetv) = vNamesDiffmHetv
Eqm Hetv:
# Define lower and upper bound vectors for Eqm Hetv model, for subsequent use
# with bridge_sampler():
lbVecEqmHetv = c( rep(-Inf,1), # mu
rep(0,nrow(yMatAct)), # sigma
rep(-Inf,ncol(yMatAct)-3) ) # thresh
ubVecEqmHetv = c( rep(Inf,1), # mu
rep(Inf,nrow(yMatAct)), # sigma
rep(Inf,ncol(yMatAct)-3) ) # thresh
vNamesEqmHetv = c( paste0( "mu" ) ,
paste0( "sigma[",1:nrow(yMatAct),"]" ) ,
paste0( "thresh[",1+1:(ncol(yMatAct)-3),"]" ) )
names(lbVecEqmHetv) = vNamesEqmHetv
names(ubVecEqmHetv) = vNamesEqmHetv
Diffm Homv:
# Define lower and upper bound vectors for Diffm Homv model, for subsequent use
# with bridge_sampler():
lbVecDiffmHomv = c( rep(-Inf,nrow(yMatAct)), # mu
rep(0,1), # sigma
rep(-Inf,ncol(yMatAct)-3) ) # thresh
ubVecDiffmHomv = c( rep(Inf,nrow(yMatAct)), # mu
rep(Inf,1), # sigma
rep(Inf,ncol(yMatAct)-3) ) # thresh
vNamesDiffmHomv = c( paste0( "mu[",1:nrow(yMatAct),"]" ) ,
paste0( "sigma" ) ,
paste0( "thresh[",1+1:(ncol(yMatAct)-3),"]" ) )
names(lbVecDiffmHomv) = vNamesDiffmHomv
names(ubVecDiffmHomv) = vNamesDiffmHomv
Before the bridge_sampler() function is called, the
pseudo-random number generator is explicitly seeded for reproducible
output (cf. BARG Step 6.H.). This also applies to all subsequent
calls.
Diffm Hetv model: Compute marginal likelihood with bridge sampler.
if ( runNewMCMC ) {
tic("bridge_sampler")
set.seed(47405) # for reproducible bridge_sampler()
bridgeOutDiffmHetvBroad = bridge_sampler(
samples = runjagsDiffmHetvBroadPost # N.B. broad prior
, data = list( yMat=yMatAct ,
paramM = paramM_DiffmHetvBroad ,
paramVCOV = paramVCOV_DiffmHetvBroad )
, log_posterior = logPostProbDiffmHetv
, lb = lbVecDiffmHetv
, ub = ubVecDiffmHetv
, method = c("normal","warp3")[1]
)
toc()
save( bridgeOutDiffmHetvBroad ,
file=paste0(saveFileDir,"/",filenameRoot, "-bridgeOutDiffmHetvBroad.Rdata") )
} else {
load( file=paste0(saveFileDir,"/",filenameRoot, "-bridgeOutDiffmHetvBroad.Rdata") )
}
## Iteration: 1
## Iteration: 2
## Iteration: 3
## Iteration: 4
## bridge_sampler: 7.28 sec elapsed
summary(bridgeOutDiffmHetvBroad)
##
## Bridge sampling log marginal likelihood estimate
## (method = "normal", repetitions = 1):
##
## -1606.216
##
## Error Measures:
##
## Relative Mean-Squared Error: 1.953806e-07
## Coefficient of Variation: 0.0004420188
## Percentage Error: 0%
##
## Note:
## All error measures are approximate.
As indicated above, the percentage error from the bridge-sampling estimate is very small (only 0%), which suggests that the estimate of the marginal likelihood is very stable.
Eqm Hetv model: Compute marginal likelihood with bridge sampler.
if ( runNewMCMC ) {
tic("bridge_sampler")
set.seed(47405) # for reproducible bridge_sampler()
bridgeOutEqmHetvBroad = bridge_sampler(
samples = runjagsEqmHetvBroadPost # N.B. broad prior
, data = list( yMat=yMatAct ,
paramM = paramM_EqmHetvBroad ,
paramVCOV = paramVCOV_EqmHetvBroad )
, log_posterior = logPostProbEqmHetv
, lb = lbVecEqmHetv
, ub = ubVecEqmHetv
, method = c("normal","warp3")[1]
)
toc()
save( bridgeOutEqmHetvBroad ,
file=paste0(saveFileDir,"/",filenameRoot, "-bridgeOutEqmHetvBroad.Rdata") )
} else {
load( file=paste0(saveFileDir,"/",filenameRoot, "-bridgeOutEqmHetvBroad.Rdata") )
}
## Iteration: 1
## Iteration: 2
## Iteration: 3
## Iteration: 4
## bridge_sampler: 6.74 sec elapsed
summary(bridgeOutEqmHetvBroad)
##
## Bridge sampling log marginal likelihood estimate
## (method = "normal", repetitions = 1):
##
## -1603.692
##
## Error Measures:
##
## Relative Mean-Squared Error: 9.535108e-08
## Coefficient of Variation: 0.0003087897
## Percentage Error: 0%
##
## Note:
## All error measures are approximate.
As indicated above, the percentage error from the bridge-sampling estimate is very small (only 0%), which suggests that the estimate of the marginal likelihood is very stable.
Diffm Homv model: Compute marginal likelihood with bridge sampler.
if ( runNewMCMC ) {
tic("bridge_sampler")
set.seed(47405) # for reproducible bridge_sampler()
bridgeOutDiffmHomvBroad = bridge_sampler(
samples = runjagsDiffmHomvBroadPost # N.B. broad prior
, data = list( yMat=yMatAct ,
paramM = paramM_DiffmHomvBroad ,
paramVCOV = paramVCOV_DiffmHomvBroad )
, log_posterior = logPostProbDiffmHomv
, lb = lbVecDiffmHomv
, ub = ubVecDiffmHomv
, method = c("normal","warp3")[1]
)
toc()
save( bridgeOutDiffmHomvBroad ,
file=paste0(saveFileDir,"/",filenameRoot, "-bridgeOutDiffmHomvBroad.Rdata") )
} else {
load( file=paste0(saveFileDir,"/",filenameRoot, "-bridgeOutDiffmHomvBroad.Rdata") )
}
## Iteration: 1
## Iteration: 2
## Iteration: 3
## Iteration: 4
## bridge_sampler: 7.26 sec elapsed
summary(bridgeOutDiffmHomvBroad)
##
## Bridge sampling log marginal likelihood estimate
## (method = "normal", repetitions = 1):
##
## -1616.914
##
## Error Measures:
##
## Relative Mean-Squared Error: 1.12639e-07
## Coefficient of Variation: 0.0003356174
## Percentage Error: 0%
##
## Note:
## All error measures are approximate.
As indicated above, the percentage error from the bridge-sampling estimate is very small (only 0%), which suggests that the estimate of the marginal likelihood is very stable.
Bayes factors and posterior model probabilities:
Compute and display the Bayes factor for a test of equal means:
bfEqmVsDiffmBroad = bf( bridgeOutEqmHetvBroad , bridgeOutDiffmHetvBroad )
bfPlotEqmVsDiffmBroad = bfplot( bfEqmVsDiffmBroad$bf ,
"Eqm Hetv Broad" ,
"Diffm Hetv Broad" )
The plot above shows posterior model probabilities as a function of prior model probabilities. (Cf. BARG Step 3.C and Step 4.D.) The Bayes factor is 12.5 for Eqm Hetv Broad relative to Diffm Hetv Broad. To ‘accept’ Eqm Hetv Broad (relative to Diffm Hetv Broad) with a posterior probability of at least 0.95, Eqm Hetv Broad’s prior probability must be at least 0.604. To ‘reject’ Eqm Hetv Broad (relative to Diffm Hetv Broad) with a posterior probability less than 0.05, Eqm Hetv Broad’s prior probability must be less than 0.004.
Compute and display the Bayes factor for a test of equal variances:
bfHomvVsHetvBroad = bf( bridgeOutDiffmHomvBroad , bridgeOutDiffmHetvBroad )
bfPlotHomvVsHetvBroad = bfplot( bfHomvVsHetvBroad$bf ,
"Diffm Homv Broad" ,
"Diffm Hetv Broad" )
The plot above shows posterior model probabilities as a function of prior model probabilities. (Cf. BARG Step 3.C and Step 4.D.) The Bayes factor is 2.26e-05 for Diffm Homv Broad relative to Diffm Hetv Broad. To ‘accept’ Diffm Homv Broad (relative to Diffm Hetv Broad) with a posterior probability of at least 0.95, Diffm Homv Broad’s prior probability must be at least 1. To ‘reject’ Diffm Homv Broad (relative to Diffm Hetv Broad) with a posterior probability less than 0.05, Diffm Homv Broad’s prior probability must be less than 1.
Diffm Hetv model: Compute marginal likelihood with bridge sampler.
if ( runNewMCMC ) {
tic("bridge_sampler")
set.seed(47405) # for reproducible bridge_sampler()
bridgeOutDiffmHetvOther = bridge_sampler(
samples = runjagsDiffmHetvOtherPost # N.B. broad prior
, data = list( yMat=yMatAct ,
paramM = paramM_DiffmHetvOther ,
paramVCOV = paramVCOV_DiffmHetvOther )
, log_posterior = logPostProbDiffmHetv
, lb = lbVecDiffmHetv
, ub = ubVecDiffmHetv
, method = c("normal","warp3")[1]
)
toc()
save( bridgeOutDiffmHetvOther ,
file=paste0(saveFileDir,"/",filenameRoot, "-bridgeOutDiffmHetvOther.Rdata") )
} else {
load( file=paste0(saveFileDir,"/",filenameRoot, "-bridgeOutDiffmHetvOther.Rdata") )
}
## Iteration: 1
## Iteration: 2
## Iteration: 3
## Iteration: 4
## bridge_sampler: 8.22 sec elapsed
summary(bridgeOutDiffmHetvOther)
##
## Bridge sampling log marginal likelihood estimate
## (method = "normal", repetitions = 1):
##
## -1611.471
##
## Error Measures:
##
## Relative Mean-Squared Error: 1.829462e-07
## Coefficient of Variation: 0.0004277221
## Percentage Error: 0%
##
## Note:
## All error measures are approximate.
As indicated above, the percentage error from the bridge-sampling estimate is very small (only 0%), which suggests that the estimate of the marginal likelihood is very stable.
Eqm Hetv model: Compute marginal likelihood with bridge sampler.
if ( runNewMCMC ) {
tic("bridge_sampler")
set.seed(47405) # for reproducible bridge_sampler()
bridgeOutEqmHetvOther = bridge_sampler(
samples = runjagsEqmHetvOtherPost # N.B. broad prior
, data = list( yMat=yMatAct ,
paramM = paramM_EqmHetvOther ,
paramVCOV = paramVCOV_EqmHetvOther )
, log_posterior = logPostProbEqmHetv
, lb = lbVecEqmHetv
, ub = ubVecEqmHetv
, method = c("normal","warp3")[1]
)
toc()
save( bridgeOutEqmHetvOther ,
file=paste0(saveFileDir,"/",filenameRoot, "-bridgeOutEqmHetvOther.Rdata") )
} else {
load( file=paste0(saveFileDir,"/",filenameRoot, "-bridgeOutEqmHetvOther.Rdata") )
}
## Iteration: 1
## Iteration: 2
## Iteration: 3
## Iteration: 4
## bridge_sampler: 10.41 sec elapsed
summary(bridgeOutEqmHetvOther)
##
## Bridge sampling log marginal likelihood estimate
## (method = "normal", repetitions = 1):
##
## -1610.353
##
## Error Measures:
##
## Relative Mean-Squared Error: 8.893592e-08
## Coefficient of Variation: 0.0002982213
## Percentage Error: 0%
##
## Note:
## All error measures are approximate.
As indicated above, the percentage error from the bridge-sampling estimate is very small (only 0%), which suggests that the estimate of the marginal likelihood is very stable.
Diffm Homv model: Compute marginal likelihood with bridge sampler.
if ( runNewMCMC ) {
tic("bridge_sampler")
set.seed(47405) # for reproducible bridge_sampler()
bridgeOutDiffmHomvOther = bridge_sampler(
samples = runjagsDiffmHomvOtherPost # N.B. broad prior
, data = list( yMat=yMatAct ,
paramM = paramM_DiffmHomvOther ,
paramVCOV = paramVCOV_DiffmHomvOther )
, log_posterior = logPostProbDiffmHomv
, lb = lbVecDiffmHomv
, ub = ubVecDiffmHomv
, method = c("normal","warp3")[1]
)
toc()
save( bridgeOutDiffmHomvOther ,
file=paste0(saveFileDir,"/",filenameRoot, "-bridgeOutDiffmHomvOther.Rdata") )
} else {
load( file=paste0(saveFileDir,"/",filenameRoot, "-bridgeOutDiffmHomvOther.Rdata") )
}
## Iteration: 1
## Iteration: 2
## Iteration: 3
## Iteration: 4
## bridge_sampler: 10.87 sec elapsed
summary(bridgeOutDiffmHomvOther)
##
## Bridge sampling log marginal likelihood estimate
## (method = "normal", repetitions = 1):
##
## -1622.718
##
## Error Measures:
##
## Relative Mean-Squared Error: 9.712535e-08
## Coefficient of Variation: 0.0003116494
## Percentage Error: 0%
##
## Note:
## All error measures are approximate.
As indicated above, the percentage error from the bridge-sampling estimate is very small (only 0%), which suggests that the estimate of the marginal likelihood is very stable.
Bayes factors and posterior model probabilities:
(Cf. BARG Steps 1 and 4, regarding BF and model probabilities.)
Compute and display the Bayes factor for a test of equal means:
bfEqmVsDiffmOther = bf( bridgeOutEqmHetvOther , bridgeOutDiffmHetvOther )
bfPlotEqmVsDiffmOther = bfplot( bfEqmVsDiffmOther$bf ,
"Eqm Hetv Other" ,
"Diffm Hetv Other" )
The plot above shows posterior model probabilities as a function of prior model probabilities. (Cf. BARG Step 3.C and Step 4.D.) The Bayes factor is 3.06 for Eqm Hetv Other relative to Diffm Hetv Other. To ‘accept’ Eqm Hetv Other (relative to Diffm Hetv Other) with a posterior probability of at least 0.95, Eqm Hetv Other’s prior probability must be at least 0.861. To ‘reject’ Eqm Hetv Other (relative to Diffm Hetv Other) with a posterior probability less than 0.05, Eqm Hetv Other’s prior probability must be less than 0.017.
Compute and display the Bayes factor for a test of equal variances:
bfHomvVsHetvOther = bf( bridgeOutDiffmHomvOther , bridgeOutDiffmHetvOther )
bfPlotHomvVsHetvOther = bfplot( bfHomvVsHetvOther$bf ,
"Diffm Homv Other" ,
"Diffm Hetv Other" )
The plot above shows posterior model probabilities as a function of prior model probabilities. (Cf. BARG Step 3.C and Step 4.D.) The Bayes factor is 1.3e-05 for Diffm Homv Other relative to Diffm Hetv Other. To ‘accept’ Diffm Homv Other (relative to Diffm Hetv Other) with a posterior probability of at least 0.95, Diffm Homv Other’s prior probability must be at least 1. To ‘reject’ Diffm Homv Other (relative to Diffm Hetv Other) with a posterior probability less than 0.05, Diffm Homv Other’s prior probability must be less than 1.
Diffm Hetv model: Compute marginal likelihood with bridge sampler.
if ( runNewMCMC ) {
tic("bridge_sampler")
set.seed(47405) # for reproducible bridge_sampler()
bridgeOutDiffmHetvSubset = bridge_sampler(
samples = runjagsDiffmHetvSubsetPost # N.B. broad prior
, data = list( yMat=yMatAct ,
paramM = paramM_DiffmHetvSubset ,
paramVCOV = paramVCOV_DiffmHetvSubset )
, log_posterior = logPostProbDiffmHetv
, lb = lbVecDiffmHetv
, ub = ubVecDiffmHetv
, method = c("normal","warp3")[1]
)
toc()
save( bridgeOutDiffmHetvSubset ,
file=paste0(saveFileDir,"/",filenameRoot, "-bridgeOutDiffmHetvSubset.Rdata") )
} else {
load( file=paste0(saveFileDir,"/",filenameRoot, "-bridgeOutDiffmHetvSubset.Rdata") )
}
## Iteration: 1
## Iteration: 2
## Iteration: 3
## Iteration: 4
## bridge_sampler: 7.88 sec elapsed
summary(bridgeOutDiffmHetvSubset)
##
## Bridge sampling log marginal likelihood estimate
## (method = "normal", repetitions = 1):
##
## -1596.646
##
## Error Measures:
##
## Relative Mean-Squared Error: 1.854802e-07
## Coefficient of Variation: 0.0004306742
## Percentage Error: 0%
##
## Note:
## All error measures are approximate.
As indicated above, the percentage error from the bridge-sampling estimate is very small (only 0%), which suggests that the estimate of the marginal likelihood is very stable.
Eqm Hetv model: Compute marginal likelihood with bridge sampler.
if ( runNewMCMC ) {
tic("bridge_sampler")
set.seed(47405) # for reproducible bridge_sampler()
bridgeOutEqmHetvSubset = bridge_sampler(
samples = runjagsEqmHetvSubsetPost # N.B. broad prior
, data = list( yMat=yMatAct ,
paramM = paramM_EqmHetvSubset ,
paramVCOV = paramVCOV_EqmHetvSubset )
, log_posterior = logPostProbEqmHetv
, lb = lbVecEqmHetv
, ub = ubVecEqmHetv
, method = c("normal","warp3")[1]
)
toc()
save( bridgeOutEqmHetvSubset ,
file=paste0(saveFileDir,"/",filenameRoot, "-bridgeOutEqmHetvSubset.Rdata") )
} else {
load( file=paste0(saveFileDir,"/",filenameRoot, "-bridgeOutEqmHetvSubset.Rdata") )
}
## Iteration: 1
## Iteration: 2
## Iteration: 3
## Iteration: 4
## bridge_sampler: 7 sec elapsed
summary(bridgeOutEqmHetvSubset)
##
## Bridge sampling log marginal likelihood estimate
## (method = "normal", repetitions = 1):
##
## -1595.058
##
## Error Measures:
##
## Relative Mean-Squared Error: 8.067185e-08
## Coefficient of Variation: 0.0002840279
## Percentage Error: 0%
##
## Note:
## All error measures are approximate.
As indicated above, the percentage error from the bridge-sampling estimate is very small (only 0%), which suggests that the estimate of the marginal likelihood is very stable.
Diffm Homv model: Compute marginal likelihood with bridge sampler.
if ( runNewMCMC ) {
tic("bridge_sampler")
set.seed(47405) # for reproducible bridge_sampler()
bridgeOutDiffmHomvSubset = bridge_sampler(
samples = runjagsDiffmHomvSubsetPost # N.B. broad prior
, data = list( yMat=yMatAct ,
paramM = paramM_DiffmHomvSubset ,
paramVCOV = paramVCOV_DiffmHomvSubset )
, log_posterior = logPostProbDiffmHomv
, lb = lbVecDiffmHomv
, ub = ubVecDiffmHomv
, method = c("normal","warp3")[1]
)
toc()
save( bridgeOutDiffmHomvSubset ,
file=paste0(saveFileDir,"/",filenameRoot, "-bridgeOutDiffmHomvSubset.Rdata") )
} else {
load( file=paste0(saveFileDir,"/",filenameRoot, "-bridgeOutDiffmHomvSubset.Rdata") )
}
## Iteration: 1
## Iteration: 2
## Iteration: 3
## Iteration: 4
## bridge_sampler: 7.14 sec elapsed
summary(bridgeOutDiffmHomvSubset)
##
## Bridge sampling log marginal likelihood estimate
## (method = "normal", repetitions = 1):
##
## -1608.237
##
## Error Measures:
##
## Relative Mean-Squared Error: 9.385726e-08
## Coefficient of Variation: 0.0003063613
## Percentage Error: 0%
##
## Note:
## All error measures are approximate.
As indicated above, the percentage error from the bridge-sampling estimate is very small (only 0%), which suggests that the estimate of the marginal likelihood is very stable.
Bayes factors and posterior model probabilities:
(Cf. BARG Steps 1 and 4, regarding BF and model probabilities.)
Compute and display the Bayes factor for a test of equal means:
bfEqmVsDiffmSubset = bf( bridgeOutEqmHetvSubset , bridgeOutDiffmHetvSubset )
bfPlotEqmVsDiffmSubset = bfplot( bfEqmVsDiffmSubset$bf ,
"Eqm Hetv Subset" ,
"Diffm Hetv Subset" )
The plot above shows posterior model probabilities as a function of prior model probabilities. (Cf. BARG Step 3.C and Step 4.D.) The Bayes factor is 4.89 for Eqm Hetv Subset relative to Diffm Hetv Subset. To ‘accept’ Eqm Hetv Subset (relative to Diffm Hetv Subset) with a posterior probability of at least 0.95, Eqm Hetv Subset’s prior probability must be at least 0.795. To ‘reject’ Eqm Hetv Subset (relative to Diffm Hetv Subset) with a posterior probability less than 0.05, Eqm Hetv Subset’s prior probability must be less than 0.011.
Compute and display the Bayes factor for a test of equal variances:
bfHomvVsHetvSubset = bf( bridgeOutDiffmHomvSubset , bridgeOutDiffmHetvSubset )
bfPlotHomvVsHetvSubset = bfplot( bfHomvVsHetvSubset$bf ,
"Diffm Homv Subset" ,
"Diffm Hetv Subset" )
The plot above shows posterior model probabilities as a function of prior model probabilities. (Cf. BARG Step 3.C and Step 4.D.) The Bayes factor is 9.25e-06 for Diffm Homv Subset relative to Diffm Hetv Subset. To ‘accept’ Diffm Homv Subset (relative to Diffm Hetv Subset) with a posterior probability of at least 0.95, Diffm Homv Subset’s prior probability must be at least 1. To ‘reject’ Diffm Homv Subset (relative to Diffm Hetv Subset) with a posterior probability less than 0.05, Diffm Homv Subset’s prior probability must be less than 1.
(Cf. BARG Step 5.D. and 5.E.)
par(mfrow=c(1,3))
bfPlotEqmVsDiffmBroad = bfplot( bfEqmVsDiffmBroad$bf ,
"Eqm Hetv Broad" ,
"Diffm Hetv Broad" )
bfPlotEqmVsDiffmOther = bfplot( bfEqmVsDiffmOther$bf ,
"Eqm Hetv Other" ,
"Diffm Hetv Other" )
bfPlotEqmVsDiffmSubset = bfplot( bfEqmVsDiffmSubset$bf ,
"Eqm Hetv Subset" ,
"Diffm Hetv Subset" )
par(mfrow=c(1,1))
EqmVsDiffmMat = matrix( c( bfEqmVsDiffmBroad$bf , bfPlotEqmVsDiffmBroad$hiCritPrior ,
bfEqmVsDiffmOther$bf , bfPlotEqmVsDiffmOther$hiCritPrior ,
bfEqmVsDiffmSubset$bf , bfPlotEqmVsDiffmSubset$hiCritPrior ) ,
nrow=3 , ncol=2 , byrow=TRUE )
colnames( EqmVsDiffmMat ) = c("BF","CritPriorProb")
rownames( EqmVsDiffmMat ) = c("Broad","Other","Subset")
signif(EqmVsDiffmMat,3)
## BF CritPriorProb
## Broad 12.50 0.604
## Other 3.06 0.861
## Subset 4.89 0.795
The table above has a different row for each prior, corresponding to the preceding graphs. The column labelled “CritPriorProb” is the minimum prior probability of the equal-mean model required for its posterior probability to exceed 0.95, that is, to “accept” it. To “accept” the equal-mean (Eqm) model, its prior probability would have to be at least 0.604 even for the most extreme BF, which is not very plausible because the underlying means of two different movies are very unlikely to be exactly equal, that is, the prior probability of the Eqm model is very small.
The results indicate that there is some change in the BF across priors, though all lean toward the equal-mean model. Interestingly, although the Broad and Subset priors yielded posterior estimates that were virtually identical to each other in the full model, they yield somewhat different BF’s. Presumably this arises, at least in part, because a broad prior distribution on the target parameter in the non-null model “dilutes” the prior probability of non-null values across a wide range, such that non-null parameter values that happen fit the data have small prior probability, resulting in weakened support for the non-null model.
As an aside, uncertainty in prior model probability can be addressed by a technique explained at https://osf.io/36527/ (Kruschke, 13 April 2021, submitted). Uncertainty in the prior model probability is explicitly represented by a beta distribution, which is then updated by the data and passed through the BF curve to produce a posterior distribution on model probability.
# Load functions provided at https://osf.io/36527/ bfplotOrig = bfplot source("BayesModelProb_FuncDef.R") # Caution: Overwrites bfplot() bfplot = bfplotOrigAs mentioned above, the prior probability of the Eqm model is very small. I will represent that small but uncertain prior model probability with a beta distribution that has a mode at 0.02 and a concentration of 20 (which I think is generous to the Eqm model).
# Specify mode and concentration: theMode = 0.02 theConc = 20 # Compute corresponding a,b shape parameters for beta distribution: ab = betaABfromModeKappa( theMode , theConc )Below are shown the prior distribution of model probability and the posterior distribution of model probability, using the BF from the subset prior:
plotInfo = plotBFpp( BF = bfEqmVsDiffmSubset$bf , A = ab$a , B = ab$b , HDImass=0.95 , postCrit=0.95 , which=c(1,4) )
The total posterior probability of the Eqm model exceeding the decision criterion (0.95) is only 1.4290075^{-12}.
Decisions made by the posterior model probability (via the BF) agree with decisions made by considering parameter magnitudes and ROPEs. That is, the parameter estimates suggested little difference between means, but the distributions had a non-trivial amount of the posterior fall outside the ROPE, hence a decision was withheld. This agrees with the BF’s, which lean toward the equal-mean model, but are not extreme enough to decide to accept the equal-mean model.
par(mfrow=c(1,3))
bfPlotHomvVsHetvBroad = bfplot( bfHomvVsHetvBroad$bf ,
"Diffm Homv Broad" ,
"Diffm Hetv Broad" )
bfPlotHomvVsHetvOther = bfplot( bfHomvVsHetvOther$bf ,
"Diffm Homv Other" ,
"Diffm Hetv Other" )
bfPlotHomvVsHetvSubset = bfplot( bfHomvVsHetvSubset$bf ,
"Diffm Homv Subset" ,
"Diffm Hetv Subset" )
par(mfrow=c(1,1))
HomvVsHetvMat = matrix( c( bfHomvVsHetvBroad$bf ,
bfPlotHomvVsHetvBroad$loCritPrior ,
bfHomvVsHetvOther$bf ,
bfPlotHomvVsHetvOther$loCritPrior ,
bfHomvVsHetvSubset$bf ,
bfPlotHomvVsHetvSubset$loCritPrior ) ,
nrow=3 , ncol=2 , byrow=TRUE )
colnames( HomvVsHetvMat ) = c("BF","CritPriorProb")
rownames( HomvVsHetvMat ) = c("Broad","Other","Subset")
HomvVsHetvMat
## BF CritPriorProb
## Broad 2.258193e-05 0.9995711
## Other 1.304307e-05 0.9997522
## Subset 9.251810e-06 0.9998242
The table above has a different row for each prior, corresponding to the preceding graphs. The column labelled “CritPriorProb” is the maximum prior probability of the homogeneous-variance model required for its posterior probability to be less than 0.05, that is, to “reject” it. In all cases, virtually any prior probability on the homogeneous-variance model (other than 1.0) results in the homogeneous-variance model (Homv) being rejected.
The results indicate that there is some change in the BF across priors, though all lean strongly away from homogeneous-variance model. Interestingly, although the Broad and Subset priors yielded posterior estimates that were virtually identical to each other in the full model, they yield somewhat different BF’s. Presumably this arises, at least in part, because a broad prior distribution on the target parameter in the non-null model “dilutes” the prior probability of non-null values across a wide range, such that non-null parameter values that happen fit the data have small prior probability, resulting in weakened support for the non-null model.
As an aside, uncertainty in prior model probability can be addressed by a technique explained at https://osf.io/36527/ (Kruschke, 13 April 2021, submitted). Uncertainty in the prior model probability is explicitly represented by a beta distribution, which is then updated by the data and passed through the BF curve to produce a posterior distribution on model probability.
As mentioned above, the prior probability of the Hovm model is very small. I will represent that small but uncertain prior model probability with a beta distribution that has a mode at 0.02 and a concentration of 20 (which I think is generous to the Homv model).
# Specify mode and concentration: theMode = 0.02 theConc = 20 # Compute corresponding a,b shape parameters for beta distribution: ab = betaABfromModeKappa( theMode , theConc )Below are shown the prior distribution of model probability and the posterior distribution of model probability, using the BF from the subset prior:
plotInfo = plotBFpp( BF = bfHomvVsHetvSubset$bf , A = ab$a , B = ab$b , HDImass=0.95 , postCrit=0.95 , which=c(1,4) )
The total posterior probability of the Homv model exceeding (falling below) the decision criterion to reject it (i.e., 0.05) is essentially 1.0, at 1.
Decisions made by the posterior model probability (via the BF) agree with decisions made by considering parameter magnitudes and ROPEs. That is, the parameter estimates indicated large differences between variances with posterior distributions that fell well outside the ROPE, and the BF’s indicated a huge range of prior probabilities that would lead to posterior probabilities that reject the homogeneous-variance model.
(Cf. BARG Step 6.)
Install the following software. It is all free. Even if you already have the software, it is a good idea to install the most recent versions.
After downloading, be sure to install.
All the files needed to reproduce this document are at https://osf.io/w7cph/. They include the following:
BARG-Supplement.Rmd The R Markdown source file for this
document.MoviesData.csv The data file that specifies the ratings
of the movies.BARG-references.bib An auxiliary file with reference
information.apa.csl An auxiliary file for formatting the
references.LiddellKruschke2018Fig1.png An auxiliary image file
imported into the summary document.BayesModelProb_FuncDef.R An auxiliary file that defines
additional functions.Save all of those files together in the same folder on your computer.
Optionally, you can also download the
folder,BARG-Supplement-Output, which contains
numerical output from previously running
BARG-Supplement.Rmd. The files in this folder are only
useful for saving time when typesetting BARG-Supplement.Rmd
without recomputing the MCMC chains. If you do download the folder,
place it (with its contents) as a folder within the folder you used for
the files listed above.
BARG-Supplement.Rmd in RStudio.BARG-Supplement.Rmd for
runNewMCMC = TRUE. If you want to use previously saved MCMC
chains, then edit that line to runNewMCMC = FALSE and be
sure that the folder BARG-Supplement-Output and all of its
contents are in your current working directory. (Cf. BARG Step
6.G.)The exact versions of the software that were used to produce this analysis and document are reported below:
date()
## [1] "Sat Dec 2 08:37:26 2023"
sessionInfo()
## R version 4.3.1 (2023-06-16 ucrt)
## Platform: x86_64-w64-mingw32/x64 (64-bit)
## Running under: Windows 11 x64 (build 22631)
##
## Matrix products: default
##
##
## locale:
## [1] LC_COLLATE=English_United States.utf8
## [2] LC_CTYPE=English_United States.utf8
## [3] LC_MONETARY=English_United States.utf8
## [4] LC_NUMERIC=C
## [5] LC_TIME=English_United States.utf8
##
## time zone: Asia/Saigon
## tzcode source: internal
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] tictoc_1.2 psych_2.3.9 DT_0.30
## [4] emdbook_1.3.13 bridgesampling_1.1-2 runjags_2.2.2-1.1
## [7] rjags_4-14 coda_0.19-4
##
## loaded via a namespace (and not attached):
## [1] sass_0.4.7 KernSmooth_2.23-22 stringi_1.8.2
## [4] lattice_0.22-5 digest_0.6.33 magrittr_2.0.3
## [7] evaluate_0.23 grid_4.3.1 mvtnorm_1.2-4
## [10] fastmap_1.1.1 plyr_1.8.9 jsonlite_1.8.7
## [13] Matrix_1.6-3 Brobdingnag_1.2-9 scales_1.2.1
## [16] crosstalk_1.2.1 numDeriv_2016.8-1.1 jquerylib_0.1.4
## [19] mnormt_2.1.1 cli_3.6.1 rlang_1.1.2
## [22] bbmle_1.0.25 munsell_0.5.0 ellipsis_0.3.2
## [25] cachem_1.0.8 yaml_2.3.7 tools_4.3.1
## [28] parallel_4.3.1 bdsmatrix_1.3-6 colorspace_2.1-0
## [31] vctrs_0.6.4 R6_2.5.1 png_0.1-8
## [34] stats4_4.3.1 lifecycle_1.0.4 stringr_1.5.1
## [37] htmlwidgets_1.6.3 MASS_7.3-60 bslib_0.6.0
## [40] glue_1.6.2 Rcpp_1.0.11 xfun_0.41
## [43] highr_0.10 rstudioapi_0.15.0 knitr_1.45
## [46] htmltools_0.5.7 nlme_3.1-164 rmarkdown_2.25
## [49] compiler_4.3.1
This concludes the example of analysis and reporting.
This section reviews many previous guidelines for reporting Bayesian analyses, in approximately chronological order. (This section is independent of the previous sections regarding the example of a Bayesian analysis and its reportage.)
The BayesWatch list by Spiegelhalter et al. (2000) (Ch. 8) occurred in the context of a 136 page monograph about Bayesian methods. The BARG covers everything in the BayesWatch list (except for one item idiosyncratic to technology assessment). Unlike most subsequent lists, the BayesWatch list explicitly included reporting the loss function used for decision making. But most studies, including the examples Spiegelhalter et al. (2000) used to illustrate BayesWatch, have no specific loss function. Spiegelhalter et al. (2000) did, however, provide examples of ROPEs, which they refer to as a “clinically important difference.” Unfortunately their examples did not report some of the essential details in the BARG, such as the ESS of MCMC chains. A clear example of a sensitivity analysis, using three different priors, was shown in Ch. 9 (pp. 59-62).
The BaSiS (Bayesian Standards in Science) list by Gatsonis & Goodman (2001) is a brief and terse document in draft form without much detail, but which nevertheless has been a landmark for subsequent lists. In its few details about reporting the posterior distribution, the BaSiS list included some optional details not in the BARG, namely, the “shape of the posterior densities of individual parameters” and “joint posterior probability intervals.” The BARG does not explicitly mention these because typically the shape of the posterior distribution is unimodal and skewness will be apparent by the asymmetry of the CI around the central tendency, and it is not clear from BaSiS exactly when joint intervals are needed. The BaSiS is brief and leaves out many items in the BARG.
The ROBUST (Reporting Of Bayes Used in clinical STudies) list was developed by Sung et al. (2005) by polling two dozen statisticians and reviewing the BayesWatch and BaSiS lists. Sung et al. (2005) selected the items mentioned by a majority of experts. Nevertheless, the resulting list is brief and subsumed by the BARG. ROBUST was intended as a minimal list for brief reports and leaves out many essential items such as details of the MCMC computations, separate considerations for hypothesis testing, etc. By contrast, the BARG recommends extensive reporting, but relegating some details to appendices or supplementary material if demanded by journal format or audience.
Guidelines for reporting Bayesian analyses were provided in Ch. 23 (pp. 619–622) of Kruschke (2011) and revised in Ch. 25 (pp. 721–725) of Kruschke (2015). The BARG is strongly influenced by those guidelines, and incorporates all of their points in an expanded structure.
The SAMPL (Statistical Analyses and Methods in the Published Literature) guidelines by Lang & Altman (2015) gave checklists for reporting some basic types of analyses. It included a separate short checklist for reporting Bayesian analyses, with no explanation for any of the points. The BARG includes them all.
Another brief checklist of “minimum reporting guidelines” appeared with some explanation in the context of a tutorial article regarding Bayesian linear regression by Baldwin & Larson (2017) (Section 4.1, p. 74). Again, the BARG includes all the mentioned points.
A checklist called WAMBS (When to worry and how to Avoid the Misuse of Bayesian Statistics) was presented by Depaoli & van de Schoot (2017). The WAMBS points out that prior distributions on higher-level parameters might produce unanticipated biases on the posterior distribution, so care should be taken to check that priors intended to be uninfluential really are. When considering MCMC diagnostics, the WAMBS did not mention effective sample size (ESS), although ESS is described in a subsequent tutorial (van de Schoot et al., 2020). The WAMBS instead recommended a procedure of repeating the analysis with double the chain length, which might not be practical with time-consuming computations and might still not achieve sufficient ESS. The WAMBS did not discriminate between continuous parameter estimation and model/hypothesis comparison, and did not discuss Bayes factors. The WAMBS was updated in van de Schoot et al. (2021) to a 10-point checklist which does highlight ESS and other important technical aspects. The BARG covers all the goals of the WAMBS, and includes various additional points such as treatment of posterior model probabilities in model comparison and hypothesis testing.
The 2018 APA reporting standards included a small subsection regarding Bayesian analyses (Table 8, p. 20, Appelbaum et al., 2018). The list is fairly extensive but has no explanatory text accompanying it nor any references to other sources, which is unfortunate because it includes some cryptic items such as: “Describe the unnormalized or normalized likelihood if the prior distribution is informative.” It does not mention effective sample size (ESS) of the MCMC chains. It has no sequential structure, and therefore would be difficult to use as a guide to conducting an analysis.
There is a reporting checklist in guidelines for using the JASP software (van Doorn et al., 2020), at the end of its Table 1. The overall guidelines include many important aspects of planning, executing, interpreting, and reporting a Bayesian analysis. However, the guidelines do not mention the importance of checking and reporting MCMC details such as ESS. The Table 1 checklist indicates that if there is hypothesis testing then Bayes factors should be reported, but the checklist does not say that posterior probabilities should be reported. (The graphical JASP output presents numerical BF’s but presents posterior probabilities only as pie charts assuming fixed 50/50 prior probabilities.) The guidelines highlight categorical labels of Bayes factors in Fig. 4 (e.g., “Bayes factors greater than 10 are considered strong evidence” p. 821) despite also cautioning against using Bayes factors to make decisions (“These classifications are heuristic and should not be misused as an absolute rule for all-or-nothing conclusions”, p. 821). Except for a few aspects of research design that fall outside the scope of reporting results, the BARG encompasses the JASP reporting guidelines.
Aczel et al. (2020) presented a “thinking guideline” for Bayesian analyses in Box 2 of their short article. It provides some important points for thinking clearly about goals and purposes of analysis, all of which are included in the BARG.
In conclusion, the BARG acknowledges and incorporates the points of previous guidelines, but with (relative to various predecessors) greater generality, thoroughness, explanation, and sequentiality. As mentioned in the main article, it is hoped that the BARG will be useful to researchers, authors, reviewers, editors, educators, and students.