Alternative tools for t-test

Note The ESMS project aims to help Vietnamese medical students in developing their skills of data analysis. We provide the problem-based tutorials using R.This tutorial was inspired by a status of Pr. Nguyen Van Tuan on Facebook (22/03/2017).He encouraged the use the Bootstrapping method rather than non-parametric methods for dealing with messy data. We extend this topic with a robust function for bootstrapping the difference of Mean and Median. We also introduce some statistical inference tools for Highest density interval (HDI, also known as confidence interval ), such as Bayes factor, Region of practice equivalence (ROPE) and comparative. value.

Introduction

Last time, we have discovered the t test for comparing the means between 2 independent samples. http://rpubs.com/ledongnhatnam/253124. We know that the t-test is only reliable when our data met the prior assumptions like normal distribution, homogeneity in variance and absence of outliers. The question is when these assumptions are not met, what should we do ? In the present tutorial, we will introduce 3 alternative solutions for this problem, including: 1) The Wilcoxon_Mann-Whitney U test; 2) The Bootstrapping statistical inference and 3) Bayesian analysis.

Context

This case study is based on an original study of J. Ludbrook (1994).

Ref: J. Ludbrook (1994). Repeated measurements and multiple comparisons in cardiovascular research. Cardiovascular Research 28, 303–311.

The study’s objective was to evaluate the vasodilating effect of MDL 72222, an antagonist of Serotonin (5-HT-3) on rabbits. Though the original experiment involves mixed effects of vasodilating effect by MLD treatment against the stimulating effect of phenylbiguanide over time, we will simplify our study design by considering only MLD treatment as fixed effect. Our study question is whether MLD treatment decreases significantly the hypertensive effect of phenylbiguanide compared to Saline, or whether there is a significative difference in Blood pressure change between 2 MLD and control groups.

Materials and method

The original dataset could be downloaded directly from the famous http://vincentarelbundock.github.io/Rdatasets/datasets.html website.

Following packages must be established in R-studio (some other packages might also be required during the analysis)

library(tidyverse) # for data management
library(fBasics) # for testing the normal distribution 
library(brms) # for Bayesian regression analysis

Preparing the data

library(tidyverse)

bpdata=read.csv("http://vincentarelbundock.github.io/Rdatasets/csv/MASS/Rabbit.csv")%>%as_tibble()

bpdata

## # A tibble: 60 × 6
##        X BPchange   Dose    Run Treatment Animal
##    <int>    <dbl>  <dbl> <fctr>    <fctr> <fctr>
## 1      1     0.50   6.25     C1   Control     R1
## 2      2     4.50  12.50     C1   Control     R1
## 3      3    10.00  25.00     C1   Control     R1
## 4      4    26.00  50.00     C1   Control     R1
## 5      5    37.00 100.00     C1   Control     R1
## 6      6    32.00 200.00     C1   Control     R1
## 7      7     1.00   6.25     C2   Control     R2
## 8      8     1.25  12.50     C2   Control     R2
## 9      9     4.00  25.00     C2   Control     R2
## 10    10    12.00  50.00     C2   Control     R2
## # ... with 50 more rows

As mentionned above, we will focus on 2 variables: the target variable is BPchange, and the Factor is Treatment

Data exploration

library(fBasics)
bpdata$BPchange%>%split(bpdata$Treatment)%>%map(~dagoTest(.))

## $Control
## 
## Title:
##  D'Agostino Normality Test
## 
## Test Results:
##   STATISTIC:
##     Chi2 | Omnibus: 7.413
##     Z3  | Skewness: 1.1895
##     Z4  | Kurtosis: -2.4491
##   P VALUE:
##     Omnibus  Test: 0.02456 
##     Skewness Test: 0.2343 
##     Kurtosis Test: 0.01432 
## 
## Description:
##  Thu Mar 23 09:33:05 2017 by user: Admin
## 
## 
## $MDL
## 
## Title:
##  D'Agostino Normality Test
## 
## Test Results:
##   STATISTIC:
##     Chi2 | Omnibus: 8.2118
##     Z3  | Skewness: 2.7684
##     Z4  | Kurtosis: 0.74
##   P VALUE:
##     Omnibus  Test: 0.01648 
##     Skewness Test: 0.005633 
##     Kurtosis Test: 0.4593 
## 
## Description:
##  Thu Mar 23 09:33:05 2017 by user: Admin

bpdata%>%ggplot(aes(x=BPchange,fill=Treatment))+geom_density(alpha=0.5)+theme_bw()+scale_fill_manual(values=RColorBrewer::brewer.pal(3,"Set1"))

bpdata%>%ggplot(aes(x=Treatment,y=BPchange,fill=Treatment))+geom_boxplot(alpha=0.5)+theme_bw()+scale_fill_manual(values=RColorBrewer::brewer.pal(3,"Set1"))

psych::describeBy(bpdata$BPchange,bpdata$Treatment)

## $Control
##    vars  n  mean    sd median trimmed   mad min max range skew kurtosis
## X1    1 30 13.56 12.13     11   12.56 14.08 0.5  37  36.5 0.44    -1.35
##      se
## X1 2.22
## 
## $MDL
##    vars  n mean    sd median trimmed  mad  min max range skew kurtosis
## X1    1 30 8.88 10.46    2.8     7.2 2.52 0.75  37 36.25 1.16    -0.03
##      se
## X1 1.91
## 
## attr(,"call")
## by.default(data = x, INDICES = group, FUN = describe, type = type)

The data exploration showed that the BPchange outcome was not normally distributed in 2 treatment groups. Their distributions were neither uniformed. This non-uniformity makes our comparison more difficult, as we can no more set our Null hypothesis on Median or Mean.

By looking at the descriptive stats result, we can expect that the BPchange is lower in MDL compared to Control group, i.e the MDL treatment inhibited the hypertensive effect of another stimulant, but we dont know whether such difference is significative or not.

1) Non-paramatric method - Wilcoxon_Mann-Whitney test

The Wilcoxon rank sum test (or Mann-Whitney U test) consists of testing a Null hypothesis of equivalent difference of medians between 2 independent samples. However, as mentioned above, we can no more compare the medians because two sample’s distributions were completely different.

The null hypothesis H0 for Wilcoxon test must be expressed as “The BPchange’s distribution is the same across two treatment groups”

wilcox.test(bpdata$BPchange~bpdata$Treatment,alternative ="two.sided")

## 
##  Wilcoxon rank sum test with continuity correction
## 
## data:  bpdata$BPchange by bpdata$Treatment
## W = 546, p-value = 0.1577
## alternative hypothesis: true location shift is not equal to 0

The result of Wilcoxon rank sum test was dissapointing, as we fail to reject our null hypothesis. Even if the test was positive, it can not tell us much information about the effect of MDL treatment, since our null hypothesis was simply involved overall distributions but not on a quantitative interval or median.

But this is not the end of the road. We still have two powerful methods. The first one is bootstraping statistical inference.

2) Bootstrapping method

Bootstrapping consists of a random sampling with replacement. Based on bootstrap we can determine a confidence interval (I prefer a proper term of Highest density interval) that could be used for statistical inference.

In R, a bootstrapping process consists of 2 components: A statistical function that generates the metrics, and the boot function for resampling.

# This is the statistical function, it generate the difference of Mean and Median of an outcome across two factor levels 

bootDif=function(outcome,factor,data,i){
  d=data[i,]
  sum=d%>%select(.,get(factor),get(outcome))%>%split(.[,factor])%>%map(~psych::describe(.))
  med1=sum[[1]]%>%.[2,5]
  med2=sum[[2]]%>%.[2,5]
  mean1=sum[[1]]%>%.[2,3]
  mean2=sum[[2]]%>%.[2,3]
  meddif=med1-med2
  meandif=mean1-mean2
  return=cbind(meandif,meddif)
}

library(boot)

set.seed(123)

# The boot function will perform the resampling process 1000 times, and apply the statistical function on each iteration. Results are compiled in an output object named "res"

res=boot(statistic=bootDif,outcome="BPchange",factor="Treatment",data=bpdata,R=1000)%>%.$t%>%as_tibble()

names(res)=c("MeanDiff","MedianDiff")

res$Iteration=row.names(res)

Then we can explore the output object in many ways. This output is a vector or sample of statistics. In this case it has two metrics: Difference of mean and Difference of median.

The conventional method for interpreting bootstrap result is exploring the highest density interval at 95% of density mass

Hmisc::describe(res$MeanDiff)

## res$MeanDiff 
##        n  missing distinct     Info     Mean      Gmd      .05      .10 
##     1000        0      998        1    4.667    3.226  -0.2791   0.8740 
##      .25      .50      .75      .90      .95 
##   2.8553   4.7139   6.5538   8.2045   9.0469 
## 
## lowest : -4.807589 -3.726667 -3.351562 -3.304054 -3.085429
## highest: 12.167714 12.220707 12.329429 12.329798 12.368973

Hmisc::describe(res$MedianDiff)

## res$MedianDiff 
##        n  missing distinct     Info     Mean      Gmd      .05      .10 
##     1000        0      212        1    7.106     6.02     -0.5      1.0 
##      .25      .50      .75      .90      .95 
##      2.7      7.5     11.0     14.0     15.5 
## 
## lowest : -13.00 -10.00  -8.00  -7.50  -6.75, highest:  18.30  19.40  19.50  19.70  20.00

We can see that the 95% CI of Mean and Median differences were respectively (-0.2791 to 9.0469) and (-0.5 to 15.5). Such confidence interval cannot warrant absolute differences, since they both include zero.

We can also interpret this highest density interval in a more flexible way, by using the John Kruschke’s method. This method consists of setting a ROPE (Region of practice equivalence) or a Comparative value, then computing the mass of density (%) that falls inside or outside the ROPE, above or below the Comp. Val.

Alt text

To do this, we will set-up some functions for calculating the Density mass:

HDIF= function( sampleVec,credMass=0.975 ) {
  sortedPts = sort( sampleVec )
  ciIdxInc = ceiling( credMass * length( sortedPts ) )
  nCIs = length( sortedPts ) - ciIdxInc
  ciWidth = rep( 0 , nCIs )
  for ( i in 1:nCIs ) {
    ciWidth[ i ] = sortedPts[ i + ciIdxInc ] - sortedPts[ i ]
  }
  HDImin = sortedPts[ which.min( ciWidth ) ]
  HDImax = sortedPts[ which.min( ciWidth ) + ciIdxInc ]
  HDIlim = c( HDImin , HDImax )
  return( HDIlim )
}

SBOO=function(paramSampleVec,compVal=NULL , ROPE=NULL , credMass=0.975) {
  meanParam = mean( paramSampleVec )
  medianParam = median( paramSampleVec )
  dres = density( paramSampleVec )
  modeParam = dres$x[which.max(dres$y)]
  hdiLim = HDIF( paramSampleVec , credMass=credMass )
  if ( !is.null(compVal) ) {
    pcgtCompVal = ( 100 * sum( paramSampleVec > compVal ) 
                    / length( paramSampleVec ) )
  } else {
    compVal=NA
    pcgtCompVal=NA
  }
  if ( !is.null(ROPE) ) {
    pcltRope = ( 100 * sum( paramSampleVec < ROPE[1] ) 
                 / length( paramSampleVec ) )
    pcgtRope = ( 100 * sum( paramSampleVec > ROPE[2] ) 
                 / length( paramSampleVec ) )
    pcinRope = 100-(pcltRope+pcgtRope)
  } else { 
    ROPE = c(NA,NA)
    pcltRope=NA 
    pcgtRope=NA 
    pcinRope=NA 
  }  
  return( c( Mean=meanParam , Median=medianParam , Mode=modeParam , 
             HDIlevel=credMass , LL=hdiLim[1] , UL=hdiLim[2] , 
             CompVal=compVal , PcntGtCompVal=pcgtCompVal , 
             ROPElow=ROPE[1] , ROPEhigh=ROPE[2] ,
             PcntLtROPE=pcltRope , PcntInROPE=pcinRope , PcntGtROPE=pcgtRope ) )
}

summaryBOOT=function(Bootvector,compVal=NULL, rope=NULL,credMass=NULL){
  summaryInfo = NULL
  summaryInfo = cbind(summaryInfo, "Estimated"= SBOO(Bootvector,
                                                     compVal=compVal,
                                                     ROPE=rope,credMass=credMass))
  return(summaryInfo)
}

Assuming that our Null-hypothesis is based on a Comp. Val of Zero BP change, and our ROPE is limited from -0.5 to +0.5 unit of BP change. The values of Mean or median difference that fall inside the ROPE will be considered non-significative, whist the portion of density within the inferiority or superiority zones is considered significative. The significant level is extended to 97.5%:

summaryBOOT(Bootvector = res$MedianDiff,compVal=0,rope=c(-1,1),credMass=0.975)

##               Estimated
## Mean           7.106350
## Median         7.500000
## Mode           2.438042
## HDIlevel       0.975000
## LL            -5.000000
## UL            17.000000
## CompVal        0.000000
## PcntGtCompVal 93.400000
## ROPElow       -1.000000
## ROPEhigh       1.000000
## PcntLtROPE     3.800000
## PcntInROPE     6.900000
## PcntGtROPE    89.300000

The median difference indicates the difference in median of BPchange between (Control - Treatment) group. As the MDL treatment group has lower BP change, this value is negative.

Though the 97.5%CI included zero (-5 to 17), we found that 93.4% of density mass is greater than a Comp.Val of zero, and 89.3% of density mass is above the upper bound of ROPE, whilst only 6.9% of density mass fall within ROPE, and 3.8% was found below lower bound of ROPE. Those results indicate infact a high probability of positive difference between Control and treatment groups, i.e the evidence weights toward a positive difference in Blood pressure change.

summaryBOOT(Bootvector = res$MeanDiff,compVal=0,rope=c(-1,1),credMass=0.975)

##               Estimated
## Mean           4.666958
## Median         4.713939
## Mode           5.807220
## HDIlevel       0.975000
## LL            -2.190179
## UL            10.708608
## CompVal        0.000000
## PcntGtCompVal 94.100000
## ROPElow       -1.000000
## ROPEhigh       1.000000
## PcntLtROPE     2.900000
## PcntInROPE     7.700000
## PcntGtROPE    89.400000

In the same way, for Mean difference: only 7.7% of density mass is found within the ROPE, while 94.1% density was above Comp.Val and 89.4 % greater than ROPE. Again, the evidence was in favor to a positive difference.

Those results could be visually displayed as follows:

res%>%ggplot(aes(x=MedianDiff))+geom_histogram(alpha=0.7,fill="gold",color="black",bins = 100)+theme_bw()+geom_vline(xintercept=0,size=0.8,linetype="dashed",color="blue")+geom_vline(xintercept=c(-0.5,0.5),size=0.8,linetype="longdash",color="red4")

res%>%ggplot(aes(x=MeanDiff))+geom_histogram(alpha=0.7,fill="skyblue",color="black",bins = 100)+theme_bw()+geom_vline(xintercept=0,size=0.8,linetype="dashed",color="blue")+geom_vline(xintercept=c(-0.5,0.5),size=0.8,linetype="longdash",color="red4")

The evolution of Mean and Median differences through 1000 bootstrap iterations could be plotted as below:

library(dygraphs)

res%>%dygraph(main = "Bootstraped Difference", ylab = "Difference",xlab="Iteration")%>%
  dySeries("MeanDiff",label="Mean") %>%
  dySeries("MedianDiff",label="Median") %>%
  dyOptions(stackedGraph = TRUE,fillGraph = TRUE,colors = RColorBrewer::brewer.pal(3, "Set1"),fillAlpha = 0.5)%>%
  dyRangeSelector(height = 40)

3) BAYESIAN ANALYSIS

The Bayesian analysis could be done via modelling using a MCMC sampler called STAN. The brms package allows coding our model to STAN, then the sampler do its jobs. This model aims to evaluate the fixed effect of Treatment factor on the BPchange outcome. Our target parameter is the beta regression coefficient for Treatment. A Bayesian model must be built upon the prior of distribution of the parameters in the model. Here is a summary of our Priors

Bayes

After having introduced the priors, we build the model in STAN MCMC sampler with 1000 warming-up iteration (for stabilising the MCMC sampling process), 4 MCMC chains, each one contains 9000 iteration steps.

library(brms)
priort=get_prior(data=bpdata,formula=BPchange~Treatment,family=student)
priort

##                 prior     class         coef group nlpar bound
## 1                             b                               
## 2                             b    Intercept                  
## 3                             b TreatmentMDL                  
## 4                     Intercept                               
## 5       gamma(2, 0.1)        nu                               
## 6 student_t(3, 0, 11)     sigma

set.seed(123)

bayesmod=brm(data=bpdata,formula=BPchange~Treatment,family=student,prior=priort,chains=4,iter=10000,warmup = 1000)

## 
## SAMPLING FOR MODEL 'student(identity) brms-model' NOW (CHAIN 1).
## 
## Chain 1, Iteration:    1 / 10000 [  0%]  (Warmup)
## Chain 1, Iteration: 1000 / 10000 [ 10%]  (Warmup)
## Chain 1, Iteration: 1001 / 10000 [ 10%]  (Sampling)
## Chain 1, Iteration: 2000 / 10000 [ 20%]  (Sampling)
## Chain 1, Iteration: 3000 / 10000 [ 30%]  (Sampling)
## Chain 1, Iteration: 4000 / 10000 [ 40%]  (Sampling)
## Chain 1, Iteration: 5000 / 10000 [ 50%]  (Sampling)
## Chain 1, Iteration: 6000 / 10000 [ 60%]  (Sampling)
## Chain 1, Iteration: 7000 / 10000 [ 70%]  (Sampling)
## Chain 1, Iteration: 8000 / 10000 [ 80%]  (Sampling)
## Chain 1, Iteration: 9000 / 10000 [ 90%]  (Sampling)
## Chain 1, Iteration: 10000 / 10000 [100%]  (Sampling)
##  Elapsed Time: 0.099 seconds (Warm-up)
##                0.789 seconds (Sampling)
##                0.888 seconds (Total)
## 
## 
## SAMPLING FOR MODEL 'student(identity) brms-model' NOW (CHAIN 2).
## 
## Chain 2, Iteration:    1 / 10000 [  0%]  (Warmup)
## Chain 2, Iteration: 1000 / 10000 [ 10%]  (Warmup)
## Chain 2, Iteration: 1001 / 10000 [ 10%]  (Sampling)
## Chain 2, Iteration: 2000 / 10000 [ 20%]  (Sampling)
## Chain 2, Iteration: 3000 / 10000 [ 30%]  (Sampling)
## Chain 2, Iteration: 4000 / 10000 [ 40%]  (Sampling)
## Chain 2, Iteration: 5000 / 10000 [ 50%]  (Sampling)
## Chain 2, Iteration: 6000 / 10000 [ 60%]  (Sampling)
## Chain 2, Iteration: 7000 / 10000 [ 70%]  (Sampling)
## Chain 2, Iteration: 8000 / 10000 [ 80%]  (Sampling)
## Chain 2, Iteration: 9000 / 10000 [ 90%]  (Sampling)
## Chain 2, Iteration: 10000 / 10000 [100%]  (Sampling)
##  Elapsed Time: 0.167 seconds (Warm-up)
##                0.922 seconds (Sampling)
##                1.089 seconds (Total)
## 
## 
## SAMPLING FOR MODEL 'student(identity) brms-model' NOW (CHAIN 3).
## 
## Chain 3, Iteration:    1 / 10000 [  0%]  (Warmup)
## Chain 3, Iteration: 1000 / 10000 [ 10%]  (Warmup)
## Chain 3, Iteration: 1001 / 10000 [ 10%]  (Sampling)
## Chain 3, Iteration: 2000 / 10000 [ 20%]  (Sampling)
## Chain 3, Iteration: 3000 / 10000 [ 30%]  (Sampling)
## Chain 3, Iteration: 4000 / 10000 [ 40%]  (Sampling)
## Chain 3, Iteration: 5000 / 10000 [ 50%]  (Sampling)
## Chain 3, Iteration: 6000 / 10000 [ 60%]  (Sampling)
## Chain 3, Iteration: 7000 / 10000 [ 70%]  (Sampling)
## Chain 3, Iteration: 8000 / 10000 [ 80%]  (Sampling)
## Chain 3, Iteration: 9000 / 10000 [ 90%]  (Sampling)
## Chain 3, Iteration: 10000 / 10000 [100%]  (Sampling)
##  Elapsed Time: 0.109 seconds (Warm-up)
##                0.799 seconds (Sampling)
##                0.908 seconds (Total)
## 
## 
## SAMPLING FOR MODEL 'student(identity) brms-model' NOW (CHAIN 4).
## 
## Chain 4, Iteration:    1 / 10000 [  0%]  (Warmup)
## Chain 4, Iteration: 1000 / 10000 [ 10%]  (Warmup)
## Chain 4, Iteration: 1001 / 10000 [ 10%]  (Sampling)
## Chain 4, Iteration: 2000 / 10000 [ 20%]  (Sampling)
## Chain 4, Iteration: 3000 / 10000 [ 30%]  (Sampling)
## Chain 4, Iteration: 4000 / 10000 [ 40%]  (Sampling)
## Chain 4, Iteration: 5000 / 10000 [ 50%]  (Sampling)
## Chain 4, Iteration: 6000 / 10000 [ 60%]  (Sampling)
## Chain 4, Iteration: 7000 / 10000 [ 70%]  (Sampling)
## Chain 4, Iteration: 8000 / 10000 [ 80%]  (Sampling)
## Chain 4, Iteration: 9000 / 10000 [ 90%]  (Sampling)
## Chain 4, Iteration: 10000 / 10000 [100%]  (Sampling)
##  Elapsed Time: 0.117 seconds (Warm-up)
##                0.756 seconds (Sampling)
##                0.873 seconds (Total)

summary(bayesmod)

##  Family: student (identity) 
## Formula: BPchange ~ Treatment 
##    Data: bpdata (Number of observations: 60) 
## Samples: 4 chains, each with iter = 10000; warmup = 1000; thin = 1; 
##          total post-warmup samples = 36000
##    WAIC: Not computed
##  
## Population-Level Effects: 
##              Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
## Intercept       13.20      2.17     8.96    17.48      36000    1
## TreatmentMDL    -4.89      2.97   -10.73     0.96      36000    1
## 
## Family Specific Parameters: 
##       Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
## sigma    11.05      1.14     9.00    13.48      36000    1
## nu       25.07     14.51     6.17    61.17      36000    1
## 
## Samples were drawn using sampling(NUTS). For each parameter, Eff.Sample 
## is a crude measure of effective sample size, and Rhat is the potential 
## scale reduction factor on split chains (at convergence, Rhat = 1).

plot(bayesmod)

The Bayesian analysis is based on the Posterior Distribution of our regression coefficient of Treatment factor. The posterior distributions are stored in 4 MCMC chains.

The posterior distribution could be interpreted by using either Bayes Factor (conventional method) or John kruschke’s HDI inference methods

The Bayes factor could be explained as follows

Bayes factor

hypothesis(bayesmod,"TreatmentMDL<0",alpha=0.05)

## Hypothesis Tests for class b:
##                    Estimate Est.Error l-95% CI u-95% CI Evid.Ratio  
## (TreatmentMDL) < 0    -4.89      2.97     -Inf    -0.02      19.24 *
## ---
## '*': The expected value under the hypothesis lies outside the 95% CI.

hypothesis(bayesmod,"TreatmentMDL<1",alpha=0.05)

## Hypothesis Tests for class b:
##                        Estimate Est.Error l-95% CI u-95% CI Evid.Ratio  
## (TreatmentMDL)-(1) < 0    -5.89      2.97     -Inf    -1.02      40.43 *
## ---
## '*': The expected value under the hypothesis lies outside the 95% CI.

hypothesis(bayesmod,"TreatmentMDL<5",alpha=0.05)

## Hypothesis Tests for class b:
##                        Estimate Est.Error l-95% CI u-95% CI Evid.Ratio  
## (TreatmentMDL)-(5) < 0    -9.89      2.97     -Inf    -5.02    1284.71 *
## ---
## '*': The expected value under the hypothesis lies outside the 95% CI.

The Bayesian inference is much more flexible than p_value of Null hypothesis testing in Frequentist approaches, as we can set our null hypothesis on any value, not only the Zero.

In this example, we can set our H0 on 3 Critical values of 0, 1 and 5 units of BPchange. The Evidence ratios (or Bayes factor) indicate that treatment with a Serotonin inhibitor did have a significant effect on BPchange, thus the blood pressure increases less under phenylbiguanide injection in MDL group, compared to the control group. For critical levels of 0,1 and 5 mmHg, the Bayes Factors were respectively 19.24, 40.43 and 1284.71 and all of them were significative at alpha level = 0.05.

Now, we can also explore the MCMC chains (or Posterior distributions of Treatment effect) using ROPE and Comp.Val as follows:

mcmcdf=as.mcmc(bayesmod)%>%as.matrix(,chains=TRUE)%>%as_tibble()%>%.[,c(1,3)]
mcmcdf$CHAIN=as.factor(mcmcdf$CHAIN)
names(mcmcdf)=c("MCMCChain","MDLTreatment")

mcmcdf%>%ggplot(aes(x=MDLTreatment,fill=MCMCChain))+geom_histogram(alpha=0.5,color="black",bins=100)+theme_bw()+geom_vline(xintercept=0,size=0.8,linetype="dashed",color="blue")+geom_vline(xintercept=c(-0.5,0.5),size=0.8,linetype="longdash",color="red4")+facet_wrap(~MCMCChain,ncol=1)

C1=subset(mcmcdf,MCMCChain=="1")%>%.$MDLTreatment
C2=subset(mcmcdf,MCMCChain=="2")%>%.$MDLTreatment
C3=subset(mcmcdf,MCMCChain=="3")%>%.$MDLTreatment
C4=subset(mcmcdf,MCMCChain=="4")%>%.$MDLTreatment
mcmc=cbind(C1,C2,C3,C4)%>%as_tibble()
mcmc$Iteration=row.names(mcmc)

mcmc%>%dygraph(main = "MDL Effect", ylab = "Difference",xlab="Iteration")%>%
  dySeries("C1",label="C1") %>%
  dySeries("C2",label="C2") %>%
  dySeries("C3",label="C3") %>%
  dySeries("C4",label="C4") %>%
  dyOptions(stackedGraph = TRUE,colors = RColorBrewer::brewer.pal(4, "Set1"),fillAlpha = 0.1)%>%
  dyRangeSelector(height = 40)

summaryBOOT(Bootvector=mcmcdf$MDLTreatment,compVal=0,rope=c(-1,1),credMass=0.975)

##                Estimated
## Mean           -4.889594
## Median         -4.903067
## Mode           -5.175787
## HDIlevel        0.975000
## LL            -11.550405
## UL              1.882189
## CompVal         0.000000
## PcntGtCompVal   4.941667
## ROPElow        -1.000000
## ROPEhigh        1.000000
## PcntLtROPE     90.636111
## PcntInROPE      6.950000
## PcntGtROPE      2.413889

The result shows that 90.64% of posterior density falls below the lower bound of ROPE, whist only 6.95% of density mass falls inside ROPE. The evidence weights toward a relevant effect of MDL treatment on BPchange

CONCLUSION

Through this data experiment, we have discovered 3 different methods for comparing the mean between 2 independent samples. Though a non-parametric tests like Mann-Whitney or Wilcoxon sign rank could handle all assumption violations in data, including the discrete variables, those tests only provide poor information. Bootstraping and Bayesian regression are the most robust and flexible methods that should be considered for replacing a student t test when data have violation. The Bayesian method truly supersedes the t test and non-parametric tests since this could be extended with multiple factors, covariates, interactions and random effects. Bayesian models also provide a much more flexible ability of inference using Bayes Factor and other statistical inference methods. We strongly recommend adopting Bayesian analysis or Bootstrapping statistical inference in Biomedical researchs.

P.S: This study is dedicated to Pr. Nguyen Van Tuan, who inspired me and many generations of medical students to become true data scientists.

ESMS Case 6 Alternative tools for t test

Lê Ngọc Khả Nhi

22 March 2017

Alternative tools for t-test