The purpose of this talk is to familiarize the audience with Bayesian thinking and to demonstrate that with current software, Bayesian analysis is no longer restricted to a small number of expert statisticians.
Summary: Two groups of people took an IQ test.
Group 1, \(N_1=47\), consumes a “smart drug”, and Group 2, \(N_2=42\), is a control group that consumes a placebo (Kruschke, 2013).
American Statistical Association (ASA) released a statement about p-values (Lazar, 2016). Among the principles are:
Cumming (2014) claim that "we need to shift from reliance on NHST to estimation and other preferred techniques.
Kruschke and Liddell (2018) advocate that Bayesian methods are better suited to achieve this, for both hypothesis testing and parameter estimation.
According to Gigerenzer (2018), we need to stop relying on NHST, but instead learn to use a statistical toolkit.
Many reviewers now demand Bayes factors (because a BF can provide evidence for/against hypotheses).
Bayesian data analysis is not limited to calculating Bayes factors.
more intuitive (quantification of uncertainty)
able to provide evidence for/against hypotheses
cognitive process models (Lee and Wagenmakers, 2013)
robust models
can include prior knowledge
better for multilevel models (Gelman and Hill, 2006)
based on probability theory (Bayes theorem)
We will have a brief look at the theoretical background, then dive straight into a practical example.
Parameters are random variables. These are drawn from probability distributions, which reflect our uncertainty about the parameters.
The prior distribution is updated with the likelihood (data) to obtain a posterior distribution.
To make this more concrete, we return to the IQ example.
Our null hypothesis: the smart drug group have higher IQ scores than the placebo group.
We obtained a p-value of 0.0548769 (Welch test)
Even after removing outliers, p is > 0.05
Not in this case. The data provide little evidence either for or against our hypothesis.
The maximally attainable Bayes factor in favour of our hypothesis is \(\sim 1.8\).
A further benefit of going Bayesian is the ability to flexibly specify our generative model. We can make the model robust against outliers.
A t-test is really just a general linear model (assuming equal variances) \[ Y = \alpha + \beta X + \epsilon\] \[ \epsilon \sim N(0, \sigma^2) \]
where \(X\) is an indicator variable.
which can be read as:
\[ IQ = Placebo + \beta \cdot SmartDrug + \epsilon\] \[ \epsilon \sim N(0, \sigma^2) \]
The \(\beta\) parameter therefore represents the difference between groups.
Linear regression model as a probabilistic model
The graph on the right shows the data-generating process (dependencies among the random variables).
\(\alpha\) is our expectation for the placebo group, \(\beta\) is our expectation for the difference in means. \(\sigma\) is the variance of the outcome.
Probabilistic programming languages:
data {
int<lower=0> N;
vector[N] x;
vector[N] y;
}
parameters {
real alpha;
real beta;
real<lower=0> sigma;
}
model {
sigma ~ cauchy(0, 2.5);
alpha ~ normal(100, 15);
beta ~ normal(0, 10);
y ~ normal(alpha + beta * x, sigma);
}
Family: gaussian
Links: mu = identity; sigma = identity
Formula: IQ ~ Group
Data: TwoGroupIQ (Number of observations: 89)
Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
total post-warmup samples = 4000
Population-Level Effects:
Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
Intercept 100.36 0.72 98.91 101.75 4083 1.00
* GroupSmartDrug 1.56 1.00 -0.37 3.56 4039 1.00
Family Specific Parameters:
Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
sigma 4.76 0.36 4.12 5.53 3504 1.00We obtain the marginal posterior distribution of the regression parameter \(\beta\)1
We can draw samples from the posterior predictive distribution. The model predicts equal variances, and no outliers.
The Gaussian distribution is sensitive to outliers. We can model our data as being generated by a T distribution instead.
fit_robust <- brm(bf(IQ ~ 0 + Group, sigma ~ Group),
family = student,
data = TwoGroupIQ,
prior = c(set_prior("normal(100, 10)", class = "b"),
set_prior("cauchy(0, 1)", class = "b", dpar = "sigma"),
set_prior("exponential(1.0/29)", class = "nu")),
cores = parallel::detectCores(),
file = here::here("models/fitiq-robust"))
Family: student
Links: mu = identity; sigma = log; nu = identity
Formula: IQ ~ 0 + Group
sigma ~ Group
Data: TwoGroupIQ (Number of observations: 89)
Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
total post-warmup samples = 4000
Population-Level Effects:
Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
sigma_Intercept 0.01 0.19 -0.37 0.38 4343 1.00
GroupPlacebo 100.52 0.21 100.12 100.93 4312 1.00
GroupSmartDrug 101.55 0.36 100.85 102.24 4262 1.00
sigma_GroupSmartDrug 0.61 0.25 0.10 1.11 3800 1.00
Family Specific Parameters:
Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
nu 1.74 0.45 1.10 2.81 3554 1.00## Hypothesis Tests for class b:
## Hypothesis Estimate Est.Error CI.Lower CI.Upper Evid.Ratio
## 1 (GroupSmartDrug)-... > 0 1.03 0.41 0.34 Inf 101.56
## Post.Prob Star
## 1 0.99 *
## ---
## '*': The expected value under the hypothesis lies outside the 95%-CI.
## Posterior probabilities of point hypotheses assume equal prior probabilities.
Equal variance model
Robust model
We can compute Bayes factors using different two methods:
Savage-Dickey density ratio
Bridge sampling (used to obtain the marginal likelihood)
Comparison: Model with grouping variable vs model without (restricted).
For a more detailed description of Bayes factors, see here.
Bayes factors are difficult to compute, often only possible for certain restricted models.
Savage Dickey density ratio:
## [1] 2.935822
We noted that the normal model does not describe the data well. We modelled the outcome \(y\) as being drawn from a t distribution. This accounts for outliers.
Posterior predictive check: the data are well described by this model.
Parameter estimation: the parameter representing the group differnce is positive, with a 95% credible interval of [0.23, 1.85].
Bayes factor in favour of our hypothesis lies between 2.94 and 3.49, depending on the method used.
This is exploratory.
We did not remove outliers.
We did not test more subjects.
Computing a BF requires carefully specifying prior distributions, and these need to be reported.
Read Statistical Rethinking and install.packages("brms")
Statistical Rethinking (an introduction to applied Bayesian data analysis w book website and lectures on Youtube
brms: R package for Bayesian generalized multivariate non-linear multilevel models using Stan
Blog post by Matti Vuorre on how to perform a Bayesian t-test u.- Blog post by A. Solomon Kurz on hobust regression.
Blog post by Andrew Heiss: Various ways to measure the difference in means in two groups.
Cumming, G. (2014). “The New Statistics: Why and How”. En. In: Psychological Science 25.1, pp. 7-29. ISSN: 0956-7976. DOI: 10.1177/0956797613504966.
Gelman, A. and J. Hill (2006). Data Analysis Using Regression and Multilevel/Hierarchical Models. En. Cambridge University Press. ISBN: 978-0-511-79094-2. DOI: 10.1017/CBO9780511790942.
Gigerenzer, G. (2018). “Statistical Rituals: The Replication Delusion and How We Got There”. En. In: Advances in Methods and Practices in Psychological Science 1.2, pp. 198-218. ISSN: 2515-2459. DOI: 10.1177/2515245918771329.
Greenland, S, S. J. Senn, K. J. Rothman, et al. (2016). “Statistical Tests, P Values, Confidence Intervals, and Power: A Guide to Misinterpretations”. En. In: European Journal of Epidemiology 31.4, pp. 337-350. ISSN: 1573-7284. DOI: 10.1007/s10654-016-0149-3.
Kruschke, J. K. (2013). “Bayesian Estimation Supersedes the t Test.” En. In: Journal of Experimental Psychology: General 142.2, pp. 573-603. ISSN: 1939-2222, 0096-3445. DOI: 10.1037/a0029146.
Kruschke, J. K. and T. M. Liddell (2018). “The Bayesian New Statistics: Hypothesis Testing, Estimation, Meta-Analysis, and Power Analysis from a Bayesian Perspective”. En. In: Psychonomic Bulletin & Review 25.1, pp. 178-206. ISSN: 1531-5320. DOI: 10.3758/s13423-016-1221-4.
Lazar, N. A. (2016). “The ASA’s Statement on p-Values: Context, Process, and Purpose AU - Wasserstein, Ronald L.” In: The American Statistician 70.2, pp. 129-133. ISSN: 0003-1305. DOI: 10.1080/00031305.2016.1154108.
Lee, M. D. and E. Wagenmakers (2013). Bayesian Cognitive Modeling: A Practical Course. OCLC: ocn861318341. Cambridge ; New York: Cambridge University Press. ISBN: 978-1-107-01845-7 978-1-107-60357-8.