1 Preface
2 Learning Outcomes
3 Chapter 1: Why Bayesian Statistics?
- 3.1 1.1 Motivation
- 3.2 1.2 Frequentist and Bayesian Views
- 3.3 1.3 A Simple Everyday Example
- 3.4 1.4 Bayesian Learning
4 Chapter 2: Probability Refresher
- 4.1 2.1 What is Probability?
- 4.2 2.2 Sample Space
- 4.3 2.3 Complement Rule
- 4.4 2.4 Addition Rule
- 4.5 2.5 Conditional Probability
- 4.6 2.6 R Example: Conditional Probability
5 Chapter 3: Bayes’ Theorem
- 5.1 3.1 Bayes’ Theorem
- 5.2 3.2 Bayesian Formula for Parameters
- 5.3 3.3 Medical Testing Example
- 5.4 3.4 R Calculation
- 5.5 3.5 Simulation of Medical Testing
- 5.6 3.6 Graph: Test Results by Disease Status
6 Chapter 4: Prior Distributions
- 6.1 4.1 What is a Prior?
- 6.2 4.2 Uniform Prior
- 6.3 4.3 Beta Distribution
- 6.4 4.4 Shapes of Beta Priors
- 6.5 4.5 Interpretation of Beta Parameters
7 Chapter 5: Likelihood
- 7.1 5.1 What is Likelihood?
- 7.2 5.2 Coin Toss Example
8 Chapter 6: Posterior Distribution
- 8.1 6.1 Prior × Likelihood = Posterior
- 8.2 6.2 Example: Beta-Binomial Posterior
- 8.3 6.3 Prior, Likelihood, and Posterior Together
9 Chapter 7: Bayesian Estimation
- 9.1 7.1 Posterior Mean
- 9.2 7.2 Posterior Variance
- 9.3 7.3 Posterior Mode
- 9.4 7.4 R Posterior Summary
10 Chapter 8: Credible Intervals
- 10.1 8.1 What is a Credible Interval?
- 10.2 8.2 R Calculation
- 10.3 8.3 Credible Interval vs Confidence Interval
11 Chapter 9: Posterior Simulation
- 11.1 9.1 Why Simulate?
- 11.2 9.2 R Posterior Simulation
- 11.3 9.3 Histogram of Posterior Samples
12 Chapter 10: Posterior Predictive Distribution
- 12.1 10.1 Prediction in Bayesian Statistics
- 12.2 10.2 Predicting the Next Coin Toss
- 12.3 10.3 Predicting Future Number of Heads
13 Chapter 11: Prior Sensitivity Analysis
- 13.1 11.1 Why Prior Sensitivity Matters
- 13.2 11.2 Comparing Different Priors
- 13.3 11.3 Graph of Posterior Under Different Priors
14 Chapter 12: Bayesian Estimation of a Normal Mean
- 14.1 12.1 Problem Setting
- 14.2 12.2 Posterior Distribution
- 14.3 12.3 Simulated Dataset
- 14.4 12.4 Data Visualization
- 14.5 12.5 Bayesian Calculation in R
- 14.6 12.6 Prior and Posterior Graph
15 Chapter 13: Bayesian Linear Regression
- 15.1 13.1 Regression Model
- 15.2 13.2 Simulated Regression Dataset
- 15.3 13.3 Scatterplot
- 15.4 13.4 Approximate Bayesian Regression by Simulation
- 15.5 13.5 Posterior Distribution of Slope
- 15.6 13.6 Posterior Prediction Lines
16 Chapter 14: Introduction to MCMC
- 16.1 14.1 Why Do We Need MCMC?
- 16.2 14.2 Basic Idea of MCMC
- 16.3 14.3 Simple Metropolis Algorithm Example
- 16.4 14.4 Trace Plot
- 16.5 14.5 MCMC Posterior Histogram
17 Chapter 15: Bayesian Workflow
- 17.1 15.1 Full Bayesian Workflow
- 17.2 15.2 Reporting Bayesian Results
18 Chapter 16: Complete Applied Case Study
- 18.1 16.1 Case Study: Student Success Probability
- 18.2 16.2 Model
- 18.3 16.3 R Analysis
- 18.4 16.4 Probability That Pass Rate Exceeds 70%
- 18.5 16.5 Graph
19 Chapter 17: Common Bayesian Terms
20 Chapter 18: Exercises
- 20.1 Exercise 1
- 20.2 Exercise 2
- 20.3 Exercise 3
- 20.4 Exercise 4
- 20.5 Exercise 5
21 Chapter 19: Solutions to Selected Exercises
- 21.1 Solution to Exercise 1
22 Appendix A: Formula Sheet
- 22.1 Bayes’ Theorem
- 22.2 Bayesian Parameter Form
- 22.3 Proportional Posterior
- 22.4 Binomial Likelihood
- 22.5 Beta Prior
- 22.6 Beta-Binomial Posterior
- 22.7 Beta Mean
- 22.8 Beta Variance
- 22.9 Posterior Predictive Distribution
23 Appendix B: Recommended R Functions
24 Appendix C: Final Teaching Message

1 Preface

This textbook-style R Markdown note introduces Bayesian statistics from the beginning. It is written for students who have basic knowledge of descriptive statistics and probability but are new to Bayesian thinking.

The aim is to teach Bayesian statistics using three connected layers:

Conceptual understanding: What does Bayesian thinking mean?
Mathematical notation: How do we write Bayesian models formally?
R implementation: How do we simulate, visualize, estimate, and interpret Bayesian models?

This note is suitable for first-year master’s students in statistics, data science, epidemiology, business analytics, and applied research methods.

2 Learning Outcomes

After completing this note, students should be able to:

Explain the difference between frequentist and Bayesian statistics.
Understand prior, likelihood, evidence, and posterior.
Apply Bayes’ theorem to real examples.
Use the Beta-Binomial model for proportions.
Simulate Bayesian posterior distributions in R.
Interpret credible intervals and posterior predictive distributions.
Understand the basic idea of Bayesian regression.
Understand why MCMC is required in modern Bayesian analysis.
Produce Bayesian graphs and tables using R.

3 Chapter 1: Why Bayesian Statistics?

3.1 1.1 Motivation

Statistics is about learning from data. In many real-world problems, we do not know the true value of a parameter. For example:

What proportion of voters support a political party?
What is the probability that a patient has a disease?
What is the average effect of a new teaching method?
What is the future number of disease cases?
What is the probability that a machine will fail?

In classical or frequentist statistics, unknown parameters are usually treated as fixed but unknown quantities. In Bayesian statistics, unknown parameters are treated as uncertain quantities, and this uncertainty is described using probability distributions.

3.2 1.2 Frequentist and Bayesian Views

Let \(\theta\) be an unknown parameter.

In the frequentist approach:

\[ \theta = \text{fixed but unknown} \]

In the Bayesian approach:

\[ \theta \sim p(\theta) \]

This means \(\theta\) has a probability distribution that represents our uncertainty.

3.3 1.3 A Simple Everyday Example

Suppose you want to know whether it will rain today.

Before looking outside, you may think:

\[ P(\text{Rain}) = 0.20 \]

Then you see dark clouds. Your belief changes:

\[ P(\text{Rain} \mid \text{Dark clouds}) > 0.20 \]

This is Bayesian thinking: update your belief when new evidence arrives.

3.4 1.4 Bayesian Learning

Bayesian learning can be summarized as:

\[ \text{Posterior belief} = \text{Prior belief} + \text{Information from data} \]

More formally:

\[ \text{Posterior} \propto \text{Likelihood} \times \text{Prior} \]

4 Chapter 2: Probability Refresher

4.1 2.1 What is Probability?

Probability measures uncertainty. If \(A\) is an event, then:

\[ 0 \leq P(A) \leq 1 \]

where:

\(P(A)=0\) means impossible.
\(P(A)=1\) means certain.

4.2 2.2 Sample Space

The sample space is the set of all possible outcomes.

Example: rolling a fair die.

\[ S = \{1,2,3,4,5,6\} \]

The probability of rolling a six is:

\[ P(6)=\frac{1}{6} \]

4.3 2.3 Complement Rule

If \(A^c\) is the complement of event \(A\), then:

\[ P(A^c)=1-P(A) \]

Example:

If \(P(\text{Rain})=0.30\), then:

\[ P(\text{No rain})=1-0.30=0.70 \]

4.4 2.4 Addition Rule

For two events \(A\) and \(B\):

\[ P(A \cup B)=P(A)+P(B)-P(A\cap B) \]

If \(A\) and \(B\) are mutually exclusive:

\[ P(A \cup B)=P(A)+P(B) \]

4.5 2.5 Conditional Probability

Conditional probability is the probability of one event given another event has occurred.

\[ P(A \mid B)=\frac{P(A\cap B)}{P(B)}, \quad P(B)>0 \]

Example:

\[ P(\text{Pass} \mid \text{Attend class}) \]

means probability of passing given the student attended class.

4.6 2.6 R Example: Conditional Probability

We create a small dataset of students.

students <- data.frame(
  Student_ID = 1:20,
  Attendance = c("High", "High", "Low", "High", "Low", "High", "High", "Low", "High", "Low",
                 "High", "Low", "High", "High", "Low", "High", "Low", "High", "High", "Low"),
  Result = c("Pass", "Pass", "Fail", "Pass", "Fail", "Pass", "Pass", "Fail", "Pass", "Fail",
             "Pass", "Fail", "Pass", "Pass", "Fail", "Pass", "Fail", "Pass", "Pass", "Fail")
)

kable(students)

Student_ID	Attendance	Result
1	High	Pass
2	High	Pass
3	Low	Fail
4	High	Pass
5	Low	Fail
6	High	Pass
7	High	Pass
8	Low	Fail
9	High	Pass
10	Low	Fail
11	High	Pass
12	Low	Fail
13	High	Pass
14	High	Pass
15	Low	Fail
16	High	Pass
17	Low	Fail
18	High	Pass
19	High	Pass
20	Low	Fail

Now calculate:

\[ P(\text{Pass} \mid \text{High Attendance}) \]

high_attendance <- students %>% filter(Attendance == "High")
prob_pass_given_high <- mean(high_attendance$Result == "Pass")
prob_pass_given_high

## [1] 1

Interpretation: Among students with high attendance, the estimated probability of passing is 1.

5 Chapter 3: Bayes’ Theorem

5.1 3.1 Bayes’ Theorem

Bayes’ theorem is the foundation of Bayesian statistics.

\[ P(A \mid B)=\frac{P(B \mid A)P(A)}{P(B)} \]

where:

\(P(A)\) is the prior probability.
\(P(B \mid A)\) is the likelihood.
\(P(B)\) is the evidence or marginal probability.
\(P(A \mid B)\) is the posterior probability.

5.2 3.2 Bayesian Formula for Parameters

In statistics, we usually write Bayes’ theorem as:

\[ p(\theta \mid y)=\frac{p(y \mid \theta)p(\theta)}{p(y)} \]

where:

\(\theta\) is the unknown parameter.
\(y\) is the observed data.
\(p(\theta)\) is the prior.
\(p(y \mid \theta)\) is the likelihood.
\(p(\theta \mid y)\) is the posterior.

Since \(p(y)\) is often a normalizing constant, we commonly write:

\[ p(\theta \mid y) \propto p(y \mid \theta)p(\theta) \]

5.3 3.3 Medical Testing Example

Suppose a disease is rare.

Disease prevalence: \(P(D)=0.01\)
Test sensitivity: \(P(+ \mid D)=0.95\)
False positive rate: \(P(+ \mid D^c)=0.05\)

We want:

\[ P(D \mid +) \]

Using Bayes’ theorem:

\[ P(D \mid +)=\frac{P(+ \mid D)P(D)}{P(+ \mid D)P(D)+P(+ \mid D^c)P(D^c)} \]

5.4 3.4 R Calculation

p_disease <- 0.01
sensitivity <- 0.95
false_positive <- 0.05
p_no_disease <- 1 - p_disease

posterior_disease <- (sensitivity * p_disease) /
  (sensitivity * p_disease + false_positive * p_no_disease)

posterior_disease

## [1] 0.1610169

The probability of actually having the disease after a positive test is approximately:

round(posterior_disease * 100, 2)

## [1] 16.1

percent.

This is much lower than 95% because the disease is rare.

5.5 3.5 Simulation of Medical Testing

set.seed(123)
n <- 100000

disease <- rbinom(n, size = 1, prob = p_disease)

test_positive <- ifelse(
  disease == 1,
  rbinom(n, size = 1, prob = sensitivity),
  rbinom(n, size = 1, prob = false_positive)
)

medical_df <- data.frame(
  disease = factor(disease, labels = c("No disease", "Disease")),
  test = factor(test_positive, labels = c("Negative", "Positive"))
)

posterior_simulated <- mean(medical_df$disease == "Disease" & medical_df$test == "Positive") /
  mean(medical_df$test == "Positive")

posterior_simulated

## [1] 0.1619473

5.6 3.6 Graph: Test Results by Disease Status

ggplot(medical_df, aes(x = test, fill = disease)) +
  geom_bar(position = "fill") +
  labs(
    title = "Medical Test Result by Disease Status",
    x = "Test Result",
    y = "Proportion",
    fill = "True Status"
  ) +
  scale_y_continuous(labels = percent_format()) +
  theme_minimal()

6 Chapter 4: Prior Distributions

6.1 4.1 What is a Prior?

A prior distribution represents what we believe about a parameter before observing the data.

If \(\theta\) is a parameter, the prior is:

\[ p(\theta) \]

Examples:

\[ \theta \sim Uniform(0,1) \]

\[ \theta \sim Beta(\alpha,\beta) \]

\[ \theta \sim Normal(\mu,\sigma^2) \]

6.2 4.2 Uniform Prior

If we know very little about a probability parameter \(p\), we may use:

\[ p \sim Uniform(0,1) \]

This gives equal prior weight to all values between 0 and 1.

p_grid <- seq(0, 1, length.out = 1000)
uniform_df <- data.frame(p = p_grid, density = dunif(p_grid, 0, 1))

ggplot(uniform_df, aes(x = p, y = density)) +
  geom_line(linewidth = 1) +
  labs(title = "Uniform Prior Distribution", x = "p", y = "Density") +
  theme_minimal()

6.3 4.3 Beta Distribution

The Beta distribution is very important in Bayesian statistics because it is used as a prior for probabilities.

\[ p \sim Beta(\alpha,\beta) \]

The density is:

\[ f(p)=\frac{\Gamma(\alpha+\beta)}{\Gamma(\alpha)\Gamma(\beta)}p^{\alpha-1}(1-p)^{\beta-1}, \quad 0<p<1 \]

where \(\alpha>0\) and \(\beta>0\).

6.4 4.4 Shapes of Beta Priors

beta_df <- data.frame(p = rep(p_grid, 4),
                      density = c(dbeta(p_grid, 1, 1),
                                  dbeta(p_grid, 2, 2),
                                  dbeta(p_grid, 8, 2),
                                  dbeta(p_grid, 2, 8)),
                      Prior = rep(c("Beta(1,1): Uniform",
                                    "Beta(2,2): Balanced",
                                    "Beta(8,2): High values likely",
                                    "Beta(2,8): Low values likely"), each = length(p_grid)))

ggplot(beta_df, aes(x = p, y = density, linetype = Prior)) +
  geom_line(linewidth = 1) +
  labs(title = "Different Beta Prior Distributions", x = "Probability p", y = "Density") +
  theme_minimal()

6.5 4.5 Interpretation of Beta Parameters

For \(p \sim Beta(\alpha,\beta)\):

\[ E(p)=\frac{\alpha}{\alpha+\beta} \]

\[ Var(p)=\frac{\alpha\beta}{(\alpha+\beta)^2(\alpha+\beta+1)} \]

beta_summary <- data.frame(
  Prior = c("Beta(1,1)", "Beta(2,2)", "Beta(8,2)", "Beta(2,8)", "Beta(20,20)"),
  alpha = c(1, 2, 8, 2, 20),
  beta = c(1, 2, 2, 8, 20)
)

beta_summary <- beta_summary %>%
  mutate(
    Mean = alpha / (alpha + beta),
    Variance = alpha * beta / ((alpha + beta)^2 * (alpha + beta + 1))
  )

kable(beta_summary, digits = 4)

Prior	alpha	beta	Mean	Variance
Beta(1,1)	1	1	0.5	0.0833
Beta(2,2)	2	2	0.5	0.0500
Beta(8,2)	8	2	0.8	0.0145
Beta(2,8)	2	8	0.2	0.0145
Beta(20,20)	20	20	0.5	0.0061

7 Chapter 5: Likelihood

7.1 5.1 What is Likelihood?

The likelihood measures how well different parameter values explain the observed data.

Suppose:

\[ X \sim Binomial(n,p) \]

The probability mass function is:

\[ P(X=x \mid p)=\binom{n}{x}p^x(1-p)^{n-x} \]

As a likelihood function of \(p\), we write:

\[ L(p) \propto p^x(1-p)^{n-x} \]

7.2 5.2 Coin Toss Example

Suppose we toss a coin 10 times and observe 7 heads.

\[ n=10, \quad x=7 \]

The likelihood is:

\[ L(p) \propto p^7(1-p)^3 \]

x <- 7
n <- 10
p_grid <- seq(0.001, 0.999, length.out = 1000)
likelihood <- dbinom(x, size = n, prob = p_grid)
likelihood_df <- data.frame(p = p_grid, likelihood = likelihood)

ggplot(likelihood_df, aes(x = p, y = likelihood)) +
  geom_line(linewidth = 1) +
  labs(title = "Likelihood Curve for 7 Heads in 10 Tosses", x = "p", y = "Likelihood") +
  theme_minimal()

The likelihood is highest near:

\[ \hat{p}=\frac{x}{n}=\frac{7}{10}=0.70 \]

8 Chapter 6: Posterior Distribution

8.1 6.1 Prior × Likelihood = Posterior

The most important Bayesian relationship is:

\[ p(p \mid x) \propto p(x \mid p)p(p) \]

For the Beta-Binomial model:

Prior:

\[ p \sim Beta(\alpha,\beta) \]

Likelihood:

\[ X \sim Binomial(n,p) \]

Posterior:

\[ p \mid x \sim Beta(\alpha+x,\beta+n-x) \]

8.2 6.2 Example: Beta-Binomial Posterior

Suppose:

\[ p \sim Beta(2,2) \]

and we observe:

\[ x=7, \quad n=10 \]

Then:

\[ p \mid x \sim Beta(2+7,2+10-7)=Beta(9,5) \]

alpha_prior <- 2
beta_prior <- 2
x <- 7
n <- 10

alpha_post <- alpha_prior + x
beta_post <- beta_prior + n - x

posterior_df <- data.frame(
  p = rep(p_grid, 2),
  density = c(dbeta(p_grid, alpha_prior, beta_prior),
              dbeta(p_grid, alpha_post, beta_post)),
  Distribution = rep(c("Prior: Beta(2,2)", "Posterior: Beta(9,5)"), each = length(p_grid))
)

ggplot(posterior_df, aes(x = p, y = density, linetype = Distribution)) +
  geom_line(linewidth = 1) +
  labs(title = "Prior and Posterior Distribution", x = "p", y = "Density") +
  theme_minimal()

8.3 6.3 Prior, Likelihood, and Posterior Together

For visualization, we rescale likelihood to compare shapes.

prior_density <- dbeta(p_grid, alpha_prior, beta_prior)
likelihood_scaled <- likelihood / max(likelihood) * max(prior_density)
posterior_density <- dbeta(p_grid, alpha_post, beta_post)

plp_df <- data.frame(
  p = rep(p_grid, 3),
  value = c(prior_density, likelihood_scaled, posterior_density),
  Component = rep(c("Prior", "Likelihood (scaled)", "Posterior"), each = length(p_grid))
)

ggplot(plp_df, aes(x = p, y = value, linetype = Component)) +
  geom_line(linewidth = 1) +
  labs(title = "Prior, Likelihood and Posterior", x = "p", y = "Value") +
  theme_minimal()

9 Chapter 7: Bayesian Estimation

9.1 7.1 Posterior Mean

For:

\[ p \mid x \sim Beta(\alpha+x,\beta+n-x) \]

The posterior mean is:

\[ E(p \mid x)=\frac{\alpha+x}{\alpha+\beta+n} \]

9.2 7.2 Posterior Variance

\[ Var(p \mid x)=\frac{(\alpha+x)(\beta+n-x)}{(\alpha+\beta+n)^2(\alpha+\beta+n+1)} \]

9.3 7.3 Posterior Mode

If \(\alpha+x>1\) and \(\beta+n-x>1\), the posterior mode is:

\[ \frac{\alpha+x-1}{\alpha+\beta+n-2} \]

9.4 7.4 R Posterior Summary

posterior_mean <- alpha_post / (alpha_post + beta_post)
posterior_variance <- alpha_post * beta_post / ((alpha_post + beta_post)^2 * (alpha_post + beta_post + 1))
posterior_sd <- sqrt(posterior_variance)
posterior_median <- qbeta(0.5, alpha_post, beta_post)
posterior_mode <- (alpha_post - 1) / (alpha_post + beta_post - 2)
credible_interval <- qbeta(c(0.025, 0.975), alpha_post, beta_post)

summary_table <- data.frame(
  Statistic = c("Posterior Mean", "Posterior Median", "Posterior Mode", "Posterior SD", "2.5%", "97.5%"),
  Value = c(posterior_mean, posterior_median, posterior_mode, posterior_sd, credible_interval[1], credible_interval[2])
)

kable(summary_table, digits = 4)

Statistic	Value
Posterior Mean	0.6429
Posterior Median	0.6498
Posterior Mode	0.6667
Posterior SD	0.1237
2.5%	0.3857
97.5%	0.8614

10 Chapter 8: Credible Intervals

10.1 8.1 What is a Credible Interval?

A 95% credible interval is an interval \([a,b]\) such that:

\[ P(a < p < b \mid x)=0.95 \]

This means that, given the data and prior, there is 95% posterior probability that \(p\) lies between \(a\) and \(b\).

10.2 8.2 R Calculation

ci_lower <- credible_interval[1]
ci_upper <- credible_interval[2]

credible_df <- data.frame(p = p_grid, density = dbeta(p_grid, alpha_post, beta_post))
credible_df$inside <- credible_df$p >= ci_lower & credible_df$p <= ci_upper

ggplot(credible_df, aes(x = p, y = density)) +
  geom_line(linewidth = 1) +
  geom_area(data = subset(credible_df, inside), aes(x = p, y = density), alpha = 0.4) +
  geom_vline(xintercept = c(ci_lower, ci_upper), linetype = "dashed") +
  labs(title = "95% Bayesian Credible Interval", x = "p", y = "Posterior Density") +
  theme_minimal()

10.3 8.3 Credible Interval vs Confidence Interval

A Bayesian credible interval allows a direct probability statement about the parameter:

\[ P(a<p<b \mid x)=0.95 \]

A frequentist confidence interval does not say there is 95% probability that the fixed parameter lies in a specific observed interval. Instead, it describes the long-run performance of the procedure.

11 Chapter 9: Posterior Simulation

11.1 9.1 Why Simulate?

Many Bayesian summaries can be calculated by drawing random samples from the posterior distribution.

If:

\[ p \mid x \sim Beta(9,5) \]

we can simulate:

\[ p^{(1)},p^{(2)},...,p^{(S)} \sim Beta(9,5) \]

11.2 9.2 R Posterior Simulation

set.seed(123)
S <- 10000
posterior_samples <- rbeta(S, alpha_post, beta_post)

simulation_summary <- data.frame(
  Statistic = c("Mean", "Median", "SD", "2.5%", "97.5%"),
  Value = c(mean(posterior_samples), median(posterior_samples), sd(posterior_samples),
            quantile(posterior_samples, 0.025), quantile(posterior_samples, 0.975))
)

kable(simulation_summary, digits = 4)

Statistic	Value
Mean	0.6429
Median	0.6505
SD	0.1244
2.5%	0.3869
97.5%	0.8649

11.3 9.3 Histogram of Posterior Samples

posterior_samples_df <- data.frame(p = posterior_samples)

ggplot(posterior_samples_df, aes(x = p)) +
  geom_histogram(bins = 40, alpha = 0.7) +
  geom_vline(xintercept = mean(posterior_samples), linetype = "dashed") +
  labs(title = "Posterior Samples from Beta(9,5)", x = "p", y = "Frequency") +
  theme_minimal()

12 Chapter 10: Posterior Predictive Distribution

12.1 10.1 Prediction in Bayesian Statistics

Bayesian statistics naturally supports prediction. The posterior predictive distribution is:

\[ p(y_{new}\mid y)=\int p(y_{new}\mid \theta)p(\theta\mid y)d\theta \]

This combines parameter uncertainty and future observation uncertainty.

12.2 10.2 Predicting the Next Coin Toss

If \(p\) is the probability of heads, then:

\[ Y_{new}\mid p \sim Bernoulli(p) \]

We simulate:

Draw \(p\) from posterior.
Draw future outcome from Bernoulli\((p)\).

set.seed(123)
future_heads <- rbinom(S, size = 1, prob = posterior_samples)
mean_future_heads <- mean(future_heads)
mean_future_heads

## [1] 0.647

The posterior predictive probability of heads in the next toss is approximately 0.647.

12.3 10.3 Predicting Future Number of Heads

Suppose we want to predict the number of heads in 20 future tosses.

set.seed(123)
future_20_heads <- rbinom(S, size = 20, prob = posterior_samples)
future_df <- data.frame(heads = future_20_heads)

ggplot(future_df, aes(x = heads)) +
  geom_bar(alpha = 0.7) +
  labs(title = "Posterior Predictive Distribution: Heads in 20 Future Tosses", x = "Number of Heads", y = "Frequency") +
  theme_minimal()

future_summary <- data.frame(
  Statistic = c("Mean predicted heads", "Median predicted heads", "2.5%", "97.5%"),
  Value = c(mean(future_20_heads), median(future_20_heads),
            quantile(future_20_heads, 0.025), quantile(future_20_heads, 0.975))
)

kable(future_summary, digits = 2)

Statistic	Value
Mean predicted heads	12.87
Median predicted heads	13.00
2.5%	6.00
97.5%	19.00

13 Chapter 11: Prior Sensitivity Analysis

13.1 11.1 Why Prior Sensitivity Matters

Bayesian results depend on both the prior and the data. It is good practice to check whether conclusions change under different reasonable priors.

13.2 11.2 Comparing Different Priors

Suppose the observed data remain:

\[ x=7, \quad n=10 \]

We compare four priors:

\[ Beta(1,1), Beta(2,2), Beta(10,10), Beta(20,5) \]

priors <- data.frame(
  Prior = c("Beta(1,1)", "Beta(2,2)", "Beta(10,10)", "Beta(20,5)"),
  alpha = c(1, 2, 10, 20),
  beta = c(1, 2, 10, 5)
)

sensitivity_results <- priors %>%
  mutate(
    alpha_post = alpha + x,
    beta_post = beta + n - x,
    posterior_mean = alpha_post / (alpha_post + beta_post),
    lower_95 = qbeta(0.025, alpha_post, beta_post),
    upper_95 = qbeta(0.975, alpha_post, beta_post)
  )

kable(sensitivity_results, digits = 4)

Prior	alpha	beta	alpha_post	beta_post	posterior_mean	lower_95	upper_95
Beta(1,1)	1	1	8	4	0.6667	0.3903	0.8907
Beta(2,2)	2	2	9	5	0.6429	0.3857	0.8614
Beta(10,10)	10	10	17	13	0.5667	0.3894	0.7355
Beta(20,5)	20	5	27	8	0.7714	0.6210	0.8925

13.3 11.3 Graph of Posterior Under Different Priors

sens_plot_df <- data.frame()

for (i in 1:nrow(priors)) {
  temp <- data.frame(
    p = p_grid,
    density = dbeta(p_grid, priors$alpha[i] + x, priors$beta[i] + n - x),
    Prior = priors$Prior[i]
  )
  sens_plot_df <- rbind(sens_plot_df, temp)
}

ggplot(sens_plot_df, aes(x = p, y = density, linetype = Prior)) +
  geom_line(linewidth = 1) +
  labs(title = "Posterior Distributions Under Different Priors", x = "p", y = "Density") +
  theme_minimal()

14 Chapter 12: Bayesian Estimation of a Normal Mean

14.1 12.1 Problem Setting

Suppose we observe continuous data:

\[ y_1,y_2,...,y_n \]

Assume:

\[ y_i \sim Normal(\mu,\sigma^2) \]

For simplicity, assume \(\sigma^2\) is known.

We place a Normal prior on \(\mu\):

\[ \mu \sim Normal(\mu_0,\tau_0^2) \]

14.2 12.2 Posterior Distribution

The posterior is also Normal:

\[ \mu \mid y \sim Normal(\mu_n,\tau_n^2) \]

where:

\[ \tau_n^2=\left(\frac{n}{\sigma^2}+\frac{1}{\tau_0^2}\right)^{-1} \]

\[ \mu_n=\tau_n^2\left(\frac{n\bar{y}}{\sigma^2}+\frac{\mu_0}{\tau_0^2}\right) \]

14.3 12.3 Simulated Dataset

set.seed(123)
n_normal <- 50
true_mu <- 70
sigma_known <- 10

y <- rnorm(n_normal, mean = true_mu, sd = sigma_known)
normal_df <- data.frame(score = y)

kable(head(normal_df, 10), digits = 2)

score
64.40
67.70
85.59
70.71
71.29
87.15
74.61
57.35
63.13
65.54

14.4 12.4 Data Visualization

ggplot(normal_df, aes(x = score)) +
  geom_histogram(bins = 15, alpha = 0.7) +
  geom_vline(xintercept = mean(y), linetype = "dashed") +
  labs(title = "Simulated Student Scores", x = "Score", y = "Frequency") +
  theme_minimal()

14.5 12.5 Bayesian Calculation in R

mu0 <- 65
tau0 <- 15

sample_mean <- mean(y)
posterior_variance_mu <- 1 / (n_normal / sigma_known^2 + 1 / tau0^2)
posterior_mean_mu <- posterior_variance_mu * (n_normal * sample_mean / sigma_known^2 + mu0 / tau0^2)
posterior_sd_mu <- sqrt(posterior_variance_mu)

normal_posterior_summary <- data.frame(
  Statistic = c("Sample mean", "Prior mean", "Posterior mean", "Posterior SD", "95% lower", "95% upper"),
  Value = c(sample_mean, mu0, posterior_mean_mu, posterior_sd_mu,
            posterior_mean_mu - 1.96 * posterior_sd_mu,
            posterior_mean_mu + 1.96 * posterior_sd_mu)
)

kable(normal_posterior_summary, digits = 3)

Statistic	Value
Sample mean	70.344
Prior mean	65.000
Posterior mean	70.297
Posterior SD	1.408
95% lower	67.537
95% upper	73.057

14.6 12.6 Prior and Posterior Graph

mu_grid <- seq(55, 85, length.out = 1000)
normal_pp_df <- data.frame(
  mu = rep(mu_grid, 2),
  density = c(dnorm(mu_grid, mean = mu0, sd = tau0),
              dnorm(mu_grid, mean = posterior_mean_mu, sd = posterior_sd_mu)),
  Distribution = rep(c("Prior", "Posterior"), each = length(mu_grid))
)

ggplot(normal_pp_df, aes(x = mu, y = density, linetype = Distribution)) +
  geom_line(linewidth = 1) +
  labs(title = "Prior and Posterior for Normal Mean", x = expression(mu), y = "Density") +
  theme_minimal()

15 Chapter 13: Bayesian Linear Regression

15.1 13.1 Regression Model

In linear regression, we model a continuous outcome \(Y\) using predictor \(X\):

\[ Y_i=\beta_0+\beta_1X_i+\epsilon_i \]

where:

\[ \epsilon_i \sim Normal(0,\sigma^2) \]

In Bayesian regression, we assign priors:

\[ \beta_0 \sim Normal(0,100^2) \]

\[ \beta_1 \sim Normal(0,100^2) \]

\[ \sigma \sim HalfNormal(0,10) \]

15.2 13.2 Simulated Regression Dataset

set.seed(123)
n_reg <- 150
study_hours <- runif(n_reg, min = 0, max = 10)
exam_score <- 50 + 4 * study_hours + rnorm(n_reg, mean = 0, sd = 8)

reg_df <- data.frame(
  study_hours = study_hours,
  exam_score = exam_score
)

kable(head(reg_df, 10), digits = 2)

study_hours	exam_score
2.88	69.71
7.88	79.25
4.09	56.59
8.83	86.77
9.40	86.51
0.46	51.87
5.28	74.21
8.92	82.73
5.51	77.21
4.57	66.50

15.3 13.3 Scatterplot

ggplot(reg_df, aes(x = study_hours, y = exam_score)) +
  geom_point(alpha = 0.7) +
  geom_smooth(method = "lm", se = TRUE) +
  labs(title = "Relationship Between Study Hours and Exam Score", x = "Study Hours", y = "Exam Score") +
  theme_minimal()

15.4 13.4 Approximate Bayesian Regression by Simulation

For beginner-level teaching, we use a normal approximation around the classical regression estimates. This is not full MCMC, but it gives intuition about posterior uncertainty.

lm_fit <- lm(exam_score ~ study_hours, data = reg_df)
summary(lm_fit)

## 
## Call:
## lm(formula = exam_score ~ study_hours, data = reg_df)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -16.2890  -4.8644  -0.9938   4.7232  25.7358 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  49.6221     1.2653   39.22   <2e-16 ***
## study_hours   4.0644     0.2183   18.62   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 7.65 on 148 degrees of freedom
## Multiple R-squared:  0.7009, Adjusted R-squared:  0.6989 
## F-statistic: 346.8 on 1 and 148 DF,  p-value: < 2.2e-16

set.seed(123)
coef_est <- coef(lm_fit)
vcov_est <- vcov(lm_fit)

# Simulate approximate posterior samples using multivariate normal logic
# Cholesky decomposition without extra packages
S <- 5000
z <- matrix(rnorm(S * 2), ncol = 2)
L <- chol(vcov_est)
beta_samples <- sweep(z %*% L, 2, coef_est, "+")

beta_df <- data.frame(
  beta0 = beta_samples[, 1],
  beta1 = beta_samples[, 2]
)

beta_summary <- data.frame(
  Parameter = c("Intercept", "Slope"),
  Mean = c(mean(beta_df$beta0), mean(beta_df$beta1)),
  Lower_95 = c(quantile(beta_df$beta0, 0.025), quantile(beta_df$beta1, 0.025)),
  Upper_95 = c(quantile(beta_df$beta0, 0.975), quantile(beta_df$beta1, 0.975))
)

kable(beta_summary, digits = 3)

Parameter	Mean	Lower_95	Upper_95
Intercept	49.621	47.113	52.096
Slope	4.064	3.637	4.492

15.5 13.5 Posterior Distribution of Slope

ggplot(beta_df, aes(x = beta1)) +
  geom_histogram(bins = 40, alpha = 0.7) +
  geom_vline(xintercept = mean(beta_df$beta1), linetype = "dashed") +
  labs(title = "Approximate Posterior Distribution of Regression Slope", x = expression(beta[1]), y = "Frequency") +
  theme_minimal()

15.6 13.6 Posterior Prediction Lines

set.seed(123)
selected <- sample(1:S, 100)
x_seq <- seq(0, 10, length.out = 100)
line_df <- data.frame()

for (i in selected) {
  temp <- data.frame(
    study_hours = x_seq,
    exam_score = beta_df$beta0[i] + beta_df$beta1[i] * x_seq,
    draw = i
  )
  line_df <- rbind(line_df, temp)
}

ggplot(reg_df, aes(x = study_hours, y = exam_score)) +
  geom_point(alpha = 0.5) +
  geom_line(data = line_df, aes(x = study_hours, y = exam_score, group = draw), alpha = 0.15) +
  labs(title = "Posterior Regression Lines", x = "Study Hours", y = "Exam Score") +
  theme_minimal()

16 Chapter 14: Introduction to MCMC

16.1 14.1 Why Do We Need MCMC?

In simple Bayesian models, we can calculate the posterior directly. But in realistic models, the posterior may be too complicated.

Bayesian posterior:

\[ p(\theta\mid y)=\frac{p(y\mid \theta)p(\theta)}{p(y)} \]

The denominator is:

\[ p(y)=\int p(y\mid \theta)p(\theta)d\theta \]

This integral can be difficult or impossible to solve analytically.

MCMC methods allow us to generate samples from the posterior distribution without solving the integral directly.

16.2 14.2 Basic Idea of MCMC

MCMC creates a chain:

\[ \theta^{(1)},\theta^{(2)},... ,\theta^{(S)} \]

After convergence, these values behave like samples from the posterior distribution.

16.3 14.3 Simple Metropolis Algorithm Example

We estimate the posterior of \(p\) for a coin toss model using a random walk proposal.

log_posterior <- function(p, x, n, alpha, beta) {
  if (p <= 0 || p >= 1) return(-Inf)
  dbinom(x, size = n, prob = p, log = TRUE) + dbeta(p, alpha, beta, log = TRUE)
}

metropolis_coin <- function(iter = 10000, start = 0.5, proposal_sd = 0.05,
                            x = 7, n = 10, alpha = 2, beta = 2) {
  chain <- numeric(iter)
  chain[1] <- start
  accepted <- 0
  
  for (i in 2:iter) {
    current <- chain[i - 1]
    proposal <- rnorm(1, mean = current, sd = proposal_sd)
    
    log_ratio <- log_posterior(proposal, x, n, alpha, beta) -
      log_posterior(current, x, n, alpha, beta)
    
    if (log(runif(1)) < log_ratio) {
      chain[i] <- proposal
      accepted <- accepted + 1
    } else {
      chain[i] <- current
    }
  }
  
  list(chain = chain, acceptance_rate = accepted / (iter - 1))
}

set.seed(123)
mcmc_result <- metropolis_coin()
mcmc_result$acceptance_rate

## [1] 0.8736874

16.4 14.4 Trace Plot

mcmc_df <- data.frame(iteration = 1:length(mcmc_result$chain), p = mcmc_result$chain)

ggplot(mcmc_df, aes(x = iteration, y = p)) +
  geom_line(alpha = 0.8) +
  labs(title = "MCMC Trace Plot", x = "Iteration", y = "p") +
  theme_minimal()

16.5 14.5 MCMC Posterior Histogram

burnin <- 1000
mcmc_samples <- mcmc_result$chain[-(1:burnin)]

mcmc_samples_df <- data.frame(p = mcmc_samples)

ggplot(mcmc_samples_df, aes(x = p)) +
  geom_histogram(bins = 40, alpha = 0.7) +
  stat_function(fun = dbeta, args = list(shape1 = alpha_post, shape2 = beta_post), linewidth = 1) +
  labs(title = "MCMC Samples Compared with Exact Posterior", x = "p", y = "Frequency / Density") +
  theme_minimal()

17 Chapter 15: Bayesian Workflow

17.1 15.1 Full Bayesian Workflow

A standard Bayesian analysis usually follows these steps:

Define the research question.
Choose a probability model for the data.
Choose prior distributions.
Compute or simulate the posterior.
Summarize posterior results.
Check model fit.
Conduct prior sensitivity analysis.
Make predictions.
Communicate uncertainty clearly.

17.2 15.2 Reporting Bayesian Results

A Bayesian result should report:

Prior distribution.
Likelihood/model.
Posterior mean or median.
95% credible interval.
Posterior probability of meaningful hypotheses.
Sensitivity analysis.
Predictive checks where possible.

Example statement:

“Using a Beta(2,2) prior and observing 7 successes out of 10 trials, the posterior distribution was Beta(9,5). The posterior mean was 0.643, and the 95% credible interval was approximately [0.39, 0.86].”

18 Chapter 16: Complete Applied Case Study

18.1 16.1 Case Study: Student Success Probability

Suppose a lecturer wants to estimate the probability that students pass a statistics course after using a new interactive teaching method.

Observed data:

Number of students: \(n=80\)
Number passed: \(x=62\)

We estimate the pass probability \(p\).

18.2 16.2 Model

Likelihood:

\[ X \sim Binomial(n,p) \]

Prior:

\[ p \sim Beta(5,5) \]

Posterior:

\[ p \mid x \sim Beta(5+62,5+80-62)=Beta(67,23) \]

18.3 16.3 R Analysis

n_course <- 80
x_pass <- 62
alpha_course <- 5
beta_course <- 5

alpha_course_post <- alpha_course + x_pass
beta_course_post <- beta_course + n_course - x_pass

course_summary <- data.frame(
  Quantity = c("Observed pass rate", "Posterior mean", "Posterior median", "95% lower", "95% upper"),
  Value = c(x_pass / n_course,
            alpha_course_post / (alpha_course_post + beta_course_post),
            qbeta(0.5, alpha_course_post, beta_course_post),
            qbeta(0.025, alpha_course_post, beta_course_post),
            qbeta(0.975, alpha_course_post, beta_course_post))
)

kable(course_summary, digits = 4)

Quantity	Value
Observed pass rate	0.7750
Posterior mean	0.7444
Posterior median	0.7463
95% lower	0.6500
95% upper	0.8286

18.4 16.4 Probability That Pass Rate Exceeds 70%

prob_above_70 <- 1 - pbeta(0.70, alpha_course_post, beta_course_post)
prob_above_70

## [1] 0.8341849

Interpretation: The posterior probability that the true pass probability exceeds 70% is 0.834.

18.5 16.5 Graph

course_df <- data.frame(
  p = p_grid,
  density = dbeta(p_grid, alpha_course_post, beta_course_post)
)

ggplot(course_df, aes(x = p, y = density)) +
  geom_line(linewidth = 1) +
  geom_vline(xintercept = 0.70, linetype = "dashed") +
  geom_area(data = subset(course_df, p > 0.70), aes(x = p, y = density), alpha = 0.4) +
  labs(title = "Posterior Probability That Pass Rate Exceeds 70%", x = "Pass Probability", y = "Posterior Density") +
  theme_minimal()

19 Chapter 17: Common Bayesian Terms

Term	Meaning
Prior	Belief before observing data
Likelihood	Information from data
Posterior	Updated belief after observing data
Evidence	Marginal probability of data
Credible interval	Bayesian uncertainty interval
Posterior mean	Average of posterior distribution
Posterior predictive	Distribution for future data
MCMC	Simulation method for posterior sampling
Conjugate prior	Prior producing posterior in same family

20 Chapter 18: Exercises

20.1 Exercise 1

A coin is tossed 20 times and gives 14 heads. Use a \(Beta(1,1)\) prior.

Write the likelihood.
Derive the posterior distribution.
Compute posterior mean.
Compute 95% credible interval using R.
Plot the posterior distribution.

20.2 Exercise 2

Repeat Exercise 1 using \(Beta(10,10)\). Compare the posterior with Exercise 1.

20.3 Exercise 3

A disease affects 2% of a population. A test has sensitivity 90% and false positive rate 8%.

Calculate:

\[ P(Disease \mid Positive) \]

using R.

20.4 Exercise 4

Simulate a dataset of 100 students where the true pass probability is 0.75. Use a \(Beta(2,2)\) prior and estimate the posterior distribution.

20.5 Exercise 5

Generate a regression dataset where:

\[ Y=10+3X+\epsilon \]

where \(\epsilon \sim Normal(0,4^2)\). Fit a linear regression and simulate approximate posterior samples for the slope.

21 Chapter 19: Solutions to Selected Exercises

21.1 Solution to Exercise 1

x_ex1 <- 14
n_ex1 <- 20
alpha_ex1 <- 1
beta_ex1 <- 1

alpha_ex1_post <- alpha_ex1 + x_ex1
beta_ex1_post <- beta_ex1 + n_ex1 - x_ex1

ex1_summary <- data.frame(
  Statistic = c("Posterior distribution", "Posterior mean", "2.5%", "97.5%"),
  Value = c(paste0("Beta(", alpha_ex1_post, ",", beta_ex1_post, ")"),
            round(alpha_ex1_post / (alpha_ex1_post + beta_ex1_post), 4),
            round(qbeta(0.025, alpha_ex1_post, beta_ex1_post), 4),
            round(qbeta(0.975, alpha_ex1_post, beta_ex1_post), 4))
)

kable(ex1_summary)

Statistic	Value
Posterior distribution	Beta(15,7)
Posterior mean	0.6818
2.5%	0.4782
97.5%	0.8541

ex1_df <- data.frame(
  p = p_grid,
  density = dbeta(p_grid, alpha_ex1_post, beta_ex1_post)
)

ggplot(ex1_df, aes(x = p, y = density)) +
  geom_line(linewidth = 1) +
  labs(title = "Exercise 1 Posterior Distribution", x = "p", y = "Density") +
  theme_minimal()

22 Appendix A: Formula Sheet

22.1 Bayes’ Theorem

\[ P(A \mid B)=\frac{P(B \mid A)P(A)}{P(B)} \]

22.2 Bayesian Parameter Form

\[ p(\theta \mid y)=\frac{p(y \mid \theta)p(\theta)}{p(y)} \]

22.3 Proportional Posterior

\[ p(\theta \mid y) \propto p(y \mid \theta)p(\theta) \]

22.4 Binomial Likelihood

\[ P(X=x \mid p)=\binom{n}{x}p^x(1-p)^{n-x} \]

22.5 Beta Prior

\[ p \sim Beta(\alpha,\beta) \]

22.6 Beta-Binomial Posterior

\[ p \mid x \sim Beta(\alpha+x,\beta+n-x) \]

22.7 Beta Mean

\[ E(p)=\frac{\alpha}{\alpha+\beta} \]

22.8 Beta Variance

\[ Var(p)=\frac{\alpha\beta}{(\alpha+\beta)^2(\alpha+\beta+1)} \]

22.9 Posterior Predictive Distribution

\[ p(y_{new}\mid y)=\int p(y_{new}\mid \theta)p(\theta\mid y)d\theta \]

23 Appendix B: Recommended R Functions

Function	Purpose
`dbeta()`	Beta density
`pbeta()`	Beta cumulative probability
`qbeta()`	Beta quantile
`rbeta()`	Random Beta samples
`dbinom()`	Binomial probability
`rbinom()`	Simulate Binomial data
`dnorm()`	Normal density
`rnorm()`	Simulate Normal data
`lm()`	Linear regression
`ggplot()`	Graphing

24 Appendix C: Final Teaching Message

Bayesian statistics is not only a collection of formulas. It is a way of learning from evidence. The central idea is simple:

\[ \text{New belief} = \text{Old belief} + \text{Data evidence} \]

In mathematical form:

\[ \text{Posterior} \propto \text{Likelihood} \times \text{Prior} \]

Once this idea is clear, students can gradually move toward advanced topics such as hierarchical models, MCMC, Bayesian regression, Bayesian disease modelling, Bayesian machine learning, and decision theory.

Bayesian Statistics Using R: A Beginner to Intermediate notes

Dr Hom Nath Dhungana, Dr Eric Howard, Dr Radhika Kuttala

2026-06-04

1 Preface

2 Learning Outcomes

3 Chapter 1: Why Bayesian Statistics?

3.1 1.1 Motivation

3.2 1.2 Frequentist and Bayesian Views

3.3 1.3 A Simple Everyday Example

3.4 1.4 Bayesian Learning

4 Chapter 2: Probability Refresher

4.1 2.1 What is Probability?

4.2 2.2 Sample Space

4.3 2.3 Complement Rule

4.4 2.4 Addition Rule

4.5 2.5 Conditional Probability

4.6 2.6 R Example: Conditional Probability

5 Chapter 3: Bayes’ Theorem

5.1 3.1 Bayes’ Theorem

5.2 3.2 Bayesian Formula for Parameters

5.3 3.3 Medical Testing Example

5.4 3.4 R Calculation

5.5 3.5 Simulation of Medical Testing

5.6 3.6 Graph: Test Results by Disease Status

6 Chapter 4: Prior Distributions

6.1 4.1 What is a Prior?

6.2 4.2 Uniform Prior

6.3 4.3 Beta Distribution

6.4 4.4 Shapes of Beta Priors

6.5 4.5 Interpretation of Beta Parameters

7 Chapter 5: Likelihood

7.1 5.1 What is Likelihood?

7.2 5.2 Coin Toss Example

8 Chapter 6: Posterior Distribution

8.1 6.1 Prior × Likelihood = Posterior

8.2 6.2 Example: Beta-Binomial Posterior

8.3 6.3 Prior, Likelihood, and Posterior Together

9 Chapter 7: Bayesian Estimation

9.1 7.1 Posterior Mean

9.2 7.2 Posterior Variance

9.3 7.3 Posterior Mode

9.4 7.4 R Posterior Summary

10 Chapter 8: Credible Intervals

10.1 8.1 What is a Credible Interval?

10.2 8.2 R Calculation

10.3 8.3 Credible Interval vs Confidence Interval

11 Chapter 9: Posterior Simulation

11.1 9.1 Why Simulate?

11.2 9.2 R Posterior Simulation

11.3 9.3 Histogram of Posterior Samples

12 Chapter 10: Posterior Predictive Distribution

12.1 10.1 Prediction in Bayesian Statistics

12.2 10.2 Predicting the Next Coin Toss

12.3 10.3 Predicting Future Number of Heads

13 Chapter 11: Prior Sensitivity Analysis

13.1 11.1 Why Prior Sensitivity Matters

13.2 11.2 Comparing Different Priors

13.3 11.3 Graph of Posterior Under Different Priors

14 Chapter 12: Bayesian Estimation of a Normal Mean

14.1 12.1 Problem Setting

14.2 12.2 Posterior Distribution

14.3 12.3 Simulated Dataset

14.4 12.4 Data Visualization

14.5 12.5 Bayesian Calculation in R

14.6 12.6 Prior and Posterior Graph

15 Chapter 13: Bayesian Linear Regression

15.1 13.1 Regression Model

15.2 13.2 Simulated Regression Dataset

15.3 13.3 Scatterplot

15.4 13.4 Approximate Bayesian Regression by Simulation

15.5 13.5 Posterior Distribution of Slope

15.6 13.6 Posterior Prediction Lines

16 Chapter 14: Introduction to MCMC

16.1 14.1 Why Do We Need MCMC?

16.2 14.2 Basic Idea of MCMC

16.3 14.3 Simple Metropolis Algorithm Example

16.4 14.4 Trace Plot

16.5 14.5 MCMC Posterior Histogram

17 Chapter 15: Bayesian Workflow

17.1 15.1 Full Bayesian Workflow