Bayesian Inference for Midge Wing Length

STA 145 – Bayesian Statistical Inference
Brandon Black and Nicholas Hamler

Goal: Estimate the population mean \(\theta\) and variance \(\sigma^2\) using Bayesian methods.

Method:

  • Normal likelihood
  • Conjugate priors
  • Gibbs sampling

Data and Likelihood

Wing length data (mm):

## [1] 1.64 1.70 1.72 1.74 1.82 1.82 1.82 1.90 2.08

Sample size: \(n = \;9\)

Sample mean: \(\bar{y} = \;1.804\)

Likelihood model:

\[ Y_i \mid \theta, \sigma^2 \sim N(\theta,\sigma^2) \]

Goal: infer \((\theta,\sigma^2)\mid Y\).

Prior Specification

Prior mean from biological studies:

\[ \mu_0 = 1.9 \]

Mean prior:

\[ \theta \sim N(\mu_0,\tau_0^2) \]

with

\[ \tau_0 = 0.95 \]

Precision parameter:

\[ \tilde\sigma^2 = 1/\sigma^2 \]

\[ \tilde\sigma^2 \sim Gamma\!\left(\nu_0/2,(\nu_0/2)\sigma_0^2\right) \]

where \(\sigma_0^2 = 0.01\).

Full Conditional Distributions

The conditional posterior for the mean is

\[ \theta \mid \sigma^2, y \sim N(\mu_n,\tau_n^2) \]

where

\[ \tau_n^2 = \left( \frac{n}{\sigma^2} + \frac{1}{\tau_0^2} \right)^{-1} \]

\[ \mu_n = \tau_n^2 \left( \frac{n\bar y}{\sigma^2} + \frac{\mu_0}{\tau_0^2} \right) \]

The conditional posterior for precision is

\[ \tilde\sigma^2 \mid \theta,y \sim Gamma\left( \frac{n+\nu_0}{2}, \frac{\sum (y_i-\theta)^2 + \nu_0\sigma_0^2}{2} \right) \]

Bayesian Models

Case 1

Variance prior uses fixed hyperparameter:

\[ \tilde\sigma^2 \sim Gamma\!\left(\nu_0/2,(\nu_0/2)\sigma_0^2\right), \qquad \sigma_0^2 = 0.01 \]

Case 2

Variance prior becomes hierarchical:

\[ \sigma_0^2 \sim Gamma(a_i,\beta) \]

\[ \beta \sim Gamma(c,d) \]

This allows the variance prior to be estimated from the data.

Hyperparameter Selection (Case 2)

For the hierarchical model we use

\[ \sigma_0^2 \sim Gamma(a_i,\beta) \]

We examine three choices:

\[ a_1 = 1, \quad a_2 = 5, \quad a_3 = 10 \]

Interpretation:

  • Smaller \(a\) \(\rightarrow\) more diffuse prior on variance
  • Larger \(a\) \(\rightarrow\) stronger prior belief (more concentrated)

This lets us evaluate prior sensitivity by comparing posterior summaries under different \(a\) values.

Gibbs Sampler

We use Gibbs sampling to generate posterior samples.

Algorithm:

Initialize parameters.

For \(t=1,\ldots,T\):

  1. \[ \theta^{(t)} \sim p(\theta\mid \sigma^2,y) \]

  2. \[ \sigma^{2(t)} \sim p(\sigma^2\mid \theta,y) \]

Case 2 additionally samples:

  1. \(\sigma_0^2\)

  2. \(\beta\)

We run:

  • 10,000 iterations
  • 2,000 burn-in

MCMC Diagnostics

Takeaway: stable trace + approximately normal posterior near 1.8 mm.

MCMC Diagnostics (Interpretation)

  • Trace plot: fluctuates around a stable mean with no visible trend/drift \(\rightarrow\) good mixing/convergence.
  • Posterior distribution: \(\theta\) is approximately normal, centered around ~1.8 mm (close to the sample mean).

These diagnostics indicate the Gibbs sampler successfully approximates the posterior distribution.

Posterior Results

Case 2 prior sensitivity (varying \(a\)):

a E(theta|y) 2.5% 97.5% E(sigma^2|y)
1 1.804 1.687 1.920 0.0309
5 1.806 1.591 2.019 0.1080
10 1.807 1.472 2.152 0.2700
  • \(\theta\) is stable across \(a\) (mean \(\approx 1.805\); intervals overlap) \(\rightarrow\) mean inference is robust.
  • \(\sigma^2\) is sensitive to \(a\) (larger \(a\) \(\rightarrow\) larger \(E(\sigma^2\mid y)\) and wider intervals).

Conclusion

Answer to the problem: the population mean midge wing length is about 1.80 mm.

  • Posterior mean of \(\theta\): \(\approx 1.805\) mm (robust across \(a=1,5,10\))
  • Uncertainty: 95% credible intervals overlap across hyperparameter choices
  • Variance inference: \(E(\sigma^2\mid y)\) changes noticeably with \(a\) \(\rightarrow\) variance is more prior-sensitive in Case 2

Overall: Bayesian modeling + Gibbs sampling gave a principled estimate of typical wing length while quantifying uncertainty and checking sensitivity to prior assumptions.