We derive the Bayes estimator for an exponential rate parameter \(\theta\) under squared error loss, using a conjugate Gamma prior.
Let \(Y_1, \dots, Y_n \mid \theta \sim \text{Exponential}(\theta)\), i.i.d. The probability density function for one observation is:
\[ f(y_i \mid \theta) = \theta e^{-\theta y_i}, \quad y_i > 0, \ \theta > 0 \]
For \(n\) independent observations, the joint likelihood is:
\[ f(\mathbf{y} \mid \theta) = \prod_{i=1}^n \theta e^{-\theta y_i} = \theta^n e^{-\theta \sum_{i=1}^n y_i} \]
As a kernel (ignoring constants not involving \(\theta\)):
\[ f(\mathbf{y} \mid \theta) \propto \theta^n e^{-\theta \sum y_i} \]
The conjugate prior for the rate parameter \(\theta\) of an exponential distribution is the Gamma distribution.
Let \(\theta \sim \text{Gamma}(\alpha, \beta)\) with shape \(\alpha > 0\) and rate \(\beta > 0\). The probability density function is:
\[ \pi(\theta) = \frac{\beta^\alpha}{\Gamma(\alpha)} \theta^{\alpha-1} e^{-\beta \theta}, \quad \theta > 0 \]
As a kernel:
\[ \pi(\theta) \propto \theta^{\alpha-1} e^{-\beta \theta} \]
We use squared error loss:
\[ L(\theta, \hat{\theta}) = (\theta - \hat{\theta})^2 \]
For squared error loss, the Bayes estimator is the posterior mean:
\[ \hat{\theta}_{\text{Bayes}} = \mathbb{E}[\theta \mid \mathbf{y}] \]
Bayes’ theorem states:
\[ \pi(\theta \mid \mathbf{y}) = \frac{f(\mathbf{y} \mid \theta) \pi(\theta)}{m(\mathbf{y})} \propto f(\mathbf{y} \mid \theta) \times \pi(\theta) \]
where \(m(\mathbf{y})\) is the marginal likelihood (normalizing constant).
Likelihood kernel:
\[ f(\mathbf{y} \mid \theta) \propto \theta^n e^{-\theta \sum y_i} \]
Prior kernel:
\[ \pi(\theta) \propto \theta^{\alpha-1} e^{-\beta \theta} \]
Multiplying:
\[ \pi(\theta \mid \mathbf{y}) \propto \theta^n e^{-\theta \sum y_i} \times \theta^{\alpha-1} e^{-\beta \theta} \]
Combine powers of \(\theta\):
\[ \pi(\theta \mid \mathbf{y}) \propto \theta^{(n + \alpha - 1)} e^{-\theta (\beta + \sum y_i)} \]
The kernel
\[ \theta^{n + \alpha - 1} e^{-\theta (\beta + \sum y_i)} \]
is exactly the kernel of a Gamma distribution with:
Therefore:
\[ \theta \mid \mathbf{y} \sim \text{Gamma}\left( n + \alpha,\ \beta + \sum_{i=1}^n y_i \right) \]
The full posterior density is:
\[ \pi(\theta \mid \mathbf{y}) = \frac{(\beta + \sum y_i)^{n+\alpha}}{\Gamma(n+\alpha)} \theta^{n+\alpha-1} e^{-\theta (\beta + \sum y_i)} \]
For a \(\text{Gamma}(a, b)\) distribution (shape \(a\), rate \(b\)), the mean is:
\[ \mathbb{E}[\theta] = \frac{a}{b} \]
Applying this to our posterior:
\[ \mathbb{E}[\theta \mid \mathbf{y}] = \frac{n + \alpha}{\beta + \sum_{i=1}^n y_i} \]
Under squared error loss, the Bayes estimator is:
\[ \boxed{\hat{\theta}_{\text{Bayes}} = \frac{n + \alpha}{\beta + \sum_{i=1}^n y_i}} \]
where:
If we choose the prior \(\text{Gamma}(\alpha = 1, \beta = \mu)\), then:
The Bayes estimator becomes:
\[ \hat{\theta}_{\text{Bayes}} = \frac{n + 1}{\mu + \sum_{i=1}^n y_i} \]
This matches the formula from Example 4.6.
The estimator can be written as:
\[ \hat{\theta} = \frac{n + \alpha}{\beta + n \bar{y}} \]
where \(\bar{y} = \frac{1}{n} \sum y_i\).
The MLE for the exponential rate is \(\hat{\theta}_{\text{MLE}} = \frac{n}{\sum y_i} = \frac{1}{\bar{y}}\).
The Bayes estimator shrinks the MLE toward the prior mean \(\alpha / \beta\).
Suppose we observe \(n = 5\) observations with \(\sum y_i = 10\), and we use a prior \(\text{Gamma}(\alpha = 1, \beta = 2)\):
n <- 5
sum_y <- 10
alpha <- 1
beta <- 2
theta_hat <- (n + alpha) / (beta + sum_y)
theta_hat
## [1] 0.5
mle <- n / sum_y
mle
## [1] 0.5