We have:
Likelihood:
\[ f(y|\beta,\sigma^2) = \left( \frac{\lambda^2}{2\pi} \right)^{n/2} |H|^{-1/2} \exp\left[ -\frac{\lambda^2}{2} (y - X\beta)' H^{-1} (y - X\beta) \right] \]
where \(\lambda^2 = 1/\sigma^2\).
Prior on \(\lambda^2\):
\[ \lambda^2 \sim \text{Gamma}(a,b) \quad \text{so} \quad \pi(\lambda^2) \propto (\lambda^2)^{a-1} e^{-b \lambda^2}, \quad \lambda^2 > 0 \]
(We must check density scaling later; often \(b\) is rate, but here it’s probably in the exponential as \(e^{-b\lambda^2}\).)
Prior on \(\beta\) given \(\lambda^2\):
\[ \beta \mid \lambda^2 \sim N_p\left(\beta_0, \; \frac{1}{\lambda^2} M^{-1} \right) \]
so covariance \(= M^{-1} / \lambda^2\).
That means the prior density is:
\[ \pi(\beta|\lambda^2) \propto (\lambda^2)^{p/2} |M|^{1/2} \exp\left[ -\frac{\lambda^2}{2} (\beta - \beta_0)' M (\beta - \beta_0) \right] \]
Joint prior:
\[ \pi(\beta, \lambda^2) = \pi(\beta | \lambda^2) \cdot \pi(\lambda^2). \]
Substitute:
\[ \pi(\beta|\lambda^2) \propto (\lambda^2)^{p/2} \exp\left[ -\frac{\lambda^2}{2} (\beta - \beta_0)' M (\beta - \beta_0) \right] \] \[ \pi(\lambda^2) \propto (\lambda^2)^{a-1} \exp\left[ -b \lambda^2 \right] \]
Multiply:
\[ \pi(\beta, \lambda^2) \propto (\lambda^2)^{p/2 + a - 1} \exp\left[ -\frac{\lambda^2}{2} (\beta - \beta_0)' M (\beta - \beta_0) - b \lambda^2 \right]. \]
Exponent:
\[ -\frac{\lambda^2}{2} (\beta - \beta_0)' M (\beta - \beta_0) - b \lambda^2 \] \[ = -\frac{\lambda^2}{2} \left[ (\beta - \beta_0)' M (\beta - \beta_0) + 2b \right]. \]
Thus:
\[ \pi(\beta, \lambda^2) \propto (\lambda^2)^{p/2 + a - 1} \exp\left[ -\frac{\lambda^2}{2} \Big( (\beta - \beta_0)' M (\beta - \beta_0) + 2b \Big) \right]. \]
Given joint prior:
\[ \pi(\beta, \lambda^2) \propto \lambda^{p + 2a - 2} \exp\left[ -\frac{\lambda^2}{2} \big\{ 2b + (\beta - \beta_0)' M (\beta - \beta_0) \big\} \right] \]
Check exponents carefully: My result above has
\((\lambda^2)^{p/2 + a - 1}\)
and \(\lambda^{2(p/2 + a - 1)} = \lambda^{p +
2a - 2}\).
Yes, indeed:
\[ (\lambda^2)^{p/2 + a - 1} = \lambda^{p + 2a - 2}. \]
So the given form matches exactly.
The joint prior density is proportional to:
\[ \boxed{ \pi(\beta, \lambda^2) \propto (\lambda^2)^{\frac{p}{2} + a - 1} \exp\left[ -\frac{\lambda^2}{2} \Big( (\beta - \beta_0)' M (\beta - \beta_0) + 2b \Big) \right] } \]
or equivalently:
\[ \boxed{ \pi(\beta, \lambda^2) \propto \lambda^{p + 2a - 2} \exp\left[ -\frac{\lambda^2}{2} \big( 2b + (\beta - \beta_0)' M (\beta - \beta_0) \big) \right]. } \]
The joint posterior is proportional to (likelihood × prior).
Likelihood:
\[ f(y|\beta,\lambda^2) \propto (\lambda^2)^{n/2} \exp\left[ -\frac{\lambda^2}{2} (y - X\beta)' H^{-1} (y - X\beta) \right] \]
Prior:
\[ \pi(\beta,\lambda^2) \propto (\lambda^2)^{p/2 + a - 1} \exp\left[ -\frac{\lambda^2}{2} \left( 2b + (\beta - \beta_0)' M (\beta - \beta_0) \right) \right] \]
Multiplying:
\[ \pi(\beta,\lambda^2|y) \propto (\lambda^2)^{n/2 + p/2 + a - 1} \exp\left[ -\frac{\lambda^2}{2} \left( (y - X\beta)' H^{-1} (y - X\beta) + (\beta - \beta_0)' M (\beta - \beta_0) + 2b \right) \right] \]
So Eq. (6.3) is simply:
\[ \pi(\beta,\lambda^2|y) \propto (\lambda^2)^{\frac{n+p}{2} + a - 1} \exp\left[ -\frac{\lambda^2}{2} Q(\beta) \right] \]
with
\[ Q(\beta) = 2b + (y - X\beta)' H^{-1} (y - X\beta) + (\beta - \beta_0)' M (\beta - \beta_0). \]
We need to rewrite \(Q(\beta)\) as a quadratic form in \(\beta\):
First, expand the two quadratic terms separately:
Term A = \((y - X\beta)' H^{-1} (y - X\beta)\):
\[ = y' H^{-1} y - 2 y' H^{-1} X\beta + \beta' X' H^{-1} X \beta. \]
Term B = \((\beta - \beta_0)' M (\beta - \beta_0)\):
\[ = \beta' M \beta - 2 \beta_0' M \beta + \beta_0' M \beta_0. \]
So:
\[ Q(\beta) = 2b + y' H^{-1} y + \beta_0' M \beta_0 - 2\beta'(X' H^{-1} y + M \beta_0) + \beta' (X' H^{-1} X + M) \beta. \]
Define:
\[ M_* = M + X' H^{-1} X \] and
\[ c = X' H^{-1} y + M \beta_0. \]
Then:
\[ Q(\beta) = \beta' M_* \beta - 2 \beta' c + \left( 2b + y' H^{-1} y + \beta_0' M \beta_0 \right). \]
Let \(\beta_* = M_*^{-1} c\). Then we have:
\[ Q(\beta) = (\beta - \beta_*)' M_* (\beta - \beta_*) + \text{(terms not involving \(\beta\))}. \]
Let’s find that constant term:
We know
\[ \beta' M_* \beta - 2\beta' c = (\beta - \beta_*)' M_* (\beta - \beta_*) - \beta_*' M_* \beta_* \]
since:
\[ (\beta - \beta_*)' M_* (\beta - \beta_*) = \beta' M_* \beta - 2\beta' M_* \beta_* + \beta_*' M_* \beta_* \]
and \(M_* \beta_* = c\) so \(\beta' c = \beta' M_* \beta_*\).
Thus:
\[ Q(\beta) = (\beta - \beta_*)' M_* (\beta - \beta_*) - \beta_*' M_* \beta_* + \left( 2b + y' H^{-1} y + \beta_0' M \beta_0 \right). \]
Define \(2b_*\) as the constant term:
\[ 2b_* = 2b + y' H^{-1} y + \beta_0' M \beta_0 - \beta_*' M_* \beta_*. \]
Then:
\[ Q(\beta) = 2b_* + (\beta - \beta_*)' M_* (\beta - \beta_*). \]
This matches Eq. (6.4).
Substitute into Eq. (6.3):
\[ \pi(\beta,\lambda^2|y) \propto (\lambda^2)^{\frac{n+p}{2} + a - 1} \exp\left[ -\frac{\lambda^2}{2} \left( 2b_* + (\beta - \beta_*)' M_* (\beta - \beta_*) \right) \right]. \]
That is exactly Eq. (6.5).
From Eq. (6.5), fixing \(\lambda^2\), we see the dependence on \(\beta\) is a normal kernel:
\[ \exp\left[ -\frac{\lambda^2}{2} (\beta - \beta_*)' M_* (\beta - \beta_*) \right] \]
so
\[ \beta \mid \lambda^2, y \sim N\left(\beta_*, \frac{1}{\lambda^2} M_*^{-1} \right). \]
From Eq. (6.3) or (6.5), fixing \(\beta\), treat \(\lambda^2\):
The only \(\lambda^2\) factors are \((\lambda^2)^{\frac{n+p}{2}+a-1}\) and the exponential factor \(\exp\left[ -\frac{\lambda^2}{2} Q(\beta) \right]\), where \(Q(\beta) = 2b_* + (\beta - \beta_*)' M_* (\beta - \beta_*)\) from the completed square.
So it’s exactly a gamma kernel with shape \(\frac{n+p}{2} + a\) and rate \(b_* + \frac{1}{2}(\beta-\beta_*)' M_* (\beta-\beta_*)\) if we recall:
Gamma density: \(f(x) \propto x^{r-1} e^{-s x}\) with rate \(s\) — here \(x = \lambda^2\), so \(r = \frac{n+p}{2} + a\) and \(s = b_* + \frac{1}{2}(\beta-\beta_*)' M_* (\beta-\beta_*)\).
But \(\lambda^2\) in prior \(G(a,b)\) means \(b\) is the rate (scale = \(1/b\)).
So:
\[ \lambda^2 \mid \beta, y \sim G\left( a + \frac{n+p}{2}, \; b_* + \frac{1}{2}(\beta - \beta_*)' M_* (\beta-\beta_*) \right). \]
Thus the given final forms in Eq. (6.6) are:
\[ \boxed{ \beta \mid \lambda^2, y \sim N\left( \beta_*, (\lambda^2 M_*)^{-1} \right) } \] \[ \boxed{ \lambda^2 \mid \beta, y \sim \text{Gamma}\left( \frac{n+p}{2} + a, \; b_* + \frac{1}{2}(\beta-\beta_*)' M_* (\beta-\beta_*) \right) } \]
To obtain \(\pi(\beta|y)\), we integrate out \(\lambda^2\):
\[ \pi(\beta|y) \propto \int_0^\infty (\lambda^2)^{\frac{n+p}{2} + a - 1} \exp\left[ -\frac{\lambda^2}{2} Q(\beta) \right] d(\lambda^2) \]
where \(Q(\beta) = 2b_* + (\beta - \beta_*)' M_* (\beta - \beta_*)\).
Let \(t = \lambda^2\). Then the integrand becomes: \[ t^{\frac{n+p}{2} + a - 1} e^{-t \cdot Q(\beta)/2} \]
This is a Gamma kernel. Recall the Gamma integral: \[ \int_0^\infty t^{r-1} e^{-st} dt = \frac{\Gamma(r)}{s^r}, \quad s > 0 \]
Here: \[ r = \frac{n+p}{2} + a, \quad s = \frac{Q(\beta)}{2} \]
Thus: \[ \pi(\beta|y) \propto \left[ \frac{Q(\beta)}{2} \right]^{-\left( \frac{n+p}{2} + a \right)} \]
Therefore: \[ \pi(\beta|y) \propto \left[ 2b_* + (\beta - \beta_*)' M_* (\beta - \beta_*) \right]^{-\left( \frac{n+p}{2} + a \right)} \]
A multivariate t-distribution with \(\nu\) degrees of freedom, location \(\mu\), and scale matrix \(\Sigma\) has density:
\[ f(x) \propto \left[ \nu + (x - \mu)' \Sigma^{-1} (x - \mu) \right]^{-(\nu + p)/2} \]
Compare with our expression. The exponent in our density is: \[ -\left( \frac{n+p}{2} + a \right) = -\frac{n + p + 2a}{2} \]
This matches the t-distribution exponent \(-\frac{\nu + p}{2}\) if: \[ \nu + p = n + p + 2a \quad \Rightarrow \quad \nu = n + 2a \]
Now we need to match the quadratic form. Factor out \(2b_*\) from our expression:
\[ \pi(\beta|y) \propto (2b_*)^{-\left( \frac{n+2a+p}{2} \right)} \left[ 1 + \frac{1}{2b_*} (\beta - \beta_*)' M_* (\beta - \beta_*) \right]^{-\left( \frac{n+2a+p}{2} \right)} \]
For the standard t-distribution with \(\nu = n+2a\), we have: \[ f(x) \propto \left[ 1 + \frac{1}{\nu} (x - \mu)' \Sigma^{-1} (x - \mu) \right]^{-(\nu + p)/2} \]
Matching terms: \[ \frac{1}{\nu} \Sigma^{-1} = \frac{1}{2b_*} M_* \quad \Rightarrow \quad \Sigma^{-1} = \frac{\nu}{2b_*} M_* = \frac{n+2a}{2b_*} M_* \]
Therefore: \[ \Sigma = \frac{2b_*}{n+2a} M_*^{-1} \]
Thus: \[ \boxed{\beta \mid y \sim t_{n+2a}\left( \beta_*, \; \frac{2b_*}{n+2a} M_*^{-1} \right)} \]
To obtain \(\pi(\lambda^2|y)\), we integrate out \(\beta\) from the joint posterior:
\[ \pi(\lambda^2|y) \propto \int \pi(\beta,\lambda^2|y) d\beta \]
From Equation (6.5): \[ \pi(\beta,\lambda^2|y) \propto (\lambda^2)^{\frac{n+p}{2} + a - 1} e^{-\lambda^2 b_*} \exp\left[ -\frac{\lambda^2}{2} (\beta - \beta_*)' M_* (\beta - \beta_*) \right] \]
The integral over \(\beta\) is a Gaussian integral: \[ \int \exp\left[ -\frac{\lambda^2}{2} (\beta - \beta_*)' M_* (\beta - \beta_*) \right] d\beta = \left( \frac{2\pi}{\lambda^2} \right)^{p/2} |M_*|^{-1/2} \]
This contributes a factor of \((\lambda^2)^{-p/2}\) to the kernel.
Therefore: \[ \pi(\lambda^2|y) \propto (\lambda^2)^{\frac{n+p}{2} + a - 1} e^{-\lambda^2 b_*} \times (\lambda^2)^{-p/2} \]
Simplify the exponent on \(\lambda^2\): \[ \frac{n+p}{2} + a - 1 - \frac{p}{2} = \frac{n}{2} + a - 1 \]
Thus: \[ \pi(\lambda^2|y) \propto (\lambda^2)^{\frac{n}{2} + a - 1} e^{-b_* \lambda^2} \]
This is the kernel of a Gamma distribution with shape \(\frac{n}{2} + a\) and rate \(b_*\):
\[ \boxed{\lambda^2 \mid y \sim G\left( \frac{n}{2} + a, \; b_* \right)} \]
The marginal posterior distributions are:
\[ \boxed{ \beta|y \sim t_{n+2a}\left( \beta_*, \frac{2b_*}{n+2a} M_*^{-1} \right), \quad \lambda^2|y \sim G\left( \frac{n}{2} + a, b_* \right) } \]
Equation (6.6) gives the full conditional posterior distributions: \[ \beta|\lambda^2,y \sim N\left(\beta_*, \frac{1}{\lambda^2}M_*^{-1}\right), \quad \lambda^2|\beta,y \sim G\left(\frac{n+p}{2}+a, \; b_* + \frac{1}{2}(\beta-\beta_*)'M_*(\beta-\beta_*)\right) \] These are conditional on the other parameter being known/fixed. They are used in Gibbs sampling.
Equation (6.7) gives the marginal posterior distributions, obtained by integrating out the nuisance parameter. These are different:
You cannot simply use (6.6) to claim what the marginal distributions are; the integration step is essential.