Question 1: Poisson Distribution with Chi-Square Prior
a) What is the posterior distribution of λ, after observing a single value of x?
Step 1: Likelihood Function
If \(X \sim \text{Poisson}(\lambda)\) , then the likelihood of observing \(x\) is:
\[
P(X = x \mid \lambda) = \frac{\lambda^x e^{-\lambda}}{x!}
\]
Step 2: Prior Distribution
Given \(\lambda \sim \chi^2_4\) , we can express this as a Gamma distribution:
\[
\lambda \sim \text{Gamma}(\alpha = 2, \theta = 2)
\]
Recall that a Chi-square distribution with \(k\) degrees of freedom is equivalent to:
\[
\chi^2_k \equiv \text{Gamma}\left(\frac{k}{2}, 2\right)
\]
So the prior density is:
\[
\pi(\lambda) = \frac{1}{\Gamma(2) \cdot 2^2} \lambda^{2-1} e^{-\lambda/2}
= \frac{1}{4} \lambda e^{-\lambda/2}
\] since \(\Gamma(2)=1\)
Step 3: Posterior Distribution
Using Bayes’ theorem, the unnormalized posterior is:
\[
\pi(\lambda \mid x) \propto P(x \mid \lambda) \cdot \pi(\lambda)
= \left( \frac{\lambda^x e^{-\lambda}}{x!} \right) \cdot \left( \frac{1}{4} \lambda e^{-\lambda/2} \right)
\]
Simplifying:
\[\begin{aligned}
\pi(\lambda \mid x) &\propto\lambda^{x + 1} e^{- \frac{3}{2} \lambda}\\
&\propto\lambda^{x + 2-1} e^{- \frac{3}{2} \lambda}\\
\end{aligned}
\]
This is the kernel of a Gamma distribution:
\[
\lambda \mid x \sim \text{Gamma}(x + 2, \text{rate} = \frac{3}{2})
\]
Final Answer
The posterior distribution of \(\lambda\) , after observing \(x\) , is:
\[
\boxed{\lambda \mid x \sim \text{Gamma}(x + 2, \text{rate} = \frac{3}{2})}
\]
where \[\alpha^* = x+2=6 , \quad \beta^*=3/2\]
b) Given an observed value x = 4, compute the posterior mean, mode, and 95% credible interval, both manually and using any software of your choice.
\[mean = E(X)=\frac{\alpha^*}{\beta^*}=\frac{4+2}{1.5}=4\] \[mode = \frac{\alpha^* -1}{\beta^*}=\frac{6-1}{1.5}=3.333\]
95% confidence interval
we want to find the values \((a,b)\) such that \(P^{*}(a\leq \theta \leq b) =0.95\) and the values are such that :
\[P^{*}(\theta \geq a)=0.975 , \quad P^{*}(\theta \leq b)=0.025\] the values are found in the R code below where a=c_lower
and b=c_upper
Solution in R
Posterior Distribution of λ
Prior: \(\lambda \sim \chi^2(4)\) which is \(Gamma(shape=2, rate=0.5)\)
Likelihood: \(X|\lambda \sim Poisson(\lambda)\)
Posterior: \(Gamma(shape = 2 + x, rate = 0.5 + 1)\)
# (b) Calculations for observed x=4
x <- 4
post_shape <- 2 + x
post_rate <- 0.5 + 1 # = 1.5
# Posterior mean
post_mean <- post_shape / post_rate
# Posterior mode
post_mode <- (post_shape - 1 ) / post_rate
# 95% Credible Interval
ci_lower <- qgamma (0.025 , shape= post_shape, rate= post_rate)
ci_upper <- qgamma (0.975 , shape= post_shape, rate= post_rate)
# Print results
cat ("(a) Posterior distribution: Gamma(shape =" , post_shape, ", rate =" , post_rate, ") \n " )
(a) Posterior distribution: Gamma(shape = 6 , rate = 1.5 )
cat (" Posterior mean:" , post_mean, " \n " )
cat (" Posterior mode:" , post_mode, " \n " )
cat (" 95% Credible Interval: (" , ci_lower, "," , ci_upper, ") \n " )
95% Credible Interval: ( 1.46793 , 7.778888 )
# (c) Visualization of prior vs posterior
library (ggplot2)
# Create sequence of λ values
lambda_vals <- seq (0 , 15 , length.out= 500 )
# Prior density (Chi-square with 4 df = Gamma(2, 0.5))
prior_dens <- dgamma (lambda_vals, shape= 2 , rate= 0.5 )
# Posterior density
post_dens <- dgamma (lambda_vals, shape= post_shape, rate= post_rate)
# Create data frame for plotting
df <- data.frame (
lambda = rep (lambda_vals, 2 ),
density = c (prior_dens, post_dens),
distribution = rep (c ("Prior (χ²4)" , "Posterior" ), each= length (lambda_vals))
)
# Plot
ggplot (df, aes (x= lambda, y= density, color= distribution)) +
geom_line (linewidth= 1.2 ) +
geom_vline (xintercept= post_mean, linetype= "dashed" , color= "red" ) +
geom_vline (xintercept= post_mode, linetype= "dotted" , color= "blue" ) +
annotate ("rect" , xmin= ci_lower, xmax= ci_upper, ymin= 0 , ymax= 0.2 ,
alpha= 0.2 , fill= "green" ) +
labs (title= "Prior and Posterior Distributions of λ" ,
subtitle= "For observed x=4 with Poisson likelihood and χ²₄ prior" ,
x= "λ" , y= "Density" ) +
theme_minimal () +
scale_color_manual (values= c ("Prior (χ²4)" = "black" , "Posterior" = "red" )) +
theme (legend.position= "bottom" )
95% Credible Interval (Bayesian):
Given the observed data and prior, there is a 95% probability that the true value of \(\lambda\) ,\(\lambda\) lies within this interval.
95% Confidence Interval (Frequentist):
If we were to repeat the experiment many times, 95% of the intervals constructed this way would contain the true \(\lambda\) . It does not assign a probability to \(\lambda\) itself — \(\lambda\) is fixed, and the interval is random.
Question 2: Mean Heights of Males in a Population
Derive the Normal-Normal Conjugate posterior distribution
Data comes from a normal distribution with known variance \(\sigma^2\) but unknown mean \(\mu\) , and if your prior on the mean \(\mu\) , has a normal distribution with self-elicited mean \(\mu_0\) and self-elicited variance \(\sigma_0\) , then your posterior density for the mean, after seeing a sample of size \(n\) with sample mean \(\bar{y}\) , is also normal. In mathematical notation, we have:
\[\begin{aligned}
y|\mu &\sim N(\mu,\sigma^2) \\
\mu &\sim N(\mu_0, \sigma_0^2)
\end{aligned}\]
the likelihood expression is given by:
\[
\begin{aligned}
P(y|\mu) &=L(y|\mu)\\
&= \prod^n_{n=1}\frac{1}{\sqrt{2\pi\sigma}}e^{\frac{-(y-\mu)^2}{2\sigma^2}}\\
&=\left(\frac{1}{\sqrt{2\pi\sigma}}\right)^ne^{\frac{-\sum(y-\mu)^2}{2\sigma^2}}
\end{aligned}
\]
Deriving the normal-normal posterior
\[
\begin{aligned}
\pi^*(\mu|y) &\propto P(y|\mu)\pi(\mu) \\
&= \left(\frac{1}{\sqrt{2\pi\sigma}}\right)^ne^{\frac{-\sum(y-\mu)^2}{2\sigma^2}}*\left(\frac{1}{\sqrt{2\pi\sigma_0}}\right)e^{\frac{(\mu-\mu_0)^2}{2\sigma_0^2}}\\
&= \left(\frac{1}{\sqrt{2\pi\sigma}}\right)^n*\left(\frac{1}{\sqrt{2\pi\sigma_0}}\right)e^{\frac{-\sum(y-\mu)^2}{2\sigma^2}}e^{\frac{-(\mu-\mu_0)^2}{2\sigma_0^2}}\\
&= \left(\frac{1}{\sqrt{2\pi\sigma}}\right)^n*\left(\frac{1}{\sqrt{2\pi\sigma_0}}\right)e^{\frac{-\sum(y^2-2y\mu+\mu^2)}{2\sigma^2}}e^{\frac{-(\mu^2-2\mu\mu_0+\mu_0^2)}{2\sigma_0^2}}\\
&= \left(\frac{1}{\sqrt{2\pi\sigma}}\right)^n*\left(\frac{1}{\sqrt{2\pi\sigma_0}}\right)e^{\frac{-\sum(y^2-2y\mu+\mu^2)}{2\sigma^2}\frac{-(\mu^2-2\mu\mu_0+\mu_0^2)}{2\sigma_0^2}}\\
&\propto e^{\frac{-\sum(y^2-2y\mu+\mu^2)}{2\sigma^2}\frac{-(\mu^2-2\mu\mu_0+\mu_0^2)}{2\sigma_0^2}}
\end{aligned}
\]
Looking at the term in the exponent we see that it can be simplified to:
\[\begin{aligned}
{\frac{-\sum(y^2-2y\mu+\mu^2)}{2\sigma^2}\frac{-(\mu^2-2 \mu\mu_0+\mu_0^2)}{2\sigma_0^2}}&=\frac{-\sigma_0^2\sum y_i^2+2\mu\sigma_0^2\sum y_i-n\mu^2\sigma^2_0-\sigma^2 \mu^2+2\mu_0\sigma^2 \mu-\mu_0^2\sigma^2}{2\sigma^2\sigma_0^2}\\
&= \frac{-n\mu^2\sigma_0^2-\sigma^2\mu^2+2\mu\sigma_0^2n\bar{y}+2\mu_0\sigma^2 \mu}{2\sigma^2\sigma_0^2}\\
&=\frac{-\mu^2(n\sigma_0^2+\sigma^2)+2\mu(\sigma_0^2n\bar{y}+\mu_0\sigma^2)}{2\sigma^2\sigma_0^2}\\
&=\frac{-\mu^2(n\sigma_0^2+\sigma^2)}{2\sigma^2\sigma_0^2}+\frac{2\mu(\sigma_0^2n\bar{y}+\mu_0\sigma^2)}{2\sigma^2\sigma_0^2}\\
&=-\mu^2/2(n/\sigma^2+1/\sigma^2_0)+\mu(n\bar{y}/\sigma^2+\mu_0/\sigma^2_0)\\
&=-a\mu^2/2+b\mu\\
&=\frac{-a}{2}(\mu^2-\frac{2b}{a}\mu)\\
&=\frac{-a}{2}\left[\mu^2-\frac{2b}{a}\mu\right]\\
&=\frac{-a}{2}\left[\mu^2-\frac{2b}{a}\mu+\frac{b^2}{a^2}-\frac{b^2}{a^2}\right]\\
&=\frac{-a}{2}\left[(\mu-\frac{b}{a})^2-\frac{b^2}{a^2}\right]\\
&=\frac{-a}{2}(\mu-\frac{b}{a})^2-\frac{b^2}{2a}\\
&\propto \frac{-a}{2}(\mu-\frac{b}{a})^2\\
&\propto \frac{-1}{2}\frac{(\mu-\frac{b}{a})^2}{\frac{1}{a}}\\
\end{aligned}\]
the last eypression entails a normal distribution with
\[\mu^* =\frac{b}{a}=\frac{\frac{n\bar{y}}{\sigma^2}+\frac{\mu_0}{\sigma^2_0}}{\frac{n}{\sigma^2}+\frac{1}{\sigma^2_0}}\]
\[\sigma^{2*} = \frac{1}{a}=\frac{1}{\frac{n}{\sigma^2}+\frac{1}{\sigma^2_0}}\]
Hence for the normal-normal conjugate families, assume the prior on the unknown mean follows a normal distribution, i.e. \(\mu \sim N(\mu_0, \sigma_0^2)\) . We also assume that the data \(y_1,y_2,\cdots,y_n\) are independent and come from a normal with variance \(\sigma^2\) .
Then the posterior distribution of \(\mu\) is also normal, with mean as a weighted average of the prior mean and the sample mean. We have
\[\mu|y_1,y_2,\cdots,y_n \sim N(\mu_0^*, \sigma^{2*}),\]
where
\[\mu^* = \frac{\mu_0\sigma^2 + n\bar{y}\sigma^2_0}{\sigma^2 + n\sigma_0^2} \text{ and } \sigma^{2*} = \sqrt{\frac{\sigma^2\sigma_0^2}{\sigma^2 + n\sigma_0^2}}.\] (b) Given parameters:
\[\begin{align*}
n &= 80,\quad \bar{y} = 182\,\text{cm} \\
\sigma &= 15\,\text{cm},\quad \mu_0 = 187\,\text{cm},\quad \sigma_0 = 10\,\text{cm}
\end{align*}\]
Posterior calculations: \[\begin{align*}
\mu_n &= \frac{10^2 \times 80 \times 182 + 15^2 \times 187}{80 \times 10^2 + 15^2} \\
&= \frac{1,456,000 + 42,075}{8,000 + 225} \\
&\approx 182.14\,\text{cm} \\[10pt]
\sigma_n^2 &= \left(\frac{80}{15^2} + \frac{1}{10^2}\right)^{-1} \\
&= \left(\frac{80}{225} + \frac{1}{100}\right)^{-1} \\
&\approx (0.3556 + 0.01)^{-1} \approx 2.74\,\text{cm}^2 \\
\sigma_n &\approx 1.65\,\text{cm}
\end{align*}\]
Posterior predictive probability:
\[
P(Y > 180) = 1 - \Phi\left(\frac{180 - \mu_n}{\sqrt{\sigma_n^2}}\right) \approx 1 - \Phi\left(\frac{-2.17}{15.11}\right) \approx 0.9018
\]
c) Use any software of your choice to solve part (b) and include appropriate graphs
# Given values
mu0 <- 187
sigma0 <- 10
ybar <- 182
sigma <- 15
n <- 80
# Posterior parameters
(sigma_n2 <- 1 / (1 / sigma0^ 2 + n / sigma^ 2 ))
(mu_n <- (sigma0^ 2 * ybar + (sigma^ 2 / n) * mu0) / (sigma0^ 2 + sigma^ 2 / n))
sigma_n <- sqrt (sigma_n2)
# Compute P(Y > 180)
p_gt_180 <- 1 - pnorm (180 , mean = mu_n, sd = sigma_n)
p_gt_180
# Plot posterior
library (ggplot2)
x <- seq (mu_n - 4 * sigma_n, mu_n + 4 * sigma_n, length.out = 1000 )
df <- data.frame (x = x, density = dnorm (x, mean = mu_n, sd = sigma_n))
ggplot (df, aes (x = x, y = density)) +
geom_line (color = "blue" ) +
geom_vline (xintercept = 180 , color = "red" , linetype = "dashed" ) +
geom_area (data = subset (df, x > 180 ), aes (y = density), fill = "blue" , alpha = 0.3 ) +
labs (title = "Posterior Distribution of μ" , x = "Height (cm)" , y = "Density" ) +
theme_minimal ()
Question 3: Hypothesis Testing with Normal Distribution
Let \(X_1, X_2, \ldots, X_n\) be a random sample from the normal distribution \(\mathcal{N}(\mu, 64)\) . We want to test the simple null hypothesis:
\[
H_0: \mu = 80 \quad \text{vs} \quad H_1: \mu = 76
\]
using the likelihood ratio:
\[
\Lambda(x) = \frac{L(80)}{L(76)}
\]
(a) Derivation of the Likelihood Ratio
Since the variance is known (\(\sigma^2 = 64\) ), the likelihood function under a specific value of \(\mu\) is:
\[\begin{align}
L(X;\mu) &= \prod_{i=1}^{n} \frac{1}{\sqrt{2\pi \cdot 64}} \exp\left( -\frac{(x_i - \mu)^2}{2 \cdot 64} \right)\\
&= \left( \frac{1}{\sqrt{2\pi \cdot 64}} \right)^n \exp\left( -\frac{1}{128} \sum_{i=1}^n (x_i - \mu)^2 \right)
\end{align}
\]
The likelihood ratio is:
\[
\begin{aligned}
\Lambda(x) &= \frac{L(X; \mu_0)}{L(X; \mu_1)} = \frac{L(80)}{L(76)} \\
&= \exp\left(-\frac{1}{2\sigma^2} \left(\sum_{i=1}^n (X_i - \mu_0)^2 - \sum_{i=1}^n (X_i - \mu_1)^2\right)\right)\\
&= \exp\left( \frac{1}{128} \left[ \sum_{i=1}^{n} (X_i - 76)^2 - \sum_{i=1}^{n} (X_i - 80)^2 \right] \right)
\end{aligned}
\]
Also note that:
\[\Lambda(x) = \exp\left(\frac{1}{2\sigma^2} \left(-2(\mu_1 - \mu_0)\sum_{i=1}^n X_i + n(\mu_1^2 - \mu_0^2)\right)\right)\]
Now simplify the difference inside the exponent:
\[
(x_i - 76)^2 - (x_i - 80)^2 = 8x_i - 624
\]
So the sum becomes:
\[\begin{aligned}
\sum_{i=1}^n \left[ (x_i - 76)^2 - (x_i - 80)^2\right]&=8\sum X_i - \sum 624\\
&=8.n.\frac{\sum X_i}{n} - 624n\\
&= 8n \bar{X} - 624n\\
\end{aligned}
\]
Thus, the likelihood ratio simplifies to:
\[\begin{aligned}
\Lambda(x) &= \exp(\frac{8n \bar{X} - 624n}{128})\\
&=\exp\left( \frac{n}{16} (\bar{X} - 78) \right)
\end{aligned}
\]
(b) Likelihood Ratio Condition
We want to find the values in the sample space for which \(\Lambda(x) \leq k\) , for some constant \(k > 0\) . That is,
\[
\exp\left( \frac{n}{16} (\bar{X} - 78) \right) \leq k
\]
Taking natural logarithms on both sides:
\[
\frac{n}{16} (\bar{X} - 78) \leq \ln k
\quad \Rightarrow \quad
\bar{X} \leq \frac{16}{n} \ln k + 78
\]
Conclusion
The critical region for the test is:
\[
\bar{X} \leq \frac{16}{n} \ln k + 78
\]
where \(k\) is chosen to satisfy the desired significance level \(\alpha\) .
Let \(c = 16\ln k + 78n\) , then the critical region is: \[\boxed{\sum_{i=1}^n X_i \leq c}\]
Determine n and c for \(\alpha \approx 0.05\) , \(\beta \approx 0.05\)
Under \(H_0\) : \(\sum X_i \sim N(n80, n64)\) \ Under \(H_1\) : \(\sum X_i \sim N(n76, n64)\)
For \(\alpha = 0.05\) : \[P\left(\sum X_i \leq c\mid\mu=80\right) = 0.05\] \[\frac{c - 80n}{8\sqrt{n}} = -1.645\]
For \(\beta = 0.05\) : \[P\left(\sum X_i > c\mid\mu=76\right) = 0.05\] \[\frac{c - 76n}{8\sqrt{n}} = 1.645\]
Solving the system: \[c = 80n - 1.645\times8\sqrt{n}\] \[c = 76n + 1.645\times8\sqrt{n}\]
Equating both expressions: \[80n - 13.16\sqrt{n} = 76n + 13.16\sqrt{n}\] \[4n = 26.32\sqrt{n}\] \[\sqrt{n} = \frac{26.32}{4} = 6.58\] \[n = (6.58)^2 \approx 43.3\]
\[c = 76(43.3) + 1.645\times 8*\sqrt{44}=3431.294\]
Question 4: Moment Generating Function
Let \[M(t_1, t_2) = \mathbb{E}[e^{t_1 X + t_2 Y}] \tag{1}\] be the joint moment generating function (MGF) of two random variables \(X\) and \(Y\) . We can show that:
\[\begin{aligned}
M(t_1, t_2) &= \mathbb{E}[e^{t_1 X + t_2 Y}]\\
&= \mathbb{E}[e^{t_1 X}]\mathbb{E}[e^{t_2 Y}] \\
&= \mathbb{M}(t_1,0)\mathbb{M}(0,t_2)
\end{aligned}\]
we also know that \[M(t_1,0) =\int^{\infty}_{-\infty}e^{t_1x}f(x)dx\]
\[M(0,t_2) =\int^{\infty}_{-\infty}e^{t_2y}f(y)dy\]
Now
\[\begin{aligned}
M(t_1, t_2) &= \mathbb{M}(t_1,0)\mathbb{M}(0,t_2)\\
&=\int^{\infty}_{-\infty}e^{t_1x}f(x)dx \int^{\infty}_{-\infty}e^{t_2y}f(y)dy\\
&= \int^{\infty}_{-\infty}\int^{\infty}_{-\infty}e^{t_1x+t_2y}f(x)f(y)dxdy\\
&= \int^{\infty}_{-\infty}\int^{\infty}_{-\infty}e^{t_1x+t_2y}f(x,y)dxdy
\end{aligned}\]
Define the cumulant generating function:
\[
\psi(t_1, t_2) = \log M(t_1, t_2)
\]
We will show that the partial derivatives of \(\psi\) at \((t_1, t_2) = (0, 0)\) yield the mean, variance, and covariance of \(X\) and \(Y\) .
First Derivatives (Means)
\[\begin{aligned}
M(t_1, t_2) &= \int^{\infty}_{-\infty}\int^{\infty}_{-\infty}e^{t_1x+t_2y}f(x,y)dxdy\\
then~~
\frac{\partial}{\partial t_1}M(t_1, t_2) &= \int^{\infty}_{-\infty}\int^{\infty}_{-\infty}\frac{\partial}{\partial t_1} e^{t_1x+t_2y}f(x,y)dxdy\\
&=\int^{\infty}_{-\infty}\int^{\infty}_{-\infty}X e^{t_1x+t_2y}f(x,y)dxdy\\
&= \mathbb{E}[X e^{t_1 X + t_2 Y}]
\end{aligned}\]
by inspection :
\[\frac{\partial^r}{\partial t_1^r}M(t_1, t_2) = \mathbb{E}[X^r e^{t_1 X + t_2 Y}] \tag{2}\]
\[\frac{\partial^r}{\partial t_2^r}M(t_1, t_2) = \mathbb{E}[Y^r e^{t_1 X + t_2 Y}] \tag{3}\]
\[\frac{\partial^2}{\partial t_1\partial t_2}M(t_1, t_2) = \mathbb{E}[XY e^{t_1 X + t_2 Y}] \tag{4}\] hence
\[
\frac{\partial}{\partial t_1} \psi(t_1, t_2) = \frac{1}{M(t_1, t_2)} \cdot \frac{\partial }{\partial t_1}M(t_1, t_2)
= \frac{\mathbb{E}[X e^{t_1 X + t_2 Y}]}{\mathbb{E}[e^{t_1 X + t_2 Y}]}
\] \[
\frac{\partial}{\partial t_1} \psi(0, 0)
= \frac{\mathbb{E}[X e^{0 (X) + 0(Y)}]}{\mathbb{E}[e^{0(X) + 0(Y)}]}=\frac{\mathbb{E} [X]}{1} =\mathbb{E} [X]
\] In the same manner:
\[
\frac{\partial}{\partial t_2} \psi(t_1, t_2) = \frac{1}{M(t_1, t_2)} \cdot \frac{\partial }{\partial t_1}M(t_1, t_2)
= \frac{\mathbb{E}[Y e^{t_1 X + t_2 Y}]}{\mathbb{E}[e^{t_1 X + t_2 Y}]}
\]
\[
\frac{\partial}{\partial t_2} \psi(0, 0)
= \frac{\mathbb{E}[Y e^{0 (X) + 0(Y)}]}{\mathbb{E}[e^{0(X) + 0(Y)}]}=\frac{\mathbb{E} [Y]}{1} =\mathbb{E} [Y]
\]
So, the first derivatives at \((0, 0)\) are:
\[
\left. \frac{\partial \psi}{\partial t_1} \right|_{(0, 0)} = \mathbb{E}[X], \quad
\left. \frac{\partial \psi}{\partial t_2} \right|_{(0, 0)} = \mathbb{E}[Y]
\]
Second Derivatives (Variances)
Using Quotient Rule we can find the second derivatives as below
\[
\frac{\partial^2}{\partial t_1^2}\psi(t_1, t_2)
= \frac{M(t_1,t_2) \cdot \frac{\partial^2 M(t_1,t_2)}{\partial t_1^2} - \left( \frac{\partial M(t_1,t_2)}{\partial t_1} \right)^2}{M(t_1,t_2)^2}
\]
At \((0, 0)\) , this simplifies to:
\[
\frac{\partial^2}{\partial t_1^2}\psi(0, 0)
= \frac{M(0,0) \cdot \frac{\partial^2 M(0,0)}{\partial t_1^2} - \left( \frac{\partial M(0,0)}{\partial t_1} \right)^2}{M(0,0)^2}
\] \[
\frac{\partial^2}{\partial t_1^2}\psi(0, 0)
= \frac{1 \cdot \mathbb{E}[X^2] - [\mathbb{E}[X]]^2}{1^2}=\mathbb{E}[X^2] - [\mathbb{E}[X]]^2=Var(X)
\]
\[
\left. \frac{\partial^2 \psi}{\partial t_1^2} \right|_{(0, 0)} = \mathbb{E}[X^2] - (\mathbb{E}[X])^2 = \mathrm{Var}(X)
\]
Similarly:
\[
\left. \frac{\partial^2 \psi}{\partial t_2^2} \right|_{(0, 0)} = \mathrm{Var}(Y)
\]
Covariances
since
\[\begin{aligned}
\frac{\partial^2}{\partial t_1 \partial t_2}\psi(t_1, t_2)
&= \frac{M(t_1,t_2) \cdot \frac{\partial^2 M(t_1,t_2)}{\partial t_1 \partial t_2} - \frac{\partial M(t_1,t_2)}{\partial t_1} \frac{\partial M(t_1,t_2)}{\partial t_2}}{M(t_1,t_2)^2}\\
\frac{\partial^2}{\partial t_1 \partial t_2}\psi(0, 0)
&= \frac{M(0,0) \cdot \frac{\partial^2 M(0,0)}{\partial t_1 \partial t_2} - \frac{\partial M(0,0)}{\partial t_1} \frac{\partial M(0,0)}{\partial t_2}}{M(0,0)^2}\\
\left. \frac{\partial^2 \psi}{\partial t_1 \partial t_2} \right|_{(0, 0)} &= \mathbb{E}[XY] - \mathbb{E}[X]\mathbb{E}[Y] = \mathrm{Cov}(X, Y)
\end{aligned}\]
Since for the mixed partial derivative:
\[\frac{\partial^2}{\partial t_1 \partial t_2}M(t_1, t_2) =\mathbb{E}[XY e^{t_1 X + t_2 Y}]\] \[\frac{\partial^2}{\partial t_1 \partial t_2}M(0, 0) =\mathbb{E}[XY]\]
In short
\[\begin{aligned}
\left. \frac{\partial \psi}{\partial t_1} \right|_{(0, 0)} &= \mathbb{E}[X] \\
\left. \frac{\partial \psi}{\partial t_2} \right|_{(0, 0)} &= \mathbb{E}[Y] \\
\left. \frac{\partial^2 \psi}{\partial t_1^2} \right|_{(0, 0)} &= \mathrm{Var}(X) \\
\left. \frac{\partial^2 \psi}{\partial t_2^2} \right|_{(0, 0)} &= \mathrm{Var}(Y) \\
\left. \frac{\partial^2 \psi}{\partial t_1 \partial t_2} \right|_{(0, 0)} &= \mathrm{Cov}(X, Y)
\end{aligned}\]
Question (5)
\[P(X=x) = \frac{e^{-\lambda} \lambda^x}{x!}=\frac{e^{-50} 50^x}{x!}\]
Finding the maximum likelihood
\[\mathcal L(\lambda;x)=\frac{e^{-\lambda} \lambda^x}{x!}\]
Finding the log likelihood
\[
\begin{aligned}
log\mathcal L(\lambda;x) &= log\left(\frac{e^{-\lambda} \lambda^x}{x!}\right)\\
&= log(e^{-\lambda} \lambda^x)-log(x!)\\
&= -\lambda+ xlog\lambda-log(x!)\\
differentiate~~\\
\frac{d}{d\lambda}log\mathcal L(\lambda;x) &= -1 + \frac{x}{\lambda}
\end{aligned}
\]
Maximize by equating to zero and solve ,i.e
\[
\begin{aligned}
\frac{d}{d\lambda}log\mathcal L(\lambda;x) &= 0\\
-1 + \frac{x}{\lambda} &= 0\\
\frac{x}{\lambda} &= 1\\
\hat{x} &= \lambda=50
\end{aligned}
\]
Question 6: Unbiasedness and Bias
# Set seed for reproducibility
set.seed (123 )
# Parameters
mu <- 21
sigma <- 3 # since variance = 9
n1 <- 30
n2 <- 100
simulations <- 10000
# Function to simulate and analyze
simulate_estimates <- function (n) {
samples <- matrix (rnorm (simulations * n, mean = mu, sd = sigma), nrow = simulations)
xbar <- rowMeans (samples)
xbar_biased <- 0.9 * xbar
data.frame (xbar = xbar, xbar_biased = xbar_biased)
}
# Simulate for both sample sizes
results_n30 <- simulate_estimates (n1)
results_n100 <- simulate_estimates (n2)
# Summary statistics function
summarize_results <- function (data) {
list (
mean_xbar = mean (data$ xbar),
bias_xbar = mean (data$ xbar) - mu,
var_xbar = var (data$ xbar),
mean_biased = mean (data$ xbar_biased),
bias_biased = mean (data$ xbar_biased) - mu,
var_biased = var (data$ xbar_biased)
)
}
# Get summaries
summary_n30 <- summarize_results (results_n30)
summary_n100 <- summarize_results (results_n100)
print (summary_n30)
$mean_xbar
[1] 21.0056
$bias_xbar
[1] 0.005595708
$var_xbar
[1] 0.3047891
$mean_biased
[1] 18.90504
$bias_biased
[1] -2.094964
$var_biased
[1] 0.2468792
$mean_xbar
[1] 20.99441
$bias_xbar
[1] -0.005587189
$var_xbar
[1] 0.0889089
$mean_biased
[1] 18.89497
$bias_biased
[1] -2.105028
$var_biased
[1] 0.07201621
# Plot histograms
library (ggplot2)
library (tidyr)
results_n30_long <- pivot_longer (results_n30, cols = everything (), names_to = "Estimator" , values_to = "Value" )
results_n100_long <- pivot_longer (results_n100, cols = everything (), names_to = "Estimator" , values_to = "Value" )
# Plot for n = 30
ggplot (results_n30_long, aes (x = Value, fill = Estimator)) +
geom_density (alpha = 0.5 ) +
geom_vline (xintercept = mu, color = "black" , linetype = "dashed" ) +
labs (title = "Sampling Distributions for n = 30" , x = "Estimator Value" , y = "Density" ) +
theme_minimal ()
# Plot for n = 100
ggplot (results_n100_long, aes (x = Value, fill = Estimator)) +
geom_density (alpha = 0.5 ) +
geom_vline (xintercept = mu, color = "black" , linetype = "dashed" ) +
labs (title = "Sampling Distributions for n = 100" , x = "Estimator Value" , y = "Density" ) +
theme_minimal ()
Which estimator appears unbiased for n=30?
The sample mean (X̄) is unbiased (bias ≈ 0)
The 0.9X̄ estimator is biased (bias ≈ -2.1)
Which has lower variance for n=30 and for n=100?
The biased estimator has lower variance (0.244 vs 0.301 for n=30; 0.072 vs 0.089 for n=100)
Which estimator would you prefer?
For precise estimation: Biased estimator (lower variance)
For accurate estimation: Sample mean (unbiased)
Trade-off depends on application (MSE = Variance + Bias^2)
Effects of larger sample size (n=100):
Both distributions become narrower (lower variance)
Shapes become more concentrated around their means
Variance decreases proportionally to 1/n
Bias becomes less significant with larger n
Question 7: Estimation and Expectation
Part (A)
Let \((G_1, G_2, \dots, G_m)\) and \((H_1, H_2, \dots, H_n)\) be independent random variables from two continuous distribution functions. Find the Uniformly Minimum Variance Unbiased Estimators (UMVUE) of:
(i) \(E(GH)\)
Since \(G\) and \(H\) are independent: \[E(GH) = E(G)E(H)\]
The sample means are unbiased estimators: \[\bar{G} = \frac{1}{m}\sum_{i=1}^m G_i, \quad \bar{H} = \frac{1}{n}\sum_{j=1}^n H_j\]
thus
\[E(\bar{G}) = \mu_G, \quad E(\bar{H}) = \mu_H\] Proof
\[\begin{aligned}
E(\bar{X}) &= E(\frac{\sum^n_{i=1} X_i)}{n}\\
&= \sum^n_{i=1}(\frac{E( X_i)}{n})\\
&= \frac{n.\mu}{n}\\
&= \mu
\end{aligned}\]
Because \(\bar{G}\) and \(\bar{H}\) are independent: \[E(\bar{G}\bar{H}) = E(\bar{G})E(\bar{H}) = E(G)E(H)\]
Thus, the UMVUE for \(E(GH)\) is: \[\color{dodgerblue}{\boxed{\bar{G}\bar{H} = \left(\frac{1}{m}\sum_{i=1}^m G_i\right)\left(\frac{1}{n}\sum_{j=1}^n H_j\right)}}\]
(ii) \(\text{Var}(G + H)\)
Since \(G\) and \(H\) are independent: \[\text{Var}(G + H) = \text{Var}(G) + \text{Var}(H)\]
The sample variances are unbiased estimators: \[S_G^2 = \frac{1}{m-1}\sum_{i=1}^m (G_i - \bar{G})^2, \quad S_H^2 = \frac{1}{n-1}\sum_{j=1}^n (H_j - \bar{H})^2\] thus
\[E(S_G^2) = \sigma^2_G, \quad E(S_H^2) = \sigma^2_H\]
Proof
\[E(S^2) = \sigma^2\]
\[\begin{aligned}
E\bigg[\frac{1}{n-1} \sum_{i=1}^n (X_i - \overline{X})^2 \bigg] &= \frac{\sum_{i=1}^n E(X_i^2) + E \bigg[ - 2 \overline{X}\left( {\color{tomato}{n\overline{X}}} \right) + n \overline{X}^2 \bigg]}{n-1} \\
&= \frac{\sum_{i=1}^n E(X_i^2) + E \bigg[ -n \overline{X}^2\bigg]}{n-1}\\
&= \frac{\sum_{i=1}^n E(X_i^2) - n E \big[ \overline{X}^2\big]}{n-1} .\\
&= \frac{\sum_{i=1}^n {\color{dodgerblue}{ E \big[ X_i^2 \big]}} - n {\color{tomato}{E \big[ \overline{X}^2 \big]}}}{n-1} \\
&= \frac{\sum_{i=1}^n \bigg( {\color{dodgerblue}{ \mbox{Var} \big[ X_i \big] + \left( E \big[ X_i \big] \right)^2 }} \bigg) - n \left( {\color{tomato}{\mbox{Var} \big[ \overline{X} \big] + \left( E \big[ \overline{X} \big]\right)^2}} \right)}{n-1} \\
&= \frac{\sum_{i=1}^n {\color{dodgerblue}{ \left( \sigma^2 + \mu^2 \right)}} - n \left( \mbox{Var} \big[ \overline{X} \big] + \left( E \big[ \overline{X} \big]\right)^2 \right)}{n-1} \\
&= \frac{\sum_{i=1}^n\left( \sigma^2 + \mu^2 \right) - n \left( {\color{tomato}{\frac{\sigma^2}{n}}} + \left( {\color{tomato}{ \mu }}\right)^2 \right)}{n-1} \\
&= \frac{{\color{dodgerblue}{n(\sigma^2 + \mu^2)}} - \sigma^2 - n\mu^2}{n-1} \\
&= \frac{(n-1) \sigma^2}{n-1}.\\
&= \sigma^2
\end{aligned}\]
Thus, the UMVUE for \(\text{Var}(G + H)\) is: \[\color{dodgerblue}{\boxed{S_G^2 + S_H^2 = \frac{1}{m-1}\sum_{i=1}^m (G_i - \bar{G})^2 + \frac{1}{n-1}\sum_{j=1}^n (H_j - \bar{H})^2}}\]
Part (B)
Suppose \(X \sim N(\mu, \sigma^2)\) . Prove that: \[E[(X - \mu)^2] = \sigma^2\]
By definition of variance: \[\text{Var}(X) = E[(X - \mu)^2]\]
For \(X \sim N(\mu, \sigma^2)\) : \[\text{Var}(X) = \sigma^2\]
Therefore: \[\color{dodgerblue}{\boxed{E[(X - \mu)^2] = \sigma^2}}\]