Introduction to Statistical Theory for Health Researchers

Question 1: Poisson Distribution with Chi-Square Prior

a) What is the posterior distribution of λ, after observing a single value of x?

Step 1: Likelihood Function

If \(X \sim \text{Poisson}(\lambda)\), then the likelihood of observing \(x\) is:

\[ P(X = x \mid \lambda) = \frac{\lambda^x e^{-\lambda}}{x!} \]

Step 2: Prior Distribution

Given \(\lambda \sim \chi^2_4\), we can express this as a Gamma distribution:

\[ \lambda \sim \text{Gamma}(\alpha = 2, \theta = 2) \]

Recall that a Chi-square distribution with \(k\) degrees of freedom is equivalent to:

\[ \chi^2_k \equiv \text{Gamma}\left(\frac{k}{2}, 2\right) \]

So the prior density is:

\[ \pi(\lambda) = \frac{1}{\Gamma(2) \cdot 2^2} \lambda^{2-1} e^{-\lambda/2} = \frac{1}{4} \lambda e^{-\lambda/2} \] since \(\Gamma(2)=1\)

Step 3: Posterior Distribution

Using Bayes’ theorem, the unnormalized posterior is:

\[ \pi(\lambda \mid x) \propto P(x \mid \lambda) \cdot \pi(\lambda) = \left( \frac{\lambda^x e^{-\lambda}}{x!} \right) \cdot \left( \frac{1}{4} \lambda e^{-\lambda/2} \right) \]

Simplifying:

\[\begin{aligned} \pi(\lambda \mid x) &\propto\lambda^{x + 1} e^{- \frac{3}{2} \lambda}\\ &\propto\lambda^{x + 2-1} e^{- \frac{3}{2} \lambda}\\ \end{aligned} \]

This is the kernel of a Gamma distribution:

\[ \lambda \mid x \sim \text{Gamma}(x + 2, \text{rate} = \frac{3}{2}) \]

Final Answer

The posterior distribution of \(\lambda\), after observing \(x\), is:

\[ \boxed{\lambda \mid x \sim \text{Gamma}(x + 2, \text{rate} = \frac{3}{2})} \]

where \[\alpha^* = x+2=6 , \quad \beta^*=3/2\]

b) Given an observed value x = 4, compute the posterior mean, mode, and 95% credible interval, both manually and using any software of your choice.

\[mean = E(X)=\frac{\alpha^*}{\beta^*}=\frac{4+2}{1.5}=4\] \[mode = \frac{\alpha^* -1}{\beta^*}=\frac{6-1}{1.5}=3.333\]

95% confidence interval

we want to find the values \((a,b)\) such that \(P^{*}(a\leq \theta \leq b) =0.95\) and the values are such that :

\[P^{*}(\theta \geq a)=0.975 , \quad P^{*}(\theta \leq b)=0.025\] the values are found in the R code below where a=c_lower and b=c_upper

Solution in R

Posterior Distribution of λ

Prior: \(\lambda \sim \chi^2(4)\) which is \(Gamma(shape=2, rate=0.5)\)
Likelihood: \(X|\lambda \sim Poisson(\lambda)\)
Posterior: \(Gamma(shape = 2 + x, rate = 0.5 + 1)\)

# (b) Calculations for observed x=4
x <- 4
post_shape <- 2 + x
post_rate <- 0.5 + 1  # = 1.5

# Posterior mean
post_mean <- post_shape / post_rate

# Posterior mode
post_mode <- (post_shape - 1) / post_rate

# 95% Credible Interval
ci_lower <- qgamma(0.025, shape=post_shape, rate=post_rate)
ci_upper <- qgamma(0.975, shape=post_shape, rate=post_rate)

# Print results
cat("(a) Posterior distribution: Gamma(shape =", post_shape, ", rate =", post_rate, ")\n")

(a) Posterior distribution: Gamma(shape = 6 , rate = 1.5 )

cat("(b) For x = 4:\n")

(b) For x = 4:

cat("  Posterior mean:", post_mean, "\n")

  Posterior mean: 4

cat("  Posterior mode:", post_mode, "\n")

  Posterior mode: 3.333333

cat("  95% Credible Interval: (", ci_lower, ",", ci_upper, ")\n")

  95% Credible Interval: ( 1.46793 , 7.778888 )

# (c) Visualization of prior vs posterior
library(ggplot2)

# Create sequence of λ values
lambda_vals <- seq(0, 15, length.out=500)

# Prior density (Chi-square with 4 df = Gamma(2, 0.5))
prior_dens <- dgamma(lambda_vals, shape=2, rate=0.5)

# Posterior density
post_dens <- dgamma(lambda_vals, shape=post_shape, rate=post_rate)

# Create data frame for plotting
df <- data.frame(
  lambda = rep(lambda_vals, 2),
  density = c(prior_dens, post_dens),
  distribution = rep(c("Prior (χ²4)", "Posterior"), each=length(lambda_vals))
)

# Plot
ggplot(df, aes(x=lambda, y=density, color=distribution)) +
  geom_line(linewidth=1.2) +
  geom_vline(xintercept=post_mean, linetype="dashed", color="red") +
  geom_vline(xintercept=post_mode, linetype="dotted", color="blue") +
  annotate("rect", xmin=ci_lower, xmax=ci_upper, ymin=0, ymax=0.2, 
           alpha=0.2, fill="green") +
  labs(title="Prior and Posterior Distributions of λ",
       subtitle="For observed x=4 with Poisson likelihood and χ²₄ prior",
       x="λ", y="Density") +
  theme_minimal() +
  scale_color_manual(values=c("Prior (χ²4)"="black", "Posterior"="red")) +
  theme(legend.position="bottom")

Interpretation

95% Credible Interval (Bayesian):

Given the observed data and prior, there is a 95% probability that the true value of \(\lambda\) ,\(\lambda\) lies within this interval.

95% Confidence Interval (Frequentist):

If we were to repeat the experiment many times, 95% of the intervals constructed this way would contain the true \(\lambda\). It does not assign a probability to \(\lambda\) itself — \(\lambda\) is fixed, and the interval is random.

Question 2: Mean Heights of Males in a Population

Derive the Normal-Normal Conjugate posterior distribution

Data comes from a normal distribution with known variance \(\sigma^2\) but unknown mean \(\mu\), and if your prior on the mean \(\mu\), has a normal distribution with self-elicited mean \(\mu_0\) and self-elicited variance \(\sigma_0\), then your posterior density for the mean, after seeing a sample of size \(n\) with sample mean \(\bar{y}\), is also normal. In mathematical notation, we have:

\[\begin{aligned} y|\mu &\sim N(\mu,\sigma^2) \\ \mu &\sim N(\mu_0, \sigma_0^2) \end{aligned}\]

the likelihood expression is given by:

\[ \begin{aligned} P(y|\mu) &=L(y|\mu)\\ &= \prod^n_{n=1}\frac{1}{\sqrt{2\pi\sigma}}e^{\frac{-(y-\mu)^2}{2\sigma^2}}\\ &=\left(\frac{1}{\sqrt{2\pi\sigma}}\right)^ne^{\frac{-\sum(y-\mu)^2}{2\sigma^2}} \end{aligned} \]

Deriving the normal-normal posterior

\[ \begin{aligned} \pi^*(\mu|y) &\propto P(y|\mu)\pi(\mu) \\ &= \left(\frac{1}{\sqrt{2\pi\sigma}}\right)^ne^{\frac{-\sum(y-\mu)^2}{2\sigma^2}}*\left(\frac{1}{\sqrt{2\pi\sigma_0}}\right)e^{\frac{(\mu-\mu_0)^2}{2\sigma_0^2}}\\ &= \left(\frac{1}{\sqrt{2\pi\sigma}}\right)^n*\left(\frac{1}{\sqrt{2\pi\sigma_0}}\right)e^{\frac{-\sum(y-\mu)^2}{2\sigma^2}}e^{\frac{-(\mu-\mu_0)^2}{2\sigma_0^2}}\\ &= \left(\frac{1}{\sqrt{2\pi\sigma}}\right)^n*\left(\frac{1}{\sqrt{2\pi\sigma_0}}\right)e^{\frac{-\sum(y^2-2y\mu+\mu^2)}{2\sigma^2}}e^{\frac{-(\mu^2-2\mu\mu_0+\mu_0^2)}{2\sigma_0^2}}\\ &= \left(\frac{1}{\sqrt{2\pi\sigma}}\right)^n*\left(\frac{1}{\sqrt{2\pi\sigma_0}}\right)e^{\frac{-\sum(y^2-2y\mu+\mu^2)}{2\sigma^2}\frac{-(\mu^2-2\mu\mu_0+\mu_0^2)}{2\sigma_0^2}}\\ &\propto e^{\frac{-\sum(y^2-2y\mu+\mu^2)}{2\sigma^2}\frac{-(\mu^2-2\mu\mu_0+\mu_0^2)}{2\sigma_0^2}} \end{aligned} \]

Looking at the term in the exponent we see that it can be simplified to:

\[\begin{aligned} {\frac{-\sum(y^2-2y\mu+\mu^2)}{2\sigma^2}\frac{-(\mu^2-2 \mu\mu_0+\mu_0^2)}{2\sigma_0^2}}&=\frac{-\sigma_0^2\sum y_i^2+2\mu\sigma_0^2\sum y_i-n\mu^2\sigma^2_0-\sigma^2 \mu^2+2\mu_0\sigma^2 \mu-\mu_0^2\sigma^2}{2\sigma^2\sigma_0^2}\\ &= \frac{-n\mu^2\sigma_0^2-\sigma^2\mu^2+2\mu\sigma_0^2n\bar{y}+2\mu_0\sigma^2 \mu}{2\sigma^2\sigma_0^2}\\ &=\frac{-\mu^2(n\sigma_0^2+\sigma^2)+2\mu(\sigma_0^2n\bar{y}+\mu_0\sigma^2)}{2\sigma^2\sigma_0^2}\\ &=\frac{-\mu^2(n\sigma_0^2+\sigma^2)}{2\sigma^2\sigma_0^2}+\frac{2\mu(\sigma_0^2n\bar{y}+\mu_0\sigma^2)}{2\sigma^2\sigma_0^2}\\ &=-\mu^2/2(n/\sigma^2+1/\sigma^2_0)+\mu(n\bar{y}/\sigma^2+\mu_0/\sigma^2_0)\\ &=-a\mu^2/2+b\mu\\ &=\frac{-a}{2}(\mu^2-\frac{2b}{a}\mu)\\ &=\frac{-a}{2}\left[\mu^2-\frac{2b}{a}\mu\right]\\ &=\frac{-a}{2}\left[\mu^2-\frac{2b}{a}\mu+\frac{b^2}{a^2}-\frac{b^2}{a^2}\right]\\ &=\frac{-a}{2}\left[(\mu-\frac{b}{a})^2-\frac{b^2}{a^2}\right]\\ &=\frac{-a}{2}(\mu-\frac{b}{a})^2-\frac{b^2}{2a}\\ &\propto \frac{-a}{2}(\mu-\frac{b}{a})^2\\ &\propto \frac{-1}{2}\frac{(\mu-\frac{b}{a})^2}{\frac{1}{a}}\\ \end{aligned}\]

the last eypression entails a normal distribution with

\[\mu^* =\frac{b}{a}=\frac{\frac{n\bar{y}}{\sigma^2}+\frac{\mu_0}{\sigma^2_0}}{\frac{n}{\sigma^2}+\frac{1}{\sigma^2_0}}\]

\[\sigma^{2*} = \frac{1}{a}=\frac{1}{\frac{n}{\sigma^2}+\frac{1}{\sigma^2_0}}\]

Hence for the normal-normal conjugate families, assume the prior on the unknown mean follows a normal distribution, i.e. \(\mu \sim N(\mu_0, \sigma_0^2)\). We also assume that the data \(y_1,y_2,\cdots,y_n\) are independent and come from a normal with variance \(\sigma^2\).

Then the posterior distribution of \(\mu\) is also normal, with mean as a weighted average of the prior mean and the sample mean. We have

\[\mu|y_1,y_2,\cdots,y_n \sim N(\mu_0^*, \sigma^{2*}),\]

where

\[\mu^* = \frac{\mu_0\sigma^2 + n\bar{y}\sigma^2_0}{\sigma^2 + n\sigma_0^2} \text{ and } \sigma^{2*} = \sqrt{\frac{\sigma^2\sigma_0^2}{\sigma^2 + n\sigma_0^2}}.\] (b) Given parameters:

\[\begin{align*} n &= 80,\quad \bar{y} = 182\,\text{cm} \\ \sigma &= 15\,\text{cm},\quad \mu_0 = 187\,\text{cm},\quad \sigma_0 = 10\,\text{cm} \end{align*}\]

Posterior calculations: \[\begin{align*} \mu_n &= \frac{10^2 \times 80 \times 182 + 15^2 \times 187}{80 \times 10^2 + 15^2} \\ &= \frac{1,456,000 + 42,075}{8,000 + 225} \\ &\approx 182.14\,\text{cm} \\[10pt] \sigma_n^2 &= \left(\frac{80}{15^2} + \frac{1}{10^2}\right)^{-1} \\ &= \left(\frac{80}{225} + \frac{1}{100}\right)^{-1} \\ &\approx (0.3556 + 0.01)^{-1} \approx 2.74\,\text{cm}^2 \\ \sigma_n &\approx 1.65\,\text{cm} \end{align*}\]

Posterior predictive probability:

\[ P(Y > 180) = 1 - \Phi\left(\frac{180 - \mu_n}{\sqrt{\sigma_n^2}}\right) \approx 1 - \Phi\left(\frac{-2.17}{15.11}\right) \approx 0.9018 \]

c) Use any software of your choice to solve part (b) and include appropriate graphs

# Given values
mu0 <- 187
sigma0 <- 10
ybar <- 182
sigma <- 15
n <- 80

# Posterior parameters
(sigma_n2 <- 1 / (1 / sigma0^2 + n / sigma^2))

[1] 2.735562

(mu_n <- (sigma0^2 * ybar + (sigma^2 / n) * mu0) / (sigma0^2 + sigma^2 / n))

[1] 182.1368

sigma_n <- sqrt(sigma_n2)

# Compute P(Y > 180)
p_gt_180 <- 1 - pnorm(180, mean = mu_n, sd = sigma_n)
p_gt_180

[1] 0.9018078

# Plot posterior
library(ggplot2)
x <- seq(mu_n - 4*sigma_n, mu_n + 4*sigma_n, length.out = 1000)
df <- data.frame(x = x, density = dnorm(x, mean = mu_n, sd = sigma_n))

ggplot(df, aes(x = x, y = density)) +
  geom_line(color = "blue") +
  geom_vline(xintercept = 180, color = "red", linetype = "dashed") +
  geom_area(data = subset(df, x > 180), aes(y = density), fill = "blue", alpha = 0.3) +
  labs(title = "Posterior Distribution of μ", x = "Height (cm)", y = "Density") +
  theme_minimal()

Question 3: Hypothesis Testing with Normal Distribution

Let \(X_1, X_2, \ldots, X_n\) be a random sample from the normal distribution \(\mathcal{N}(\mu, 64)\). We want to test the simple null hypothesis:

\[ H_0: \mu = 80 \quad \text{vs} \quad H_1: \mu = 76 \]

using the likelihood ratio:

\[ \Lambda(x) = \frac{L(80)}{L(76)} \]

(a) Derivation of the Likelihood Ratio

Since the variance is known (\(\sigma^2 = 64\)), the likelihood function under a specific value of \(\mu\) is:

\[\begin{align} L(X;\mu) &= \prod_{i=1}^{n} \frac{1}{\sqrt{2\pi \cdot 64}} \exp\left( -\frac{(x_i - \mu)^2}{2 \cdot 64} \right)\\ &= \left( \frac{1}{\sqrt{2\pi \cdot 64}} \right)^n \exp\left( -\frac{1}{128} \sum_{i=1}^n (x_i - \mu)^2 \right) \end{align} \]

The likelihood ratio is:

\[ \begin{aligned} \Lambda(x) &= \frac{L(X; \mu_0)}{L(X; \mu_1)} = \frac{L(80)}{L(76)} \\ &= \exp\left(-\frac{1}{2\sigma^2} \left(\sum_{i=1}^n (X_i - \mu_0)^2 - \sum_{i=1}^n (X_i - \mu_1)^2\right)\right)\\ &= \exp\left( \frac{1}{128} \left[ \sum_{i=1}^{n} (X_i - 76)^2 - \sum_{i=1}^{n} (X_i - 80)^2 \right] \right) \end{aligned} \]

Also note that:

\[\Lambda(x) = \exp\left(\frac{1}{2\sigma^2} \left(-2(\mu_1 - \mu_0)\sum_{i=1}^n X_i + n(\mu_1^2 - \mu_0^2)\right)\right)\]

Now simplify the difference inside the exponent:

\[ (x_i - 76)^2 - (x_i - 80)^2 = 8x_i - 624 \]

So the sum becomes:

\[\begin{aligned} \sum_{i=1}^n \left[ (x_i - 76)^2 - (x_i - 80)^2\right]&=8\sum X_i - \sum 624\\ &=8.n.\frac{\sum X_i}{n} - 624n\\ &= 8n \bar{X} - 624n\\ \end{aligned} \]

Thus, the likelihood ratio simplifies to:

\[\begin{aligned} \Lambda(x) &= \exp(\frac{8n \bar{X} - 624n}{128})\\ &=\exp\left( \frac{n}{16} (\bar{X} - 78) \right) \end{aligned} \]

(b) Likelihood Ratio Condition

We want to find the values in the sample space for which \(\Lambda(x) \leq k\), for some constant \(k > 0\). That is,

\[ \exp\left( \frac{n}{16} (\bar{X} - 78) \right) \leq k \]

Taking natural logarithms on both sides:

\[ \frac{n}{16} (\bar{X} - 78) \leq \ln k \quad \Rightarrow \quad \bar{X} \leq \frac{16}{n} \ln k + 78 \]

Conclusion

The critical region for the test is:

\[ \bar{X} \leq \frac{16}{n} \ln k + 78 \]

where \(k\) is chosen to satisfy the desired significance level \(\alpha\).

Let \(c = 16\ln k + 78n\), then the critical region is: \[\boxed{\sum_{i=1}^n X_i \leq c}\]

Determine n and c for \(\alpha \approx 0.05\), \(\beta \approx 0.05\)

Under \(H_0\): \(\sum X_i \sim N(n80, n64)\) \ Under \(H_1\): \(\sum X_i \sim N(n76, n64)\)

For \(\alpha = 0.05\): \[P\left(\sum X_i \leq c\mid\mu=80\right) = 0.05\] \[\frac{c - 80n}{8\sqrt{n}} = -1.645\]

For \(\beta = 0.05\): \[P\left(\sum X_i > c\mid\mu=76\right) = 0.05\] \[\frac{c - 76n}{8\sqrt{n}} = 1.645\]

Solving the system: \[c = 80n - 1.645\times8\sqrt{n}\] \[c = 76n + 1.645\times8\sqrt{n}\]

Equating both expressions: \[80n - 13.16\sqrt{n} = 76n + 13.16\sqrt{n}\] \[4n = 26.32\sqrt{n}\] \[\sqrt{n} = \frac{26.32}{4} = 6.58\] \[n = (6.58)^2 \approx 43.3\]

\[c = 76(43.3) + 1.645\times 8*\sqrt{44}=3431.294\]

Question 4: Moment Generating Function

Let \[M(t_1, t_2) = \mathbb{E}[e^{t_1 X + t_2 Y}] \tag{1}\] be the joint moment generating function (MGF) of two random variables \(X\) and \(Y\). We can show that:

\[\begin{aligned} M(t_1, t_2) &= \mathbb{E}[e^{t_1 X + t_2 Y}]\\ &= \mathbb{E}[e^{t_1 X}]\mathbb{E}[e^{t_2 Y}] \\ &= \mathbb{M}(t_1,0)\mathbb{M}(0,t_2) \end{aligned}\]

we also know that \[M(t_1,0) =\int^{\infty}_{-\infty}e^{t_1x}f(x)dx\]

\[M(0,t_2) =\int^{\infty}_{-\infty}e^{t_2y}f(y)dy\]

Now

\[\begin{aligned} M(t_1, t_2) &= \mathbb{M}(t_1,0)\mathbb{M}(0,t_2)\\ &=\int^{\infty}_{-\infty}e^{t_1x}f(x)dx \int^{\infty}_{-\infty}e^{t_2y}f(y)dy\\ &= \int^{\infty}_{-\infty}\int^{\infty}_{-\infty}e^{t_1x+t_2y}f(x)f(y)dxdy\\ &= \int^{\infty}_{-\infty}\int^{\infty}_{-\infty}e^{t_1x+t_2y}f(x,y)dxdy \end{aligned}\]

Define the cumulant generating function:

\[ \psi(t_1, t_2) = \log M(t_1, t_2) \]

We will show that the partial derivatives of \(\psi\) at \((t_1, t_2) = (0, 0)\) yield the mean, variance, and covariance of \(X\) and \(Y\).

First Derivatives (Means)

\[\begin{aligned} M(t_1, t_2) &= \int^{\infty}_{-\infty}\int^{\infty}_{-\infty}e^{t_1x+t_2y}f(x,y)dxdy\\ then~~ \frac{\partial}{\partial t_1}M(t_1, t_2) &= \int^{\infty}_{-\infty}\int^{\infty}_{-\infty}\frac{\partial}{\partial t_1} e^{t_1x+t_2y}f(x,y)dxdy\\ &=\int^{\infty}_{-\infty}\int^{\infty}_{-\infty}X e^{t_1x+t_2y}f(x,y)dxdy\\ &= \mathbb{E}[X e^{t_1 X + t_2 Y}] \end{aligned}\]

by inspection :

\[\frac{\partial^r}{\partial t_1^r}M(t_1, t_2) = \mathbb{E}[X^r e^{t_1 X + t_2 Y}] \tag{2}\]

\[\frac{\partial^r}{\partial t_2^r}M(t_1, t_2) = \mathbb{E}[Y^r e^{t_1 X + t_2 Y}] \tag{3}\]

\[\frac{\partial^2}{\partial t_1\partial t_2}M(t_1, t_2) = \mathbb{E}[XY e^{t_1 X + t_2 Y}] \tag{4}\] hence

\[ \frac{\partial}{\partial t_1} \psi(t_1, t_2) = \frac{1}{M(t_1, t_2)} \cdot \frac{\partial }{\partial t_1}M(t_1, t_2) = \frac{\mathbb{E}[X e^{t_1 X + t_2 Y}]}{\mathbb{E}[e^{t_1 X + t_2 Y}]} \] \[ \frac{\partial}{\partial t_1} \psi(0, 0) = \frac{\mathbb{E}[X e^{0 (X) + 0(Y)}]}{\mathbb{E}[e^{0(X) + 0(Y)}]}=\frac{\mathbb{E} [X]}{1} =\mathbb{E} [X] \] In the same manner:

\[ \frac{\partial}{\partial t_2} \psi(t_1, t_2) = \frac{1}{M(t_1, t_2)} \cdot \frac{\partial }{\partial t_1}M(t_1, t_2) = \frac{\mathbb{E}[Y e^{t_1 X + t_2 Y}]}{\mathbb{E}[e^{t_1 X + t_2 Y}]} \]

\[ \frac{\partial}{\partial t_2} \psi(0, 0) = \frac{\mathbb{E}[Y e^{0 (X) + 0(Y)}]}{\mathbb{E}[e^{0(X) + 0(Y)}]}=\frac{\mathbb{E} [Y]}{1} =\mathbb{E} [Y] \]

So, the first derivatives at \((0, 0)\) are:

\[ \left. \frac{\partial \psi}{\partial t_1} \right|_{(0, 0)} = \mathbb{E}[X], \quad \left. \frac{\partial \psi}{\partial t_2} \right|_{(0, 0)} = \mathbb{E}[Y] \]

Second Derivatives (Variances)

Using Quotient Rule we can find the second derivatives as below

\[ \frac{\partial^2}{\partial t_1^2}\psi(t_1, t_2) = \frac{M(t_1,t_2) \cdot \frac{\partial^2 M(t_1,t_2)}{\partial t_1^2} - \left( \frac{\partial M(t_1,t_2)}{\partial t_1} \right)^2}{M(t_1,t_2)^2} \]

At \((0, 0)\), this simplifies to:

\[ \frac{\partial^2}{\partial t_1^2}\psi(0, 0) = \frac{M(0,0) \cdot \frac{\partial^2 M(0,0)}{\partial t_1^2} - \left( \frac{\partial M(0,0)}{\partial t_1} \right)^2}{M(0,0)^2} \] \[ \frac{\partial^2}{\partial t_1^2}\psi(0, 0) = \frac{1 \cdot \mathbb{E}[X^2] - [\mathbb{E}[X]]^2}{1^2}=\mathbb{E}[X^2] - [\mathbb{E}[X]]^2=Var(X) \]

\[ \left. \frac{\partial^2 \psi}{\partial t_1^2} \right|_{(0, 0)} = \mathbb{E}[X^2] - (\mathbb{E}[X])^2 = \mathrm{Var}(X) \]

Similarly:

\[ \left. \frac{\partial^2 \psi}{\partial t_2^2} \right|_{(0, 0)} = \mathrm{Var}(Y) \]

Covariances

since

\[\begin{aligned} \frac{\partial^2}{\partial t_1 \partial t_2}\psi(t_1, t_2) &= \frac{M(t_1,t_2) \cdot \frac{\partial^2 M(t_1,t_2)}{\partial t_1 \partial t_2} - \frac{\partial M(t_1,t_2)}{\partial t_1} \frac{\partial M(t_1,t_2)}{\partial t_2}}{M(t_1,t_2)^2}\\ \frac{\partial^2}{\partial t_1 \partial t_2}\psi(0, 0) &= \frac{M(0,0) \cdot \frac{\partial^2 M(0,0)}{\partial t_1 \partial t_2} - \frac{\partial M(0,0)}{\partial t_1} \frac{\partial M(0,0)}{\partial t_2}}{M(0,0)^2}\\ \left. \frac{\partial^2 \psi}{\partial t_1 \partial t_2} \right|_{(0, 0)} &= \mathbb{E}[XY] - \mathbb{E}[X]\mathbb{E}[Y] = \mathrm{Cov}(X, Y) \end{aligned}\]

Since for the mixed partial derivative:

\[\frac{\partial^2}{\partial t_1 \partial t_2}M(t_1, t_2) =\mathbb{E}[XY e^{t_1 X + t_2 Y}]\] \[\frac{\partial^2}{\partial t_1 \partial t_2}M(0, 0) =\mathbb{E}[XY]\]

In short

\[\begin{aligned} \left. \frac{\partial \psi}{\partial t_1} \right|_{(0, 0)} &= \mathbb{E}[X] \\ \left. \frac{\partial \psi}{\partial t_2} \right|_{(0, 0)} &= \mathbb{E}[Y] \\ \left. \frac{\partial^2 \psi}{\partial t_1^2} \right|_{(0, 0)} &= \mathrm{Var}(X) \\ \left. \frac{\partial^2 \psi}{\partial t_2^2} \right|_{(0, 0)} &= \mathrm{Var}(Y) \\ \left. \frac{\partial^2 \psi}{\partial t_1 \partial t_2} \right|_{(0, 0)} &= \mathrm{Cov}(X, Y) \end{aligned}\]

Question (5)

\[P(X=x) = \frac{e^{-\lambda} \lambda^x}{x!}=\frac{e^{-50} 50^x}{x!}\]

Finding the maximum likelihood

\[\mathcal L(\lambda;x)=\frac{e^{-\lambda} \lambda^x}{x!}\]

Finding the log likelihood

\[ \begin{aligned} log\mathcal L(\lambda;x) &= log\left(\frac{e^{-\lambda} \lambda^x}{x!}\right)\\ &= log(e^{-\lambda} \lambda^x)-log(x!)\\ &= -\lambda+ xlog\lambda-log(x!)\\ differentiate~~\\ \frac{d}{d\lambda}log\mathcal L(\lambda;x) &= -1 + \frac{x}{\lambda} \end{aligned} \]

Maximize by equating to zero and solve ,i.e

\[ \begin{aligned} \frac{d}{d\lambda}log\mathcal L(\lambda;x) &= 0\\ -1 + \frac{x}{\lambda} &= 0\\ \frac{x}{\lambda} &= 1\\ \hat{x} &= \lambda=50 \end{aligned} \]

Question 6: Unbiasedness and Bias

# Set seed for reproducibility
set.seed(123)

# Parameters
mu <- 21
sigma <- 3  # since variance = 9
n1 <- 30
n2 <- 100
simulations <- 10000

# Function to simulate and analyze
simulate_estimates <- function(n) {
  samples <- matrix(rnorm(simulations * n, mean = mu, sd = sigma), nrow = simulations)
  xbar <- rowMeans(samples)
  xbar_biased <- 0.9 * xbar
  
  data.frame(xbar = xbar, xbar_biased = xbar_biased)
}

# Simulate for both sample sizes
results_n30 <- simulate_estimates(n1)
results_n100 <- simulate_estimates(n2)

# Summary statistics function
summarize_results <- function(data) {
  list(
    mean_xbar = mean(data$xbar),
    bias_xbar = mean(data$xbar) - mu,
    var_xbar = var(data$xbar),
    
    mean_biased = mean(data$xbar_biased),
    bias_biased = mean(data$xbar_biased) - mu,
    var_biased = var(data$xbar_biased)
  )
}

# Get summaries
summary_n30 <- summarize_results(results_n30)
summary_n100 <- summarize_results(results_n100)

print(summary_n30)

$mean_xbar
[1] 21.0056

$bias_xbar
[1] 0.005595708

$var_xbar
[1] 0.3047891

$mean_biased
[1] 18.90504

$bias_biased
[1] -2.094964

$var_biased
[1] 0.2468792

print(summary_n100)

$mean_xbar
[1] 20.99441

$bias_xbar
[1] -0.005587189

$var_xbar
[1] 0.0889089

$mean_biased
[1] 18.89497

$bias_biased
[1] -2.105028

$var_biased
[1] 0.07201621

# Plot histograms
library(ggplot2)
library(tidyr)

results_n30_long <- pivot_longer(results_n30, cols = everything(), names_to = "Estimator", values_to = "Value")
results_n100_long <- pivot_longer(results_n100, cols = everything(), names_to = "Estimator", values_to = "Value")

# Plot for n = 30
ggplot(results_n30_long, aes(x = Value, fill = Estimator)) +
  geom_density(alpha = 0.5) +
  geom_vline(xintercept = mu, color = "black", linetype = "dashed") +
  labs(title = "Sampling Distributions for n = 30", x = "Estimator Value", y = "Density") +
  theme_minimal()

# Plot for n = 100
ggplot(results_n100_long, aes(x = Value, fill = Estimator)) +
  geom_density(alpha = 0.5) +
  geom_vline(xintercept = mu, color = "black", linetype = "dashed") +
  labs(title = "Sampling Distributions for n = 100", x = "Estimator Value", y = "Density") +
  theme_minimal()

Comments

Which estimator appears unbiased for n=30?

The sample mean (X̄) is unbiased (bias ≈ 0)
The 0.9X̄ estimator is biased (bias ≈ -2.1)

Which has lower variance for n=30 and for n=100?

The biased estimator has lower variance (0.244 vs 0.301 for n=30; 0.072 vs 0.089 for n=100)

Which estimator would you prefer?

For precise estimation: Biased estimator (lower variance)

For accurate estimation: Sample mean (unbiased)

Trade-off depends on application (MSE = Variance + Bias^2)

Effects of larger sample size (n=100):

Both distributions become narrower (lower variance)
Shapes become more concentrated around their means
Variance decreases proportionally to 1/n
Bias becomes less significant with larger n

Question 7: Estimation and Expectation

Part (A)

Let \((G_1, G_2, \dots, G_m)\) and \((H_1, H_2, \dots, H_n)\) be independent random variables from two continuous distribution functions. Find the Uniformly Minimum Variance Unbiased Estimators (UMVUE) of:

(i) \(E(GH)\)

Since \(G\) and \(H\) are independent: \[E(GH) = E(G)E(H)\]

The sample means are unbiased estimators: \[\bar{G} = \frac{1}{m}\sum_{i=1}^m G_i, \quad \bar{H} = \frac{1}{n}\sum_{j=1}^n H_j\]

thus

\[E(\bar{G}) = \mu_G, \quad E(\bar{H}) = \mu_H\] Proof

\[\begin{aligned} E(\bar{X}) &= E(\frac{\sum^n_{i=1} X_i)}{n}\\ &= \sum^n_{i=1}(\frac{E( X_i)}{n})\\ &= \frac{n.\mu}{n}\\ &= \mu \end{aligned}\]

Because \(\bar{G}\) and \(\bar{H}\) are independent: \[E(\bar{G}\bar{H}) = E(\bar{G})E(\bar{H}) = E(G)E(H)\]

Thus, the UMVUE for \(E(GH)\) is: \[\color{dodgerblue}{\boxed{\bar{G}\bar{H} = \left(\frac{1}{m}\sum_{i=1}^m G_i\right)\left(\frac{1}{n}\sum_{j=1}^n H_j\right)}}\]

(ii) \(\text{Var}(G + H)\)

Since \(G\) and \(H\) are independent: \[\text{Var}(G + H) = \text{Var}(G) + \text{Var}(H)\]

The sample variances are unbiased estimators: \[S_G^2 = \frac{1}{m-1}\sum_{i=1}^m (G_i - \bar{G})^2, \quad S_H^2 = \frac{1}{n-1}\sum_{j=1}^n (H_j - \bar{H})^2\] thus

\[E(S_G^2) = \sigma^2_G, \quad E(S_H^2) = \sigma^2_H\]

Proof

\[E(S^2) = \sigma^2\]

\[\begin{aligned} E\bigg[\frac{1}{n-1} \sum_{i=1}^n (X_i - \overline{X})^2 \bigg] &= \frac{\sum_{i=1}^n E(X_i^2) + E \bigg[ - 2 \overline{X}\left( {\color{tomato}{n\overline{X}}} \right) + n \overline{X}^2 \bigg]}{n-1} \\ &= \frac{\sum_{i=1}^n E(X_i^2) + E \bigg[ -n \overline{X}^2\bigg]}{n-1}\\ &= \frac{\sum_{i=1}^n E(X_i^2) - n E \big[ \overline{X}^2\big]}{n-1} .\\ &= \frac{\sum_{i=1}^n {\color{dodgerblue}{ E \big[ X_i^2 \big]}} - n {\color{tomato}{E \big[ \overline{X}^2 \big]}}}{n-1} \\ &= \frac{\sum_{i=1}^n \bigg( {\color{dodgerblue}{ \mbox{Var} \big[ X_i \big] + \left( E \big[ X_i \big] \right)^2 }} \bigg) - n \left( {\color{tomato}{\mbox{Var} \big[ \overline{X} \big] + \left( E \big[ \overline{X} \big]\right)^2}} \right)}{n-1} \\ &= \frac{\sum_{i=1}^n {\color{dodgerblue}{ \left( \sigma^2 + \mu^2 \right)}} - n \left( \mbox{Var} \big[ \overline{X} \big] + \left( E \big[ \overline{X} \big]\right)^2 \right)}{n-1} \\ &= \frac{\sum_{i=1}^n\left( \sigma^2 + \mu^2 \right) - n \left( {\color{tomato}{\frac{\sigma^2}{n}}} + \left( {\color{tomato}{ \mu }}\right)^2 \right)}{n-1} \\ &= \frac{{\color{dodgerblue}{n(\sigma^2 + \mu^2)}} - \sigma^2 - n\mu^2}{n-1} \\ &= \frac{(n-1) \sigma^2}{n-1}.\\ &= \sigma^2 \end{aligned}\]

Thus, the UMVUE for \(\text{Var}(G + H)\) is: \[\color{dodgerblue}{\boxed{S_G^2 + S_H^2 = \frac{1}{m-1}\sum_{i=1}^m (G_i - \bar{G})^2 + \frac{1}{n-1}\sum_{j=1}^n (H_j - \bar{H})^2}}\]

Part (B)

Suppose \(X \sim N(\mu, \sigma^2)\). Prove that: \[E[(X - \mu)^2] = \sigma^2\]

By definition of variance: \[\text{Var}(X) = E[(X - \mu)^2]\]

For \(X \sim N(\mu, \sigma^2)\): \[\text{Var}(X) = \sigma^2\]

Therefore: \[\color{dodgerblue}{\boxed{E[(X - \mu)^2] = \sigma^2}}\]