qgamma(c(0.025, 0.975), shape = 6, rate = 3/2)
[1] 1.467930 7.778888
Maximum Likehood Estimator, Hypothesis Testing, Moment Generating Functions, Bias
The likelihood function:
\[ L(X | \lambda) = \frac{\lambda^X e^{-\lambda}}{X!} \]
The prior distribution for the parameter \(( \lambda)\) is assumed to follow a Chi-Square distribution with 4 degrees of freedom. The probability density function (PDF) is given by:
\[ p( \lambda) = \frac{\lambda^{\frac{\nu}{2} - 1} e^{-\frac{\lambda}{2}}}{2^{\nu/2} \Gamma(\nu / 2)} \]
with \(\nu = 4\)
\[ p(\lambda) = \frac{\lambda^{1} e^{-\lambda / 2}}{4} \]
Using Bayes’ Theorem:
\[ P( \lambda | X) \propto L( \lambda) \cdot P( \lambda). \]
Substituting the likelihood and prior:
\[ p(\lambda | x) \propto \left(\frac{\lambda e^{-\lambda / 2}}{4}\right) \cdot \left(\frac{\lambda^x e^{-\lambda}}{x!}\right). \]
Simplify: \[ p(\lambda | x) \propto \lambda^{x + 1} e^{-\frac{3\lambda}{2}}. \]
This corresponds to a Gamma distribution:
where \(\alpha - 1 = 1 + x\) = \(\alpha = 1 + x + 1 = x + 2\) and \(\beta = \frac{3}{2}\)
\[ \lambda | x \sim \text{Gamma}(\alpha = x + 2, \beta = \frac{3}{2}). \]
Given \(x = 4\), the posterior parameters are: - Shape: \(\alpha = 4 + 2 = 6\), - Rate: \(\beta = \frac{3}{2}\).
\[ \mathbb{E}[\lambda] = \frac{\alpha}{\beta} = \frac{6}{3/2} = 4 \]
\[ \text{Mode} = \frac{\alpha - 1}{\beta} = \frac{6 - 1}{3/2} = \frac{5}{1.5} = 3.33 \]
The 95% credible interval using R
qgamma(c(0.025, 0.975), shape = 6, rate = 3/2)
[1] 1.467930 7.778888
Given the prior belief, there is a 95% probability that \(( \lambda)\) lies within the interval (1.77, 7.27).
Given \(Y \sim N(\mu, \sigma^2)\), with a prior \(\mu \sim N(\mu_0, \sigma_0^2)\), the posterior distribution for \((\mu | Y)\) is derived as follows:
\[ P(\mu) = \frac{1}{\sqrt{2\pi \sigma_0^2}} \exp\left(-\frac{(\mu - \mu_0)^2}{2\sigma_0^2}\right) \]
\[ L(\mu) = \prod_{i=1}^n \frac{1}{\sqrt{2\pi\sigma^2}} \exp\left(-\frac{(x_i - \mu)^2}{2\sigma^2}\right) \]
Simplifyin the joint likelihood further: \[ L(\mu) \propto \exp\left(-\frac{n}{2\sigma^2} (n\bar{x} - \mu)^2\right), \]
where, \(\bar{x} = \frac{1}{n} \sum_{i=1}^{n} x_i\)
Using Bayes’ theorem: \[ P(\mu \mid x) \propto L(\mu) \cdot P(\mu) \]
Substitute the prior and simplified likelihood: \[ P(\mu \mid x) \propto \exp\left(-\frac{1}{2\sigma_0^2} (\mu - \mu_0)^2\right) \cdot \exp\left(-\frac{n}{2\sigma^2} (n\bar{x} - \mu)^2\right) \]
Combine the exponents: \[ P(\mu \mid x) \propto \exp\left(-\frac{1}{2} \left[ \frac{1}{\sigma_0^2} (\mu - \mu_0)^2 + \frac{n}{\sigma^2} (n\bar{x} - \mu)^2 \right]\right) \]
Expand and combine terms: \[ \frac{1}{\sigma_0^2} (\mu^2 - 2\mu\mu_0 + \mu_0^2) + \frac{n}{\sigma^2} (n\bar{x}^2 - 2\mu n\bar{x} + \mu^2) \]
Group terms: \[ \left( \frac{1}{\sigma_0^2} + \frac{n}{\sigma^2} \right) \mu^2 - 2\mu \left( \frac{\mu_0}{\sigma_0^2} + \frac{n\bar{x}}{\sigma^2} \right) + C, \] where C collects constants independent of \(\mu\).
The posterior mean and variance are: \[ \mu_{\text{post}} = \frac{\frac{\mu_0}{\sigma_0^2} + \frac{n\bar{x}}{\sigma^2}}{\frac{1}{\sigma_0^2} + \frac{n}{\sigma^2}}, \]
\[ \sigma_{\text{post}}^2 = \frac{1}{\frac{1}{\sigma_0^2} + \frac{n}{\sigma^2}}. \]
Compute \((P(Y > 180)\)
Given:
Sample size: \(n = 80\)
Sample mean: \(\bar{Y} = 182.137\)
Known variance: \(\sigma^2 = 225\)
Prior mean: \(\mu_0 = 187\)
Prior variance: \(\sigma_0^2 = 100\)
The posterior mean is:
\[ \mu_{\text{post}} = \frac{\frac{187}{100} + \frac{80 \cdot 182.137}{225}}{\frac{1}{100} + \frac{80}{225}} = 182.137 \]
The posterior variance is:
\[ \sigma_{\text{post}}^2 = \frac{1}{\frac{1}{100} + \frac{80}{225}} = 2.736 \]
New Variance: The new variance combines the variance of Y given \(\mu\). The new variance of \(\mu:\) is:
\[ \text{Var}(Y_{\text{new}}) = \sigma^2 + v^2 = 225 + 2.736 = 227.736. \]
The Posterior Distribution of \(( Y_{\text{new}})\) is given by:
\[ Y_{\text{new}} \sim \mathcal{N}(182.137, 227.736). \]
Probability Calculation:
\[ P(Y_{\text{new}} > 180) = 1 - P(Y_{\text{new}} \leq 180). \]
Standardization: Standardizing the value:
\[ Z = \frac{180 - 182.137}{\sqrt{227.736}} = \approx -0.1416, \]
Then using the normal distribution table, I found the Z score to be 0.4443. The probability for the new Y value was then estimated to be:
\[ P(Y_{\text{new}} > 180) = 1 - 0.4443 = 0.5557. \]
# Define given parameters
# Parameters
<- 182.137
mu_post <- sqrt(227.736)
sigma_post <- 180
y_value
# Probability Calculation
<- 1 - pnorm(y_value, mean = mu_post, sd = sigma_post)
p_greater_180 print(p_greater_180)
[1] 0.5563053
The normal normal conjugate distrbustion
library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr 1.1.4 ✔ readr 2.1.5
✔ forcats 1.0.0 ✔ stringr 1.5.1
✔ ggplot2 3.5.1 ✔ tibble 3.2.1
✔ lubridate 1.9.4 ✔ tidyr 1.3.1
✔ purrr 1.0.2
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag() masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
# Define parameters
<- 187
prior_mean <- 100
prior_var <- sqrt(prior_var)
prior_sd
<- 182
sample_mean <- 225
known_var <- 80
n <- known_var / n
likelihood_var <- sqrt(likelihood_var)
likelihood_sd
<- 182.137
posterior_mean <- 227.736
posterior_var <- sqrt(posterior_var)
posterior_sd
<- seq(170, 200, length.out = 500)
x
<- dnorm(x, mean = prior_mean, sd = prior_sd)
prior_density <- dnorm(x, mean = sample_mean, sd = likelihood_sd)
likelihood_density <- dnorm(x, mean = posterior_mean, sd = posterior_sd)
posterior_density
<- data.frame(
df x = rep(x, 3),
density = c(prior_density, likelihood_density, posterior_density),
Distribution = rep(c("Prior (N(187, 100))", "Likelihood (N(182, 225))", "Posterior (N(182.137, 227.736))"), each = length(x))
)
ggplot(df, aes(x = x, y = density, color = Distribution, linetype = Distribution)) +
geom_line(linewidth = 1) +
scale_color_manual(values = c("Prior (N(187, 100))" = "blue",
"Likelihood (N(182, 225))" = "red",
"Posterior (N(182.137, 227.736))" = "green")) +
labs(x = "Parameter (μ)",
y = "Density",
title = "Normal Normal Conjugate") +
theme_gray() +
theme(legend.position = "bottom",
legend.title = element_blank(),
legend.background = element_rect(fill = "white", color = "black"))
The Likelihood function
\[ L(\mu) = \prod_{i=1}^n \frac{1}{\sqrt{2\pi \cdot 64}} \exp\left(-\frac{(X_i - \mu)^2}{2 \cdot 64}\right) = \left(\frac{1}{\sqrt{2\pi \cdot 64}}\right)^n \exp\left(-\frac{1}{128}\sum_{i=1}^n (X_i - \mu)^2\right). \]
The likelihood function under \(H_0\) is: 80 \[ L(80) = \left(\frac{1}{\sqrt{2\pi \cdot 64}}\right)^n \exp\left(-\frac{1}{128} \sum_{i=1}^n (X_i - 80)^2\right). \]
The likelihood function under \(H_1\) is: 76
\[ L(76) = \left(\frac{1}{\sqrt{2\pi \cdot 64}}\right)^n \exp\left(-\frac{1}{128} \sum_{i=1}^n (X_i - 76)^2\right). \]
Likelihood Ratio
The likelihood ratio is:
\[ \frac{L(80)}{L(76)} = \frac{(128\pi)^{-n/2} \exp\left(-128 \sum_{i=1}^n (x_i - 80)^2 \right)}{(128\pi)^{-n/2} \exp\left(-128 \sum_{i=1}^n (x_i - 76)^2 \right)}, \]
\[ \frac{L(80)}{L(76)} = \exp\left(-\frac{1}{128}\sum_{i=1}^n (X_i - 80)^2\right) \exp\left(-\frac{1}{128}\sum_{i=1}^n (X_i - 76)^2\right) = \exp\left(-\frac{1}{128} \left[\sum_{i=1}^n (X_i - 80)^2 - \sum_{i=1}^n (X_i - 76)^2 \right]\right). \]
Simplify the exponent:
\[ (X_i - 80)^2 - (X_i - 76)^2 = (X_i^2 - 160X_i + 6400) - (X_i^2 - 152X_i + 5776) = -160X_i + 152X_i + 6400 - 5776 = -8X_i + 624. \]
\[ \sum_{i=1}^n \left[ (X_i - 80)^2 - (X_i - 76)^2 \right] = \sum_{i=1}^n (-8 X_i + 624) = -8 \sum_{i=1}^n X_i + 624n = -8n \bar{X} + 624n, \]
where
\[ \bar{X} = \frac{1}{n} \sum_{i=1}^n X_i. \]
Thus:
\[ \Lambda = \exp\left(-\frac{1}{128}(-8n \bar{X} + 624n)\right) = \exp\left(\frac{8n \bar{X} - 624n}{128}\right) = \exp\left(n \frac{\bar{X} - 78}{16}\right). \]
The critical region is defined by:
\[ \Lambda \leq k : \exp\left(\frac{n}{16} (\bar{X} - 78)\right) \leq k. \]
Taking the natural logarithm:
\[ \frac{n}{16} (\bar{X} - 78) \leq \log(k). \]
Simplifying:
\[ \bar{X} - 78 \leq \frac{16}{n} \log(k) \implies \bar{X} \leq 78 + \frac{16}{n} \log(k). \]
\[ \alpha = P\left(\bar{X} \leq c \mid \mu = 80\right). \]
Since
\[ \bar{X} \sim N\left(\mu, \frac{\sigma^2}{n}\right) = N\left(80, \frac{64}{n}\right), \]
\[ Z = \frac{\bar{X} - 80}{\sqrt{64/n}} = \frac{\bar{X} - 80}{8/\sqrt{n}} \sim N(0, 1). \]
The probability:
\[ P\left(\bar{X} \leq c \mid \mu = 80\right) = P\left(\frac{\bar{X} - 80}{8/\sqrt{n}} \leq \frac{c - 80}{8/\sqrt{n}}\right) = \Phi\left(\frac{c - 80}{8/\sqrt{n}}\right) \leq 0.05. \]
Solving for \(c:\)
\[ \frac{c - 80}{8/\sqrt{n}} \leq \Phi^{-1}(0.05) = -1.645 \implies c - 80 \leq -1.645 \cdot \frac{8}{\sqrt{n}} \implies c \leq 80 - \frac{13.16}{\sqrt{n}}. \]
\[ \beta = P\left(c \geq \bar{X} \mid \mu = 76\right). \]
\[ \beta = P\left(\bar{X} > c \mid \mu = 76\right) = 1 - P\left(\bar{X} \leq c \mid \mu = 76\right). \]
Under \(H_1\),
\[ \bar{X} \sim N\left(76, \frac{64}{n}\right). \]
Thus:
\[ P\left(\bar{X} \leq c \mid \mu = 76\right) = P\left(\frac{\bar{X} - 76}{8/\sqrt{n}} \leq \frac{c - 76}{8/\sqrt{n}}\right) = \Phi\left(\frac{c - 76}{8/\sqrt{n}}\right). \]
Substituting into \(\beta\):
\[ \beta = 1 - \Phi\left(\frac{c - 76}{8/\sqrt{n}}\right) \leq 0.05 \implies \Phi\left(\frac{c - 76}{8/\sqrt{n}}\right) \geq 0.95. \]
Solving for \(c\):
\[ \frac{c - 76}{8/\sqrt{n}} \geq \Phi^{-1}(0.95) = 1.645 \implies c - 76 \geq 1.645 \cdot \frac{8}{\sqrt{n}} \implies c \geq 76 + \frac{13.16}{\sqrt{n}}. \]
Combining the inequalities:
\[ 80 - \frac{13.16}{\sqrt{n}} \geq c \geq 76 + \frac{13.16}{\sqrt{n}}. \]
For equality (to minimize \(n\))
\[ 80 - \frac{13.16}{\sqrt{n}} = 76 + \frac{13.16}{\sqrt{n}}. \]
Simplify:
\[ 80 - 76 = \frac{13.16}{\sqrt{n}} + \frac{13.16}{\sqrt{n}} \implies 4 = \frac{26.32}{\sqrt{n}} \implies \sqrt{n} = \frac{26.32}{4} = 6.58 \implies n = (6.58)^2 \approx 43.3. \]
Round up:
\[ n = 44. \]
Then:
\[ c = 80 - \frac{13.16}{\sqrt{44}} = 80 - \frac{13.16}{6.633} \approx 80 - 1.984 \approx 78.016. \]
Verify:
\[ c = 76 + \frac{13.16}{\sqrt{44}} \approx 76 + 1.984 \approx 78.016. \]
Check \(\alpha)\):
\[ \frac{c - 80}{8/\sqrt{44}} = \frac{78.016 - 80}{8/\sqrt{44}} = \frac{-1.984}{1.206} \approx -1.645 \implies \alpha = \Phi(-1.645) = 0.05. \]
Check \(\beta\):
\[ \frac{c - 76}{8/\sqrt{44}} = \frac{78.016 - 76}{1.206} \approx 1.645 \implies \beta = 1 - \Phi(1.645) = 0.05. \]
Thus:
\[ n = 44, \quad c \approx 78.016. \]
# Parameters
<- 80
mu0 <- 76
mu1 <- 8
sigma <- 0.05
alpha <- 0.05
beta
# Critical z-values
<- qnorm(alpha, lower.tail = TRUE) # One-sided test
z_alpha <- qnorm(1 - beta) # For power = 1 - beta
z_beta
# Difference in means
<- abs(mu0 - mu1)
delta
# Calculate required n
<- sigma * (z_beta + abs(z_alpha))
numerator <- (numerator / delta)^2
n_required
# Round up to nearest integer
<- ceiling(n_required)
n_required_rounded
# Compute critical value c
<- sigma / sqrt(n_required_rounded)
se <- mu0 + z_alpha * se
critical_value
# Output results
cat("Required sample size n:", n_required_rounded, "\n")
Required sample size n: 44
cat("Critical value c:", round(critical_value, 2), "\n")
Critical value c: 78.02
# Create x-axis values for the plot
<- seq(70, 85, length.out = 1000)
x_vals
# Densities under H0 and H1
<- data.frame(
df x = x_vals,
H0 = dnorm(x_vals, mean = mu0, sd = se),
H1 = dnorm(x_vals, mean = mu1, sd = se)
)
# Plot
ggplot(df, aes(x = x)) +
geom_line(aes(y = H0), color = "blue", size = 1.2) +
geom_line(aes(y = H1), color = "red", size = 1.2) +
# Type I error region
geom_area(data = subset(df, x <= critical_value), aes(y = H0), fill = "blue", alpha = 0.3) +
# Type II error region
geom_area(data = subset(df, x > critical_value), aes(y = H1), fill = "red", alpha = 0.3) +
geom_vline(xintercept = critical_value, linetype = "dashed", color = "black") +
annotate("text", x = critical_value, y = 0.05, label = paste0("C-value ≈ ", round(critical_value, 3)), angle = 90, vjust = -0.5) +
labs(title = "Sampling Distributions Under H0 and H1",
x = expression(bar(X)),
y = "Density") +
theme_minimal() +
theme(legend.position = "bottom",
legend.title = element_blank(),
legend.background = element_rect(fill = "white", color = "black"),
legend.margin = margin(6, 6, 6, 6))
Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
ℹ Please use `linewidth` instead.
\[ \psi(t_1, t_2) = \ln M(t_1, t_2), \quad M(t_1, t_2) = E\left[\exp(t_1 X + t_2 Y)\right]. \]
\[ \frac{\partial \psi}{\partial t_1} = \frac{1}{M(t_1, t_2)} \cdot \frac{\partial M(t_1, t_2)}{\partial t_1}, \]
where
\[ \frac{\partial M(t_1, t_2)}{\partial t_1} = E\left[X \cdot \exp(t_1 X + t_2 Y)\right]. \]
At \((t_1, t_2) = (0, 0)\):
\[ \frac{\partial \psi(0, 0)}{\partial t_1} = E[X]. \]
\[ \frac{\partial \psi}{\partial t_2} = \frac{1}{M(t_1, t_2)} \cdot \frac{\partial M(t_1, t_2)}{\partial t_2}, \]
where
\[ \frac{\partial M(t_1, t_2)}{\partial t_2} = E\left[Y \cdot \exp(t_1 X + t_2 Y)\right]. \]
At \((t_1, t_2) = (0, 0)\):
\[ \frac{\partial \psi(0, 0)}{\partial t_2} = E[Y]. \]
\[ \frac{\partial^2 \psi}{\partial t_1^2} = \frac{\frac{\partial^2 M(t_1, t_2)}{\partial t_1^2} \cdot M(t_1, t_2) - \left(\frac{\partial M(t_1, t_2)}{\partial t_1}\right)^2}{M(t_1, t_2)^2}. \]
At \((t_1, t_2) = (0, 0)\):
\[ \frac{\partial^2 \psi(0, 0)}{\partial t_1^2} = \text{Var}(X). \]
\[ \frac{\partial^2 \psi}{\partial t_2^2} = \frac{\frac{\partial^2 M(t_1, t_2)}{\partial t_2^2} \cdot M(t_1, t_2) - \left(\frac{\partial M(t_1, t_2)}{\partial t_2}\right)^2}{M(t_1, t_2)^2}. \]
At \((t_1, t_2) = (0, 0)\):
\[ \frac{\partial^2 \psi(0, 0)}{\partial t_2^2} = \text{Var}(Y). \]
\[ \frac{\partial^2 \psi}{\partial t_1 \partial t_2} = \frac{\frac{\partial^2 M(t_1, t_2)}{\partial t_1 \partial t_2} \cdot M(t_1, t_2) - \frac{\partial M(t_1, t_2)}{\partial t_1} \cdot \frac{\partial M(t_1, t_2)}{\partial t_2}}{M(t_1, t_2)^2}. \]
At \((t_1, t_2) = (0, 0)\):
\[ \frac{\partial^2 \psi(0, 0)}{\partial t_1 \partial t_2} = \text{Cov}(X, Y). \]
Step 1: Write the Likelihood Function
\[ L(\lambda) = \frac{e^{-\lambda} \cdot \lambda^x}{x!}. \]
Step 2: Compute the Log-Likelihood
\[ \ln\left(L(\lambda)\right) = \ln\left(\frac{e^{-\lambda} \cdot \lambda^x}{x!}\right) = -\lambda + x \ln(\lambda) - \ln(x!). \]
Step 3: Find the First Derivative
Differentiate \(\ln(L(\lambda))\) with respect to \(\lambda\):
\[ \frac{d\ln(L)}{d\lambda} = \frac{d}{d\lambda} [ -\lambda + x \ln(\lambda) ] = -1 + x \cdot \frac{1}{\lambda} = -1 + \frac{x}{\lambda}. \]
Step 4: Set the Derivative to Zero
To find the maximum:
\[ -1 + \frac{x}{\lambda} = 0 \implies \frac{x}{\lambda} = 1 \implies \lambda = x. \]
So, the MLE of \(\lambda\), based on one observation \(X\), is:
\[ \hat{\lambda} = X. \]
Step 5: Verify It’s a Maximum
Take the second derivative to check if this is a maximum:
\[ \frac{d^2\ln(L)}{d\lambda^2} = \frac{d}{d\lambda} \left[ -1 + \frac{x}{\lambda} \right] = \frac{d}{d\lambda} \left[ \frac{x}{\lambda} \right] = x \cdot \left(-\frac{1}{\lambda^2}\right) = -\frac{x}{\lambda^2}. \]
Since \(x\) and \(\lambda\) are positive,
\[ -\frac{x}{\lambda^2} < 0, \]
which means the second derivative is negative, confirming that this is indeed a maximum.
The expected value of \(\hat{\lambda}\) is:
\[ E[\hat{\lambda}] = E[X]. \]
For a Poisson random variable:
\[ E[X] = \lambda = 50, \]
so:
\[ E[\hat{\lambda}] = 50. \]
This matches the true \(\lambda\), demonstrating that the estimator is unbiased.
Part A
set.seed(100) # For reproducibility
# Parameters
<- 21
mu <- sqrt(9) # Standard deviation = sqrt(variance)
sigma <- 30
n <- 10000
num_simulations
# Simulate samples: each row is one sample of size n
<- matrix(rnorm(n * num_simulations, mean = mu, sd = sigma),
samples nrow = num_simulations, ncol = n)
# Sample means (X̄)
<- rowMeans(samples)
X_bar
# Biased estimator: μ̂_biased = 0.9 * X̄
<- 0.9 * X_bar biased_estimates
# Create a dataframe for plotting
<- data.frame(
df Estimate = c(X_bar, biased_estimates),
Type = rep(c("Unbiased (X̄)", "Biased (0.9 X̄)"), each = num_simulations)
)
# Plot density curves
ggplot(df, aes(x = Estimate, fill = Type, color = Type)) +
geom_density(alpha = 0.4, linewidth = 1) +
geom_vline(xintercept = mu, linetype = "dashed", color = "black") +
annotate("text", x = mu, y = 0.15, label = "True Mean = 21", vjust = -1.2, size = 4) +
labs(title = "Sampling Distributions of Estimators",
x = "Estimate Value", y = "Density") +
theme_minimal() +
theme(legend.position = "top")
# True mean
<- 21
mu
# Unbiased estimator
<- mean(X_bar)
mean_unbiased <- mean_unbiased - mu
bias_unbiased <- var(X_bar)
var_unbiased
# Biased estimator
<- mean(biased_estimates)
mean_biased <- mean_biased - mu
bias_biased <- var(biased_estimates)
var_biased
# Print results
cat("Unbiased Estimator (X̄):\n")
Unbiased Estimator (X̄):
cat(" Mean:", round(mean_unbiased, 3), "\n")
Mean: 21.008
cat(" Bias:", round(bias_unbiased, 3), "\n")
Bias: 0.008
cat(" Variance:", round(var_unbiased, 3), "\n\n")
Variance: 0.302
cat("Biased Estimator (0.9 X̄):\n")
Biased Estimator (0.9 X̄):
cat(" Mean:", round(mean_biased, 3), "\n")
Mean: 18.907
cat(" Bias:", round(bias_biased, 3), "\n")
Bias: -2.093
cat(" Variance:", round(var_biased, 3), "\n")
Variance: 0.245
The sample mean (\(\bar{X}\) )appears unbiased. This is because its average estimate across the 10,000 simulations is approximately equal to the true population mean ( \(\mu = 21\)), and its computed bias is close to zero.
The biased estimator 0.9\(\bar{X}\) has lower variance. This makes intuitive sense: multiplying by a constant less than 1 shrinks the variability of the estimator (by the square of that constant.
I’d prefer \(\bar{X}\) . While the biased \(\mu\) has lower variance, its bias (-2.1) means it systematically underestimates the true mean, which can lead to incorrect conclusions. \(\bar{X}\) is unbiased, so it’s more reliable for estimating the true mean, even if its variance is slightly higher.
An estimator is unbiased if, on average, it equals the true value it’s estimating. For example, if we’re estimating a mean of 21, an unbiased estimator averages to 21 over many samples—it doesn’t systematically over- or under-estimate.
An estimator is efficient if it has the smallest possible variance among unbiased estimators. It means the estimates are tightly clustered around the true value, with less variability than other unbiased estimators. Efficiency is about precision: an efficient estimator gives more consistent estimates with less spread.
Part B
set.seed(100) # For reproducibility
# Parameters
<- 21
mu1 <- sqrt(9) # Standard deviation = sqrt(variance)
sigma1 <- 100
n1 <- 10000
num_simulations1
# Simulate samples: each row is one sample of size n
<- matrix(rnorm(n1 * num_simulations, mean = mu1, sd = sigma1),
samples1 nrow = num_simulations1, ncol = n1)
<- rowMeans(samples1)
X_bar1
# Biased estimator: μ̂_biased = 0.9 * X̄
<- 0.9 * X_bar1 biased_estimates1
# Create a dataframe for plotting
<- data.frame(
df1 Estimate1 = c(X_bar1, biased_estimates1),
Type = rep(c("Unbiased (X̄)", "Biased (0.9 X̄)"), each = num_simulations1)
)
# Plot density curves
ggplot(df1, aes(x = Estimate1, fill = Type, color = Type)) +
geom_density(alpha = 0.4, linewidth = 1) +
geom_vline(xintercept = mu1, linetype = "dashed", color = "black") +
annotate("text", x = mu1, y = 0.15, label = "True Mean = 21", vjust = -1.2, size = 4) +
labs(title = "Sampling Distributions of Estimators with n= 100",
x = "Estimate Value", y = "Density") +
theme_minimal() +
theme(legend.position = "top")
<- 21
mu1
# Unbiased estimator
<- mean(X_bar1)
mean_unbiased <- mean_unbiased - mu1
bias_unbiased <- var(X_bar1)
var_unbiased
# Biased estimator
<- mean(biased_estimates1)
mean_biased <- mean_biased - mu1
bias_biased <- var(biased_estimates1)
var_biased
# Print results
cat("Unbiased Estimator (X̄):\n")
Unbiased Estimator (X̄):
cat(" Mean:", round(mean_unbiased, 3), "\n")
Mean: 21.004
cat(" Bias:", round(bias_unbiased, 3), "\n")
Bias: 0.004
cat(" Variance:", round(var_unbiased, 3), "\n\n")
Variance: 0.09
cat("Biased Estimator (0.9 X̄):\n")
Biased Estimator (0.9 X̄):
cat(" Mean:", round(mean_biased, 3), "\n")
Mean: 18.904
cat(" Bias:", round(bias_biased, 3), "\n")
Bias: -2.096
cat(" Variance:", round(var_biased, 3), "\n")
Variance: 0.073
The sampling distributions of the two estimators, \(\bar{X}\) (unbiased) and mu_biased (0.9 *\(\bar{X}\), biased), were analyzed for sample sizes n = 30 and n = 100. Both distributions are normal-shaped, with \(\bar{X}\) centered at the true mean of 21 and the biased \(\mu\) at 18.9, reflecting its bias of -2.1. The spread of biased \(\mu\) is consistently narrower than that of \(\bar{X}\) due to its lower variance (0.243 vs. 0.3 for n = 30, and 0.0729 vs. 0.09 for n = 100), as seen in the graphs where the pink curve (\(\mu\)) is tighter than the blue curve (\(\bar{bar}\)).
As sample size increases from 30 to 100, the variability of both estimators decreases, with variances dropping proportionally (e.g., Var(\(\bar{X}\)) from 0.3 to 0.09), making the distributions visibly narrower in the n = 100 graph.
A good estimator benefits from low variance because it ensures more consistent and reliable estimates, even if slightly biased, as consistency often outweighs a small bias in practical applications; however, in this case, \(\mu\)’s large bias makes \(\bar{X}\) preferable despite its higher variance.
Part A
Independent Random Variables
E(GH) = E(G)E(H).
The unbiased estimators for E(G) and E(H) are their respective sample means:
\[ \hat{E}(G) = \frac{1}{m} \sum_{i=1}^m G_i, \quad \hat{E}(H) = \frac{1}{n} \sum_{j=1}^n H_j. \]
Thus, the The Uniformly Minimum Variance Unbiased Estimator (UMVUE) for E(GH) is given by:
\[ \hat{E}(GH) = \hat{E}(G) \cdot \hat{E}(H) = \left(\frac{1}{m} \sum_{i=1}^m G_i \right) \cdot \left( \frac{1}{n} \sum_{j=1}^n H_j \right). \]
Finding the UMVUE of \(\text{Var}(G + H)\) Using the property of variances for independent variables, we know:
\[ \text{Var}(G + H) = \text{Var}(G) + \text{Var}(H). \]
The unbiased estimators for \(\text{Var}(G)\) and \(\text{Var}(H)\) are derived from their sample variances:
\[ \hat{\text{Var}}(G) = \frac{1}{m-1} \sum_{i=1}^m \left( G_i - \bar{G} \right)^2, \quad \text{where} \; \bar{G} = \frac{1}{m} \sum_{i=1}^m G_i, \]
\[ \hat{\text{Var}}(H) = \frac{1}{n-1} \sum_{j=1}^n \left( H_j - \bar{H} \right)^2, \quad \text{where} \; \bar{H} = \frac{1}{n} \sum_{j=1}^n H_j. \]
Thus, the UMVUE for \(\text{Var}(G + H)\) is:
\[ \hat{\text{Var}}(G + H) = \hat{\text{Var}}(G) + \hat{\text{Var}}(H). \]
Part B
Proof that \(E[(X - \mu)^2] = \sigma^2\)
Definition of Variance:
The variance of a random variable X is defined as:
\[ \text{Var}(X) = E[(X - E[X])^2]. \]
Substitute the expectation:
For a normal random variable X \(\sim N(\mu, \sigma^2)\), the mean is \(\mu\). Substituting this:
\[ \text{Var}(X) = E[(X - \mu)^2]. \]
Variance Property:
For a normal distribution, the variance is given as:
\[ \text{Var}(X) = \sigma^2. \]
Conclusion:
From the definition of variance and its property for normal distributions, we can conclude:
\[ E[(X - \mu)^2] = \sigma^2. \]