Log-normal Distribution

The log-normal distributions have been widely used in many different fields such as finance and economics, engineering and reliability, environmental science, medical and biology etc.

If \(X\) follows a lognormal distribution:

\[ f(x; \mu, \sigma^2) = \frac{1}{x\sqrt{2\pi \sigma^2}} \exp\left[-\frac{(\ln x - \mu)^2}{2\sigma^2}\right], \quad x > 0 \]

where \(\mu \in \mathbb{R}\) is the mean of \(\ln X\) and \(\sigma^2 > 0\) is the variance of \(\ln X\).

Caution: \(\mu\) and \(\sigma^2\) are not population mean and variance of the lognormal distribution!. Instead, we will use \(\mu_{LN}\) and \(\sigma_{LN}^2\) to denote the lognormal mean and variance respectively.

For an i.i.d. random sample \(x_1, x_2, \dots, x_n\):

\[\begin{align*} \ell(\mu, \sigma^2) &= \sum_{i=1}^n \ln f(x_i; \mu, \sigma^2) \\ &= -\frac{n}{2} \ln(2\pi) - \frac{n}{2} \ln \sigma^2 - \sum_{i=1}^n \ln x_i - \frac{1}{2\sigma^2} \sum_{i=1}^n (\ln x_i - \mu)^2 \end{align*}\]

The \(k\)-th moment of log normal distribution given by

\[ \boxed{ \mathbb{E}[X^k] = \exp\left( k\mu + \frac{1}{2} k^2 \sigma^2 \right). } \]

Particularly, the mean of the above lognormal distribution, denoted by \(\mu_{LN}\), is given by

\[ \mu_{LN} = \mathbb{E}[X] = \exp\left( \mu + \frac{1}{2} \sigma^2 \right). \]

Once the parameters were estimated, say \((\overline{\mu}, \overline{\sigma^2})\), we can use the plug-in Principle to estimate the lognormal mean by

\[ \boxed{\overline{\mu_{LN}} = \exp\left( \overline{\mu} + \frac{1}{2} \overline{\sigma^2} \right).} \]

Working Dataset: A reservoir engineer at an oil exploration company has recently drilled 50 new wells in a shale formation. According to geological theory, the size of oil reserves in this type of formation is often modeled using a log-normal distribution. The engineer is responsible for estimating the typical reserve size and quantifying the uncertainty of these estimates for a presentation to senior management. The estimated ultimate recovery (EUR) for the 50 wells, measured in thousands of barrels (Mbbl), is as follows:

17.3, 24.8, 8.2, 31.5, 14.1, 42.7, 11.9, 55.3, 21.4, 9.7, 36.2, 18.6, 63.1, 13.2, 28.9, 
47.5, 19.8, 7.5, 33.4, 52.1, 16.0, 25.9, 38.7, 10.3, 44.2, 22.5, 58.6, 12.8, 30.3, 48.9, 
20.1, 35.4, 15.7, 60.2, 26.7, 41.3, 18.1, 53.8, 23.9, 46.2, 29.4, 37.6, 14.9, 50.5, 32.8, 
19.3, 56.7, 11.2, 39.5, 27.1

This assignment focuses on finding the asymptotic sampling distributions of the MLE \(\hat{\mu}\) and \(\hat{\sigma^2}\) corresponding to \(\mu\) and \(\sigma^2\) based on an environmental study data.


# Oil reserve data
x <- c(
  17.3, 24.8, 8.2, 31.5, 14.1, 42.7, 11.9, 55.3, 21.4, 9.7,
  36.2, 18.6, 63.1, 13.2, 28.9, 47.5, 19.8, 7.5, 33.4, 52.1,
  16.0, 25.9, 38.7, 10.3, 44.2, 22.5, 58.6, 12.8, 30.3, 48.9,
  20.1, 35.4, 15.7, 60.2, 26.7, 41.3, 18.1, 53.8, 23.9, 46.2,
  29.4, 37.6, 14.9, 50.5, 32.8, 19.3, 56.7, 11.2, 39.5, 27.1
)

# Sample size
n <- length(x)

Question 1: Method of Moments Estimation (MME) of \(\mu\) and \(\sigma^2\)

Estimate the parameters \(\mu\) and \(\sigma^2\) (caution: not \(\sigma\)!) from the raw oil reserve data using the given \(k\)-th moment function. Organize your work into three parts:

  1. Derive the system of two equations for the method of moments estimators (MMEs) of \(\mu\) and \(\sigma^2\).

Let the first two sample moments be

\[ m_1=\frac{1}{n}\sum_{i=1}^n x_i,\qquad m_2=\frac{1}{n}\sum_{i=1}^n x_i^2. \]

For the lognormal distribution,

\[ \mu_1=\mathbb{E}[X]=\exp\left(\mu+\frac{1}{2}\sigma^2\right) \] and \[ \mu_2=\mathbb{E}[X^2]=\exp\left(2\mu+2\sigma^2\right). \]

The method of moments sets

\[ \exp\left(\mu+\frac{1}{2}\sigma^2\right)=m_1 \]

\[ \exp\left(2\mu+2\sigma^2\right)=m_2. \]

These are two MME equations used to solve for \(\mu\) and \(\sigma^2\).

  1. Derive the closed-form expressions for the MMEs of \(\mu\), \(\sigma^2\), and \(\mu_{LN}\), denoted by \(\widetilde{\mu}\), \(\widetilde{\sigma^2}\), and \(\widetilde{\mu_{LN}}\), respectively.

Taking logarithms gives

\[ \mu+\frac{1}{2}\sigma^2=\ln(m_1) \]

\[ 2\mu+2\sigma^2=\ln(m_2). \]

From the first equation,

\[ 2\mu+\sigma^2=2\ln(m_1). \]

Subtracting this from the second equation gives

\[ \sigma^2=\ln(m_2)-2\ln(m_1)=\ln\left(\frac{m_2}{m_1^2}\right). \]

Thus,

\[ \widetilde{\sigma^2}=\ln\left(\frac{m_2}{m_1^2}\right). \]

Substituting into the first equation,

\[ \widetilde{\mu}=\ln(m_1)-\frac{1}{2}\widetilde{\sigma^2}. \]

The plug-in estimator of the lognormal mean is

\[ \widetilde{\mu_{LN}}=\exp\left(\widetilde{\mu}+\frac{1}{2}\widetilde{\sigma^2}\right) \]

Since \(m_1=\bar X\), we have

\[ \boxed{\widetilde{\mu_{LN}}=\bar X} \]

  1. Write an R function that outputs the MMEs \(\widetilde{\mu}\), \(\widetilde{\sigma^2}\), and \(\widetilde{\mu_{LN}}\) based on the given data set.
# Method of Moments estimation for lognormal parameters
lognormal.mme <- function(x){
  m1 <- mean(x) #First sample moment
  m2 <- mean(x^2) #Second sample moment

  #MME of sigma^2 derived from moment equations   
  sigma2.tilde <- log(m2 / (m1^2))
  
  # MME of mu obtained by substituting sigma^2 into first equation
  mu.tilde <- log(m1) - 0.5 * sigma2.tilde
  
  # Plug-in estimator for the lognormal mean
  muLN.tilde <- exp(mu.tilde + 0.5 * sigma2.tilde)
  
  # Return the estimates
  data.frame(
    mu_tilde = mu.tilde,
    sigma2_tilde = sigma2.tilde,
    muLN_tilde = muLN.tilde
  )
}

# Return the estimates
lognormal.mme(x)
  mu_tilde sigma2_tilde muLN_tilde
1 3.301269    0.2339652     30.516


Question 2: Finding the MLE of \(\mu\) and \(\sigma^2\)

This question involves two parts:

(1). Derive the score functions (i.e., the gradient of the log-likelihood) for the parameters of the log-normal distribution, and set them to zero to obtain the system of score equations. Begin with the full log-likelihood and compute its partial derivatives with respect to \(\mu\) and \(\sigma^2\). [Hint: letting \(\beta = \sigma^2\) may simplify differentiation.]

Assume that \(\{x_1, x_2, \cdots, x_n \} \to \text{ LN }(\mu, \sigma^2)\) with density function given by

\[ \ell(\mu, \sigma^2) = -\frac{n}{2} \ln(2\pi) - n \ln \sigma - \sum_{i=1}^n \ln x_i - \frac{1}{2\sigma^2} \sum_{i=1}^n (\ln x_i - \mu)^2 \] The log-likelihood function is

\[ \ell(\mu,\sigma^2)=-\frac{n}{2}\ln(2\pi)-\frac{n}{2}\ln(\sigma^2)-\sum_{i=1}^n \ln x_i-\frac{1}{2\sigma^2}\sum_{i=1}^n(\ln x_i-\mu)^2. \]

Let \(\beta=\sigma^2\).

\[ \frac{\partial \ell}{\partial \mu}=\frac{1}{\beta}\sum_{i=1}^n(\ln x_i-\mu). \]

Setting this equal to zero gives \[ \hat{\mu}=\frac{1}{n}\sum_{i=1}^n \ln x_i. \]

\[ \frac{\partial \ell}{\partial \beta}=-\frac{n}{2\beta}+\frac{1}{2\beta^2}\sum_{i=1}^n(\ln x_i-\mu)^2. \]

Setting this equal to zero yields

\[ \widehat{\sigma^2}=\frac{1}{n}\sum_{i=1}^n(\ln x_i-\hat{\mu})^2. \]

(2). Perform some algebra to find the closed form of the MLE of \(\mu\) and \(\sigma^2\), denoted by \(\widehat{\mu}\) and \(\widehat{\sigma^2}\).

\[ \boxed{\hat{\mu}=\frac{1}{n}\sum_{i=1}^n \ln x_i} \] and

\[ \boxed{\widehat{\sigma^2}=\frac{1}{n}\sum_{i=1}^n(\ln x_i-\hat{\mu})^2}. \]

(3). Recall that lognormal population mean (i.e., the first moment) is

\[ \mu_{LN} = \mathbb{E}[X] = \exp\left( \mu + \frac{1}{2} \sigma^2 \right). \]

Using the relationship between variance and the first and second moments, \(\text{Var}(X) = \mathbb{E}[X^2] - \left(\mathbb{E}[X]\right)^2\), we derive the population variance of the lognormal distribution, denoted as \(\sigma_{\text{LN}}^2\). This result will be used in applying the Central Limit Theorem in subsequent questions

The lognormal mean is

\[ \mu_{LN}=\exp\left(\mu+\frac{1}{2}\sigma^2\right). \] Also, \[ \mathbb{E}[X^2]=\exp(2\mu+2\sigma^2). \] Therefore \[ \sigma_{LN}^2=\mathbb{E}[X^2]-(\mathbb{E}[X])^2=\exp(2\mu+2\sigma^2)-\exp(2\mu+\sigma^2). \]

Equivalently, \[ \sigma_{LN}^2=\exp(2\mu+\sigma^2)\left(\exp(\sigma^2)-1\right). \]

Question 3: Sampling Distribution of Lognormal Sample Mean \(\widehat{\mu_{LN}}\)

We have derived two different estimators of lognormal population mean \(\mu_{LN}\). This Question focuses on the sampling distribution of lognormal sample sample means based on MLE. The following are detailed instructions.

(1). Write an R function to estimate the MLE of \(\mu\) and \(\sigma^2\), denoted by \(\widehat{\mu}\) and \(\widehat{\sigma^2}\) and return the MLE of \(\mu_{LN}\), denoted by \(\widehat{\mu_{LN}}\).

# Function to compute MLE of mu, sigma^2, and lognormal mean
lognormal_mle <- function(x){

  y <- log(x)# Take natural log of data
  mu_hat <- mean(y) # MLE of mu 

  # MLE of sigma^2 
  sigma2_hat <- mean((y - mu_hat)^2)

  # Plug-in estimate of lognormal population mean
  muLN_hat <- exp(mu_hat + 0.5 * sigma2_hat)

  # Return estimates
  c(mu_hat = mu_hat,
    sigma2_hat = sigma2_hat,
    muLN_hat = muLN_hat)
}

# Compute MLE estimates
lognormal_mle(x)
    mu_hat sigma2_hat   muLN_hat 
 3.2685359  0.3271989 30.9426444 

(2). Take 1000 Bootstrap samples from the raw oil data set. For each bootstrap sample, call the above R function to find the bootstrap MLE of \(\mu_{LN}\), denoted by \(\widehat{\mu_{LN}^*}^{(1)}, \widehat{\mu_{LN}^*}^{(2)}, \cdots, \widehat{\mu_{LN}^*}^{(1000)}\).

set.seed(123)

B <- 1000 # Number of bootstrap samples
n <- length(x)

boot_muLN <- NULL # Vector to store bootstrap estimates

# Bootstrap loop
for(i in 1:B){

  # Draw bootstrap sample with replacement
  sample_i <- sample(x, size = n, replace = TRUE)

  # Compute bootstrap MLE of lognormal mean
  boot_muLN[i] <- lognormal_mle(sample_i)[3]
}

(3). Using kernel density estimation approach to estimate the bootstrap density curves based on the bootstrap MLEs: \(\widehat{\mu_{LN}^*}^{(1)}, \widehat{\mu_{LN}^*}^{(2)}, \cdots, \widehat{\mu_{LN}^*}^{(1000)}\).

# Plot histogram of bootstrap estimates
hist(boot_muLN,
     probability = TRUE,
     breaks = 20,
     main = "Bootstrap Sampling Distribution of mu_LN",
     xlab = "Bootstrap Estimates of mu_LN")

# Add kernel density curve
lines(density(boot_muLN),
      col = "blue",
      lwd = 2)

(4). In order to find the asymptotic sampling distribution using the central limit theorem (CLT). That is, when sample size is large,

\[ \bar{X} \to N \left(\mu_{LN}, \frac{\sigma_{LN}}{\sqrt{n}}\right) \]

where \(\mu_{LN}\) and \(\sigma_{LN}\) can be estimated by using the plug-in principle of MLE. Based on the asymptotic sampling distribution of lognormal sample mean \(\bar{X}\), add the asymptotic density curve to the above Bootstrap kernel density curve.

# Obtain MLE estimates from original data
mle_out <- lognormal_mle(x)

mu_hat <- mle_out[1]
sigma2_hat <- mle_out[2]
muLN_hat <- mle_out[3]

# Plug-in estimate of lognormal variance
sigma2LN_hat <- exp(2 * mu_hat + sigma2_hat) * (exp(sigma2_hat) - 1)

# Standard deviation of lognormal distribution
sigmaLN_hat <- sqrt(sigma2LN_hat)

# Standard error of the sample mean using CLT
asym_se <- sigmaLN_hat / sqrt(n)

# Create grid of values for asymptotic density
grid_x <- seq(min(boot_muLN), max(boot_muLN), length.out = 400)

# Compute asymptotic normal density
asym_density <- dnorm(grid_x,
                      mean = muLN_hat,
                      sd = asym_se)

# Plot bootstrap distribution again
hist(boot_muLN,
     probability = TRUE,
     breaks = 20,
     main = "Bootstrap vs Asymptotic Sampling Distribution",
     xlab = "mu_LN")

# Add bootstrap kernel density
lines(density(boot_muLN),
      col = "blue",
      lwd = 2)

# Add asymptotic normal curve
lines(grid_x,
      asym_density,
      col = "red",
      lwd = 2,
      lty = 2)

# Add legend
legend("topright",
       legend = c("Bootstrap KDE", "Asymptotic Normal"),
       col = c("blue", "red"),
       lwd = 2,
       lty = c(1,2))

(5). Write a short summary to compare the two sampling distributions of the sample mean of the lognormal distribution.

Answer: The bootstrap distribution is obtained by resampling the data, so it reflects the shape of the data more directly. The asymptotic distribution is based on the Central Limit Theorem and assumes a Normal approximation. If the two curves are similar, the Normal approximation works well. If they differ, the bootstrap distribution gives a better description of the estimator’s sampling behavior.

Question 4: Extra Bonus Credit

Derive the Hessian matrix by computing the second-order partial derivatives of the log-likelihood function (with respect to \(\mu\) and \(\sigma^2\)). Then, translate the derived Hessian into an R function that takes the MLEs of the parameters and the data as inputs and returns the Hessian matrix. Finally, compare your analytically derived Hessian with the numerical Hessian returned by optim().

Let \(\beta=\sigma^2\).

The Hessian matrix of the log-likelihood is obtained from the second derivatives.

\[ \frac{\partial^2 \ell}{\partial \mu^2}=-\frac{n}{\beta} \]

\[ \frac{\partial^2 \ell}{\partial \mu \partial\beta}=-\frac{1}{\beta^2}\sum_{i=1}^n(\ln x_i-\mu) \]

\[ \frac{\partial^2 \ell}{\partial \beta^2}=\frac{n}{2\beta^2}-\frac{1}{\beta^3}\sum_{i=1}^n(\ln x_i-\mu)^2 \]

Thus the Hessian matrix is

\[ H(\mu,\beta) =\begin{bmatrix}-\frac{n}{\beta}&-\frac{1}{\beta^2}\sum_{i=1}^n(\ln x_i-\mu)\\-\frac{1}{\beta^2}\sum_{i=1}^n(\ln x_i-\mu)&\frac{n}{2\beta^2}-\frac{1}{\beta^3}\sum_{i=1}^n(\ln x_i-\mu)^2\end{bmatrix}. \]

# Function to compute analytic Hessian matrix of the log-likelihood
lognormal_hessian <- function(mu, sigma2, x){

  # Log transformed data
  y <- log(x)

  # Sample size
  n <- length(x)

  # Intermediate sums
  s1 <- sum(y - mu)
  s2 <- sum((y - mu)^2)

  # Second derivative with respect to mu
  h11 <- -n / sigma2

  # Cross derivative
  h12 <- -s1 / (sigma2^2)

  # Symmetry of Hessian
  h21 <- h12

  # Second derivative with respect to sigma^2
  h22 <- n / (2 * sigma2^2) - s2 / (sigma2^3)

  # Construct Hessian matrix
  matrix(c(h11, h12,
           h21, h22),
         nrow = 2,
         byrow = TRUE)
}

Compare with numerical Hessian from optim ()

# Negative log-likelihood for optim()
negloglik_lognormal <- function(par, x){
  mu <- par[1]
  sigma2 <- par[2]
  if(sigma2 <= 0) return(Inf)
  
  n <- length(x)
  y <- log(x)
  
  ll <- -n/2 * log(2*pi) - n/2 * log(sigma2) - sum(log(x)) -
    (1/(2*sigma2)) * sum((y - mu)^2)
  
  return(-ll)
}

# Gradient of negative log-likelihood
neggrad_lognormal <- function(par, x){
  mu <- par[1]
  sigma2 <- par[2]
  y <- log(x)
  n <- length(x)
  
  dmu <- -(1/sigma2) * sum(y - mu)
  dsigma2 <- n/(2*sigma2) - sum((y - mu)^2)/(2 * sigma2^2)
  
  return(c(dmu, dsigma2))
}

# optim
init <- c(mean(log(x)), var(log(x)))
fit <- optim(par = init,
             fn = negloglik_lognormal,
             gr = neggrad_lognormal,
             x = x,
             method = "L-BFGS-B",
             lower = c(-Inf, 1e-8),
             hessian = TRUE)

mu_hat_opt <- fit$par[1]
sigma2_hat_opt <- fit$par[2]

# Analytic Hessian of log-likelihood
H_analytic <- lognormal_hessian(mu_hat_opt, sigma2_hat_opt, x)

# Numerical Hessian from optim() is for negative log-likelihood
H_numeric <- fit$hessian

H_analytic
              [,1]          [,2]
[1,] -1.528111e+02  1.037003e-13
[2,]  1.037003e-13 -2.335090e+02
-H_analytic
              [,1]          [,2]
[1,]  1.528111e+02 -1.037003e-13
[2,] -1.037003e-13  2.335090e+02
H_numeric
              [,1]          [,2]
[1,]  1.528111e+02 -5.185066e-14
[2,] -5.185066e-14  2.335155e+02

The analytic Hessian and the numerical Hessian from optim() should be very close, except for small numerical differences due to finite-difference approximation. Since optim() is minimizing the negative log-likelihood, its Hessian should match the negative of the analytic Hessian of the log-likelihood.

---
title: "STA 506 - Midterm Examination Spring 2026 "
author: " Xiaoying Ma "
date: " Due: Sunday, 03/08 at 11:59 PM"
output:
  html_document: 
    toc: yes
    toc_depth: 4
    toc_float: yes
    number_sections: no
    toc_collapsed: yes
    code_folding: hide
    code_download: yes
    smooth_scroll: yes
    highlight: monochrome
    theme: spacelab
  pdf_document: 
    toc: yes
    toc_depth: 4
    fig_caption: yes
    number_sections: yes
    fig_width: 3
    fig_height: 3
  word_document: 
    toc: yes
    toc_depth: 4
    fig_caption: yes
    keep_md: yes
editor_options: 
  chunk_output_type: inline
---

```{css, echo = FALSE}
#TOC::before {
  content: "Table of Contents";
  font-weight: bold;
  font-size: 1.2em;
  display: block;
  color: navy;
  margin-bottom: 10px;
}


div#TOC li {     /* table of content  */
    list-style:upper-roman;
    background-image:none;
    background-repeat:none;
    background-position:0;
}

h1.title {    /* level 1 header of title  */
  font-size: 22px;
  font-weight: bold;
  color: DarkRed;
  text-align: center;
  font-family: "Gill Sans", sans-serif;
}

h4.author { /* Header 4 - and the author and data headers use this too  */
  font-size: 15px;
  font-weight: bold;
  font-family: system-ui;
  color: navy;
  text-align: center;
}

h4.date { /* Header 4 - and the author and data headers use this too  */
  font-size: 18px;
  font-weight: bold;
  font-family: "Gill Sans", sans-serif;
  color: DarkBlue;
  text-align: center;
}

h1 { /* Header 1 - and the author and data headers use this too  */
    font-size: 20px;
    font-weight: bold;
    font-family: "Times New Roman", Times, serif;
    color: darkred;
    text-align: center;
}

h2 { /* Header 2 - and the author and data headers use this too  */
    font-size: 18px;
    font-weight: bold;
    font-family: "Times New Roman", Times, serif;
    color: navy;
    text-align: left;
}

h3 { /* Header 3 - and the author and data headers use this too  */
    font-size: 16px;
    font-weight: bold;
    font-family: "Times New Roman", Times, serif;
    color: navy;
    text-align: left;
}

h4 { /* Header 4 - and the author and data headers use this too  */
    font-size: 14px;
  font-weight: bold;
    font-family: "Times New Roman", Times, serif;
    color: darkred;
    text-align: left;
}

/* Add dots after numbered headers */
.header-section-number::after {
  content: ".";

body { background-color:white; }

.highlightme { background-color:yellow; }

p { background-color:white; }

}
```

```{r setup, include=FALSE}
# code chunk specifies whether the R code, warnings, and output 
# will be included in the output files.
if (!require("knitr")) {
   install.packages("knitr")
   library(knitr)
}
if (!require("pander")) {
   install.packages("pander")
   library(pander)
}
if (!require("ggplot2")) {
  install.packages("ggplot2")
  library(ggplot2)
}
if (!require("tidyverse")) {
  install.packages("tidyverse")
  library(tidyverse)
}

if (!require("plotly")) {
  install.packages("plotly")
  library(plotly)
}
####
knitr::opts_chunk$set(echo = TRUE,       # include code chunk in the output file
                      warning = FALSE,   # sometimes, you code may produce warning messages,
                                         # you can choose to include the warning messages in
                                         # the output file. 
                      results = TRUE,    # you can also decide whether to include the output
                                         # in the output file.
                      message = FALSE,
                      comment = NA
                      )  
```
 
 \
 

## **Log-normal Distribution**

The log-normal distributions have been widely used in many different fields such as finance and economics, engineering and reliability, environmental science, medical and biology etc. 

If $X$ follows a lognormal distribution:

$$
f(x; \mu, \sigma^2) = \frac{1}{x\sqrt{2\pi \sigma^2}} 
\exp\left[-\frac{(\ln x - \mu)^2}{2\sigma^2}\right], \quad x > 0
$$

where  $\mu \in \mathbb{R}$ is the mean of $\ln X$ and $\sigma^2 > 0$ is the variance of $\ln X$.

<font color ="red">**\color{red}Caution:** *\color{red}$\mu$ and $\sigma^2$ are not population mean and variance of the lognormal distribution!*. **\color{blue} Instead, we will use $\mu_{LN}$ and $\sigma_{LN}^2$ to denote the lognormal mean and variance respectively.**</font>


For an i.i.d. random sample $x_1, x_2, \dots, x_n$:


\begin{align*}
\ell(\mu, \sigma^2) &= \sum_{i=1}^n \ln f(x_i; \mu, \sigma^2) \\
&= -\frac{n}{2} \ln(2\pi) - \frac{n}{2} \ln \sigma^2 - \sum_{i=1}^n \ln x_i 
   - \frac{1}{2\sigma^2} \sum_{i=1}^n (\ln x_i - \mu)^2
\end{align*}


The $k$-th moment of log normal distribution given by

$$
\boxed{ \mathbb{E}[X^k] = \exp\left( k\mu + \frac{1}{2} k^2 \sigma^2 \right). }
$$

Particularly, the mean of the above lognormal distribution, denoted by $\mu_{LN}$, is given by

$$
\mu_{LN} = \mathbb{E}[X] = \exp\left( \mu + \frac{1}{2}  \sigma^2 \right).
$$

Once the parameters were estimated, say $(\overline{\mu}, \overline{\sigma^2})$, we can use **the plug-in Principle** to estimate the lognormal mean by

$$
\boxed{\overline{\mu_{LN}} =  \exp\left( \overline{\mu} + \frac{1}{2}  \overline{\sigma^2} \right).}
$$





**Working Dataset**: A reservoir engineer at an oil exploration company has recently drilled 50 new wells in a shale formation. According to geological theory, the size of oil reserves in this type of formation is often modeled using a log-normal distribution. The engineer is responsible for estimating the typical reserve size and quantifying the uncertainty of these estimates for a presentation to senior management. The estimated ultimate recovery (EUR) for the 50 wells, measured in thousands of barrels (Mbbl), is as follows:

```
17.3, 24.8, 8.2, 31.5, 14.1, 42.7, 11.9, 55.3, 21.4, 9.7, 36.2, 18.6, 63.1, 13.2, 28.9, 
47.5, 19.8, 7.5, 33.4, 52.1, 16.0, 25.9, 38.7, 10.3, 44.2, 22.5, 58.6, 12.8, 30.3, 48.9, 
20.1, 35.4, 15.7, 60.2, 26.7, 41.3, 18.1, 53.8, 23.9, 46.2, 29.4, 37.6, 14.9, 50.5, 32.8, 
19.3, 56.7, 11.2, 39.5, 27.1
```


<font color = "blue">This assignment focuses on finding the asymptotic sampling distributions of the MLE $\hat{\mu}$ and $\hat{\sigma^2}$ corresponding to $\mu$ and $\sigma^2$ based on an environmental study data.</font>


\


```{r}
# Oil reserve data
x <- c(
  17.3, 24.8, 8.2, 31.5, 14.1, 42.7, 11.9, 55.3, 21.4, 9.7,
  36.2, 18.6, 63.1, 13.2, 28.9, 47.5, 19.8, 7.5, 33.4, 52.1,
  16.0, 25.9, 38.7, 10.3, 44.2, 22.5, 58.6, 12.8, 30.3, 48.9,
  20.1, 35.4, 15.7, 60.2, 26.7, 41.3, 18.1, 53.8, 23.9, 46.2,
  29.4, 37.6, 14.9, 50.5, 32.8, 19.3, 56.7, 11.2, 39.5, 27.1
)

# Sample size
n <- length(x)

```

## **Question 1**: Method of Moments Estimation (MME) of $\mu$ and $\sigma^2$

Estimate the parameters $\mu$ and $\sigma^2$ (<font color="red">*\color{red}caution: not $\sigma$!*</font>) from the raw oil reserve data using the given $k$-th moment function. Organize your work into three parts:

(a) Derive the system of two equations for the method of moments estimators (MMEs) of $\mu$ and $\sigma^2$.

Let the first two sample moments be

$$
m_1=\frac{1}{n}\sum_{i=1}^n x_i,\qquad m_2=\frac{1}{n}\sum_{i=1}^n x_i^2.
$$

For the lognormal distribution,

$$
\mu_1=\mathbb{E}[X]=\exp\left(\mu+\frac{1}{2}\sigma^2\right)
$$
and 
$$
\mu_2=\mathbb{E}[X^2]=\exp\left(2\mu+2\sigma^2\right).
$$

The method of moments sets

$$
\exp\left(\mu+\frac{1}{2}\sigma^2\right)=m_1
$$


$$
\exp\left(2\mu+2\sigma^2\right)=m_2.
$$

These are two MME equations used to solve for $\mu$ and $\sigma^2$.


(b) Derive the closed-form expressions for the MMEs of $\mu$, $\sigma^2$, and $\mu_{LN}$, denoted by $\widetilde{\mu}$, $\widetilde{\sigma^2}$, and $\widetilde{\mu_{LN}}$, respectively.

Taking logarithms gives

$$
\mu+\frac{1}{2}\sigma^2=\ln(m_1)
$$

$$
2\mu+2\sigma^2=\ln(m_2).
$$

From the first equation,

$$
2\mu+\sigma^2=2\ln(m_1).
$$

Subtracting this from the second equation gives

$$
\sigma^2=\ln(m_2)-2\ln(m_1)=\ln\left(\frac{m_2}{m_1^2}\right).
$$

Thus,

$$
\widetilde{\sigma^2}=\ln\left(\frac{m_2}{m_1^2}\right).
$$

Substituting into the first equation,

$$
\widetilde{\mu}=\ln(m_1)-\frac{1}{2}\widetilde{\sigma^2}.
$$

The plug-in estimator of the lognormal mean is

$$
\widetilde{\mu_{LN}}=\exp\left(\widetilde{\mu}+\frac{1}{2}\widetilde{\sigma^2}\right)
$$

Since $m_1=\bar X$, we have

$$
\boxed{\widetilde{\mu_{LN}}=\bar X}
$$


(c) Write an R function that outputs the MMEs $\widetilde{\mu}$, $\widetilde{\sigma^2}$, and $\widetilde{\mu_{LN}}$ based on the given data set.

```{r}
# Method of Moments estimation for lognormal parameters
lognormal.mme <- function(x){
  m1 <- mean(x) #First sample moment
  m2 <- mean(x^2) #Second sample moment

  #MME of sigma^2 derived from moment equations   
  sigma2.tilde <- log(m2 / (m1^2))
  
  # MME of mu obtained by substituting sigma^2 into first equation
  mu.tilde <- log(m1) - 0.5 * sigma2.tilde
  
  # Plug-in estimator for the lognormal mean
  muLN.tilde <- exp(mu.tilde + 0.5 * sigma2.tilde)
  
  # Return the estimates
  data.frame(
    mu_tilde = mu.tilde,
    sigma2_tilde = sigma2.tilde,
    muLN_tilde = muLN.tilde
  )
}

# Return the estimates
lognormal.mme(x)
```
\


## **Question 2**: Finding the MLE of $\mu$ and $\sigma^2$

This question involves two parts:

(1). Derive the score functions (i.e., the gradient of the log-likelihood) for the parameters of the log-normal distribution, and set them to zero to obtain the system of score equations. Begin with the full log-likelihood and compute its partial derivatives with respect to $\mu$ and $\sigma^2$. [Hint: letting $\beta = \sigma^2$ may simplify differentiation.]

Assume that $\{x_1, x_2, \cdots, x_n \} \to \text{ LN }(\mu, \sigma^2)$ with density function given by

$$
\ell(\mu, \sigma^2) = -\frac{n}{2} \ln(2\pi) - n \ln \sigma - \sum_{i=1}^n \ln x_i 
   - \frac{1}{2\sigma^2} \sum_{i=1}^n (\ln x_i - \mu)^2
$$
The log-likelihood function is

$$
\ell(\mu,\sigma^2)=-\frac{n}{2}\ln(2\pi)-\frac{n}{2}\ln(\sigma^2)-\sum_{i=1}^n \ln x_i-\frac{1}{2\sigma^2}\sum_{i=1}^n(\ln x_i-\mu)^2.
$$

Let $\beta=\sigma^2$.

\subsection*{Score function for $\mu$}

$$
\frac{\partial \ell}{\partial \mu}=\frac{1}{\beta}\sum_{i=1}^n(\ln x_i-\mu).
$$

Setting this equal to zero gives
$$
\hat{\mu}=\frac{1}{n}\sum_{i=1}^n \ln x_i.
$$

\subsection*{Score function for $\beta=\sigma^2$}

$$
\frac{\partial \ell}{\partial \beta}=-\frac{n}{2\beta}+\frac{1}{2\beta^2}\sum_{i=1}^n(\ln x_i-\mu)^2.
$$

Setting this equal to zero yields

$$
\widehat{\sigma^2}=\frac{1}{n}\sum_{i=1}^n(\ln x_i-\hat{\mu})^2.
$$


(2). Perform some algebra to find the closed form of the MLE of $\mu$ and $\sigma^2$, denoted by $\widehat{\mu}$ and $\widehat{\sigma^2}$.

$$
\boxed{\hat{\mu}=\frac{1}{n}\sum_{i=1}^n \ln x_i}
$$
and

$$
\boxed{\widehat{\sigma^2}=\frac{1}{n}\sum_{i=1}^n(\ln x_i-\hat{\mu})^2}.
$$


(3). Recall that lognormal population mean (i.e., the first moment) is 

$$
\mu_{LN} = \mathbb{E}[X] = \exp\left( \mu + \frac{1}{2}  \sigma^2 \right).
$$

Using the relationship between variance and the first and second moments, $\text{Var}(X) = \mathbb{E}[X^2] - \left(\mathbb{E}[X]\right)^2$, we derive the population variance of the lognormal distribution, denoted as $\sigma_{\text{LN}}^2$. This result will be used in applying the Central Limit Theorem in subsequent questions
\section*{Lognormal Mean and Variance}

The lognormal mean is

$$
\mu_{LN}=\exp\left(\mu+\frac{1}{2}\sigma^2\right).
$$
Also,
$$
\mathbb{E}[X^2]=\exp(2\mu+2\sigma^2).
$$
Therefore
$$
\sigma_{LN}^2=\mathbb{E}[X^2]-(\mathbb{E}[X])^2=\exp(2\mu+2\sigma^2)-\exp(2\mu+\sigma^2).
$$

Equivalently,
$$
\sigma_{LN}^2=\exp(2\mu+\sigma^2)\left(\exp(\sigma^2)-1\right).
$$
\

## **Question 3**: Sampling Distribution of Lognormal **Sample Mean** $\widehat{\mu_{LN}}$

We have derived two different estimators of lognormal population mean $\mu_{LN}$. This Question focuses on the sampling distribution of lognormal sample sample means based on MLE. The following are detailed instructions.

(1). Write an R function to estimate the MLE of $\mu$ and $\sigma^2$, denoted by $\widehat{\mu}$ and $\widehat{\sigma^2}$ and return the MLE of $\mu_{LN}$, denoted by $\widehat{\mu_{LN}}$.

```{r}

# Function to compute MLE of mu, sigma^2, and lognormal mean
lognormal_mle <- function(x){

  y <- log(x)# Take natural log of data
  mu_hat <- mean(y) # MLE of mu 

  # MLE of sigma^2 
  sigma2_hat <- mean((y - mu_hat)^2)

  # Plug-in estimate of lognormal population mean
  muLN_hat <- exp(mu_hat + 0.5 * sigma2_hat)

  # Return estimates
  c(mu_hat = mu_hat,
    sigma2_hat = sigma2_hat,
    muLN_hat = muLN_hat)
}

# Compute MLE estimates
lognormal_mle(x)

```

(2). Take 1000 Bootstrap samples from the raw oil data set. For each bootstrap sample, call the above R function to find the **bootstrap MLE** of $\mu_{LN}$, denoted by $\widehat{\mu_{LN}^*}^{(1)}, \widehat{\mu_{LN}^*}^{(2)}, \cdots, \widehat{\mu_{LN}^*}^{(1000)}$.

```{r}

set.seed(123)

B <- 1000 # Number of bootstrap samples
n <- length(x)

boot_muLN <- NULL # Vector to store bootstrap estimates

# Bootstrap loop
for(i in 1:B){

  # Draw bootstrap sample with replacement
  sample_i <- sample(x, size = n, replace = TRUE)

  # Compute bootstrap MLE of lognormal mean
  boot_muLN[i] <- lognormal_mle(sample_i)[3]
}

```

(3). Using kernel density estimation approach to estimate the bootstrap density curves based on the bootstrap MLEs: $\widehat{\mu_{LN}^*}^{(1)}, \widehat{\mu_{LN}^*}^{(2)}, \cdots, \widehat{\mu_{LN}^*}^{(1000)}$.

```{r}
# Plot histogram of bootstrap estimates
hist(boot_muLN,
     probability = TRUE,
     breaks = 20,
     main = "Bootstrap Sampling Distribution of mu_LN",
     xlab = "Bootstrap Estimates of mu_LN")

# Add kernel density curve
lines(density(boot_muLN),
      col = "blue",
      lwd = 2)
```

(4). In order to find the asymptotic sampling distribution using the **central limit theorem (CLT)**. That is, when sample size is large, 

$$
\bar{X} \to N \left(\mu_{LN}, \frac{\sigma_{LN}}{\sqrt{n}}\right)
$$

where $\mu_{LN}$ and $\sigma_{LN}$ can be estimated by using the **plug-in principle of MLE**. Based on the asymptotic sampling distribution of **lognormal sample mean** $\bar{X}$, add the asymptotic density curve to the above Bootstrap kernel density curve.

```{r}
# Obtain MLE estimates from original data
mle_out <- lognormal_mle(x)

mu_hat <- mle_out[1]
sigma2_hat <- mle_out[2]
muLN_hat <- mle_out[3]

# Plug-in estimate of lognormal variance
sigma2LN_hat <- exp(2 * mu_hat + sigma2_hat) * (exp(sigma2_hat) - 1)

# Standard deviation of lognormal distribution
sigmaLN_hat <- sqrt(sigma2LN_hat)

# Standard error of the sample mean using CLT
asym_se <- sigmaLN_hat / sqrt(n)

# Create grid of values for asymptotic density
grid_x <- seq(min(boot_muLN), max(boot_muLN), length.out = 400)

# Compute asymptotic normal density
asym_density <- dnorm(grid_x,
                      mean = muLN_hat,
                      sd = asym_se)

# Plot bootstrap distribution again
hist(boot_muLN,
     probability = TRUE,
     breaks = 20,
     main = "Bootstrap vs Asymptotic Sampling Distribution",
     xlab = "mu_LN")

# Add bootstrap kernel density
lines(density(boot_muLN),
      col = "blue",
      lwd = 2)

# Add asymptotic normal curve
lines(grid_x,
      asym_density,
      col = "red",
      lwd = 2,
      lty = 2)

# Add legend
legend("topright",
       legend = c("Bootstrap KDE", "Asymptotic Normal"),
       col = c("blue", "red"),
       lwd = 2,
       lty = c(1,2))
```

(5). Write a short summary to compare the two sampling distributions of the sample mean of the lognormal distribution.

**Answer:** The bootstrap distribution is obtained by resampling the data, so it reflects the shape of the data more directly. The asymptotic distribution is based on the Central Limit Theorem and assumes a Normal approximation. If the two curves are similar, the Normal approximation works well. If they differ, the bootstrap distribution gives a better description of the estimator’s sampling behavior.

## **Question 4**: Extra Bonus Credit

Derive the Hessian matrix by computing the second-order partial derivatives of the log-likelihood function (with respect to $\mu$ and $\sigma^2$). Then, translate the derived Hessian into an R function that takes the MLEs of the parameters and the data as inputs and returns the Hessian matrix. Finally, compare your analytically derived Hessian with the numerical Hessian returned by `optim()`.

Let $\beta=\sigma^2$.

The Hessian matrix of the log-likelihood is obtained from the second derivatives.

$$
\frac{\partial^2 \ell}{\partial \mu^2}=-\frac{n}{\beta}
$$

$$
\frac{\partial^2 \ell}{\partial \mu \partial\beta}=-\frac{1}{\beta^2}\sum_{i=1}^n(\ln x_i-\mu)
$$

$$
\frac{\partial^2 \ell}{\partial \beta^2}=\frac{n}{2\beta^2}-\frac{1}{\beta^3}\sum_{i=1}^n(\ln x_i-\mu)^2
$$

Thus the Hessian matrix is

$$
H(\mu,\beta)  =\begin{bmatrix}-\frac{n}{\beta}&-\frac{1}{\beta^2}\sum_{i=1}^n(\ln x_i-\mu)\\-\frac{1}{\beta^2}\sum_{i=1}^n(\ln x_i-\mu)&\frac{n}{2\beta^2}-\frac{1}{\beta^3}\sum_{i=1}^n(\ln x_i-\mu)^2\end{bmatrix}.
$$

```{r}
# Function to compute analytic Hessian matrix of the log-likelihood
lognormal_hessian <- function(mu, sigma2, x){

  # Log transformed data
  y <- log(x)

  # Sample size
  n <- length(x)

  # Intermediate sums
  s1 <- sum(y - mu)
  s2 <- sum((y - mu)^2)

  # Second derivative with respect to mu
  h11 <- -n / sigma2

  # Cross derivative
  h12 <- -s1 / (sigma2^2)

  # Symmetry of Hessian
  h21 <- h12

  # Second derivative with respect to sigma^2
  h22 <- n / (2 * sigma2^2) - s2 / (sigma2^3)

  # Construct Hessian matrix
  matrix(c(h11, h12,
           h21, h22),
         nrow = 2,
         byrow = TRUE)
}
```

Compare with numerical Hessian from optim ()

```{r}
# Negative log-likelihood for optim()
negloglik_lognormal <- function(par, x){
  mu <- par[1]
  sigma2 <- par[2]
  if(sigma2 <= 0) return(Inf)
  
  n <- length(x)
  y <- log(x)
  
  ll <- -n/2 * log(2*pi) - n/2 * log(sigma2) - sum(log(x)) -
    (1/(2*sigma2)) * sum((y - mu)^2)
  
  return(-ll)
}

# Gradient of negative log-likelihood
neggrad_lognormal <- function(par, x){
  mu <- par[1]
  sigma2 <- par[2]
  y <- log(x)
  n <- length(x)
  
  dmu <- -(1/sigma2) * sum(y - mu)
  dsigma2 <- n/(2*sigma2) - sum((y - mu)^2)/(2 * sigma2^2)
  
  return(c(dmu, dsigma2))
}

# optim
init <- c(mean(log(x)), var(log(x)))
fit <- optim(par = init,
             fn = negloglik_lognormal,
             gr = neggrad_lognormal,
             x = x,
             method = "L-BFGS-B",
             lower = c(-Inf, 1e-8),
             hessian = TRUE)

mu_hat_opt <- fit$par[1]
sigma2_hat_opt <- fit$par[2]

# Analytic Hessian of log-likelihood
H_analytic <- lognormal_hessian(mu_hat_opt, sigma2_hat_opt, x)

# Numerical Hessian from optim() is for negative log-likelihood
H_numeric <- fit$hessian

H_analytic
-H_analytic
H_numeric
```
The analytic Hessian and the numerical Hessian from optim() should be very close, except for small numerical differences due to finite-difference approximation. Since optim() is minimizing the negative log-likelihood, its Hessian should match the negative of the analytic Hessian of the log-likelihood.