Final Exam Guidelines

  • Coverage: The major concepts and inference procedures—such as sampling distributions, confidence intervals, and hypothesis testing—are covered and implemented using both classical parametric likelihood-based methods and modern non-parametric approaches, including the bootstrap and kernel density estimation.

  • Part A requires derivation of selected likelihood-based functions for performing various types of inference, with sufficient detail to enable translation of these derivations into code for numerical analysis.

  • Your code for the problems in Part B must align with your derivations in Part A and be well commented where necessary.

  • In Part B, all numerical results must be interpreted from a practical perspective.


Policies of Using AI Tools

  • Policy on AI Tool Use: Students must adhere to the AI tool policy specified in the course syllabus. The direct copying of AI-generated content is strictly prohibited. All submitted work must reflect your own understanding; where external tools are consulted, content must be thoroughly rephrased and synthesized in your own words.

  • Code Inclusion Requirement: Any code included in your essay must be properly commented to explain the purpose and/or expected output of key code lines. Submitting AI-generated code without meaningful, student-added comments will not be accepted.


Working Model for the Final Exam

Caution: Please follow the suggested expressions and guided steps to complete the exam. Other approaches such as transformation for trivialize the problems that will not meet the exam objectives.

The Kumaraswamy distribution is a two-parameter continuous probability distribution defined on the interval (0, 1). It is often used as an alternative to the Beta distribution due to its simple closed-form expressions for the cumulative distribution function (CDF) and quantile function. It is commonly used in

  • Hydrology: Modeling rainfall, streamflow, or other bounded natural phenomena

  • Economics: Income shares, proportions, or bounded indices

  • Monte Carlo simulation: Efficient random variate generation (via inverse transform)

  • Machine learning: Output layer for bounded targets, prior distributions in Bayesian models

  • Reliability engineering: Modeling failure rates of systems with bounded lifetimes


Let \(X\) be the Kumaraswamy random variable with Cumulative Distribution Function (CDF)

\[ F(x; a, b) = 1 - (1 - x^a)^b \]

where \(a > 0\) and \(b > 0\) unknown parameters and \(0 < x < 1\).

The following are two special case of the Kumaraswamy distribution:

  1. Uniform Distribution: When \(a = 1\) and \(b = 1\), the Kumaraswamy distribution becomes a uniform distribution over \([0, 1]\) with CDF \(F(x) = x\).

  2. Power Distribution: when \(b = 1\) and \(a > 0\), the Kumaraswamy distribution becomes a power distribution over \([0, 1]\) with CDF \(F(x) = x^a\).

This final exam focuses on inferences of Kumaraswamy distribution and related data analysis.

Part A: Methodological Derivations


Problem A1:

Show that the density function of the Kumaraswamy distribution is

\[ f(x; a, b) = ab \, x^{a-1} (1 - x^a)^{b-1}. \]

We take the derivative of the CDF with respect to \(x\) to obtain the density function.

\[ f(x;a,b)=\frac{d}{dx}\left[1-(1-x^a)^b\right] \]

Apply the chain rule:

\[ = -b(1-x^a)^{b-1}(-ax^{a-1}) \]

Simplifying the expression:

\[ = abx^{a-1}(1-x^a)^{b-1}, \quad 0<x<1,\ a>0,\ b>0. \]

This matches the given density function.

Problem A2:

Let \(\{x_1, x_2, \cdots, x_n \}\) be an i.i.d. random sample taken from a population that follows the above 2-parameter Kumaraswamy distribution. Write out the loglikelihood function of \(a\) and \(b\), denoted by \(\ell(a,b)\), based on the above random sample and derive the gradient vector \([\ell_a^\prime(a,b), \ell_b^\prime(a,b)]\), the first order partial derivative of the log-likelihood with respect to parameters \(a\) and \(b\).


Using the density from Problem A1,

\[ f(x_i;a,b)=abx_i^{a-1}(1-x_i^a)^{b-1} \]

Since the sample is i.i.d., the likelihood function is

\[ L(a,b)=\prod_{i=1}^{n}abx_i^{a-1}(1-x_i^a)^{b-1} \]

Taking the natural log gives the log-likelihood function

\[ \ell(a,b)=\sum_{i=1}^{n} \left[ \log(a)+\log(b)+(a-1)\log(x_i)+(b-1)\log(1-x_i^a) \right] \]

Simplifying,

\[ \ell(a,b)= n\log(a)+n\log(b) +(a-1)\sum_{i=1}^{n}\log(x_i) +(b-1)\sum_{i=1}^{n}\log(1-x_i^a) \]

Differentiate with respect to \(a\),

\[ \ell_a'(a,b) = \frac{n}{a} + \sum_{i=1}^{n}\log(x_i) - (b-1)\sum_{i=1}^{n} \frac{x_i^a\log(x_i)}{1-x_i^a} \]

Differentiate with respect to \(b\),

\[ \ell_b'(a,b) = \frac{n}{b} + \sum_{i=1}^{n}\log(1-x_i^a) \]

The gradient vector is

\[ \left[ \ell_a'(a,b),\ell_b'(a,b) \right] = \left[ \frac{n}{a} + \sum_{i=1}^{n}\log(x_i) - (b-1)\sum_{i=1}^{n} \frac{x_i^a\log(x_i)}{1-x_i^a}, \frac{n}{b} + \sum_{i=1}^{n}\log(1-x_i^a) \right]. \]

This gives the required gradient vector for the log-likelihood with respect to \(a\) and \(b\).

Problem A3:

Based on the gradients functions obtained in the above problem A2, derive the observed Fisher Information matrix (i.e, the negative Hessian Matrix).


The observed Fisher Information matrix comes from the negative Hessian matrix, so I use the second partial derivatives of the log-likelihood.

\[ I(a,b)=-H(a,b) \]

Using the gradient functions from Problem A2,

\[ \ell_a'(a,b) = \frac{n}{a} + \sum_{i=1}^{n}\log(x_i) - (b-1)\sum_{i=1}^{n} \frac{x_i^a\log(x_i)}{1-x_i^a} \]

\[ \ell_b'(a,b) = \frac{n}{b} + \sum_{i=1}^{n}\log(1-x_i^a) \]

Differentiate \(\ell_a'(a,b)\) with respect to \(a\),

\[ \ell_{aa}''(a,b) = -\frac{n}{a^2} - (b-1)\sum_{i=1}^{n} \frac{x_i^a[\log(x_i)]^2}{(1-x_i^a)^2} \]

Differentiate \(\ell_a'(a,b)\) with respect to \(b\),

\[ \ell_{ab}''(a,b) = -\sum_{i=1}^{n} \frac{x_i^a\log(x_i)}{1-x_i^a} \]

Differentiate \(\ell_b'(a,b)\) with respect to \(b\),

\[ \ell_{bb}''(a,b) = -\frac{n}{b^2} \]

The Hessian matrix is

\[ H(a,b)= \begin{bmatrix} \ell_{aa}''(a,b) & \ell_{ab}''(a,b) \\ \ell_{ba}''(a,b) & \ell_{bb}''(a,b) \end{bmatrix} \]

Since the observed Fisher Information matrix is the negative Hessian,

\[ I(a,b)= \begin{bmatrix} \frac{n}{a^2} + (b-1)\sum_{i=1}^{n} \frac{x_i^a[\log(x_i)]^2}{(1-x_i^a)^2} & \sum_{i=1}^{n} \frac{x_i^a\log(x_i)}{1-x_i^a} \\ \sum_{i=1}^{n} \frac{x_i^a\log(x_i)}{1-x_i^a} & \frac{n}{b^2} \end{bmatrix}. \]

This gives the observed Fisher Information matrix based on the second partial derivatives from Problem A2.

Problem A4:

Consider power distribution \(F(x) = x^a, (a >0 \quad \text{ and }\quad x \in (0,1))\), a special case of the Kumaraswamy distribution with \(b = 1\), and a random sample from this distribution \(\{ x_1, x_2, \cdots, x_n\}\). Derive the MLE and MME of \(a\) respectively. [Hint: To find the MME, you need to compute the moment of the power distribution; that is, \(E[X^k] = \int_0^1 x^k F'(x) dx\). Note that both the MLE and the MME have closed-form expressions.]


Since this is the special case with \(b=1\), the power distribution has CDF

\[ F(x)=x^a \]

Taking the derivative gives the density

\[ f(x)=F'(x)=ax^{a-1}, \quad 0<x<1,\ a>0. \]

For the MLE, I construct the likelihood using the i.i.d. sample

\[ L(a)=\prod_{i=1}^{n} ax_i^{a-1} \]

Taking the natural log to simplify the product,

\[ \ell(a)=n\log(a)+(a-1)\sum_{i=1}^{n}\log(x_i) \]

Differentiate with respect to \(a\) to obtain the score function,

\[ \ell'(a)=\frac{n}{a}+\sum_{i=1}^{n}\log(x_i) \]

Set the derivative equal to zero and solve for \(a\),

\[ \frac{n}{a}+\sum_{i=1}^{n}\log(x_i)=0 \]

\[ \hat{a}_{MLE}= -\frac{n}{\sum_{i=1}^{n}\log(x_i)} \]

For the MME, I compute the first moment using the definition from the hint.

\[ E[X]=\int_0^1 x f(x)\,dx \]

Substituting the density,

\[ E[X]=\int_0^1 x(ax^{a-1})\,dx = a\int_0^1 x^a\,dx = \frac{a}{a+1} \]

Set the population moment equal to the sample mean,

\[ \bar{x}=\frac{a}{a+1} \]

Solve for \(a\),

\[ \bar{x}(a+1)=a \]

\[ \bar{x}a+\bar{x}=a \]

\[ \bar{x}=a(1-\bar{x}) \]

\[ \tilde{a}_{MME}=\frac{\bar{x}}{1-\bar{x}} \]

Therefore, the MLE is \(\hat{a}_{MLE}=-\frac{n}{\sum_{i=1}^{n}\log(x_i)}\), and the MME is \(\tilde{a}_{MME}=\frac{\bar{x}}{1-\bar{x}}\).

Problem A5:

Using the same setting as in Problem A4, find the asymptotic (Wald) confidence interval for \(a\). [Hint: Compute the Fisher information for \(a\), then take its reciprocal to obtain the variance.]


Using the MLE from Problem A4,

\[ \hat{a}_{MLE}=-\frac{n}{\sum_{i=1}^{n}\log(x_i)} \]

The log-likelihood from Problem A4 is

\[ \ell(a)=n\log(a)+(a-1)\sum_{i=1}^{n}\log(x_i) \]

Differentiate a second time to obtain the Hessian,

\[ \ell''(a)=-\frac{n}{a^2} \]

The Fisher information is the negative second derivative,

\[ I(a)=-\ell''(a)=\frac{n}{a^2} \]

Using the hint, the approximate variance is the reciprocal of the Fisher information evaluated at \(\hat{a}\),

\[ Var(\hat{a})\approx \frac{1}{I(\hat{a})} = \frac{\hat{a}^2}{n} \]

Therefore, the standard error is

\[ SE(\hat{a})=\frac{\hat{a}}{\sqrt{n}} \]

The Wald confidence interval for \(a\) is

\[ \hat{a}\pm z_{\alpha/2}\frac{\hat{a}}{\sqrt{n}} \]

For a 95% confidence interval,

\[ \hat{a}\pm 1.96\frac{\hat{a}}{\sqrt{n}}. \]

Problem A6:

Using the same setting as in Problem A4, perform a likelihood ratio test for the hypothesis \(H_0 :a=1\) (i.e., the power distribution reduces to a uniform distribution). [Hint: Evaluate the log-likelihood function at the maximum likelihood estimate \(\hat{a}\) and at \(a=1\), then use these values to construct the LRT test statistic.]


Using the same log-likelihood from Problem A4,

\[ \ell(a)=n\log(a)+(a-1)\sum_{i=1}^{n}\log(x_i) \]

The hypotheses are

\[ H_0:a=1 \]

and

\[ H_A:a\neq 1. \]

Evaluate the log-likelihood at the MLE,

\[ \ell(\hat{a})=n\log(\hat{a})+(\hat{a}-1)\sum_{i=1}^{n}\log(x_i) \]

Under the null hypothesis, \(a=1\), so

\[ \ell(1)=n\log(1)+(1-1)\sum_{i=1}^{n}\log(x_i)=0 \]

The likelihood ratio test statistic is

\[ G^2=2\left[\ell(\hat{a})-\ell(1)\right] \]

Substituting the log-likelihood values,

\[ G^2=2\left[n\log(\hat{a})+(\hat{a}-1)\sum_{i=1}^{n}\log(x_i)\right] \]

Since one parameter is fixed under \(H_0\),

\[ G^2\sim \chi^2_1. \]

Reject \(H_0\) if

\[ G^2>\chi^2_{1,1-\alpha}. \]

This tests whether the power distribution can be reduced to the uniform distribution when \(a=1\).

Part B: Numerical Analysis

All code must be well commented and adhere to best coding practices

Working Dataset: A small reservoir supplies water to a town. During the dry season (50 days), engineers record the fraction of usable storage filled each morning. Values near 0 mean the reservoir is nearly empty; values near 1 mean it’s full. The distribution tends to be right‑skewed (mostly low levels due to drought) but with occasional replenishment.

The following 50 data points (ordered for clarity) represent the daily proportion of usable storage:

0.12, 0.14, 0.15, 0.16, 0.17, 0.18, 0.19, 0.20, 0.21, 0.22,
0.23, 0.24, 0.25, 0.26, 0.27, 0.28, 0.29, 0.30, 0.31, 0.32,
0.33, 0.34, 0.35, 0.36, 0.37, 0.38, 0.39, 0.40, 0.41, 0.42,
0.43, 0.44, 0.45, 0.46, 0.47, 0.48, 0.49, 0.50, 0.51, 0.52,
0.53, 0.54, 0.55, 0.56, 0.57, 0.58, 0.59, 0.60, 0.61, 0.78


Problem B1:

Fit the Kumaraswamy distribution to the above data. Use the derivations in Problem A2 to find the MLE of \(a\) and \(b\). Please copy the key formulas before coding.


Using the log-likelihood from Problem A2,

\[ \ell(a,b)= n\log(a)+n\log(b) +(a-1)\sum_{i=1}^{n}\log(x_i) +(b-1)\sum_{i=1}^{n}\log(1-x_i^a) \]

The MLEs of \(a\) and \(b\) are found by maximizing this log-likelihood.

# Input the reservoir storage data from the problem
reservoir <- c(
  0.12, 0.14, 0.15, 0.16, 0.17, 0.18, 0.19, 0.20, 0.21, 0.22,
  0.23, 0.24, 0.25, 0.26, 0.27, 0.28, 0.29, 0.30, 0.31, 0.32,
  0.33, 0.34, 0.35, 0.36, 0.37, 0.38, 0.39, 0.40, 0.41, 0.42,
  0.43, 0.44, 0.45, 0.46, 0.47, 0.48, 0.49, 0.50, 0.51, 0.52,
  0.53, 0.54, 0.55, 0.56, 0.57, 0.58, 0.59, 0.60, 0.61, 0.78
)

# Store sample size for use in likelihood expressions
n <- length(reservoir)
# Define the negative log-likelihood from Problem A2
neg_loglik_kumar <- function(par, x) {
  a <- par[1]
  b <- par[2]
  
  # Enforce parameter constraints (a > 0, b > 0)
  if (a <= 0 || b <= 0) return(Inf)
  
  # Negative log-likelihood (used for minimization)
  -(
    length(x) * log(a) +
    length(x) * log(b) +
    (a - 1) * sum(log(x)) +
    (b - 1) * sum(log(1 - x^a))
  )
}

# Perform numerical maximization of the log-likelihood
fit_kumar <- optim(
  par = c(1, 1),                  # initial guesses
  fn = neg_loglik_kumar,
  x = reservoir,
  method = "L-BFGS-B",
  lower = c(0.0001, 0.0001)       # enforce positivity
)

# Extract MLE estimates
a_hat <- fit_kumar$par[1]
b_hat <- fit_kumar$par[2]

# Display results
a_hat
[1] 2.529602
b_hat
[1] 7.883394

The maximum likelihood estimates are \(\hat{a} = 2.5296020115006\) and \(\hat{b} = 7.88339354060574\).

Problem B2:

Fit the power distribution to the above data using the derived of \(a\) obtained in Problem A4 to test the following hypothesis using likelihood ratio procedure at significance level \(\alpha = 0.05\):

\[ H_0: b = 1 \quad \text{ versus } \quad H_a: b \ne 1. \]

State the statistical decision clearly. What is the practical implication of the testing result?


Using the power distribution restriction \(b=1\), the MLE of \(a\) from Problem A4 is

\[ \hat{a}_{0}=-\frac{n}{\sum_{i=1}^{n}\log(x_i)} \]

# Estimate a under the restricted model where b = 1 (power distribution case)
a_hat_power <- -n / sum(log(reservoir))

# Evaluate the log-likelihood using the full Kumaraswamy model (from B1 estimates)
loglik_full <- -neg_loglik_kumar(c(a_hat, b_hat), reservoir)

# Evaluate the log-likelihood under the restricted model where b = 1
loglik_restricted <- -neg_loglik_kumar(c(a_hat_power, 1), reservoir)

# Construct the likelihood ratio test statistic comparing the two models
G2 <- 2 * (loglik_full - loglik_restricted)

# Since one parameter is restricted, the test follows a chi-square distribution with 1 degree of freedom
p_value <- 1 - pchisq(G2, df = 1)
critical_value <- qchisq(0.95, df = 1)

# Display results so I can interpret the test
a_hat_power
[1] 0.9393197
loglik_full
[1] 24.56271
loglik_restricted
[1] 0.1000435
G2
[1] 48.92533
critical_value
[1] 3.841459
p_value
[1] 2.658984e-12

The likelihood ratio test gives \(G^2 = 48.92533\) with a p-value of \(2.658984 \times 10^{-12}\). Since the p-value is less than \(0.05\), I reject \(H_0\). This indicates that \(b \neq 1\), so the power distribution does not fit the data well, and the Kumaraswamy model is more appropriate.

Problem B3:

Use the procedure and code from Problem B1 to estimate the MLEs of \(a\) and \(b\), and then complete the following analyses:

(1). Obtain the bootstrap sampling distributions of \(\hat{a}\) and \(\hat{b}\) and plot each distribution using Gaussian kernel density curves.

Using the procedure from Problem B1, the Kumaraswamy MLEs are \(\hat{a} = 2.5296020115006\) and \(\hat{b} = 7.88339354060574\). These estimates are used as the starting values when refitting the model to each bootstrap sample.

# Set seed so the bootstrap results can be repeated
set.seed(506)

# Choose the number of bootstrap samples
B <- 1000

# Store the bootstrap estimates of a and b
boot_a <- numeric(B)
boot_b <- numeric(B)

# Refit the Kumaraswamy model to each bootstrap sample
for (i in 1:B) {
  boot_sample <- sample(reservoir, size = n, replace = TRUE)
  
  boot_fit <- optim(
    par = c(a_hat, b_hat),
    fn = neg_loglik_kumar,
    x = boot_sample,
    method = "L-BFGS-B",
    lower = c(0.0001, 0.0001)
  )
  
  boot_a[i] <- boot_fit$par[1]
  boot_b[i] <- boot_fit$par[2]
}
# Plot the bootstrap sampling distribution of a-hat
plot(
  density(boot_a, kernel = "gaussian"),
  main = "Bootstrap Sampling Distribution of a-hat",
  xlab = "Bootstrap estimates of a",
  ylab = "Density"
)

# Plot the bootstrap sampling distribution of b-hat
plot(
  density(boot_b, kernel = "gaussian"),
  main = "Bootstrap Sampling Distribution of b-hat",
  xlab = "Bootstrap estimates of b",
  ylab = "Density"
)

(2). Construct both the \(95\%\) bootstrap confidence interval and the Wald confidence interval for \(b\). Do these intervals agree with the results obtained in Problem B2? [Compute the standard error of \(\hat{b}\) using the observed Fisher information matrix, i.e., the inverse of the negative Hessian obtained from optim()]

For \(b\), the bootstrap confidence interval is obtained from the bootstrap estimates of \(\hat{b}\), while the Wald confidence interval is constructed using the standard error from the inverse of the observed Fisher information matrix.

# Use the bootstrap estimates from B3(1) to get the 95% percentile interval for b
boot_ci_b <- quantile(boot_b, probs = c(0.025, 0.975))

# Refit the model one more time so I can extract the Hessian matrix from optim()
fit_kumar_hessian <- optim(
  par = c(a_hat, b_hat),
  fn = neg_loglik_kumar,
  x = reservoir,
  method = "L-BFGS-B",
  lower = c(0.0001, 0.0001),
  hessian = TRUE
)

# Take the inverse of the Hessian to get the estimated variance-covariance matrix
cov_matrix <- solve(fit_kumar_hessian$hessian)

# Pull out the standard error for b-hat (second parameter)
se_b <- sqrt(cov_matrix[2, 2])

# Build the Wald confidence interval using b-hat ± 1.96 * SE
wald_ci_b <- b_hat + c(-1, 1) * 1.96 * se_b

# Print everything so I can compare the two intervals and relate back to B2
boot_ci_b
     2.5%     97.5% 
 4.845854 15.541923 
se_b
[1] 2.244614
wald_ci_b
[1]  3.483951 12.282836

The 95% bootstrap confidence interval for \(b\) is \((4.845854, 15.541923)\), and the 95% Wald confidence interval is \((3.483951, 12.282836)\). The standard error of \(\hat{b}\) obtained from the inverse Hessian is \(2.244614\). Both intervals are entirely above \(1\), so they agree with the result from Problem B2 where \(H_0:b=1\) was rejected. This supports the conclusion that \(b \neq 1\), meaning the power distribution does not fit the data as well as the full Kumaraswamy model.

(3). Based on the bootstrap sampling distributions from part (1) of this problem, assess whether the validity of the Wald confidence interval is supported.


The bootstrap sampling distribution for \(\hat{b}\) is right-skewed, so the normal approximation used for the Wald confidence interval is not strongly supported. However, both the bootstrap and Wald confidence intervals are entirely above \(1\), so they still agree with the decision from Problem B2 that \(H_0:b=1\) should be rejected. Therefore, the Wald interval gives the same conclusion, but the bootstrap interval is more appropriate because it reflects the skewness in the sampling distribution.

Problem B4:

In the introduction to the working model for this exam, the Kumaraswamy distribution reduces to the uniform distribution on (0,1). In this problem, we perform a likelihood ratio test for the following hypothesis to assess whether the data come from the uniform distribution on (0,1):

\[ H_0: a = 1\quad \& \quad b = 1\quad \text{ versus } \quad H_a: a \ne 1 \quad \text{or} \quad b \ne 1 \quad \text{or}\quad (a \ne 1 \quad \& \quad b \ne 1). \]

Provide a practical interpretation of the above test result. [Hint: \(H_a\) basically says that there is no constraints for \(a\) and \(b\). Please review the lecture note for module 11 on the likelihood ratio test before coding.]


# Use the unrestricted Kumaraswamy model from Problem B1
loglik_full <- -neg_loglik_kumar(c(a_hat, b_hat), reservoir)

# Use the restricted uniform model where a = 1 and b = 1
loglik_restricted <- -neg_loglik_kumar(c(1, 1), reservoir)

# Build the likelihood ratio test statistic
G2_uniform <- 2 * (loglik_full - loglik_restricted)

# Two parameters are fixed under H0, so the test uses 2 degrees of freedom
critical_value_uniform <- qchisq(0.95, df = 2)
p_value_uniform <- 1 - pchisq(G2_uniform, df = 2)

# Print the values needed for the statistical decision
loglik_full
[1] 24.56271
loglik_restricted
[1] 0
G2_uniform
[1] 49.12542
critical_value_uniform
[1] 5.991465
p_value_uniform
[1] 2.150558e-11

Practically, this result suggests that the reservoir storage data are not spread evenly across the interval \((0,1)\), so the Uniform\((0,1)\) model does not describe the data well. The likelihood ratio statistic was \(G^2 = 49.12542\) with p-value \(= 2.150558 \times 10^{-11}\), which gives very strong evidence against \(H_0:a=1,\ b=1\). When both parameters are allowed to vary freely, the Kumaraswamy model fits the data much better than the restricted uniform model.

Note: Please download the template and insert your work into it to complete the exam.

---
title: "STA 506 Final Examination"
author: "Kayla Dyer"
date: " Due: May 5, 2026 "
output:
  html_document: 
    toc: yes
    toc_depth: 4
    toc_float: yes
    number_sections: no
    toc_collapsed: yes
    code_folding: hide
    code_download: yes
    smooth_scroll: yes
    highlight: monochrome
    theme: spacelab
  pdf_document: 
    toc: yes
    toc_depth: 4
    fig_caption: yes
    number_sections: no
    fig_width: 3
    fig_height: 3
  word_document: 
    toc: yes
    toc_depth: 4
    fig_caption: yes
    keep_md: yes
editor_options: 
  chunk_output_type: inline
---

```{css, echo = FALSE}
#TOC::before {
  content: "Table of Contents";
  font-weight: bold;
  font-size: 1.2em;
  display: block;
  color: navy;
  margin-bottom: 10px;
}


div#TOC li {     /* table of content  */
    list-style:upper-roman;
    background-image:none;
    background-repeat:none;
    background-position:0;
}

h1.title {    /* level 1 header of title  */
  font-size: 22px;
  font-weight: bold;
  color: DarkRed;
  text-align: center;
  font-family: "Gill Sans", sans-serif;
}

h4.author { /* Header 4 - and the author and data headers use this too  */
  font-size: 15px;
  font-weight: bold;
  font-family: system-ui;
  color: navy;
  text-align: center;
}

h4.date { /* Header 4 - and the author and data headers use this too  */
  font-size: 18px;
  font-weight: bold;
  font-family: "Gill Sans", sans-serif;
  color: DarkBlue;
  text-align: center;
}

h1 { /* Header 1 - and the author and data headers use this too  */
    font-size: 20px;
    font-weight: bold;
    font-family: "Times New Roman", Times, serif;
    color: darkred;
    text-align: center;
}

h2 { /* Header 2 - and the author and data headers use this too  */
    font-size: 18px;
    font-weight: bold;
    font-family: "Times New Roman", Times, serif;
    color: navy;
    text-align: left;
}

h3 { /* Header 3 - and the author and data headers use this too  */
    font-size: 16px;
    font-weight: bold;
    font-family: "Times New Roman", Times, serif;
    color: navy;
    text-align: left;
}

h4 { /* Header 4 - and the author and data headers use this too  */
    font-size: 14px;
  font-weight: bold;
    font-family: "Times New Roman", Times, serif;
    color: darkred;
    text-align: left;
}

/* Add dots after numbered headers */
.header-section-number::after {
  content: ".";

body { background-color:white; }

.highlightme { background-color:yellow; }

p { background-color:white; }

}
```

```{r setup, include=FALSE}
# code chunk specifies whether the R code, warnings, and output 
# will be included in the output files.
if (!require("knitr")) {
   install.packages("knitr")
   library(knitr)
}
if (!require("pander")) {
   install.packages("pander")
   library(pander)
}
if (!require("ggplot2")) {
  install.packages("ggplot2")
  library(ggplot2)
}
if (!require("tidyverse")) {
  install.packages("tidyverse")
  library(tidyverse)
}

if (!require("plotly")) {
  install.packages("plotly")
  library(plotly)
}
####
knitr::opts_chunk$set(echo = TRUE,       # include code chunk in the output file
                      warning = FALSE,   # sometimes, you code may produce warning messages,
                                         # you can choose to include the warning messages in
                                         # the output file. 
                      results = TRUE,    # you can also decide whether to include the output
                                         # in the output file.
                      message = FALSE,
                      comment = NA
                      )  
```
 
 \
 
## **Final Exam Guidelines** 

* **Coverage**: The major concepts and inference procedures—such as sampling distributions, confidence intervals, and hypothesis testing—are covered and implemented using both classical parametric likelihood-based methods and modern non-parametric approaches, including the bootstrap and kernel density estimation.

* **Part A** requires derivation of selected likelihood-based functions for performing various types of inference, with sufficient detail to enable translation of these derivations into code for numerical analysis.

* Your code for the problems in **Part B** must align with your derivations in **Part A** and be well commented where necessary.

* In **Part B**, all numerical results must be interpreted from a practical perspective.


\

## **Policies of Using AI Tools**

* **Policy on AI Tool Use**: Students must adhere to the AI tool policy specified in the course syllabus. The direct copying of AI-generated content is strictly prohibited. All submitted work must reflect your own understanding; where external tools are consulted, content must be thoroughly rephrased and synthesized in your own words.

* **Code Inclusion Requirement**: Any code included in your essay must be properly commented to explain the purpose and/or expected output of key code lines. Submitting AI-generated code without meaningful, student-added comments will not be accepted.

\

## **Working Model for the Final Exam**

<font color = "orange">**Caution**: *Please follow the suggested expressions and guided steps to complete the exam. Other approaches such as transformation for trivialize the problems that will not meet the exam objectives.*</font>


The **Kumaraswamy distribution** is a two-parameter continuous probability distribution defined on the interval (0, 1). It is often used as an alternative to the Beta distribution due to its simple closed-form expressions for the cumulative distribution function (CDF) and quantile function. It is commonly used in 

* **Hydrology**: Modeling rainfall, streamflow, or other bounded natural phenomena

* **Economics**: Income shares, proportions, or bounded indices

* **Monte Carlo simulation**: Efficient random variate generation (via inverse transform)

* **Machine learning**: Output layer for bounded targets, prior distributions in Bayesian models

* **Reliability engineering**: Modeling failure rates of systems with bounded lifetimes

\

Let $X$ be the Kumaraswamy random variable with Cumulative Distribution Function (CDF)  

$$
F(x; a, b) = 1 - (1 - x^a)^b
$$

where $a > 0$ and $b > 0$ unknown parameters and $0 < x < 1$. 

The following are two special case of the Kumaraswamy distribution:

1. **Uniform Distribution**: When $a = 1$ and $b = 1$, the Kumaraswamy distribution becomes a uniform distribution over $[0, 1]$ with CDF $F(x) = x$.


2. **Power Distribution**: when $b = 1$ and $a > 0$, the Kumaraswamy distribution becomes a power distribution over $[0, 1]$ with CDF $F(x) = x^a$. 

This final exam focuses on inferences of Kumaraswamy distribution and related data analysis.


## Part A: Methodological Derivations

\

### **Problem A1**: 
Show that the density function of the Kumaraswamy distribution is

$$
f(x; a, b) = ab \, x^{a-1} (1 - x^a)^{b-1}.
$$
\

We take the derivative of the CDF with respect to \(x\) to obtain the density function.

$$
f(x;a,b)=\frac{d}{dx}\left[1-(1-x^a)^b\right]
$$

Apply the chain rule:

$$
= -b(1-x^a)^{b-1}(-ax^{a-1})
$$

Simplifying the expression:

$$
= abx^{a-1}(1-x^a)^{b-1}, \quad 0<x<1,\ a>0,\ b>0.
$$

This matches the given density function.

### **Problem A2**: 
Let $\{x_1, x_2, \cdots, x_n \}$ be an i.i.d. random sample taken from a population that follows the above 2-parameter Kumaraswamy distribution. Write out the loglikelihood function of $a$ and $b$, denoted by $\ell(a,b)$, based on the above random sample and **derive** the gradient vector $[\ell_a^\prime(a,b), \ell_b^\prime(a,b)]$, the first order partial derivative of the log-likelihood with respect to parameters $a$ and $b$.

\

Using the density from Problem A1,

$$
f(x_i;a,b)=abx_i^{a-1}(1-x_i^a)^{b-1}
$$

Since the sample is i.i.d., the likelihood function is

$$
L(a,b)=\prod_{i=1}^{n}abx_i^{a-1}(1-x_i^a)^{b-1}
$$

Taking the natural log gives the log-likelihood function

$$
\ell(a,b)=\sum_{i=1}^{n}
\left[
\log(a)+\log(b)+(a-1)\log(x_i)+(b-1)\log(1-x_i^a)
\right]
$$

Simplifying,

$$
\ell(a,b)=
n\log(a)+n\log(b)
+(a-1)\sum_{i=1}^{n}\log(x_i)
+(b-1)\sum_{i=1}^{n}\log(1-x_i^a)
$$

Differentiate with respect to \(a\),

$$
\ell_a'(a,b)
=
\frac{n}{a}
+
\sum_{i=1}^{n}\log(x_i)
-
(b-1)\sum_{i=1}^{n}
\frac{x_i^a\log(x_i)}{1-x_i^a}
$$

Differentiate with respect to \(b\),

$$
\ell_b'(a,b)
=
\frac{n}{b}
+
\sum_{i=1}^{n}\log(1-x_i^a)
$$

The gradient vector is

$$
\left[
\ell_a'(a,b),\ell_b'(a,b)
\right]
=
\left[
\frac{n}{a}
+
\sum_{i=1}^{n}\log(x_i)
-
(b-1)\sum_{i=1}^{n}
\frac{x_i^a\log(x_i)}{1-x_i^a},
\frac{n}{b}
+
\sum_{i=1}^{n}\log(1-x_i^a)
\right].
$$

This gives the required gradient vector for the log-likelihood with respect to \(a\) and \(b\).


### **Problem A3**: 
Based on the gradients functions obtained in the above problem A2, **derive** the observed Fisher Information matrix (i.e, the negative Hessian Matrix).

\

The observed Fisher Information matrix comes from the negative Hessian matrix, so I use the second partial derivatives of the log-likelihood.

$$
I(a,b)=-H(a,b)
$$

Using the gradient functions from Problem A2,

$$
\ell_a'(a,b)
=
\frac{n}{a}
+
\sum_{i=1}^{n}\log(x_i)
-
(b-1)\sum_{i=1}^{n}
\frac{x_i^a\log(x_i)}{1-x_i^a}
$$

$$
\ell_b'(a,b)
=
\frac{n}{b}
+
\sum_{i=1}^{n}\log(1-x_i^a)
$$

Differentiate \(\ell_a'(a,b)\) with respect to \(a\),

$$
\ell_{aa}''(a,b)
=
-\frac{n}{a^2}
-
(b-1)\sum_{i=1}^{n}
\frac{x_i^a[\log(x_i)]^2}{(1-x_i^a)^2}
$$

Differentiate \(\ell_a'(a,b)\) with respect to \(b\),

$$
\ell_{ab}''(a,b)
=
-\sum_{i=1}^{n}
\frac{x_i^a\log(x_i)}{1-x_i^a}
$$

Differentiate \(\ell_b'(a,b)\) with respect to \(b\),

$$
\ell_{bb}''(a,b)
=
-\frac{n}{b^2}
$$

The Hessian matrix is

$$
H(a,b)=
\begin{bmatrix}
\ell_{aa}''(a,b) & \ell_{ab}''(a,b) \\
\ell_{ba}''(a,b) & \ell_{bb}''(a,b)
\end{bmatrix}
$$

Since the observed Fisher Information matrix is the negative Hessian,

$$
I(a,b)=
\begin{bmatrix}
\frac{n}{a^2}
+
(b-1)\sum_{i=1}^{n}
\frac{x_i^a[\log(x_i)]^2}{(1-x_i^a)^2}
&
\sum_{i=1}^{n}
\frac{x_i^a\log(x_i)}{1-x_i^a}
\\
\sum_{i=1}^{n}
\frac{x_i^a\log(x_i)}{1-x_i^a}
&
\frac{n}{b^2}
\end{bmatrix}.
$$

This gives the observed Fisher Information matrix based on the second partial derivatives from Problem A2.

### **Problem A4**: 

Consider power distribution $F(x) = x^a, (a >0 \quad \text{ and }\quad x \in (0,1))$, a special case of the Kumaraswamy distribution with $b = 1$, and a random sample from this distribution $\{ x_1, x_2, \cdots, x_n\}$. **Derive** the MLE and MME of $a$ respectively. [*Hint: To find the MME, you need to compute the moment of the power distribution; that is, $E[X^k] = \int_0^1 x^k F'(x) dx$. Note that both the MLE and the MME have closed-form expressions.*]

\

Since this is the special case with \(b=1\), the power distribution has CDF

$$
F(x)=x^a
$$

Taking the derivative gives the density

$$
f(x)=F'(x)=ax^{a-1}, \quad 0<x<1,\ a>0.
$$

For the MLE, I construct the likelihood using the i.i.d. sample

$$
L(a)=\prod_{i=1}^{n} ax_i^{a-1}
$$

Taking the natural log to simplify the product,

$$
\ell(a)=n\log(a)+(a-1)\sum_{i=1}^{n}\log(x_i)
$$

Differentiate with respect to \(a\) to obtain the score function,

$$
\ell'(a)=\frac{n}{a}+\sum_{i=1}^{n}\log(x_i)
$$

Set the derivative equal to zero and solve for \(a\),

$$
\frac{n}{a}+\sum_{i=1}^{n}\log(x_i)=0
$$

$$
\hat{a}_{MLE}=
-\frac{n}{\sum_{i=1}^{n}\log(x_i)}
$$

For the MME, I compute the first moment using the definition from the hint.

$$
E[X]=\int_0^1 x f(x)\,dx
$$

Substituting the density,

$$
E[X]=\int_0^1 x(ax^{a-1})\,dx
=
a\int_0^1 x^a\,dx
=
\frac{a}{a+1}
$$

Set the population moment equal to the sample mean,

$$
\bar{x}=\frac{a}{a+1}
$$

Solve for \(a\),

$$
\bar{x}(a+1)=a
$$

$$
\bar{x}a+\bar{x}=a
$$

$$
\bar{x}=a(1-\bar{x})
$$

$$
\tilde{a}_{MME}=\frac{\bar{x}}{1-\bar{x}}
$$

Therefore, the MLE is \(\hat{a}_{MLE}=-\frac{n}{\sum_{i=1}^{n}\log(x_i)}\), and the MME is \(\tilde{a}_{MME}=\frac{\bar{x}}{1-\bar{x}}\).


### **Problem A5**:

Using the same setting as in **Problem A4**, find the asymptotic (Wald) confidence interval for $a$. [*Hint: Compute the Fisher information for $a$, then take its reciprocal to obtain the variance*.]

\

Using the MLE from Problem A4,

$$
\hat{a}_{MLE}=-\frac{n}{\sum_{i=1}^{n}\log(x_i)}
$$

The log-likelihood from Problem A4 is

$$
\ell(a)=n\log(a)+(a-1)\sum_{i=1}^{n}\log(x_i)
$$

Differentiate a second time to obtain the Hessian,

$$
\ell''(a)=-\frac{n}{a^2}
$$

The Fisher information is the negative second derivative,

$$
I(a)=-\ell''(a)=\frac{n}{a^2}
$$

Using the hint, the approximate variance is the reciprocal of the Fisher information evaluated at \(\hat{a}\),

$$
Var(\hat{a})\approx \frac{1}{I(\hat{a})}
=
\frac{\hat{a}^2}{n}
$$

Therefore, the standard error is

$$
SE(\hat{a})=\frac{\hat{a}}{\sqrt{n}}
$$

The Wald confidence interval for \(a\) is

$$
\hat{a}\pm z_{\alpha/2}\frac{\hat{a}}{\sqrt{n}}
$$

For a 95% confidence interval,

$$
\hat{a}\pm 1.96\frac{\hat{a}}{\sqrt{n}}.
$$

### **Problem A6**:

Using the same setting as in **Problem A4**, perform a likelihood ratio test for the hypothesis $H_0 :a=1$ (i.e., the power distribution reduces to a uniform distribution). [*Hint: Evaluate the log-likelihood function at the maximum likelihood estimate $\hat{a}$ and at $a=1$, then use these values to construct the LRT test statistic.*]

\

Using the same log-likelihood from Problem A4,

$$
\ell(a)=n\log(a)+(a-1)\sum_{i=1}^{n}\log(x_i)
$$

The hypotheses are

$$
H_0:a=1
$$

and

$$
H_A:a\neq 1.
$$

Evaluate the log-likelihood at the MLE,

$$
\ell(\hat{a})=n\log(\hat{a})+(\hat{a}-1)\sum_{i=1}^{n}\log(x_i)
$$

Under the null hypothesis, \(a=1\), so

$$
\ell(1)=n\log(1)+(1-1)\sum_{i=1}^{n}\log(x_i)=0
$$

The likelihood ratio test statistic is

$$
G^2=2\left[\ell(\hat{a})-\ell(1)\right]
$$

Substituting the log-likelihood values,

$$
G^2=2\left[n\log(\hat{a})+(\hat{a}-1)\sum_{i=1}^{n}\log(x_i)\right]
$$

Since one parameter is fixed under \(H_0\),

$$
G^2\sim \chi^2_1.
$$

Reject \(H_0\) if

$$
G^2>\chi^2_{1,1-\alpha}.
$$

This tests whether the power distribution can be reduced to the uniform distribution when \(a=1\).

## Part B: Numerical Analysis

**All code must be well commented and adhere to best coding practices**

**Working Dataset**: A small reservoir supplies water to a town. During the dry season (50 days), engineers record the fraction of usable storage filled each morning. Values near 0 mean the reservoir is nearly empty; values near 1 mean it's full. The distribution tends to be right‑skewed (mostly low levels due to drought) but with occasional replenishment.

The following 50 data points (ordered for clarity) represent the daily proportion of usable storage:

```
0.12, 0.14, 0.15, 0.16, 0.17, 0.18, 0.19, 0.20, 0.21, 0.22,
0.23, 0.24, 0.25, 0.26, 0.27, 0.28, 0.29, 0.30, 0.31, 0.32,
0.33, 0.34, 0.35, 0.36, 0.37, 0.38, 0.39, 0.40, 0.41, 0.42,
0.43, 0.44, 0.45, 0.46, 0.47, 0.48, 0.49, 0.50, 0.51, 0.52,
0.53, 0.54, 0.55, 0.56, 0.57, 0.58, 0.59, 0.60, 0.61, 0.78
```

\

### **Problem B1**:

Fit the Kumaraswamy distribution to the above data. Use the derivations in **Problem A2** to find the MLE of $a$ and $b$. Please copy the key formulas before coding.

\

Using the log-likelihood from Problem A2,

$$
\ell(a,b)=
n\log(a)+n\log(b)
+(a-1)\sum_{i=1}^{n}\log(x_i)
+(b-1)\sum_{i=1}^{n}\log(1-x_i^a)
$$

The MLEs of \(a\) and \(b\) are found by maximizing this log-likelihood.

```{r}
# Input the reservoir storage data from the problem
reservoir <- c(
  0.12, 0.14, 0.15, 0.16, 0.17, 0.18, 0.19, 0.20, 0.21, 0.22,
  0.23, 0.24, 0.25, 0.26, 0.27, 0.28, 0.29, 0.30, 0.31, 0.32,
  0.33, 0.34, 0.35, 0.36, 0.37, 0.38, 0.39, 0.40, 0.41, 0.42,
  0.43, 0.44, 0.45, 0.46, 0.47, 0.48, 0.49, 0.50, 0.51, 0.52,
  0.53, 0.54, 0.55, 0.56, 0.57, 0.58, 0.59, 0.60, 0.61, 0.78
)

# Store sample size for use in likelihood expressions
n <- length(reservoir)
```

```{r}
# Define the negative log-likelihood from Problem A2
neg_loglik_kumar <- function(par, x) {
  a <- par[1]
  b <- par[2]
  
  # Enforce parameter constraints (a > 0, b > 0)
  if (a <= 0 || b <= 0) return(Inf)
  
  # Negative log-likelihood (used for minimization)
  -(
    length(x) * log(a) +
    length(x) * log(b) +
    (a - 1) * sum(log(x)) +
    (b - 1) * sum(log(1 - x^a))
  )
}

# Perform numerical maximization of the log-likelihood
fit_kumar <- optim(
  par = c(1, 1),                  # initial guesses
  fn = neg_loglik_kumar,
  x = reservoir,
  method = "L-BFGS-B",
  lower = c(0.0001, 0.0001)       # enforce positivity
)

# Extract MLE estimates
a_hat <- fit_kumar$par[1]
b_hat <- fit_kumar$par[2]

# Display results
a_hat
b_hat
```

The maximum likelihood estimates are $\hat{a} = 2.5296020115006$ and $\hat{b} = 7.88339354060574$.

### **Problem B2**:

Fit the **power distribution** to the above data using the derived  of $a$ obtained in **Problem A4** to test the following hypothesis using likelihood ratio procedure at significance level $\alpha = 0.05$:

$$
H_0: b = 1 \quad \text{ versus } \quad H_a: b \ne 1.
$$

State the statistical decision clearly. What is the practical implication of the testing result?

\

Using the power distribution restriction \(b=1\), the MLE of \(a\) from Problem A4 is

$$
\hat{a}_{0}=-\frac{n}{\sum_{i=1}^{n}\log(x_i)}
$$

```{r}
# Estimate a under the restricted model where b = 1 (power distribution case)
a_hat_power <- -n / sum(log(reservoir))

# Evaluate the log-likelihood using the full Kumaraswamy model (from B1 estimates)
loglik_full <- -neg_loglik_kumar(c(a_hat, b_hat), reservoir)

# Evaluate the log-likelihood under the restricted model where b = 1
loglik_restricted <- -neg_loglik_kumar(c(a_hat_power, 1), reservoir)

# Construct the likelihood ratio test statistic comparing the two models
G2 <- 2 * (loglik_full - loglik_restricted)

# Since one parameter is restricted, the test follows a chi-square distribution with 1 degree of freedom
p_value <- 1 - pchisq(G2, df = 1)
critical_value <- qchisq(0.95, df = 1)

# Display results so I can interpret the test
a_hat_power
loglik_full
loglik_restricted
G2
critical_value
p_value
```

The likelihood ratio test gives $G^2 = 48.92533$ with a p-value of $2.658984 \times 10^{-12}$. Since the p-value is less than $0.05$, I reject $H_0$. This indicates that $b \neq 1$, so the power distribution does not fit the data well, and the Kumaraswamy model is more appropriate.

### **Problem B3**:

Use the procedure and code from **Problem B1** to estimate the MLEs of $a$ and $b$, and then complete the following analyses:

(1). Obtain the bootstrap sampling distributions of $\hat{a}$ and $\hat{b}$ and plot each distribution using **Gaussian kernel density curves**.

Using the procedure from Problem B1, the Kumaraswamy MLEs are $\hat{a} = 2.5296020115006$ and $\hat{b} = 7.88339354060574$. These estimates are used as the starting values when refitting the model to each bootstrap sample.

```{r}
# Set seed so the bootstrap results can be repeated
set.seed(506)

# Choose the number of bootstrap samples
B <- 1000

# Store the bootstrap estimates of a and b
boot_a <- numeric(B)
boot_b <- numeric(B)

# Refit the Kumaraswamy model to each bootstrap sample
for (i in 1:B) {
  boot_sample <- sample(reservoir, size = n, replace = TRUE)
  
  boot_fit <- optim(
    par = c(a_hat, b_hat),
    fn = neg_loglik_kumar,
    x = boot_sample,
    method = "L-BFGS-B",
    lower = c(0.0001, 0.0001)
  )
  
  boot_a[i] <- boot_fit$par[1]
  boot_b[i] <- boot_fit$par[2]
}
```

```{r}
# Plot the bootstrap sampling distribution of a-hat
plot(
  density(boot_a, kernel = "gaussian"),
  main = "Bootstrap Sampling Distribution of a-hat",
  xlab = "Bootstrap estimates of a",
  ylab = "Density"
)
```

```{r}
# Plot the bootstrap sampling distribution of b-hat
plot(
  density(boot_b, kernel = "gaussian"),
  main = "Bootstrap Sampling Distribution of b-hat",
  xlab = "Bootstrap estimates of b",
  ylab = "Density"
)
```

(2).  Construct both the $95\%$ **bootstrap confidence interval** and the **Wald confidence interval** for $b$. Do these intervals agree with the results obtained in **Problem B2**? [*Compute the standard error of $\hat{b}$ using the observed Fisher information matrix, i.e., the inverse of the negative Hessian obtained from optim()*]

For \(b\), the bootstrap confidence interval is obtained from the bootstrap estimates of \(\hat{b}\), while the Wald confidence interval is constructed using the standard error from the inverse of the observed Fisher information matrix.

```{r}
# Use the bootstrap estimates from B3(1) to get the 95% percentile interval for b
boot_ci_b <- quantile(boot_b, probs = c(0.025, 0.975))

# Refit the model one more time so I can extract the Hessian matrix from optim()
fit_kumar_hessian <- optim(
  par = c(a_hat, b_hat),
  fn = neg_loglik_kumar,
  x = reservoir,
  method = "L-BFGS-B",
  lower = c(0.0001, 0.0001),
  hessian = TRUE
)

# Take the inverse of the Hessian to get the estimated variance-covariance matrix
cov_matrix <- solve(fit_kumar_hessian$hessian)

# Pull out the standard error for b-hat (second parameter)
se_b <- sqrt(cov_matrix[2, 2])

# Build the Wald confidence interval using b-hat ± 1.96 * SE
wald_ci_b <- b_hat + c(-1, 1) * 1.96 * se_b

# Print everything so I can compare the two intervals and relate back to B2
boot_ci_b
se_b
wald_ci_b
```

The 95% bootstrap confidence interval for $b$ is $(4.845854, 15.541923)$, and the 95% Wald confidence interval is $(3.483951, 12.282836)$. The standard error of $\hat{b}$ obtained from the inverse Hessian is $2.244614$. Both intervals are entirely above $1$, so they agree with the result from Problem B2 where $H_0:b=1$ was rejected. This supports the conclusion that $b \neq 1$, meaning the power distribution does not fit the data as well as the full Kumaraswamy model.

(3). Based on the bootstrap sampling distributions from part (1) of this problem, assess whether the validity of the Wald confidence interval is supported.

\

The bootstrap sampling distribution for $\hat{b}$ is right-skewed, so the normal approximation used for the Wald confidence interval is not strongly supported. However, both the bootstrap and Wald confidence intervals are entirely above $1$, so they still agree with the decision from Problem B2 that $H_0:b=1$ should be rejected. Therefore, the Wald interval gives the same conclusion, but the bootstrap interval is more appropriate because it reflects the skewness in the sampling distribution.

### **Problem B4**:

In the introduction to the working model for this exam, the Kumaraswamy distribution reduces to the uniform distribution on (0,1). In this problem, we perform a **likelihood ratio test** for the following hypothesis to assess whether the data come from the uniform distribution on (0,1):

$$
H_0: a = 1\quad \& \quad b = 1\quad \text{ versus } \quad H_a: a \ne 1 \quad \text{or} \quad b \ne 1 \quad \text{or}\quad (a \ne 1 \quad \& \quad b \ne 1).
$$

Provide a practical interpretation of the above test result. [*Hint: $H_a$ basically says that there is no constraints for $a$ and $b$. Please review the lecture note for module 11  on the likelihood ratio test before coding.*]

\

```{r}
# Use the unrestricted Kumaraswamy model from Problem B1
loglik_full <- -neg_loglik_kumar(c(a_hat, b_hat), reservoir)

# Use the restricted uniform model where a = 1 and b = 1
loglik_restricted <- -neg_loglik_kumar(c(1, 1), reservoir)

# Build the likelihood ratio test statistic
G2_uniform <- 2 * (loglik_full - loglik_restricted)

# Two parameters are fixed under H0, so the test uses 2 degrees of freedom
critical_value_uniform <- qchisq(0.95, df = 2)
p_value_uniform <- 1 - pchisq(G2_uniform, df = 2)

# Print the values needed for the statistical decision
loglik_full
loglik_restricted
G2_uniform
critical_value_uniform
p_value_uniform
```

Practically, this result suggests that the reservoir storage data are not spread evenly across the interval $(0,1)$, so the Uniform$(0,1)$ model does not describe the data well. The likelihood ratio statistic was $G^2 = 49.12542$ with p-value $= 2.150558 \times 10^{-11}$, which gives very strong evidence against $H_0:a=1,\ b=1$. When both parameters are allowed to vary freely, the Kumaraswamy model fits the data much better than the restricted uniform model.

<font color = "red">**Note**: Please download the template and insert your work into it to complete the exam. </font>




