optim()To estimate the Weibull parameters \(\hat{\lambda}\) (scale) and \(\hat{\beta}\) (shape), we maximize the
log-likelihood function of the Weibull distribution. Since
optim() minimizes by default, we minimize the
negative log-likelihood.
The Weibull log-likelihood for \(n\) observations \(t_1, t_2, \ldots, t_n\) is:
\[\ell(\lambda, \beta) = n\log(\beta) - n\beta\log(\lambda) + (\beta - 1)\sum_{i=1}^{n}\log(t_i) - \sum_{i=1}^{n}\left(\frac{t_i}{\lambda}\right)^{\beta}\]
The gradient (partial derivatives with respect to each parameter) is:
\[\frac{\partial \ell}{\partial \beta} = \frac{n}{\beta} - n\log(\lambda) + \sum_{i=1}^{n}\log(t_i) - \sum_{i=1}^{n}\left(\frac{t_i}{\lambda}\right)^{\beta}\log\left(\frac{t_i}{\lambda}\right)\]
\[\frac{\partial \ell}{\partial \lambda} = -\frac{n\beta}{\lambda} + \frac{\beta}{\lambda}\sum_{i=1}^{n}\left(\frac{t_i}{\lambda}\right)^{\beta}\]
We provide these functions explicitly to optim() via the
gr argument, which improves numerical accuracy and
convergence speed.
# Failure times
t_data <- c(5.2, 7.8, 9.1, 11.3, 12.5, 13.0, 14.2, 15.1, 15.9, 16.7,
17.2, 17.8, 18.4, 18.9, 19.3, 19.7, 20.2, 20.6, 21.0, 21.5,
21.9, 22.3, 22.7, 23.1, 23.5, 23.9, 24.3, 24.7, 25.1, 25.5,
25.9, 26.3, 26.7, 27.1, 27.5, 27.9, 28.3, 28.7, 29.1, 29.5,
29.9, 30.3, 30.7, 31.1, 31.5, 31.9, 32.3, 32.7, 33.1, 33.5,
33.9, 34.3, 34.7, 35.1, 35.5, 35.9, 36.3, 36.7, 37.1, 37.5,
37.9, 38.3, 38.7, 39.1, 39.5, 39.9, 40.3, 40.7, 41.1, 41.5,
41.9, 42.3, 42.7, 43.1, 43.5)
n <- length(t_data)
# Negative log-likelihood for the Weibull distribution
weibull_negloglik <- function(params, t) {
lambda <- params[1]
beta <- params[2]
# Return a large penalty if parameters go outside valid range
if (lambda <= 0 || beta <= 0) return(1e10)
n <- length(t)
ll <- n * log(beta) - n * beta * log(lambda) +
(beta - 1) * sum(log(t)) -
sum((t / lambda)^beta)
return(-ll) # negative because optim() minimizes
}
# --- Gradient of the negative log-likelihood ---
weibull_neggrad <- function(params, t) {
lambda <- params[1]
beta <- params[2]
n <- length(t)
ratio <- t / lambda # t_i / lambda for all i
# Partial derivative w.r.t. beta
d_beta <- -(n / beta - n * log(lambda) + sum(log(t)) -
sum(ratio^beta * log(ratio)))
# Partial derivative w.r.t. lambda
d_lambda <- -(-n * beta / lambda + (beta / lambda) * sum(ratio^beta))
return(c(d_lambda, d_beta))
}
# Run optim() using the BFGS method
init_params <- c(lambda = mean(t_data), beta = 1)
weibull_fit <- optim(
par = init_params,
fn = weibull_negloglik,
gr = weibull_neggrad,
t = t_data,
method = "BFGS"
)
# Extract MLEs
lambda_hat <- weibull_fit$par[1]
beta_hat <- weibull_fit$par[2]
cat(" lambda-hat (scale):", round(lambda_hat, 4), "\n") lambda-hat (scale): 31.4182
beta-hat (shape): 3.3708
optim()The exponential distribution is a special case of the Weibull distribution where \(\beta = 1\), leaving only the scale parameter \(\lambda\) to estimate. We again maximize the log-likelihood, this time with respect to \(\lambda\) only.
The exponential log-likelihood for \(n\) observations \(t_1, t_2, \ldots, t_n\) is:
\[\ell(\lambda) = -n\log(\lambda) - \frac{1}{\lambda}\sum_{i=1}^{n}t_i\]
The gradient with respect to \(\lambda\) is:
\[\frac{\partial \ell}{\partial \lambda} = -\frac{n}{\lambda} + \frac{1}{\lambda^2}\sum_{i=1}^{n}t_i\]
As with part (a), we provide the gradient explicitly to
optim() and use the BFGS method so that the gradient
information is utilized during optimization.
# --- Negative log-likelihood for the Exponential distribution ---
# Only one parameter: lambda (scale)
exp_negloglik <- function(params, t) {
lambda <- params[1]
# Penalty for invalid parameter values
if (lambda <= 0) return(1e10)
n <- length(t)
ll <- -n * log(lambda) - (1 / lambda) * sum(t)
return(-ll) # negative because optim() minimizes
}
# --- Gradient of the negative log-likelihood ---
exp_neggrad <- function(params, t) {
lambda <- params[1]
n <- length(t)
d_lambda <- -(-n / lambda + sum(t) / lambda^2)
return(c(d_lambda))
}
# --- Run optim() with BFGS, starting at the sample mean ---
# The sample mean is the analytical MLE, so it is a natural starting point
exp_fit <- optim(
par = c(lambda = mean(t_data)),
fn = exp_negloglik,
gr = exp_neggrad,
t = t_data,
method = "BFGS"
)
# Extract MLE
lambda_hat_exp <- exp_fit$par[1]
cat(" lambda-hat (scale):", round(lambda_hat_exp, 4), "\n") lambda-hat (scale): 28.1853
Using the MLEs obtained in parts (a) and (b), we perform a classical likelihood ratio test to assess whether the exponential model (\(H_0: \beta = 1\)) is adequate compared to the more general Weibull model (\(H_1: \beta \neq 1\)).
The likelihood ratio statistic is defined as:
\[\Lambda_{obs} = -2[\ell_0(\hat{\lambda}_{exp}) - \ell_1(\hat{\lambda}_{weibull}, \hat{\beta}_{weibull})]\]
where \(\ell_0\) is the log-likelihood of the exponential model evaluated at its MLE, and \(\ell_1\) is the log-likelihood of the Weibull model evaluated at its MLEs. Under \(H_0\), \(\Lambda_{obs}\) follows a \(\chi^2\) distribution with 1 degree of freedom, since the Weibull model has one additional parameter (\(\beta\)) compared to the exponential model.
# --- Compute log-likelihood under H0: Exponential model ---
# This is the restricted model (beta = 1, only lambda estimated)
loglik_exp <- -exp_negloglik(c(lambda_hat_exp), t = t_data)
# --- Compute log-likelihood under H1: Weibull model ---
# This is the unrestricted model (both lambda and beta estimated)
loglik_weibull <- -weibull_negloglik(c(lambda_hat, beta_hat), t = t_data)
# --- Compute the likelihood ratio test statistic ---
Lambda_obs <- -2 * (loglik_exp - loglik_weibull)
# --- Compute the p-value using chi-square distribution with 1 df ---
p_value_chisq <- pchisq(Lambda_obs, df = 1, lower.tail = FALSE)
cat(" P-value (chi-square test) :", round(p_value_chisq, 6), "\n") P-value (chi-square test) : 0
Rather than relying on the asymptotic \(\chi^2\) distribution, we use the Bootstrap Likelihood Ratio Test (BLRT) to empirically simulate the null distribution of \(\Lambda_{obs}\). Under \(H_0: \beta = 1\), we generate \(B\) datasets from the exponential model using \(\hat{\lambda}_{exp}\) estimated in part (b), fit both models to each bootstrap sample, and compute the likelihood ratio statistic for each.
The bootstrap p-value is computed as:
\[p = \frac{1 + \#\{\Lambda_b \geq \Lambda_{obs}\}}{B + 1}\]
This avoids any distributional assumptions and lets the data determine the null distribution directly.
set.seed(123) # for reproducibility
B <- 9999 # number of bootstrap samples
# --- Storage for bootstrap LR statistics ---
Lambda_boot <- numeric(B)
for (b in 1:B) {
# Step 1: Generate bootstrap sample under H0 (exponential model)
# We sample from exponential with scale = lambda_hat_exp
boot_sample <- rexp(n, rate = 1 / lambda_hat_exp)
# Step 2: Fit exponential model (H0) to bootstrap sample
boot_exp_fit <- optim(
par = c(lambda = mean(boot_sample)),
fn = exp_negloglik,
gr = exp_neggrad,
t = boot_sample,
method = "BFGS"
)
# Step 3: Fit Weibull model (H1) to bootstrap sample
boot_weibull_fit <- optim(
par = c(lambda = mean(boot_sample), beta = 1),
fn = weibull_negloglik,
gr = weibull_neggrad,
t = boot_sample,
method = "BFGS"
)
# Step 4: Compute log-likelihoods for this bootstrap sample
boot_loglik_exp <- -exp_negloglik(boot_exp_fit$par, t = boot_sample)
boot_loglik_weibull <- -weibull_negloglik(boot_weibull_fit$par, t = boot_sample)
# Step 5: Compute bootstrap LR statistic
Lambda_boot[b] <- -2 * (boot_loglik_exp - boot_loglik_weibull)
}
# --- Compute bootstrap p-value ---
p_value_boot <- (1 + sum(Lambda_boot >= Lambda_obs)) / (B + 1)
cat(" Bootstrap p-value:", round(p_value_chisq, 6), "\n") Bootstrap p-value: 0
Both the classical likelihood ratio test (LRT) and the Bootstrap Likelihood Ratio Test (BLRT) were performed to test \(H_0: \beta = 1\) (exponential model) against \(H_1: \beta \neq 1\) (Weibull model).
The classical LRT relies on the asymptotic result that the likelihood ratio statistic \(\Lambda_{obs}\) follows a \(\chi^2\) distribution with 1 degree of freedom under \(H_0\). The BLRT makes no such distributional assumption — instead it simulates the null distribution of \(\Lambda_{obs}\) directly from the data by generating \(B = 9999\) bootstrap samples under \(H_0\).
Both tests produced extremely small p-values, well below any conventional significance level (\(\alpha = 0.05\)). This means both tests lead to the same conclusion: we reject \(H_0: \beta = 1\) in favor of \(H_1: \beta \neq 1\). The consistency between the two tests is reassuring — it suggests the asymptotic \(\chi^2\) approximation used in the classical LRT is appropriate for this dataset, and that the conclusion is not sensitive to the choice of testing approach.
Based on the results of both tests, the Weibull model is recommended for this data. The evidence against the exponential model is overwhelming, and the estimated shape parameter \(\hat{\beta} = 3.3708 > 1\) has a meaningful engineering interpretation — it indicates an increasing hazard rate over time. This is consistent with the physical reality described in the problem: gearboxes in wind turbines are subject to mechanical wear, fatigue, pitting, and bearing degradation, all of which cause the probability of failure to increase as the turbines age.
The exponential model, which assumes a constant hazard rate (random failures with no aging effect), is too simplistic for this application and would lead to unreliable reliability predictions. The Weibull model with \(\hat{\beta} = 3.3708\) and \(\hat{\lambda} = 31.4182\) months provides a statistically and physically justified description of the gearbox failure process.