# read in data
data = rbind.data.frame(5.2,  7.8,  9.1, 11.3, 12.5, 13.0, 14.2, 15.1, 15.9, 16.7, 17.2, 17.8, 18.4, 18.9, 
  19.3, 19.7, 20.2, 20.6, 21.0, 21.5, 21.9, 22.3, 22.7, 23.1, 23.5, 23.9, 24.3, 24.7, 
  25.1, 25.5, 25.9, 26.3, 26.7, 27.1, 27.5, 27.9, 28.3, 28.7, 29.1, 29.5, 29.9, 30.3, 
  30.7, 31.1, 31.5, 31.9, 32.3, 32.7, 33.1, 33.5, 33.9, 34.3, 34.7, 35.1, 35.5, 35.9, 
  36.3, 36.7, 37.1, 37.5, 37.9, 38.3, 38.7, 39.1, 39.5, 39.9, 40.3, 40.7, 41.1, 41.5,
  41.9, 42.3, 42.7, 43.1, 43.5)
colnames(data) = "time"
  # time is in months

Intro and Setup

The dataset at our disposal for this analyis is made up of 75 recorded failure times (measured in months) for wind turbine gearbox components. Summary statistics as well as a histogram depiction of our sample data can be seen below.

sum_stats_table = function(variable){
  table = cbind(
    round(min(variable),2),
    round(quantile(variable, 0.25),2),
    round(median(variable),2),
    round(mean(variable),2),
    round(quantile(variable, 0.75),2),
    round(max(variable),2)
  )
  colnames(table) = c("Min", "Q1", "Med", "Mean", "Q3", "Max")
  rownames(table) = NULL
  return(table)
}

Summary_Table = sum_stats_table(variable = data$time)
kable(Summary_Table,align = "c",
      caption = "<span style='color:##000000;'>
      Summary Stats of Survival Times (Months)</span>") %>%
  kable_styling(
    bootstrap_options = c("striped", "bordered"),
    full_width = FALSE,
    position = "center")
Summary Stats of Survival Times (Months)
Min Q1 Med Mean Q3 Max
5.2 21.25 28.7 28.19 36.1 43.5
ggplot(data = data) +
  geom_histogram(mapping = aes(x = time, y = after_stat(density)),
                 color = "black",
                  # argument color in geom_histogram is for color of bin lines
                 fill = "lightblue",
                 bins = 10,) +
  labs(title = "Distribution of Survival Times",
       x = "Time (Months)",
       y = "Density") + 
  theme(
    panel.border = element_rect(color = "black", fill = NA, linewidth = 3)
  ) # this is the equivalent of box(lwd = 3) in base R

The engineering team at a wind energy company is unsure whether to label and treat the broader population distribution of fail times as exponentially or Weibull distributed. The probability density of an exponential distribution is simply a more simplified version of the Weibull’s. It is the value of our shape parameter, \(\beta\), that determines which of these distribution’s we will label as the framework for predictive analysis on gearbox fail times moving forward.

For the purposes of today’s analysis, we will treat the null hypothesis that said population is exponentially distributed (AKA \(\beta = 1\)), and the alternative hypothesis that a Weibull distribution more accurately represents the broader population of gearbox fail times (AKA \(\beta \neq 1\)).

\[\begin{align} H_0&: \beta = 1 \quad \text{Takeaway: Exponential} \\ H_a&: \beta \neq 1 \quad \text{Takeaway: Weibull} \end{align}\]

In order to assess the validity of these opposing hypotheses, we must first quantify an estimate for our parameter \(\beta\) under the assumption of each scenario (Weibull distribution vs exponential distribution). In the following sections, I will use maximum likelihood estimation (MLE) to do so.

a.) MLE of Weibull

Operating under the assumption that the population of gearbox fail times is Weibull distributed, we can use MLE to simultaneously generate estimates for the shape parameter, \(\beta\), and the the scale parameter, \(\lambda\). We’ll denote these estimates as \(\hat{\beta}\) and \(\hat{\lambda}\) respectively.

Parameter calculations were performed with R’s built-in optim() function. For mathematical reference, the log-likelihood function for the Weibull distribution as well as the partial derivatives of that very function with respect to both \(\beta\) and \(\lambda\) are below.

\[\Large \ell(\beta,\lambda|t) = [n \times \ln(\beta)] - [n \times \beta \times \ln(\lambda)] + [(\beta - 1) \times \sum\ln(t_i)] - \sum\left(\frac{(t_i)}{\lambda}\right)^\beta \]


\[\Large \frac{\partial f}{\partial \beta} = \frac{n}{\beta} - n \times \ln(\lambda) + \sum\ln(t_i) - \sum\left(\ln\frac{t_i}{\lambda}\right) \times \left(\frac{t_i}{\lambda}\right)^\beta \]


\[\Large \frac{\partial f}{\partial \lambda} = \frac{\beta}{\lambda} \times \sum\left(\frac{t_i}{\lambda}\right)^\beta - n \]

## MLE Weibull Estimation
  # code is largely copied from HW 9 assignment, which took formulas and procedure from following:
    # Week 11, Section 5.2 
    # Week 7, Section 5

#### define likelihood function
## *negative* log likelihood function for Weibull distribution
  # by negative log likelihood, this function....
log_likelihood = function(params, data) {
  beta = params[1]  # shape
  lambda = params[2]  # scale 
  
  # Ensure parameters are positive
  if (beta <= 0 || lambda <= 0) {
    return(1e10)  # Return large value for invalid parameters
  }
  
  n = length(data)
  log_lik = n * log(beta) - n * beta * log(lambda) + 
    (beta - 1) * sum(log(data)) - sum((data / lambda)^beta)
  
  return(log_lik)  # Return log-likelihood! NOT the negative log-likelihood
}

#### define score equations (partial derivatives)
log_likelihood_gradient = function(params, data) {
  beta = params[1]       # shape parameter
  lambda = params[2]  # scale parameter
  
  n = length(data)
  
  partial_deriv_beta = n/beta - n * log(lambda) + sum(log(data)) - 
        sum(log(data/lambda) * (data/lambda)^beta)
  
  partial_deriv_lambda =  (beta/lambda)*(sum((data/lambda)^beta) -n)
  
  return(c(partial_deriv_beta,
           partial_deriv_lambda))  # Gradient of log-likelihood, NOT the negative scores
}

#### use log likelihood and score equations in optim function

# need to have good initial guesses for beta and lambda for the optim function
beta_guess = (sd(data$time)/mean(data$time))^(-1.086)

lambda_guess = mean(data$time)/gamma(1 + 1/beta_guess)


weib_mle_estimates = optim(
  par = c(beta_guess, lambda_guess), # par argument is initial "guesses"
  data = data$time,
  fn = log_likelihood,
  gr = log_likelihood_gradient,
  method = "L-BFGS-B",
  hessian = TRUE,
    # will use the Hessian matrix later on
  control = list(maxit = 1000, 
                 fnscale  = -1,
                 trace = FALSE,
                 abstol = 1e-8)
)
#weib_mle_estimates
  
weib_mle_beta = weib_mle_estimates$par[1] # shape
weib_mle_lambda = weib_mle_estimates$par[2] # scale

Post-MLE, we have estimates of about 3.371 and 31.418 for \(\beta\) and \(\lambda\) respectively. To further ensure these estimates seem reasonable, below I overlaid a Weibull distribution with said parameter values atop our available sample data.

ggplot(data = data) +
  geom_histogram(mapping = aes(x = time, y = after_stat(density)),
                 color = "black",
                  # argument color in geom_histogram is for color of bin lines
                 fill = "lightblue",
                 bins = 10,) +
  # use stat_function in ggplot to show density curves of other distributions
    stat_function(fun = dweibull, 
                args = list(shape = weib_mle_beta, scale = weib_mle_lambda),
                color = "black",
                linewidth = 1.5,
                xlim = range(0:50))+
            # xlim argument so curve doesn't awkardly cut out at edge points of data
  labs(title = "Distribution of Gearbox Survival Times (Weibull)",
       x = "Time (Months)",
       y = "Density") + 
  theme(
    panel.border = element_rect(color = "black", fill = NA, linewidth = 3),
    legend.position = c(0.85, 0.85)
  )

As we can see in our histogram above, there is certainly a respectable amount of overlap between the distributions of our sample data and a theoretical Weibull density curve with parameters of \(\beta \approx 3.371\) and \(\lambda \approx 31.418\). This leads us to believe that if we are to label the broader population of gearbox survival times as Weibull distributed, then the \(\beta\) and \(\lambda\) estimates prescribed via MLE are quality starting population estimators.

Now for comparison’s sake, I will repeat the MLE and visualization procedure, but this time operating under the assumption that the gearbox survival times are generally exponentially distributed.

b.) MLE of Exponential

As alluded to before, the exponential distribution can be simplistically described as a less complicated “version” of the Weibull distribution. Instead of employing two active parameters, the exponential distribution’s only parameter is \(\lambda\), which we will now estimate via MLE. Log-likelihood functions and partial derivatives are written below.

\[\Large \ell(\lambda|t) = -[n \times \ln(\lambda)] - \frac{\sum(t_i)}{\lambda} \]


\[\Large \frac{\partial f}{\partial \lambda} = \frac{n}{\lambda} + \frac{\sum(t_i)}{\lambda^2} \]


## Exponential MLE Distribution
  # once again code largely brought over from HW 9 assignment

# exponential log likelihood follows same pattern as Weibull log likelihood but a MUCH simpler distribution
exp_log_likelihood = function(lambda, data) {
 if (lambda <= 0) return (1e10)
 n = length(data)
 log_lik = -n * log(lambda) - sum(data)/lambda
 return(log_lik)
}

### now define partial derivative w/ respect to lambda (the score function or gradient)
  # performed via quick hand calculation

exp_gradient = function(lambda, data) {
  if (lambda <= 0) return (NA)
  n = length(data)
  partial_deriv_lambda = (n/lambda) + (sum(data)/(lambda^2))
  return(partial_deriv_lambda)
}

exp_mle_estimates = optim(
  par = 1/mean(data$time),
    # the initial guess can simply be 1/(mean of the data) since we are performing this analysis under the theoretical that it is an EXPONENTIAL distribution. The mean of an exponential distribution is 1 divided by the the rate parameter lambda
  data = data$time,
  fn = exp_log_likelihood,
  gr = exp_gradient,
  method = "L-BFGS-B",
  hessian = TRUE,
    # will use the Hessian matrix later on
  control = list(maxit = 1000, 
                 fnscale  = -1,
                 trace = FALSE,
                 abstol = 1e-8)
)
  
#exp_mle_estimates$par

exp_mle_lambda = exp_mle_estimates$par

Per MLE calculations, we can assert that the \(\lambda\) parameter of the distribution of gearbox survival times, if said distribution was truly exponential, is estimated to be roughly 28.377. An exponential distribution with \(\lambda = 28.377\) is overlaid atop the sample data and available below.

ggplot(data = data) +
  geom_histogram(mapping = aes(x = time, y = after_stat(density)),
                 color = "black",
                  # argument color in geom_histogram is for color of bin lines
                 fill = "lightblue",
                 bins = 10,) +
  # use stat_function in ggplot to show density curves of other distributions
    stat_function(fun = dexp, 
                args = list(rate = 1/exp_mle_lambda),
                color = "red",
                linewidth = 1.5,
                xlim = range(0:50)
                )+
            # xlim argument so curve doesn't awkardly cut out at edge points of data
  labs(title = "Distribution of Survival Times (Exp)",
       x = "Time (Months)",
       y = "Density") + 
  theme(
    panel.border = element_rect(color = "black", fill = NA, linewidth = 3),
    legend.position = c(0.85, 0.85)
  )

Comparing this histogram to the one from part a, we see a drastic decline in distribution resemblance between the sample data and the theoretical density curve, when working under the assumption that the population of gearbox survival times is exponentially distributed. We will keep this in mind moving forward, as we run hypothesis tests to quantify the differences between the two distribution assumptions.

c.) Likelihood Ratio Test

To further investigate the difference in each distribution’s fit with regards to our population of interest (gearbox survival times), we will perform a likelihood ratio test (LRT). The LRT takes into account the log-likelihood function evaluated with parameter values estimated from a simpler model (the exponential in this case) and the log-likelihood function evaluated with parameter values estimated from a more complicated model (the Weibull in this scenario). The null hypothesis of such a test is that, per our sample data, there is no significant difference between the two model fits.

# weib_mle_beta and weib_mle_lambda are UNRESTRICTED MLE VALUES, but now we must find

# null hypothesis is that beta = 1
# if null hypothesis is true, then distribution is really just exponential

## the likelihood ratio test is just 2 * 
  # [log likelihood evaluated in unrestricted way] -
  # [log likelihood evaluated in restricted or simple way]

# need to evaluate each distribution's likelihood function at the previously estimated parameters of beta and/or lambda
log_evaluation_weib = log_likelihood(params = c(weib_mle_beta, weib_mle_lambda),
                                     data = data)
log_evaluation_exp = exp_log_likelihood(lambda = exp_mle_lambda,
                                        data = data)

# test stat for likelihood ratio test is uppercase Lambda

Lambda = 2 * abs(log_evaluation_weib - log_evaluation_exp)

p_value_likelihoodratio = 1 - pchisq(Lambda, df = 1)
  # test stat distribution for likelihood ratio test is chi-square distributed
  # df = number of parameters we are estimating in function

Our LRT returned an extremely high test value (\(\Lambda\)) of about 1145.444. Considering \(\Lambda\) follows a chi-square distribution, this test stat coincides with a p value of approximately zero. With these findings, we must reject the null hypothesis as there is very convincing evidence that the Weibull distribution more accurately represents the fit of our population of interest than the exponential distribution does. After examining the histograms in parts a and b, this finding is not surprising.

d.) Bootstrap Likelihood Ratio Test

Lastly, I repeated this same type of procedure, likelihood ratio testing, and applied it to a bootstrapped scenario. Under the guise of the bootstrap LRT, we take repeated samples of a theoretical distribution, in this case an exponential distribution with \(\lambda \approx\) 28.377. Then, for each sample, a test statistic of \(\Lambda\) is found using that particular bootstrap sample’s log-likelihood evaluation along with the log-likelihood evaluation of our original and more complex/unrestricted model (the Weibull in this scenario).

### Bootstrap Likelihood Ratio Test (BLRT) for short
# reference code section 4.3 from Week 12 notes, need to change formulas since we are applying to exponential distribution, not normal

bootstrap_lrt_test = function(data, b, rate_null, obs_lr_stat){
    # obs_lr_stat will = Lambda = likelihood ratio test stat taken from observed data calculations
  n = length(data)
  boot_lr_stat = numeric(b)
    
  # Log-likelihood functions are found in earlier code chunks and will be called to within the optim functions of the for loop process
    
  # START FOR LOOP
    for (i in 1:b){
      
      # Generating artificial data meant to replicate the characteristics of our NULL hypothesis
      # In this case, our null hypothesis is that the data fits an exponential distribution with rate parameter lambda
      boot_exp_sample = rexp(n = n, rate = rate_null) 
  
      ####### Weibull likelihood per boot sample    
      boot_beta_guess = (sd(boot_exp_sample)/mean(boot_exp_sample))^(-1.086)

      boot_lambda_guess = mean(boot_exp_sample)/gamma(1 + 1/boot_beta_guess)
      
      boot_log_weib = optim(
          par = c(boot_beta_guess, boot_lambda_guess), # par argument is initial "guesses"
          data = boot_exp_sample,
          fn = log_likelihood,
          gr = log_likelihood_gradient,
          method = "L-BFGS-B",
          lower = c(1e-8, 1e-8), ## this limiting lower argument is NECESSARY, before I added it I was getting an error line for "non-finite value supplied by optim"
          hessian = TRUE,
          control = list(maxit = 1000, 
                         fnscale  = -1,
                         trace = FALSE,
                         abstol = 1e-8)
          )
      boot_log_weib = boot_log_weib$value # store log likelihood evaluation
      ####### End Weibull likelihood per boot sample 
  

      ####### Exponential likelihood per boot sample    
      boot_log_exp = optim(
          par = 1/mean(boot_exp_sample),
          data = boot_exp_sample,
          fn = exp_log_likelihood,
          gr = exp_gradient,
          method = "L-BFGS-B",
          hessian = TRUE,
          control = list(maxit = 1000, 
                         fnscale  = -1,
                         trace = FALSE,
                         abstol = 1e-8)
          )
      boot_log_exp = boot_log_exp$value  # store log likelihood evaluation
      ####### End Exponential likelihood per boot sample  
      
      
      # the likelihood ratio for EACH instance of the bootstrap is 2 times the difference between the likelihood ratio evaluated under unrestricted (Weibull) circumstances and evaluated under simpler (exponential) circumstances
      boot_lr_stat[i] = 2 * abs(boot_log_weib - boot_log_exp)
      

    } # END FOR LOOP
    
  # calculate and output final p value
  p = (sum(boot_lr_stat >= obs_lr_stat) + 1)/ (b + 1)
  
  ### FINAL p value of bootstrap exercise is equal to.. 
  # number of bootstrap sample LR test stats greater than or equal to the original LR test stat found in section c
  
  return(list(p_value_bootstrap = p,
              bootstrap_lr_stats = boot_lr_stat))
  
} #### END BOOTSTRAP FUNCTION


## now running the function
set.seed(123)
BLRT_test = bootstrap_lrt_test(data = data$time,
                               b = 1000, 
                               rate_null = exp_mle_lambda,
                               obs_lr_stat = Lambda)

# BLRT_test$p_value_bootstrap = 0.000999001

Using a repeated sample size of 1,000, there once again appears to be significant evidence that an assumption of exponential distribution would result in a much less accurate resemblance of our population of gearbox survival times than an assumption of Weibull distribution would. Our bootstrap LRT yielded a p-value of only 0.001.

e.) Summary and Conclusions

To summarize, all of the analysis up to this point supports the notion of moving forward under the assumption that the broader population of gearbox survival times is Weibull distributed with a shape parameter (\(\beta\)) of about 3.371 and scale parameter (\(\lambda\)) of roughly 31.418. Both the standard and bootstrap iterations of the LRT show a significant decline in distribution resemblance when assuming an exponential distribution. Furthermore, we could see in preliminary visual analysis via the histograms in sections a and b, that a Weibull distribution density curve with said paremeters was much more aligned with our sample distribution than that of the exponential curve with \(\beta\) value of approximately 28.377.

---
title: "Bootstrap Likelihood Ratio Test"
author: "Chris Bahm"
date: "April 18, 2026"
output:
  html_document:
    toc: true
    toc_float:
      collapsed: true
      smooth_scroll: true
    toc_depth: 4
    fig_width: 6
    fig_height: 4
    fig_caption: true
    number_sections: false
    code_folding: hide
    code_download: true
    theme: lumen
    highlight: tango
  pdf_document:
    toc: true
    toc_depth: 4
    fig_caption: true
    number_sections: true
  word_document:
    toc: true
    toc_depth: 4
---

```{css, echo = FALSE}
div#TOC li {     /* table of content  */
    list-style:upper-roman;
    background-image:none;
    background-repeat:none;
    background-position:0;
}

h1.title {    /* level 1 header of title  */
  font-size: 24px;
  font-weight: bold;
  color: DarkRed;
  text-align: center;
}

h4.author { /* Header 4 - and the author and data headers use this too  */
  font-size: 18px;
  font-weight: bold;
  font-family: "Times New Roman", Times, serif;
  color: DarkRed;
  text-align: center;
}

h4.date { /* Header 4 - and the author and data headers use this too  */
  font-size: 18px;
  font-weight: bold;
  font-family: "Times New Roman", Times, serif;
  color: DarkBlue;
  text-align: center;
}

h1 { /* Header 1 - and the author and data headers use this too  */
    font-size: 20px;
    font-weight: bold;
    font-family: "Times New Roman", Times, serif;
    color: darkred;
    text-align: center;
}

h2 { /* Header 2 - and the author and data headers use this too  */
    font-size: 18px;
    font-weight: bold;
    font-family: "Times New Roman", Times, serif;
    color: navy;
    text-align: left;
}

h3 { /* Header 3 - and the author and data headers use this too  */
    font-size: 16px;
    font-weight: bold;
    font-family: "Times New Roman", Times, serif;
    color: navy;
    text-align: left;
}

h4 { /* Header 4 - and the author and data headers use this too  */
    font-size: 14px;
  font-weight: bold;
    font-family: "Times New Roman", Times, serif;
    color: darkred;
    text-align: left;
}

/* Add dots after numbered headers */
.header-section-number::after {
  content: ".";
}
```

```{r setup, include=FALSE}
# code chunk specifies whether the R code, warnings, and output 
# will be included in the output files.

if (!require("knitr")) {                      # use conditional statement to detect
   install.packages("knitr")                  # whether a package was installed in
   library(knitr)                             # your machine. If not, install it and
}                                             # load it to the working directory.

if (!require(tidyverse)) {library(tidyvserse)} 

if (!require(GGally)) {library(GGally)} 

if (!require(kableExtra)) {library(kableExtra)} 

if (!require(ggplot2)) {library(ggplot2)} 

if (!require(car)) {library(car)} 

if (!require(dplyr)) {library(dplyr)} 

if (!require(pander)) {library(pander)} 

if (!require(forecast)) {library(forecast)} 

if (!require(lubridate)) {library(lubridate)} 

if (!require("scales")) {
install.packages("scales")                                        
library("scales") 
}

knitr::opts_chunk$set(
	echo = TRUE,
	message = FALSE,
	warning = FALSE,
	comment = NA,
	results = TRUE,
	fig.align = "center"
)
```

```{r}
# read in data
data = rbind.data.frame(5.2,  7.8,  9.1, 11.3, 12.5, 13.0, 14.2, 15.1, 15.9, 16.7, 17.2, 17.8, 18.4, 18.9, 
  19.3, 19.7, 20.2, 20.6, 21.0, 21.5, 21.9, 22.3, 22.7, 23.1, 23.5, 23.9, 24.3, 24.7, 
  25.1, 25.5, 25.9, 26.3, 26.7, 27.1, 27.5, 27.9, 28.3, 28.7, 29.1, 29.5, 29.9, 30.3, 
  30.7, 31.1, 31.5, 31.9, 32.3, 32.7, 33.1, 33.5, 33.9, 34.3, 34.7, 35.1, 35.5, 35.9, 
  36.3, 36.7, 37.1, 37.5, 37.9, 38.3, 38.7, 39.1, 39.5, 39.9, 40.3, 40.7, 41.1, 41.5,
  41.9, 42.3, 42.7, 43.1, 43.5)
colnames(data) = "time"
  # time is in months
```

# Intro and Setup

The dataset at our disposal for this analyis is made up of `r nrow(data)` recorded failure times (measured in months) for wind turbine gearbox components. Summary statistics as well as a histogram depiction of our sample data can be seen below.

```{r}
sum_stats_table = function(variable){
  table = cbind(
    round(min(variable),2),
    round(quantile(variable, 0.25),2),
    round(median(variable),2),
    round(mean(variable),2),
    round(quantile(variable, 0.75),2),
    round(max(variable),2)
  )
  colnames(table) = c("Min", "Q1", "Med", "Mean", "Q3", "Max")
  rownames(table) = NULL
  return(table)
}

Summary_Table = sum_stats_table(variable = data$time)
kable(Summary_Table,align = "c",
      caption = "<span style='color:##000000;'>
      Summary Stats of Survival Times (Months)</span>") %>%
  kable_styling(
    bootstrap_options = c("striped", "bordered"),
    full_width = FALSE,
    position = "center")

ggplot(data = data) +
  geom_histogram(mapping = aes(x = time, y = after_stat(density)),
                 color = "black",
                  # argument color in geom_histogram is for color of bin lines
                 fill = "lightblue",
                 bins = 10,) +
  labs(title = "Distribution of Survival Times",
       x = "Time (Months)",
       y = "Density") + 
  theme(
    panel.border = element_rect(color = "black", fill = NA, linewidth = 3)
  ) # this is the equivalent of box(lwd = 3) in base R
```


The engineering team at a wind energy company is unsure whether to label and treat the broader population distribution of fail times as exponentially or Weibull distributed. The probability density of an exponential distribution is simply a more simplified version of the Weibull's. It is the value of our shape parameter, $\beta$, that determines which of these distribution's we will label as the framework for predictive analysis on gearbox fail times moving forward.

For the purposes of today's analysis, we will treat the null hypothesis that said population is exponentially distributed (AKA $\beta = 1$), and the alternative hypothesis that a Weibull distribution more accurately represents the broader population of gearbox fail times (AKA $\beta \neq 1$).


\begin{align}
H_0&: \beta = 1 \quad \text{Takeaway: Exponential} \\
H_a&: \beta \neq 1 \quad \text{Takeaway: Weibull}
\end{align}

In order to assess the validity of these opposing hypotheses, we must first quantify an estimate for our parameter $\beta$ under the assumption of each scenario (Weibull distribution vs exponential distribution). In the following sections, I will use maximum likelihood estimation (MLE) to do so.


# a.) MLE of Weibull
Operating under the assumption that the population of gearbox fail times is **Weibull distributed**, we can use MLE to simultaneously generate estimates for the shape parameter, $\beta$, and the the scale parameter, $\lambda$. We'll denote these estimates as $\hat{\beta}$ and $\hat{\lambda}$ respectively.

Parameter calculations were performed with R's built-in <span style="color:blue;">optim()</span> function. For mathematical reference, the log-likelihood function for the Weibull distribution as well as the partial derivatives of that very function with respect to both $\beta$ and $\lambda$ are below.

$$\Large \ell(\beta,\lambda|t) =
[n \times \ln(\beta)] - 
[n \times \beta \times \ln(\lambda)]
+ [(\beta - 1) \times \sum\ln(t_i)]
- \sum\left(\frac{(t_i)}{\lambda}\right)^\beta $$

<br>

$$\Large \frac{\partial f}{\partial \beta} =
\frac{n}{\beta} -
n \times \ln(\lambda)
+ \sum\ln(t_i) -
\sum\left(\ln\frac{t_i}{\lambda}\right) \times \left(\frac{t_i}{\lambda}\right)^\beta    $$

<br>

$$\Large \frac{\partial f}{\partial \lambda} =  \frac{\beta}{\lambda} \times \sum\left(\frac{t_i}{\lambda}\right)^\beta - n    $$
<br>

```{r}
## MLE Weibull Estimation
  # code is largely copied from HW 9 assignment, which took formulas and procedure from following:
    # Week 11, Section 5.2 
    # Week 7, Section 5

#### define likelihood function
## *negative* log likelihood function for Weibull distribution
  # by negative log likelihood, this function....
log_likelihood = function(params, data) {
  beta = params[1]  # shape
  lambda = params[2]  # scale 
  
  # Ensure parameters are positive
  if (beta <= 0 || lambda <= 0) {
    return(1e10)  # Return large value for invalid parameters
  }
  
  n = length(data)
  log_lik = n * log(beta) - n * beta * log(lambda) + 
    (beta - 1) * sum(log(data)) - sum((data / lambda)^beta)
  
  return(log_lik)  # Return log-likelihood! NOT the negative log-likelihood
}

#### define score equations (partial derivatives)
log_likelihood_gradient = function(params, data) {
  beta = params[1]       # shape parameter
  lambda = params[2]  # scale parameter
  
  n = length(data)
  
  partial_deriv_beta = n/beta - n * log(lambda) + sum(log(data)) - 
        sum(log(data/lambda) * (data/lambda)^beta)
  
  partial_deriv_lambda =  (beta/lambda)*(sum((data/lambda)^beta) -n)
  
  return(c(partial_deriv_beta,
           partial_deriv_lambda))  # Gradient of log-likelihood, NOT the negative scores
}

#### use log likelihood and score equations in optim function

# need to have good initial guesses for beta and lambda for the optim function
beta_guess = (sd(data$time)/mean(data$time))^(-1.086)

lambda_guess = mean(data$time)/gamma(1 + 1/beta_guess)


weib_mle_estimates = optim(
  par = c(beta_guess, lambda_guess), # par argument is initial "guesses"
  data = data$time,
  fn = log_likelihood,
  gr = log_likelihood_gradient,
  method = "L-BFGS-B",
  hessian = TRUE,
    # will use the Hessian matrix later on
  control = list(maxit = 1000, 
                 fnscale  = -1,
                 trace = FALSE,
                 abstol = 1e-8)
)
#weib_mle_estimates
  
weib_mle_beta = weib_mle_estimates$par[1] # shape
weib_mle_lambda = weib_mle_estimates$par[2] # scale
```

Post-MLE, we have estimates of about `r round(weib_mle_beta,3)` and `r round(weib_mle_lambda,3)` for $\beta$ and $\lambda$ respectively. To further ensure these estimates seem reasonable, below I overlaid a Weibull distribution with said parameter values atop our available sample data.

```{r}
ggplot(data = data) +
  geom_histogram(mapping = aes(x = time, y = after_stat(density)),
                 color = "black",
                  # argument color in geom_histogram is for color of bin lines
                 fill = "lightblue",
                 bins = 10,) +
  # use stat_function in ggplot to show density curves of other distributions
    stat_function(fun = dweibull, 
                args = list(shape = weib_mle_beta, scale = weib_mle_lambda),
                color = "black",
                linewidth = 1.5,
                xlim = range(0:50))+
            # xlim argument so curve doesn't awkardly cut out at edge points of data
  labs(title = "Distribution of Gearbox Survival Times (Weibull)",
       x = "Time (Months)",
       y = "Density") + 
  theme(
    panel.border = element_rect(color = "black", fill = NA, linewidth = 3),
    legend.position = c(0.85, 0.85)
  )
```

As we can see in our histogram above, there is certainly a respectable amount of overlap between the distributions of our sample data and a theoretical Weibull density curve with parameters of $\beta \approx `r round(weib_mle_beta,3)`$ and $\lambda \approx `r round(weib_mle_lambda,3)`$. This leads us to believe that *if* we are to label the broader population of gearbox survival times as Weibull distributed, then the $\beta$ and $\lambda$ estimates prescribed via MLE are quality starting population estimators.

Now for comparison's sake, I will repeat the MLE and visualization procedure, but this time operating under the assumption that the gearbox survival times are generally **exponentially** distributed.

# b.) MLE of Exponential
As alluded to before, the exponential distribution can be simplistically described as a less complicated "version" of the Weibull distribution. Instead of employing two active parameters, the exponential distribution's only parameter is $\lambda$, which we will now estimate via MLE. Log-likelihood functions and partial derivatives are written below.

$$\Large \ell(\lambda|t) = -[n \times \ln(\lambda)] - \frac{\sum(t_i)}{\lambda}  $$

<br>

$$\Large \frac{\partial f}{\partial \lambda} = \frac{n}{\lambda} + \frac{\sum(t_i)}{\lambda^2}   $$

<br>
```{r}
## Exponential MLE Distribution
  # once again code largely brought over from HW 9 assignment

# exponential log likelihood follows same pattern as Weibull log likelihood but a MUCH simpler distribution
exp_log_likelihood = function(lambda, data) {
 if (lambda <= 0) return (1e10)
 n = length(data)
 log_lik = -n * log(lambda) - sum(data)/lambda
 return(log_lik)
}

### now define partial derivative w/ respect to lambda (the score function or gradient)
  # performed via quick hand calculation

exp_gradient = function(lambda, data) {
  if (lambda <= 0) return (NA)
  n = length(data)
  partial_deriv_lambda = (n/lambda) + (sum(data)/(lambda^2))
  return(partial_deriv_lambda)
}

exp_mle_estimates = optim(
  par = 1/mean(data$time),
    # the initial guess can simply be 1/(mean of the data) since we are performing this analysis under the theoretical that it is an EXPONENTIAL distribution. The mean of an exponential distribution is 1 divided by the the rate parameter lambda
  data = data$time,
  fn = exp_log_likelihood,
  gr = exp_gradient,
  method = "L-BFGS-B",
  hessian = TRUE,
    # will use the Hessian matrix later on
  control = list(maxit = 1000, 
                 fnscale  = -1,
                 trace = FALSE,
                 abstol = 1e-8)
)
  
#exp_mle_estimates$par

exp_mle_lambda = exp_mle_estimates$par
```

Per MLE calculations, we can assert that the $\lambda$ parameter of the distribution of gearbox survival times, if said distribution was truly exponential, is estimated to be roughly `r round(exp_mle_lambda,3)`. An <span style="color:red;">exponential distribution</span> with $\lambda = `r round(exp_mle_lambda,3)`$ is overlaid atop the sample data and available below.

```{r}
ggplot(data = data) +
  geom_histogram(mapping = aes(x = time, y = after_stat(density)),
                 color = "black",
                  # argument color in geom_histogram is for color of bin lines
                 fill = "lightblue",
                 bins = 10,) +
  # use stat_function in ggplot to show density curves of other distributions
    stat_function(fun = dexp, 
                args = list(rate = 1/exp_mle_lambda),
                color = "red",
                linewidth = 1.5,
                xlim = range(0:50)
                )+
            # xlim argument so curve doesn't awkardly cut out at edge points of data
  labs(title = "Distribution of Survival Times (Exp)",
       x = "Time (Months)",
       y = "Density") + 
  theme(
    panel.border = element_rect(color = "black", fill = NA, linewidth = 3),
    legend.position = c(0.85, 0.85)
  )

```

Comparing this histogram to the one from part a, we see a drastic decline in distribution resemblance between the sample data and the theoretical density curve, when working under the assumption that the population of gearbox survival times is exponentially distributed. We will keep this in mind moving forward, as we run hypothesis tests to quantify the differences between the two distribution assumptions.

# c.) Likelihood Ratio Test

To further investigate the difference in each distribution's fit with regards to our population of interest (gearbox survival times), we will perform a likelihood ratio test (LRT). The LRT takes into account the log-likelihood function evaluated with parameter values estimated from a simpler model (the exponential in this case) and the log-likelihood function evaluated with parameter values estimated from a more complicated model (the Weibull in this scenario). The null hypothesis of such a test is that, per our sample data, there is **no** significant difference between the two model fits.

```{r}
# weib_mle_beta and weib_mle_lambda are UNRESTRICTED MLE VALUES, but now we must find

# null hypothesis is that beta = 1
# if null hypothesis is true, then distribution is really just exponential

## the likelihood ratio test is just 2 * 
  # [log likelihood evaluated in unrestricted way] -
  # [log likelihood evaluated in restricted or simple way]

# need to evaluate each distribution's likelihood function at the previously estimated parameters of beta and/or lambda
log_evaluation_weib = log_likelihood(params = c(weib_mle_beta, weib_mle_lambda),
                                     data = data)
log_evaluation_exp = exp_log_likelihood(lambda = exp_mle_lambda,
                                        data = data)

# test stat for likelihood ratio test is uppercase Lambda

Lambda = 2 * abs(log_evaluation_weib - log_evaluation_exp)

p_value_likelihoodratio = 1 - pchisq(Lambda, df = 1)
  # test stat distribution for likelihood ratio test is chi-square distributed
  # df = number of parameters we are estimating in function
```

Our LRT returned an extremely high test value ($\Lambda$) of about `r round(Lambda,3)`. Considering $\Lambda$ follows a chi-square distribution, this test stat coincides with a p value of approximately zero. With these findings, we must **reject** the null hypothesis as there is very convincing evidence that the Weibull distribution more accurately represents the fit of our population of interest than the exponential distribution does. After examining the histograms in parts a and b, this finding is not surprising.

# d.) Bootstrap Likelihood Ratio Test 
Lastly, I repeated this same type of procedure, likelihood ratio testing, and applied it to a bootstrapped scenario. Under the guise of the bootstrap LRT, we take repeated samples of a theoretical distribution, in this case an exponential distribution with $\lambda \approx$ `r round(exp_mle_lambda,3)`. Then, for each sample, a test statistic of $\Lambda$ is found using that particular bootstrap sample's log-likelihood evaluation along with the log-likelihood evaluation of our original and more complex/unrestricted model (the Weibull in this scenario).

```{r, warning=FALSE}
### Bootstrap Likelihood Ratio Test (BLRT) for short
# reference code section 4.3 from Week 12 notes, need to change formulas since we are applying to exponential distribution, not normal

bootstrap_lrt_test = function(data, b, rate_null, obs_lr_stat){
    # obs_lr_stat will = Lambda = likelihood ratio test stat taken from observed data calculations
  n = length(data)
  boot_lr_stat = numeric(b)
    
  # Log-likelihood functions are found in earlier code chunks and will be called to within the optim functions of the for loop process
    
  # START FOR LOOP
    for (i in 1:b){
      
      # Generating artificial data meant to replicate the characteristics of our NULL hypothesis
      # In this case, our null hypothesis is that the data fits an exponential distribution with rate parameter lambda
      boot_exp_sample = rexp(n = n, rate = rate_null) 
  
      ####### Weibull likelihood per boot sample    
      boot_beta_guess = (sd(boot_exp_sample)/mean(boot_exp_sample))^(-1.086)

      boot_lambda_guess = mean(boot_exp_sample)/gamma(1 + 1/boot_beta_guess)
      
      boot_log_weib = optim(
          par = c(boot_beta_guess, boot_lambda_guess), # par argument is initial "guesses"
          data = boot_exp_sample,
          fn = log_likelihood,
          gr = log_likelihood_gradient,
          method = "L-BFGS-B",
          lower = c(1e-8, 1e-8), ## this limiting lower argument is NECESSARY, before I added it I was getting an error line for "non-finite value supplied by optim"
          hessian = TRUE,
          control = list(maxit = 1000, 
                         fnscale  = -1,
                         trace = FALSE,
                         abstol = 1e-8)
          )
      boot_log_weib = boot_log_weib$value # store log likelihood evaluation
      ####### End Weibull likelihood per boot sample 
  

      ####### Exponential likelihood per boot sample    
      boot_log_exp = optim(
          par = 1/mean(boot_exp_sample),
          data = boot_exp_sample,
          fn = exp_log_likelihood,
          gr = exp_gradient,
          method = "L-BFGS-B",
          hessian = TRUE,
          control = list(maxit = 1000, 
                         fnscale  = -1,
                         trace = FALSE,
                         abstol = 1e-8)
          )
      boot_log_exp = boot_log_exp$value  # store log likelihood evaluation
      ####### End Exponential likelihood per boot sample  
      
      
      # the likelihood ratio for EACH instance of the bootstrap is 2 times the difference between the likelihood ratio evaluated under unrestricted (Weibull) circumstances and evaluated under simpler (exponential) circumstances
      boot_lr_stat[i] = 2 * abs(boot_log_weib - boot_log_exp)
      

    } # END FOR LOOP
    
  # calculate and output final p value
  p = (sum(boot_lr_stat >= obs_lr_stat) + 1)/ (b + 1)
  
  ### FINAL p value of bootstrap exercise is equal to.. 
  # number of bootstrap sample LR test stats greater than or equal to the original LR test stat found in section c
  
  return(list(p_value_bootstrap = p,
              bootstrap_lr_stats = boot_lr_stat))
  
} #### END BOOTSTRAP FUNCTION


## now running the function
set.seed(123)
BLRT_test = bootstrap_lrt_test(data = data$time,
                               b = 1000, 
                               rate_null = exp_mle_lambda,
                               obs_lr_stat = Lambda)

# BLRT_test$p_value_bootstrap = 0.000999001
```

Using a repeated sample size of 1,000, there once again appears to be significant evidence that an assumption of exponential distribution would result in a much less accurate resemblance of our population of gearbox survival times than an assumption of Weibull distribution would. Our bootstrap LRT yielded a p-value of only `r round(BLRT_test$p_value_bootstrap, 4)`.

# e.) Summary and Conclusions
To summarize, all of the analysis up to this point supports the notion of moving forward under the assumption that the broader population of gearbox survival times is **Weibull distributed** with a shape parameter ($\beta$) of about `r round(weib_mle_beta,3)` and scale parameter ($\lambda$) of roughly `r round(weib_mle_lambda,3)`. Both the standard and bootstrap iterations of the LRT show a significant decline in distribution resemblance when assuming an exponential distribution. Furthermore, we could see in preliminary visual analysis via the histograms in sections a and b, that a Weibull distribution density curve with said paremeters was *much* more aligned with our sample distribution than that of the exponential curve with $\beta$ value of approximately `r round(exp_mle_lambda,3)`.


