1 Introduction

For this assignment, we will focus on comparing a asymptotic \(\chi^2\) hypothesis test to a bootstrapped hypothesis test to determine whether or not a sample came from one distribution or another. Specifically, we will do a Wald \(\chi^2\) and a bootstrapped hypothesis test to determine if a sample came from a Weibull distribution or an exponential distribution.

The sample we will use for our tests is a sample consisting of 75 failure times for gearboxes used in wind turbines at a distribution company’s facility.

times <- c(5.2, 7.8, 9.1, 11.3, 12.5, 13.0, 14.2, 15.1, 15.9, 16.7, 17.2, 17.8, 18.4, 18.9, 19.3, 19.7, 20.2, 20.6, 21.0, 21.5, 21.9, 22.3, 22.7, 23.1, 23.5, 23.9, 24.3, 24.7, 25.1, 25.5, 25.9, 26.3, 26.7, 27.1, 27.5, 27.9, 28.3, 28.7, 29.1, 29.5, 29.9, 30.3, 30.7, 31.1, 31.5, 31.9, 32.3, 32.7, 33.1, 33.5, 33.9, 34.3, 34.7, 35.1, 35.5, 35.9, 36.3, 36.7, 37.1, 37.5, 37.9, 38.3, 38.7, 39.1, 39.5, 39.9, 40.3, 40.7, 41.1, 41.5,
41.9, 42.3, 42.7, 43.1, 43.5)

2 Part A

For this part, we will find the maximum likelihood estimators of the Weibull distribution, treating this sample as if it comes from a Weibull distribution.

First, we’ll start with the log-likelihood function for the weibull distribution:

$$ \[\begin{align} \ell(k, \lambda; \mathbf{t}) &= \ln L(k, \lambda; \mathbf{t}) \\ &= \sum_{i=1}^{n} \ln \left[ \frac{k}{\lambda} \left( \frac{t_i}{\lambda} \right)^{k-1} \exp\left( -\left( \frac{t_i}{\lambda} \right)^k \right) \right] \\ &= \sum_{i=1}^{n} \left[ \ln k - \ln \lambda + (k-1)(\ln t_i - \ln \lambda) - \left( \frac{t_i}{\lambda} \right)^k \right] \\ &= \sum_{i=1}^{n} \left[ \ln k - k \ln \lambda + (k-1) \ln t_i - \left( \frac{t_i}{\lambda} \right)^k \right] \end{align}\]

$$

From there, we can take the partial derivatives (one with respect to each variable), set them both equal to 0, and then solve for the two parameters and get their maximum likelihood estimates. If done right, the maximum likelihood estimates are: \[ \begin{align} \hat{\lambda} &= \left( \frac{1}{n} \sum_{i=1}^{n} x_i^{\hat{\beta}} \right)^{1/\hat{\beta}} \quad \\ \frac{1}{\hat{\beta}} &= \frac{\sum_{i=1}^{n} (t_i^{\hat{\beta}} \ln x_i)}{\sum_{i=1}^{n} x_i^{\hat{\beta}}} - \frac{1}{n} \sum_{i=1}^{n} \ln x_i \quad \end{align} \]

Where \(\lambda\) is the scale parameter and \(\beta\) is the shape parameter.

To obtain these parameters, we will use R’s built in optim() function, which will solve a system of equations numerically. The following code will be used to obtain the parameter estimates:

## Manual MLE implementation for Weibull distribution

# Create Weibull log-likelihood function
weibull.loglik <- function(params, data) {
  shape <- params[1]  # passing the parameters
  scale <- params[2]

n <- length(data)   # sample size
  
# Log-likelihood for Weibull distribution
loglik <- n * log(shape) - n * shape * log(scale) + 
            (shape - 1) * sum(log(data)) - 
            sum((data / scale)^shape)
  
  return(loglik)    # Return
}


# Score equation

weibull.score <- function(params, data) {
  shape <- params[1]
  scale <- params[2]
  n <- length(data)
  
# Gradient for shape parameter
grad_shape <- n/shape - n * log(scale) + 
                sum(log(data)) - 
                sum((data/scale)^shape * log(data/scale))
  
# Gradient for scale parameter
grad_scale <- -(n * shape)/scale + 
                (shape/scale) * sum((data/scale)^shape)
  
  return(c(grad_shape, grad_scale))
}

# Need to provide initial values for parameters
initial.params <- c(shape = 1, scale = 5)  # Reasonable starting values


# Using optim with Nelder-Mead method
mle.result.weibull <- optim(
  par = initial.params,
  fn = weibull.loglik,
  gr = weibull.score,
  data = times,
  method = "L-BFGS-B",
  hessian = TRUE,
  control = list(trace = FALSE,
                 fnscale = -1,
                 maxit = 500,
                 abstol = 1e-8)
)
##
mle.result.weibull$par
    shape     scale 
 3.370783 31.418204 

The estimate for the shape parameter is 3.370783, while the estimate for the scale parameter is 31.418204.

To verify, we can use the fitdistr() function from the MASS package:

mle.weibull.fit <- fitdistr(times, "weibull")
mle.weibull.fit
     shape        scale   
   3.3707800   31.4184018 
 ( 0.3206184) ( 1.1278755)

They’re just about the same, so we can say that our original estimates are correct.

3 Part B

For this part, we will find the maximum likelihood estimator for the exponential distribution, treating this sample as if it came from an exponential distribution.

In the case of the exponential distribution, we are only estimating the scale parameter because an exponential random variable is just a Weibull random variable with the shape parameter, \(\beta\), set to 1.

Depending on how we write the density function for the exponential, the resulting maximum likelihood estimate for the scale parameter is: \[ \hat{\lambda} = \frac{1}{n} \sum x_i = \bar{x} \implies \frac{1}{\hat{\lambda}} = \frac{n}{\sum x_i} = \frac{1}{\bar{x}} \]

Regardless of how the estimator is written, the process for solving for \(\lambda\) is the same. In this case, since we only have one parameter to solve for, and it has a simple form, we can use simpler methods to solve for it. Since we already determined that \(\lambda\) is just the sample mean, we will use that estimate.

First, since we did it manually in Part A, we can do it manually in this part as well.

## Manual MLE implementation for Exponential distribution

# Create Exponential Log-Likelihood Function
exponential.loglik <- function(params, data) {
  scale <- params[1]

n <- length(data)   # sample size
  
# Log-likelihood for Exponential Distribution
loglik <- -n*log(scale) - sum(data)/scale
  
  return(loglik)    # Return 
}


# Score equation

exponential.score <- function(params, data) {
  scale <- params[1]
  n <- length(data)

  
# Gradient for scale parameter
grad_scale <- -(n/scale) + sum(data)/scale^2

  return(grad_scale)
}

# Need to provide initial values for parameters
  # Reasonable starting values


# Using optim with Nelder-Mead method
mle.result.exponential <- optim(
  par = 5, # Just a generic number for an estimate
  fn = exponential.loglik,
  gr = exponential.score,
  data = times,
  method = "L-BFGS-B",
  hessian = TRUE,
  control = list(trace = FALSE,
                 fnscale = -1,
                 maxit = 500,
                 abstol = 1e-8)
)
##
mle.result.exponential$par
[1] 28.18533

Now let’s plug the sample mean for the mle:

## Using Sample Mean as MLE For Exponential 

mle.exponential.scale <- mean(times)
mle.exponential.scale
[1] 28.18533

The mean of our sample is 28.18533, so the estimate for our scale parameter is 28.18533.

To verify, we can use fitdistr() from the MASS package:

## Verifying MLE For Exponential via fitdistr()

exponential.fit <- fitdistr(times, "exponential")
exponential.fit
      rate    
  0.035479446 
 (0.004096813)

As we can see, R gave us the estimate for the rate parameter. But we need the estimate for the scale parameter. The good news is that we can get the estimate for the scale parameter, as it is just the inverse of the rate parameter. In other words, scale = 1/rate.

mle.fit <- 1/exponential.fit$estimate
mle.fit
    rate 
28.18533 

4 Part C

For this part, we will conduct the first of two hypothesis tests to determine whether or not this data comes from a Weibull distribution or an exponential distribution. Specifically, we will conduct a likelihood ratio \(\chi^2\) test. Our null hypothesis, \(H_0\), is that this data comes from an exponential distribution. In other words, our null hypothesis is that \(\beta = 1\). Our alternative hypothesis, \(H_a\), is that it our data comes from a Weibull distribution. In other words, our alternative hypothesis is that \(\beta \neq 1\). To put into symbols: \[ H_0: \beta = 1 (Exponential) \\ H_a: \beta \neq 1 (Weibull) \]

This test will be done at significance level \(\alpha = 0.05\).

## Likelihood Ratio Test using both Log-Likelihood Functions

# Weibull Log-Likelihood Function
loglike.weibull <- function(data, lambda, beta){
  n <- length(data)
  n*log(beta)-n*beta*log(lambda)+(beta-1)*sum(log(data))-sum((data/lambda)^beta)
}

# Exponential Log-Likelihood Function
loglike.exponential <- function(data, lambda){ 
  n <- length(data)
  -n*log(lambda)-sum(data)/lambda
}

## Evaluate Log Likelihood @ MLE
scale.est.weibull <- mle.result.weibull$par[[2]]
shape.est.weibull <- mle.result.weibull$par[[1]]
scale.est.exponential <- mle.fit

# Exponential vs Weibull Hypotheses
LogLike.Alt <- loglike.weibull(times, scale.est.weibull, shape.est.weibull)
LogLike.Null <- loglike.exponential(times, scale.est.exponential)

ratio <- 2*(LogLike.Alt - LogLike.Null) #Computes likelihood ratio

## Getting p-value for ratio test
pvalue.ratio <- 1 - pchisq(ratio, df = 1) #Gets p-value

# STatistic and p-value together
test1 <- c(ratio, pvalue.ratio)
test1
    rate     rate 
100.6144   0.0000 

Our \(\chi^2\) statistic is 100.6144, and the p-value is \(\approx\) 0, which so it’s safe to say that we can reject the null hypothesis and conclude that there is sufficient evidence to say that this sample came from an weibull distribution.

5 Part D

For this part, we will conduct the second of our hypothesis tests to determine whether or not this sample came from a Weibull distribution or an exponential distribution. Specifically, we will use a bootstrapped likelihood ratio hypothesis test. The null hypothesis, the alternative hypothesis, and the level of significance will be the same as the likelihood ratio test from Part C.

# Bootstrap Likelihood Ratio test for normal mean
set.seed(1)

bootstrap_lrt_test <- function(data, scale_0, B = 10000) {

n <- length(data)

# MLE of Exponential
scale_hat <- mean(data)

scale_0 <- mean(data)
    
# Log-likelihoods
logL_hat <- sum(dexp(data, rate = 1/scale_hat, log = TRUE))
logL0 <- sum(dexp(data, rate = 1/scale_0, log = TRUE))

    
# Likelihood Ratio Test Statistic
LR_obs <- -2 * (logL0 - logL_hat)

# Bootstrap distribution
LR_star <- numeric(B)
    for (b in 1:B) {
# Generate data under H0: Exp(Scale)
         bs_sample <- rexp(n, rate = 1/scale_0)
        
        # MLE from bootstrap sample
        scale_1 <- mean(bs_sample)
        
        # Likelihoods
        logL_star <- sum(dexp(bs_sample, rate = 1/scale_hat, log = TRUE))
        logL_star0 <- sum(dexp(bs_sample, rate = 1/scale_0, log = TRUE))
        
        LR_star[b] <- -2 * (logL_star0 - logL_star)
    }
    
p_value <- (sum(LR_star >= LR_obs) + 1) / (B + 1)
    
 return(list(
        LR_obs = LR_obs,
        p_value = p_value,
        scale_1 = scale_1,
        scale_0 = scale_0
    ))
}

bootstrap_lrt_test(times)
$LR_obs
[1] 0

$p_value
[1] 1

$scale_1
[1] 22.82617

$scale_0
[1] 28.18533

It’s hard to determine what exactly our test statistic is, but one thing is clear: the p-value for the test is \(\approx\) 1, so this time we will fail to reject the null hypothesis, and we can conclude that there is insufficient evidence to say that this sample came from a weibull distribution.

6 Part E

The two tests did not generate the same result. What is odd is how much the bootstrapped estimate for the scale parameter changed as n increases. If n = 1000, then the scale parameter is estimated to be 26.5558. If n = 5000, then the scale parameter is estimated to be 30.10166. If n = 10,000, then the scale parameter is estimated to be 22.82617.

Still, based on the non-bootstrapped likelihood ratio test, i would be inclined to believe that the Weibull distribution is a better fit for this data.

---
title: "STA 512 Assignment 12: Bootstrap Testing Hypothesis"
author: "Ian VanWright"
date: "04/20/2026"
output:
  html_document: 
    toc: yes
    toc_depth: 4
    toc_float: yes
    number_sections: yes
    toc_collapsed: yes
    code_folding: show
    code_download: yes
    smooth_scroll: yes
    theme: lumen
  pdf_document: 
    toc: yes
    toc_depth: 4
    fig_caption: yes
    number_sections: yes
    fig_width: 3
    fig_height: 3
  word_document: 
    toc: yes
    toc_depth: 4
    fig_caption: yes
    keep_md: yes
editor_options: 
  chunk_output_type: inline
---

```{css, echo = FALSE}
#TOC::before {
  content: "Table of Contents";
  font-weight: bold;
  font-size: 1.2em;
  display: block;
  color: navy;
  margin-bottom: 10px;
}


div#TOC li {     /* table of content  */
    list-style:upper-roman;
    background-image:none;
    background-repeat:none;
    background-position:0;
}

h1.title {    /* level 1 header of title  */
  font-size: 22px;
  font-weight: bold;
  color: DarkRed;
  text-align: center;
  font-family: "Gill Sans", sans-serif;
}

h4.author { /* Header 4 - and the author and data headers use this too  */
  font-size: 15px;
  font-weight: bold;
  font-family: system-ui;
  color: navy;
  text-align: center;
}

h4.date { /* Header 4 - and the author and data headers use this too  */
  font-size: 18px;
  font-weight: bold;
  font-family: "Gill Sans", sans-serif;
  color: DarkBlue;
  text-align: center;
}

h1 { /* Header 1 - and the author and data headers use this too  */
    font-size: 20px;
    font-weight: bold;
    font-family: "Times New Roman", Times, serif;
    color: darkred;
    text-align: center;
}

h2 { /* Header 2 - and the author and data headers use this too  */
    font-size: 18px;
    font-weight: bold;
    font-family: "Times New Roman", Times, serif;
    color: navy;
    text-align: left;
}

h3 { /* Header 3 - and the author and data headers use this too  */
    font-size: 16px;
    font-weight: bold;
    font-family: "Times New Roman", Times, serif;
    color: navy;
    text-align: left;
}

h4 { /* Header 4 - and the author and data headers use this too  */
    font-size: 14px;
  font-weight: bold;
    font-family: "Times New Roman", Times, serif;
    color: darkred;
    text-align: left;
}

/* Add dots after numbered headers */
.header-section-number::after {
  content: ".";

body { background-color:white; }

.highlightme { background-color:yellow; }

p { background-color:white; }

}
```

```{html, echo=FALSE}
<button id="toggleTOCBtn" class="btn btn-default" style="position: fixed; bottom: 20px; right: 20px; z-index: 1000;">
  Toggle TOC
</button>

<script>
document.getElementById("toggleTOCBtn").onclick = function() {
  var toc = document.querySelector(".list-group, #TOC, .tocify");
  if (toc) {
    if (toc.style.display === "none") {
      toc.style.display = "";
    } else {
      toc.style.display = "none";
    }
  }
};
</script>

```{r setup, include=FALSE}
# code chunk specifies whether the R code, warnings, and output 
# will be included in the output files.
if (!require("knitr")) {
   install.packages("knitr")
   library(knitr)
}
if (!require("pander")) {
   install.packages("pander")
   library(pander)
}
if (!require("ggplot2")) {
  install.packages("ggplot2")
  library(ggplot2)
}
if (!require("tidyverse")) {
  install.packages("tidyverse")
  library(tidyverse)
}

if (!require("plotly")) {
  install.packages("plotly")
  library(plotly)
}
if (!require("fitdistrplus")) {
  install.packages("fitdistrplus")
  library(fitdistrplus)
}
if (!require("MASS")) {
  install.packages("MASS")
  library(MASS)
}

## library(fitdistrplus)
knitr::opts_chunk$set(echo = TRUE,       # include code chunk in the output file
                      warning = FALSE,   # sometimes, you code may produce warning messages,
                                         # you can choose to include the warning messages in
                                         # the output file. 
                      results = TRUE,    # you can also decide whether to include the output
                                         # in the output file.
                      message = FALSE,
                      comment = NA
                      )  
```

\

# Introduction
For this assignment, we will focus on comparing a asymptotic $\chi^2$ hypothesis test to a bootstrapped hypothesis test to determine whether or not a sample came from one distribution or another. Specifically, we will do a Wald $\chi^2$ and a bootstrapped hypothesis test to determine if a sample came from a Weibull distribution or an exponential distribution.

The sample we will use for our tests is a sample consisting of 75 failure times for gearboxes used in wind turbines at a distribution company's facility.
```{r}
times <- c(5.2, 7.8, 9.1, 11.3, 12.5, 13.0, 14.2, 15.1, 15.9, 16.7, 17.2, 17.8, 18.4, 18.9, 19.3, 19.7, 20.2, 20.6, 21.0, 21.5, 21.9, 22.3, 22.7, 23.1, 23.5, 23.9, 24.3, 24.7, 25.1, 25.5, 25.9, 26.3, 26.7, 27.1, 27.5, 27.9, 28.3, 28.7, 29.1, 29.5, 29.9, 30.3, 30.7, 31.1, 31.5, 31.9, 32.3, 32.7, 33.1, 33.5, 33.9, 34.3, 34.7, 35.1, 35.5, 35.9, 36.3, 36.7, 37.1, 37.5, 37.9, 38.3, 38.7, 39.1, 39.5, 39.9, 40.3, 40.7, 41.1, 41.5,
41.9, 42.3, 42.7, 43.1, 43.5)
```

# Part A
For this part, we will find the maximum likelihood estimators of the Weibull distribution, treating this sample as if it comes from a Weibull distribution.

First, we'll start with the log-likelihood function for the weibull distribution:

$$
\begin{align}
    \ell(k, \lambda; \mathbf{t}) &= \ln L(k, \lambda; \mathbf{t}) \\
    &= \sum_{i=1}^{n} \ln \left[ \frac{k}{\lambda} \left( \frac{t_i}{\lambda} \right)^{k-1} \exp\left( -\left( \frac{t_i}{\lambda} \right)^k \right) \right] \\
    &= \sum_{i=1}^{n} \left[ \ln k - \ln \lambda + (k-1)(\ln t_i - \ln \lambda) - \left( \frac{t_i}{\lambda} \right)^k \right] \\
    &= \sum_{i=1}^{n} \left[ \ln k - k \ln \lambda + (k-1) \ln t_i - \left( \frac{t_i}{\lambda} \right)^k \right]
\end{align}

$$

From there, we can take the partial derivatives (one with respect to each variable), set them both equal to 0, and then solve for the two parameters and get their maximum likelihood estimates. If done right, the maximum likelihood estimates are:
$$
\begin{align}
    \hat{\lambda} &= \left( \frac{1}{n} \sum_{i=1}^{n} x_i^{\hat{\beta}} \right)^{1/\hat{\beta}} \quad \\
    \frac{1}{\hat{\beta}} &= \frac{\sum_{i=1}^{n} (t_i^{\hat{\beta}} \ln x_i)}{\sum_{i=1}^{n} x_i^{\hat{\beta}}} - \frac{1}{n} \sum_{i=1}^{n} \ln x_i \quad
\end{align}
$$

Where $\lambda$ is the scale parameter and $\beta$ is the shape parameter.

To obtain these parameters, we will use R's built in optim() function, which will solve a system of equations numerically. The following code will be used to obtain the parameter estimates:
```{r}
## Manual MLE implementation for Weibull distribution

# Create Weibull log-likelihood function
weibull.loglik <- function(params, data) {
  shape <- params[1]  # passing the parameters
  scale <- params[2]

n <- length(data)   # sample size
  
# Log-likelihood for Weibull distribution
loglik <- n * log(shape) - n * shape * log(scale) + 
            (shape - 1) * sum(log(data)) - 
            sum((data / scale)^shape)
  
  return(loglik)    # Return
}


# Score equation

weibull.score <- function(params, data) {
  shape <- params[1]
  scale <- params[2]
  n <- length(data)
  
# Gradient for shape parameter
grad_shape <- n/shape - n * log(scale) + 
                sum(log(data)) - 
                sum((data/scale)^shape * log(data/scale))
  
# Gradient for scale parameter
grad_scale <- -(n * shape)/scale + 
                (shape/scale) * sum((data/scale)^shape)
  
  return(c(grad_shape, grad_scale))
}

# Need to provide initial values for parameters
initial.params <- c(shape = 1, scale = 5)  # Reasonable starting values


# Using optim with Nelder-Mead method
mle.result.weibull <- optim(
  par = initial.params,
  fn = weibull.loglik,
  gr = weibull.score,
  data = times,
  method = "L-BFGS-B",
  hessian = TRUE,
  control = list(trace = FALSE,
                 fnscale = -1,
                 maxit = 500,
                 abstol = 1e-8)
)
##
mle.result.weibull$par
```
The estimate for the shape parameter is 3.370783, while the estimate for the scale parameter is 31.418204.

To verify, we can use the fitdistr() function from the `MASS` package:

```{r}
mle.weibull.fit <- fitdistr(times, "weibull")
mle.weibull.fit
```
They're just about the same, so we can say that our original estimates are correct.


# Part B
For this part, we will find the maximum likelihood estimator for the exponential distribution, treating this sample as if it came from an exponential distribution.

In the case of the exponential distribution, we are only estimating the scale parameter because an exponential random variable is just a Weibull random variable with the shape parameter, $\beta$, set to 1.

Depending on how we write the density function for the exponential, the resulting maximum likelihood estimate for the scale parameter is:
$$
\hat{\lambda} = \frac{1}{n} \sum x_i = \bar{x} \implies \frac{1}{\hat{\lambda}} = \frac{n}{\sum x_i} = \frac{1}{\bar{x}}
$$ 

Regardless of how the estimator is written, the process for solving for $\lambda$ is the same. In this case, since we only have one parameter to solve for, and it has a simple form, we can use simpler methods to solve for it. Since we already determined that $\lambda$ is just the sample mean, we will use that estimate.

First, since we did it manually in Part A, we can do it manually in this part as well.

```{r}
## Manual MLE implementation for Exponential distribution

# Create Exponential Log-Likelihood Function
exponential.loglik <- function(params, data) {
  scale <- params[1]

n <- length(data)   # sample size
  
# Log-likelihood for Exponential Distribution
loglik <- -n*log(scale) - sum(data)/scale
  
  return(loglik)    # Return 
}


# Score equation

exponential.score <- function(params, data) {
  scale <- params[1]
  n <- length(data)

  
# Gradient for scale parameter
grad_scale <- -(n/scale) + sum(data)/scale^2

  return(grad_scale)
}

# Need to provide initial values for parameters
  # Reasonable starting values


# Using optim with Nelder-Mead method
mle.result.exponential <- optim(
  par = 5, # Just a generic number for an estimate
  fn = exponential.loglik,
  gr = exponential.score,
  data = times,
  method = "L-BFGS-B",
  hessian = TRUE,
  control = list(trace = FALSE,
                 fnscale = -1,
                 maxit = 500,
                 abstol = 1e-8)
)
##
mle.result.exponential$par
```

Now let's plug the sample mean for the mle:
```{r}
## Using Sample Mean as MLE For Exponential 

mle.exponential.scale <- mean(times)
mle.exponential.scale
```

The mean of our sample is 28.18533, so the estimate for our scale parameter is 28.18533.

To verify, we can use fitdistr() from the `MASS` package:
```{r}
## Verifying MLE For Exponential via fitdistr()

exponential.fit <- fitdistr(times, "exponential")
exponential.fit
```
As we can see, R gave us the estimate for the rate parameter. But we need the estimate for the scale parameter. The good news is that we can get the estimate for the scale parameter, as it is just the inverse of the rate parameter. In other words, scale = 1/rate.

```{r}
mle.fit <- 1/exponential.fit$estimate
mle.fit
```

# Part C
For this part, we will conduct the first of two hypothesis tests to determine whether or not this data comes from a Weibull distribution or an exponential distribution. Specifically, we will conduct a likelihood ratio $\chi^2$ test. Our null hypothesis, $H_0$, is that this data comes from an exponential distribution. In other words, our null hypothesis is that $\beta = 1$. Our alternative hypothesis, $H_a$, is that it our data comes from a Weibull distribution. In other words, our alternative hypothesis is that $\beta \neq 1$. To put into symbols:
$$
H_0: \beta = 1 (Exponential) \\
H_a: \beta \neq 1 (Weibull)
$$ 

This test will be done at significance level $\alpha = 0.05$. 
```{r}
## Likelihood Ratio Test using both Log-Likelihood Functions

# Weibull Log-Likelihood Function
loglike.weibull <- function(data, lambda, beta){
  n <- length(data)
  n*log(beta)-n*beta*log(lambda)+(beta-1)*sum(log(data))-sum((data/lambda)^beta)
}

# Exponential Log-Likelihood Function
loglike.exponential <- function(data, lambda){ 
  n <- length(data)
  -n*log(lambda)-sum(data)/lambda
}

## Evaluate Log Likelihood @ MLE
scale.est.weibull <- mle.result.weibull$par[[2]]
shape.est.weibull <- mle.result.weibull$par[[1]]
scale.est.exponential <- mle.fit

# Exponential vs Weibull Hypotheses
LogLike.Alt <- loglike.weibull(times, scale.est.weibull, shape.est.weibull)
LogLike.Null <- loglike.exponential(times, scale.est.exponential)

ratio <- 2*(LogLike.Alt - LogLike.Null) #Computes likelihood ratio

## Getting p-value for ratio test
pvalue.ratio <- 1 - pchisq(ratio, df = 1) #Gets p-value

# STatistic and p-value together
test1 <- c(ratio, pvalue.ratio)
test1
```
Our $\chi^2$ statistic is 100.6144, and the p-value is $\approx$ 0, which so it's safe to say that we can reject the null hypothesis and conclude that there is sufficient evidence to say that this sample came from an weibull distribution.

# Part D
For this part, we will conduct the second of our hypothesis tests to determine whether or not this sample came from a Weibull distribution or an exponential distribution. Specifically, we will use a bootstrapped likelihood ratio hypothesis test. The null hypothesis, the alternative hypothesis, and the level of significance will be the same as the likelihood ratio test from Part C.


```{r}
# Bootstrap Likelihood Ratio test for normal mean
set.seed(1)

bootstrap_lrt_test <- function(data, scale_0, B = 10000) {

n <- length(data)

# MLE of Exponential
scale_hat <- mean(data)

scale_0 <- mean(data)
    
# Log-likelihoods
logL_hat <- sum(dexp(data, rate = 1/scale_hat, log = TRUE))
logL0 <- sum(dexp(data, rate = 1/scale_0, log = TRUE))

    
# Likelihood Ratio Test Statistic
LR_obs <- -2 * (logL0 - logL_hat)

# Bootstrap distribution
LR_star <- numeric(B)
    for (b in 1:B) {
# Generate data under H0: Exp(Scale)
         bs_sample <- rexp(n, rate = 1/scale_0)
        
        # MLE from bootstrap sample
        scale_1 <- mean(bs_sample)
        
        # Likelihoods
        logL_star <- sum(dexp(bs_sample, rate = 1/scale_hat, log = TRUE))
        logL_star0 <- sum(dexp(bs_sample, rate = 1/scale_0, log = TRUE))
        
        LR_star[b] <- -2 * (logL_star0 - logL_star)
    }
    
p_value <- (sum(LR_star >= LR_obs) + 1) / (B + 1)
    
 return(list(
        LR_obs = LR_obs,
        p_value = p_value,
        scale_1 = scale_1,
        scale_0 = scale_0
    ))
}

bootstrap_lrt_test(times)
```
It's hard to determine what exactly our test statistic is, but one thing is clear: the p-value for the test is $\approx$ 1, so this time we will fail to reject the null hypothesis, and we can conclude that there is insufficient evidence to say that this sample came from a weibull distribution.

# Part E
The two tests did not generate the same result. What is odd is how much the bootstrapped estimate for the scale parameter changed as n increases. If n = 1000, then the scale parameter is estimated to be 26.5558. If n = 5000, then the scale parameter is estimated to be 30.10166. If n = 10,000, then the scale parameter is estimated to be 22.82617. 

Still, based on the non-bootstrapped likelihood ratio test, i would be inclined to believe that the Weibull distribution is a better fit for this data.

