data = rbind.data.frame(12.4, 18.7, 25.3, 30.1, 33.5, 35.2, 38.9, 40.3, 42.7, 45.1, 47.6, 49.8, 52.4, 55.0, 
57.3, 60.2, 62.8, 65.1, 67.9, 70.5, 72.3, 75.6, 78.2, 80.9, 83.4, 85.7, 88.1, 90.6, 
93.2, 95.8, 98.4, 101.0, 104.5, 107.3, 110.6, 113.2, 116.8, 120.1, 123.7, 127.4,
130.9, 134.5, 138.2, 142.0, 146.3, 150.7, 155.2, 160.8, 168.4, 175.9)
colnames(data) = "time"

Intro and Setup

For this analysis we are examining a dataset composed of 50 survival times (the recorded times from start until failure) of identically manufactured ball bearings. Over previous months, associates have recognized what they believe to be an increase in the unreliability of these bearings. A summary and visualization of the 50 recorded survival times are below.

sum_stats_table = function(variable){
  table = cbind(
    round(min(variable),2),
    round(quantile(variable, 0.25),2),
    round(median(variable),2),
    round(mean(variable),2),
    round(quantile(variable, 0.75),2),
    round(max(variable),2)
  )
  colnames(table) = c("Min", "Q1", "Med", "Mean", "Q3", "Max")
  rownames(table) = NULL
  return(table)
}

Summary_Table = sum_stats_table(variable = data$time)
kable(Summary_Table,align = "c",
      caption = "<span style='color:##000000;'>
      Summary Stats of Survival Times </span>") %>%
  kable_styling(
    bootstrap_options = c("striped", "bordered"),
    full_width = FALSE,
    position = "center")
Summary Stats of Survival Times
Min Q1 Med Mean Q3 Max
12.4 53.05 84.55 87.61 119.27 175.9
ggplot(data = data) +
  geom_histogram(mapping = aes(x = time, y = after_stat(density)),
                 color = "black",
                  # argument color in geom_histogram is for color of bin lines
                 fill = "lightblue",
                 bins = 10,) +
  labs(title = "Distribution of Survival Times",
       x = "Time",
       y = "Density") + 
  theme(
    panel.border = element_rect(color = "black", fill = NA, linewidth = 3)
  ) # this is the equivalent of box(lwd = 3) in base R

For the purposes of this analysis, our default assumption (null hypothesis) is that the \(\beta\) term of the broader population of ball bearings at hand is equal to 1. If this is truly the case, then our population is exponentially distributed. However, given the makeup of the density formulas for both exponential and Weibull distributions, if \(\beta \ne\) 1, that means our population is Weibull distributed. Knowing a population’s true distribution is valuable for future predictive analysis, making this a worthwhile exercise.

\[\begin{align} H_0&: \beta = 1 \quad \text{Takeaway: Exponential} \\ H_a&: \beta \neq 1 \quad \text{Takeaway: Weibull} \end{align}\]

a.) MLE of Weibull Distribution

The first way that we will attempt to estimate \(\beta\) (the shape parameter) is via maximum likelihood estimation (MLE). We will also simultaneously use MLE to estimate the scale parameter, \(\lambda\). We will denote these estimates as \(\hat{\beta}\) and \(\hat{\lambda}\) respectively.

In the process of performing likelihood estimation, the log-likelihood function for the Weibull distribution as well as the partial derivatives of that very function with respect to both \(\beta\) and \(\lambda\) are below.

\[\Large \ell(\beta,\lambda|t) = [n * ln(\beta)] - [n*\beta *ln(\lambda)] + [(\beta - 1) * \sum(ln(t_i))] - \sum(\frac{(t_i)}{\lambda})^\beta \]

\[\Large \frac{\partial f}{\partial \beta} = \frac{n}{\beta} - n * ln(\lambda) + \sum(ln(t_i) - \sum(ln\frac{t_i}{\lambda}) * (\frac{t_i}{\lambda})^\beta \]
\[\Large \frac{\partial f}{\partial \lambda} = \frac{\beta}{\lambda} * \sum(\frac{t_i}{\lambda})^\beta - n \]

## the log likelihood function simply runs a logarithm through all the terms and accounts for the summation of all data points, as opposed to one value of our variable t


##### Overall process is from section 5.2 of this week's notes

## reference week 7 section 5 notes for distribution-specific functions and formulas below

#### define likelihood function
## *negative* log likelihood function for Weibull distribution
  # by negative log likelihood, this function....
log_likelihood = function(params, data) {
  beta = params[1]  # shape
  lambda = params[2]  # scale 
  
  # Ensure parameters are positive
  if (beta <= 0 || lambda <= 0) {
    return(1e10)  # Return large value for invalid parameters
  }
  
  n = length(data)
  log_lik = n * log(beta) - n * beta * log(lambda) + 
    (beta - 1) * sum(log(data)) - sum((data / lambda)^beta)
  
  return(log_lik)  # Return log-likelihood! NOT the negative log-likelihood
}


#### define score equations (partial derivatives)
## gradient (partial derivatives) of *negative* log likelihood function  for Weibull distribution
log_likelihood_gradient = function(params, data) {
  beta = params[1]       # shape parameter
  lambda = params[2]  # scale parameter
  
  n = length(data)
  
  partial_deriv_beta = n/beta - n * log(lambda) + sum(log(data)) - 
        sum(log(data/lambda) * (data/lambda)^beta)
  
  partial_deriv_lambda =  (beta/lambda)*(sum((data/lambda)^beta) -n)
  
  return(c(partial_deriv_beta,
           partial_deriv_lambda))  # Gradient of log-likelihood, NOT the negative scores
}


#### use log likelihood and score equations in optim function

# need to have good initial guesses for beta and lambda for the optim function
  # reference section 5.2 of week 7 "Asymptotic Sampling" notes
beta_guess = (sd(data$time)/mean(data$time))^(-1.086)

lambda_guess = mean(data$time)/gamma(1 + 1/beta_guess)


weib_mle_estimates = optim(
  par = c(beta_guess, lambda_guess), # par argument is initial "guesses"
  data = data$time,
  fn = log_likelihood,
  gr = log_likelihood_gradient,
  method = "L-BFGS-B",
  hessian = TRUE,
    # will use the Hessian matrix later on
  control = list(maxit = 1000, 
                 fnscale  = -1,
                 trace = FALSE,
                 abstol = 1e-8)
)
#weib_mle_estimates
  
weib_mle_beta = weib_mle_estimates$par[1] # shape
weib_mle_lambda = weib_mle_estimates$par[2] # scale

Utilizing R’s optim() function, along with the log likelihood and partial derivatives of the Weibull distribution, this round of MLE-based estimates yielded a value for \(\beta\) of roughly 2.206 and an approximation of 99.021 for \(\lambda\).

As a means of further investigation, I overlaid the density curve of a Weibull distribution with those exact shape (\(\beta\)) and scale (\(\lambda\)) characteristics atop the histogram of our available sample data. The results below show that our sample data certainly does not seem to strongly reject the potential accuracy of our maximum likelihood estimation.

ggplot(data = data) +
  geom_histogram(mapping = aes(x = time, y = after_stat(density)),
                 color = "black",
                  # argument color in geom_histogram is for color of bin lines
                 fill = "lightblue",
                 bins = 10,) +
  # use stat_function in ggplot to show density curves of other distributions
    stat_function(fun = dweibull, 
                args = list(shape = weib_mle_beta, scale = weib_mle_lambda),
                color = "black",
                linewidth = 1.5,
                xlim = range(0:200))+
            # xlim argument so curve doesn't awkardly cut out at edge points of data
  labs(title = "Distribution of Survival Times",
       x = "Time",
       y = "Density") + 
  theme(
    panel.border = element_rect(color = "black", fill = NA, linewidth = 3),
    legend.position = c(0.85, 0.85)
  )

If the population of ball bearing survival times truly is distributed in a Weibull manner, then it seems very likely that \(\beta\) and \(\lambda\) are about equal to 2.206 and 99.021 respectively. I will now perform the MLE process again, this time operating under the assertion that the population is exponentially distributed.

b.) MLE of Exponential Distribution

\[\Large \ell(\lambda|t) = -[n * ln(\lambda)] - \frac{\sum(t_i)}{\lambda} \]

\[\Large \frac{\partial f}{\partial \lambda} = \frac{n}{\lambda} + \frac{\sum(t_i)}{\lambda^2} \]


# exponential log likelihood follows same pattern as Weibull log likelihood but a MUCH simpler distribution
exp_log_likelihood = function(lambda, data) {
 if (lambda <= 0) return (1e10)
 n = length(data)
 log_lik = -n * log(lambda) - sum(data)/lambda
 return(log_lik)
}

### now define partial derivative w/ respect to lambda (the score function or gradient)
  # performed via quick hand calculation

exp_gradient = function(lambda, data) {
  if (lambda <= 0) return (NA)
  n = length(data)
  partial_deriv_lambda = (n/lambda) + (sum(data)/(lambda^2))
  return(partial_deriv_lambda)
}


exp_mle_estimates = optim(
  par = 1/mean(data$time),
    # the initial guess can simply be 1/(mean of the data) since we are performing this analysis under the theoretical that it is an EXPONENTIAL distribution. The mean of an exponential distribution is 1 divided by the the rate parameter lambda
  data = data$time,
  fn = exp_log_likelihood,
  gr = exp_gradient,
  method = "L-BFGS-B",
  hessian = TRUE,
    # will use the Hessian matrix later on
  control = list(maxit = 1000, 
                 fnscale  = -1,
                 trace = FALSE,
                 abstol = 1e-8)
)
  
#exp_mle_estimates$par

exp_mle_lambda = exp_mle_estimates$par

Once again using Utilizing R’s optim() along with the principles of maximum likelihood estimation, I calculated an estimation of \(\lambda\). Under the assumption that the ball bearing survival time population is exponentially distributed, we would estimate \(\lambda\) to be roughly 88.132.

Below is the sample data with an exponential curve with said lambda (scale) parameter overlaid. While this does curve not appear to match the data to as close an extent as the previous Weibull curve did, it is still reasonably similar in pattern and scale.

ggplot(data = data) +
  geom_histogram(mapping = aes(x = time, y = after_stat(density)),
                 color = "black",
                  # argument color in geom_histogram is for color of bin lines
                 fill = "lightblue",
                 bins = 10,) +
  # use stat_function in ggplot to show density curves of other distributions
    stat_function(fun = dexp, 
                args = list(rate = 1/exp_mle_lambda),
                color = "red",
                linewidth = 1.5,
                xlim = range(0:200)
                )+
            # xlim argument so curve doesn't awkardly cut out at edge points of data
  labs(title = "Distribution of Survival Times",
       x = "Time",
       y = "Density") + 
  theme(
    panel.border = element_rect(color = "black", fill = NA, linewidth = 3),
    legend.position = c(0.85, 0.85)
  )

c.) Likelihood Ratio Test

Now having two different estimates of \(\lambda\), formulated via two differing distribution assumptions, I went ahead and performed a likelihood ratio test to help determine which presumed value of \(\lambda\) is more effective as per our sample data serving as evidence.

The likelihood ratio test, which involves the subtraction of the log likelihood estimate of a simpler model (the exponential performed in part b) from the log likelihood estimate of a more complex and more parameterized model (the Weibull performed in part a), is meant to measure the improvement in fit that the more complex and unrestricted model provides. The null hypothesis of this test is that, outside of sampling variation, there is no sufficient difference in fit between the simpler and more complex models. The test statistic for this hypothesis test, \(\Lambda\), follows a chi-square distribution with degrees of freedom equaling the number of parameters actively being tested for.

###### handle Weibull distribution first
# weib_mle_beta and weib_mle_lambda are UNRESTRICTED MLE VALUES, but now we must find

# null hypothesis is that beta = 1
# if null hypothesis is true, then distribution is really just exponential

## the likelihood ratio test is just 2 * 
  # [log likelihood evaluated in unrestricted way(which is what we did in part a)] -
  # [log likelihood evaluated in restricted or simple way(which we did in part b)]

# test stat for likelihood ratio test is uppercase Lambda

Lambda = 2 * (weib_mle_lambda - exp_mle_lambda)

p_value_likelihoodratio = 1 - pchisq(Lambda, df = 1)
  # test stat distribution for likelihood ratio test is chi-square distributed
  # df = number of parameters we are estimating in function

Performing the likelihood ratio test, the resulting value of \(\Lambda\) was roughly 21.779, which yielded a p value of roughly 3^{-6}. This of course leads us to reject the null hypothesis, as there seems to be a large enough amount of evidence that significant amount of fit is lost by operating under the assumption of an exponential distribution. The findings of this test are rather unsurprising, given the histogram visuals we looked at earlier.

d.) Wald Test

For further confirmation that the greater complexity which the Weibull carries in comparison to the exponential distribution is a worthwhile tradeoff for increased accuracy, I also performed the Wald test. Like the likelihood ratio test, the Wald test also follows a chi-square distribution, with the degrees of freedom acting as the number of parameters being estimated at that time.

The null hypothesis of the Wald test is that the parameter of interest (\(\beta\) in our case) is close enough or equal to whatever previously hypothesized value. In this instance, we will compare the unconstrained MLE-based estimate of \(\beta\) found in part a to the value of \(\beta\) (1) that would make this distribution of ball bearing survival times truly exponential.

invisible(
  weib_mle_estimates$hessian
)
## the inverse of the Hessian matrix is the covariance
 # in R, the solve function returns the inverse of a matrix object
covar_matrix = solve(-weib_mle_estimates$hessian)

# already have weib_mle_beta as our estiamte based on the unrestricted likelihood estimation method

var_beta = covar_matrix[1,1]

wald_stat = (weib_mle_beta - 1)^2 / var_beta

## once again, wald_stat is chi square just like likelihood ratio
p_value_wald = 1 - pchisq(wald_stat, df = 1)

Per the results of the Wald test, there is sufficient statistical evidence that we cannot simply operate under the assumption that the true shape parameter \(\beta\) of our distribution is 1. The Wald test yielded a test statistic and p value of approximately 23.111 and 2^{-6} respectively.

e.) Summary and Conclusions

Given the results of the two performed hypothesis tests, we cannot justify treating the distribution of ball bearing survival times as exponential. Doing so would result in a significant amount of predictive accuracy lost per the results of the likelihood ratio test. And per the Wald test, there is not sufficient evidence that the \(\beta\) parameter is “flexible” enough to be assumed at a value of one, given our available data.

Knowing this, we should move forward with the assumptions that the broader population of bearing survival times is Weibull distributed, with a shape parameter of approximately 2.206 and a scale parameter of about 99.021. The visual for this distribution is below once again.

ggplot(data = data) +
  geom_histogram(mapping = aes(x = time, y = after_stat(density)),
                 color = "black",
                  # argument color in geom_histogram is for color of bin lines
                 fill = "lightblue",
                 bins = 10,) +
  # use stat_function in ggplot to show density curves of other distributions
    stat_function(fun = dweibull, 
                args = list(shape = weib_mle_beta, scale = weib_mle_lambda),
                color = "black",
                linewidth = 1.5,
                xlim = range(0:200))+
            # xlim argument so curve doesn't awkardly cut out at edge points of data
  labs(title = "Distribution of Survival Times",
       x = "Time",
       y = "Density") + 
  theme(
    panel.border = element_rect(color = "black", fill = NA, linewidth = 3),
    legend.position = c(0.85, 0.85)
  )

Our visual shows a rather immediate uptick in values, and then a noticeable rightward skew. This is a standard pattern for a Weibull distribution, making it a powerful modeling framework for areas like engineering and life expectancies.

---
title: "Hypothesis Testing to Determine Potential Overfitting of Distribution"
author: "Chris Bahm"
date: "April 13, 2026"
output:
  html_document:
    toc: true
    toc_float:
      collapsed: true
      smooth_scroll: true
    toc_depth: 4
    fig_width: 6
    fig_height: 4
    fig_caption: true
    number_sections: false
    code_folding: hide
    code_download: true
    theme: lumen
    highlight: tango
  pdf_document:
    toc: true
    toc_depth: 4
    fig_caption: true
    number_sections: true
  word_document:
    toc: true
    toc_depth: 4
---

```{css, echo = FALSE}
div#TOC li {     /* table of content  */
    list-style:upper-roman;
    background-image:none;
    background-repeat:none;
    background-position:0;
}

h1.title {    /* level 1 header of title  */
  font-size: 24px;
  font-weight: bold;
  color: DarkRed;
  text-align: center;
}

h4.author { /* Header 4 - and the author and data headers use this too  */
  font-size: 18px;
  font-weight: bold;
  font-family: "Times New Roman", Times, serif;
  color: DarkRed;
  text-align: center;
}

h4.date { /* Header 4 - and the author and data headers use this too  */
  font-size: 18px;
  font-weight: bold;
  font-family: "Times New Roman", Times, serif;
  color: DarkBlue;
  text-align: center;
}

h1 { /* Header 1 - and the author and data headers use this too  */
    font-size: 20px;
    font-weight: bold;
    font-family: "Times New Roman", Times, serif;
    color: darkred;
    text-align: center;
}

h2 { /* Header 2 - and the author and data headers use this too  */
    font-size: 18px;
    font-weight: bold;
    font-family: "Times New Roman", Times, serif;
    color: navy;
    text-align: left;
}

h3 { /* Header 3 - and the author and data headers use this too  */
    font-size: 16px;
    font-weight: bold;
    font-family: "Times New Roman", Times, serif;
    color: navy;
    text-align: left;
}

h4 { /* Header 4 - and the author and data headers use this too  */
    font-size: 14px;
  font-weight: bold;
    font-family: "Times New Roman", Times, serif;
    color: darkred;
    text-align: left;
}

/* Add dots after numbered headers */
.header-section-number::after {
  content: ".";
}
```

```{r setup, include=FALSE}
# code chunk specifies whether the R code, warnings, and output 
# will be included in the output files.

if (!require("knitr")) {                      # use conditional statement to detect
   install.packages("knitr")                  # whether a package was installed in
   library(knitr)                             # your machine. If not, install it and
}                                             # load it to the working directory.

if (!require(tidyverse)) {library(tidyvserse)} 

if (!require(GGally)) {library(GGally)} 

if (!require(kableExtra)) {library(kableExtra)} 

if (!require(ggplot2)) {library(ggplot2)} 

if (!require(car)) {library(car)} 

if (!require(dplyr)) {library(dplyr)} 

if (!require(pander)) {library(pander)} 

if (!require(forecast)) {library(forecast)} 

if (!require(lubridate)) {library(lubridate)} 

if (!require("scales")) {
install.packages("scales")                                        
library("scales") 
}

knitr::opts_chunk$set(
	echo = TRUE,
	message = FALSE,
	warning = FALSE,
	comment = NA,
	results = TRUE,
	fig.align = "center"
)
```

```{r}
data = rbind.data.frame(12.4, 18.7, 25.3, 30.1, 33.5, 35.2, 38.9, 40.3, 42.7, 45.1, 47.6, 49.8, 52.4, 55.0, 
57.3, 60.2, 62.8, 65.1, 67.9, 70.5, 72.3, 75.6, 78.2, 80.9, 83.4, 85.7, 88.1, 90.6, 
93.2, 95.8, 98.4, 101.0, 104.5, 107.3, 110.6, 113.2, 116.8, 120.1, 123.7, 127.4,
130.9, 134.5, 138.2, 142.0, 146.3, 150.7, 155.2, 160.8, 168.4, 175.9)
colnames(data) = "time"
```

# Intro and Setup
For this analysis we are examining a dataset composed of `r nrow(data)` survival times (the recorded times from start until failure) of identically manufactured ball bearings. Over previous months, associates have recognized what they believe to be an increase in the unreliability of these bearings. A summary and visualization of the `r nrow(data)` recorded survival times are below.
```{r}
sum_stats_table = function(variable){
  table = cbind(
    round(min(variable),2),
    round(quantile(variable, 0.25),2),
    round(median(variable),2),
    round(mean(variable),2),
    round(quantile(variable, 0.75),2),
    round(max(variable),2)
  )
  colnames(table) = c("Min", "Q1", "Med", "Mean", "Q3", "Max")
  rownames(table) = NULL
  return(table)
}

Summary_Table = sum_stats_table(variable = data$time)
kable(Summary_Table,align = "c",
      caption = "<span style='color:##000000;'>
      Summary Stats of Survival Times </span>") %>%
  kable_styling(
    bootstrap_options = c("striped", "bordered"),
    full_width = FALSE,
    position = "center")


ggplot(data = data) +
  geom_histogram(mapping = aes(x = time, y = after_stat(density)),
                 color = "black",
                  # argument color in geom_histogram is for color of bin lines
                 fill = "lightblue",
                 bins = 10,) +
  labs(title = "Distribution of Survival Times",
       x = "Time",
       y = "Density") + 
  theme(
    panel.border = element_rect(color = "black", fill = NA, linewidth = 3)
  ) # this is the equivalent of box(lwd = 3) in base R


```

For the purposes of this analysis, our default assumption (null hypothesis) is that the $\beta$ term of the broader population of ball bearings at hand is equal to 1. If this is truly the case, then our population is *exponentially* distributed. However, given the makeup of the density formulas for both exponential and Weibull distributions, if $\beta \ne$ 1, that means our population is *Weibull* distributed. Knowing a population's true distribution is valuable for future predictive analysis, making this a worthwhile exercise.

\begin{align}
H_0&: \beta = 1 \quad \text{Takeaway: Exponential} \\
H_a&: \beta \neq 1 \quad \text{Takeaway: Weibull}
\end{align}


# a.) MLE of Weibull Distribution
The first way that we will attempt to estimate $\beta$ (the shape parameter) is via maximum likelihood estimation (MLE). We will also simultaneously use MLE to estimate the scale parameter, $\lambda$. We will denote these estimates as $\hat{\beta}$ and $\hat{\lambda}$ respectively.

In the process of performing likelihood estimation, the log-likelihood function for the Weibull distribution as well as the partial derivatives of that very function with respect to both $\beta$ and $\lambda$ are below.

$$\Large \ell(\beta,\lambda|t) = [n * ln(\beta)] - [n*\beta *ln(\lambda)] + [(\beta - 1) * \sum(ln(t_i))] - \sum(\frac{(t_i)}{\lambda})^\beta $$
<br>

$$\Large \frac{\partial f}{\partial \beta} = \frac{n}{\beta} - n * ln(\lambda) + \sum(ln(t_i) - \sum(ln\frac{t_i}{\lambda}) * (\frac{t_i}{\lambda})^\beta    $$
<br>
$$\Large \frac{\partial f}{\partial \lambda} =  \frac{\beta}{\lambda} * \sum(\frac{t_i}{\lambda})^\beta - n    $$
<br>

```{r}
## the log likelihood function simply runs a logarithm through all the terms and accounts for the summation of all data points, as opposed to one value of our variable t


##### Overall process is from section 5.2 of this week's notes

## reference week 7 section 5 notes for distribution-specific functions and formulas below

#### define likelihood function
## *negative* log likelihood function for Weibull distribution
  # by negative log likelihood, this function....
log_likelihood = function(params, data) {
  beta = params[1]  # shape
  lambda = params[2]  # scale 
  
  # Ensure parameters are positive
  if (beta <= 0 || lambda <= 0) {
    return(1e10)  # Return large value for invalid parameters
  }
  
  n = length(data)
  log_lik = n * log(beta) - n * beta * log(lambda) + 
    (beta - 1) * sum(log(data)) - sum((data / lambda)^beta)
  
  return(log_lik)  # Return log-likelihood! NOT the negative log-likelihood
}


#### define score equations (partial derivatives)
## gradient (partial derivatives) of *negative* log likelihood function  for Weibull distribution
log_likelihood_gradient = function(params, data) {
  beta = params[1]       # shape parameter
  lambda = params[2]  # scale parameter
  
  n = length(data)
  
  partial_deriv_beta = n/beta - n * log(lambda) + sum(log(data)) - 
        sum(log(data/lambda) * (data/lambda)^beta)
  
  partial_deriv_lambda =  (beta/lambda)*(sum((data/lambda)^beta) -n)
  
  return(c(partial_deriv_beta,
           partial_deriv_lambda))  # Gradient of log-likelihood, NOT the negative scores
}


#### use log likelihood and score equations in optim function

# need to have good initial guesses for beta and lambda for the optim function
  # reference section 5.2 of week 7 "Asymptotic Sampling" notes
beta_guess = (sd(data$time)/mean(data$time))^(-1.086)

lambda_guess = mean(data$time)/gamma(1 + 1/beta_guess)


weib_mle_estimates = optim(
  par = c(beta_guess, lambda_guess), # par argument is initial "guesses"
  data = data$time,
  fn = log_likelihood,
  gr = log_likelihood_gradient,
  method = "L-BFGS-B",
  hessian = TRUE,
    # will use the Hessian matrix later on
  control = list(maxit = 1000, 
                 fnscale  = -1,
                 trace = FALSE,
                 abstol = 1e-8)
)
#weib_mle_estimates
  
weib_mle_beta = weib_mle_estimates$par[1] # shape
weib_mle_lambda = weib_mle_estimates$par[2] # scale

```
Utilizing R's <span style="color:blue;">optim()</span> function, along with the log likelihood and partial derivatives of the Weibull distribution, this round of MLE-based estimates yielded a value for $\beta$ of roughly `r round(weib_mle_beta,3)` and an approximation of `r round(weib_mle_lambda,3)` for $\lambda$. 

As a means of further investigation, I overlaid the density curve of a **Weibull distribution** with those exact shape ($\beta$) and scale ($\lambda$) characteristics atop the histogram of our available sample data. The results below show that our sample data certainly does not seem to strongly reject the potential accuracy of our maximum likelihood estimation.
```{r}
ggplot(data = data) +
  geom_histogram(mapping = aes(x = time, y = after_stat(density)),
                 color = "black",
                  # argument color in geom_histogram is for color of bin lines
                 fill = "lightblue",
                 bins = 10,) +
  # use stat_function in ggplot to show density curves of other distributions
    stat_function(fun = dweibull, 
                args = list(shape = weib_mle_beta, scale = weib_mle_lambda),
                color = "black",
                linewidth = 1.5,
                xlim = range(0:200))+
            # xlim argument so curve doesn't awkardly cut out at edge points of data
  labs(title = "Distribution of Survival Times",
       x = "Time",
       y = "Density") + 
  theme(
    panel.border = element_rect(color = "black", fill = NA, linewidth = 3),
    legend.position = c(0.85, 0.85)
  )

```

If the population of ball bearing survival times truly is distributed in a Weibull manner, then it seems very likely that $\beta$ and $\lambda$ are about equal to `r round(weib_mle_beta,3)` and  `r round(weib_mle_lambda,3)` respectively. I will now perform the MLE process again, this time operating under the assertion that the population is *exponentially* distributed.


# b.) MLE of Exponential Distribution

$$\Large \ell(\lambda|t) = -[n * ln(\lambda)] - \frac{\sum(t_i)}{\lambda}  $$
<br>

$$\Large \frac{\partial f}{\partial \lambda} = \frac{n}{\lambda} + \frac{\sum(t_i)}{\lambda^2}   $$

<br>

```{r}
# exponential log likelihood follows same pattern as Weibull log likelihood but a MUCH simpler distribution
exp_log_likelihood = function(lambda, data) {
 if (lambda <= 0) return (1e10)
 n = length(data)
 log_lik = -n * log(lambda) - sum(data)/lambda
 return(log_lik)
}

### now define partial derivative w/ respect to lambda (the score function or gradient)
  # performed via quick hand calculation

exp_gradient = function(lambda, data) {
  if (lambda <= 0) return (NA)
  n = length(data)
  partial_deriv_lambda = (n/lambda) + (sum(data)/(lambda^2))
  return(partial_deriv_lambda)
}


exp_mle_estimates = optim(
  par = 1/mean(data$time),
    # the initial guess can simply be 1/(mean of the data) since we are performing this analysis under the theoretical that it is an EXPONENTIAL distribution. The mean of an exponential distribution is 1 divided by the the rate parameter lambda
  data = data$time,
  fn = exp_log_likelihood,
  gr = exp_gradient,
  method = "L-BFGS-B",
  hessian = TRUE,
    # will use the Hessian matrix later on
  control = list(maxit = 1000, 
                 fnscale  = -1,
                 trace = FALSE,
                 abstol = 1e-8)
)
  
#exp_mle_estimates$par

exp_mle_lambda = exp_mle_estimates$par
```
Once again using Utilizing R's <span style="color:blue;">optim()</span> along with the principles of maximum likelihood estimation, I calculated an estimation of $\lambda$. Under the assumption that the ball bearing survival time population is *exponentially* distributed, we would estimate $\lambda$ to be roughly `r round(exp_mle_lambda,3)`. 

Below is the sample data with an <span style="color:red;">exponential curve</span> with said lambda (scale) parameter overlaid. While this does curve not appear to match the data to as close an extent as the previous Weibull curve did, it is still reasonably similar in pattern and scale.
```{r}
ggplot(data = data) +
  geom_histogram(mapping = aes(x = time, y = after_stat(density)),
                 color = "black",
                  # argument color in geom_histogram is for color of bin lines
                 fill = "lightblue",
                 bins = 10,) +
  # use stat_function in ggplot to show density curves of other distributions
    stat_function(fun = dexp, 
                args = list(rate = 1/exp_mle_lambda),
                color = "red",
                linewidth = 1.5,
                xlim = range(0:200)
                )+
            # xlim argument so curve doesn't awkardly cut out at edge points of data
  labs(title = "Distribution of Survival Times",
       x = "Time",
       y = "Density") + 
  theme(
    panel.border = element_rect(color = "black", fill = NA, linewidth = 3),
    legend.position = c(0.85, 0.85)
  )

```


# c.) Likelihood Ratio Test
Now having two different estimates of $\lambda$, formulated via two differing distribution assumptions, I went ahead and performed a likelihood ratio test to help determine which presumed value of $\lambda$ is more effective as per our sample data serving as evidence. 

The likelihood ratio test, which involves the subtraction of the log likelihood estimate of a simpler model (the exponential performed in part b) from the log likelihood estimate of a more complex and more parameterized model (the Weibull performed in part a), is meant to measure the improvement in fit that the more complex and unrestricted model provides. The null hypothesis of this test is that, outside of sampling variation, there is no sufficient difference in fit between the simpler and more complex models. The test statistic for this hypothesis test, $\Lambda$, follows a chi-square distribution with degrees of freedom equaling the number of parameters actively being tested for.

```{r}
###### handle Weibull distribution first
# weib_mle_beta and weib_mle_lambda are UNRESTRICTED MLE VALUES, but now we must find

# null hypothesis is that beta = 1
# if null hypothesis is true, then distribution is really just exponential

## the likelihood ratio test is just 2 * 
  # [log likelihood evaluated in unrestricted way(which is what we did in part a)] -
  # [log likelihood evaluated in restricted or simple way(which we did in part b)]

# test stat for likelihood ratio test is uppercase Lambda

Lambda = 2 * (weib_mle_lambda - exp_mle_lambda)

p_value_likelihoodratio = 1 - pchisq(Lambda, df = 1)
  # test stat distribution for likelihood ratio test is chi-square distributed
  # df = number of parameters we are estimating in function

```

Performing the likelihood ratio test, the resulting value of $\Lambda$ was roughly `r round(Lambda,3)`, which yielded a p value of roughly `r round(p_value_likelihoodratio, 6)`. This of course leads us to reject the null hypothesis, as there seems to be a large enough amount of evidence that significant amount of fit is lost by operating under the assumption of an exponential distribution. The findings of this test are rather unsurprising, given the histogram visuals we looked at earlier.

# d.) Wald Test

For further confirmation that the greater complexity which the Weibull carries in comparison to the exponential distribution is a worthwhile tradeoff for increased accuracy, I also performed the Wald test. Like the likelihood ratio test, the Wald test also follows a chi-square distribution, with the degrees of freedom acting as the number of parameters being estimated at that time. 

The null hypothesis of the Wald test is that the parameter of interest ($\beta$ in our case) is close enough or equal to whatever previously hypothesized value. In this instance, we will compare the unconstrained MLE-based estimate of $\beta$ found in part a to the value of $\beta$ (1) that would make this distribution of ball bearing survival times truly exponential.
```{r}
invisible(
  weib_mle_estimates$hessian
)
## the inverse of the Hessian matrix is the covariance
 # in R, the solve function returns the inverse of a matrix object
covar_matrix = solve(-weib_mle_estimates$hessian)

# already have weib_mle_beta as our estiamte based on the unrestricted likelihood estimation method

var_beta = covar_matrix[1,1]

wald_stat = (weib_mle_beta - 1)^2 / var_beta

## once again, wald_stat is chi square just like likelihood ratio
p_value_wald = 1 - pchisq(wald_stat, df = 1)

```

Per the results of the Wald test, there is sufficient statistical evidence that we **cannot** simply operate under the assumption that the true shape parameter $\beta$ of our distribution is 1. The Wald test yielded a test statistic and p value of approximately `r round(wald_stat, 3)` and `r round(p_value_wald,6)` respectively.

# e.) Summary and Conclusions
Given the results of the two performed hypothesis tests, we **cannot** justify treating the distribution of ball bearing survival times as exponential. Doing so would result in a significant amount of predictive accuracy lost per the results of the likelihood ratio test. And per the Wald test, there is not sufficient evidence that the $\beta$ parameter is "flexible" enough to be assumed at a value of one, given our available data.

Knowing this, we should move forward with the assumptions that the broader population of bearing survival times is **Weibull distributed**, with a shape parameter of approximately `r round(weib_mle_beta,3)` and a scale parameter of about `r round(weib_mle_lambda,3)`. The visual for this distribution is below once again.

```{r}
ggplot(data = data) +
  geom_histogram(mapping = aes(x = time, y = after_stat(density)),
                 color = "black",
                  # argument color in geom_histogram is for color of bin lines
                 fill = "lightblue",
                 bins = 10,) +
  # use stat_function in ggplot to show density curves of other distributions
    stat_function(fun = dweibull, 
                args = list(shape = weib_mle_beta, scale = weib_mle_lambda),
                color = "black",
                linewidth = 1.5,
                xlim = range(0:200))+
            # xlim argument so curve doesn't awkardly cut out at edge points of data
  labs(title = "Distribution of Survival Times",
       x = "Time",
       y = "Density") + 
  theme(
    panel.border = element_rect(color = "black", fill = NA, linewidth = 3),
    legend.position = c(0.85, 0.85)
  )

```

Our visual shows a rather immediate uptick in values, and then a noticeable rightward skew. This is a standard pattern for a Weibull distribution, making it a powerful modeling framework for areas like engineering and life expectancies.
