Assignment Objectives

  • Enhance understanding the procedure of likelihood-based chi-square hypothesis testing .

  • Implement the procedures for detecting overfitting/underfitting issues in practical applications.

Policies of Using AI Tools

Policy on AI Tool Use: Please adhere to the AI tool policy specified in the course syllabus. The direct copying of AI-generated content is strictly prohibited. All submitted work must reflect your own understanding; where external tools are consulted, content must be thoroughly rephrased and synthesized in your own words.

Code Inclusion Requirement: Any code included in your essay must be properly commented to explain the purpose and/or expected output of key code lines. Submitting AI-generated code without meaningful, student-added comments will not be accepted.

Testing Overfitting/Underfitting

In Machine Learning and Statistics, overfitting occurs when a model is too complex and learns noise, leading to poor performance on new data, while underfitting happens when a model is too simple to capture important patterns, resulting in high errors overall; both issues are explained by the Bias–Variance Tradeoff and can cause unreliable predictions in real-world applications.

The probability density function (PDF) of the Weibull distribution is:

\[ f(t; \lambda, \beta) = \frac{\beta}{\lambda} \left( \frac{t}{\lambda} \right)^{\beta-1} \exp\left[ -\left( \frac{t}{\lambda} \right)^\beta \right], \quad t \ge 0 \] where \(\lambda > 0\) is the scale parameter (characteristic life) and \(\beta > 0\) is the shape parameter.

When \(\beta = 1\), the Weibull PDF simplifies to the exponential PDF:

\[ f(t; \lambda) = \frac{1}{\lambda} \exp\left( -\frac{t}{\lambda} \right) \] with constant hazard rate \(h(t) = 1/\lambda\).

This assignment focuses on performing a hypothesis test for the shape parameter (\(\beta\)) of the Weibull distribution within a reliability mode

\[\begin{align} H_0&: \beta = 1 \quad \text{(Exponential model, simpler)} \\ H_1&: \beta \neq 1 \quad \text{(Weibull model, more complex)} \end{align}\]


Question: Reliability Application

A mid-sized manufacturing company producing industrial conveyor systems began experiencing unexpected downtime in one of its distribution facilities, prompting concern about the reliability of a newly sourced batch of ball bearings used in the motor assemblies. These bearings, supplied by a vendor adopting cost-saving production methods, were installed across multiple units operating under continuous load conditions. After several months, maintenance logs revealed a pattern of increasing failures, with components lasting anywhere from a few dozen to over 150 hours before breakdown. To investigate, the engineering team collected time-to-failure data from 50 identical bearings and conducted a Weibull analysis within the framework of Reliability Engineering. The 50 time-to-failure (survival time) are:

12.4, 18.7, 25.3, 30.1, 33.5, 35.2, 38.9, 40.3, 42.7, 45.1, 47.6, 49.8, 52.4, 55.0, 
57.3, 60.2, 62.8, 65.1, 67.9, 70.5, 72.3, 75.6, 78.2, 80.9, 83.4, 85.7, 88.1, 90.6, 
93.2, 95.8, 98.4, 101.0, 104.5, 107.3, 110.6, 113.2, 116.8, 120.1, 123.7, 127.4,
130.9, 134.5, 138.2, 142.0, 146.3, 150.7, 155.2, 160.8, 168.4, 175.9

This assignment focuses on hypothesis \(H_0: \beta = 1\) (exponential) against \(H_1: \beta \neq 1\) (Weibull). This framework detects overfitting (fitting a Weibull when exponential is true) and underfitting (fitting exponential when Weibull with \(\beta \neq 1\) is true).

a). Find the MLE of the Weibull parameters \(\lambda\) (scale) and \(\beta\) (shape), denoted by \(\hat{\lambda}\) and \(\hat{\beta}\), respectively, using the optim() procedure. [Hint: You should provide explicit expressions for the log-likelihood and gradient functions of the Weibull distribution parameters.]

wbearing = sort(c(12.4, 18.7, 25.3, 30.1, 33.5, 35.2, 38.9, 40.3, 42.7, 45.1, 47.6, 49.8, 52.4, 55.0, 
57.3, 60.2, 62.8, 65.1, 67.9, 70.5, 72.3, 75.6, 78.2, 80.9, 83.4, 85.7, 88.1, 90.6, 
93.2, 95.8, 98.4, 101.0, 104.5, 107.3, 110.6, 113.2, 116.8, 120.1, 123.7, 127.4,
130.9, 134.5, 138.2, 142.0, 146.3, 150.7, 155.2, 160.8, 168.4, 175.9))


#prepare weilbull distrubbtion log likelihood to use optim()
weil_loglik= function(params){
  
  lambda = params[1]
  beta = params[2]
  n = length(wbearing)
  
  if (lambda <= 0 || beta <= 0 ) return(-Inf)

  #log likelihood formula
  beta_shape = n * log(beta)
  scale_norm = n * beta * log(lambda) #normalizes the distrubution on scale parameter
  log_data_sum = (beta -1) * sum(log(wbearing)) #sums the logs
  exponential_kernel = sum((wbearing/lambda)^beta) #exponential kernel
  log_lik = beta_shape + log_data_sum - scale_norm - exponential_kernel
  
  return(log_lik)
}


#maximizing the likelihood to find the MLE
mle_weibull = optim(par = c(mean(wbearing), 1),
                     fn = weil_loglik,
                     hessian= TRUE,#output the hessian matrix
                     control = list(fnscale= -1) #fnsalce = -1 maximizes the likelihood
                     
                     )

# 3. Extract the results
hat_lambda <- mle_weibull$par[1]
hat_beta   <- mle_weibull$par[2]

cat("MLE for Lambda (Scale):", hat_lambda, 
    "\nMLE for Beta (Shape):", hat_beta)
MLE for Lambda (Scale): 99.02759 
MLE for Beta (Shape): 2.205876

b). Find the MLE of the exponential parameter \(\lambda\) (scale), denoted by \(\hat{\lambda}\), using any procedure. [Hint: You should provide explicit expressions for the log-likelihood and gradient functions of the exponential distribution parameters.]

#defining the exponential log likelihood
exp_loglik = function(lambda){
  
  n = length(wbearing)
  e_log =-n * log(lambda) -sum(wbearing)/lambda
  return(e_log)
}

# exponential gradient (score function)

exp_gradient = function(lambda){
  n = length(wbearing)
  deriv_lambda = -n / lambda + sum(wbearing)/ (lambda^2) #derivative in respect to lambda
  return(deriv_lambda)
  
}

#finding the MLE for exponential lambda

hat_lambda_exp = mean(wbearing)

#store max log like

loglik_exp_max = exp_loglik(hat_lambda_exp)

cat("Exponential MLE for Lambda:", hat_lambda_exp, 
    "\nMax Log-Likelihood (Null Model):", loglik_exp_max)
Exponential MLE for Lambda: 87.61 
Max Log-Likelihood (Null Model): -273.6448

c). Perform the likelihood ratio \(\chi^2\) test on \(\beta = 1\). What is the p-value? [Hint: Use the results in a) and b).]

#we can now measure the distance between the two models
# The likelihood ratio test asks, "is the improvement in the 'height' of the log likelihood from the exponential model to the weibull model large enough to be statistically significant


#use the (weibull model) maximum log likelihood (from part a) 

loglik_weibull_max = mle_weibull$value

# calculate the likelihood ratio test
  #find the difference between the log likelihood maximum and log_weillbull maximum
  #is the more complex model better than the simpler model
#we use the logs and -2 to force a chi square distribution 

like_ratio_test_statistic = -2 * (loglik_exp_max - loglik_weibull_max)

#calculate p value for test statistic from chiq square distrubtion

pval_lr_test = 1 - pchisq(like_ratio_test_statistic, df=1)


cat("Likelihood ratio test statistic:", like_ratio_test_statistic, 
    "\np-value (LRT):", pval_lr_test)
Likelihood ratio test statistic: 34.44995 
p-value (LRT): 4.373535e-09

d). Perform the Wald \(\chi^2\) on the same hypothesis \(\beta = 1\). What is the p-value? [Hint: You need to find the observed Fisher information matrix (i.e., the negative Hessian matrix from optim()) based on the Weibull distribution. The inverse of the negative observed Hessian matrix is the covariance matrix.]

#the observed fisher information(negative hessian)
#(the hessian is the matrix of the second derivative 


fisher_info = -mle_weibull$hessian

#run a variance covariance matrix (inverse of the fisher information)

var_cov_matrix = solve(fisher_info)

#extract the variance for beta
#since params[2] was beta, the variance of beta is the 2nd row, 2nd column
var_beta = var_cov_matrix[2,2]


## Wald test statistic
    #we use wald to determine if the variable is significant and should be used in the final model 
# Formula: (Estimate - Null)^2 / Variance

wald_statistic = (hat_beta- 1)^2 /var_beta

#p value from wald chi square distrubution

p_val_wald = 1 - pchisq(wald_statistic, df = 1)

cat("Wald Statistic:", wald_statistic, 
    "\np-value:", p_val_wald)
Wald Statistic: 23.10901 
p-value: 1.530721e-06

e). Write a summary of the above analyses to address the following:

  • Whether the two tests generated the same results.

  • Which model is recommended for the data.

  • Draw the density curve based on the MLE(s) of the parameter(s) and describe the distribution of the time-to-failure.

Both the Likelihood Ratio Test (LR=34.4499523, p= 4.3735351^{-9} ) and Wald Test (W= 23.1090057, p = 1.5307208^{-6}) are statistically significant. We can concluded that the slope is statistically different than 1 and that the converyer belt bearings do not fail randomly. The estimated shape parameter at 2.21 suggests the conveyers increasingly fail overtime. Considering the moderate sample of 50, the Likelihood ratio test (LRT) is the preferred test to use. LRT is most reliable for accuracy and closest to nominal Type I error rates as well as robust in being invariant to parameter transformations. The sample size does not leave computational concern. The confirmation that both LRT and Wald are significant suggests a strong relationship, although the Wald test performs poorly in skewness and nonlinear parameters which the weilbull distribution tends to hold. A drop from the LRT from being 34.4499523 to W= 23.1090057, may suggest the weilbull test could not strongly handle the distribution’s skewness. If there is discrepancy between knowing to use the simple exponential or more complex weibull model, use the weibull because it better explains the distribution relationship as the exponential model underfits the data.

# set range of time values for the x-axis
time_seq =seq(0, 200, length.out = 1000)

#calculate density values for both models using the MLEs from  Part A and Part B
weibull_dens= dweibull(time_seq, shape = hat_beta, scale = hat_lambda)
exp_dens =dexp(time_seq, rate = 1/hat_lambda_exp)


plot(time_seq, weibull_dens, type = "l", col = "darkred", lwd = 3,
     main = "Bearing Time-to-Failure: Weibull vs. Exponential",
     xlab = "Time (hours)", ylab = "Density",
     ylim = c(0, max(weibull_dens) * 1.1))


lines(time_seq, exp_dens, col = "blue", lwd = 2, lty = 2) # Add the Exponential curve for comparison

# histogram --> actual data to see the 'fit'
hist(wbearing, prob = TRUE, add = TRUE, col = rgb(0.5, 0.5, 0.5, 0.3), border = "white")

# legend
legend("topright", legend = c("Fitted Weibull (MLE)", "Fitted Exponential (Null)"),
       col = c("darkred", "blue"), lty = c(1, 2), lwd = 2)

---
title: "Assignment 11: Detecting Overfitting and Overfitting Issues"
author: "Ezana Rivers "
date: " 4-13: "
output:
  html_document: 
    toc: yes
    toc_depth: 4
    toc_float: yes
    number_sections: no
    toc_collapsed: yes
    code_folding: hide
    code_download: yes
    smooth_scroll: yes
    highlight: monochrome
    theme: spacelab
  word_document: 
    toc: yes
    toc_depth: 4
    fig_caption: yes
    keep_md: yes
  pdf_document: 
    toc: yes
    toc_depth: 4
    fig_caption: yes
    number_sections: yes
    fig_width: 3
    fig_height: 3
editor_options: 
  chunk_output_type: inline
---

```{css, echo = FALSE}
#TOC::before {
  content: "Table of Contents";
  font-weight: bold;
  font-size: 1.2em;
  display: block;
  color: navy;
  margin-bottom: 10px;
}


div#TOC li {     /* table of content  */
    list-style:upper-roman;
    background-image:none;
    background-repeat:none;
    background-position:0;
}

h1.title {    /* level 1 header of title  */
  font-size: 22px;
  font-weight: bold;
  color: DarkRed;
  text-align: center;
  font-family: "Gill Sans", sans-serif;
}

h4.author { /* Header 4 - and the author and data headers use this too  */
  font-size: 15px;
  font-weight: bold;
  font-family: system-ui;
  color: navy;
  text-align: center;
}

h4.date { /* Header 4 - and the author and data headers use this too  */
  font-size: 18px;
  font-weight: bold;
  font-family: "Gill Sans", sans-serif;
  color: DarkBlue;
  text-align: center;
}

h1 { /* Header 1 - and the author and data headers use this too  */
    font-size: 20px;
    font-weight: bold;
    font-family: "Times New Roman", Times, serif;
    color: darkred;
    text-align: center;
}

h2 { /* Header 2 - and the author and data headers use this too  */
    font-size: 18px;
    font-weight: bold;
    font-family: "Times New Roman", Times, serif;
    color: navy;
    text-align: left;
}

h3 { /* Header 3 - and the author and data headers use this too  */
    font-size: 16px;
    font-weight: bold;
    font-family: "Times New Roman", Times, serif;
    color: navy;
    text-align: left;
}

h4 { /* Header 4 - and the author and data headers use this too  */
    font-size: 14px;
  font-weight: bold;
    font-family: "Times New Roman", Times, serif;
    color: darkred;
    text-align: left;
}

/* Add dots after numbered headers */
.header-section-number::after {
  content: ".";

body {background-color: #ffffff;
      color: #000000;
      font-family: Arial, sans-serif;
      font-size: 1rem;
      line-height: 1.6;
      }

.highlightme { background-color:yellow; }

p { background-color:white; }

}
```

```{r setup, include=FALSE}
# code chunk specifies whether the R code, warnings, and output 
# will be included in the output files.
if (!require("knitr")) {
   install.packages("knitr")
   library(knitr)
}
if (!require("pander")) {
   install.packages("pander")
   library(pander)
}
if (!require("ggplot2")) {
  install.packages("ggplot2")
  library(ggplot2)
}
if (!require("tidyverse")) {
  install.packages("tidyverse")
  library(tidyverse)
}

if (!require("plotly")) {
  install.packages("plotly")
  library(plotly)
}

if (!require("VGAM")) {
  install.packages("VGAM")
  library(VGAM)
}
if (!require("boot")) {
  install.packages("boot")
  library(boot)
}
if (!require("effsize")) {
  install.packages("effsize")
  library(effsize)
}
## library(effsize)
#### VGAM
knitr::opts_chunk$set(echo = TRUE,       # include code chunk in the output file
                      warning = FALSE,   # sometimes, you code may produce warning messages,
                                         # you can choose to include the warning messages in
                                         # the output file. 
                      results = TRUE,    # you can also decide whether to include the output
                                         # in the output file.
                      message = FALSE,
                      comment = NA
                      )  
```
 
 \
 
## **Assignment Objectives** 

<p>
* Enhance understanding the procedure of likelihood-based chi-square hypothesis testing .

* Implement the procedures for detecting overfitting/underfitting issues in practical applications.
</p>


## **Policies of Using AI Tools**

<p>
**Policy on AI Tool Use**: Please adhere to the AI tool policy specified in the course syllabus. The direct copying of AI-generated content is strictly prohibited. All submitted work must reflect your own understanding; where external tools are consulted, content must be thoroughly rephrased and synthesized in your own words.
</p>

<p>
**Code Inclusion Requirement**: Any code included in your essay must be properly commented to explain the purpose and/or expected output of key code lines. Submitting AI-generated code without meaningful, student-added comments will not be accepted.
</p>


## Testing Overfitting/Underfitting

In Machine Learning and Statistics, overfitting occurs when a model is too complex and learns noise, leading to poor performance on new data, while underfitting happens when a model is too simple to capture important patterns, resulting in high errors overall; both issues are explained by the Bias–Variance Tradeoff and can cause unreliable predictions in real-world applications.


The probability density function (PDF) of the Weibull distribution is:

$$
f(t; \lambda, \beta) = \frac{\beta}{\lambda} \left( \frac{t}{\lambda} \right)^{\beta-1} \exp\left[ -\left( \frac{t}{\lambda} \right)^\beta \right], \quad t \ge 0
$$
where $\lambda > 0$ is the scale parameter (characteristic life) and $\beta > 0$ is the shape parameter.

When $\beta = 1$, the Weibull PDF simplifies to the exponential PDF:

$$
f(t; \lambda) = \frac{1}{\lambda} \exp\left( -\frac{t}{\lambda} \right)
$$
with constant hazard rate $h(t) = 1/\lambda$.


<p><font color = "darkred">**This assignment focuses on performing a hypothesis test for the shape parameter ($\beta$) of the Weibull distribution within a reliability mode**</font></p>


\begin{align}
H_0&: \beta = 1 \quad \text{(Exponential model, simpler)} \\
H_1&: \beta \neq 1 \quad \text{(Weibull model, more complex)}
\end{align}


\

## **Question: Reliability Application**

<p>
A mid-sized manufacturing company producing industrial conveyor systems began experiencing unexpected downtime in one of its distribution facilities, prompting concern about the reliability of a newly sourced batch of ball bearings used in the motor assemblies. These bearings, supplied by a vendor adopting cost-saving production methods, were installed across multiple units operating under continuous load conditions. After several months, maintenance logs revealed a pattern of increasing failures, with components lasting anywhere from a few dozen to over 150 hours before breakdown. To investigate, the engineering team collected time-to-failure data from 50 identical bearings and conducted a Weibull analysis within the framework of Reliability Engineering. The 50 time-to-failure (survival time) are:

```
12.4, 18.7, 25.3, 30.1, 33.5, 35.2, 38.9, 40.3, 42.7, 45.1, 47.6, 49.8, 52.4, 55.0, 
57.3, 60.2, 62.8, 65.1, 67.9, 70.5, 72.3, 75.6, 78.2, 80.9, 83.4, 85.7, 88.1, 90.6, 
93.2, 95.8, 98.4, 101.0, 104.5, 107.3, 110.6, 113.2, 116.8, 120.1, 123.7, 127.4,
130.9, 134.5, 138.2, 142.0, 146.3, 150.7, 155.2, 160.8, 168.4, 175.9
```
</p>

This assignment focuses on hypothesis $H_0: \beta = 1$ (exponential) against $H_1: \beta \neq 1$ (Weibull). This framework detects overfitting (fitting a Weibull when exponential is true) and underfitting (fitting exponential when Weibull with $\beta \neq 1$ is true). 


<p>
a). Find the MLE of the Weibull parameters $\lambda$ (scale) and $\beta$ (shape), denoted by $\hat{\lambda}$ and $\hat{\beta}$, respectively, using the `optim()` procedure. [*Hint: You should provide explicit expressions for the log-likelihood and gradient functions of the Weibull distribution parameters.*]

```{r}

wbearing = sort(c(12.4, 18.7, 25.3, 30.1, 33.5, 35.2, 38.9, 40.3, 42.7, 45.1, 47.6, 49.8, 52.4, 55.0, 
57.3, 60.2, 62.8, 65.1, 67.9, 70.5, 72.3, 75.6, 78.2, 80.9, 83.4, 85.7, 88.1, 90.6, 
93.2, 95.8, 98.4, 101.0, 104.5, 107.3, 110.6, 113.2, 116.8, 120.1, 123.7, 127.4,
130.9, 134.5, 138.2, 142.0, 146.3, 150.7, 155.2, 160.8, 168.4, 175.9))


#prepare weilbull distrubbtion log likelihood to use optim()
weil_loglik= function(params){
  
  lambda = params[1]
  beta = params[2]
  n = length(wbearing)
  
  if (lambda <= 0 || beta <= 0 ) return(-Inf)

  #log likelihood formula
  beta_shape = n * log(beta)
  scale_norm = n * beta * log(lambda) #normalizes the distrubution on scale parameter
  log_data_sum = (beta -1) * sum(log(wbearing)) #sums the logs
  exponential_kernel = sum((wbearing/lambda)^beta) #exponential kernel
  log_lik = beta_shape + log_data_sum - scale_norm - exponential_kernel
  
  return(log_lik)
}


#maximizing the likelihood to find the MLE
mle_weibull = optim(par = c(mean(wbearing), 1),
                     fn = weil_loglik,
                     hessian= TRUE,#output the hessian matrix
                     control = list(fnscale= -1) #fnsalce = -1 maximizes the likelihood
                     
                     )

# 3. Extract the results
hat_lambda <- mle_weibull$par[1]
hat_beta   <- mle_weibull$par[2]

cat("MLE for Lambda (Scale):", hat_lambda, 
    "\nMLE for Beta (Shape):", hat_beta)


```

b). Find the MLE of the exponential parameter $\lambda$ (scale), denoted by $\hat{\lambda}$, using any procedure. [*Hint: You should provide explicit expressions for the log-likelihood and gradient functions of the exponential distribution parameters.*]



```{r}

#defining the exponential log likelihood
exp_loglik = function(lambda){
  
  n = length(wbearing)
  e_log =-n * log(lambda) -sum(wbearing)/lambda
  return(e_log)
}

# exponential gradient (score function)

exp_gradient = function(lambda){
  n = length(wbearing)
  deriv_lambda = -n / lambda + sum(wbearing)/ (lambda^2) #derivative in respect to lambda
  return(deriv_lambda)
  
}

#finding the MLE for exponential lambda

hat_lambda_exp = mean(wbearing)

#store max log like

loglik_exp_max = exp_loglik(hat_lambda_exp)

cat("Exponential MLE for Lambda:", hat_lambda_exp, 
    "\nMax Log-Likelihood (Null Model):", loglik_exp_max)

```

c). Perform the likelihood ratio $\chi^2$ test on $\beta = 1$. What is the p-value? [*Hint: Use the results in a) and b)*.] 


```{r}

#we can now measure the distance between the two models
# The likelihood ratio test asks, "is the improvement in the 'height' of the log likelihood from the exponential model to the weibull model large enough to be statistically significant


#use the (weibull model) maximum log likelihood (from part a) 

loglik_weibull_max = mle_weibull$value

# calculate the likelihood ratio test
  #find the difference between the log likelihood maximum and log_weillbull maximum
  #is the more complex model better than the simpler model
#we use the logs and -2 to force a chi square distribution 

like_ratio_test_statistic = -2 * (loglik_exp_max - loglik_weibull_max)

#calculate p value for test statistic from chiq square distrubtion

pval_lr_test = 1 - pchisq(like_ratio_test_statistic, df=1)


cat("Likelihood ratio test statistic:", like_ratio_test_statistic, 
    "\np-value (LRT):", pval_lr_test)


```

d). Perform the Wald $\chi^2$ on the same hypothesis $\beta = 1$. What is the p-value? [*Hint: You need to find the observed Fisher information matrix (i.e., the negative Hessian matrix from `optim()`) based on the Weibull distribution. The inverse of the <font color = "blue">negative</font> observed Hessian matrix is the covariance matrix*.] 


```{r}

#the observed fisher information(negative hessian)
#(the hessian is the matrix of the second derivative 


fisher_info = -mle_weibull$hessian

#run a variance covariance matrix (inverse of the fisher information)

var_cov_matrix = solve(fisher_info)

#extract the variance for beta
#since params[2] was beta, the variance of beta is the 2nd row, 2nd column
var_beta = var_cov_matrix[2,2]


## Wald test statistic
    #we use wald to determine if the variable is significant and should be used in the final model 
# Formula: (Estimate - Null)^2 / Variance

wald_statistic = (hat_beta- 1)^2 /var_beta

#p value from wald chi square distrubution

p_val_wald = 1 - pchisq(wald_statistic, df = 1)

cat("Wald Statistic:", wald_statistic, 
    "\np-value:", p_val_wald)

```





e). Write a summary of the above analyses to address the following:

* Whether the two tests generated the same results.

* Which model is recommended for the data.

* Draw the density curve based on the MLE(s) of the parameter(s) and describe the distribution of the time-to-failure.
</p>


Both the Likelihood Ratio Test (LR=`r like_ratio_test_statistic`, p= `r pval_lr_test` ) and Wald Test (W= `r wald_statistic`, p = `r p_val_wald`) are statistically significant. 
We can concluded that the slope is statistically different than 1 and that the converyer belt bearings do not fail randomly. The estimated shape parameter at 2.21 suggests the conveyers increasingly fail overtime.
Considering the moderate sample of 50, the Likelihood ratio test (LRT) is the preferred test to use. LRT is most reliable for accuracy and closest to nominal Type I error rates as well as robust in being invariant to parameter transformations. The sample size does not leave computational concern. The confirmation that both LRT and Wald are significant 
suggests a strong relationship, although the Wald test performs poorly in skewness and nonlinear parameters which the weilbull distribution tends to hold. A drop from the LRT from being `r like_ratio_test_statistic` to W= `r wald_statistic`, may suggest the weilbull test could not strongly handle the distribution's skewness. If there is discrepancy between knowing to use the simple exponential or more complex weibull model, use the weibull because it better explains the distribution relationship as the exponential model underfits the data.


```{r}

# set range of time values for the x-axis
time_seq =seq(0, 200, length.out = 1000)

#calculate density values for both models using the MLEs from  Part A and Part B
weibull_dens= dweibull(time_seq, shape = hat_beta, scale = hat_lambda)
exp_dens =dexp(time_seq, rate = 1/hat_lambda_exp)


plot(time_seq, weibull_dens, type = "l", col = "darkred", lwd = 3,
     main = "Bearing Time-to-Failure: Weibull vs. Exponential",
     xlab = "Time (hours)", ylab = "Density",
     ylim = c(0, max(weibull_dens) * 1.1))


lines(time_seq, exp_dens, col = "blue", lwd = 2, lty = 2) # Add the Exponential curve for comparison

# histogram --> actual data to see the 'fit'
hist(wbearing, prob = TRUE, add = TRUE, col = rgb(0.5, 0.5, 0.5, 0.3), border = "white")

# legend
legend("topright", legend = c("Fitted Weibull (MLE)", "Fitted Exponential (Null)"),
       col = c("darkred", "blue"), lty = c(1, 2), lwd = 2)



```










