Data and Setup

# stored as data vector
data = rbind.data.frame(3.2, 5.8, 7.1, 4.5, 10.3, 6.2, 8.7, 5.1, 12.5, 6.9,
9.4, 5.7, 11.8, 4.9, 9.1, 6.5, 13.2, 7.8, 10.6, 6.1,
8.9, 5.4, 12.1, 7.3, 9.8, 5.9, 11.4, 6.8, 10.9, 7.5,
4.2, 8.3, 6.4, 14.1, 5.6, 9.7, 7.9, 11.1, 6.7, 10.2,
5.3, 8.6, 7.2, 12.9, 6.3, 9.3, 8.1, 13.7, 7.6, 10.8)
colnames(data) = "time"

# stored as raw values
times = c(3.2, 5.8, 7.1, 4.5, 10.3, 6.2, 8.7, 5.1, 12.5, 6.9,
9.4, 5.7, 11.8, 4.9, 9.1, 6.5, 13.2, 7.8, 10.6, 6.1,
8.9, 5.4, 12.1, 7.3, 9.8, 5.9, 11.4, 6.8, 10.9, 7.5,
4.2, 8.3, 6.4, 14.1, 5.6, 9.7, 7.9, 11.1, 6.7, 10.2,
5.3, 8.6, 7.2, 12.9, 6.3, 9.3, 8.1, 13.7, 7.6, 10.8)

For this analysis, we will be examining a dataset composed of 50 observed customer service call duration times, provided by a major North American phone provider. A summary and histogram of the available data is below.

sum_stats_table = function(variable){
  table = cbind(
    round(min(variable),2),
    round(quantile(variable, 0.25),2),
    round(median(variable),2),
    round(mean(variable),2),
    round(quantile(variable, 0.75),2),
    round(max(variable),2)
  )
  colnames(table) = c("Min", "Q1", "Med", "Mean", "Q3", "Max")
  rownames(table) = NULL
  return(table)
}

Summary_Table = sum_stats_table(variable = times)
kable(Summary_Table,align = "c",
      caption = "<span style='color:##000000;'>
      Summary Stats of Call Times </span>") %>%
  kable_styling(
    bootstrap_options = c("striped", "bordered"),
    full_width = FALSE,
    position = "center")
Summary Stats of Call Times
Min Q1 Med Mean Q3 Max
3.2 6.23 7.85 8.31 10.28 14.1
hist(data$time, 
     main = "Distribution of Call Times",
     xlab = "Call Time",
     prob = TRUE,
)
box(lwd=3)

Moving forward, we will use this available sample data to create confidence intervals to estimate the population parameter \(\theta\) from the broader population of service call duration times for this phone provider. Each of these confidence intervals will have a coverage (or confidence level) of 95%. This means that, in a theoretical world with endless random sampling from the population, the true value of \(\theta\) would be included in our interval 95% of the time.

We will operate under the assumption that our data follows a Lindley distribution. That distribution’s density function, MLE for estimating \(\theta\) (the distribution’s shape parameter), and Fisher information function are below.

knitr::include_graphics("/Users/chrisbahm/Library/Mobile Documents/com~apple~CloudDocs/West Chester/Spring 2026/STA 506 - Grad Math Stat 2 /STA 506 R Folder/Week 8 - Confidence Intervals/PDF.png")

knitr::include_graphics("/Users/chrisbahm/Library/Mobile Documents/com~apple~CloudDocs/West Chester/Spring 2026/STA 506 - Grad Math Stat 2 /STA 506 R Folder/Week 8 - Confidence Intervals/MLE.png")

knitr::include_graphics("/Users/chrisbahm/Library/Mobile Documents/com~apple~CloudDocs/West Chester/Spring 2026/STA 506 - Grad Math Stat 2 /STA 506 R Folder/Week 8 - Confidence Intervals/info.png")


a.) Asymptotic Confidence Interval

The first confidence interval I constructed to estimate \(\theta\) was an asymptotic interval, predicated on the properties of the central limit theorem (CLT). Per the CLT, a sample composed of independent and identically distributed data approaches a normal distribution as sample size increases. Conventional wisdom states that generally the CLT can be applied if the sample size is at least 30, which is the case in this scenario.

sample_var = var(data$time) # = 7.305241

n = nrow(data)

### per provided MLE formula
mle_numerator = 1 - mean(data$time) + 
  sqrt(mean(data$time)^2 + 6*mean(data$time)+1)

mle_denominator = 2 * mean(data$time)

theta_hat = mle_numerator/mle_denominator

# referencing code from Section 4 of Week 8 notes

# Find SE next
  # we are finding standard error of theta hat, so use theta hat in expression
  
  # calculations are extracted from Fisher's Information of theta expression that is provided in instructions

se_numerator = 1

se_denominator = sqrt(n * 
                        ((2/theta_hat^2) - 
                           1/(1+theta_hat)^2))

se = se_numerator/se_denominator

crit_value_low = qnorm(0.025)
crit_value_high = qnorm(0.025, lower.tail = FALSE)

  # 0.025 + 0.025 = 0.05 = 95% confidence

# confidence interval formula is estimate +- (critical value)(standard error)

ci_low = theta_hat - abs(crit_value_low * se)
  # = 0.1758057

ci_high = theta_hat + abs(crit_value_high * se)
  # = 0.2623931

The first step I took in constructing this asymptotic interval was calculating a point estimate of our parameter of interest, \(\theta\). I did so by using the closed form MLE equation pictured earlier in this report. Doing so provided me a \(\hat\theta\) value of approximately 0.219.

Since the formula for creating a confidence interval is estimate +- [(critical value) X (standard error)], I then focused on quantifying the error of the interval. As for the critical value, I simply used R’s qnorm() function (which is mathematically sound due to the CLT), and got critical values of roughly +- 1.96.

Regarding standard error, I performed algebraic manipulation on the Fisher information function referenced in opening section. Since the Fisher function tells us how much one instance of a random variable indicates about \(\theta\), I multiplied the given function by our sample size (n = 50). According to principles of MLE, the variance of \(\theta\) is simply 1 divided by \(\theta\)’s Fisher function. Because standard error (or standard deviation) is the square root of variance, I was ultimately able to calculate a standard error of roughly 0.022.

Putting these components together - the point estimate, critical value, and standard error - our final asymptotic confidence interval for \(\theta\) is the following (rounded):

\[\Large 0.176< \theta < 0.262\]

For greater certainty that no errors were made, I went forward and plotted, over top the sample data, a Lindley distribution with \(\theta\) values respectively representing the low and high points of the asymptotic interval as well as our original point estimate for \(\hat\theta\).

# install.packages("VGAM") needed to do so for dlind function
library(VGAM)

hist(data$time, 
     main = "Distribution of Call Times",
     xlab = "Call Time",
     prob = TRUE,
)

# lower limit
curve(dlind(x , theta = ci_low), 
      col = "blue",
      lwd = 3,
      add = TRUE)

# point estimation
curve(dlind(x , theta = theta_hat), 
      col = "red",
      lwd = 3,
      add = TRUE)

# upper limit
curve(dlind(x , theta = ci_high), 
      col = "green",
      lwd = 3,
      add = TRUE)

legend("topright",
       legend = c(paste0("Lower Limit (", round(ci_low,3), ")"),
                  paste0("Point Estimate (", round(theta_hat,3), ")" ),
                  paste0("Upper Limit (", round(ci_high,3), ")")),
       col = c("blue", "red", "green"),
       lwd = 3,
       box.lwd=3)

box(lwd=3)

As we can see, the overall trend of each plotted Lindley distribution does roughly mirror the trend of our sample data, with the upper limit of our confidence interval having the greatest resemblance. The plot does not provide overwhelming evidence for or against the accuracy of our interval estimations. We will keep this in mind, and compare the asymptotic interval results to the likelihood ratio results found in the next section.


b.) Likelihood Ratio Confidence Interval

The second confidence interval I created was a likelihood ratio one. Likelihood ratio intervals are not reliant on the theory of asymptotic normality, and are generally seen as more effective when working with small or moderate sample sizes.

While the likelihood ratio function works by default with standard likelihood functions, I chose to utilize the log likelihood for greater calculative ease. The provided log likelihood is below.

knitr::include_graphics("/Users/chrisbahm/Library/Mobile Documents/com~apple~CloudDocs/West Chester/Spring 2026/STA 506 - Grad Math Stat 2 /STA 506 R Folder/Week 8 - Confidence Intervals/Log.png")

# reference sections 5.1 through 5.3 from class notes

### The likelihood ratio works THE SAME using the log likelihood so will proceed using that function

# define log likelihood function
log_likelihood = function(theta, data){
  n = nrow(data)
  
  C = sum(log(1 + data))
  S = sum(data)
  
  return(n * log(theta^2/(1+theta)) + C - (theta * S))
}

## now evaluate log likelihood at MLE value of theta hat found in part a

log_likelihood_at_mle = log_likelihood(theta = theta_hat, data = data)
# = -143.3101
  # denominator of ratio function
  # remember, natural log of any number between 0 and 1 is negative, so do not be alarmed at value

Evaluating the log likelihood function with the value of \(\hat\theta\) = 0.219 originally found through the MLE function, we get a log likelihood of about -143.31.

I then utilize Wilks’ Theorem, which states that the likelihood ratio stat of \(\theta\) is equal to 2 times the difference between the log likelihood evaluated at \(\hat\theta\) and the log likelihood evaluated at the different data points of our sample.

## Wilks' Theorem
LR_stat = function(theta, data){
  2 * (log_likelihood_at_mle - log_likelihood(theta, data))
}

  # Wilks' theorem states that a likelihood ratio confidence interval with X parameters will have a critical value of a chi-square distribution with 1 - alpha, and a df = X
ratio_crit_value = qchisq(0.95, df = 1) # = 3.841459

root_function = function(theta, data, theta_hat){
  LR_stat(theta, data) - ratio_crit_value
}

lambda_lower = uniroot(root_function, 
                  interval = c(0.0001, theta_hat),
                  data = data,
                  theta_hat = theta_hat)$root

lambda_higher = uniroot(root_function, 
                  interval = c(theta_hat, 2),
                  data = data,
                  theta_hat = theta_hat)$root

#lambda_lower = 0.1786415
#lambda_higher = 0.2653312

Finally, R’s uniroot() function works to find the root of the equations set up by subtracting the critical value (3.841) from the likelihood ratio stat equation.

The critical value was quantified as the 95th percentile of a chi-square distribution with 1 degree of freedom. Wilks’ theorem states that a likelihood ratio confidence interval with X parameters will have a critical value of a chi-square distribution with 1 - \(\alpha\), and a df = X.

The ending results for the likelihood ratio confidence interval were the following:

\[\Large 0.179< \theta < 0.265\]


c.) Interval Comparison

As we can see, both methods of parameter interval estimation yielded extraordinarily similar results. If forced to make a choice, I would likely recommend moving forward with the likelihood confidence interval. I say this for two reasons.

First, as mentioned above, likelihood ratios are generally seen as a more effective tool when working with small to moderate sample sizes. In this scenario with a sample size of only 50, it is probably safer to use a method such as the likelihood ratio technique which is not reliant on the principle of asymptotic normality. Beyond that, if we recall back to our plot above, we saw that the larger values of \(\theta\) yielded a greater resemblance to our sample data. As we see in the comparison table below, both the lower and higher interval estimates from the likelihood ratio method are of greater value than their counterparts from the asymptotic method.

asymp_results = cbind(ci_low, ci_high)
ratio_results = cbind(lambda_lower, lambda_higher)
results = rbind(asymp_results, ratio_results)
colnames(results) = c("Lower Estimate", "Higher Estimate")
rownames(results) = c("Asymptotic", "Likelihood Ratio")

kable(results,align = "c",
      caption = "<span style='color:##000000;'>
      Estimates of Shape Parameter </span>") %>%
  kable_styling(
    bootstrap_options = c("striped", "bordered"),
    full_width = FALSE,
    position = "center")
Estimates of Shape Parameter
Lower Estimate Higher Estimate
Asymptotic 0.1758057 0.2623931
Likelihood Ratio 0.1786415 0.2653312
---
title: "Likelihood Confidence Intervals"
author: "Chris Bahm"
date: "March 21, 2026"
output:
  html_document:
    toc: true
    toc_float:
      collapsed: true
      smooth_scroll: true
    toc_depth: 4
    fig_width: 6
    fig_height: 4
    fig_caption: true
    number_sections: false
    code_folding: hide
    code_download: true
    theme: lumen
    highlight: tango
  pdf_document:
    toc: true
    toc_depth: 4
    fig_caption: true
    number_sections: true
  word_document:
    toc: true
    toc_depth: 4
---

```{css, echo = FALSE}
div#TOC li {     /* table of content  */
    list-style:upper-roman;
    background-image:none;
    background-repeat:none;
    background-position:0;
}

h1.title {    /* level 1 header of title  */
  font-size: 24px;
  font-weight: bold;
  color: DarkRed;
  text-align: center;
}

h4.author { /* Header 4 - and the author and data headers use this too  */
  font-size: 18px;
  font-weight: bold;
  font-family: "Times New Roman", Times, serif;
  color: DarkRed;
  text-align: center;
}

h4.date { /* Header 4 - and the author and data headers use this too  */
  font-size: 18px;
  font-weight: bold;
  font-family: "Times New Roman", Times, serif;
  color: DarkBlue;
  text-align: center;
}

h1 { /* Header 1 - and the author and data headers use this too  */
    font-size: 20px;
    font-weight: bold;
    font-family: "Times New Roman", Times, serif;
    color: darkred;
    text-align: center;
}

h2 { /* Header 2 - and the author and data headers use this too  */
    font-size: 18px;
    font-weight: bold;
    font-family: "Times New Roman", Times, serif;
    color: navy;
    text-align: left;
}

h3 { /* Header 3 - and the author and data headers use this too  */
    font-size: 16px;
    font-weight: bold;
    font-family: "Times New Roman", Times, serif;
    color: navy;
    text-align: left;
}

h4 { /* Header 4 - and the author and data headers use this too  */
    font-size: 14px;
  font-weight: bold;
    font-family: "Times New Roman", Times, serif;
    color: darkred;
    text-align: left;
}

/* Add dots after numbered headers */
.header-section-number::after {
  content: ".";
}
```

```{r setup, include=FALSE}
# code chunk specifies whether the R code, warnings, and output 
# will be included in the output files.

if (!require("knitr")) {                      # use conditional statement to detect
   install.packages("knitr")                  # whether a package was installed in
   library(knitr)                             # your machine. If not, install it and
}                                             # load it to the working directory.

if (!require(tidyverse)) {library(tidyvserse)} 

if (!require(GGally)) {library(GGally)} 

if (!require(kableExtra)) {library(kableExtra)} 

if (!require(ggplot2)) {library(ggplot2)} 

if (!require(car)) {library(car)} 

if (!require(dplyr)) {library(dplyr)} 

if (!require(pander)) {library(pander)} 

if (!require(forecast)) {library(forecast)} 

if (!require(lubridate)) {library(lubridate)} 

if (!require("scales")) {
install.packages("scales")                                        
library("scales") 
}

knitr::opts_chunk$set(
	echo = TRUE,
	message = FALSE,
	warning = FALSE,
	comment = NA,
	results = TRUE,
	fig.align = "center"
)
```

# Data and Setup

```{r}
# stored as data vector
data = rbind.data.frame(3.2, 5.8, 7.1, 4.5, 10.3, 6.2, 8.7, 5.1, 12.5, 6.9,
9.4, 5.7, 11.8, 4.9, 9.1, 6.5, 13.2, 7.8, 10.6, 6.1,
8.9, 5.4, 12.1, 7.3, 9.8, 5.9, 11.4, 6.8, 10.9, 7.5,
4.2, 8.3, 6.4, 14.1, 5.6, 9.7, 7.9, 11.1, 6.7, 10.2,
5.3, 8.6, 7.2, 12.9, 6.3, 9.3, 8.1, 13.7, 7.6, 10.8)
colnames(data) = "time"

# stored as raw values
times = c(3.2, 5.8, 7.1, 4.5, 10.3, 6.2, 8.7, 5.1, 12.5, 6.9,
9.4, 5.7, 11.8, 4.9, 9.1, 6.5, 13.2, 7.8, 10.6, 6.1,
8.9, 5.4, 12.1, 7.3, 9.8, 5.9, 11.4, 6.8, 10.9, 7.5,
4.2, 8.3, 6.4, 14.1, 5.6, 9.7, 7.9, 11.1, 6.7, 10.2,
5.3, 8.6, 7.2, 12.9, 6.3, 9.3, 8.1, 13.7, 7.6, 10.8)

```
For this analysis, we will be examining a dataset composed of `r nrow(data)` observed customer service call duration times, provided by a major North American phone provider. A summary and histogram of the available data is below.

```{r, fig.height=5.5, fig.width=6.5}
sum_stats_table = function(variable){
  table = cbind(
    round(min(variable),2),
    round(quantile(variable, 0.25),2),
    round(median(variable),2),
    round(mean(variable),2),
    round(quantile(variable, 0.75),2),
    round(max(variable),2)
  )
  colnames(table) = c("Min", "Q1", "Med", "Mean", "Q3", "Max")
  rownames(table) = NULL
  return(table)
}

Summary_Table = sum_stats_table(variable = times)
kable(Summary_Table,align = "c",
      caption = "<span style='color:##000000;'>
      Summary Stats of Call Times </span>") %>%
  kable_styling(
    bootstrap_options = c("striped", "bordered"),
    full_width = FALSE,
    position = "center")


hist(data$time, 
     main = "Distribution of Call Times",
     xlab = "Call Time",
     prob = TRUE,
)
box(lwd=3)

```

Moving forward, we will use this available sample data to create confidence intervals to estimate the population parameter $\theta$ from the broader population of service call duration times for this phone provider. Each of these confidence intervals will have a coverage (or confidence level) of 95%. This means that, in a theoretical world with endless random sampling from the population, the true value of $\theta$ would be included in our interval 95% of the time.

We will operate under the assumption that our data follows a Lindley distribution. That distribution's density function, MLE for estimating $\theta$ (the distribution's shape parameter), and Fisher information function are below.

```{r}
knitr::include_graphics("/Users/chrisbahm/Library/Mobile Documents/com~apple~CloudDocs/West Chester/Spring 2026/STA 506 - Grad Math Stat 2 /STA 506 R Folder/Week 8 - Confidence Intervals/PDF.png")

knitr::include_graphics("/Users/chrisbahm/Library/Mobile Documents/com~apple~CloudDocs/West Chester/Spring 2026/STA 506 - Grad Math Stat 2 /STA 506 R Folder/Week 8 - Confidence Intervals/MLE.png")

knitr::include_graphics("/Users/chrisbahm/Library/Mobile Documents/com~apple~CloudDocs/West Chester/Spring 2026/STA 506 - Grad Math Stat 2 /STA 506 R Folder/Week 8 - Confidence Intervals/info.png")
```

<br>

# a.) Asymptotic Confidence Interval
The first confidence interval I constructed to estimate $\theta$ was an asymptotic interval, predicated on the properties of the central limit theorem (CLT). Per the CLT, a sample composed of independent and identically distributed data approaches a normal distribution as sample size increases. Conventional wisdom states that generally the CLT can be applied if the sample size is at least 30, which is the case in this scenario.

```{r}
sample_var = var(data$time) # = 7.305241

n = nrow(data)

### per provided MLE formula
mle_numerator = 1 - mean(data$time) + 
  sqrt(mean(data$time)^2 + 6*mean(data$time)+1)

mle_denominator = 2 * mean(data$time)

theta_hat = mle_numerator/mle_denominator

# referencing code from Section 4 of Week 8 notes

# Find SE next
  # we are finding standard error of theta hat, so use theta hat in expression
  
  # calculations are extracted from Fisher's Information of theta expression that is provided in instructions

se_numerator = 1

se_denominator = sqrt(n * 
                        ((2/theta_hat^2) - 
                           1/(1+theta_hat)^2))

se = se_numerator/se_denominator

crit_value_low = qnorm(0.025)
crit_value_high = qnorm(0.025, lower.tail = FALSE)

  # 0.025 + 0.025 = 0.05 = 95% confidence

# confidence interval formula is estimate +- (critical value)(standard error)

ci_low = theta_hat - abs(crit_value_low * se)
  # = 0.1758057

ci_high = theta_hat + abs(crit_value_high * se)
  # = 0.2623931

```
The first step I took in constructing this asymptotic interval was calculating a point estimate of our parameter of interest, $\theta$. I did so by using the closed form MLE equation pictured earlier in this report. Doing so provided me a $\hat\theta$ value of approximately **`r round(theta_hat,3)`**.

Since the formula for creating a confidence interval is **estimate +- [(critical value) X (standard error)]**, I then focused on quantifying the error of the interval. As for the critical value, I simply used R's <span style="color:blue;">qnorm()</span> function (which is mathematically sound due to the CLT), and got critical values of roughly +- `r round(crit_value_high,3)`.

Regarding standard error, I performed algebraic manipulation on the Fisher information function referenced in opening section. Since the Fisher function tells us how much *one instance* of a random variable indicates about $\theta$, I multiplied the given function by our sample size (n = 50). According to principles of MLE, the variance of $\theta$ is simply 1 divided by $\theta$'s Fisher function. Because standard error (or standard deviation) is the square root of variance, I was ultimately able to calculate a standard error of roughly `r round(se, 3)`.

Putting these components together - the point estimate, critical value, and standard error - our final asymptotic confidence interval for $\theta$ is the following (rounded):

$$\Large `r round(ci_low,3)`< \theta < `r round(ci_high,3)`$$

For greater certainty that no errors were made, I went forward and plotted, over top the sample data, a Lindley distribution with $\theta$ values respectively representing the low and high points of the asymptotic interval as well as our original point estimate for $\hat\theta$. 
```{r, fig.height=5.5, fig.width=6.5}
# install.packages("VGAM") needed to do so for dlind function
library(VGAM)

hist(data$time, 
     main = "Distribution of Call Times",
     xlab = "Call Time",
     prob = TRUE,
)

# lower limit
curve(dlind(x , theta = ci_low), 
      col = "blue",
      lwd = 3,
      add = TRUE)

# point estimation
curve(dlind(x , theta = theta_hat), 
      col = "red",
      lwd = 3,
      add = TRUE)

# upper limit
curve(dlind(x , theta = ci_high), 
      col = "green",
      lwd = 3,
      add = TRUE)

legend("topright",
       legend = c(paste0("Lower Limit (", round(ci_low,3), ")"),
                  paste0("Point Estimate (", round(theta_hat,3), ")" ),
                  paste0("Upper Limit (", round(ci_high,3), ")")),
       col = c("blue", "red", "green"),
       lwd = 3,
       box.lwd=3)

box(lwd=3)
```

As we can see, the overall trend of each plotted Lindley distribution does roughly mirror the trend of our sample data, with the upper limit of our confidence interval having the greatest resemblance. The plot does not provide overwhelming evidence for or against the accuracy of our interval estimations. We will keep this in mind, and compare the asymptotic interval results to the likelihood ratio results found in the next section.

<br>

# b.) Likelihood Ratio Confidence Interval
The second confidence interval I created was a likelihood ratio one. Likelihood ratio intervals are not reliant on the theory of asymptotic normality, and are generally seen as more effective when working with small or moderate sample sizes. 

While the likelihood ratio function works by default with standard likelihood functions, I chose to utilize the log likelihood for greater calculative ease. The provided log likelihood is below.

```{r}
knitr::include_graphics("/Users/chrisbahm/Library/Mobile Documents/com~apple~CloudDocs/West Chester/Spring 2026/STA 506 - Grad Math Stat 2 /STA 506 R Folder/Week 8 - Confidence Intervals/Log.png")
```

```{r}
# reference sections 5.1 through 5.3 from class notes

### The likelihood ratio works THE SAME using the log likelihood so will proceed using that function

# define log likelihood function
log_likelihood = function(theta, data){
  n = nrow(data)
  
  C = sum(log(1 + data))
  S = sum(data)
  
  return(n * log(theta^2/(1+theta)) + C - (theta * S))
}

## now evaluate log likelihood at MLE value of theta hat found in part a

log_likelihood_at_mle = log_likelihood(theta = theta_hat, data = data)
# = -143.3101
  # denominator of ratio function
  # remember, natural log of any number between 0 and 1 is negative, so do not be alarmed at value
```

Evaluating the log likelihood function with the value of $\hat\theta$ = `r round(theta_hat,3)` originally found through the MLE function, we get a log likelihood of about `r round(log_likelihood_at_mle,3)`. 

I then utilize *Wilks' Theorem*, which states that the likelihood ratio stat of $\theta$ is equal to 2 times the difference between the log likelihood evaluated at $\hat\theta$ and the log likelihood evaluated at the different data points of our sample.

```{r}
## Wilks' Theorem
LR_stat = function(theta, data){
  2 * (log_likelihood_at_mle - log_likelihood(theta, data))
}

  # Wilks' theorem states that a likelihood ratio confidence interval with X parameters will have a critical value of a chi-square distribution with 1 - alpha, and a df = X
ratio_crit_value = qchisq(0.95, df = 1) # = 3.841459

root_function = function(theta, data, theta_hat){
  LR_stat(theta, data) - ratio_crit_value
}

lambda_lower = uniroot(root_function, 
                  interval = c(0.0001, theta_hat),
                  data = data,
                  theta_hat = theta_hat)$root

lambda_higher = uniroot(root_function, 
                  interval = c(theta_hat, 2),
                  data = data,
                  theta_hat = theta_hat)$root

#lambda_lower = 0.1786415
#lambda_higher = 0.2653312
```

Finally, R's <span style="color:blue;">uniroot()</span> function works to find the root of the equations set up by subtracting the critical value (`r round(ratio_crit_value,3)`) from the likelihood ratio stat equation. 

The critical value was quantified as the 95th percentile of a chi-square distribution with 1 degree of freedom. Wilks' theorem states that a likelihood ratio confidence interval with X parameters will have a critical value of a chi-square distribution with 1 - $\alpha$, and a df = X.

The ending results for the likelihood ratio confidence interval were the following:

$$\Large `r round(lambda_lower,3)`< \theta < `r round(lambda_higher,3)`$$

<br>

# c.) Interval Comparison
As we can see, both methods of parameter interval estimation yielded extraordinarily similar results. If forced to make a choice, I would likely recommend moving forward with the likelihood confidence interval. I say this for two reasons. 

First, as mentioned above, likelihood ratios are generally seen as a more effective tool when working with small to moderate sample sizes. In this scenario with a sample size of only `r nrow(data)`, it is probably safer to use a method such as the likelihood ratio technique which is not reliant on the principle of asymptotic normality. Beyond that, if we recall back to our plot above, we saw that the larger values of $\theta$ yielded a greater resemblance to our sample data. As we see in the comparison table below, both the lower and higher interval estimates from the likelihood ratio method are of greater value than their counterparts from the asymptotic method.

```{r}
asymp_results = cbind(ci_low, ci_high)
ratio_results = cbind(lambda_lower, lambda_higher)
results = rbind(asymp_results, ratio_results)
colnames(results) = c("Lower Estimate", "Higher Estimate")
rownames(results) = c("Asymptotic", "Likelihood Ratio")

kable(results,align = "c",
      caption = "<span style='color:##000000;'>
      Estimates of Shape Parameter </span>") %>%
  kable_styling(
    bootstrap_options = c("striped", "bordered"),
    full_width = FALSE,
    position = "center")

```

