Statistical Methods for Reliability Data

Chapter 7 - Parametric Likelihood Fitting Concepts: Exponential Distribution

W. Q. Meeker and L. A. Escobar

16 June 2020

CHAPTER OVERVIEW

This chapter explains…

Likelihood for a parametric model using discrete data
Likelihood for samples containing right and left consored observations
Use of parametric likelihood as a tool for data analysis and inference about a single population or process
The use of likelihood and normal-approximation confidence intervals for model parameters and other quantities of interest
The density approximation to the likelihood for observations reported as exact failures

7.1 - INTRODUCTION

Maximum Likelihood Estimation

If we specify a probability distribution and values for parameter vector $\underline{\theta}$, the pdf $f(\underline{x}|\underline{\theta})$…
- Represents the probability (probability density, actually) of observing data $\underline{x}=x_1,...,x_n, \;\;n\in[1,\infty)$ from a specified probability distribution
- Is a function of $\underline{x}$ assuming $\underline{\theta}=\theta_{1},\theta_{2},...$ are known
- Describes what observations $\underline{x}$ are most likely to be produced by a distribution with parameters $\underline{\theta}$
If we specify a probability distribution and observe data $\underline{x}$, the likelihood function $\mathscr{L}(\underline{\theta}|\underline{x})$…
- Represents a relative measure to determine which values of $\underline{\theta}$ are more likely to have produced the data $\underline{x}=x_1,...,x_n, \;\;n\in[1,\infty)$
- Is a function of $\underline{\theta}$ assuming $\underline{x}=x_{1},...,x_{n}$ has already been observed

Properties of MLE’s

If the ML regularity conditions are met

\[ \sqrt{n}\left(\hat{\theta}_{_{MLE}}-\theta\right)\xrightarrow{L}N\left(0,\mathscr{I}^{-1}_{\theta}\right) \]

where $\mathscr{I}_{\theta}$ is the Fisher Information matrix evaluated at $\theta$ which is expressed as

\[ \mathscr{I}_{\theta}=-E\left[\frac{\partial^2\mathscr{L}(\underline{\theta}|\underline{x})}{\partial\underline{\theta}(\partial\underline{\theta})^T}\right]=-E\left[\frac{\partial^2\mathscr{L}(\underline{\theta}|\underline{x})}{\partial\theta_i\partial\theta_j}\right] \]

Functions of MLE’s are also MLE’s and are asymptotocally normally distributed
- We can use the Delta Method to find these asymptotic distributions

7.1 - INTRODUCTION Example

Example 7.1 - Time Between $\alpha$-Particle Emissions

Background
- Berkson investigated the time between $\alpha$-particle emissions from AM-241
- Physics suggests that the interarrival times of the $\alpha$ particles are $iid\; EXP(\theta)$
Data set
- The data are $10,220$ observed interarrival times (measured in units of $1/5000$ seconds)
- The times have been “binned” into intervals as shown in Table 7.1
- To illustrate the effect of sample size, random samples of size $n=20, 200,\;\&\;2000$ were drawn from the original data set, these samples are shown in Table 7.1
Figure 7.1 & Table 7.1

SMRD::smrd_app('figure7_1')

7.2 - PARAMETRIC LIKELIHOOD

Total likelihood is equal to the joint probability of the data

Mathematically this means…

\[ \mathscr{L}(\underline{\theta}|\underline{x})=\sum_{i=1}^{n}\mathscr{L}_{i}(\underline{\theta}|x_i)=f(\underline{x}|\underline{\theta})=\prod_{i=1}^{n}f(x_{i}|\underline{\theta}),\;\;\text{if}\;x_{i}\; iid \]

So…what does that mean?

Ok, let’s walk through it using the following app

teachingApps::teachingApp('maximum_likelihood')

7.2.2 - Likelihood Function and Its Maximum

Every Observation Contributes to the Likelihood Function

The value of the likelihood function $\mathscr{L}(\underline{\theta}|\underline{x})$ depends on
1. The assumed parametric model
2. The observed data
The total likelihood is comprised of the contributions from every observation
- For observation $x_i$, the model with the highest probability provides the greatest contribution to the total likelihood
- For a single observation, the model providing the greatest contribution to the total likelihood may not be the correct model
- As the number of observatons is increased, more information is obtained and it becomes easier to differentiate which model best-fits the data
The shiny app below shows how increasing the number of observations makes it easier to tell which model best-fits a data set

teachingApps::teachingApp('maximum_likelihood_simulation')

Contributions to the Likelihood Function For Reliability Data

For failure-time data, observations make one of four contributions to the likelihood function

\[ \mathscr{L}_{i}(\underline{\theta}|t_{i})=\begin{cases} S(t_{i}) &\mbox{for a right censored observation}\\\\F(t_{i}) &\mbox{for a left censored observation}\\\\F(t_{i})-F(t_{i-1}) &\mbox{for an interval censored observation}\\\\\lim\limits_{\Delta_i\rightarrow 0} \frac{(F(t_{i})-\Delta_{i})-F(t_{i})}{\Delta_{i}} &\mbox{for an "exact" observations}\end{cases} \]

Thus, from Eq 2.14, the total likelihood function may be expressed as

\[ \begin{aligned} \mathscr{L}(\underline{\theta}|\underline{x})&=C\prod_{i=1}^{n} \mathscr{L}_{i}(\underline{\theta}|x_i)\\ &=C\prod_{i=1}^{m+1}[F(t_{i})]^{l_{i}}[F(t_{i})-F(t_{i-1})]^{d_{i}}[1-F(t_{i})]^{r_{i}} \end{aligned} \]

where
- $l_i=1$ if $t_i$ is a left censored observation (0 otherwise)
- $d_i=1$ if $t_i$ is an interval censored observation (0 otherwise)
- $r_i=1$ if $t_i$ is a right censored observation (0 otherwise)
- $n = \sum_{j=1}^{m+1}(l_{j}+d_{j}+r_{j})$

7.2.3 - Comparison of $\alpha$-Particle Data Analyses

Example 7.3 - Likelihood for the $\alpha$ Particle Data

Background

Berkson modeled the times between $\alpha$-particle arrivals using an $EXP(\theta)$ distribution.
This example demonstrates the effect of sample size on the value of $\hat{\theta}_{_{MLE}}$

Data Set

The original data set contained $10,220$ observations binned into $m+1=8$ time intervals
The random samples of size $n = 20, 200, 2000$ that were generated from the original data are shown below
Note that all four of these data sets are available in the SMRD package and may be called using
- berkson20
- berkson200
- berkson2000
- berkson10200

data <- matrix(c(3,7,4,1,3,2,0,0,
                 41,44,24,32,29,21,9,0,
                 292,494,332,236,261,308,73,4,
                 1609,2424,1770,1306,1213,1528,354,16),
               nrow  = 4, 
               byrow = TRUE)

rownames(data) <- c("n = 20","n = 200","n = 2000","n = 10220")

          [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
n = 20       3    7    4    1    3    2    0    0
n = 200     41   44   24   32   29   21    9    0
n = 2000   292  494  332  236  261  308   73    4
n = 10220 1609 2424 1770 1306 1213 1528  354   16

Additional Information

Regardless of which sample size we wish to use, the form of the likelihood function will be the same
The form of the likelihood function would change if
- We chose to model the data with a distribution other than the exponential
- We chose to inspect the samples more (less) often - resulting in a higher (lower) number of intervals
The form of the likelihood function for these data (assuming the exponential distribution is correct) is expressed as

\[ \begin{aligned} \mathscr{L}(\underline{\theta}|\underline{x})&=C\prod_{j=1}^{8}[F(t_{j}|\theta)-F(t_{j-1}|\theta)]^{d_{j}}\\ &=C\prod_{j=1}^{8}\left[1-\exp\left(-\frac{t_{j}}{\theta}\right)-1-\exp\left(-\frac{t_{j-1}}{\theta}\right)\right]^{d_{j}}\\ &=C\prod_{j=1}^{8}\left[\exp\left(-\frac{t_{j-1}}{\theta}\right)-\exp\left(-\frac{t_{j}}{\theta}\right)\right]^{d_{j}} \end{aligned} \]

In R, this likelihood function may be assigned shown below
Where
- theta represents the exponential parameter value $\theta$ the value of which we are looking to find
- d is a vector-valued variable of size $1\times 8$ representing the number of observations in each of the eight time intervals

Like7.3 <- function(theta,d) {

F <- log(exp(   -0/theta)-exp( -100/theta))*d[1]+
     log(exp( -100/theta)-exp( -300/theta))*d[2]+
     log(exp( -300/theta)-exp( -500/theta))*d[3]+
     log(exp( -500/theta)-exp( -700/theta))*d[4]+
     log(exp( -700/theta)-exp(-1000/theta))*d[5]+
     log(exp(-1000/theta)-exp(-2000/theta))*d[6]+
     log(exp(-2000/theta)-exp(-4000/theta))*d[7]+
     log(exp(-4000/theta)-exp( -Inf/theta))*d[8] 

return(-F)  ### Remember: we're minimizing the negative of our function  
}

We’ll use nlminb to find the value of theta that maximizes the function Like7.3 for each of the four data sets
- For each sample, a start value of 400 was used, this was based on trial and error
- Choosing start values that are too far away from the critical value can sometimes cause convergence failures
- Note that the upper and lower arguments were used to limit the range of values in which nlminb will search
Verify that the values returned by nlminb below for $\hat{\theta}_{_{MLE}}$ match those shown in the top of Table 7.2

P.1 <- nlminb(start = 400, obj = Like7.3, d = data[1,], lower = 100, upper = 800)$par 
P.2 <- nlminb(start = 400, obj = Like7.3, d = data[2,], lower = 100, upper = 800)$par
P.3 <- nlminb(start = 400, obj = Like7.3, d = data[3,], lower = 100, upper = 800)$par
P.4 <- nlminb(start = 400, obj = Like7.3, d = data[4,], lower = 100, upper = 800)$par

MLE <- data.frame(P.1,P.2,P.3,P.4)

colnames(MLE) <- c("n = 20","n = 200","n = 2000","n = 10220")
rownames(MLE) <- c("MLE")

      n = 20  n = 200 n = 2000 n = 10220
MLE 440.1711 572.2742 612.7727  596.3443

Figure 7.3 - Exponential MLE Probability Plots

N.samples <- 200

berkson <- switch(as.character(N.samples), 
                  "20"    = {SMRD::berkson20},
                  "200"   = {SMRD::berkson200},
                  "2000"  = {SMRD::berkson2000},
                  "10220" = {SMRD::berkson10220},
                  stop('This number is no good'))

berkson.ld <- frame.to.ld(berkson,
                          response.column = c(1,2),
                          censor.column = 3,
                          case.weight.column = 4)

mleprobplot(berkson.ld, distribution = "Exponential")

7.3 - Confidence Intervals for $\theta$

7.3.1 - Likelihood Confidence Interval for $\theta$

The Relative Likelihood Function $R(\theta)$

The likelihood function $\mathscr{L}(\underline{\theta}|\underline{x})$ provides a useful method for computing approximate CI’s for
- Parameters
- Functions of parameters
We know that $\mathscr{L}(\underline{\theta}|\underline{x})$ attains its maximum value at $\hat{\theta}_{_{MLE}}$
We also know that for $\theta \ne \hat{\theta}_{_{MLE}}$, $\mathscr{L}(\theta|\underline{x})<\mathscr{L}(\hat{\theta}_{_{MLE}}|\underline{x})$
By rearranging this relationship, we can define the relative likelihood function as

\[ R(\theta)=\frac{\mathscr{L}(\theta|\underline{x})}{\mathscr{L}(\hat{\theta}_{_{MLE}}|\underline{x})} \in (0,1) \]

Likelihood-Based Confidence Intervals

An approximate $100(1-\alpha)\%$ CI for $\theta$ is the set of values for which

\[-2\log[R(\theta)]\le \chi^2_{(1-\alpha;df)}\]

Or, equivalently, the values for which

\[R(\theta)\ge\exp\left[-\chi^2_{(1-\alpha;df)}/2\right]\]

Where $df$ is equal to the number of unknown parameters
Note the following equivalence relationship

\[ \frac{\mathscr{L}(\theta|\underline{x})}{\mathscr{L}(\hat{\theta}_{_{MLE}}|\underline{x})}=\exp\left[\frac{\log(\mathscr{L}(\theta|\underline{x}))}{\log(\mathscr{L}(\hat{\theta}_{_{MLE}}|\underline{x})})\right]\ne\frac{\log(\mathscr{L}(\theta|\underline{x}))}{\log(\mathscr{L}(\hat{\theta}_{_{MLE}}|\underline{x}))} \]

Example: Finding Relative-Likelihood Based CI’s

Background

For the $\alpha$-particle data set $(n=20)$, a $95\%$ CI for $\theta$ may be found using the relative likelihood in two ways
- Analytically - The Hard Way
- Numerically - The Easy Way
This example walk through both processes

Finding Likelihood Based CI’s Analytically

Step 1: Find $\hat{\theta}_{_{MLE}}$ by maximizing $\mathscr{L}(\theta|\underline{x})$

\[ \begin{aligned} \mathscr{L}(\theta|\underline{x})&=C\prod^{20}_{i=1}\frac{1}{\theta}\exp\left[-\frac{t}{\theta}\right]\\ &=C\prod_{j=1}^{8}\left[\exp\left(-\frac{t_{j-1}}{\theta}\right)-\exp\left(-\frac{t_{j}}{\theta}\right)\right]^{d_{j}}\\ &=C\sum_{j=1}^{8}d_{j}\times \log\left[\exp\left(-\frac{t_{j-1}}{\theta}\right)-\exp\left(-\frac{t_{j}}{\theta}\right)\right] \end{aligned} \]

Step 2: Identify the critical values for $R(\theta)$
- For this example the only unknown parameter is $\theta,\;(df=1)$
- Therefore, a two-sided $95\%$ CI for $\hat{\theta}_{_{MLE}}$ is all values of $\hat{\theta}$ for which

\[ \begin{aligned} R(\theta)=\frac{\mathscr{L}(\hat{\theta}|\underline{x})}{\mathscr{L}(\hat{\theta}_{_{MLE}}|\underline{x})}&\ge \exp\left[-\chi^{2}_{(1-.05;1)}/2\right]\\ &\ge 0.147 \end{aligned} \]

exp(-qchisq(.95,1)/2)

[1] 0.1465001

Step 3: Using the critical values of $R(\theta)$, find the associated critical values for $\theta$
- If $\hat{\theta}_{_{MLE}}$ is known, the value of $\mathscr{L}(\hat{\theta}_{_{MLE}}|\underline{x})$ can easily be found.
- The critical value for $R(\theta)$ has already been determined
- Thus, the critical values for $\theta$ may then be found by recursively substituting values in \[\mathscr{L}(\theta|\underline{x})\ge R(\theta_c) \times\mathscr{L}(\hat{\theta}_{_{MLE}}|\underline{x})\]
- For all but the most basic likelihood functions, Step 3 is hard and time consuming
- Thankfully, computers can do this for us

Finding The Relative-Likelihood Based CI Numerically

Plotting $R(\theta) \sim \theta$ gives a normalized profile of $\mathscr{L}(\theta)$ that can be used to compute CI’s for $\theta$
The SMRD package makes this easy

par(family  = 'serif')

library(SMRD)

berkson200.ld <- 
frame.to.ld(berkson200,
            response.column = c(1,2),
            censor.column = 3,
            case.weight.column = 4,
            time.units = '1/5000 Seconds')

simple.contour(berkson200.ld, 
               distribution = 'exponential', 
               xlim = c(450,800))

Figure 7.2 - Relative Likelihood Functions

Figure 7.2 shows the relative likelihood functions for each of the four samples from the Berkson data
Note how the shape of the relative likelihood function changes with differing amounts of information
- What does this mean mathematically?
- What does this mean conceptually?
- The text talks about the ‘curvature of the likelihood function’ - which is shown graphically in Figure 7.2 and in the code below
- We’ll discuss this figure more in a later example

library(ggplot2)

Like<-function(theta,d) {

F<-log(exp(   -0/theta)-exp( -100/theta))*d[1]+
   log(exp( -100/theta)-exp( -300/theta))*d[2]+
   log(exp( -300/theta)-exp( -500/theta))*d[3]+
   log(exp( -500/theta)-exp( -700/theta))*d[4]+
   log(exp( -700/theta)-exp(-1000/theta))*d[5]+
   log(exp(-1000/theta)-exp(-2000/theta))*d[6]+
   log(exp(-2000/theta)-exp(-4000/theta))*d[7]+
   log(exp(-4000/theta)-exp( -Inf/theta))*d[8] 

return(-F)      

}

data<-matrix(c(3,7,4,1,3,2,0,0,
               41,44,24,32,29,21,9,0,
               292,494,332,236,261,308,73,4,
               1609,2424,1770,1306,1213,1528,354,16),
             nrow = 4, 
             byrow = TRUE)

mles <- matrix(NA, ncol = 1, nrow = nrow(data))

mles[,1] <- sapply(X = 1:4, 
                   FUN = function(x) {
                       
                    nlminb(start = 400,
                           Like,
                           d = data[x,],
                           lower = 100,
                           upper = 800,
                           control = list(trace=FALSE))$par
         
                  })

x<-seq(200,800,.1)

n.1 <- exp(-Like(x,data[1,]) + Like(mles[1,],data[1,]))
n.2 <- exp(-Like(x,data[2,]) + Like(mles[2,],data[2,]))
n.3 <- exp(-Like(x,data[3,]) + Like(mles[3,],data[3,]))
n.4 <- exp(-Like(x,data[4,]) + Like(mles[4,],data[4,]))

Data <- data.frame(X = x,
                   Y = c(n.1, n.2, n.3, n.4),
                   N = rep(c(20,200,2000,10220), each = length(x)),
                   row.names = NULL)

ggplot(Data) +
  geom_line(aes(x = X, y = Y, colour = as.factor(N)), size = 1.5) +
  theme_bw(base_size = 16) +
  ylab(expression('R('*theta*')')) +
  xlab(expression(theta)) +
  guides(col = guide_legend(title = "# Samples"))

7.3.2 - Confidence Interval - Significance Test Relationship

Significance Tests (aka hypothesis tests)

Consider the region of the parameter space that is the mathematical complement to the region enclosed by the confidence interval

Confidence Intervals

We observe data
Attempt to fit many distributions
Find the “most likely” parameters
Identify the critial values
Values inside critical values bound our uncertainty on the parameter value

Significance Test

We observe data
Assume one distribution is correct
Test a single parameter value
Identify the critical values
Values outside critical values indicate the data is inconsitent with our hypothesis

The significance test associated with likelihood-based CI’s is known as the Likelihood Ratio Test
In the $\alpha$-particle example, our uncertaintly about the value of $\hat{\theta}_{_{MLE}}$ was expressed as all values of $\theta$ for which

\[ R(\theta|\underline{x})\ge\exp\left[-\chi^2_{(1-\alpha;1)}/2\right] \]

In the likelihood ratio test (LRT)
- A single value of $\theta$ is hypothesized, denoted by $\theta_{0}$
- The LRT tests whether the observed data provides sufficient evidence to reject the hypothesized value $\theta_0$ or not

Example 7.7 LRT for the $\alpha$-Particle Data Set

Background

Investigators believe that the true value for $\theta=650 \left(\frac{1}{5000}\;\text{seconds}\right)$
They wish to use an LRT to test the hypothesis:

\[ \begin{aligned} H_{0}: \theta &= 650\\ H_{1}: \theta &\ne 650 \end{aligned} \]

Analysis

For the sample with $n=200$ observations, the LRT critical values for $\alpha=.05$ would be

\[ \begin{aligned} -2\log\left[R(\theta_{0}|\underline{x})\right]&>\chi^2_{(.95,1)}\\ -2\log\left[\frac{\mathscr{L}(\theta_0|\underline{x})}{\mathscr{L}(\hat{\theta}_{_{MLE}}|\underline{x})}\right]&>3.84 \end{aligned} \]

For the data set with $n=200$ observations, $\hat{\theta}_{_{MLE}}=572.3$, therefore

\[\frac{\mathscr{L}(650|\underline{x})}{\mathscr{L}(572.3|\underline{x})}=2.94\]

The value of the test statistic $2.94$ is less that the critical value of $3.84$
As this value is not in the rejection region, insufficient evidence exists to reject the null hypothesis that $\theta=650$ at the $.05$ significance level.

Additional Information

Meeker notes that the sample with $n=2000$ observations does provide sufficient evidence to reject the null hypothesis that $\theta=650$ at the $.05$ significance level
- Why would this be the case?

7.3.3 - Normal Approximation CI for $\theta$

The Fisher Information Matrix

A two-sided $100(1-\alpha)\%$ CI for $\theta$ based on the normal approximation is expressed as

\[ \left[\underline{\theta},\overline{\theta}\right]=\hat{\theta}\pm z_{(1-\alpha/2)} \hat{se}_{\hat{\theta}} \]

Where
- $\hat{\theta}$ is our point estimate for $\theta$ using the available data
  This quantity is known
- $z_{(1-\alpha/2)}$ is the quantile value $NOR(p=1-\alpha/2|\mu=0, \sigma=1)$
  This quantity is known
- $\hat{se}_{\hat{\theta}}=\sqrt{\widehat{Var}[\theta]}$
  This term has yet to be determined and will require the most of our efforts
In general, the variance & covariance values for the parameters are computed from the Fisher Information Matrix expressed as

\[ \mathscr{I}_{\theta}=-E\left[\frac{\partial^2\mathscr{L}(\underline{\theta}|\underline{x})}{\partial\theta_i\partial\theta_j}\right]=E\left[\mathcal{I}_{\hat{\theta}}\right] \]

The Fisher Information Matrix
- Gives the expected amount of information before any data has been collected
- Is primarily a theoretical measure
Once data has been observed the estimated variances & covariances may be computed from the observed information matrix $\mathcal{I}_{\hat{\theta}}$

Observed Information Matrix

The observed information matrix is expressed as

\[ \mathcal{I}_{\theta}=-\frac{\partial^2\mathscr{L}(\underline{\theta}|\underline{x})}{\partial\theta_i\partial\theta_j} \]

Recall, the log-likelihood function for the $\alpha$-particle example

\[ \mathcal{L}(\theta|\underline{t})=\sum_{j=1}^8 d_j \log\left[\exp\left(-\frac{t_{j-1}}{\theta}\right)-\exp\left(-\frac{t_j}{\theta}\right)\right] \]

The first and second derivatives for this log-likelihood function are expressed as

\[ \begin{aligned} \frac{d\mathcal{L}(\theta)}{d\theta}&=\sum_{j=1}^8 d_j\frac{\frac{t_{j-1}}{\theta^2}\exp\left(-\frac{t_{j-1}}{\theta}\right)-\frac{t_{j}}{\theta^2}\exp\left(-\frac{t_{j}}{\theta}\right)}{\exp\left(-\frac{t_{j-1}}{\theta}\right)-\exp\left(-\frac{t_{j}}{\theta}\right)}\\\\\\\\ \frac{d^2\mathcal{L}(\theta)}{d\theta^2}&=\sum_{j=1}^8 d_j\left[\frac{\frac{t_{j-1}}{\theta^2}\exp\left(-\frac{t_{j-1}}{\theta}\right)-\frac{t_{j}}{\theta^2}\exp\left(-\frac{t_{j}}{\theta}\right)}{\exp\left(-\frac{t_{j-1}}{\theta}\right)-\exp\left(-\frac{t_{j}}{\theta}\right)}\right]^2+ \frac{\frac{-2t_{j-1}}{\theta^3}\exp\left(-\frac{t_{j-1}}{\theta}\right)+\frac{t_{j}^4}{\theta^4}\exp\left(-\frac{t_{j}}{\theta}\right)}{\exp\left(-\frac{t_{j-1}}{\theta}\right)-\exp\left(-\frac{t_{j}}{\theta}\right)} \end{aligned} \]

The information observed from a data set may be found by substituting any value for $\hat{\theta}$ into $\frac{\partial^2\mathcal{l}(\theta)}{\partial\theta^2}$
Substituting $\hat{\theta}_{_{MLE}}$ maximizes the observed information and is typically the value of $\theta$ used
Substituting $\hat{\theta}_{_{MLE}}$ for $\theta$ in $\frac{\partial^2\mathcal{l}(\theta)}{\partial\theta^2}$ gives

\[ \mathcal{I}_{\hat{\theta}_{_{MLE}}} = 0.000574 \]

The parameter variance-covariance matrix, denoted by $\Sigma_{\hat{\theta}}$, is found by inverting the observed information matrix

\[ \Sigma_{\hat{\theta}_{_{MLE}}}=\left[\mathcal{I}_{\hat{\theta}_{_{MLE}}}\right]^{-1} \]

For the $\alpha$-particle data set $T\sim EXP(\theta)$, $\mathcal{I}_{\hat{\theta}}$ is a single element matrix
The inverse of a single-element matrix is found by simply taking reciprocal, thus

\[ \Sigma_{\hat{\theta}_{_{MLE}}}=\frac{1}{0.000574}=1742.2 \]

Computing $\mathcal{I}_{\hat{\theta}_{MLE}}$ Numerically

Using R to Compute $\mathcal{I}_{\hat{\theta}_{MLE}}$

Recall, we found $\hat{\theta}_{_{MLE}}$ by defining the likelihood function as

Like7.3<-function(theta,d) {

F<-log(exp(   -0/theta)-exp( -100/theta))*d[1]+
   log(exp( -100/theta)-exp( -300/theta))*d[2]+
   log(exp( -300/theta)-exp( -500/theta))*d[3]+
   log(exp( -500/theta)-exp( -700/theta))*d[4]+
   log(exp( -700/theta)-exp(-1000/theta))*d[5]+
   log(exp(-1000/theta)-exp(-2000/theta))*d[6]+
   log(exp(-2000/theta)-exp(-4000/theta))*d[7]+
   log(exp(-4000/theta)-exp( -Inf/theta))*d[8] 

return(-F)      }

We then used the nlminb function to compute a value for $\hat{\theta}_{_{MLE}}$ assuming $n=200$

data[2,]

[1] 41 44 24 32 29 21  9  0

nlminb(start = 400, 
       obj = Like7.3, 
       d = data[2,], 
       lower = 100, 
       upper = 800)$par

[1] 572.27

We could have also used the optim function and arrived at the same solution

optim(par = 400, 
      fn = Like7.3, 
      d = data[2,], 
      lower = 100, 
      upper = 800, 
      method = "L-BFGS-B")$par

[1] 572.27

However, the optim output includes a numerical estimate of the Observed Information Matrix (aka the Hessian matrix in some contexts)

optim(par = 400, 
      fn = Like7.3, 
      d = data[2,], 
      lower = 100, 
      upper = 800,
      method = "L-BFGS-B", 
      hessian = TRUE)$hessian

           [,1]
[1,] 0.00057452

To compute the the variance-covariance matrix $\Sigma_{\hat{\theta}_{MLE}}$
- Use solve() to invert $\mathcal{I}_{\hat{\theta}_{MLE}}$
- Take the reciprocal (only for $1\times 1$ matrices)
Note: the accuracy of this numerical estimate can decrease as the number of unknown parameters increases
If the gradients are known, it’s generally better to define then in the function call

Using SMRD to Compute $\mathcal{I}_{\hat{\theta}_{MLE}}$

The SMRD package makes this really easy

berkson200.ld <- frame.to.ld(berkson200, 
                             response.column = c(1,2), 
                             censor.column = 3, 
                             case.weight.column = 4)

berkson200.mlest <- mlest(berkson200.ld, distribution = "Exponential")

berkson200.mlest$mle.table

                        MLE  Std.Err. 95% Lower 95% Upper
mu                   6.3496  0.072901    6.2067    6.4925
sigma                1.0000  0.000000    1.0000    1.0000
Exponential (mean) 572.2742 41.719536  496.0786  660.1732

berkson200.mlest$vcv.matrix

             mu sigma
mu    0.0053146     0
sigma 0.0000000     0

In general, the diagonal elements of $\Sigma_{\hat{\theta}_{_{MLE}}}$ represent the variances of the parameters $\widehat{Var}\left[\hat{\theta_1}\right],\widehat{Var}\left[\hat{\theta_2}\right],...$
While the off-diagonal elements represent the covariances $\widehat{Cov}\left[\hat{\theta_i}\hat{\theta_j}\right],\; i\ne j$
For the $\alpha$-particle example $\widehat{Var}\left[\hat{\theta}_{_{MLE}}\right]=1742.2$
and

\[ \widehat{se}\left[\hat{\theta}_{_{MLE}}\right]=\sqrt{\widehat{Var}\left[\hat{\theta}_{_{MLE}}\right]}=\sqrt{1742.2}=41.72 \]

Returning to Eq 7.6

The $95\%$ confidence bounds for the $\alpha$-particle assuming that

\[ \frac{\hat{\theta}-\theta}{\hat{se}_{\hat{\theta}}}\sim NOR(0,1) \]

can be expressed as

\[ \left[\underline{\theta}, \overline{\theta}\right]=572.3\pm 1.960\times 41.72=\left[ 490,653 \right] \]

where
- $572.3=\hat{\theta}_{_{MLE}}, n=200$
- $1.960=\Phi^{-1}_{_{NOR}}(p=1-0.05/2)$
Recall from Chapter 2 confidence bands computed using the normal approximation…
- Should work well for large samples $n>30$
- May give poor results for small samples $n<30$
An alternative method to compute $100(1-\alpha)\%$ CI’s For positive quantities assumes that

\[ \frac{\log(\hat{\theta})-\log(\theta)}{\hat{se}_{\log(\hat{\theta})}}\sim NOR(0,1) \]

This method has been shown to produce better confidence intervals For positive quantities when the sample size is small
Using this alternative method the upper and lower confidence limits are expressed as

\[ \left[\underline{\log(\theta)},\overline{\log(\theta)}\right]=\log(\hat{\theta}) \pm z_{(1-\alpha/2)}\hat{se}_{\log(\hat{\theta})} \]

These confidence limits may also be expressed as

\[ \left[\underline{\theta},\overline{\theta}\right]=\left[\hat{\theta}/w,\hat{\theta}\times w\right] \]

where
- $w = \exp(z_{(1-\alpha/2)}\hat{se}_{\hat{\theta}}/\hat{\theta})$
- $\hat{se}_{\log(\hat{\theta})}=\hat{se}_{\hat{\theta}}/\hat{\theta})$
$\hat{se}_{\log(\hat{\theta})}$ is found from the delta method where
- $\hat{\theta}\equiv$ the estimator for which we already know the variance
- $\log(\hat{\theta})\equiv$ the estimator for which we want to know the variance
Using the delta method gives

\[ \begin{aligned} \widehat{Var}\left[\log(\hat{\theta})\right]&=\sum_{i=1}^n\left[\frac{\partial(\log(\hat{\theta}))}{\partial\hat{\theta}}\right]^2 \widehat{Var}\left[\hat{\theta}\right] =\frac{\widehat{Var}\left[\hat{\theta}\right]}{\hat{\theta}^2}\\\\ \text{therefore}\;\;\hat{se}_{\log(\hat{\theta})}&=\sqrt{\frac{\widehat{Var}\left[\hat{\theta}\right]}{\hat{\theta}^2}}=\frac{\hat{se}_{\hat{\theta}}}{\hat{\theta}} \end{aligned} \]

Look again at the results produced by the SMRD package

berkson200.mlest$mle.table

                        MLE  Std.Err. 95% Lower 95% Upper
mu                   6.3496  0.072901    6.2067    6.4925
sigma                1.0000  0.000000    1.0000    1.0000
Exponential (mean) 572.2742 41.719536  496.0786  660.1732

berkson200.mlest$vcv.matrix

             mu sigma
mu    0.0053146     0
sigma 0.0000000     0

Note that $\Sigma_{\hat{\log(\theta)}_{MLE}}$ is returned, not $\Sigma_{\hat{\theta}_{MLE}}$

Table 7.3 - Comparison of $\alpha$-Particle ML Results

a<-c("596.3","6.084","",rep("[584,608]",3),"168","1.7","",rep("[164,171]",3))
b<-c("612.8","14.13","",rep("[586,641]",2),"[585,640]","163","3.8","",rep("[156,171]",3))
c<-c("572.3","41.72","","[498,662]","[496,660]","[490,653]","175","13","","[151,201]","[152,202]","[149,200]")
d<-c("440.2","101.0","","[289,713]","[281,690]","[242,638]","227","52","","[140,346]","[145,356]","[125,329]")
Tab.7.2<-data.frame(a,b,c,d)
rownames(Tab.7.2)<-c("ML estimate $\\hat{\\theta}$",
                     "Standard error $\\hat{se}_{\\hat{\\theta}}$",
                     "Approximate $95\\%$ CI for $\\theta$",
                     "Based on the likelihood",
                     "Based on $Z_{\\log(\\hat{\\theta})}\\sim NOR(0,1)$",
                     "Based on $Z_{\\hat{\\theta}}\\sim NOR(0,1)$",
                     "ML estimate $\\hat{\\lambda}\\times10^{5}$",
                     "Standard error $\\hat{se}_{\\hat{\\lambda}\\times10^{5}}$",
                     "Approximate $95\\%$ CI for $\\lambda \\times 10^{5}$",
                     "Based on the likelihood ",
                     "Based on $Z_{\\log(\\hat{\\lambda})}\\sim NOR(0,1)$",
                     "Based on $Z_{\\hat{\\lambda}} \\sim NOR(0,1)$")

knitr::kable(Tab.7.2, 
             format = 'markdown', 
             col.names = c("n = 10220","n = 2000","n = 200","n = 20"),
             row.names = T)

	n = 10220	n = 2000	n = 200	n = 20
ML estimate $\hat{\theta}$	596.3	612.8	572.3	440.2
Standard error $\hat{se}_{\hat{\theta}}$	6.084	14.13	41.72	101.0
Approximate $95\%$ CI for $\theta$
Based on the likelihood	[584,608]	[586,641]	[498,662]	[289,713]
Based on $Z_{\log(\hat{\theta})}\sim NOR(0,1)$	[584,608]	[586,641]	[496,660]	[281,690]
Based on $Z_{\hat{\theta}}\sim NOR(0,1)$	[584,608]	[585,640]	[490,653]	[242,638]
ML estimate $\hat{\lambda}\times10^{5}$	168	163	175	227
Standard error $\hat{se}_{\hat{\lambda}\times10^{5}}$	1.7	3.8	13	52
Approximate $95\%$ CI for $\lambda \times 10^{5}$
Based on the likelihood	[164,171]	[156,171]	[151,201]	[140,346]
Based on $Z_{\log(\hat{\lambda})}\sim NOR(0,1)$	[164,171]	[156,171]	[152,202]	[145,356]
Based on $Z_{\hat{\lambda}} \sim NOR(0,1)$	[164,171]	[156,171]	[149,200]	[125,329]

7.4.1 - Confidence Intervals for the Arrival Rate

The rate of occurrence for events distributed $EXP(\theta)$ is $\lambda =\frac{1}{\theta}$
Since $\lambda\propto \frac{1}{\theta}$ we can say that

\[ \overline{\theta}=\underline{\lambda} \quad \& \quad \underline{\theta}=\overline{\lambda} \]

And, note that

\[ \lambda \in \left(\underline{\lambda},\overline{\lambda}\right)\;\;\iff \theta \in \left(\underline{\theta},\overline{\theta}\right) \]

7.4.2 - Confidence intervals for $F(t;\theta)$

Like $\lambda,\;F(t|\theta) \propto \frac{1}{\theta}$, therefore

\[ \left[\underline{F(t|\theta)},\overline{F(t|\theta)}\right]=\left[F(t|\overline{\theta}),F(t\underline{\theta}\right] \]

Unlinke what was shown in $\S$ 3.8, these confidence bands are simultaneous since $\theta$ is the only unknown parameter.
And again

\[ F(t) \in \left(\underline{F(t)},\overline{F(t)}\right)\;\;\iff \theta \in \left(\underline{\theta},\overline{\theta}\right) \]

Table 7.3 - Comparison of $\alpha$-Particle ML Results

\[ \begin{array}{lcccc} \hline & n=10220 & n=2000 & n=200 & n=20 \\ \hline \text{ML estimate } \hat{\lambda}\times10^{5} & 168 & 163 & 175 & 227 \\ \text{Standard error } \hat{se}_{\hat{\lambda}\times10^{5}} & 1.7 & 3.8 & 13 & 52 \\ \hline \text{Approximate } 95\% \text{ CI for } \lambda\times10^{5}\ & & & & \\ \hline \text{Based on the likelihood } & [164,171] & [156,171] & [151,201] & [140,346] \\ \text{Based on } Z_{\log(\hat{\lambda})}\sim NOR(0,1) & [164,171] & [156,171] & [152,202] & [145,356] \\ \text{Based on } Z_{\hat{\lambda}}\sim NOR(0,1) & [164,171] & [156,171] & [149,200] & [125,329] \\ \hline \end{array} \]

7.5 - Comparison of Confidence Interval Prodcedures

Simulation studies have shown that better confidence intervals are produced by
1. Likelihood methods
2. Bootstrap methods (Chapter 9)
3. Normal approximation method based on $\log(\hat{\theta})$
4. Normal approximation method based on $\hat{\theta}$
These CI techniques are in order of their coverage probability
“Better” CI $\equiv$ coverage probability closer to the nominal confidence level

7.6 - Likelihood for Exact Failure Times

7.6.3 - $\hat{\theta}_{_{MLE}}$ based on density approximation

As previously discussed, the contribution of “exact” failures to $\mathscr{L}$ may be approximated as

\[ \lim\limits_{\Delta_i\rightarrow 0^+}\left[F(t_i|\theta)-F(t_i-\Delta_i|\theta\right]\approx f(t_i|\theta)\Delta_i \]

This approximation works well for data that
- Are right-censored only (no left-censoring or interval-censoring)
- Can be adequately modeled by an exponential distribution
Under this situation, $\theta$ is estimated by

\[ \hat{\theta}=\frac{TTT}{r} \]

where
- $r \equiv$ the number of failures
- $\displaystyle TTT =\sum_{i=1}^n t_{i}$ is the total time on test

Calculating Total Time on Test

The figure below shows the event plot for the lzbearing data set

The first failure occurs at lzbearing$megacycles[1] = 17.88 cycles.
However, every other unit in the test has also operated for this many cycles, therefore

TTT_1 = lzbearing$megacycles[1] * nrow(lzbearing) 

TTT_1

[1] 411.24

At the second failure time

TTT_2 = TTT_1 + diff(lzbearing$megacycles[1:2]) * (nrow(lzbearing) - 1)

TTT_2

[1] 654.12

For this data set we can compute and plot TTT using the following code

TTT <- double(nrow(lzbearing))

for(i in 1:nrow(lzbearing)) {
  
  `if`(i == 1,
       TTT[i] <- lzbearing[i,1] * length(lzbearing[23:i,1]),
       TTT[i] <- diff(lzbearing[(i-1):i,1]) * length(lzbearing[23:i,1]))
  
}

plot(x = 1:23,
     y = cumsum(TTT),
     pch = 16, 
     col = 2,
     cex = 1.2,
     las = 1,
     ylab = 'TTT',
     xlab = '# Failures')

An even more generalized code is shown below
- Computes TTT plot and Barlow-Proschan’s test for censored and uncensored data set
- The code was initially found at http://www.math.ntnu.no/~akerkar/tma4275/TTT-plot.R, however this link doesn’t seem to be working now

TTTplot <- function(time,cens=NULL,show_results = F)  {
  
    n <- length(time)
    oo <- order(time)
    if(is.null(cens)) cens = rep(1,n)
    sorttime <- time[oo]
    sortcens <- cens[oo]
    ttt <- numeric(n)
    ttt[1] <- n*sorttime[1]
    for(i in 2:n)
      {
        ttt[i] <- ttt[i-1]+(n-i+1)*(sorttime[i]-sorttime[i-1])
      }
    ttt <- data.frame(time = sorttime,
                      cens = sortcens,
                      rank = 1:n,
                      "cum total time" = ttt)
    
    colnames(ttt) <- c("time","cens","rank","Cum.Total.Time")#,"i/n","TTT")

    TTT <- ttt[ttt$cens==1,]
    
    nfail <- dim(TTT)[1]
    
    TTT <- cbind(TTT[,-3],
                 TTT$Cum.Total.Time / TTT$Cum.Total.Time[nfail],
                 (1:nfail) / nfail)
    
    colnames(TTT) <- c("time","cens","Cum.Total.Time","TTT","i/n")
    
    W <- sum(TTT$"Cum.Total.Time"[1:(nfail-1)]) / TTT$"Cum.Total.Time"[nfail]
    
    Z <- (W - (nfail - 1) / 2) / sqrt((nfail-1) / 12)
    
    if(show_results) {
      
       print(TTT)
       
       cat("Barlow-Proschan's test\n")
       cat(paste("W=",round(W,4)," k=",nfail,"\n",sep=""))
       cat(paste("Z=",
                 round(Z,4)," p.value=",round(pnorm(-abs(Z)),4),"\n",sep=""))
       
    }

    plot(TTT$"i/n",
         TTT$TTT,
         xlab = "i/n",
         ylab = "TTT",
         xlim = c(0,1),
         ylim = c(0,1),
         pch = 19, 
         main = "TTT plot",
         sub = paste("N. failure = ",
                     nfail," N. censored = ", 
                     n-nfail,sep=""))
    
    lines(c(0,1),
          c(0,1),
          lty = 2)
    
    invisible(list(TTT = TTT, W = W, Z = Z))
    
}

TTTplot(time = lzbearing$megacycles)

7.6.4 - CI for $\hat{\theta}_{_{MLE}}$ with complete data or Type II censoring

Background Assumptions

If the following assumptions are met
- The exponential distribution cannot be rejected as an adequate model for a dataset using either graphical or numerical procedures
- We have either complete data (exact observation times) or Type-II (failure) censoring fTTT plot data
An exact $100\times(1-\alpha)$ CI can be constructed for $\theta$ as

\[ \big[\underset{\sim}{\theta},\quad\overset{\sim}{\theta}\big] = \Bigg[\frac{2(TTT)}{\chi^2_{(1-\alpha/2),2r}},\quad\frac{2(TTT)}{\chi^2_{(\alpha/2),2r}}\Bigg] \]

This interval is based on the notion that

\[ \frac{2(TTT)}{\theta}\sim\chi^2_{2r} \]

Example 7.13

Background
- A life-test was performed on 25 specimens of a new insulating material
- Testing concluded when 15 failures were observed (Type-II test)
- The data are shown below (not included in the SMRD package)

t7.13 <- c(1.08,12.2,17.8,19.1,26,27.9,28.2,32.2,35.9,43.5,44,45.2,45.7,46.3,47.8,rep(47.8,10))
c7.13 <- c(rep(1,15),rep(0,10))

df7.13 <- data.frame(times = t7.13,
                     cens  = c7.13)

knitr::kable(df7.13)

times	cens
1.08	1
12.20	1
17.80	1
19.10	1
26.00	1
27.90	1
28.20	1
32.20	1
35.90	1
43.50	1
44.00	1
45.20	1
45.70	1
46.30	1
47.80	1
47.80	0
47.80	0
47.80	0
47.80	0
47.80	0
47.80	0
47.80	0
47.80	0
47.80	0
47.80	0

Analysis
- Implementing the TTTplot function defined previously (code shown below) results in $TTT = 950.88$
- Using this value gives an ML estimate for $\widehat{\theta}_{_{MLE}}= 950.88/15 = 63.392$ and a $95\%$ CI as

\[ \begin{eqnarray} \big[\underset{\sim}{\theta},\quad\overset{\sim}{\theta}\big] &=& \Bigg[\frac{2(TTT)}{\chi^2_{(1-\alpha/2),2r}}&,&\quad\frac{2(TTT)}{\chi^2_{(\alpha/2),2r}}&\Bigg]&\\\\ &=&\Bigg[\frac{2(950.88)}{\chi^2_{(0.975),30}}&,&\quad\frac{2(950.88)}{\chi^2_{(0.025),30}}&\Bigg]&\\\\ &=&\Bigg[\frac{1901.76}{46.98}&,&\quad\frac{1901.76}{16.79}&\Bigg]&\\\\ &=&\Bigg[40.48&,&113.26&\Bigg]& \end{eqnarray} \]

res <- TTTplot(time = df7.13$times,
               cens = df7.13$cens)

knitr::kable(res$TTT)

time	cens	Cum.Total.Time	TTT	i/n
1.08	1	27.00	0.02839	0.06667
12.20	1	293.88	0.30906	0.13333
17.80	1	422.68	0.44451	0.20000
19.10	1	451.28	0.47459	0.26667
26.00	1	596.18	0.62698	0.33333
27.90	1	634.18	0.66694	0.40000
28.20	1	639.88	0.67293	0.46667
32.20	1	711.88	0.74865	0.53333
35.90	1	774.78	0.81480	0.60000
43.50	1	896.38	0.94268	0.66667
44.00	1	903.88	0.95057	0.73333
45.20	1	920.68	0.96824	0.80000
45.70	1	927.18	0.97508	0.86667
46.30	1	934.38	0.98265	0.93333
47.80	1	950.88	1.00000	1.00000

TTT <- max(res$TTT$Cum.Total.Time)

TTT

[1] 950.88

7.7 - Data analysis with no failures

Data from  berkson200 
  lower upper     Status Case.weights
1   100   100 L-Censored           41
2   100   300   Interval           44
3   300   500   Interval           24
4   500   700   Interval           32
5   700  1000   Interval           29
6  1000  2000   Interval           21
7  2000  4000   Interval            9
8  4000  4000 R-Censored            0

comptime

Snubber.ld <- frame.to.ld(snubber, 
                          response.column = "cycles", 
                          censor.column = "event", 
                          time.units = "Toaster Cycles",
                          case.weight.column = "count", 
                          x.columns = "design")

groupm.mleprobplot(Snubber.ld, 
                   distribution ="Lognormal", 
                   group.var = 1, 
                   relationships = "linear")

	n = 10220	n = 2000	n = 200	n = 20
ML estimate \(\hat{\theta}\)	596.3	612.8	572.3	440.2
Standard error \(\hat{se}_{\hat{\theta}}\)	6.084	14.13	41.72	101.0
Approximate \(95\%\) CI for \(\theta\)
Based on the likelihood	[584,608]	[586,641]	[498,662]	[289,713]
Based on \(Z_{\log(\hat{\theta})}\sim NOR(0,1)\)	[584,608]	[586,641]	[496,660]	[281,690]
Based on \(Z_{\hat{\theta}}\sim NOR(0,1)\)	[584,608]	[585,640]	[490,653]	[242,638]
ML estimate \(\hat{\lambda}\times10^{5}\)	168	163	175	227
Standard error \(\hat{se}_{\hat{\lambda}\times10^{5}}\)	1.7	3.8	13	52
Approximate \(95\%\) CI for \(\lambda \times 10^{5}\)
Based on the likelihood	[164,171]	[156,171]	[151,201]	[140,346]
Based on \(Z_{\log(\hat{\lambda})}\sim NOR(0,1)\)	[164,171]	[156,171]	[152,202]	[145,356]
Based on \(Z_{\hat{\lambda}} \sim NOR(0,1)\)	[164,171]	[156,171]	[149,200]	[125,329]

Statistical Methods for Reliability Data

Chapter 7 - Parametric Likelihood Fitting Concepts: Exponential Distribution

CHAPTER OVERVIEW

This chapter explains…

7.1 - INTRODUCTION

Maximum Likelihood Estimation

Properties of MLE’s

7.1 - INTRODUCTION Example

Example 7.1 - Time Between \(\alpha\)-Particle Emissions

7.2 - PARAMETRIC LIKELIHOOD

Total likelihood is equal to the joint probability of the data

7.2.2 - Likelihood Function and Its Maximum

Every Observation Contributes to the Likelihood Function

Contributions to the Likelihood Function For Reliability Data

7.2.3 - Comparison of \(\alpha\)-Particle Data Analyses

Example 7.3 - Likelihood for the \(\alpha\) Particle Data

Background

Data Set

Additional Information

Figure 7.3 - Exponential MLE Probability Plots

7.3 - Confidence Intervals for \(\theta\)

7.3.1 - Likelihood Confidence Interval for \(\theta\)

The Relative Likelihood Function \(R(\theta)\)

Likelihood-Based Confidence Intervals

Example: Finding Relative-Likelihood Based CI’s

Background

Finding Likelihood Based CI’s Analytically

Finding The Relative-Likelihood Based CI Numerically

Figure 7.2 - Relative Likelihood Functions

7.3.2 - Confidence Interval - Significance Test Relationship

Significance Tests (aka hypothesis tests)

Example 7.7 LRT for the \(\alpha\)-Particle Data Set

Background

Analysis

Additional Information

7.3.3 - Normal Approximation CI for \(\theta\)

The Fisher Information Matrix

Observed Information Matrix

Computing \(\mathcal{I}_{\hat{\theta}_{MLE}}\) Numerically

Using R to Compute \(\mathcal{I}_{\hat{\theta}_{MLE}}\)

Using SMRD to Compute \(\mathcal{I}_{\hat{\theta}_{MLE}}\)

Returning to Eq 7.6

Table 7.3 - Comparison of \(\alpha\)-Particle ML Results

7.4.1 - Confidence Intervals for the Arrival Rate

7.4.2 - Confidence intervals for \(F(t;\theta)\)

Table 7.3 - Comparison of \(\alpha\)-Particle ML Results

7.5 - Comparison of Confidence Interval Prodcedures

7.6 - Likelihood for Exact Failure Times

7.6.3 - \(\hat{\theta}_{_{MLE}}\) based on density approximation

Calculating Total Time on Test

7.6.4 - CI for \(\hat{\theta}_{_{MLE}}\) with complete data or Type II censoring

Background Assumptions

Example 7.13

7.7 - Data analysis with no failures

comptime