2.1.1 - Failure Time Distribution Functions

Overview of Probability Functions

The probability_functions app from the teachingApps package includes the six probability functions used most in reliability studies and highlights:
- Expressions for each function
- Mathematical relationships between the functions
- How to compute values for each function in R - for a variety of distributions
The most current published version of the teachingApps package can be installed from CRAN by pasting the code below into the R console

install.packages('teachingApps')

The development version of the teachingApps package (which may contain some newer features) can be installed from GitHub by pasting the code below into the R console

if(!'devtools'%in%installed.packages()[,1]) {
  
   install.packages('devtools')

  }

devtools::install_github('Auburngrads/teachingApps')

Once the package is installed, the app can be rendered by pasting the code below into the R console

teachingApps::teachingApp('probability_functions')

Figure 2.1 Distribution Plots For $T \sim WEIB(1.7,1)$

The text introduces the probability functions using the $WEIB(1.7,1)$ distribution as an example and illustrates four functions in Figure 2.1
The SMRD package includes the distribution plot( ) function, that can be used to reproduce this figure or similar figures for many other distributions
- If single values are given for both parameters distribution.plot( ) returns a plot showing $F(t), f(t), S(t)$ and $h(t)$
- If multiple values are given for either parameter distribution.plot( ) returns a plot showing $F(t), f(t), h(t)$ and a legend of the parameter values displayed

par(family = 'serif',font = 2)

library(package = SMRD)

distribution.plot('Weibull',
                  shape = c(1.7), 
                  scale = 1,
                  prob.range = c(.000001,.99))

2.2 - MODELS FOR DISCRETE DATA FROM A CONTINUOUS PROCESS

Understanding failure

Failure implies that a product’s ability to perform one or more functions has been either…
- Lost completely, or
- Degraded to the point that the product cannot meet a performance specification
When failure occurs in fielded systems, there may not be an immediate indication to the user or the maintainers
- Some component and sub-component failures may only be discovered after a subsystem or the entire system fails
- Component failures in complex systems can remain undiscovered until the next inspection cycle
- Because of this, inspection intervals should be short enough to allow maintainers multiple opportunities to discover flaws and prevent serious failures
- Inspection intervals should be long enough to not overburden maintainers with unnecessary inspections
In real-world testing, we typically cannot observe the exact failure time for every unit in the sample
- Testing may occur in different geographic regions
- Sample size may be too large
- Test items may be removed from a test to fill a “real-world” need
- The test period may end before every item fails
- Items may fail before the first inspection

2.2.1 - Multinomial Failure Time Model

Real world reliability data

We can’t constantly monitor the performance of every component when a system is in the field.
For this reason real-world failure data are often reported as the inspection time when the failures were observed
Thus, real-world failure data are often reported as DISCRETE values corresponding to the inspection cycle
This occurs even if the sample space for the data is continuous $\Omega \in \mathbb{R}$
To analyze this real-world failure data appropriately, we partition the test into non-overlapping intervals corresponding to the time between inspections, i.e. \[ (t_0,t_1], (t_1,t_2],...,(t_{m-1},t_m],(t_m,t_{m+1}),\;\;\;\;\; t_0=0, t_{m+1}=\infty \]
The intervals do not need to be of equal length
- Inspections may be made more frequently at the beginning of a test to check for “infant mortality” failures or ensure that the test was set-up correctly
- Inspections may then become less frequent once the “bad actors” have been culled from the population

Multinomial Reliability Estimation

Estimating reliability in this scenario is similar to estmating the likelihood of a single $n$-sided die landing on a specific number
However, in the context of failure data the ‘sides’ refer to the $n$ inspection intervals
Think about how to estimate the probability of a 6-sided die landing on a given side
- If the die is ‘fair’ and we rolled it many times (i.e. $n \gtrsim 10,000$ rolls) we would expect that in the long-run \[\frac{\text{# times we roll 1,2,3,4,5,6}}{\text{# of total rolls}}\rightarrow P(X=1,2,3,4,5,6) = \frac{1}{6}\]
- However, if $n$ is much smaller, say 100 rolls, the probability that the die lands on a given side is modeled using the multinomial distrubution with probability mass function \[ f(n,\pi_{_{1}},...,\pi_{_{m+1}}|d_{_{1}},...,d_{_{m+1}})=\frac{n!}{d_1!,...,d_{m+1}!}\pi_{_{1}}^{d_{1}},...,\pi_{_{m+1}}^{d_{m+1}} \]
In the context of rolling a single die with $m+1$ sides, the variables in this equation represent
- $n$ - number of rolls
- $d_1,...,d_{m+1}$ - number of times each of the faces $1,...,m+1$ are observed
- $\pi_{_{1}},...,\pi_{_{m+1}}$ - the long-run probabilities of observing faces $1,...,m+1$
In the context of failures occuring in $m+1$ discrete time intervals, these variables represent
- $n$ - the number of systems under test
- $d_{1}$ - number of systems that failed in the $1^{st}$ interval $(t_{0},t_{1}]$
- $d_{2}$ - number of systems that failed in the $2^{nd}$ interval $(t_{1},t_{2}]$
- $d_{m+1}$ - number of systems still operating at the final inspection $(t_{m},\infty)$
- $\pi_{_{i}},...,\pi_{_{m+1}}$ - the long-run probability that a unit will fail in interval $i,...,m+1$

Figure 2.5 - Relationship between $\pi_{i}$ and $F(t)$

For a fair die and a fair toss we say that $\pi_1=\pi_2=\pi_3=\pi_4=\pi_5=\pi_6 \approx 1/6$
- The value $1/6$ is the propensity of the overall EXPERIMENT, and is a function of how the die was manufactured AND how it was tossed
- Think of it as the underlying physical dynamics that govern the likelihood of observing a given result
- A “weighted” die would have a different propensity represented by different values of $\pi_1\ne\pi_2\ne\pi_3\ne\pi_4\ne\pi_5\ne\pi_6$
- Likewise, with some practice we might be able to toss the die in such a way as to affect the propensity
For systems under test, the propensity is a complex relationship between the system’s probability of failure and the environment in which it operates
For the simplest case, $\pi_{i}=P(t_{i-1}<T\le t_{i})=F(t_{i})-F(t_{i-1})$, where
- $\pi_i\ge 0, \forall i$
- $\displaystyle \sum_{j=1}^{m+1}\pi_j=1$
- $F(t)$ is the underlying failure time distribution relating the likelihood of failure at some time $t$
Figure 2.5 illustrates the relationship between $\pi_{i}$ and $F(t)$, where the underlying failure time distribution is assumed to be $Weibull(1.7,1)$

par(family = 'serif', font = 2)

y <- function(t) { pweibull(t,shape = 1.7,scale = 1) }

curve(pweibull(x,shape=1.7,scale=1),lwd = 2,
      xlab = 't', ylab = 'f(t)', 
      from = 0, to = 3,
      las = 1)

segments(c(0,.5,1,1.5,2,rep(0,5)),
         c(rep(0,5),y(c(0.01,.5,1,1.5,2))),
         c(0,.5,1,1.5,2,rep(2.2,5)),
         rep(y(c(0.01,.5,1,1.5,2)),2),
         lty=rep(2,10),col=rep(1,10))

text(2.3,(y(0.5)+y(.01))/2,expression(pi[1]),cex=1.25)
text(2.3,(y(1.0)+y(0.5))/2,expression(pi[2]),cex=1.25)
text(2.3,(y(1.5)+y(1.0))/2,expression(pi[3]),cex=1.25)
text(2.3,(y(2.0)+y(1.5))/2,expression(pi[4]),cex=1.25)

mtext(side = 3,
      expression('Figure 2.5 - Graphical interpretation of the relationship between the '*pi[i]*' values and F('*t[i]*')'),
      font = 2,line = 2)

$Figure 2.5 - Graphical interpretation of the relationship between the $\pi_{i}$ values and $F(t_{i})$$

Figure 2.5 - Graphical interpretation of the relationship between the $\pi_{i}$ values and $F(t_{i})$

Multinomial Survival Function

Based on this definition of $F(t_{i})$, the survival function evaluated at $t_{i}$ may be expressed as

\[ S(t_i)=P(T>t_i)=1-F(t_i)=\sum_{j=i+1}^{m+1}\pi_j \]

Where
- $i$ denotes the inspection of interest
- $j$ denotes an arbitrary inspection $j \ne i$
In words, this equation states that the probability that a system will survive to inspection $i$ is equal to the sum of the probabilities that the system will fail at inspection $i+1$ or later

Multinomial Failure Time Model - In-Class Example

Conditional Multinomial Distribution Function

Motivation for this example
- Many A/C components cannot be inspected by base-level Mx due to lack of training, facilities, tools, etc.
- Components will only be inspected every $t$ flight-hours at the depot
- For each component, $\exists$ an underlying $F(t), t\in \mathbb{R}^{+}$ describing time to failure
- The form of $F(t)$ is often unknown, but may sometimes be inferred from similar components/systems
Goal of this example
- Estimate risk of delaying replacement until next inspection (depot cycle)
- Balance safety risks against replacement costs
Estimating the risk of delaying replacement until the next inspection implies that
- The system was still operating at inspection $i-1$
- We want to know how likely it is that the system will fail before inspection $i$
- The text defines this quantity in (2.7) as $p_{i}$
- $p_{i}$ is the conditional probability that the system will fail prior to inspection $i$, given that it has survived to inspection $i-1$. \[ p_i=P(t_{i-1}<T\le t_{i}|T>t_{i-1})=\frac{F(t_i)-F(t_{i-1})}{1-F(t_{i-1})}=\frac{\pi_{i}}{S(t_{i-1})} \]

2.2.2 - Multinomial CDF

Multinomial Survival Function and CDF

From Equation 2.7 we can define the survival function in terms of $\mathbf{p}=(p_{1},...,p_{m})$ as \[ \displaystyle S(t_{i})=\prod_{j=1}^{i}[1-p_{j}], i=1,...,m+1. \]
It follows that the CDF is expressed as \[ \begin{aligned} F(t_{i})&=1-\prod_{j=1}^{i}[1-p_{j}], i=1,...,m+1 \\ &=1-\prod_{j=1}^{i} 1-\frac{F(t_i)-F(t_{i-1})}{1-F(t_{i-1})}\\ &=1-\prod_{j=1}^{i} \frac{1-F(t_{i-1})}{1-F(t_{i-1})}-\frac{F(t_i)-F(t_{i-1})}{1-F(t_{i-1})}\\ &=1-\prod_{j=1}^{i} \frac{1 - F(t_i)}{1-F(t_{i-1})}\\ &=1-\prod_{j=1}^{i} \frac{S(t_i)}{S(t_{i-1})}\\ &=\sum_{j=1}^{i}\pi_{j} \end{aligned} \]
How would you state the final expression in this equation IN WORDS

2.2.2 - Multinomial CDF - Example 2.8

Computing of $F(t_i), S(t_i), \pi_i$ and $p_i$

Table 2.1 shows the resulting values for our example in which the underlying failure time distribution is $WEIB(1.7,1)$
The following steps show how the values in the table were computed for the first and second inspections

$i$	$t_{i}$	$F(t_{i})$	$S(t_{i})$	$\pi_{i}$	$p_{i}$	$1-p_{i}$
0	0.0	0.000	1.000
1	0.5	0.265	0.735	0.265	0.265	0.735
2	1.0	0.632	0.368	0.367	0.500	0.500
3	1.5	0.864	0.136	0.231	0.629	0.371
4	2.0	0.961	.0388	.0967	0.715	0.285
5	$\infty$	1.000	.0000	.0388	1.000	0.000
				1.000

At the first inspection $t_{1}=0.5$ \[ \begin{aligned} F(0.5)&=1-\exp\left[-\left(\frac{0.5}{1}\right)^{1.7}\right]=\underline{0.2649275}\\\\\\ S(0.5)&=1-F(0.5)=\underline{0.7350725}\\\\\\ \pi_i&=F(t_i)-F(t_{i-1})=F(0.5)-F(0)=\underline{0.2649275}\\\\\\ p_i&=\frac{\pi_i}{S(t_{i-1})}=\frac{F(0.5)-F(0)}{S(0)}=\frac{0.2649275}{1}=\underline{0.2649275} \end{aligned} \]
At the second inspection $t_{2}=1.0$ \[ \begin{aligned} F(1)&=1-\exp\left[-\left(\frac{1}{1}\right)^{1.7}\right]=\underline{0.6321206}\\\\\\ S(1)&=1-F(1)=\underline{0.3678794}\\\\\\ \pi_i&=F(t_i)-F(t_{i-1})=F(1)-F(0.5)=\underline{0.3671931}\\\\\\ p_i&=\frac{\pi_i}{S(t_{i-1})}=\frac{F(1)-F(0.5)}{S(0.5)}=\frac{0.3671931}{0.7350725}=\underline{0.4995331} \end{aligned} \]

2.3 - CENSORING

Types of censoring

Observe the event plots shown in the figure below
The event plot on the left shows the lfp1370 data set in the SMRD package
The event plot on the right shows the turbine data set in the SMRD package
What type of testing do you think would produce each of these event plots?

library(SMRD)

lfp1370.ld <- 
frame.to.ld(SMRD::lfp1370,
            response.column = 1, 
            censor.column = 2, 
            case.weight.column = 3,
            time.units = 'Hours')

turbine.ld <- 
frame.to.ld(SMRD::turbine,
            response.column = 1, 
            censor.column = 2,
            case.weight.column = 3,
            time.units = 'Hundreds of Hours')

par(mfrow = c(1,2),family = 'serif',font = 2)

event.plot(lfp1370.ld)
event.plot(turbine.ld)

Event plots for the lfp1370 (left) and turbine (right) datasets

2.3.1 - Censoring Mechanisms

Types of Right Censoring

Type I (Time) Censoring - Occurs when time considerations limit our ability to observe the exact failure times of one or more samples
- The total available test time is predetermined before testing begins
- Samples still operating after the predetermined end of testing are Type-I right-censored
- In this type of test, the number of observed failures is a random quantity
- There’s no guarantee that any failures will be observed before testing ends
Type II (Failure) Censoring - Occurs when time considerations limit our ability to observe the exact failure times of one or more samples
- The number of observed failures is predetermined before testing begins
- Samples still operating after the predetermined number of failures have been observed are Type-II right-censored
- In this type of test the total test time is a random quantity
- There’s no guarantee a sufficient number of failures will be observed in a reasonable timeframe
Modeling Type II censored data is simpler than for Type I censored data
Failure censored tests (Type II) are far less common than time-censored tests
Why would these two statements be true?

Time censored data may be further divided according to

The number of censoring events
- Singly censored
- Multiply censored
The contribution to the likelihood function $\S$ 2.4
- Right censored
- Left censored
- Interval censored
Remember…ALL DATA IS INTERVAL DATA

2.3.2 - Important Assumptions On Censoring Mechanisms

Independent vs. Dependent Censoring

This course considers censoring events that are independent of the time to failure
- The time at which testing ends is driven by real-world constraints - not because we want to ensure the test ends before an item fails
- Inspections should not be delayed so that a longer lifetime can be reported
- Items should not be removed from a test because it appears that they are ‘about to fail’
Competing risks considers dependent censoring events that affect time to failure
- Dependent censoring is common in medical survivability research
- Dependent censoring is beyond the scope of this course

2.3.2 - Important Assumptions On Censoring Mechanisms - Example

Independent vs. Dependent Censoring

Suppose we’re investigating the time to recurrence of symptoms in cancer patients who receive a new experimental treatment
Further suppose that the two patients described below are involved in our study
- Patient A moves away during the study - we can’t collect any future observations
- Patient B is killed in a car crash during the study - we can’t collect any future observations
How should we handle the data we’ve already collected for these patients?
- The last observation for Patient A would be independently right censored because there’s a positive probability that the cancer symptoms could return - even though we may not be able to observe when they returned
- The data observed for Patient B is an example dependent censoring because there is zero probability that the cancer symptoms could return

2.4 - LIKELIHOOD

The Likelihood Function $\mathscr{L}(\theta|\mathbf{y})$ And The PDF $f(\mathbf{y}|\theta)$

Suppose we want to simulate 50 observations from a specified distribution to use for an example
How would we do it?
1. We’d first have to choose a distribution family to simulate from (log-normal, Weibull, exponential)
2. We’d then choose a set of parameter values to indicate the specific member of our chosen distribution family
3. Finally, we’d use a software application to generate 50 random observations using the distribution/parameter set we chose
So, let’s simulate 50 failure observations in the interval $[50,150]$ using the following distributions
- $LOGNOR(\mu = 4.5, \sigma = 0.2)$
  rlnorm(n = 50, meanlog = 4.5, sdlog = 0.2)
- $WEIB(\beta = 5.5, \eta = 120)$
  rweibull(n = 50, shape = 5.5, scale = 120)
- $EXP(\theta = 40)$
  rexp(n = 50, rate = 1/40)
The parameter values in each of these distributions were chosen to generate observations between $50$ and $150$

Observing data patterns

Paste the code in the chunk below in R to bring up the shiny app
The app shows the result of our simulation in two separate plots

teachingApps::teachingApp('data_patterns')

The left plot
- Shows the pdfs of each distribution along with the 50 failure times simulated from each distribution
- When the observations are plotted this way, it’s easy to see how the distributions differ
The right plot
- Shows the raw observations ordered from smallest to largest
- In this plot we can see that the observations simulated from the exponential distribution are skewed toward lower values compared to the Weibull and lognormal distributions
- However, it’s much harder to distinguish between the observations from the Weibull and lognormal distributions.

What’s The Point?

Specifying the distribution family and parameter values gives information about the pattern of the data that will result from the simulation
- Sometimes, it’s easy to distinguish different patterns, just by looking at the data visually
- More often, graphical and numerical methods are required to distinguish between the patterns of data produced by different distributions
- Simulating more observations can sometimes make it easier to distinguish between the patterns
But the opposite should be true as well
- The data should also provide information about the distribution family and parameter values that were most likely to have produced them
- Again, sometimes it’s easy to pick out what distribution family produced a data set just by looking at the pattern of the data
- If more observations are available, it becomes easier to distinguish which distribution family produced the data
- In most cases, numerical methods are needed to distinguish which distribution produced a data set
- A popular numerical method of extracting information from data to identify which probability distribution is most likely to have produced it is Maximum Likelihood Estimation

2.4.1 - Likelihood-Based Statistical Methods

Background

Suppose that we observe $DATA = y_{1},y_{2},...,y_{n}$ from a test event
If these observations were produced by units that can reasonably be considered as identical, then we can say that \[ y_{1},y_{2},...,y_{n} \sim F(y|\mathbf{\theta}=\theta_{1},...,\theta_{n}) \]
However, the true form of $F(y|\mathbf{\theta})$ and values of $\theta_{1},...,\theta_{n}$ are unknown
Likelihood theory states that for each distribution family, we can find $\mathbf{\theta}$ such that the negative log of the joint probability of the data is minimized, that is \[ \boldsymbol{\hat{\theta^*}} = \underset{\boldsymbol{\theta} \in \boldsymbol \Theta}{\text{arg}\;\min} \bigg(-\log\bigg[\Pr(y_1,\ldots,y_n|\boldsymbol{\theta})\bigg]\bigg) \]
If these observations are mutually independent, this joint probability can be stated as

\[ \begin{aligned} \boldsymbol{\hat{\theta^*}} &= \underset{\boldsymbol{\theta} \in \boldsymbol \Theta}{\text{arg}\;\min} \bigg(-\log\bigg[\prod_{i=1}^{n}\Pr(y_{i}|\boldsymbol{\theta})\bigg]\bigg)\\\\ &= \underset{\boldsymbol{\theta} \in \boldsymbol \Theta}{\text{arg}\;\min} \bigg(-\sum_{i=1}^{n}\log\bigg[\Pr(y_{i}|\boldsymbol{\theta})\bigg]\bigg)\\\\ \end{aligned} \]

This joint probability is a loss function in $\theta$ defined over the interval $[0,+\infty)$
Under what circumstance would this loss function equal $0$?

Maximum likelihood estimation

The joint probability defined above $\Pr(y_1,\ldots,y_n|\boldsymbol{\theta})$ is also known as the Likelihood Function and is denoted by $\mathscr{L}(\boldsymbol{\theta|y_1,\ldots,y_n)}$, thus for observation $i$ \[ \mathscr{L}(\mathbf{\theta}|y_i)=\Pr(y_i|\mathbf{\theta}) \]
Note that
- $\mathscr{L}(\theta|y_i)$ is a function of the parameter vector $\theta$ for a given observation $y_i$
- $\Pr(y_{i}|\theta)$ is a function of the observed values $y_i$ for a given parameter vector $\theta$
For a vector of observations, $y_1,\ldots,y_n$, the total likelihood is expressed as \[ \mathscr{L}(\boldsymbol{\theta}|y_1,\ldots,y_n)=C\prod_{i=1}^{n} \mathscr{L}(\theta|y_i)=\prod_{i=1}^{n}\Pr(t_{i}|\theta) \]
Unsurprisingly, the log of the joint probability $\log\bigg[\Pr(y_1,\ldots,y_n|\boldsymbol{\theta})\bigg]$ is known as the log-likelihood function $\log[\mathscr{L}(\mathbf{\theta}|y_i)]$
- Thus, the value of $\boldsymbol{\theta}$ that minimizes $-\log\bigg[\Pr(y_1,\ldots,y_n|\boldsymbol{\theta})\bigg]$, also maximizes $\log[\mathscr{L}(\mathbf{\theta}|y_i)]$
- Because $\log[x]$ is a monotone function, the value of $\boldsymbol{\theta}$ that maximizes $\log[\mathscr{L}(\mathbf{\theta}|y_i)]$ also maximizes $\log[\mathscr{L}(\mathbf{\theta}|y_i)]$
- This value of $\boldsymbol{\theta}$ that maximizes the total likelihood is called the maximum likelihood estimate and is denoted $\boldsymbol{\hat{\theta}}_{_{MLE}}$
- The constant $C$ in the total likelihood expression is unimportant and may be ignored in nearly all circumstances
In maximum likelihood estimation, the objective is to maximize the total likelihood based on two inputs:
- An assumed functional form of the joint probability
- A set of parameter values - $\mathbf{\theta}=\theta_{1},...,\theta_{n}$
The functional form of the joint probability is typically expressed in terms of the CDF and the PDF of a candidate probability distribution
For each candidate probability distribution we can obtain a $\boldsymbol{\hat{\theta}}_{_{MLE}}$ that gives a distribution-specific maximum likelihood estimate $\mathscr{L}(\boldsymbol{\hat{\theta}}_{_{MLE}}|y_1,\ldots,y_n)$
The candidate probability distribution with the highest maximum likelihood is the best fit distribution for the data.

2.4.3 - Contributions To The Likelihood Function

Each observation in the sample set under evaluation contributes to the total likelihood
The degree to which an observation contritbutes to the total likelihood depends on the underlying failure distribution $F(t)$ and whether the width of the interval in which observation occurred
Figure 2.6 illustrates the contributions made to the total likelihood for ‘exact’ observations, along with right, left, and interval censored observations

par(family = 'serif', font = 2)

curve(dweibull(x, shape = 1.7, scale = 1),
      lwd = 2,
      las = 1, 
      xlim = c(0,2.5),
      xlab = 'Time (t)',
      ylab = 'f(t)',
      main = 'Figure 2.6 - Likelihood contributions for different kids of censoring')

polygon(x = c(seq(0,.5,.01),.5),
        y = c(dweibull(seq(0,.5,.01),shape = 1.7,scale = 1),0),
        col = 1)
polygon(x = c(1,seq(1,1.5,.01),1.5),
        y = c(0,dweibull(seq(1,1.5,.01),shape = 1.7,scale = 1),0),
        col = 1)
polygon(x = c(2,seq(2,2.4,.01),2.4),
        y = c(0,dweibull(seq(2,2.4,.01),shape = 1.7,scale = 1),0),
        col = 1)

text(x = c(.16,1.3,2.2),
     y = c(.75,.65,.15),
     labels = c('Left Censoring',
                'Interval Censoring',
                'Right Censoring'))

Table 2.2 expresses the contribution of each observation in terms of the underlying failure distribution

Censoring type	Range	Likelihood
$d_{i}$ observations interval censored in $t_{i-1}$ and $t_{i}$	$t_{i-1}<T\le t_{i}$	$[F(t_{i})-F(t_{i-1})]^{d_{i}}$
$l_{i}$ observations left censored at $t_{i}$	$T\le t_{i}$	$[F(t_{i})]^{l_{i}}$
$r_{i}$ observations right censored at $t_{i}$	$T>t_{i}$	$[1-F(t_{i})]^{r_{i}}$

Using the expressions in Table 2.2, the total likelihood for a data set containing exact observations, and arbitrarily censored observations is expressed as \[ \begin{aligned} \mathscr{L}(\theta|t_1,\ldots,t_n) &=C\prod_{i=1}^{n} \mathscr{L}(\theta|t_i)\\ &=C\prod_{i=1}^{m+1}[F(t_{i})]^{l_{i}}[F(t_{i})-F(t_{i-1})]^{d_{i}}[1-F(t_{i})]^{r_{i}} \end{aligned} \]
where $n = \sum_{j=1}^{m+1}(d_{j}+r_{j}+l_{j})$

Censoring type	Range	Likelihood
\(d_{i}\) observations interval censored in \(t_{i-1}\) and \(t_{i}\)	\(t_{i-1}<T\le t_{i}\)	\([F(t_{i})-F(t_{i-1})]^{d_{i}}\)
\(l_{i}\) observations left censored at \(t_{i}\)	\(T\le t_{i}\)	\([F(t_{i})]^{l_{i}}\)
\(r_{i}\) observations right censored at \(t_{i}\)	\(T>t_{i}\)	\([1-F(t_{i})]^{r_{i}}\)

Statistical Methods for Reliability Data

Chapter 2 - Models, Censoring, and Likelihood for Failure-Time Data

OVERVIEW

This Chapter Explains…

2.1.1 - Failure Time Distribution Functions

Overview of Probability Functions

Figure 2.1 Distribution Plots For \(T \sim WEIB(1.7,1)\)

2.2 - MODELS FOR DISCRETE DATA FROM A CONTINUOUS PROCESS

Understanding failure

2.2.1 - Multinomial Failure Time Model

Real world reliability data

Multinomial Reliability Estimation

Figure 2.5 - Relationship between \(\pi_{i}\) and \(F(t)\)

Multinomial Survival Function

Multinomial Failure Time Model - In-Class Example

Conditional Multinomial Distribution Function

2.2.2 - Multinomial CDF

Multinomial Survival Function and CDF

2.2.2 - Multinomial CDF - Example 2.8

Computing of \(F(t_i), S(t_i), \pi_i\) and \(p_i\)

2.3 - CENSORING

Types of censoring

2.3.1 - Censoring Mechanisms

Types of Right Censoring

Time censored data may be further divided according to

2.3.2 - Important Assumptions On Censoring Mechanisms

Independent vs. Dependent Censoring

2.3.2 - Important Assumptions On Censoring Mechanisms - Example

Independent vs. Dependent Censoring

2.4 - LIKELIHOOD

The Likelihood Function \(\mathscr{L}(\theta|\mathbf{y})\) And The PDF \(f(\mathbf{y}|\theta)\)

Observing data patterns

What’s The Point?

2.4.1 - Likelihood-Based Statistical Methods

Background

Maximum likelihood estimation

2.4.3 - Contributions To The Likelihood Function

2.4.4 - Form Of The Constant Term \(C\)

2.4.5 - Likelihood Terms For General Reliability Data

2.4.6 - Other Likelihood Terms