Statistical Methods for Reliability Data

Chapter 6 - Probability Plotting

W. Q. Meeker, L. A. Escobar, and J. K. Freels

23 July 2018

OVERVIEW

This chapter explains...

6.1 - INTRODUCTION

Applications Of Probability Plots


6.1.2 - Probability Plotting Scales: Linearizing A CDF

CDF's from different distributions can look similar on plots

teachingApps::teachingApp('probability_plotting')

6.2 - LINEARIZING LOCATION-SCALE BASED DISTRIBUTIONS

Basic Idea Behind The Process

  1. Identify the event times and sort them

    • If we're trying to model data - then we should already have data

    • Data should be sorted smallest to largest

    • In R, we can sort data using sort( )

  2. Invert and rearrange \(F(t_{p})\) to obtain a linear function of the form

\[ \begin{aligned} y&=m(x)\;+b\\\\ F^{-1}(p)&=m(t_{p})+b \end{aligned} \]

  1. Figure out some a way to come up with values for \(p\) to use in the plot

    • The values are called plotting positions

    • We'll discuss plotting positions at length in a minute

    • For now, we'll use what is known as the Hazen plotting position which is expressed as

\[\hat{p}_i=\frac{i-0.5}{n}\]

Fake Data Vs. Real Data

Fake Data Real Data

6.2.1 - Linearizing The Exponential CDF

Steps to Create An exponential Probability Plot

  1. Sort the observations from smallest to largest \(t_1, t_2,...,t_n\)

  2. Linearize the exponetial CDF

    • For \(T\sim EXP(\theta,\gamma)\) the CDF is \[F(t_{p})=p=1-\exp\left(-\frac{t_{p}-\gamma}{\theta}\right),\quad t_{p}\in [0,\infty)\]

    • Inverting and rearranging the exponential CDF gives the following linear equation \[ \begin{aligned} p&=1-\exp\left(-\frac{t_{p}-\gamma}{\theta}\right)\\ 1-p&=\exp\left(-\frac{t_{p}-\gamma}{\theta}\right)\\ \ln\left[\frac{1}{1-p}\right]&=\frac{t_{p}}{\theta}- \frac{\gamma}{\theta} \end{aligned} \]

  3. Plot the points \(\left(x = t_i,y = \log\left[\frac{1}{1-p_i}\right]\right),\;i=1,...,n\)

  4. If the points roughly fall on a straight line the data may follow an exponential distribution

    • \(\theta \equiv\;\) is the slope of this line, i.e. \(\displaystyle m = \frac{1}{\theta}\)

    • \(\gamma \equiv\;\) the value of the \(y\) intercept, i.e. \(\displaystyle b=\frac{\gamma}{\theta}\)

Figure 6.1 - Exponential Probability Plot

par(family='serif',font = 2)

p <- seq(.01,.99,.01)
gamma1 <- 0
theta1 <- 50

gamma2 <- 0
theta2 <-200

plot(qexp(p,rate=1/theta1) - gamma1,-log(1-p),
     type='l', las = 1,
     lwd=2, col=1,
     xlim=range(0,800),
     xlab='Time',
     ylab='Standard Exponential Quantile')

lines(qexp(p,rate=1/theta2) - gamma2, -log(1-p),
      lwd=2, col=2)

text(c(100,650),c(4,2.5), 
     c(expression(theta*', '*gamma*' = 50, 0'),
       expression(theta*', '*gamma*' = 200, 0')))
box(lwd=1.25)

6.2.2 - Linearizing The Normal CDF

Steps To Create a Normal Probability Plot

  1. Sort the observations from smallest to largest \(t_1, t_2,...,t_n\)

  2. Linearize the normal CDF

    • For \(T\sim NOR(\mu,\sigma)\) the CDF is \[F(t_{p})=p=\Phi_{NOR}\left[\frac{t_{p}-\mu}{\sigma}\right] \;\;t_{p}\in [-\infty,\infty)\]

    • Inverting this CDF gives the normal quantile function \[t_p=F^{-1}(p)=\Phi^{-1}_{NOR}(p)\sigma +\mu\]

    • Rearranging this expression as a linear equation gives \[\Phi_{NOR}^{-1}(p)=\frac{t_{p}}{\sigma}-\frac{\mu}{\sigma}\]

  3. Plot the points \(\left(x = t_i,y = \Phi_{NOR}^{-1}(p_i)\right),\;i=1,...,n\)

  4. If the points roughly fall on a straight line the data may follow a normal distribution where

    • \(\sigma \equiv\;\) the slope of this line, i.e. \(\displaystyle m=\left(\frac{1}{\sigma}\right)\)

    • \(\mu \equiv\;\) the \(x\)-value where the line crosses \(\displaystyle \Phi_{NOR}^{-1}(p) = 0\)

Figure 6.2 - Normal Probability Plot

par(family='serif', font=2)

p <- seq(.01,.99,.01)
mean1 <- 40
sd1 <- 10
mean2 <- 80
sd2 <- 5

plot(qnorm(p, mean = mean1, sd = sd1), qnorm(p),
     type='l',
     lwd=2, col=1,
     xlim=range(0,100),
     las = 1,
     xlab='Time',
     ylab='Standard Normal Quantile')

lines(qnorm(p,mean = mean2, sd = sd2), qnorm(p), lwd=2, col=2)
abline(h=0, lty=2)

text(c(15, 85), c(-1,-2), 
     c(expression(mu*', '*sigma*' = 40, 10'),
       expression(mu*', '*sigma*' = 80, 5')))
box(lwd=1.25)

6.2.3 - Linearizing The Lognormal CDF

Steps To Create A Lognormal Probability Plot

  1. Sort the observations from smallest to largest \(t_1, t_2,...,t_n\)

  2. Linearize the lognormal CDF

    • For \(T\sim LOGNOR(\mu,\sigma)\) the CDF is \[F(t_{p})=p=\Phi_{NOR}\left[\frac{\log(t_{p})-\mu}{\sigma}\right] \;\;t_{p}\in [0,\infty)\]

    • Inverting this CDF gives the lognormal quantile function \[\log[t_p]=F^{-1}(p)=\Phi^{-1}_{NOR}\sigma+\mu\]

    • Rearranging this expression as a linear equation gives \[\Phi_{NOR}^{-1}(p)=\frac{\log[t_{p}]}{\sigma}-\frac{\mu}{\sigma}\]

  3. Plot the points \(\left(x = \log[t_i], y =\Phi_{NOR}^{-1}(p_i)\right),\;i=1,...,n\)

  4. If the points roughly fall on a straight line, the data may follow an lognormal distribution

    • \(\log[\sigma] \equiv\;\) the slope of this line i.e. \(\displaystyle m = \left(\frac{1}{\sigma}\right)\)

    • \(\log[\mu] \equiv\;\) the \(x\)-value where the line crosses \(\displaystyle \Phi_{NOR}^{-1}(p) = 0\)

Figure 6.3 - Lognormal Probability Plot

par(family='serif',font=2)
p <- seq(.01,.99,.01)
meanlog1 <- log(50)
sdlog1 <- 2
meanlog2 <- log(500)
sdlog2 <- 2
meanlog3 <- log(500)
sdlog3 <- 1

plot(qlnorm(p,meanlog = meanlog1, 
              sdlog = sdlog1),
     qnorm(p),
     type='l',
     lwd=2,
     xlim=c(1,1000),
     las = 1,
     xlab='Time',
     ylab='Standard Normal Quantile',
     log='x')

lines(qlnorm(p,meanlog = meanlog2, sdlog = sdlog2), 
      qnorm(p), lwd=2, col=2)

lines(qlnorm(p,meanlog = meanlog3, sdlog = sdlog3), 
      qnorm(p), lwd=2, col=3)
abline(h=0, lty=2)

box(lwd=1.25)

legend('topleft',
       c(expression('exp('*mu*'),'*sigma*' = 50,2'),
         expression('exp('*mu*'),'*sigma*' = 500,2'),
         expression('exp('*mu*'),'*sigma*' = 500,1')),
       lty = 1,col = c(1:3),lwd = 2, bty = 'n')

6.2.2 - Linearizing The Smallest Extreme Value CDF

Steps To Create An SEV Probability Plot

  1. Sort the observations from smallest to largest \(t_1, t_2,...,t_n\)

  2. Linearize the smallest extreme value CDF

    • For \(T\sim SEV(\mu,\sigma)\) the CDF is \[F(t_{p})=p=\Phi_{SEV}\left[\frac{t_{p}-\mu}{\sigma}\right] \;\;t_{p}\in [-\infty,\infty)\]

    • Inverting this CDF gives the SEV quantile function \[t_p=F^{-1}(p)=\Phi^{-1}_{SEV}(p)\sigma +\mu\]

    • Rearranging this expression as a linear equation gives \[\Phi_{SEV}^{-1}(p)=\frac{t_{p}}{\sigma}-\frac{\mu}{\sigma}\]

  3. Plot the points \(\left(x = t_i,y = \Phi_{SEV}^{-1}(p_i)\right),\;i=1,...,n\)

  4. If the points roughly fall on a straight line the data may follow an SEV distribution

    • \(\sigma \equiv\;\) the slope of this line i.e. \(\displaystyle m =\left(\frac{1}{\sigma}\right)\)

    • \(\mu \equiv\;\) the \(x\)-value where the line crosses \(\displaystyle \Phi_{SEV}^{-1}(p) = 0\)

Figure 6.4 - Smallest Extreme Value Probability Plot

par(family='serif', font=2)

library(SMRD)

p<- seq(.01,.99,.01)
mean1 <- 40
sdev1 <- 10
mean2 <- 80
sdev2 <- 5

plot(qsev(p, location=mean1, scale=sdev1), qsev(p),
     type='l',
     lwd=2, col=1,
     xlim=range(0,100),
     las = 1,
     xlab='Time',
     ylab='Standard SEV Quantile')

lines(qsev(p,location=mean2, scale=sdev2), qsev(p), lwd=2, col=2)
abline(h=0, lty=2)

text(c(15, 85), c(-1,-2), 
     c(expression(mu*', '*sigma*' = 40, 10'),
       expression(mu*', '*sigma*' = 80, 5')))
box(lwd=1.25)

6.2.4 - Linearizing The Weibull CDF

Steps To Create A Weibull\((\beta,\eta)\) Probability Plot

  1. Sort the observations from smallest to largest \(t_1, t_2,...,t_n\)

  2. Linearize the Weibull\((\beta,\eta)\) CDF

    • For \(T\sim WEIB(\beta,\eta)\) the CDF is \[F(t_{p})=p=1-\exp\left[-\left(\frac{t}{\eta}\right)^{\beta}\right]\]

    • Inverting this CDF gives the Weibull quantile function \[\log[t_p]=F^{-1}(p)=\Phi^{-1}_{SEV}(p)\sigma +\mu\]

    • Rearranging this expression as a linear equation gives \[\log\left[\log\left[\frac{1}{1-p}\right]\right]=\frac{\log\left[t_{p}\right]}{\beta}-\frac{\log[\eta]}{\beta}\]

  3. Plot the points \(\left(x = \log[t_i],y = \log\left[\log\left[\frac{1}{1-p_i}\right]\right]\right),\;i=1,...,n\)

  4. If the points roughly fall on a straight line the data may follow an Weibull distribution

    • \(\beta \equiv\;\) the slope of the line \(\displaystyle m = \left(\frac{1}{\beta}\right)\)

    • \(\theta \equiv\;\) the \(x\)-value where the line crosses \(\displaystyle \Phi_{SEV}^{-1}(p) = 0\)

Steps To Create A Weibull\((\mu,\sigma)\) Probability Plot

  1. Sort the observations from smallest to largest \(t_1, t_2,...,t_n\)

  2. Linearize the Weibull\((\mu,\sigma)\) CDF

    • For \(T\sim WEIB(\mu,\sigma)\) the CDF is \[F(t_{p})=p=\Phi_{SEV}\left(\frac{\log[t_p]-\mu}{\sigma}\right)\]

    • Inverting this CDF gives the Weibull quantile function \[\log[t_p]=F^{-1}(p)=\Phi^{-1}_{SEV}(p)\sigma +\mu\]

    • Rearranging this expression as a linear equation gives \[\Phi_{SEV}^{-1}(p)=\frac{\log[t_{p}]}{\sigma}-\frac{\mu}{\sigma}\]

  3. Plot the points \(\left(x = \log[t_i],y = \Phi_{SEV}^{-1}(p_i)\right),\; i = 1,...,n\)

  4. If the points roughly fall on a straight line the data may follow an Weibull distribution

    • \(\sigma \equiv\;\) the slope of this line, i.e. \(\displaystyle m = \left(\frac{1}{\sigma}\right)\)

    • \(\mu \equiv\;\) the \(x\)-value where the line crosses \(\displaystyle \Phi_{SEV}^{-1}(p) = 0\)

Figure 6.5 - Weibull Probability Plot

par(family='serif',font=2)

library(SMRD)

p<- seq(.01,.99,.01)

plot(qweibull(p,.5,50), qsev(p),
     type='l',
     lwd = 2,
     col = 1,
     xlim = c(1,1000),
     las = 1,
     xlab='Time',
     ylab='Standard SEV Quantile',
     log='x')

lines(qweibull(p,0.5,500), 
      qsev(p), lwd = 2, col = 2)
lines(qweibull(p,1.0,500), 
      qsev(p), lwd = 2, col = 3)
abline(h=0, lty=2)

box(lwd=1.25)
legend(x = 400,
       y = -4.6,
       c(expression(eta*','*beta == ~~50*','*0.5),
         expression(eta*','*beta == 500*','*0.5),
         expression(eta*','*beta == 500*','*1.0)),
       lty = 1,
       col = c(1:3), 
       lwd = 2, 
       bty = 'n',
       y.intersp = 0.75,
       yjust = 0.2,
       xjust = 0.5,
       cex = 0.9,
       seg.len = 1)

6.3 - GRAPHICAL GOODNESS OF FIT

Let's Review

6.4.1 - Criteria For Choosing Plotting Positions

Probability plots require a way to compute plotting positions \(F(t_{i}), i=1,...,n\)

Probability Plotting Formula

\[ p_{i}=\frac{i-a}{n+1-2a} \]

Probability Plot Formula Table

Proponent Year \(a\) Formula
Hazen 1914 0.50 \(\displaystyle\frac{i-0.5}{n}\)
Weibull 1939 0 \(\displaystyle\frac{i}{n+1}\)
Blom 1958 0.375 \(\displaystyle\frac{i-0.375}{n+0.25}\)
Gringorten 1963 0.44 \(\displaystyle\frac{i-0.44}{n+0.12}\)
Chegodayev 2000 0.30 \(\displaystyle\frac{i-0.30}{n+0.4}\)
Cunnane 1977 0.40 \(\displaystyle\frac{i-0.40}{n+0.2}\)
Median 1943 0.3175 \(\displaystyle\frac{i-0.3175}{n+0.365}\)

\[ \begin{aligned} \text{Hazen}\; &p_{i}=\frac{i-0.5}{n}\\\\ \text{Chegodayev}\; &p_{i}=\frac{i-0.3}{n+0.4}\;\text{also known as "Median-Ranks"}\\\\ \text{Weibull}\; &p_{i}=\frac{i}{n+1}\\\\ \end{aligned} \]

\[ p_{i}=\begin{cases}\frac{i}{n}, &\mbox{Distribution free - (Eq 3.1)}\\\\ \frac{i-0.5}{n}, &\mbox{Assumed distribution}\end{cases} \]

6.4.2 - Choice of Plotting Positions

Censoring considerations for selecting a plotting position formula

6.4.2 - Choice of Plotting Positions

Case 1 - Data with continuous inspection and a single censoring event

\[ \hat{p}_{i}=\frac{i}{n} \]

\[ \hat{p}_{i}=\frac{i-0.5}{n}=\frac{1}{2}\left[\hat{F}(t_{(i)})+\hat{F}(t_{(i-1)})\right] \]

SMRD::smrd_app('at7987_data')

6.4.2 - Choice of Plotting Positions

Case 2 - Data with continuous inspection and multiple right censoring events

\[ \hat{p_{i}}=\frac{1}{2}\left[\hat{F}(t_{(i)})+\hat{F}(t_{(i-1)})\right] \]

Example 6.3 - ShockAbsorber

\[ \begin{aligned} \hat{F}(t_{i})&=1-\prod_{j=1}^i \left[1-\hat{p}_{j}\right]\\\\ &=1-\prod_{j=1}^i \left[1-\frac{d_j}{n_j}\right] \end{aligned} \]

\[ \begin{aligned} p_i=\frac{1}{2}\left[1-\prod_{j=1}^i \left[1-\frac{d_j}{n_j}\right]+1-\prod_{j=1}^{i-1} \left[1-\frac{d_j}{n_j}\right]\right] \end{aligned} \]

SMRD::smrd_app('shockabsorber_data6')

6.4.2 - Choice of Plotting Positions

Case 3 - Data with Interval censoring

\[ p_{i}=\hat{F}(t_{(i)}) \]

Example 6.4 - Heat Exchanger

SMRD::smrd_app('heatexchanger_data6')

6.4.2 - Choice of Plotting Positions

Case 4 - Arbitrary censoring

Example 6.5 - Turbine Wheel

SMRD::smrd_app('turbine_data6')

6.6.2 - Reasons For Bends In Probability Plots

What if your probability plot looks like this?

par(family = 'serif', font = 2)

Bleed.ld <- 
frame.to.ld(bleed,
            response.column = 1, 
            censor.column = 2, 
            case.weight.column = 3,
            x.columns = c(4),
            time.units = 'Hours')

plot(Bleed.ld,
     distribution = NULL)

Example 6.9 - Bleed System Failure

SMRD::smrd_app('bleed_data6')