Rmd5

Including Plots

You can also embed plots, for example:

Note that the echo = FALSE parameter was added to the code chunk to prevent printing of the R code that generated the plot.

library(stats)
library(qcc)

## Warning: package 'qcc' was built under R version 4.3.3

## Package 'qcc' version 2.7

## Type 'citation("qcc")' for citing this R package in publications.

Hazard rate for Weibull distribution. lambda(t) = h(t) = f(t)/(1-F(t)) f(t) is the probability Pr(X=t), F(t) is the cumulative function Pr(X <= x). f(t) is the pdf for lifetime of product. F(t) is Pr(T<=a), cdf. So 1 - F(t) is Pr(T>=a), i.e., Pr(product lasts at least some length of time) Then h(t) is pr(product fails at t given that it’s lasted until now).

Some discussion of bathtub curve and bald eagles.

Then a couple of examples:

lightbulb - probably increasing failure rate Pr(fails|new) < Pr(fails|10 years old). Lightbulbs are subject to aging.

Tim - Pr(Tim dies today|he’s 50) < Pr(Tim dies today|he’s 80) - increasing failure rate

electronic circuit board - we think constant, but did not discuss end of life mechanisms.

For cfr, generally use exponential distribution pdf = lambdaexp(-lambdax) cdf = 1 - exp(-lambda*x) then h(t) = lambda, a constant

Try to remember that h(t) is not a distribution; it’s a rate.

Fitting Weibull pdf to data. Maximum Likelihood Estimation. That means what parameters maximize the probability that I see the data that I have. Let’s look at n different products put in service at time t = 0 and want to know what their liftetimes look like. We will assume non censored data, that is we will let all of the devices die. If some of the products haven’t died, that’s censored data. We can deal with that but it’s more complicated.

We get some lifetime data and assume that it comes from a Weibull distribution. We want to find alpha and beta give data similar to the data we collected. The MLE is equal to the pdf, f(t), just looked at a different in a different light. Often written as

f(t; alpha, beta) = L(alpha, beta; t), probability of seeing liftetime t given we have data from a weibull distribution with parameters alpha and beta and probability that we get a weibull distribution with alpha and beta, given we see lifetime t.

Have one function with two variables, alpha and beta. Take partial derivatives wrt alpha and beta set both to zero and solve for alpha and beta simultaneously.

We’ll use R to do this, using the package fitdistrplus. Lots of distributions and treatments, Bayes if you know something about the distribution.

rweibull generates a random weibull distribution.

dat<-rweibull(1000,shape=2,scale=100)
hist(dat)

fit

library(fitdistrplus)

## Warning: package 'fitdistrplus' was built under R version 4.3.3

## Loading required package: MASS

## Loading required package: survival

library(MASS)
#install some tools, check out fitdistrplus

Ok, now have a fit, how good is it? Could look at a different distribution.

fit.weibull<-fitdist(dat,"weibull")
summary(fit.weibull)

## Fitting of the distribution ' weibull ' by maximum likelihood 
## Parameters : 
##        estimate Std. Error
## shape  2.060998 0.05186261
## scale 99.744755 1.60721794
## Loglikelihood:  -5186.819   AIC:  10377.64   BIC:  10387.45 
## Correlation matrix:
##          shape    scale
## shape 1.000000 0.305511
## scale 0.305511 1.000000

fit.weibull$estimate

##     shape     scale 
##  2.060998 99.744755

fit.normal<-fitdist(dat,"norm")
summary(fit.normal)

## Fitting of the distribution ' norm ' by maximum likelihood 
## Parameters : 
##      estimate Std. Error
## mean 88.55326  1.4114842
## sd   44.63504  0.9980698
## Loglikelihood:  -5217.458   AIC:  10438.92   BIC:  10448.73 
## Correlation matrix:
##      mean sd
## mean    1  0
## sd      0  1

fit.normal$estimate

##     mean       sd 
## 88.55326 44.63504

So, H0 is distribution fits, H1 is distribution does not fit. We never prove or disprove H0. All of the goodness of fit methods won’t tell you which distribution is the best.

denscomp(fit.weibull)

cdfcomp(fit.weibull)

qqcomp(fit.weibull)

We’d like to see all the data fall on a line. The Q-Q plots are very useful.

quantile(fit.weibull)

## Estimated quantiles for each specified probability (non-censored data)
##             p=0.1    p=0.2    p=0.3    p=0.4    p=0.5    p=0.6    p=0.7
## estimate 33.47277 48.17503 60.48559 72.00182 83.49458 95.60235 109.1454
##             p=0.8    p=0.9
## estimate 125.6518 149.4988

percentiles<-c(0,.05,.5,.9,.95,.99)
quantile(fit.weibull,percentiles)

## Estimated quantiles for each specified probability (non-censored data)
##          p=0   p=0.05    p=0.5    p=0.9   p=0.95   p=0.99
## estimate   0 23.60528 83.49458 149.4988 169.8596 209.2657

quantile(dat,percentiles)

##         0%         5%        50%        90%        95%        99% 
##   1.327591  22.070667  85.662217 146.438870 162.328497 201.110281

95% of my product fails at 175 hours, from data. Fit agrees. Q-Q plot supports this conclusion.

Now, bootstrapping….let’s see if we can generate confidence intervals. Sample repeatedly from data set and build CI from that.

boot.weibull<-bootdist(fit.weibull,niter=1001)
summary(boot.weibull)

## Parametric bootstrap medians and 95% percentile CI 
##          Median     2.5%      97.5%
## shape  2.063374  1.96521   2.175027
## scale 99.746030 96.45851 103.103535

quantile(boot.weibull)

## (original) estimated quantiles for each specified probability (non-censored data)
##             p=0.1    p=0.2    p=0.3    p=0.4    p=0.5    p=0.6    p=0.7
## estimate 33.47277 48.17503 60.48559 72.00182 83.49458 95.60235 109.1454
##             p=0.8    p=0.9
## estimate 125.6518 149.4988
## Median of bootstrap estimates
##             p=0.1   p=0.2    p=0.3    p=0.4    p=0.5    p=0.6    p=0.7    p=0.8
## estimate 33.43158 48.1198 60.41954 71.92465 83.43615 95.58687 109.1397 125.5878
##             p=0.9
## estimate 149.3349
## 
## two-sided 95 % CI of each quantile
##           p=0.1    p=0.2    p=0.3    p=0.4    p=0.5    p=0.6    p=0.7    p=0.8
## 2.5 %  31.18808 45.54591 57.60203 69.05235 80.45990 92.35366 105.7888 121.7737
## 97.5 % 36.02529 51.05747 63.58483 75.16390 86.73484 98.96421 112.5693 129.4050
##           p=0.9
## 2.5 %  144.6266
## 97.5 % 154.0449

plot(boot.weibull)

Rmd5

Richard Gale

2024-03-19

R Markdown

Including Plots