This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.
When you click the Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this:
summary(cars)
## speed dist
## Min. : 4.0 Min. : 2.00
## 1st Qu.:12.0 1st Qu.: 26.00
## Median :15.0 Median : 36.00
## Mean :15.4 Mean : 42.98
## 3rd Qu.:19.0 3rd Qu.: 56.00
## Max. :25.0 Max. :120.00
You can also embed plots, for example:
Note that the echo = FALSE parameter was added to the
code chunk to prevent printing of the R code that generated the
plot.
library(stats)
library(qcc)
## Warning: package 'qcc' was built under R version 4.3.3
## Package 'qcc' version 2.7
## Type 'citation("qcc")' for citing this R package in publications.
Hazard rate for Weibull distribution. lambda(t) = h(t) = f(t)/(1-F(t)) f(t) is the probability Pr(X=t), F(t) is the cumulative function Pr(X <= x). f(t) is the pdf for lifetime of product. F(t) is Pr(T<=a), cdf. So 1 - F(t) is Pr(T>=a), i.e., Pr(product lasts at least some length of time) Then h(t) is pr(product fails at t given that it’s lasted until now).
Some discussion of bathtub curve and bald eagles.
Then a couple of examples:
lightbulb - probably increasing failure rate Pr(fails|new) < Pr(fails|10 years old). Lightbulbs are subject to aging.
Tim - Pr(Tim dies today|he’s 50) < Pr(Tim dies today|he’s 80) - increasing failure rate
electronic circuit board - we think constant, but did not discuss end of life mechanisms.
For cfr, generally use exponential distribution pdf = lambdaexp(-lambdax) cdf = 1 - exp(-lambda*x) then h(t) = lambda, a constant
Try to remember that h(t) is not a distribution; it’s a rate.
Fitting Weibull pdf to data. Maximum Likelihood Estimation. That means what parameters maximize the probability that I see the data that I have. Let’s look at n different products put in service at time t = 0 and want to know what their liftetimes look like. We will assume non censored data, that is we will let all of the devices die. If some of the products haven’t died, that’s censored data. We can deal with that but it’s more complicated.
We get some lifetime data and assume that it comes from a Weibull distribution. We want to find alpha and beta give data similar to the data we collected. The MLE is equal to the pdf, f(t), just looked at a different in a different light. Often written as
f(t; alpha, beta) = L(alpha, beta; t), probability of seeing liftetime t given we have data from a weibull distribution with parameters alpha and beta and probability that we get a weibull distribution with alpha and beta, given we see lifetime t.
Have one function with two variables, alpha and beta. Take partial derivatives wrt alpha and beta set both to zero and solve for alpha and beta simultaneously.
We’ll use R to do this, using the package fitdistrplus. Lots of distributions and treatments, Bayes if you know something about the distribution.
rweibull generates a random weibull distribution.
dat<-rweibull(1000,shape=2,scale=100)
hist(dat)
fit
library(fitdistrplus)
## Warning: package 'fitdistrplus' was built under R version 4.3.3
## Loading required package: MASS
## Loading required package: survival
library(MASS)
#install some tools, check out fitdistrplus
Ok, now have a fit, how good is it? Could look at a different distribution.
fit.weibull<-fitdist(dat,"weibull")
summary(fit.weibull)
## Fitting of the distribution ' weibull ' by maximum likelihood
## Parameters :
## estimate Std. Error
## shape 2.060998 0.05186261
## scale 99.744755 1.60721794
## Loglikelihood: -5186.819 AIC: 10377.64 BIC: 10387.45
## Correlation matrix:
## shape scale
## shape 1.000000 0.305511
## scale 0.305511 1.000000
fit.weibull$estimate
## shape scale
## 2.060998 99.744755
fit.normal<-fitdist(dat,"norm")
summary(fit.normal)
## Fitting of the distribution ' norm ' by maximum likelihood
## Parameters :
## estimate Std. Error
## mean 88.55326 1.4114842
## sd 44.63504 0.9980698
## Loglikelihood: -5217.458 AIC: 10438.92 BIC: 10448.73
## Correlation matrix:
## mean sd
## mean 1 0
## sd 0 1
fit.normal$estimate
## mean sd
## 88.55326 44.63504
So, H0 is distribution fits, H1 is distribution does not fit. We never prove or disprove H0. All of the goodness of fit methods won’t tell you which distribution is the best.
denscomp(fit.weibull)
cdfcomp(fit.weibull)
qqcomp(fit.weibull)
We’d like to see all the data fall on a line. The Q-Q plots are very
useful.
quantile(fit.weibull)
## Estimated quantiles for each specified probability (non-censored data)
## p=0.1 p=0.2 p=0.3 p=0.4 p=0.5 p=0.6 p=0.7
## estimate 33.47277 48.17503 60.48559 72.00182 83.49458 95.60235 109.1454
## p=0.8 p=0.9
## estimate 125.6518 149.4988
percentiles<-c(0,.05,.5,.9,.95,.99)
quantile(fit.weibull,percentiles)
## Estimated quantiles for each specified probability (non-censored data)
## p=0 p=0.05 p=0.5 p=0.9 p=0.95 p=0.99
## estimate 0 23.60528 83.49458 149.4988 169.8596 209.2657
quantile(dat,percentiles)
## 0% 5% 50% 90% 95% 99%
## 1.327591 22.070667 85.662217 146.438870 162.328497 201.110281
95% of my product fails at 175 hours, from data. Fit agrees. Q-Q plot supports this conclusion.
Now, bootstrapping….let’s see if we can generate confidence intervals. Sample repeatedly from data set and build CI from that.
boot.weibull<-bootdist(fit.weibull,niter=1001)
summary(boot.weibull)
## Parametric bootstrap medians and 95% percentile CI
## Median 2.5% 97.5%
## shape 2.063374 1.96521 2.175027
## scale 99.746030 96.45851 103.103535
quantile(boot.weibull)
## (original) estimated quantiles for each specified probability (non-censored data)
## p=0.1 p=0.2 p=0.3 p=0.4 p=0.5 p=0.6 p=0.7
## estimate 33.47277 48.17503 60.48559 72.00182 83.49458 95.60235 109.1454
## p=0.8 p=0.9
## estimate 125.6518 149.4988
## Median of bootstrap estimates
## p=0.1 p=0.2 p=0.3 p=0.4 p=0.5 p=0.6 p=0.7 p=0.8
## estimate 33.43158 48.1198 60.41954 71.92465 83.43615 95.58687 109.1397 125.5878
## p=0.9
## estimate 149.3349
##
## two-sided 95 % CI of each quantile
## p=0.1 p=0.2 p=0.3 p=0.4 p=0.5 p=0.6 p=0.7 p=0.8
## 2.5 % 31.18808 45.54591 57.60203 69.05235 80.45990 92.35366 105.7888 121.7737
## 97.5 % 36.02529 51.05747 63.58483 75.16390 86.73484 98.96421 112.5693 129.4050
## p=0.9
## 2.5 % 144.6266
## 97.5 % 154.0449
plot(boot.weibull)