Jason Freels
28 September 2016
Review some important ideas from elementary statistics
Detail the standard functions to work with several common distributions
Introduce several common functions for performing statistical analyses
mean( ), sd( ), var( ), median( )
hist( )
ecdf( )
density( )
lm( )
Introduce two common functions for performing numerical optimizations
optim( )
nlminb( )
A
Number of foreign particles in a 20 oz. glass of tap water
Sum of two dice rolls
Number of licks it takes to get to the center of a tootsie roll pop
A
A glass with billions of foreign particles may be odd in Ohio but common in Africa
A two-roll sum of \(2\) would be more rare than a two-roll sum of \(7\)
Three licks may be possible, but it seems really low
In the statistics terminology, any possible value of a random variable is called either an
The set of all possible outcomes for a random variable is called the sample space and is usually denoted by \(\Omega\)
If the number outcomes in \(\Omega\) is countable or countably infinite the random variable is said to be
The number of cars that pass an intersection in some time interval
The result of rolling a die
The number of foreign particles in a volume of water
If the number outcomes in \(\Omega\) is uncountably infinite the random variable is said to be
The time at which a device fails
The volume of milk dispensed into a 'gallon' jug by a dairy
The amount of solar energy on a incident on photovoltaic cell in an hour
Random variables are generally denoted by capital letters i.e. \(X\)
Observed outcomes are denoted by lower case letters
Thus, \(P(X = x) = p\) means: The probability that the observed value of a random variable \(X\) is \(x\) is equal to \(p\)
Assume we roll a die two-times - what outcomes might we observe?
\(\Omega = (1,1), (1,2),\cdots, (5,6), (6,6)\)
36 possible outcomes
The probability of observing any outcome is the same - 1/36
Now, assume that we sum the observed values - what outcomes might we observe?
\(\Omega=\mathbb{I}_{[2,12]}\), that is the set of integers from 2 to 12
11 possible outcomes
The probability of observing any outcome is NOT the same
Only one combination of two rolls give a sum of \(12\), but several combinations give a sum of \(6\)
Therefore, the outcome \(12\) is less likely to be observed than the outcome \(6\)
The shiny app below illustrates the outcomes for various random variables related to rolling a die one or more times
For the dice-roll example, \(\Omega\) contains a countable or discrete set of possible outcomes
Therefore, any random variable we define for this example will be a
What random variables can we define?
What's the probability that the sum of two rolls will be \(\le10\)?
If we re-run the example \(10\) times what's the probability that the total will be \(>7\) four or more times?
What's the probability that we must repeat the example \(10\) times to get a total that's \(>10\)?
Each of the above questions correspond to a distinct random variable
Each random variable has its own distinct \(\Omega\)
Making statements about each random variable requires us to collect distinct data
For each random variable \(X\) there exists a
For discrete random variables we call \(f(x), f_{_{1}}\) the probability mass function
For continuous random variables we call \(f(x)\) the probability density function
Probability theory places three restrictions on \(f(x)\)
\[ \begin{aligned} &\text{For}\;x_{_{1}} \in \Omega,& \;f(x_{_{1}})\ge 0\\\\ &\text{For}\;x_{_{1}}, x_{_{2}} \in \Omega,& \;f(x_{_{1}})+f(x_{_{2}})\ge f(x_{_{1}})\\ &\text{For}\;x_{_{1}},...,x_{_{n}} \in \Omega,& \;\sum_1^n x_{_{i}}=1\\ \end{aligned} \]
\[f(x)=\begin{cases} Pr(X=x)&x\in \Omega\\\\0 &x \not \in \Omega\end{cases}\]
\[ \begin{aligned} \sum_{x\in \mathcal{A},\; \mathcal{A} \subseteq \Omega} f(x)&=F(x)\\\\ \sum_{\forall x\in \Omega} f(x)&=1 \end{aligned} \]
\[F(x)=\begin{cases} Pr(X\le x)&x\in \Omega\\\\0 &x \not \in \Omega\end{cases}\]
\[ \begin{aligned} F(x)&\ge, \;\;\forall x \in \Omega\\\\ F(x)&=1, \;\;\sup(x) \in \Omega \end{aligned} \]
\[f(x|p,n) = \left( \begin{array}{c} n \\ x \end{array} \right) (p)^{x}(1 - p)^{(n-x)}\]
\[f(x|p) = p (1 - p)^{x-1}\]
\[f(x|r,m,n) = \frac{\binom{r}{x} \binom{m - r}{n - x}}{\binom{m}{n}}\]
\[f(x|\lambda) = \frac{e^{-\lambda}\lambda^{x}}{x!}\]
\[f(x|\mu,\sigma) = \frac{\exp\left[-(x-\mu)^{2}/(2\sigma^{2})\right]}{\sigma\sqrt{2\pi}}\]
\[f(x|\theta) = \frac{1} {\theta} \exp{\left[-\frac{x}{\theta}\right]}\]
\[f(x|\theta,\beta)=\frac{\gamma}{\theta}\left(\frac{x}{\theta}\right)^{\beta-1}\exp{\left[-\left(\frac{x}{\theta}\right)^{\beta}\right]}\]
\[f(x|\mu,\sigma)=\frac{\exp\left[-(\ln(x-\mu)^{2}/(2\sigma^{2})\right]}{x\sigma\sqrt{2\pi}}\]
For \((T\in \mathcal{R}^{+}),\) the following probability functions will be used in this course
\(\hspace{12pt}F(t)\) - Cumulative density function (CDF)
\(\hspace{14pt}f(t)\) - Probability mass function (pmf)
\(\hspace{13pt}S(t)\) - Reliability (survivor) function
\(\hspace{14pt}h(t)\) - Hazard rate (failure rate) function
\(\hspace{11pt}H(t)\) - Cumulative hazard function
\(F^{-1}(p)\) - Inverse CDF (or quantile function)
\[S(t)=Pr(T > t), t\in (0,\infty)\]
\[ \begin{aligned} S(t)&\ge 0, \forall t \in \mathcal{R}^{+}\\\\ S(0)&=1\\\\ \displaystyle \lim_{t \to \infty} S(t)&=0 \end{aligned} \]
For the random variable \(T\in \mathcal{R}^{+}\)
\[ \begin{aligned} h(t)&=\displaystyle \lim_{\Delta t \to 0} \frac{Pr(t < T \le t+\Delta t|T \ge t)}{\Delta t}\\[5mm] &=\displaystyle \lim_{dt \to 0} \frac{R(t)-R(t+dt)}{R(t)dt}\\[5mm] &=\frac{dR(t)}{dt}\frac{1}{R(t)}=\frac{f(t)}{R(t)}\\ \end{aligned} \]
Using a Weibull distribution with shape \((\beta)\) and scale \((\theta)\) as example
dweibull(x,shape,scale)
Density function \(f(x)\) for a \(Weib(\beta,\theta)\) distribution
pweibull(q,shape,scale)
Probability function \(F(x)\) at quantile \(q\) for a \(Weib(\beta,\theta)\) distribution
qweibull(p,shape,scale)
Quantile function \(F^{-1}(p)\) at probability \(p\) for a \(Weib(\beta,\theta)\) distribution
rweibull(n,shape,scale)
Generates \(n\) random observations from a \(Weib(\beta,\theta)\) distribution
Other distributions in base R
Normal - \(Nor(\mu,\sigma)\)
rnorm(n,mean,sd)
Lognormal - \(Lognor(\mu,\sigma)\)
rlnorm(n,meanlog,sdlog)
Exponential - \(exp(\theta)\)
rexp(m,rate)
Gamma - \(Gam(\kappa,\beta)\)
rgamma(n, shape, rate, scale=1/rate)
R uses a four-function notation to compute values for distribution functions
Using The Normal Distribution As An Example
[1] 0.1586553 0.3085375 0.5000000 0.6914625 0.8413447
[1] 0.1209854 0.1760327 0.1994711 0.1760327 0.1209854
[1] 3.289707 1.683242 0.000000 -1.683242 -3.289707
[1] 2.7494016 1.7362779 3.1268790 -0.7383119 1.6373741
hist( )density( )ecdf( )optim( ) or nlminb( ) to Maximize FunctionsSuppose we have a the following function, where
\(d_1,...,d_8\) are the data observed in 8 intervals
Given a set of observations we want to find the \(\theta\) that maximizes the function
Like7.3<-function(theta,d) {
d1<-d[1];d2<-d[2];d3<-d[3];d4<-d[4];d5<-d[5];d6<-d[6];d7<-d[7];d8<-d[8]
F<-c(rep(NA,1))
F<-log(exp( -0/theta)-exp( -100/theta))*d1+
log(exp( -100/theta)-exp( -300/theta))*d2+
log(exp( -300/theta)-exp( -500/theta))*d3+
log(exp( -500/theta)-exp( -700/theta))*d4+
log(exp( -700/theta)-exp(-1000/theta))*d5+
log(exp(-1000/theta)-exp(-2000/theta))*d6+
log(exp(-2000/theta)-exp(-4000/theta))*d7+
log(exp(-4000/theta)-exp( -Inf/theta))*d8
return(-F) }We can use nlminb( ) to do this assuming the following data were observed
data[2,][1] 41 44 24 32 29 21 9 0
nlminb(start = 400, obj = Like7.3, d = data[2,], lower = 100, upper = 800)$par[1] 572.2742
For this fucntion, we could have also used optim( ) and arrived at the same solution
optim(par = 400, fn = Like7.3, d = data[2,], lower = 100, upper = 800, method = "L-BFGS-B")$par[1] 572.274
In inferential statistics, confidence intervals (CI) are useful for quantifying the uncertainy associated with a point estimate due to sampling errors
Sources of uncertainty
Limited understanding of how a system interacts with its environment
Manufacturing flaws & deviations between units in a sample
Inability of chosen parametric model to fit data
Small sample sizes amplify the uncertainty
What does it mean to say "...the \(100\times(1-\alpha)\%\) CI for \(\theta =(\underline{\hat{\theta}},\overline{\hat{\theta}})\)..."
Is the continuous counterpart to the geometric distribution
Describes the inter-arrival time durations between events in a homogeneous Poisson process
Is used to model the "useful-life" region of the bathtub curve
Implies that failures are due to random events or chance
Is a "memoryless" distribution
Is one of the most commonly used lifetime distributions in reliability analyses
Is the simplest distribution used in the analysis of reliability data
In real world scenarios, assuming exponentially distributed failure times is rarely valid
\[ \begin{aligned} f(y|\mu,\sigma)&=\frac{1}{\sigma}\phi_{nor}\left(\frac{y-\mu}{\sigma}\right)=\frac{1}{\sigma}\frac{e^{-(y - \mu)^{2}/(2\sigma^{2})}}{\sqrt{2\pi}}\\\\ F(y|\mu,\sigma)&=\Phi_{nor}\left(\frac{y-\mu}{\sigma}\right)=\int_{-\infty}^{y} \frac{e^{-(y-\sigma)^{2}/2\sigma^2}} {\sqrt{2\pi}\sigma}\\\\ h(y|\mu,\sigma)&=\frac{f(y|\mu,\sigma)}{1-F(y|\mu,\sigma)}\\\\ y_{p}&=\mu+\Phi^{-1}_{nor}(p)\sigma, \;\;\;\;\;\;\;\;\text{where}\;\Phi^{-1}_{nor}(p)=z_p\\\\ E[Y]&=\mu\\\\ Var[Y]&=\sigma^2 \end{aligned} \]
\[ \begin{aligned} f(t|\mu,\sigma)&=\frac{1}{\sigma}\phi_{nor}\left(\frac{\log(t)-\mu}{\sigma}\right)\\\\ F(t|\mu,\sigma)&=\Phi_{nor}\left(\frac{\log(t)-\mu}{\sigma}\right)\\\\ h(t|\mu,\sigma)&=\frac{f(t|\mu,\sigma)}{1-F(t|\mu,\sigma)}\\\\ t_{p}&=\exp\left[\mu+\Phi^{-1}_{nor}(p)\sigma\right], \;\;\;\;\;\;\;\;\text{where}\;\Phi^{-1}_{nor}(p)=z_p\\\\ E[T]&=\exp(\mu+0.5\sigma^2)\\\\ Var[T]&=\exp(2\mu+\sigma^2)(\exp(\sigma^2)-1) \end{aligned} \]