PRESENTATION OVERVIEW

In This Presentation...

Review some important ideas from elementary statistics
Detail the standard functions to work with several common distributions
Introduce several common functions for performing statistical analyses
- mean( ), sd( ), var( ), median( )
- hist( )
- ecdf( )
- density( )
- lm( )
Introduce two common functions for performing numerical optimizations
- optim( )
- nlminb( )

ELEMENTARY STATISTICS REVIEW

Random Quantities & Random Variables

A Random Quantity is a measurable property of an entity (or event) whose value is uncertain due to random effects
- Number of foreign particles in a 20 oz. glass of tap water
- Sum of two dice rolls
- Number of licks it takes to get to the center of a tootsie roll pop
A Random Variable is a function quantifying the rarity (or oddness) of an observed value of a random quantity in light of all other possible values that could have been observed
- A glass with billions of foreign particles may be odd in Ohio but common in Africa
- A two-roll sum of \(2\) would be more rare than a two-roll sum of \(7\)
- Three licks may be possible, but it seems really low

Random Variable Terminology

In the statistics terminology, any possible value of a random variable is called either an event or an outcome
The set of all possible outcomes for a random variable is called the sample space and is usually denoted by \(\Omega\)
If the number outcomes in \(\Omega\) is countable or countably infinite the random variable is said to be discrete
- The number of cars that pass an intersection in some time interval
- The result of rolling a die
- The number of foreign particles in a volume of water
If the number outcomes in \(\Omega\) is uncountably infinite the random variable is said to be continuous
- The time at which a device fails
- The volume of milk dispensed into a 'gallon' jug by a dairy
- The amount of solar energy on a incident on photovoltaic cell in an hour
Random variables are generally denoted by capital letters i.e. \(X\)
Observed outcomes are denoted by lower case letters
Thus, \(P(X = x) = p\) means: The probability that the observed value of a random variable \(X\) is \(x\) is equal to \(p\)

Example: Rolling A Die \(N\) Times

Assume we roll a die two-times - what outcomes might we observe?
- \(\Omega = (1,1), (1,2),\cdots, (5,6), (6,6)\)
- 36 possible outcomes
- The probability of observing any outcome is the same - 1/36
Now, assume that we sum the observed values - what outcomes might we observe?
- \(\Omega=\mathbb{I}_{[2,12]}\), that is the set of integers from 2 to 12
- 11 possible outcomes
- The probability of observing any outcome is NOT the same
- Only one combination of two rolls give a sum of \(12\), but several combinations give a sum of \(6\)
- Therefore, the outcome \(12\) is less likely to be observed than the outcome \(6\)
The shiny app below illustrates the outcomes for various random variables related to rolling a die one or more times

Discrete Random Variables

For the dice-roll example, \(\Omega\) contains a countable or discrete set of possible outcomes
Therefore, any random variable we define for this example will be a discrete random variable
What random variables can we define? That depends on what we want to know
1. What's the probability that the sum of two rolls will be \(\le10\)?
2. If we re-run the example \(10\) times what's the probability that the total will be \(>7\) four or more times?
3. What's the probability that we must repeat the example \(10\) times to get a total that's \(>10\)?
Each of the above questions correspond to a distinct random variable
- Each random variable has its own distinct \(\Omega\)
- Making statements about each random variable requires us to collect distinct data

Distribution Functions

For each random variable \(X\) there exists a function \(f(x)\) that assigns a relative probability to each outcome \(x \in \Omega\)
- For discrete random variables we call \(f(x), f_{_{1}}\) the probability mass function
- For continuous random variables we call \(f(x)\) the probability density function
Probability theory places three restrictions on \(f(x)\)

\[ \begin{aligned} &\text{For}\;x_{_{1}} \in \Omega,& \;f(x_{_{1}})\ge 0\\\\ &\text{For}\;x_{_{1}}, x_{_{2}} \in \Omega,& \;f(x_{_{1}})+f(x_{_{2}})\ge f(x_{_{1}})\\ &\text{For}\;x_{_{1}},...,x_{_{n}} \in \Omega,& \;\sum_1^n x_{_{i}}=1\\ \end{aligned} \]

The Probability Mass Function - \(f(x)\)

For \(x\in \Omega, \;\left(\Omega\subseteq\mathbb{R}\right),\) the pmf relates the possible values of \(X\) to their respective relative probabilities of occurrence

\[f(x)=\begin{cases} Pr(X=x)&x\in \Omega\\\\0 &x \not \in \Omega\end{cases}\]

Where

\[ \begin{aligned} \sum_{x\in \mathcal{A},\; \mathcal{A} \subseteq \Omega} f(x)&=F(x)\\\\ \sum_{\forall x\in \Omega} f(x)&=1 \end{aligned} \]

The Cumulative Distribution Function - \(F(x)\)

For \(x\in \Omega, \;\left(\Omega\subseteq\mathbb{R}\right),\) the cdf is a strictly positive, non-decreasing function relating the cumulative probability that the true (but unknown) value of \(X\) is less than or equal to \(x\)

\[F(x)=\begin{cases} Pr(X\le x)&x\in \Omega\\\\0 &x \not \in \Omega\end{cases}\]

Where

\[ \begin{aligned} F(x)&\ge, \;\;\forall x \in \Omega\\\\ F(x)&=1, \;\;\sup(x) \in \Omega \end{aligned} \]

Common Discrete Distributions

Binomial Distribution

\[f(x|p,n) = \left( \begin{array}{c} n \\ x \end{array} \right) (p)^{x}(1 - p)^{(n-x)}\]

Geometric Distribution

\[f(x|p) = p (1 - p)^{x-1}\]

Hypergeometric Distribution

\[f(x|r,m,n) = \frac{\binom{r}{x} \binom{m - r}{n - x}}{\binom{m}{n}}\]

Poisson Distribution

\[f(x|\lambda) = \frac{e^{-\lambda}\lambda^{x}}{x!}\]

Common Continuous Distributions

Normal Distribution

\[f(x|\mu,\sigma) = \frac{\exp\left[-(x-\mu)^{2}/(2\sigma^{2})\right]}{\sigma\sqrt{2\pi}}\]

Exponential Distribution

\[f(x|\theta) = \frac{1} {\theta} \exp{\left[-\frac{x}{\theta}\right]}\]

Weibull Distribution

\[f(x|\theta,\beta)=\frac{\gamma}{\theta}\left(\frac{x}{\theta}\right)^{\beta-1}\exp{\left[-\left(\frac{x}{\theta}\right)^{\beta}\right]}\]

Lognormal Distribution

\[f(x|\mu,\sigma)=\frac{\exp\left[-(\ln(x-\mu)^{2}/(2\sigma^{2})\right]}{x\sigma\sqrt{2\pi}}\]

Distribution Functions For Continuous Random Variables

For \((T\in \mathcal{R}^{+}),\) the following probability functions will be used in this course
- \(\hspace{12pt}F(t)\) - Cumulative density function (CDF)
- \(\hspace{14pt}f(t)\) - Probability mass function (pmf)
- \(\hspace{13pt}S(t)\) - Reliability (survivor) function
- \(\hspace{14pt}h(t)\) - Hazard rate (failure rate) function
- \(\hspace{11pt}H(t)\) - Cumulative hazard function
- \(F^{-1}(p)\) - Inverse CDF (or quantile function)

The Survival (Reliability) Function - \(S(t)\)

For \(T\in \mathbb{R}^{+},\) the survival function is a strictly positive, non-increasing function relating the probability that the true (but unknown) value of \(T\) is greater than \(t\)

\[S(t)=Pr(T > t), t\in (0,\infty)\]

Where

\[ \begin{aligned} S(t)&\ge 0, \forall t \in \mathcal{R}^{+}\\\\ S(0)&=1\\\\ \displaystyle \lim_{t \to \infty} S(t)&=0 \end{aligned} \]

The hazard (failure rate) function - \(h(t)\)

For the random variable \(T\in \mathcal{R}^{+}\)

The hazard function relates the instantaneous, conditional probability of failure in the interval \((t+\Delta t)\) given \(T>t\)

\[ \begin{aligned} h(t)&=\displaystyle \lim_{\Delta t \to 0} \frac{Pr(t < T \le t+\Delta t|T \ge t)}{\Delta t}\\[5mm] &=\displaystyle \lim_{dt \to 0} \frac{R(t)-R(t+dt)}{R(t)dt}\\[5mm] &=\frac{dR(t)}{dt}\frac{1}{R(t)}=\frac{f(t)}{R(t)}\\ \end{aligned} \]

Probability Functions in R

Using a Weibull distribution with shape \((\beta)\) and scale \((\theta)\) as example

dweibull(x,shape,scale)
Density function \(f(x)\) for a \(Weib(\beta,\theta)\) distribution

pweibull(q,shape,scale)
Probability function \(F(x)\) at quantile \(q\) for a \(Weib(\beta,\theta)\) distribution

qweibull(p,shape,scale)
Quantile function \(F^{-1}(p)\) at probability \(p\) for a \(Weib(\beta,\theta)\) distribution

rweibull(n,shape,scale)
Generates \(n\) random observations from a \(Weib(\beta,\theta)\) distribution

Other distributions in base R
- Normal - \(Nor(\mu,\sigma)\)
  rnorm(n,mean,sd)
- Lognormal - \(Lognor(\mu,\sigma)\)
  rlnorm(n,meanlog,sdlog)
- Exponential - \(exp(\theta)\)
  rexp(m,rate)
- Gamma - \(Gam(\kappa,\beta)\)
  rgamma(n, shape, rate, scale=1/rate)

Accessing Distribution Fucntion Values

R uses a four-function notation to compute values for distribution functions
Using The Normal Distribution As An Example

[1] 0.1586553 0.3085375 0.5000000 0.6914625 0.8413447

[1] 0.1209854 0.1760327 0.1994711 0.1760327 0.1209854

[1] 3.289707 1.683242 0.000000 -1.683242 -3.289707

[1] 2.7494016 1.7362779 3.1268790 -0.7383119 1.6373741

Plotting Relative Frequencies and Probabilities Using `hist( )`

Plotting The Kernel Density Using `density( )`

Plotting The Empirical CDF Using `ecdf( )`

Using `optim( )` or `nlminb( )` to Maximize Functions

Suppose we have a the following function, where
- \(d_1,...,d_8\) are the data observed in 8 intervals
- Given a set of observations we want to find the \(\theta\) that maximizes the function

Like7.3<-function(theta,d) {

d1<-d[1];d2<-d[2];d3<-d[3];d4<-d[4];d5<-d[5];d6<-d[6];d7<-d[7];d8<-d[8]

F<-c(rep(NA,1))

F<-log(exp(   -0/theta)-exp( -100/theta))*d1+
   log(exp( -100/theta)-exp( -300/theta))*d2+
   log(exp( -300/theta)-exp( -500/theta))*d3+
   log(exp( -500/theta)-exp( -700/theta))*d4+
   log(exp( -700/theta)-exp(-1000/theta))*d5+
   log(exp(-1000/theta)-exp(-2000/theta))*d6+
   log(exp(-2000/theta)-exp(-4000/theta))*d7+
   log(exp(-4000/theta)-exp( -Inf/theta))*d8 

return(-F)      }

We can use nlminb( ) to do this assuming the following data were observed

data[2,]

[1] 41 44 24 32 29 21 9 0

nlminb(start = 400, obj = Like7.3, d = data[2,], lower = 100, upper = 800)$par

[1] 572.2742

For this fucntion, we could have also used optim( ) and arrived at the same solution

optim(par = 400, fn = Like7.3, d = data[2,], lower = 100, upper = 800, method = "L-BFGS-B")$par

[1] 572.274

\(\S\) 3.3.2 - Confidence intervals

In inferential statistics, confidence intervals (CI) are useful for quantifying the uncertainy associated with a point estimate due to sampling errors

Sources of uncertainty

Limited understanding of how a system interacts with its environment
Manufacturing flaws & deviations between units in a sample
Inability of chosen parametric model to fit data
Small sample sizes amplify the uncertainty

\(\S\) 3.3.2 - Confidence intervals

What does it mean to say "...the \(100\times(1-\alpha)\%\) CI for \(\theta =(\underline{\hat{\theta}},\overline{\hat{\theta}})\)..."

For \(\alpha=0.05 \rightarrow\) a \(95\%\) CI on \(\theta\) means

If an experiment were run \(i\) times (each run producing a distinct data set) the true value of \(\theta\) would be in \((\underline {\hat {\theta_{i}}},\overline{\hat{\theta_{i}}})\) in \(95\%\) of the runs.

It does not mean \[P(\underline{\hat{\theta_{i}}} <\theta<\overline{\hat{\theta_{i}}})=0.95\]

This probability is \(1\) or \(0\) since \(\theta\) is either in the interval or it isn't

What Does "Confidence Level" Mean? source

The Poisson Distribution

The Geometric Distribution

The Hyper-Geometric Distribution

The Binomial Distribution

Continuous distributions

Exponential Distribution

Is the continuous counterpart to the geometric distribution

Describes the inter-arrival time durations between events in a homogeneous Poisson process

Is used to model the "useful-life" region of the bathtub curve

Implies that failures are due to random events or chance

Is a "memoryless" distribution

Is one of the most commonly used lifetime distributions in reliability analyses

Is the simplest distribution used in the analysis of reliability data

In real world scenarios, assuming exponentially distributed failure times is rarely valid

Normal Distribution \(\; Y \sim NOR(\mu,\sigma),\;\;Y\in(-\infty,\infty)\)

\[ \begin{aligned} f(y|\mu,\sigma)&=\frac{1}{\sigma}\phi_{nor}\left(\frac{y-\mu}{\sigma}\right)=\frac{1}{\sigma}\frac{e^{-(y - \mu)^{2}/(2\sigma^{2})}}{\sqrt{2\pi}}\\\\ F(y|\mu,\sigma)&=\Phi_{nor}\left(\frac{y-\mu}{\sigma}\right)=\int_{-\infty}^{y} \frac{e^{-(y-\sigma)^{2}/2\sigma^2}} {\sqrt{2\pi}\sigma}\\\\ h(y|\mu,\sigma)&=\frac{f(y|\mu,\sigma)}{1-F(y|\mu,\sigma)}\\\\ y_{p}&=\mu+\Phi^{-1}_{nor}(p)\sigma, \;\;\;\;\;\;\;\;\text{where}\;\Phi^{-1}_{nor}(p)=z_p\\\\ E[Y]&=\mu\\\\ Var[Y]&=\sigma^2 \end{aligned} \]

Log-normal Distribution \(\; T \sim LOGNOR(\mu,\sigma),\;\;T\in[0,\infty)\)

\[ \begin{aligned} f(t|\mu,\sigma)&=\frac{1}{\sigma}\phi_{nor}\left(\frac{\log(t)-\mu}{\sigma}\right)\\\\ F(t|\mu,\sigma)&=\Phi_{nor}\left(\frac{\log(t)-\mu}{\sigma}\right)\\\\ h(t|\mu,\sigma)&=\frac{f(t|\mu,\sigma)}{1-F(t|\mu,\sigma)}\\\\ t_{p}&=\exp\left[\mu+\Phi^{-1}_{nor}(p)\sigma\right], \;\;\;\;\;\;\;\;\text{where}\;\Phi^{-1}_{nor}(p)=z_p\\\\ E[T]&=\exp(\mu+0.5\sigma^2)\\\\ Var[T]&=\exp(2\mu+\sigma^2)(\exp(\sigma^2)-1) \end{aligned} \]

Introduction to R

Functions For Statistics and Optimization

PRESENTATION OVERVIEW

In This Presentation...

ELEMENTARY STATISTICS REVIEW

Random Quantities & Random Variables

Random Variable Terminology

Example: Rolling A Die \(N\) Times

Discrete Random Variables

Distribution Functions

The Probability Mass Function - \(f(x)\)

The Cumulative Distribution Function - \(F(x)\)

Common Discrete Distributions

Common Continuous Distributions

Distribution Functions For Continuous Random Variables

The Survival (Reliability) Function - \(S(t)\)

The hazard (failure rate) function - \(h(t)\)

Probability Functions in R

Accessing Distribution Fucntion Values

Plotting Relative Frequencies and Probabilities Using `hist( )`

Plotting The Kernel Density Using `density( )`

Plotting The Empirical CDF Using `ecdf( )`

Using `optim( )` or `nlminb( )` to Maximize Functions

\(\S\) 3.3.2 - Confidence intervals

\(\S\) 3.3.2 - Confidence intervals

What Does "Confidence Level" Mean? source

The Poisson Distribution

The Geometric Distribution

The Hyper-Geometric Distribution

The Binomial Distribution

Continuous distributions

Exponential Distribution

Normal Distribution \(\; Y \sim NOR(\mu,\sigma),\;\;Y\in(-\infty,\infty)\)

Log-normal Distribution \(\; T \sim LOGNOR(\mu,\sigma),\;\;T\in[0,\infty)\)

\(\S\) 4.6 - Log-normal Distribution

Introduction to R

Functions For Statistics and Optimization

PRESENTATION OVERVIEW

In This Presentation...

ELEMENTARY STATISTICS REVIEW

Random Quantities & Random Variables

Random Variable Terminology

Example: Rolling A Die \(N\) Times

Discrete Random Variables

Distribution Functions

The Probability Mass Function - \(f(x)\)

The Cumulative Distribution Function - \(F(x)\)

Common Discrete Distributions

Common Continuous Distributions

Distribution Functions For Continuous Random Variables

The Survival (Reliability) Function - \(S(t)\)

The hazard (failure rate) function - \(h(t)\)

Probability Functions in R

Accessing Distribution Fucntion Values

Plotting Relative Frequencies and Probabilities Using hist( )

Plotting The Kernel Density Using density( )

Plotting The Empirical CDF Using ecdf( )

Using optim( ) or nlminb( ) to Maximize Functions

\(\S\) 3.3.2 - Confidence intervals

\(\S\) 3.3.2 - Confidence intervals

What Does "Confidence Level" Mean? source

The Poisson Distribution

The Geometric Distribution

The Hyper-Geometric Distribution

The Binomial Distribution

Continuous distributions

Exponential Distribution

Normal Distribution \(\; Y \sim NOR(\mu,\sigma),\;\;Y\in(-\infty,\infty)\)

Log-normal Distribution \(\; T \sim LOGNOR(\mu,\sigma),\;\;T\in[0,\infty)\)

\(\S\) 4.6 - Log-normal Distribution

Plotting Relative Frequencies and Probabilities Using `hist( )`

Plotting The Kernel Density Using `density( )`

Plotting The Empirical CDF Using `ecdf( )`

Using `optim( )` or `nlminb( )` to Maximize Functions