<<<<<<< HEAD:06_StatisticalInference/01_02_Probability/index.Rmd

title : Probability subtitle : Statistical Inference author : Brian Caffo, Jeff Leek, Roger Peng job : Johns Hopkins Bloomberg School of Public Health logo : bloomberg_shield.png framework : io2012 # {io2012, html5slides, shower, dzslides, …} highlighter : highlight.js # {highlight.js, prettify, highlight} hitheme : tomorrow # url: lib: ../../librariesNew assets: ../../assets widgets : [mathjax] # {mathjax, quiz, bootstrap} mode : selfcontained # {standalone, draft} —

Notation


Interpretation of set operations

Normal set operations have particular interpretations in this setting

  1. \(\omega \in E\) implies that \(E\) occurs when \(\omega\) occurs
  2. \(\omega \not\in E\) implies that \(E\) does not occur when \(\omega\) occurs
  3. \(E \subset F\) implies that the occurrence of \(E\) implies the occurrence of \(F\)
  4. \(E \cap F\) implies the event that both \(E\) and \(F\) occur
  5. \(E \cup F\) implies the event that at least one of \(E\) or \(F\) occur
  6. \(E \cap F=\emptyset\) means that \(E\) and \(F\) are mutually exclusive, or cannot both occur
  7. \(E^c\) or \(\bar E\) is the event that \(E\) does not occur

Probability

A probability measure, \(P\), is a function from the collection of possible events so that the following hold

  1. For an event \(E\subset \Omega\), \(0 \leq P(E) \leq 1\)
  2. \(P(\Omega) = 1\)
  3. If \(E_1\) and \(E_2\) are mutually exclusive events \(P(E_1 \cup E_2) = P(E_1) + P(E_2)\).

Part 3 of the definition implies finite additivity

\[ P(\cup_{i=1}^n A_i) = \sum_{i=1}^n P(A_i) \] where the \(\{A_i\}\) are mutually exclusive. (Note a more general version of additivity is used in advanced classes.)


Example consequences


Example

The National Sleep Foundation (www.sleepfoundation.org) reports that around 3% of the American population has sleep apnea. They also report that around 10% of the North American and European population has restless leg syndrome. Does this imply that 13% of people will have at least one sleep problems of these sorts?


Example continued

Answer: No, the events are not mutually exclusive. To elaborate let:

\[ \begin{eqnarray*} A_1 & = & \{\mbox{Person has sleep apnea}\} \\ A_2 & = & \{\mbox{Person has RLS}\} \end{eqnarray*} \]

Then

\[ \begin{eqnarray*} P(A_1 \cup A_2 ) & = & P(A_1) + P(A_2) - P(A_1 \cap A_2) \\ & = & 0.13 - \mbox{Probability of having both} \end{eqnarray*} \] Likely, some fraction of the population has both.


Random variables


Examples of variables that can be thought of as random variables


PMF

A probability mass function evaluated at a value corresponds to the probability that a random variable takes that value. To be a valid pmf a function, \(p\), must satisfy

  1. \(p(x) \geq 0\) for all \(x\)
  2. \(\sum_{x} p(x) = 1\)

The sum is taken over all of the possible values for \(x\).


Example

Let \(X\) be the result of a coin flip where \(X=0\) represents tails and \(X = 1\) represents heads. \[ p(x) = (1/2)^{x} (1/2)^{1-x} ~~\mbox{ for }~~x = 0,1 \] Suppose that we do not know whether or not the coin is fair; Let \(\theta\) be the probability of a head expressed as a proportion (between 0 and 1). \[ p(x) = \theta^{x} (1 - \theta)^{1-x} ~~\mbox{ for }~~x = 0,1 \]


PDF

A probability density function (pdf), is a function associated with a continuous random variable

Areas under pdfs correspond to probabilities for that random variable

To be a valid pdf, a function \(f\) must satisfy

  1. \(f(x) \geq 0\) for all \(x\)

  2. The area under \(f(x)\) is one.

x <- c(-0.5, 0, 1, 1, 1.5); y <- c( 0, 0, 2, 0, 0)
plot(x, y, lwd = 3, frame = FALSE, type = "l")


Example continued

What is the probability that 75% or fewer of calls get addressed?


1.5 * .75 / 2
## [1] 0.5625
pbeta(.75, 2, 1)
## [1] 0.5625

CDF and survival function


Example

What are the survival function and CDF from the density considered before?

For \(1 \geq x \geq 0\) \[ F(x) = P(X \leq x) = \frac{1}{2} Base \times Height = \frac{1}{2} (x) \times (2 x) = x^2 \]

\[ S(x) = 1 - x^2 \]

pbeta(c(0.4, 0.5, 0.6), 2, 1)
## [1] 0.16 0.25 0.36

Quantiles

Summary

Probability


Probability

Given a random experiment (say rolling a die) a probability measure is a population quantity that summarizes the randomness.

Specifically, probability takes a possible outcome from the expertiment and:

The Russian mathematician Kolmogorov formalized these rules.


Rules probability must follow


Example

The National Sleep Foundation (www.sleepfoundation.org) reports that around 3% of the American population has sleep apnea. They also report that around 10% of the North American and European population has restless leg syndrome. Does this imply that 13% of people will have at least one sleep problems of these sorts?


Example continued

Answer: No, the events can simultaneously occur and so are not mutually exclusive. To elaborate let:

Going further

Probability calculus is useful for understanding the rules that probabilities must follow.

However, we need ways to model and think about probabilities for numeric outcomes of experiments (broadly defined).

Densities and mass functions for random variables are the best starting point for this.

Remember, everything we’re talking about up to at this point is a population quantity not a statement about what occurs in the data.
- We’re going with this is: use the data to estimate properties of the population.

Examples of variables that can be thought of as random variables

Experiments that we use for intuition and building context - The \((0-1)\) outcome of the flip of a coin - The outcome from the roll of a die

Specific instances of treating variables as if random - The web site traffic on a given day - The BMI of a subject four years after a baseline measurement - The hypertension status of a subject randomly drawn from a population - The number of people who click on an ad - Intelligence quotients for a sample of children


PMF

A probability mass function evaluated at a value corresponds to the probability that a random variable takes that value. To be a valid pmf a function, \(p\), must satisfy

  1. It must always be larger than or equal to 0.
  2. The sum of the possible values that the random variable can take has to add up to one.

Example

Let \(X\) be the result of a coin flip where \(X=0\) represents tails and \(X = 1\) represents heads. \[ p(x) = (1/2)^{x} (1/2)^{1-x} ~~\mbox{ for }~~x = 0,1 \] Suppose that we do not know whether or not the coin is fair; Let \(\theta\) be the probability of a head expressed as a proportion (between 0 and 1). \[ p(x) = \theta^{x} (1 - \theta)^{1-x} ~~\mbox{ for }~~x = 0,1 \]


PDF

A probability density function (pdf), is a function associated with a continuous random variable

Areas under pdfs correspond to probabilities for that random variable

To be a valid pdf, a function must satisfy

  1. It must be larger than or equal to zero everywhere.

  2. The total area under it must be one.

x <- c(-0.5, 0, 1, 1, 1.5); y <- c( 0, 0, 2, 0, 0)
plot(x, y, lwd = 3, frame = FALSE, type = "l")


Example continued

What is the probability that 75% or fewer of calls get addressed?


1.5 * .75 / 2
## [1] 0.5625
pbeta(.75, 2, 1)
## [1] 0.5625

CDF and survival function

Certain areas are so useful, we give them names

  • The cumulative distribution function (CDF) of a random variable, \(X\), returns the probability that the random variable is less than or equal to the value \(x\) \[ F(x) = P(X \leq x) \] (This definition applies regardless of whether \(X\) is discrete or continuous.)
  • The survival function of a random variable \(X\) is defined as the probability that the random variable is greater than the value \(x\) \[ S(x) = P(X > x) \]
  • Notice that \(S(x) = 1 - F(x)\)

Example

What are the survival function and CDF from the density considered before?

For \(1 \geq x \geq 0\) \[ F(x) = P(X \leq x) = \frac{1}{2} Base \times Height = \frac{1}{2} (x) \times (2 x) = x^2 \]

\[ S(x) = 1 - x^2 \]

pbeta(c(0.4, 0.5, 0.6), 2, 1)
## [1] 0.16 0.25 0.36

Quantiles

You’ve heard of sample quantiles. If you were the 95th percentile on an exam, you know that 95% of people scored worse than you and 5% scored better. These are sample quantities. Here we define their population analogs.

For example

The \(95^{th}\) percentile of a distribution is the point so that: - the probability that a random variable drawn from the population is less is 95% - the probability that a random variable drawn from the population is more is 5%

Example continued

R can approximate quantiles for you for common distributions

qbeta(0.5, 2, 1)
## [1] 0.7071068

Summary

devel:06_StatisticalInference/02_Probability/index.Rmd