Notation

The sample space, \( \Omega \), is the collection of possible outcomes of an experiment
- Example: die roll \( \Omega = \{1,2,3,4,5,6\} \)
An event, say \( E \), is a subset of \( \Omega \)
- Example: die roll is even \( E = \{2,4,6\} \)
An elementary or simple event is a particular result of an experiment
- Example: die roll is a four, \( \omega = 4 \)
\( \emptyset \) is called the null event or the empty set

Interpretation of set operations

Normal set operations have particular interpretations in this setting

\( \omega \in E \) implies that \( E \) occurs when \( \omega \) occurs
\( \omega \not\in E \) implies that \( E \) does not occur when \( \omega \) occurs
\( E \subset F \) implies that the occurrence of \( E \) implies the occurrence of \( F \)
\( E \cap F \) implies the event that both \( E \) and \( F \) occur
\( E \cup F \) implies the event that at least one of \( E \) or \( F \) occur
\( E \cap F=\emptyset \) means that \( E \) and \( F \) are mutually exclusive, or cannot both occur
\( E^c \) or \( \bar E \) is the event that \( E \) does not occur

Probability

A probability measure, \( P \), is a function from the collection of possible events so that the following hold

For an event \( E\subset \Omega \), \( 0 \leq P(E) \leq 1 \)
\( P(\Omega) = 1 \)
If \( E_1 \) and \( E_2 \) are mutually exclusive events \( P(E_1 \cup E_2) = P(E_1) + P(E_2) \).

Part 3 of the definition implies finite additivity

\[ P(\cup_{i=1}^n A_i) = \sum_{i=1}^n P(A_i) \] where the \( \{A_i\} \) are mutually exclusive. (Note a more general version of additivity is used in advanced classes.)

Example consequences

\( P(\emptyset) = 0 \)
\( P(E) = 1 - P(E^c) \)
\( P(A \cup B) = P(A) + P(B) - P(A \cap B) \)
if \( A \subset B \) then \( P(A) \leq P(B) \)
\( P\left(A \cup B\right) = 1 - P(A^c \cap B^c) \)
\( P(A \cap B^c) = P(A) - P(A \cap B) \)
\( P(\cup_{i=1}^n E_i) \leq \sum_{i=1}^n P(E_i) \)
\( P(\cup_{i=1}^n E_i) \geq \max_i P(E_i) \)

Example

The National Sleep Foundation (www.sleepfoundation.org) reports that around 3% of the American population has sleep apnea. They also report that around 10% of the North American and European population has restless leg syndrome. Does this imply that 13% of people will have at least one sleep problems of these sorts?

Example continued

Answer: No, the events are not mutually exclusive. To elaborate let:

\[ \begin{eqnarray*} A_1 & = & \{\mbox{Person has sleep apnea}\} \\ A_2 & = & \{\mbox{Person has RLS}\} \end{eqnarray*} \]

Then

\[ \begin{eqnarray*} P(A_1 \cup A_2 ) & = & P(A_1) + P(A_2) - P(A_1 \cap A_2) \\ & = & 0.13 - \mbox{Probability of having both} \end{eqnarray*} \] Likely, some fraction of the population has both.

Random variables

A random variable is a numerical outcome of an experiment.
The random variables that we study will come in two varieties, discrete or continuous.
Discrete random variable are random variables that take on only a countable number of possibilities.
- \( P(X = k) \)
Continuous random variable can take any value on the real line or some subset of the real line.
- \( P(X \in A) \)

Examples of variables that can be thought of as random variables

The \( (0-1) \) outcome of the flip of a coin
The outcome from the roll of a die
The BMI of a subject four years after a baseline measurement
The hypertension status of a subject randomly drawn from a population

PMF

A probability mass function evaluated at a value corresponds to the probability that a random variable takes that value. To be a valid pmf a function, \( p \), must satisfy

\( p(x) \geq 0 \) for all \( x \)
\( \sum_{x} p(x) = 1 \)

The sum is taken over all of the possible values for \( x \).

Example

Let \( X \) be the result of a coin flip where \( X=0 \) represents tails and \( X = 1 \) represents heads. \[ p(x) = (1/2)^{x} (1/2)^{1-x} ~~\mbox{ for }~~x = 0,1 \] Suppose that we do not know whether or not the coin is fair; Let \( \theta \) be the probability of a head expressed as a proportion (between 0 and 1). \[ p(x) = \theta^{x} (1 - \theta)^{1-x} ~~\mbox{ for }~~x = 0,1 \]

PDF

A probability density function (pdf), is a function associated with a continuous random variable

Areas under pdfs correspond to probabilities for that random variable

To be a valid pdf, a function \( f \) must satisfy

\( f(x) \geq 0 \) for all \( x \)
The area under \( f(x) \) is one.

Example

Suppose that the proportion of help calls that get addressed in a random day by a help line is given by \[ f(x) = \left\{\begin{array}{ll} 2 x & \mbox{ for } 1 > x > 0 \\ 0 & \mbox{ otherwise} \end{array} \right. \]

Is this a mathematically valid density?

x <- c(-0.5, 0, 1, 1, 1.5)
y <- c(0, 0, 2, 0, 0)
plot(x, y, lwd = 3, frame = FALSE, type = "l")

plot of chunk unnamed-chunk-1

Example continued

What is the probability that 75% or fewer of calls get addressed?

plot of chunk unnamed-chunk-2

1.5 * 0.75/2

## [1] 0.5625

pbeta(0.75, 2, 1)

## [1] 0.5625

CDF and survival function

The cumulative distribution function (CDF) of a random variable \( X \) is defined as the function \[ F(x) = P(X \leq x) \]
This definition applies regardless of whether \( X \) is discrete or continuous.
The survival function of a random variable \( X \) is defined as \[ S(x) = P(X > x) \]
Notice that \( S(x) = 1 - F(x) \)
For continuous random variables, the PDF is the derivative of the CDF

Example

What are the survival function and CDF from the density considered before?

For \( 1 \geq x \geq 0 \) \[ F(x) = P(X \leq x) = \frac{1}{2} Base \times Height = \frac{1}{2} (x) \times (2 x) = x^2 \]

\[ S(x) = 1 - x^2 \]

pbeta(c(0.4, 0.5, 0.6), 2, 1)

## [1] 0.16 0.25 0.36

Quantiles

The \( \alpha^{th} \) quantile of a distribution with distribution function \( F \) is the point \( x_\alpha \) so that \[ F(x_\alpha) = \alpha \]
A percentile is simply a quantile with \( \alpha \) expressed as a percent
The median is the \( 50^{th} \) percentile

Example

We want to solve \( 0.5 = F(x) = x^2 \)
Resulting in the solution

sqrt(0.5)

## [1] 0.7071

Therefore, about 0.7071 of calls being answered on a random day is the median.
R can approximate quantiles for you for common distributions

qbeta(0.5, 2, 1)

## [1] 0.7071

Summary

You might be wondering at this point “I've heard of a median before, it didn't require integration. Where's the data?”
We're referring to are population quantities. Therefore, the median being discussed is the population median.
A probability model connects the data to the population using assumptions.
Therefore the median we're discussing is the estimand, the sample median will be the estimator