Probability & Probability Distributions

Joe Ripberger

Probability

  • Probability: likelihood that an event (or set of events) will occur
    • Range from 0 to 1—the higher the probability is, the more certain we are that an event (or set of events) will occur
    • The probability of an event (A) is the number of ways event (A) can occur divided by the total number of possible outcomes

Finding Probabilities

  • There are 2 basic ways to find a simple probability:
    • Logic (prior—before an event occurs)
    • Observation (posterior—after an event occurs)
      • Large number of trials
  • Examples:
    • What is the probability of getting a head when flipping a coin?
    • What is the probability that it will be > 90 degrees on Sept. 30?

Finding Probabilities

  • Mutually exclusive events — if an event (A) occurs, event (B) cannot occur
    • Only one event can happen at a given time
  • Independent events — the probability of event (A) is not affected by the occurrence or nonoccurrence of event (B), and vice versa

Finding Probabilities

  • Marginal (unconditional) probability
    • Probability of event (A)
    • \(P(A) \in [1,0]\)
  • Union of probabilities
    • Probability of event (A) or (B)
    • \(P(A \cup B) = P(A) + P(B) - P(A \cap B)\)
    • \(P(A \cup B) = P(A) + P(B)\) if A and B are mutually exclusive

Finding Probabilities

  • Conditional probability
    • Probability of event (A) given (B)
    • \(P(A \mid B) = \frac{P(A \cap B)}{P(B)}\)
  • Joint probability
    • Probability of event (A) and (B)
    • \(P(A \cap B) = P(A|B)P(B)\)
    • \(P(A \cap B) = P(A)P(B)\) if A and B are independent

Probability Distributions

  • Probability Distribution: a list that specifies the probabilities associated with all possible outcomes of a random process
    • Probability Mass Function (pmf): a function that gives the probability that a discrete random variable is exactly equal to some value
    • Probability Density Function (pdf): a function that gives the probability that a continuous random variable is approximately equal to some value
    • Cumulative Distribution Function (cdf): a function that gives the probability that a discrete or continuous random variable will take a value less than or equal to some value
  • pmfs, pdfs, and cdfs, are defined mathematically but often shown graphically

Probability Mass Function

Cumulative Distribution Function

Probability Density Function

Probability Density Function

Probability Density Function

Probability Density Function

integrate.xy(density(height_data_wide$Male)$x, 
             density(height_data_wide$Male)$y, 
             73, 75)
[1] 0.1200779
integrate.xy(density(height_data_wide$Female)$x, 
             density(height_data_wide$Female)$y, 
             73, 75)
[1] 0.008992442

Cumulative Distribution Function

Cumulative Distribution Function

Cumulative Distribution Function

pnorm(74, mean = 70, sd = 4, lower.tail = TRUE)
[1] 0.8413447
pnorm(74, mean = 65, sd = 3.5, lower.tail = TRUE)
[1] 0.994936

Normal (Gaussian) Distribution

  • Common probability distribution for continuous random variables
  • Defined by \(\mu\) and \(\sigma^2\), often written as \(\mathcal{N}(\mu,\sigma^2)\)
  • Skewness = 0 (symmetric)
  • Kurtosis = 3
  • Non-zero over the entire real line, but practically zero when the value of x lies more than a few standard deviations away from the mean
  • Standard normal distribution is a special case of the normal distribution, where:
    • \(\mu=0\)
    • \(\sigma=1\)

Normal Distribution (pdf)

Normal Distribution (cdf)

Skewness

  • Skewness is a measure of the asymmetry of a probability distribution about its mean
    • Positive skewness: the right tail is longer; the mass of the distribution is concentrated on the left of the median
    • Negative skewness: the left tail is longer; the mass of the distribution is concentrated on the right of the median

Skewness

\[\text{Sample skewness} = \frac{m_3}{s^3} = \frac{\tfrac{1}{n} \sum_{i=1}^n (x_i-\overline{x})^3}{ \sqrt{\tfrac{1}{n-1} \sum_{i=1}^n (x_i-\overline{x})^2}^{\,3}}\]

Kurtosis

  • Kurtosis is a measure of the “tailedness” of a probability distribution
    • Positive kurtosis (leptokurtic): distribution has “fatter” tails than a normal distribution; produces more outliers than normal (kurtosis > 3)
    • Negative kurtosis (platykurtic): distribution has “thinner” tails than a normal distribution; produces less outliers than normal (kurtosis > 3)

Kurtosis

Source: https://en.wikipedia.org/wiki/Kurtosis

Kurtosis

\[\text{Sample kurtosis} = \frac{m_4}{m_2^2} -3 = \frac{\tfrac{1}{n} \sum_{i=1}^n (x_i - \overline{x})^4}{\left(\tfrac{1}{n} \sum_{i=1}^n (x_i - \overline{x})^2\right)^2} - 3\]

Z-Score

  • Standard or standardized score: how far above or below the mean (\(\mu\)) is a given value?
    • A z-score of 0 means the value is the same as the mean
      • What percentile is that?
    • A z-score of 1 means the value is 1 standard deviation above the mean
      • What percentile is that?
    • A z-score of -1 means the value is 1 standard deviation below the mean
      • What percentile is that?

Normal Distribution (cdf)

Z-Scores

  • Standard or standardized score: how far above or below the mean (\(\mu\)) is a given value?
    • A z-score of 0 means the value is the same as the mean
      • What percentile is that? [50th]
    • A z-score of 1 means the value is 1 standard deviation above the mean
      • What percentile is that? [84th]
    • A z-score of -1 means the value is 1 standard deviation below the mean
      • What percentile is that? [16th]

Z-Scores

pnorm(0, mean = 0, sd = 1, lower.tail = TRUE)
[1] 0.5
pnorm(1, mean = 0, sd = 1, lower.tail = TRUE)
[1] 0.8413447
pnorm(-1, mean = 0, sd = 1, lower.tail = TRUE)
[1] 0.1586553

Z-Scores

  • To convert a raw score into a z-score, use the following formula: \[z = \frac{x-\mu}{\sigma}\]
  • If \(x\) = 78, \(\mu = 70\), and \(\sigma = 4\), what is the z-score of \(x\)?
  • What is the probability that a man in the U.S. is 6’5” or taller?

Normal Distribution (cdf)

Z-Scores

1 - pnorm(2, mean = 0, sd = 1, lower.tail = TRUE)
[1] 0.02275013
pnorm(2, mean = 0, sd = 1, lower.tail = FALSE)
[1] 0.02275013

Binomial Distribution

  • Common probability distribution for discrete random variables
  • Defined by the number of trials (n) and the probability of success on a single trial (p)
  • Approximates the normal distribution as n increases
  • Bernoulli distribution is a special case of the binomial distribution, where n = 1

Binomial Distribution

Cumulative Distribution Function

Other Common Distributions

  • Uniform
    • Constant probability
    • Defined by the minimum and maximum values of x
  • Student’s t
    • Similar to the normal distribution, but has heavier tails (higher kurtosis)
    • Centered at 0 and defined by n (degrees of freedom (\(\nu\)), \(n-1\))
    • Approximates the normal distribution as n increases
  • Logistic
    • Similar to the normal distribution, but has heavier tails (higher kurtosis)
    • Defined by the location \((\mu)\) and scale \((s)\) of x (similar to \(\mu\) and \(\sigma^2\))

Other Common Distributions (pdf)