Basic Definitions

Probability is a numerical measure of the likelihood an outcome will occur; can assume values between 0 (outcome is highly unlikely to occur) and 1 (outcome is very likely to occur).

Note: 1/2, 0.5, or 50% all mean the same thing so they are all correct. When it comes to working with computers, decimals is the preferred format.

Experiment is any process that can result in one of several well-defined outcomes that cannot be predicted with certainty beforehand (e.g. roll a die, make an investment).

Sample space is the set of all possible outcomes in an experiment (written as S).

Sample point is a member of the sample space; any one particular experimental outcome.

Event a well-defined collection of sample points; a subset of the sample space.

Approaches to Probability

Classical method assumes equally likely outcomes.

Relative frequency method is based more on real-world empirical observation than on theoretical assumptions about the likelihood of any experimental outcome.

Subjective method assumes experimental outcomes are not equally-likely, or relative frequency data are either unavailable or uncollectable.

Special Rules

Axioms of Probability

When assigning probabilities, two requirements must be met:

  • the probability of any sample point must be between 0 and 1, inclusive.
  • the sum of all sample points probabilities must be 1.

Elementary Operations

The intersection of two events is written as Pr(A \(\cap\) B) and is read as the probability that both A and B occur simultaneously.

If Pr(A \(\cap\) B) = 0, then you say the two events are mutually exclusive.

The union of two events is written as Pr(A \(\cup\) B) and is read as the probability that A or B occurs.

The complement of an event is written as Pr(\(A^c\)) and is read as the probability that A does not occur.

Special Rules

The union of any two events can be calculated with the following rule:

\[ P(A \cup B) = P(A) + P(B) - P(A \cap B)\]

Conditional Probability is the probability of one event occurring after taking into account the occurrence of another event.

Notation: Pr(A|B) bar means “given”.

Independent Events: If Pr(A|B) = Pr(A), then the two events are independent.

For independent events the following rule is always true:

\[ P(A \cap B) = P(A) * P(B) \]

Random Variables

A random variable is a variable whose specific outcomes are assumed to arise by chance or according to some random or stochastic mechanism.

When you are considering random variables, we assume you have not yet made an observation.

The probability distribution is a mathematical function that provides the probabilities of occurrence of different possible outcomes in an experiment.

There are two types of random variables, depending on their outcome:

A cumulative probability for a random variable X is the probability of observing less than or equal to x and written as \(Pr(X \leq x)\).

For discrete random variables we can calculate \(Pr(X = x)\).

For continuous random variables we calculate probabilities for an interval (using cumulative probabilities).

Mean and Variance of a Random Variable

For a random variable X with density f, the mean \(\mu_x\) (or expectation or expected value E[X] ) is interpreted as the average outcome that you can expect over many realizations.

\[ \mu_x = E(X) = \sum_{i=1}^n x_i* Pr(X=x_i)\]

For X, the variance \(\sigma^2_x\), also written as Var[X], quantifies the variability inherent in realizations of X.

\[ \sigma^2_x = Var(X) = \sum_{i=1}^n (x_i - \mu_x)^2* Pr(X=x_i)\]

Shape, Skew, and Modality

A distribution is symmetric if you can draw a vertical line down the center, and it is equally reflected with 0.5 probability falling on either side of this center line.

If a distribution is asymmetric, we say that it is skewed.

Modality describes the number of easily identifiable peaks in the distribution of interest.

Counting Rules

The Multiplication Principle

If an experiment can be characterized as a sequence of N steps with \(n_1\) possible results on the first step, \(n_2\) possible results on the second step, etc., then the total number of outcomes for the overall experiment is equal to the product of the number of results on each step.

Combinations

Counts the number of experimental outcomes when n objects are to be selected from a larger set of N objects.