Chapter 16: Common Probability Distributions

Common Probability Mass Functions (Discrete)
Common Probability Density Functions (Continous)

Common Probability Mass Functions (Discrete)

Bernoulli Distribution

The Bernoulli distribution is the probability distribution of a discrete random variable that has only two possible outcomes, such as success or failure. This type of variable can be referred to as binary or dichotomous.

X = 0 is failure X = 1 is success p is the known probability of success

Binomial Distribution

The binomial distribution is the distribution of successes in n number of trials involving binary discrete random variables.

The parameters of the binomial distribution are n and p, and the notation:

\(X \sim BIN(n, p)\)

is often used to indicate that X follows a binomial distribution for n trials with parameter p.

The binomial mass function f is given by:

\[f(x) = P(X=x) = \binom{n}{x} p^x(1-p)^{n-x} \qquad x={0,1,2,...,n} \] where \[\binom{n}{x} = \frac{n!}{x!(n-x)!}\]

known as the binomial coefficient, or combination, accounts for all different orders in which you might observe x successes throughout n trials.

Note that \(n! = n (n-1) (n-2) \cdot\cdot\cdot (2)(1)\) so for example \(4! = 4 \cdot 3 \cdot 2 \cdot 1 = 24\)

X can take only the values 0, 1, …, n and represents the total number of successes.
p should be interpreted as “the probability of success at each trial.” Therefore, \(0 \le p \le 1\).
n > 0 is an integer interpreted as “the number of trials.”
Each of the n trials is a Bernoulli success and failure event, the trials are independent (in other words, the outcome of one doesn’t affect the outcome of any other), and p is constant.

The mean of a binomial random variable is given by \(\mu = np\) and the variance by \(\sigma^2 = np(1-p)\).

Poisson Distribution

The Poisson distribution is used to model a slightly more general, but just as important, discrete random variable — a count.

The Poisson mass function f is given by:

\[f(x) = P(X=x) = \frac{\lambda^{x}e^{-\lambda}}{x!},\qquad x={0,1,2,...}\] The notation \(X \sim POIS(\lambda)\) is often used to indicate that “X follows a Poisson distribution with parameter \(\lambda\).”

The events or items being counted are assumed to manifest independently of one another.
The entities, features, or events being counted occur independently in a well-defined interval at a constant rate.
X can take only non-negative integers: 0,1,…
\(\lambda\) should be interpreted as the “mean number of occurrences” and must therefore be finite and strictly positive.

The mean of a binomial random variable is given by \(\mu = \lambda\) and the variance by \(\sigma^2 = \lambda\).

Other Mass Functions

The hypergeometric distribution is used to model sampling without replacement.
The multinomial distribution is a generalization of the binomial, where a success can occur in one of multiple categories at each trial

Common Probability Density Functions (Continous)

Uniform

The uniform distribution is a simple density function that describes a continuous random variable whose interval of possible values offers no fluctuations in probability.

For a continuous random variable \(a \le X \le b\), the uniform density function f is:

\[f(x) = \left\{ \begin{array}{c l} \frac{1}{b-a} & a \le x \le b\\ 0 & otherwise \end{array}\right.\]

where a and b are parameters of the distribution defining the limits of the possible values X can take.

The mean of a uniform random variable is given by \(\mu = \frac{a+b}{2}\) and the variance by \(\sigma^2 = \frac{(b-a)^2}{12}\).

Normal

The Normal distribution is the most important and commonly applied probability distributions in modeling continuous random variables. It is characterized by a distinctive “bell-shaped” curve, and it also referred to as the Gaussian distribution.

Theoretically, X can take any value from \(-\infty\) to \(\infty\).
The parameters \(\mu\) and \(\sigma\) directly describe the mean and the standard deviation of the distribution and completely determine the shape of the distribution.
If you have a random variable \(X\sim N(\mu, \sigma)\), then you can create a new random variable \(Z = \frac{(X − \mu)}{\sigma}\), which means \(Z \sim N(0, 1)\) . This is known as standardization of X and Z is known as a standard normal random variable. It has a mean of 0 and standard deviation of 1.

For any Normal distribution:

Approximately 68% of data values are within 1 standard deviation of the mean
Approximately 95% of data values fall within 2 standard deviations of the mean
Approximately 99.7% of data values are within 3 standard deviations of the mean
The mean, median, and mode are equal.

To calculate normal probabilities we will use a Normal Probability Table. Every table is different, but typically we are given cumulative probabilities, or \(P(X \le x)\), that is the area to the left of the curve.

Using a Normal Table

The margins represent z-scores, so you always need to convert to standard Normal first. First, we locate the z-score on the horizontal margin, which will gives us values up to the nearest tenth. Then, on that row, we move on to find the column with the nearest hundredth. The value inside the table is the cumulative probability.
The process can also be reversed, that is, if given a probability, one would find it in the table and locate the associated z-score. Then the value of X that gives that particular probability can be obtain by algebra, solving:

\[X = \sigma Z + \mu \]

Student’s t-distribution

The Student’s t-distribution is a continuous probability distribution generally used when dealing with statistics estimated from a sample of data.

Any particular t-distribution looks a lot like the standard normal distribution—it’s bell-shaped, symmetric, and unimodal, and it’s centered on zero.

The difference is that while a normal distribution is typically used to deal with a population, the t-distribution deals with sample from a population.

The t-distribution depends solely on the degrees of freedom called so because it represents the number of individual components in the calculation of a given statistic that are “free to change.”

Exponential

The exponential distribution can be used to model many continuous phenomena. In fact, this distribution describes the time between events in a Poisson process.

For a continuous random variable \(0 \le X \le \infty\), the exponential density function f is:

\[f(x) = \lambda e^{-\lambda x}, \qquad x \ge 0\] Where \(\lambda\) is the rate parameter.

The mean of an exponential random variable is given by \(\mu = \frac{1}{\lambda}\) and the variance by \(\sigma^2 = \frac{1}{\lambda^2}\).

Probabilities for the exponential can be computed easily using the cumulative probability function:

\[F(x) = P(X \le x) = 1-e^{-\lambda x}\]

Other Density Functions

The chi-squared distribution is often related to operations concerning sample variances of normally distributed data.
The F-distribution is used to model ratios of two chi-squared random variables and is useful in, for example, regression problems.
The gamma distribution is a generalization of both the exponential and chi-squared distributions.
The beta distribution is a generalization of the uniform and it is used to model a variable defined on a finite interval.