Probability Distribution

Bijay Lal Pradhan

Probability Distribution

A probability distribution describes the likelihood of different outcomes or events in a given situation or experiment. It assigns probabilities to each possible outcome, with the total probabilities summing up to 1. Probability distributions can be classified into two main types: discrete probability distributions and continuous probability distributions.

Discrete Probability Distribution:

In a discrete probability distribution, the random variable can take on a finite or countably infinite number of distinct values. Each value has an associated probability. Common examples of discrete probability distributions include:

Uniform Distribution: Each outcome is equally likely. For example, rolling a fair six-sided die or drawing a card from a standard deck.
Binomial Distribution: Describes the number of successes in a fixed number of independent Bernoulli trials, where each trial has only two possible outcomes (e.g., success or failure). Parameters of the distribution include the number of trials (n) and the probability of success (p).
Poisson Distribution: Models the number of events occurring in a fixed interval of time or space, given a known average rate of occurrence (λ).

Continuous Probability Distribution:

In a continuous probability distribution, the random variable can take on any value within a specified range. The probability of any single outcome is infinitesimal, so probabilities are represented by intervals rather than specific values. Common examples of continuous probability distributions include:

Normal Distribution: Also known as the Gaussian distribution, it is characterized by a bell-shaped curve symmetric around the mean. The distribution is fully determined by two parameters: the mean (μ) and the standard deviation (σ). Many natural phenomena follow a normal distribution.
Exponential Distribution: Describes the time between events in a Poisson process, where events occur continuously and independently at a constant average rate (λ). It is often used to model waiting times or lifetimes of objects.
Uniform Distribution: In its continuous form, it represents a constant probability density over a specified interval. For example, a continuous random variable uniformly distributed between 0 and 1.

Properties of Probability Distributions:

Probability Mass Function (PMF): For discrete distributions, the PMF gives the probability that a discrete random variable is exactly equal to a certain value.
Probability Density Function (PDF): For continuous distributions, the PDF gives the probability that a continuous random variable falls within a particular range.
Cumulative Distribution Function (CDF): Both discrete and continuous distributions have a CDF, which gives the probability that a random variable takes on a value less than or equal to a given value.
Expectation (Mean): The expected value of a random variable, often denoted as E(X), represents the long-term average outcome of a random experiment.
Variance and Standard Deviation: Measures of the spread or dispersion of the distribution around its mean. Variance (Var(X)) is the average of the squared deviations from the mean, while standard deviation (σ) is the square root of the variance.

Probability distributions are fundamental in statistics, data analysis, providing a framework for modeling uncertainty and making predictions about random phenomena.

Example: 4 birth

Consider the different possible orderings of boy (B) and girl (G) in four sequential births. There are 2*2*2*2=24 = 16 possibilities, so the sample space is:

BBBB    BGBB    GBBB    GGBB

BBBG    BGBG    GBBG    GGBG

BBGB    BGGB    GBGB    GGGB

BBGG    BGGG    GBGG    GGGG

If girl and boy are each equally likely [P(G) = P(B) = 1/2], and the gender of each child is independent of that of the previous child, then the probability of each of these 16 possibilities is:

(1/2)(1/2)(1/2)(1/2) = 1/16.

Random Variable

Now count the nos. of girls in each set of sequential births:

BBBB    (0) BGBB    (1) GBBB    (1) GGBB    (2)

BBBG    (1) BGBG    (2) GBBG    (2) GGBG    (3)

BBGB    (1) BGGB    (2) GBGB    (2) GGGB    (3)

BBGG    (2) BGGG    (3) GBGG    (3) GGGG    (4)

Notice that:

each possible outcome is assigned a single numeric value,
all outcomes are assigned a numeric value, and
the value assigned varies over the outcomes.

The count of the number of girls is a random variable:

A random variable, say X, is a an uncertain quantity whose value depends on chance.

Random variable (Cont.)

Random Variable

Random Variable (Cont.)

Since the random variable X = 3 when any of the four outcomes BGGG, GBGG, GGBG, or GGGB occurs,

P(X = 3) = P(BGGG) + P(GBGG) + P(GGBG) + P(GGGB) = 4/16

The probability distribution of a random variable is a lists of possible values of the random variables and their associated probabilities.

X (Nos. of girls)	P(x)
0	1/16
1	4/16
2	6/16
3	4/16
4	1/16
Total	16/16=1
——————-	———

Rolling two dice

Sum in the two faces of the dice

First Die/Second Die	1	2	3	4	5	6
1	2	3	4	5	6	7
2	3	4	5	6	7	8
3	4	5	6	7	8	9
4	5	6	7	8	9	10
5	6	7	8	9	10	11
6	7	8	9	10	11	12

Frequency diagram

Binomial Distribution

The probability of a given sequence of x successes out of n trials with probability of success p and probability of failure q is equal to:

p^x _* q^(n-x)

The number of different sequences of n trials that result in exactly x successes is equal to the number of choices of x elements out of a total of n elements. This number is denoted by:

ⁿ C_x or denoted by \({n \choose x}\)
So, the Probability is given by \[ \boxed{{n \choose x} p^{x} q^{(n-x)}} \]

Binomial Distribution

Consider a Bernoulli Process in which we have a sequence of n identical trials satisfying the following conditions:

Each trial has two possible outcomes, called success *and failure. The two outcomes are mutually exclusive and exhaustive.
The probability of success, denoted by p, remains constant from trial to trial. The probability of failure is denoted by q, where q = 1-p.
The n trials are independent. That is, the outcome of any trial does not affect the outcomes of the other trials.

Binomial Distribution

A random variable, X, that counts the number of successes in n Bernoulli trials, where p is the probability of success* in any given trial, is said to follow the binomial probability distribution with parameters n (number of trials) and p (probability of success). We call X the binomial random variable. And the probability is given by: \[ \boxed{P(X=x)={n \choose x} p^{x} q^{(n-x)}} \]

* The terms success and failure are simply statistical terms, and do not have positive or negative implications.  In a production setting, finding a defective product may be termed a “success,” although it is not a positive result.

Binomial probability distribution:

\[\boxed{ P(x) = {n \choose x} p^x q^{(n-x)}} \]

where:

\(p\) is the probability of success in a single trial,

\(q = 1-p\), probability of failure

\(n\) is the number of trials, and

\(x\) is the number of successes.

Mean and Variance of Binomial Distribution

Mean of a binomial distribution \(\mu\) = E(x)= np

Variance of binomial distribution \(\sigma^2\) = V(x)=npq

Standard deviation of binomial distribution \(σ\) = sd(x)=√(npq)

If H represents the number of heads count in the five tosses of a fair coin

\(\mu\) = E(H)= 5*0.5 = 2.5

\(\sigma^2\) = Var(H)=50.50.5 = 1.25

\(\sigma\) = SD(H)=\(\sqrt{\text{var}(H)}\) = \(\sqrt{1.25}\) = 1.118

While calculating Expectation you have to multiply probability by number.

Expectation = n * p(x)

Some Problems

Observing a certain number of families, there are 4 children. Find the probability of getting: (a) No boy; (b) exactly 2 boys; (c) at least 2 boys; (d) at most 2 boys; (e) Not less than 3 boys; (f) less than 4 boys; (g) between 1 and 4 boys. [Ans: (a) 0.0625 (b) 0.375 (c) 0.6875 (d) 0.6875 (e) 0.3125 (f) 0.9375 (g) 0.625]
In a tossing of four unbiased coins, determine the probability of obtaining: (a) at least one head (b) two or more heads; (c) one head and three tails. [Ans: (a) 0.9375 (b) 0.6875 (c) 0.25]
There is 20% chance of items being defective in a consignment of 20 items. What is the probability of getting: (a) no defective; (b) 12 non-defectives; (c) not less than 2 defectives; (d) not more than 2 defectives? [Ans: (a) 0.0115 (b) 0.0222 (c) 0.9306 (d) 0.2060]

Some Problems

A box contains 100 items of which 10 items are defectives. Six items are randomly selected in succession. What is the probability of getting less than three defective items? [Ans: 0.9840]
The average percentage of getting failure in a certain test is 60. If seven students are selected for the exam, then what would be the chance of getting at least 5 passed in the test? [Ans: 0.0962]
In a 400 sets of eight tosses of a coin, in how many cases we expect to have: (a) 5 heads and 3 tails; (b) at least 6 heads. [Ans: (a) 87.5 sets (b) 57.86 sets]

Some Problems

From the past experience a stock broker finds that 70% of the telephone calls he receives during the business hours, are orders and the remaining are for other business. What is the probability that out of first 8 telephone calls during a day—

exactly 5 calls are order calls? (ii) at least 6 are order calls ? In R Studio use this code: dbinom(k, size = n, prob = p) for cumulative fun pbinom will be used