MT5762 Lecture 6

C. Donovan

Probability and probability functions

  • Further uses of probability functions to deal rationally with uncertainty.

Expectations of the discrete Random Variable (RV)

Suppose Dragos and Christopher bet on the outcome of a coin flip: if H, then Dragos gives Christopher $1; if T, then Christopher pays Dragos. Suppose Dragos insists on using his coin, which, unknown to Christopher, is biased in favor of tails, namely the ratio of H:T is 0.4:0.6.

Let \( X \) be the net gain for Christopher after a single flip. Our PMF for \( X \) is:

X 1 -1
\( \Pr(X) \) 0.4 0.6

What do you expect Christopher's net gain to be after a single flip? (Hint: think of 100 flips, how many should he win, how many should he lose?)

Expectation for a discrete RV

Expectation of X

In general, for a discrete RV \( X \), the Expected Value of \( X \), denoted E[\( X \)] or \( \mu_X \) is

\[ E(X) = \sum_x x \Pr(X=x) \]

Example

expected sum of a roll of 2 dice. Possible values of \( X \) and probabilities.

2 3 4 5 6 7 8 9 10 11 12
0.028 0.056 0.083 0.111 0.139 0.167 0.139 0.111 0.083 0.056 0.028

\[ E(X) = 2\times 0.028 + 3\times 0.056 + 4\times 0.083 + \ldots + 12\times 0.028 = 7 \]

Example

special case, 0 or 1 outcome, a Bernoulli RV

X 0 1
\( \Pr(X) \) (1-p) p

Then \( E[X]= \)?

\[ E(X) = 0\times(1-p) + 1\times p = p \]

Simple transforms on RV X

Special case: linear transformations of \( X \). Let \( a \) and \( b \) be real numbers.

\[ \begin{eqnarray*} Y = aX & : & E(Y) = aE(X) \\ Y = X+b & : & E(Y) = E(X)+b \\ Y = aX+b & : & E(Y) = aE(X)+b \end{eqnarray*} \]

Also addition of any two random variables

\[ E[X+Y] = E[X]+E[Y]. \]

Variances of a discrete RV

  • Two RVs \( X_1 \) and \( X_2 \) may have the same expectations but differ considerably in terms of the range of variability.
  • To quantify the degree of variation in a PMF, use Variance of \( X \), V(\( X \)) or \( \sigma^2_X \).

This is the Expected Squared Deviation about \( E(X) \) and is defined by:

Variance of X

\[ \begin{eqnarray*} V(X) = \sum_x (x-E(X))^2 \Pr(X=x) \end{eqnarray*} \]

Variances of the discrete RV

Standard Deviation of X

Relatedly, the standard deviation of \( X \), SD(\( X \)) or \( \sigma_X \):

\[ \begin{eqnarray*} SD(X) = \sqrt{V(X)} \end{eqnarray*} \]

Example

\( X \) = sum of 2 dice (E\( (X) \)=7):

\[ \begin{align*} V(X) &= \sum_{x=2}^{12} (x-7)^2 \Pr(X=x) \\ &= (2-7)^2\times 0.028 + (3-7)^2\times 0.056 + \ldots + (12-7)^2\times 0.028 \\ &= 5.83 \end{align*} \]

V[X+Y] or V[X-Y]

  • If \( X \) and \( Y \) are independent, \( V[X+Y] = V[X-Y] = V[X]+V[Y] \)
  • If not, \( V[X+Y] = V[X]+V[Y]+2 Cov[X,Y] \) and \( V[X-Y] = V[X] + V[Y] - 2 Cov[X,Y] \), where

Covariance

\[ Cov[X,Y] = \sum_x \sum_y (x-E[X]) (y-E[Y]) p(x,y) \]

Cov or Covariance is a measure of the degree of association, or similarity, between two random variables.

Correlation and covariance

An alternate way to write the formula involves using a correlation coefficient, \( \rho \), which may be more familiar.

Correlation

\[ \rho = \frac{Cov[X,Y]}{SD(X) \times SD(Y)} \]

Note: \( -1 \le \rho \le 1 \). When \( \rho \)=1, \( X \) and \( Y \) have a positive linear relationship.

Correlation and covariance

Thus

\[ Cov[X,Y] = \rho ~ SD(X) ~ SD(Y) \]

and

\[ V[X+Y] = V[X]+V[Y]+2 \rho SD(X) SD(Y) \]

\[ V[X-Y] = V[X]+V[Y]-2 \rho SD(X) SD(Y) \]

Example

A time and motion study measures the time required for an assembly line worker to perform a repetitive task.

The data show that:

  • The time required to bring a part from a bin to its position on an automobile chassis varies from car to car with mean 11 seconds and standard deviation 2 seconds.
  • The time required to attach the part to the chassis varies with mean 20 seconds and standard deviation 4 seconds.

Example

  • What is the mean time required for the entire operation of positioning and attaching the part?

  • Let \( X \)=time to position part and \( Y \)=time to attach part.

\[ E[X+Y] = E[X]+E[Y] = 11 + 20 = 31 \]

Example

  • If the variation in the worker's performance is reduced by better training, the standard deviations will decrease. Will this decrease change the mean found previously, if the mean times for the two steps remain as before?

  • No. Changing the standard deviations does not change the means.

Example

  • The study finds that the times required for the two steps are independent. A part that takes a long time to position, for example, does not take more or less time to attach than other parts.

  • What is the standard deviation for the time to position and attach the part?

\[ \begin{align*} V[X+Y] &= V[X]+V[Y] = 2^2+4^2 = 20 \\ SD[X+Y] &= \sqrt{20} = 4.47 \mbox{ seconds} \end{align*}. \]

Example

  • How would your previous answer change if the variables were dependent with a correlation coefficient of 0.8?

\[ \begin{align*} V[X+Y] &= V[X]+V[Y]+2\rho SD[X] SD[Y] \\ &= 2^2+4^2 + 2*0.8*2*4 = 32.8 \\ SD[X+Y] &= \sqrt{32.8} = 5.73 \mbox{ seconds} \end{align*}. \]

Specific RV distributions

Binomial RV

Zero or one RV

Let \( X_1 \), \( X_2 \), \( \ldots \), \( X_n \) be \( n \) independent Bernoulli RVs with the same probability of success \( p \) (note these have have a 0 or 1 outcome = fail or success). Define a new random variable

\[ \begin{eqnarray*} Y = \sum_{i=1}^n X_i \end{eqnarray*} \]

\( Y \) is called a Binomial RV. This is indicated by the notation: \( Y \sim \mbox{Binomial}(n,p) \).

Example

Consider a game of darts. Assume the probability of hitting the bull's-eye is ¼.

  • Assuming that one throw has no effect on any other throw, what is the probability of hitting the bull's-eye exactly once in 4 attempts?
  • Here \( n= 4 \) and \( p = 0.25 \).
  • How many ways can I get 1 bull's-eye in 4 tosses?

Example

\[ {4 \choose 1} = \frac{4!}{1! 3!} = 4 \]

Write down each of the ways and the probability for each way.

outcome under independence resulting probability
SFFF ¼ * ¾ * ¾ * ¾ = \( (1/4)^1 (3/4)^3 \) = 0.1055
FSFF ¾ * ¼ * ¾ * ¾ = \( (1/4)^1 (3/4)^3 \) = 0.1055
FFSF ¾ * ¾ * ¼ * ¾ = \( (1/4)^1 (3/4)^3 \) = 0.1055
FFFS ¾ * ¾ * ¾ * ¼ = \( (1/4)^1 (3/4)^3 \) = 0.1055

Example

What is the probability of 1 bull's eye in 4 tosses?

\[ 4 \times (1/4)^1 \times (3/4)^3 = 0.4219 \]

This suggests a shortcut:

\[ \begin{eqnarray*} \Pr(Y=1) = {4 \choose 1} 0.25^1 (1-0.25)^{4-1} \end{eqnarray*} \]

Binomial PMF

  • In general if \( Y \) \( \sim \) Binomial(\( n \),\( p \)),

\[ \begin{eqnarray*} \Pr(Y=k) = {n \choose k} p^k (1-p)^{n-k} \end{eqnarray*} \]

  • Of course we don't do this by hand - R will provide

Looking at some binomial distributions

The binomial distribution has two parameters:

  • The number of trials
  • The probability of success

This defines a PMF - the look of which changes with the above

Looking at some binomial distributions

  # plot a couple of binomial PMFs

  # stack them
  par(mfrow = c(2,1))

  x <- 0:5

  # first 5 trials, 0.5 prob of success
  barplot(dbinom(x, 5, 0.5), names.arg = x, col = 'slateblue4')

  # now with a higher probability of success
  barplot(dbinom(x, 5, 0.9), names.arg = x, col = 'slateblue4')

plot of chunk unnamed-chunk-2

Looking at some binomial distributions

  # plot a couple of binomial PMFs

  # stack them
  par(mfrow = c(2,1))

  x <- 0:20

  # first 5 trials, 0.5 prob of success
  barplot(dbinom(x, 20, 0.2), names.arg = x, col = 'slateblue4')

  # now with a higher probability of success
  barplot(dbinom(x, 20, 0.6), names.arg = x, col = 'slateblue4')

plot of chunk unnamed-chunk-4

Examples

# probability of observing 2 successes from 5 trials
# where the probability of success is 0.3
  dbinom(2, 5, 0.3)   
[1] 0.3087
# probability of observing up to 2 successes from 5 trials
# where the probability of success is 0.3
  dbinom(0, 5, 0.3) + dbinom(1, 5, 0.3) + dbinom(2, 5, 0.3) 
[1] 0.83692
# or more directly - using the CDF
  pbinom(2, 5, 0.3)  
[1] 0.83692

Binomial expected value & variance

  • \( E(Y) \) and \( V(Y) \).

\[ \begin{eqnarray*} E(Y) = np \\ V(Y) = np(1-p) \\ SD(Y) = \sqrt{np(1-p)} \\ \end{eqnarray*} \]

Example \( Y \) \( \sim \) Binomial(50,0.8).

\[ \begin{eqnarray*} E(Y) = 50\times 0.8 = 40 \\ V(Y) = 50\times 0.8\times 0.2 = 8 \\ SD(Y) = \sqrt{8}=2.83 \\ \end{eqnarray*} \]

Example

Assume that 13% of people are lefthanded and therefore slightly evil, as we all know.

  • Latin (and Italian at least) for left sinstra, leading to today's sinister.
  • Also derived from lyft for weak.
  • Black magic is the left-hand path.
  • Etc…

refer Wikipedia: https://en.wikipedia.org/wiki/Bias_against_left-handed_people

Example

Assume that 13% of people are lefthanded. If we select 5 people at random, find the probability of each outcome described below.

  • The first sinister person is the fifth chosen.

\( 0.87^4 \times 0.13 = 0.074 \)

  • There are some sinister folk amongst the 5 people.

At least one: 1-none = 1-\( 0.87^5 \)=0.502.

  • The first dubious left-hander is the second or third person.

Mutually exclusive events:

(second) \( 0.87\times 0.13 \) + (third) \( 0.87^2\times 0.13 = 0.1131+0.0984 = 0.2115 \).

Example

  • There are exactly 3 semi-evil in the group.

    \( \Pr(X=3|n=5,p=0.13) = {5 \choose 3}0.13^3*0.87^2 = 0.0166 \).

    Or, naturally, by computer:

    dbinom(3, 5, 0.13)
    
    [1] 0.01662909
    

Example:

  • There are at least 3 semi-evil in the group. That means 3 or 4 or 5 lefties: \( \Pr(X=3)+\Pr(X=4)+\Pr(X=5) \).
# a boring sum of probabilities

  dbinom(3, 5, 0.13) + dbinom(4, 5, 0.13) + dbinom(5, 5, 0.13)   
[1] 0.01790863
# use our knowledge of CDFs to get the complement
# i.e. they sum to one

  1-pbinom(2, 5, 0.13)
[1] 0.01790863
# or alter the argument to be upper tail

  pbinom(2, 5, 0.13, lower.tail = F)
[1] 0.01790863

Example:

Now EV and SD:

  • How many sinister are expected?

\( E[X]= np = 5\times 0.13 = 0.65 \)

  • With what standard deviation?

\( SD[X] = \sqrt{np(1-p)} = \sqrt{5\times 0.13 \times 0.87} = 0.752 \)

Let's do some gambling

  • Consider some coin tosses and varying odds

    • (explains game)
    • So we arrive at odds = \( 1/P(win) \) for a fair game
  • Now let's look at horses

    • (gets bookies odds)
    • Observe not a PMF
    • Odds are too low
    • Assume our probability is 10% lower than implied
    • Over 100 markets - where are we likely to be?

Let's do some gambling

Use some basic calcs for expectation and variance. Our bookies offering bad odds, of course:

  # assume gambling at decimal odds of 20


  offeredOdds <- 20
  bookProb <- 1/offeredOdds
  trueProb <- bookProb*0.9
  trueProb
[1] 0.045
  # fair odds
  1/trueProb
[1] 22.22222

Let's do some gambling

Implied expectation and variance:

  # expectation
  expReturn <- trueProb*19 + (1-trueProb)*-1
  expReturn*100  
[1] -10
  # variance
  varReturn <- trueProb*(19-expReturn)^2 + (1-trueProb)*(-1-expReturn)^2
  varReturn
[1] 17.19

Let's do some gambling

Repose the problem at a lower odds bracket - proportionately the same poor odds:

  # assume gambling at decimal odds of 20


  offeredOdds <- 10
  bookProb <- 1/offeredOdds
  trueProb <- bookProb*0.9
  trueProb
[1] 0.09
  # fair odds
  1/trueProb
[1] 11.11111

Let's do some gambling

Expectation is the same, our variance is lower:

  # expectation
  expReturn <- trueProb*9 + (1-trueProb)*-1
  expReturn*100  
[1] -10
  # variance
  varReturn <- trueProb*(9-expReturn)^2 + (1-trueProb)*(-1-expReturn)^2
  varReturn
[1] 8.19

Let's do some gambling

This is pretty important if you have this vice.

  • Your swings of wins and losses will be larger for the larger odds scenario
  • This means the probability of going bust sooner is higher.
  • Even if this was a positive expectation, this would still apply - low variance is good!

Poisson distribution

Probability mass function description: A RV \( X \) with a Poisson distribution is a discrete RV with an infinite but countable set of possible values, in particular \( X \) can equal \( 0, 1, 2, 3, \ldots \).

Poisson distributions are a parametric family of distributions.

The Poisson PMF

\[ \begin{eqnarray*} \Pr(X=x) = e^{-\lambda} \lambda^x/x! \end{eqnarray*} \]

Example

\( X \) \( \sim \) Poisson(4),

\[ \begin{eqnarray*} \Pr(X=3|\lambda=4) = e^{-4} 4^3/3! = 0.1954 \end{eqnarray*} \]

dpois(3, 4)
[1] 0.1953668

Why Poisson?

The underlying process that gives rise to a Poisson distribution is one where (theoretically):

  • \( \Pr \)(a single Success over a “short'' interval) \( \propto \) to interval length
  • \( \Pr \)(two or more Successes over a short'' interval) is effectively zero
  • Given two disjoint time intervals, #Successes in one interval are Independent of #Successes in second interval

Looking at some Poisson distributions

  # plot a couple of poisson PMFs

  # stack them
  par(mfrow = c(2,1))

  x <- 0:30

  # look at outcomes up to 30, mean/rate of 4
  barplot(dpois(x, lambda = 4), names.arg = x, col = 'slateblue4')

  # now with a rate of 10
  barplot(dpois(x, lambda = 10), names.arg = x, col = 'slateblue4')

plot of chunk unnamed-chunk-14

Looking at some Poisson distributions

  # plot a couple of poisson PMFs

  # stack them
  par(mfrow = c(2,1))

  x <- 0:100

  # look at outcomes up to 100, mean/rate of 10
  barplot(dpois(x, lambda = 4), names.arg = x, col = 'slateblue4')

  # now with a rate of 70
  barplot(dpois(x, lambda = 10), names.arg = x, col = 'slateblue4')

plot of chunk unnamed-chunk-16

Expected value and variance

Poisson RV

\[ \begin{eqnarray*} E(X) = \lambda \\ V(X) = \lambda \\ SD(X) = \sqrt{\lambda} \end{eqnarray*} \]

Example calcs

# the Poisson PMF
  dpois(x = 2, lambda = 4)
[1] 0.1465251
# the Poisson CDF
  ppois(2, lambda = 4)
[1] 0.2381033
# equivalent to
  dpois(0, 4) + dpois(1, 4) + dpois(2, 4) 
[1] 0.2381033

Example:

Xeroderma Pigmentosum (XP) is a genetic disorder, where individuals are extremely sensitive to UV rays (and will likely get skin cancer if exposed to UV; the life expectancy is quite low). The frequency in the U.S. and Europe is approximately 1:250,000, i.e., \( p\approx 0.000004 \).

If 1,000,000 people are randomly sampled, the probability of getting 5 people with XP is?

Example:

Observe the rate is 4 per million. We're looking at the prob of 5 from a million.

\[ \begin{align*} \Pr(X=5) \approx & e^{-4} 4^5/5! = 0.156 \\ \end{align*} \]

# in R
  dpois(5, 4)
[1] 0.1562935

Recap and look-forwards

We've covered:

  • Expected values and variances in the context of discrete RVs
  • Two basic discrete probability distributions: binomial and poisson

Next:

  • Continuous random variables