In this lab we will:

  • stimulate random processes, calculate their probabilities, and compare results.
  • calculate probabilities using the rules of independent and complementary events.
  • develop the concepts of conditional probability and expected value.
  • compute conditional probabilities using Bayes’ Theorem.

Lab Objective 1: Write and organize statistical reports in a clear readable format. Lab Objective 2: Use and apply statistical structures for handling data.
Lab Objective 3: Analyze and display numeric data.
Lab Objective 4: Learn to compute and interpret probabilities of discrete variables.

1 Law of Large Numbers

Let \(p_n\) be the proportion of a certain outcomes occurring in \(n\) trails in which that outcome may occur, and let \(p\) be the theoretical probability of that outcome. The Law of Large Numbers states that as \(n\) approaches infinity, \(p_n\) converges to (approaches) \(p\).

Let’s test this out!

1.1 Heads or Tails?

The following code runs a simulation of flipping a coin 10 times using the sample() function. Let the outcome 1 represent heads and the outcome 0 represent tails.

set.seed(1) # This sets the seed for the random number generator (RNG) state
outcomes = c(0,1) # 0 = Tails, 1 = Heads
sam = sample(x = outcomes, size = 10, replace=T) # Samples randomly from `outcomes` size times
sam
##  [1] 0 1 0 0 1 0 0 0 1 1
  1. What is \(\hat{p}_{10}\), the proportion of times head is rolled out of the 10 samples?

Lets increase the size to 100.

set.seed(1) # This sets the seed for the random number generator (RNG) state
outcomes = c(0,1) # 0 = Tails, 1 = Heads
sam = sample(x = outcomes, size = 100, replace=T) # Samples randomly from `outcomes` size times
table(sam)
## sam
##  0  1 
## 49 51
  1. What is \(\hat{p}_{100}\), the proportion of times head is rolled out of the 100 samples? (Hint: use the sum() function.)

Lets increase the size to 10000.

set.seed(1) # This sets the seed for the random number generator (RNG) state
outcomes = c(0,1) # 0 = Tails, 1 = Heads
sam = sample(x = outcomes, size = 1000, replace=T) # Samples randomly from `outcomes` size times
table(sam)
## sam
##   0   1 
## 502 498
  1. What is \(\hat{p}_{1000}\), the proportion of times head is rolled out of the 10 samples?

  2. What do you notice about how \(\hat{p}_n\) changes as n increases?


Below is a cool function for plotting the outcomes of a simulation.

#' Runs simulation of randomly selecting from a set of outcomes and plots how proportion of times a specific outcome occurs
#' outcomes = vector of possible outcomes
#' outcome = outcome in question
#' n = number of trials
#' p = theoretical probability of phat
#' seed = seed for random number generator (optional)
plot_sim = function( outcomes, outcome, n, p, seed = 1 ) {
  set.seed(seed)
  results = sample(x = outcomes, size = n, replace=T) 
  phat = c()
  for ( i in 1:n) {
    phat[i] = length( which(results[1:i]==outcome) )/i
  }
  plot(1:n, phat, type='l',col='blue', xlab = "n",log='x', ylim = c(0,1))
  abline(a=p,b=0, lty = "dotted")
}

Let’s test it out for \(n=100\)

plot_sim( outcomes=c(0,1), outcome = 1, n=100, p=0.5)

  1. Run plot_sim for \(n=10000\). What do you observe?

  2. Do the results of this simulation appear follow the Law of Large number? Why or why not?

1.2 Rolling a Dice

Now let’s simulate rolling a six-sided dice. Below we have plot_sim function used run a simulation of rolling a dice 10,000 times and plot the proportion of times a 1 is rolled.

plot_sim( outcomes=c(1,2,3,4,5,6), outcome = 1, n=10000, p=1/6)

  1. Do the results of this simulation appear follow the Law of Large number? Why or why not?

2 Independent Events

Two events are independent if knowing the outcome of one provides no useful information about the outcome of the other (i.e. the outcome of one does not affect the probability of the other).

  1. Suppose you just flipped heads on a coin. What is the probability of flipping heads again?

If events A and B are independent, then the probability of both \(A\) and \(B\) occurring simultaneously is

\[P(A \text{ and } B) = P(A) \cdot P(B)\]

where \(P(A \text{ and } B)\) is the probability of events \(A\) and \(B\) both occurring, \(P(A)\) is the probability of event A occurring, and \(P(B)\) is the probability of event \(B\) occurring.

  1. What is the probability of:

    1. flipping heads 2 times in a row.

    2. flipping heads 3 times in a row.

    3. flipping heads 10 times in a row?

3 Complementary Events

Sometimes probabilities are easier to calculate if we look at their complement.

The complement of an event \(A\) is the event “\(A\) do not happen.” The notation \(\bar{A}\) or \(A^c\) is used for the complement of event \(A\). We can compute the probability of the complement using \(P(A^c) = 1 - P(A)\). (Notice also that complement of \(A^c\) is the original event \(A\), so that \(P(A) = 1 - P(A^c)\).)

  1. Suppose you flip a coin 2 times. What is the probability of flipping at least one tail?

Use: A = flipping at least one tail
A^C = ???

  1. Suppose you flip a coin 3 times. What is the probability of flipping at least one tail?

  2. Suppose you flip a coin 10 times. What is the probability of flipping at least one tail?


4 Expected Value (3.4.1)

Expected value provides a way of evaluating the value of a decision of multiple outcomes.

Expected Value defined as the average gain or loss of an event if the procedure is repeated many times. We can compute the expected value by multiplying each outcome by the probability of that outcome, then adding up the products.

For example, if there are two possible outcomes of a decision, A and B, the expected value of each decision are V(A) and V(B) respectively (usually represented as monetary values), the expected value of the decision is:

Expected value = V(A)P(A) + V(B)P(B)

4.1 Buying Raffle Tickets

You purchase a raffle ticket to help out a charity. The raffle ticket costs $5. The charity is selling 2000 tickets. One of them will be drawn and the person holding the ticket will be given a prize worth $4000. Compute the expected value for this raffle.

  1. Fill in the following values correctly to compute the expected value of buying a raffle ticket.
# value of winning
v_win = 0 
# probability of winning
p_win = 0 
 # value of losing
v_lose = 0
# probability of losing
p_lose = 0

#Expected Value
v_win*p_win + v_lose*p_lose
## [1] 0

Check: You should get an expected value of -$3. On average, each person is giving about $3.00 to charity.

4.2 Earthquake Insurance

  1. An insurance company estimates the probability of an earthquake in the next year to be 0.0013. The average damage done by an earthquake it estimates to be $60,000. If the company offers earthquake insurance for $100, what is their expected value of the policy?

5 Conditional Probability (3.2)

The probability the event B occurs, given that event A has happened is represented by \(P(B|A)\), read “the probability of B given A.”

Conditional probabilities can be used to find the probability of joint events, even when they are not independent:

\[P(A \text{ and } B) = P(A|B) \cdot P(B)\]

This can be solved for

\[ P(A|B) = \frac{ P(A \text{ and } B) }{P(B)}\] 15. From the Machine Learning (ML) example (Figure 3.12 in your textbook, also printed in your Lab 4 Canvas assignment):

  1. Find the probability that the ML prediction was correct, given that the photo was about fashion.
# Need to find P(ML is pred_fashion | truth is fashion)
# A = ML is pred_fashion
# B = truth is fashion


#P(A|B)
#Check
197/309
## [1] 0.6375405
  1. Find the probability that the ML prediction was correct, given that the photo was not about fashion.

  2. Find the probability that the ML prediction was wrong, given that the photo was about fashion.

  3. Find the probability that the ML prediction was wrong, given that the photo was not about fashion.

  4. In which case is the ML prediction the most accurate?