Section I - Set theory and Probability
Definition. A set is a collection of objects thought of as a whole. The objects, of which the set is a collection, are called delements or members of the set.
Sample spaces and events
The probabilities to refer to the possible outcomes of certin experiments or observations. The probability model will be seen to involve two things: * choosing a set to represent the possible outcomes; * and allocating probabilities to these possible outcomes.
Setting up a sample space S of all possible outcomes
If we want to construct an abstract model, we must decide what constitutes a possible experiment. The possible outcomes define the idealized experiment and they are usually called sample points.
The collection of all sample points is called the sample space S of the model.
The notion of event can now be introduced. An event is a subset of S. An event can contain one or more sample points. Hence an event is a subset, and it is not an element.
The allocation of probabilities to the elements of a sample space
Let the sample space S be the set
\(S=\{e_1, e_2, ...\}=E_1 \cup E_2 \cup ...,\)
where \(E_i = \{e_1\}\) are the simple events in S. Then we assume that to each event E in S we can assign a non-negative real nuber P[E], called the probability of E.
Axioms for probability
- P[E]>=0 for every event E.
- P[S]=1 for the certain event S.
- The probability P[A] of any event A is the sum of the probabilities of the simple events whose union is A.
If the sample space S is the union of the distinct simple events \(E_1, E_2,...,\) it follows from axioms II and III that \(P[S]=P[E_1]+P[E_2]+...=1\).
An important consequence of Axiom III is that if E and F are mutually exclusive events, so that \(E \cap F = \varnothing\) , then \(P[E \cap F] = P[E]+P[F]\).
Theorem 1. \(P[\varnothing]=0\), where \(\varnothing\) is the empty set.
Theorem 2. \(P[E \cap F'] = P[E] - P[F]\)
Theorem 3. \(P[F']=1-P[F]\)
Theorem 4. \(P[E \cup F] = P[E] + P[F] - P[E \cap F]\).
Conditional probability
Definition. The conditional proability of B, given A, is defined to be \(P(B \mid A) = \frac{P[B \cap A]}{p[A]}\), provided that \(P[A]!=0\).
Idependent events
Definition. The events A and B are said to be independent if and only if \(P[A \cup B] = P[A]P[B]\).
Using the definition of conditional probability, it implies that \(P[B \mid A] = P[B]\) and \(P[A \mid B] = P[A]\)
Section II - Binomial distribution
Definition.
Finding density with simulation (rbinom)
Flip 10 coins 100,000 times. The rbinom function would generate 100,000 numbers. The largest number in the sequence would be 10. That is when 10 coins all turn out to be heads through one experiment and therefore 10 times 1 which is 10.
flips<-rbinom(100000,10,0.5)
sum<-sum(flips==5)
length.flips<-length(flips)
sum
[1] 24576
length.flips
[1] 100000
sum/length
[1] 0.24576
mean(flips==5) # mean is equal to sum/length
[1] 0.24576
Calculating exact probability density (dbinom)
The probablity of getting a 5 when flipping 10 coins with probability of 0.5
dbinom(5,10,0.5)
Calculating cumulative density (pbinom)
The probabilit of getting a number no larger than 4 when flipping 10 coins with probability of 0.5.
Method 1 - calculating it through simulation
flips<-rbinom(100000,10,0.5)
mean(flips<=4)
Method 2 - calculating cumulative density using pbinom
pbinom(4,10,0.5)
Exercise
If you flip 10 coins each with a 30% probability of coming up heads, what is the probability exactly 2 of them are heads?
Answer the above question using the dbinom() function. This function takes almost the same arguments as rbinom(). The second and third arguments are size and prob, but now the first argument is x instead of n. Use x to specify where you want to evaluate the binomial density.
Confirm your answer using the rbinom() function by creating a simulation of 10,000 trials. Put this all on one line by wrapping the mean() function around the rbinom() function.
Calculate the probability that 2 are heads using dbinom
dbinom(2,10,0.3)
Confirm your answer with a simulation using rbinom
mean(rbinom(10000,10,0.3)==2)
Exercise
If you flip ten coins that each have a 30% probability of heads, what is the probability at least five are heads? If you flip ten coins that each have a 30% probability of heads, what is the probability at least five are heads?
Answer the above question using the pbinom() function. (Note that you can compute the probability that the number of heads is less than or equal to 4, then take 1 - that probability).
Confirm your answer with a simulation of 10,000 trials by finding the number of trials that result in 5 or more heads.
1 - pbinom(4, 10, .3)
mean(rbinom(10000, 10, .3) >= 5)
Varying the number of trials
In the last exercise you tried flipping ten coins with a 30% probability of heads to find the probability at least five are heads. You found that the exact answer was 1 - pbinom(4, 10, .3) = 0.1502683, then confirmed with 10,000 simulated trials.
Did you need all 10,000 trials to get an accurate answer? Would your answer have been more accurate with more trials? Try answering this question with simulations of 100, 1,000, 10,000, 100,000 trials. Which is the closest to the exact answer?
mean(rbinom(10000, 10, .3) >= 5)
mean(rbinom(100, 10, .3) >= 5)
mean(rbinom(1000, 10, .3) >= 5)
mean(rbinom(100000, 10, .3) >= 5)
Expected Value
The expected value of a binomial distribution by multiplying the size (or the number of coins), by the probability each is heads.
Exercise
Calculating the expected value What is the expected value of a binomial distribution where 25 coins are flipped, each having a 30% chance of heads?
Step1 - Calculate this using the exact formula you learned in the lecture: the expected value of the binomial is size * p. Print this result to the screen.
Step2 - Confirm with a simulation of 10,000 draws from the binomial.
25*0.3
mean(rbinom(10000,25,0.3))
Calculating the variance What is the variance of a binomial distribution where 25 coins are flipped, each having a 30% chance of heads? Step1 - Calculate this using the exact formula you learned in the lecture: the variance of the binomial is size * p * (1 - p). Print this result to the screen. Step2 - Confirm with a simulation of 10,000 trials.
# Calculate the variance using the exact formula
25*0.3*(1-0.3)
# Confirm with a simulation using rbinom
var(rbinom(10000,25,0.3))
Joint Probility
The probability of A and B is the probability of A times the probability of B. Note that this is true only if events A and B are independent: that is if reresult of A does not affect the probability of B.
Simulating the probability of A and B
You can also use simulation to estimate the probability of two events both happening. Randomly simulate 100,000 flips of coin A, each of which has a 40% chance of being heads. Save this as a variable A. Randomly simulate 100,000 flips of coin B, each of which has a 20% chance of being heads. Save this as a variable B. Use the “and” operator (&) to combine the variables A and B to estimate the probability that both A and B are heads.
# Simulate 100,000 flips of a coin with a 40% chance of heads
A <- rbinom(100000,1,0.4)
# Simulate 100,000 flips of a coin with a 20% chance of heads
B <- rbinom(100000,1,0.2)
# Estimate the probability both A and B are heads
mean (A & B)
Simulating the probability of A, B, and C
Randomly simulate 100,000 flips of A (40% chance), B (20% chance), and C (70% chance). What fraction of the time do all three coins come up heads? * You’ve already simulated A and B. Now simulate 100,000 flips of coin C, where each has a 70% chance of coming up heads. * Use A, B, and C to estimate the probability that all three coins would come up heads.
# You've already simulated 100,000 flips of coins A and B
A <- rbinom(100000, 1, .4)
B <- rbinom(100000, 1, .2)
# Simulate 100,000 flips of coin C (70% chance of heads)
C<-rbinom(100000,1,0.7)
# Estimate the probability A, B, and C are all heads
mean(A&B&C)
Solving for probability of A or B
If coins A and B are independent, and A has a 60% chance of coming up heads, and event B has a 10% chance of coming up heads, what is the probability either A or B will come up heads?
References
