Harold Nelson
March 11, 2024
An event is a set of possible outcomes of an experiment.The probability of an event is what we expect its relative frequency to approach as we run the experiment a large number of times.These values can arise from several sources.
Probabilities are always numbers in the range \(0\) to \(1\).
If the elementary outcomes of an experiment are discrete and all equally likely, the theoretical probability of an event is defined as
\[\frac{\text{Number of elementary outcomes in the event}}{\text{Total number of elementary outcomes}}\]
As an example, consider the probability of getting a head when you flip a fair coin. There are two elementary outcomes of this experiment, a head, or a tail. Since the coin is fair, the outcomes are equally likely. There are two possible outcomes and only one is in our event. So, the probability is \(1/2\). In everyday language, you may hear “fifty-fifty.” To stick with the language of probability we say that the probability is \(.5\).
Using equally likely elementary outcomes with discrete experiments is clearly advantageous. Consider the probability of getting a sum of seven dots when rolling a pair of dice. If we think of the elementary outcomes of this experiment as the numbers between two and twelve, the outcomes are not equally likely. There is only one way to get a sum of two and only one way to get sum of twelve, but there are many ways to get a sum of seven. There are 36 possible equally likely elemementary outcome when we consider the orderd pairs of outcomes describing what happened to die 1 and what happened to die 2. These are \((2,5),(5,2),(3,4),(4,3),(6,1) \text{ and } (1,6)\). Since there are six of them, the probabiliy of getting a seven is \(6/36\), or \(1/6\).
This is really just noting that you can describe some empirical facts using the language of probability. An example or two should help.
Based on the Current Population Survey, about 60% of the US population over the age of 16 meets the definition of participating in the labor force. We can say that if a member of the population over the age of 16 were picked at random, the probability of that person being a labor force participant is about .6.
Based on data from the CDC, there were 2,515,458 deaths in the US in 2010. Of those the primary cause of death was heart disease in 596,577 cases. We can say that if we were to pick a 2010 death certificate at random the probability of the death having a primary cause of heart disease is .237.
This is really just using the language of probability to express a personal degree of belief. Here are some examples based on my own personal experiences.
I believe that the probability of having at least one class cancelled during a winter term because of snow is about .5.
I believe that the probability of my laptop having a hardware failure during the next 6 months is less than .05.
Some experiments have numerical outcomes. The numerical outcomes are called random variables.
For example, If I flip a fair coin three times and record the number of heads I get as the outcome of the experiment, I have a random variable with four possible values - \(0, 1, 2 \text{ and }3\). Such a random variable, with a finite number of possible valuables is called discrete.
Random variables can also be continuous, as opposed to discrete. For example, suppose the experiment is to pick a rock at random from a large pile of rocks and record the weight. The number of possible values is conceptually infinite, although it would be recorded only to the degree of accuracy provided by our scales.
Random variables have functions which describe the relative frequency with which they take on their range of possible values.
In the case of a discrete random variable, this function is called a distribution function.
For example consider the simple experiment of flipping a fair coin once and counting the number of heads as the outcome. There are two possible values of the random variable \(0\) and \(1\). If the distribution function is denoted \(d()\), we have \(d(0)=.5\) and \(d(1)=.5\).
The other example above, flipping a fair coin three times and counting the number of heads, is a little more complicated. However it fits the theoretical requirements for a binomial random variable.
The following requirements define a binomial random variable.
There is a simple trial, which is repeated n times and has only two possible outcomes. One of the outcomes is defined arbitrarily as a “success.”
The probability of success is a constant, denoted p.
The trials are independent, in that success or failure on one trial does not effect the probability of success on other trials.
The binomial probability didsrtibution function gives us the probability of exactly \(x\) successes given the values of \(n\) and \(p\).
\[\left(\frac{n!}{(n-x)!x!}\right)p^{x}(1-p)^{n-x}\]
Fortunately, you will never have to use this formula. R provides the function dbinom() to do this calculation. For example, to calculate the probability of exactly 5 successes out of 10 independent trials with a constant probability .2 of success on each trial, the following code will work.
## [1] 0.02642412
it is easy to get a table of all of the values of the probability distribution for this example as follows.
# Create a vector x with the possible values of the random variable.
x <- 0:10
# Use dbinom() to create the value of the probability of each value in x.
d <- dbinom(x,10,.2)
# Put x and d together in a dataframe for display purposes.
df <- data.frame(x,d)
#Display the probability distribution.
df
## x d
## 1 0 0.1073741824
## 2 1 0.2684354560
## 3 2 0.3019898880
## 4 3 0.2013265920
## 5 4 0.0880803840
## 6 5 0.0264241152
## 7 6 0.0055050240
## 8 7 0.0007864320
## 9 8 0.0000737280
## 10 9 0.0000040960
## 11 10 0.0000001024
The drudgery that students of 1960 experienced in doing the computations with the formula above is no longer necessary. Doing these computations by hand does not increase your insight into statistics. The relative ease of invoking dbinom() leads some people to fear that “You can’t understand it if you let a computer do all the work.” There are two human parts of this kind of problem.
Does this situation satisfy the requirements to make the binomial distribution applicable?
How do you ask the computer to do its part?
This first of these takes experience and is where you need to really use your mind. The second is what you have just learned.
To help you gain some skill in the first human task, here are some examples. For each of these you should decide if the binomial distribution fits and identify the values on n and p.