Basic Concepts of Probability

An event is a set of possible outcomes of an experiment.The probability of an event is what we expect its relative frequency to approach as we run the experiment a large number of times.These values can arise from several sources.

Probabilities are always numbers in the range \(0\) to \(1\).

Theory

If the elementary outcomes of an experiment are discrete and all equally likely, the theoretical probability of an event is defined as

\[\frac{\text{Number of elementary outcomes in the event}}{\text{Total number of elementary outcomes}}\]

As an example, consider the probability of getting a head when you flip a fair coin. There are two elementary outcomes of this experiment, a head, or a tail. Since the coin is fair, the outcomes are equally likely. There are two possible outcomes and only one is in our event. So, the probability is \(1/2\). In everyday language, you may hear “fifty-fifty.” To stick with the language of probability we say that the probability is \(.5\). This presentation shows how to compute simple probabilities.

Urn Problems

Urn problems are found in many textbooks on introductory statistics. They are good mental exercise for strengthening the concept of probability.

Suppose an urn contains two red balls and three green balls. If you select a ball from the urn without looking, what is the probability that you get a red ball?

The answer is obvious using the principle above, assuming that every ball in the urn is equally likely to be selected. There are five balls possible and two of them are red, so the probability is \(2/5\).

The R Version

I’ll build a model of the urn as a vector containing character strings.

urn = c("Red","Red","Green","Green","Green")
urn

## [1] "Red"   "Red"   "Green" "Green" "Green"

With such small numbers and simple values it’s easy to count. But let me show you a method that can be applied in large, complex situations.

To count the number of red balls or to determine the fraction of balls that are red, we can create a logical vector that marks the red balls.

urn_logical_red = urn == "Red"
urn_logical_red

## [1]  TRUE  TRUE FALSE FALSE FALSE

For the following exercises, try to answer the questions on your own before looking at my solutions.

Exercise

What happems if we take the sum of this logical vector?

Solution

sum(urn_logical_red)

## [1] 2

Question

What happened when we did numerical arithmetic on the logical values?

Answer

The TRUE values became 1 and the FALSE values became 0.

Question

What would we get if took the mean value of the logical vector?

Answer

mean(urn_logical_red)

## [1] 0.4

This is the fraction of cases in which the logical value is TRUE, in other words, the probability that it is true.

Question

Could you do these computations without creating a separate logical vector?

Answer

Yes, here’s how.

sum(urn == "Red")

## [1] 2

mean(urn == "Red")

## [1] 0.4

The implication is that we can compute the probability of any subset of a sample space that we can define using a logical expression.

Questions

For example consider the mtcars dataframe built into Base R. Suppose my experiment is to randomly select a vehicle from this dataframe and ask about the probability of the following events:

The car has a displacement of less than 200 cubic inches.

Answer

mean(mtcars$disp < 200)

## [1] 0.5

Question

The car has 3 gear speeds.

Answer

mean(mtcars$gear == 3)

## [1] 0.46875

Question

The car satisfies both of these criteria.

Answer

mean(mtcars$disp < 200 & mtcars$gear == 3)

## [1] 0.03125

To learn how to create logical expressions using comparison operators and logical operators, you could do the first chapter of the course Intermediate R in Datacamp. You could earn many XP towards your extra credit assignment.

Simple Probability Computation

Basic Concepts of Probability

Theory

Urn Problems

The R Version

Exercise

Solution

Question

Answer

Question

Answer

Question

Answer

Questions

Answer

Question

Answer

Question

Answer