These notes are designed to follow the Datacamp course Intermediate R. They include some details that I want to emphasize or extra points that I consider worth mentioning.

This particular set focuses on logical values, the first chapter of the course, but it does include other topics as appropriate. You will also hear the term Boolean values after the originator of this area of mathematics. See https://en.wikipedia.org/wiki/Boolean_algebra for more information.

Working through these notes should give you enough experience with logical expressions to distinguish them easily from normal algebraic expressions.

The notes are in question and answer format. To get the most out of them, you should have RStudio open and respond actively to the questions as they come up.

Dealing with NA Values

How do you use NA values in logical expressions? It is clear that the numerical value 7 is not NA. It would seem that the logical expression 7 == NA should evaluate to false. It is also clear that NA itself is NA. Therefore NA == NA should be TRUE. Try it.

Answer

7 == NA

## [1] NA

NA == NA

## [1] NA

That didn’t work. The key principle is that any expression involving NA is NA.

We need to make use of the function is.na() to answer such questions with a TRUE/FALSE answer. It returns TRUE or FALSE depending on the value of the argument. Try it with 7 and with NA.

Answer

is.na(7)

## [1] FALSE

is.na(NA)

## [1] TRUE

Truth Table for &

Most presentations of logical operators use truth tables to give precise definitions of the logical operators. Given two logical variables, P and Q, there are four possible combinations of truth values. The standard arrangement of these possibilities is based on a tree diagram or a nested for loop in which P is in the outer loop or the fist level of the tree. The table below is implemented as a dataframe and shows the value of P & Q for each of the four possibilities.

P = c(T,T,F,F)
Q = c(T,F,T,F)
P_and_Q = P & Q

df_tt_and = data.frame(P,Q,P_and_Q)
df_tt_and

##       P     Q P_and_Q
## 1  TRUE  TRUE    TRUE
## 2  TRUE FALSE   FALSE
## 3 FALSE  TRUE   FALSE
## 4 FALSE FALSE   FALSE

Exercise

Following the example above construct a truth table for the or operator which is written “|” in R.

Answer

P = c(T,T,F,F)
Q = c(T,F,T,F)
P_or_Q = P | Q

df_tt_or= data.frame(P,Q,P_or_Q)
df_tt_or

##       P     Q P_or_Q
## 1  TRUE  TRUE   TRUE
## 2  TRUE FALSE   TRUE
## 3 FALSE  TRUE   TRUE
## 4 FALSE FALSE  FALSE

Not

Not, written ! in R is a unary operator, so its truth table only needs two rows. Create this table yourself.

Answer

P1 = c(T,F)
not_P1 = !P1

df_tt_not = data.frame(P1,not_P1)
df_tt_not

##      P1 not_P1
## 1  TRUE  FALSE
## 2 FALSE   TRUE

Logical Equivalence

Two logical expressions are equivalent if they have the same truth value for every possible combination of the basic values of their lowest level components. The way we do this in Math. courses is to build a truth table with columns containing the truth values for the two expressions. Then we visually examine the two columns. If the two columns are identical, the expressions are logically equivalent.

As an example consider the following two logical expressions

Exp1: not (P and Q)
Exp2: (not P) or (not Q)

Exercise: Let’s build a truth table containing these.

Answer

P = c(T,T,F,F)
Q = c(T,F,T,F)

Exp1 = !(P & Q)
Exp2 = !P | !Q

tt_df = data.frame(P,Q,Exp1,Exp2)
tt_df

##       P     Q  Exp1  Exp2
## 1  TRUE  TRUE FALSE FALSE
## 2  TRUE FALSE  TRUE  TRUE
## 3 FALSE  TRUE  TRUE  TRUE
## 4 FALSE FALSE  TRUE  TRUE

Yes! they are identical. We can see that visually, but did we need to look. Could we just compare the two logical vectors?

Exercise: Do that.

Answer

Exp1 == Exp2

## [1] TRUE TRUE TRUE TRUE

We still had to look and see that every position in the logical vector contained the value TRUE. How could we simplify the conclusion?

The idea is that we want to see if the count of TRUE values in the logical vectors is the same as its length.

Exercise: Do that.

Answer

Let’s just look at the two numbers.

sum(Exp1 == Exp2)

## [1] 4

length(Exp1 == Exp2)

## [1] 4

Yes, they are the same. Of course I could ask this question even more directly.

Do that.

Answer

sum(Exp1 == Exp2) == length(Exp1 == Exp2)

## [1] TRUE

This particular logical equivalence is one of DeMorgan’s two laws. The second one is almost the same, but the two logical operators are reversed in their roles.

A Distributive Law

In the algebra of real numbers, we know that the following is true. For any three numbers, x, y and z

\[x(y+z)=x y + x z\] There are two similar laws in Boolean algebra. We’ll just look at one of them. For any three logical values P, Q and R The following expressions are equivalent.

Exp1 = P and (Q or R)
Exp2 = (P and Q) or (P and R)

A truth table would contain eight rows and be a tedious chore.But computing the values of the two logical expressions for every possible cobination of P, Q and R would be easy using nested for loops.

Do That:

Answer

TV = c(T,F)
for(P in TV){
  for(Q in TV){
    for(R in TV){
      Exp1 = P & (Q | R)
      Exp2 = (P & Q) | (P & R)
      print(Exp1 == Exp2)
    }
  }
}

## [1] TRUE
## [1] TRUE
## [1] TRUE
## [1] TRUE
## [1] TRUE
## [1] TRUE
## [1] TRUE
## [1] TRUE

That’s a very not so R-ish way to do things. Experienced R users do manipulations with vectors instead of for loops. We can generate the base columns of the 8-row truth table as follows using simple brute strength.

P = c(T,T,T,T,F,F,F,F)
Q = c(T,T,F,F,T,T,F,F)
R = c(T,F,T,F,T,F,T,F)

base_cols = data.frame(P,Q,R)
base_cols

##       P     Q     R
## 1  TRUE  TRUE  TRUE
## 2  TRUE  TRUE FALSE
## 3  TRUE FALSE  TRUE
## 4  TRUE FALSE FALSE
## 5 FALSE  TRUE  TRUE
## 6 FALSE  TRUE FALSE
## 7 FALSE FALSE  TRUE
## 8 FALSE FALSE FALSE

Then we can create the expressions we want using vector operations and test the equality of the two.

Exp1 = P & (Q | R)
Exp2 = (P & Q) | (P & R)
Exp1 == Exp2

## [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE

You could also build the base columns using the rep() function. Visit http://www.endmemo.com/r/rep.php to see how rep() works. Then use it to build P, Q and R.

Answer

P = rep(c(T,F),each = 4)
Q = rep(c(T,T,F,F),times = 2)
R = rep(c(T,F),times = 4)

tt_df = data.frame(P,Q,R)
tt_df

##       P     Q     R
## 1  TRUE  TRUE  TRUE
## 2  TRUE  TRUE FALSE
## 3  TRUE FALSE  TRUE
## 4  TRUE FALSE FALSE
## 5 FALSE  TRUE  TRUE
## 6 FALSE  TRUE FALSE
## 7 FALSE FALSE  TRUE
## 8 FALSE FALSE FALSE

Three Values

In R4DS, Hadley makes the point that in R, there are three potential values of a boolean variable:

TRUE
FALSE
NA

How can we investigate the equivalence of two expressions while incorporating the third value?

Solution

It’s fairly easy to do with the nested for loops. In fact, we only need to insert three characters in the code.

Here’s an example.

TV = c(T,F,NA)
for(P in TV){
  for(Q in TV){
    for(R in TV){
      Exp1 = P & (Q | R)
      Exp2 = (P & Q) | (P & R)
      print(Exp1 == Exp2)
    }
  }
}

## [1] TRUE
## [1] TRUE
## [1] TRUE
## [1] TRUE
## [1] TRUE
## [1] NA
## [1] TRUE
## [1] NA
## [1] NA
## [1] TRUE
## [1] TRUE
## [1] TRUE
## [1] TRUE
## [1] TRUE
## [1] TRUE
## [1] TRUE
## [1] TRUE
## [1] TRUE
## [1] NA
## [1] NA
## [1] NA
## [1] NA
## [1] TRUE
## [1] NA
## [1] NA
## [1] NA
## [1] NA

Eliminate the NA Values

Expand on Exp1 == Exp2 to include the cases where both expressions are NA.

Solution

TV = c(T,F,NA)
for(P in TV){
  for(Q in TV){
    for(R in TV){
      Exp1 = P & (Q | R)
      Exp2 = (P & Q) | (P & R)
      print((Exp1 == Exp2) | (is.na(Exp1) == is.na(Exp2)))
    }
  }
}

## [1] TRUE
## [1] TRUE
## [1] TRUE
## [1] TRUE
## [1] TRUE
## [1] TRUE
## [1] TRUE
## [1] TRUE
## [1] TRUE
## [1] TRUE
## [1] TRUE
## [1] TRUE
## [1] TRUE
## [1] TRUE
## [1] TRUE
## [1] TRUE
## [1] TRUE
## [1] TRUE
## [1] TRUE
## [1] TRUE
## [1] TRUE
## [1] TRUE
## [1] TRUE
## [1] TRUE
## [1] TRUE
## [1] TRUE
## [1] TRUE

Logical Values

Dealing with NA Values

Answer

Answer

Truth Table for &

Exercise

Answer

Not

Answer

Logical Equivalence

Answer

Answer

Answer

Answer

A Distributive Law

Answer

Answer

Three Values

Solution

Eliminate the NA Values

Solution