R Markdown: “Why 10 * 0.1 is rarely 1.0”

This is an R Markdown document. Markdown is a simple formatting syntax for authoring web pages (click the MD toolbar button for help on Markdown).

When you click the Knit HTML button a web page will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this:

summary(cars)
##      speed           dist       
##  Min.   : 4.0   Min.   :  2.00  
##  1st Qu.:12.0   1st Qu.: 26.00  
##  Median :15.0   Median : 36.00  
##  Mean   :15.4   Mean   : 42.98  
##  3rd Qu.:19.0   3rd Qu.: 56.00  
##  Max.   :25.0   Max.   :120.00

You can also embed plots, for example:

plot(cars)

Why “10 * 0.1 is rarely 1.0”

The above is a citation of one of the early books in computer science, The Elements of Programming Style by Kernighan and Plauger, 1974, where it was the title of section 36 (of 56).

Actually, it is in today’s R, typically:

10 * 0.1 == 1.0
## [1] TRUE

Still, as you will hopefully come to see, it is rather lucky coincidence, because we will see that 0.1 is not exactly the same as the mathematical \(1/10\). Let’s explore some of the numbers in R:

x1 <- seq(0, 1, by = 0.1)
(x2 <- 0:10 / 10)
##  [1] 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

So far, nothing special. Both x1 and x2 are the the same, 0, 0.1, 0.2, …, 1.0, right ?

x1 == x2
##  [1]  TRUE  TRUE  TRUE FALSE  TRUE  TRUE FALSE FALSE  TRUE  TRUE  TRUE

Oops, what happened?

n <- 0:10
rbind(x1 * 10 == n,
      x2 * 10 == n)
##      [,1] [,2] [,3]  [,4] [,5] [,6]  [,7]  [,8] [,9] [,10] [,11]
## [1,] TRUE TRUE TRUE FALSE TRUE TRUE FALSE FALSE TRUE  TRUE  TRUE
## [2,] TRUE TRUE TRUE  TRUE TRUE TRUE  TRUE  TRUE TRUE  TRUE  TRUE

So it seems, that x2 is exact, but x1 is not? Not true:

(d2  <- diff(x2))
##  [1] 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1
unique(d2)
## [1] 0.1 0.1 0.1 0.1

Aha… So, one 0.1 differs from some other 0.1 ?? Is R computing nonsense?

No. The computer can only compute with finite precision, and it uses binary representations of all the “numeric” (double) numbers.

print(x2, digits=17)
##  [1] 0.00000000000000000 0.10000000000000001 0.20000000000000001
##  [4] 0.29999999999999999 0.40000000000000002 0.50000000000000000
##  [7] 0.59999999999999998 0.69999999999999996 0.80000000000000004
## [10] 0.90000000000000002 1.00000000000000000

On one hand, we learned that \((a + b) - a = b\) or similarly \((a/b) \times b = a\), or \(\frac a n + \frac b n = \frac{a+b}{n}\) in high school – or earlier. As you can see from above, this is not always true in computer arithmetic:

1/10  + 2/10 ==  3/10
## [1] FALSE

Why? … … Well if they are not equal, let’s look at the difference

1/10  + 2/10  - 3/10
## [1] 5.551115e-17

Aha… So what happened above?

(u2 <- unique(d2))
## [1] 0.1 0.1 0.1 0.1
u2 * 10
## [1] 1 1 1 1
u2 - 1/10
## [1]  0.000000e+00 -2.775558e-17  2.775558e-17  8.326673e-17

Note: Among all the (shortened) fractions \(m / n\), only very few are exact in (usual) computer arithmetic: The denominator \(n\) must be of the form \(n = 2^k, k \in \{0,1,\dots\}\), e.g. 1/2, 3/4, 13/16, but not 1/10, 2/10, 3/10, 4/10 !

Take away:

  1. Do not use ‘==’ for numbers unless they are integer** (or otherwise known to be exact)
  2. Compare (vectors of) numbers with all.equal()
all.equal(x1, x2)
## [1] TRUE
all.equal(x1, x2, tol = 1e-10)
## [1] TRUE
all.equal(x1, x2, tol = 1e-15)
## [1] TRUE
all.equal(x1, x2, tol = 0) ## -> shows the *relative* difference
## [1] "Mean relative difference: 1.734723e-16"
  1. Sometimes, instead of all.equal, you use abs(x - y) < 1e-10 (absolute error) or often more reasonably
abs(x1 - x2) <= 1e-10 * abs((x1 + x2)/2)
##  [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE

and you may note that <= is important and not the same as <.