From 538 Riddler

Riddler Express

From Dee Harley comes a devilish matter of dominos:

In a set of dominos, each tile has two sides with a number of dots on each side: zero, one, two, three, four, five or six. There are 28 total tiles, with each number of dots appearing alongside each other number (including itself) on a single tile.

Question 1: What is the probability of drawing a “double” from a set of dominoes — that is, a tile with the same number on both sides?

Question 2: Now you pick a random tile from the set and uncover only one side, revealing that it has six dots. What’s the probability that this tile is a double, with six on both sides?

First a bit on domino tiles. A tile is a pair of numbers, each number in [0, 6]. The first number is unconstrained (but not uniform!). The second number is also in [0, 6], but must be greater than or equal to the first number.

Here’s a convenient way to visualize the tiles:

(0, 0), (0, 1), (0, 2), (0, 3), (0, 4), (0, 5), (0, 6)
        (1, 1), (1, 2), (1, 3), (1, 4), (1, 5), (1, 6)
                (2, 2), (2, 3), (2, 4), (2, 5), (2, 6)
                        (3, 3), (3, 4), (3, 5), (3, 6)
                                (4, 4), (4, 5), (4, 6)
                                        (5, 5), (5, 6)
                                                (6, 6)

Down to business. I’ll make a data.frame for the 28 tiles. Here’s a vector for the first number, walking the tile list column-wise left-to-right:

x1 <- c(0, 0:1, 0:2, 0:3, 0:4, 0:5, 0:6)

Here’s the second number, again working column-wise:

r <-function(n, expr) { replicate(n, expr)}
x2 <- c(r(1, 0), r(2, 1), r(3, 2), r(4, 3), r(5, 4), r(6, 5), r(7, 6))

And here’s my data frame:

dominoes <- data.frame(x1=x1, x2=x2)

Now for some sanity checks. I had better have 28 tiles:

nrow(dominoes)
## [1] 28

x1 should be between 0 and 6 inclusive.

summary(dominoes$x1)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    0.00    0.75    2.00    2.00    3.00    6.00

Min and max are ok, but what about the mean? Looking at the tile list again, I see that, row-wise, x1 should have 7 0’s, 6 1’s, … 2 5’s and 1 6.

sum(r(7, 0), r(6, 1), r(5, 2), r(4,3), r(3,4), r(2,5), r(1,6)) / 28
## [1] 2

Yep, that matches the mean value.

Looking at x2, it’s also supposed to be between 0 and 6 inclusive.

summary(dominoes$x2)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    0.00    3.00    4.00    4.00    5.25    6.00

To check the mean, looking at x2 by row, it’s made of numbers 0-6, 1-6, 2-6, …, 5-6 and 6.

sum(0:6, 1:6, 2:6, 3:6, 4:6, 5:6, 6) / 28
## [1] 4

… and that’s a match.

x2 is supposed to be greater than or equal to x1 for every tile.

dominoes[dominoes$x2 < dominoes$x1,]

Now I can plot my dominoes.

library(ggplot2)
ggplot(dominoes, aes(x=x2, y=x1, label=sprintf("(%d, %d)", x1, x2))) + scale_y_reverse() + geom_text() + theme_void()

Ok! I think I can finally consider the questions.

Q1: what’s the probability of drawing a “double”? These are the dominoes along the diagonal; there are 7 of them, out of 28, so the answer should be 1/4.

nrow(dominoes[dominoes$x1 == dominoes$x2,]) / nrow(dominoes)
## [1] 0.25

Q2: given that a drawn tile has a 6 on one end, what’s the probability that it’s a double? This tile is among the 7 in the right-most column; only one of those is a double, so the answer should be 1/7.

nrow(dominoes[dominoes$x1 == 6 & dominoes$x2 == 6,]) / nrow(dominoes[dominoes$x1 == 6 | dominoes$x2 == 6,])
## [1] 0.1428571

Theory is fine, but how about a supporting simulation? I’ll draw one hundred thousand dominoes and see if my sample is close to the theoretical answers.

draws <- dominoes[sample(nrow(dominoes), 100000, replace=T),]

Sanity check: is my sampling fair (does it choose each domino with equal likelihood)? I’ll plot each domino, with jitter and low alpha; if one domino or another is favored it should show up as a darker area.

ggplot(draws, aes(x=x2, y=x1, label=sprintf("(%d, %d)", x1, x2))) + geom_jitter(width=0.25, height=0.25, alpha=0.002)  + scale_y_reverse() + geom_text(vjust=3) + theme_void()

The distribution seems fair. Now let’s focus on doubles: the diagonal in the plot.

draws$double <- draws$x1 == draws$x2
ggplot(draws, aes(x=x2, y=x1, color=x1==x2, label=sprintf("(%d, %d)", x1, x2))) + geom_jitter(width=0.25, height=0.25, alpha=0.01)  + scale_y_reverse() + geom_text(vjust=3) + theme_void() + theme(legend.position="none")

nrow(draws[draws$x1 == draws$x2,]) / nrow(draws)
## [1] 0.24747

Pretty close to the theoretical value!

Now looking at sixes, and the six double:

ggplot(draws, aes(x=x2, y=x1, color=as.factor((x1==6) + (x2==6)), label=sprintf("(%d, %d)", x1, x2))) + geom_jitter(width=0.25, height=0.25, alpha=0.01) + scale_y_reverse() + geom_text(vjust=3) + theme_void() + theme(legend.position="none")

nrow(draws[draws$x1 == 6 & draws$x2 == 6,]) / nrow(draws[draws$x1 == 6 | draws$x2 == 6,])
## [1] 0.1394514

Close enough to the predicted 1/7.

Source code is here