October 2, 2015

####################################################################################################

What can you do with R? A lot of things, and I did not resist the tempetations of trying some tricks.

Rolling a dice is the quintessence of hazard and probability.

In the Urban dictionary (http://www.urbandictionary.com), ‘to roll a dice’ is explained as “To take a chance” and also “To risk exposing yourself to criticism by not following through on a task”.

Bayesians do prefer flipping a coin, as a test of fairness in probability. I can understand them. It is easy to reduces the whole thing to a binomial distribution and to a sequence of independent Bernoulli trials.

Definition, definition: “A binomial experiment involves n independent and identical trials such that each trial can result in to one of the two possible outcomes, namely, success or failure”.

Krishnamoorthy K. (2006). Handbook of Statistical Distributions with Applications. Boca Raton, FL: Chapman & Hall/CRC; p. 31.

The mean in a binomial distribution is = n * p, where n is the number of trials, and p is the expected probability of the success. Variance is calculated by n x p x (1 - p), and standard deviation, if this does make any sense, is the square root of variance.

If the coin is fair, p = 0.50, mean = 100 x 0.50 = 50; with 100 flip, variance = 100 x 0.50 x (1-0.50) = 25; sd = sqrt(25) = 5.

Can you judge if a coin is fair by – let ’ s say – 200 flips?

Let ’ s see.

We will use the ‘animation’ package to do this.

Just copy and past the code in your consolle, to have the animation working

### Call the library

library(animation)

### Let's flip the coin

### Remember: Tail, I win, Head, you lose... :-)

set.seed(15)    ### set seed for repeatability

ani.options(nmax = 200, interval = 0.3)
par(mar = c(2, 4, 2, 2))
flip.coin(faces = c("Tail", "Head"), type = "n",
    prob = c(0.5, 0.5), 
    col = c(2, 4))
### done
### done

I wan unable to have it embedded in the html markdown document (it plots 200 iterations, which is not something you want to see in your screen!)

Let ’ s take a look at the results…

Alt text

Interesting finding…

The tail went out 115 times (p = 0.575), but it was expected 100 times (we expected 100 tail and 100 head).

You can obtain different findings just by changing the seed.

Just try set.seed(10) or set.seed(33)

However, let ’ s check it out our first trial.

### Expected mean and standard deviation for 'Tail'

n = 200

p.exp <- 0.50
p.obs <- 0.575

mean.exp <- n * p.exp
mean.obs <- n * p.obs

sd.exp <- round(sqrt(n*p.exp*(1 - p.exp)), 3) 

sd.obs <- round(sqrt(n*p.obs*(1 - p.obs)), 3) 

res <- cbind(mean.exp, sd.exp, mean.obs, sd.obs)

names(res) <- c("Expected mean", "Expected SD", "Observed mean", "Observed SD")

res[1]; res[2]; res[3]; res[4];
## Expected mean 
##           100
## Expected SD 
##       7.071
## Observed mean 
##           115
## Observed SD 
##       6.991
### barplot of the expected and observed probability

barplot(c(res[1],res[3]), col = "gold", ylim = c(0, mean.obs+(3*sd.obs)))

### 2 SD interval above and below the expected case
        
segments(0.70,res[1]-2*sd.exp, 0.70,res[1]+2*sd.exp, lwd=3) 

### 2 SD interval above and below the observed case
                
segments(1.90,res[3]-2*sd.obs, 1.90,res[3]+2*sd.obs, lwd=3)                 

### lower limit = 1 SD in observed data

abline(h=res[3]-sd.obs, col ="red") 

plot of chunk unnamed-chunk-2

OK, let ’ s test whether the coin is fair

We will use the binomial test

### We have 115 success (tail), and 85 unsuccess (head)
### Expected probability is 1/2 = 50%

binom.test(c(115,85), p = 1/2)
## 
##  Exact binomial test
## 
## data:  c(115, 85)
## number of successes = 115, number of trials = 200, p-value =
## 0.04004
## alternative hypothesis: true probability of success is not equal to 0.5
## 95 percent confidence interval:
##  0.5033 0.6444
## sample estimates:
## probability of success 
##                  0.575
### or we can use this way
### number of success and number of trials
### no other specification (this is the exact binomial test)

binom.test(115, 200)
## 
##  Exact binomial test
## 
## data:  115 and 200
## number of successes = 115, number of trials = 200, p-value =
## 0.04004
## alternative hypothesis: true probability of success is not equal to 0.5
## 95 percent confidence interval:
##  0.5033 0.6444
## sample estimates:
## probability of success 
##                  0.575

Let ’ s try a Bayesian way

require(BayesFactor) 
## Loading required package: BayesFactor
## Warning: package 'BayesFactor' was built under R version 3.1.3
## Loading required package: coda
## Loading required package: lattice
## Loading required package: Matrix
## Warning: package 'Matrix' was built under R version 3.1.3
## 
## Attaching package: 'Matrix'
## 
## The following objects are masked from 'package:base':
## 
##     crossprod, tcrossprod
## 
## ************
## Welcome to BayesFactor 0.9.11-1. If you have questions, please contact Richard Morey (richarddmorey@gmail.com).
## 
## Type BFManual() to open the manual.
## ************
n = 200     ### number of trials
y = 115     ### observed number of the event of interest
pr = 0.50   ### expected probability of the occurrence of the success/unsuccess 

### the observed probability is compared with the expected probability (0.5)

Bayes.bin <- proportionBF(y, n, pr)
    
Bayes.bin 
## Bayes factor analysis
## --------------
## [1] Alt., p0=0.5, r=0.5 : 1.545 ±0%
## 
## Against denominator:
##   Null, p = 0.5 
## ---
## Bayes factor type: BFproportion, logistic
### The Bayes factor in favor of the alternative hypothesis (frequency is not 0.50 = not fair)

BF <- as.vector(Bayes.bin)
round(BF[[1]], 3)           
## [1] 1.545
## The Bayes factor in favor of the null hypothesis (frequency is 0.50 = fair)

BF.null <- as.vector(1/Bayes.bin)

round(BF.null[[1]], 3)          
## [1] 0.647

The Bayes factor with some relevance is 3 to 10 or higher. In this case, nor the null neither the alternative are favoured.

Well, strange things happen… The coin is fair, we know because we simulated it to be fair…

After all, ”probability does not exist, except in the mind” (Bruno de Finetti).

So, rolling a dice…

The ancient Romans played a game, ‘tali’, something like the Knucklebones, which were called Astragaloi or Astragals by the Greeks.

Alt text Alt text

These images are from http://www.nationaltrust.org.uk/document-1355766856560/ and from http://www.slideshare.net/sabella17/socials-ancient-roman-entertainment

“The game was played with four sheep’s or goat’s knucklebones, 4-sided rectangular dice numbered I, III, IV and VI” (Wikipedia: https://en.wikipedia.org/wiki/Venus_Throw)

“The Venus Throw happened when each talus landed on a different side, i.e. with a total of 14” (ibidem).

The Venus Throw won all, and it was considered a fauste throw, it was thought to bring luck on the player.

We may consider the equivalent of the Venus Throw a sequence of 1, 2, 3 4, 5, 6 with six dices.

A rather rare event: current dices have six sides, what is the probability that each of the six numbers will appear exactly once?

The best response is from math.stackexchange (http://math.stackexchange.com/questions/105300/probability-question-six-dice)

“Imagine you throw one after the other. You consider a throw as a success if the number is different from all previous numbers”.

“You start with one. This is always a succes so P(first) = 1 = 6/6”.

“Your second throw is a success if one of the remaining 5 numbers shows, so P(second) = 5/6. And so on”.

“Since all the throws are independent, the total probability is the product of all separate probabilities:”

“P(all numbers are different)= 6/6 x 5/6 x 4/6 x 3/6 x 2/6 x 1/6”.

Probability_Venus_Throw <- 6/6 * 5/6 * 4/6 * 3/6 * 2/6 * 1/6

options(scipen=999)     ### scipen=999 disables scientific notation in R

Probability_Venus_Throw
## [1] 0.01543
options(scipen=0)       ### scipen=0 reinstates default scientific notation in R

Not so pretty rare, isn ’ t it?

It is about 1 out of 60 rolls.

Let ‘s see with the ’TeachingDemos’ package

library(TeachingDemos)

# 9 rolls of 6 fair dice

set.seed(345)

dice(9,6, plot.it=TRUE)

plot of chunk unnamed-chunk-6

Well, we were close, but for two 4 rather than a 4 and a 5… (at the bottom, on the left)

You can change the seed to see whether you have more luck than me…

Let ’ s try again!

set.seed(123)

tmp <- dice(10,6)

### The dices have different colors

tmp
##    Red Green Blue Black Yellow Purple
## 1    3     1    1     1      2      2
## 2    6     4    6     1      4      4
## 3    4     6    5     6      4      6
## 4    1     5    1     6      4      2
## 5    1     2    5     2      2      5
## 6    2     1    6     4      2      3
## 7    5     3    5     6      3      2
## 8    1     2    5     3      4      6
## 9    5     3    3     3      3      1
## 10   4     1    2     3      1      4

Nope… It is not my day!…

I hope you’ve enjoyed this!

Have a nice day!

Alt text

Twitter: @AntoViral

This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents.

For more details on using R Markdown see http://rmarkdown.rstudio.com.