knitr::opts_chunk$set(cache=TRUE, warning=FALSE, message=FALSE, fig.width=12)
library(lattice)
library(knitr)
Balls colors: “brown” 13%, “yellow” 14%, “red” 13%, “blue” 24%, “orange” 20%, “green” 16%.
# There is no sample dimension. I make a n = 100 dataframe.
e3 <- data.frame(color = rep(c("brown", "yellow", "red", "blue", "orange", "green"), c(13, 14, 13, 24, 20, 16)))
barchart(~table(color), data = e3,
xlab = "Percentage",
horizontal = FALSE,
main = "Balls (no N)")
A. What is the chance that you extract a brown or yellow ball?
It is the sum of the quantities of yellow and brown balls on the total, that is:
paste((13 + 14), "%", sep = "")
## [1] "27%"
B. What is the chance that you will not pull out a blue?
Blue balls represent the 24% of the pool. Hence, the probability of not extracting it is:
paste((100 - 24), "%", sep = "")
## [1] "76%"
Flip a coin four times and have four head. Does this outcome give you the idea that the coin is not fair?
Solution:
We do not have enough data to answer this question. Four coin tosses produce 2^4 = 16 possible results. Four heads represents a single possible result, hence the probability of getting it is 1/16 = .06. It is not likely, but it is possible.
Toss a fair coin 10 times, and record how many heads occur.
A. How many are the possible outcomes?
2 ^ 10
## [1] 1024
B. What would be the probability of each of the outcomes?
paste(round((1 / 1024), 3), "%", sep = "")
## [1] "0.001%"
c. How many of the outcomes would have one head? What is the probability of one head on ten flips.
A possible solution id HTTTTTTTTTT, but I could have a head at any toss, such as the second THTTTTTTTT and so forth. Hence, we can count 10 possible outcomes:
paste(round((10 / 1024), 2), "%", sep = "")
## [1] "0.01%"
D. How many of the outomes would have zero heads? What is the probability of zero heads in ten filps?
Only one possible result has no heads.
paste(round((1/1024), 3), "%", sep = "")
## [1] "0.001%"
E. What is the probability of getting one head or less on 10 flips?
There are eleven possible results, hence:
paste(round((11/1024), 2), "%", "")
## [1] "0.01 % "
What is the probability of getting more than one head on 10 flips of a fair coin?
paste(round(1 - (11/1024), 3), "%", sep = "")
## [1] "0.989%"
Suppose a basket player’2 free-throw shooting percentage is 70%.
A. Explain what it means as a probability.
It means that on the long term the player should perform the 70% of correct throws, as average.
B. What is wrong with thinking that his chances of making his next free throw are 50-50.
Because the 70% is produced by a long time observation of his performances. Probability does not tell us what is gung to happening, but just how, in the long term, it should result.
Suppose you buy a lottery ticket, and you have to pick six numbers from 1 to 50 (repetions allowed). Which repetition is more likely to win: 13, 48, 17, 22, 6, 39 or 1, 2, 3, 4, 5, 6?
Solution:
They are equally likely, since every number picked does not depend from the other numbers already extracted.
You feel lucky again and buy a handful of instant lottery tickets. The last three tickes give you one dollar! Should you play again because you are “on a roll”?
Solution:
No, of course. Those are “winning-traps”, giving you the illusion that a real victory is really possible. Such a small trend tells us nothing about the possible resul of the next ticket. Simply, there are more ticket where you do not win: in the end, you will always lose.
Suppose that a small town has five people with a rare form of cancer. Does this automatically mean that a huge problem exists that need to be addresses?
Solution:
Since the illness is rare, we could suspect a problem in that particular area, even if not necessarily. More information about the time of the appearence of the illness could be useful to produce a more useful deduction.
A couple has conceived three girls so far a fourth baby on the way. Do you predict that the newborn could be a girl or a boy? Why?
Solution:
The chance is 50-50, since there is no relation between the sex f the newborn and the sex of his/her sisters.
Meteorologists use computer models to predict when and where hurrican will hit shore. Suppose they predict that hurrican Stat has 20% chance of hitting the east coast.
A. On what info are the meteorologist basing this prediction?
Simulators are based on the historical data collected from previous events.
B. Why is this prediction harder to make than your chance of getting a head on your next coin toss?
Coin flips are equally likely, 50-50. Past hurricanes hit the east cost in average the 20% of cases. It is not harder, it is only based on different data, especially in simulators, where many data input are blank due to lack of information.
Bob stayed at the slot four hours. He think the longer he will stay, the better chance he has to win eventually. Is Bob right?
Solution:
Unfortunately, he is wrong. The longer he stays, the bigger becomes the amount of money he wastes. He can win, even if with a really tiny chance, but the prize will never cover the losses. The winner will always be the owner of the machine. We can try to simulate a long game session even with a reasonable winning probability (0.01%). Win give the player 100, lose will cost 1. Even so, the game ends with the player leaving with empty pockets.
set.seed(123)
n <- 1e6
a <- rep(NA_integer_, n)
a[1] <- 100L # set initial value (integer)
i <- 1 # counter
while(a[i] > 0) {
# first check whether our results will fit. If not, embiggenate `a`.
if(i==length(a)) a <- c(a, rep(NA_integer_, n))
if (rbinom(1, 1, 0.01) == 1) {
a[i+1] <- a[i] + 100L
} else {
a[i+1] <- a[i] - 1L
}
i <- i + 1
}
plot(a[seq_len(i)], type = "l")
Which situation is more likely to produce exactly 50 percent heads: flipping a coin 10 times or 10000?
Solution:
Probability works on the long run, hence the correct answer is 10000. Let’s try a simulation of the problem.
set.seed(1234)
ten_fl <- sample(0:1, 10, replace = TRUE)
thous_fl <- (sample(0:1, 10000, replace = TRUE))
par(mfrow = c(1, 2))
plot_ten_fl <- hist(ten_fl, main = "Ten coin flips", xlab = paste("Mean", mean(ten_fl), sep = " "))
abline(v = mean(ten_fl), col = "red")
plot_thous_fl <- hist(thous_fl, main = "10000 coin flips", xlab = paste("Mean", mean(thous_fl), sep = " "))
abline(v = mean(thous_fl), col = "red")
As we can see, the mean tends more toward the center of the distribution (0.5) on 10000 flips.