R Notebook

Getting to know the binomial distribution functions in R: Simulates a random draw from a binomial distribution. This function has three arguments: the number of random draws, the number of coins being flipped on each draw, and the probability of a heads as the outcome.As before, heads will be assigned a value of 1 and tails a value of 0.

rbinom(1,1,.5)

## [1] 0

Simulates 10 flips of a single coin.

rbinom(10,1,.5)

##  [1] 0 1 1 1 1 0 0 1 0 0

Simulates one draw of ten coins. This counts the number of heads out of 10 flips.

rbinom(1,10,.5)

## [1] 4

Simulates 10 draws with each having 10 coins and reports the number of heads from each draw. You can think of the number of draws as the number of replicates and the number of coins as the sample size in each replicate.

rbinom(10,10,.5)

##  [1] 5 7 7 6 3 5 8 7 7 4

Simulates 10 draws of 10 coins, each with an 80% chance of producing a head instead of a 50% chance. These biased coins are still considered random variables because the outcome is random.

rbinom(10,10,.8)

##  [1] 8 7 9 8 9 7 7 8 8 9

Simulates 100000 draws, each with 10 fair coins, and saves the outcome as ‘flips’.

flips<-rbinom(100000, 10, .5)

Finds the fraction of draws where 5 heads occurred, which reflects the probability that this outcome will occur.

mean(flips==5)

## [1] 0.24376

Finds the cumulative probability that heads will occur 4 or fewer times during each draw. This is referred to as the cumulative density because it sums up the probability that the number of heads is 0, 1, 2, 3, or 4 out of 10 flips.

mean(flips<=4)

## [1] 0.37723

Estimates the exact probability density at a given point. The arguments here are the density being estimated (5 heads), the number of coins (10), and the probability of producing a head (0.5).

dbinom(5,10, .5)

## [1] 0.2460938

Estimates the exact cumulative probability density for a given range. The arguments here are the density being estimated (4 or fewer heads), the number of coins (10), and the probability of producing a head (0.5).

pbinom(4,10,.5)

## [1] 0.3769531

Simulating coin flips: Generate 10 separate random flips with probability 0.7 of producing heads. What kind of values do you see and what do they represent?

rbinom(10, 1, .7)

##  [1] 1 1 0 1 1 1 1 1 1 1

Simulating draws from a binomial distribution: Generate 100 occurrences of flipping 10 coins, each with 70% probability of producing heads. What kind of values do you produce and what do they represent? Produce a plot of the probability distribution that you generate and describe its shape.

flips<-rbinom(100, 10, 0.7)
table(flips)/100

## flips
##    3    4    5    6    7    8    9   10 
## 0.01 0.03 0.10 0.15 0.24 0.30 0.14 0.03

hist(flips, freq = FALSE) #how does the y-axis change between theset two?

hist(flips)

Calculate the exact probability that 2 heads will arise from 10 coin flips with a 70% probability of coming up tails. Compare your answer with a simulation of 10,000 trials. Do the two approaches yield similar results?

dbinom(2, 10, 0.3)  #Note that the word problem lists probability of getting tails

## [1] 0.2334744

mean(rbinom(10000, 10, 0.3)==2)

## [1] 0.2357

Calculate the cumulative probability that at least five coins out of 10 are heads with a 30% probability of coming up heads. Compare your answer with a simulation of 10,000 trials. Do the two approaches yield similar results?

1-pbinom(4, 10, .3) #Note that we have to subtract from 1 to get the right range

## [1] 0.1502683

pbinom(4, 10, .3, lower.tail = FALSE) #same as the line above

## [1] 0.1502683

mean(rbinom(10000, 10, .3)>= 5)

## [1] 0.1523

Repeat the simulation you ran in exercise (3) with 10, 100, 1,000, 10,000, and 100,000 trials. Which simulation yields a result most similar to the exact probability? Produce a plot depicting the number of trails on the x-axis and the associated probabilities you calculated on the y-axis. Make sure to adjust your axis labels appropriately. What pattern do you see? Produce the plot again but this time log-transform the number of trials. What does this graph reveal that your first graph did not?

r10<-mean(rbinom(10, 10, 0.3)==2)
r100<-mean(rbinom(100, 10, 0.3)==2)
r1000<-mean(rbinom(1000, 10, 0.3)==2)
r10000<-mean(rbinom(10000, 10, 0.3)==2)
r100000<-mean(rbinom(100000, 10, 0.3)==2)

r<-c(r10, r100, r1000, r10000, r100000)
n<-c(10, 100, 1000, 10000, 100000)

plot(n, r, xlab="Number of replicates", ylab="Probability of 70% flips coming up tails")

plot(log(n), r, xlab="Log of number of replicates", ylab="Probability that 2 heads will arise from 10 coin flips")

If events A and B are independent, and A has a 50% chance of happening and B has a 30% chance of happening, what is the probability that they will both happen? Use simulations to answer this question.

A<-rbinom(1000, 1, 0.5)
B<-rbinom(1000, 1, 0.3)
mean(A & B)  #Answer is around 15.2%

## [1] 0.147

Expanding on exercise (7), event C has a 70% chance of happening. What is the probability that events A, B, and C all happen?

C<-rbinom(1000, 1, 0.7)
mean(A & B & C)  #Answer is around 10.5%

## [1] 0.103

If events A and B are independent, and A has a 40% chance of coming up heads and B has a 75% chance of coming up heads, what is the probability that either A or B will come up heads? Use simulations to answer this question.

A<-rbinom(1000, 1, 0.4)
B<-rbinom(1000, 1, 0.75)
mean(A|B)  #Answer is around 86.1%

## [1] 0.843

Suppose X is a random binomially distributed variable (10, 0.3) and Y is another random binomially distributed variable (10, 0.65), and that they are independent. What is the probability that either of these variables is less than or equal to 5? Estimate this probability both using simulations of 10,000 trials and by calculating exact cumulative densities. How do these two approaches compare?

X<-rbinom(100000, 10, 0.3)
Y<-rbinom(100000, 10, 0.65)
mean(X<=5|Y<=5)

## [1] 0.96512

prob_X_less<-pbinom(5, 10, 0.3)
prob_Y_less<-pbinom(5, 10, 0.65)
prob_X_less + prob_Y_less - prob_X_less*prob_Y_less

## [1] 0.9644174