Discussion 5

N. Nedd

2017-09-27


Section 2.2 Question 22

require(dplyr)
## Loading required package: dplyr
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
TotalHeads <- numeric()

#The sample function specfies that the options are 0 and 1 where 1 represents Heads and 0 Tails.  
#It also indicates that there is a probability of 0.5 of obtaining each option making the coin a fair one.

for (i in 1:1000){
  tosses <- sample(c(0,1), size = 100, replace = TRUE, prob = c(0.5,0.5))
  sumHeads <- sum(tosses)
  TotalHeads <- append(TotalHeads, sumHeads)
}

TotalHeadsdf <- data.frame(TotalHeads)
HeadsSummary <- count(TotalHeadsdf, TotalHeads)
HeadsSummary <- mutate(HeadsSummary, prop = n/1000) #proportion out of 1000 for which number of heads is n
HeadsSummary
## # A tibble: 34 x 3
##    TotalHeads     n  prop
##         <dbl> <int> <dbl>
##  1         33     1 0.001
##  2         34     1 0.001
##  3         35     2 0.002
##  4         36     3 0.003
##  5         37     4 0.004
##  6         38     3 0.003
##  7         39     5 0.005
##  8         40     7 0.007
##  9         41    20 0.020
## 10         42    29 0.029
## # ... with 24 more rows
#Filter for each n in the interval [35,65]
HeadsSummary2 <- filter(HeadsSummary, TotalHeads >= 35)
HeadsSummary2 <- filter(HeadsSummary, TotalHeads <= 65)

barplot(HeadsSummary2$prop, names.arg = HeadsSummary2$TotalHeads, xlab = 'Sum of Heads in 100 coin tosses', ylab = 'Frequency')

The data seems to be able to fit a normal curve since the higher proportions are in the middle of the range with lower proportions at the outer edges of the interval.