Data 606 Homework 3

library(DATA606)
## 
## Welcome to CUNY DATA606 Statistics and Probability for Data Analytics 
## This package is designed to support this course. The text book used 
## is OpenIntro Statistics, 3rd Edition. You can read this by typing 
## vignette('os3') or visit www.OpenIntro.org. 
##  
## The getLabs() function will return a list of the labs available. 
##  
## The demo(package='DATA606') will list the demos that are available.
## 
## Attaching package: 'DATA606'
## The following object is masked from 'package:utils':
## 
##     demo

3.2

  1. z < -1.13
1 - pnorm(-1.13, mean = 0, sd = 1)
## [1] 0.8707619
library(ggplot2)
normalPlot(mean = 0, sd = 1, bounds = c(-1.13, 4))

(b) Z < .18

pnorm(.18, mean = 0, sd = 1)
## [1] 0.5714237
normalPlot(mean = 0, sd = 1, bounds = c(-4, .18), tails = FALSE)

(c) Z > 8

1 - pnorm(8, mean = 0, sd = 1)
## [1] 6.661338e-16
normalPlot(mean = 0, sd = 1, bounds = c(8, Inf), tails = FALSE)

|z| < .5

x <- 1 - pnorm(.5, mean = 0, sd = 1)
y <- pnorm(.5, mean = 0, sd = 1)
print(x)
## [1] 0.3085375
print(y)
## [1] 0.6914625
normalPlot(mean = 0, sd = 1, bounds = c(x, y), tails = FALSE)

###3.4 Part 1 (a) Men N(μ = 4313, σ = 583) Women N(μ = 5261, σ = 807) (b)

Z_Leo <- (4948 - 4313) / 583
Z_Leo
## [1] 1.089194
Z_Mary <- (5513 - 5261) / 807
Z_Mary
## [1] 0.3122677

This Z score tells us that Mary’s Z score was .31 standard deviations away from the mean and Leo’s was 1.08 standard deviations away from the mean. (c) Mary ranked better in her group because she is closer to the mean than Leo is. (d)

pnorm(Z_Leo)
## [1] 0.8619658
pnorm(Z_Mary)
## [1] 0.6225814
  1. each runner’s z score would remain the same, however, their percentages within their groups would change.

3.18

fheights <- c(54,55,56,56,57,58,58,59,60,60,61,61,62,62,63,63,63,64,65,65,67,67,69,73)
summary(fheights)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   54.00   58.00   61.50   61.58   64.25   73.00
fhghtmean <- mean(fheights)
fhghtsd <- sd(fheights)
hist(fheights)

qqnormsim(fheights)

(a)

pnorm(fhghtmean+fhghtsd, mean =  fhghtmean, sd = fhghtsd)
## [1] 0.8413447
pnorm(fhghtmean+2*fhghtsd, mean =  fhghtmean, sd = fhghtsd)
## [1] 0.9772499
pnorm(fhghtmean+3*fhghtsd, mean =  fhghtmean, sd = fhghtsd)
## [1] 0.9986501

The heights follow the 68-95-97% rule. (b) We can say that the distibution is fairly normal. There appears to be a few outliers on both sides of the distribution, however, most data points appear to be close the the line.

3.22

Defective rate is 2% (a)

pgeom(10-1,0.02)
## [1] 0.1829272
1-pgeom(100,0.02)
## [1] 0.1299672
p <- .02
e <- 1/p
e
## [1] 50
sd <- sqrt((1 - p)/p^2)
sd
## [1] 49.49747
  1. 5% defective rate
p <- .05
e <- 1/p
e
## [1] 20
sd <- sqrt((1 - p)/p^2)
sd
## [1] 19.49359
  1. The probability of an event with a 5% defective rate is greater. The standard deviation becomes smaller as does the mean.

3.38

boy <- .51
n <- 3
x <- 2
dbinom(x, n, boy)
## [1] 0.382347
  1. GBB, BGB, BBG
probboys <- ((.49*.51*.51)+(.51*.49*.51)+(.51*.51*.49))
probboys
## [1] 0.382347
  1. Part B would be more tedicious because you would have to write out 8 different equations with 8 different probabilities.

3.42

p <- .15
n <- 10
k <- 3
choose(n - 1, k - 1) * (1 - p)^(n - k) * p^k
## [1] 0.03895012
  1. Since each event is independent of one another, the probability of of her 10th serve being successful is 15%
  2. Part A was looking for a specific combination of successes with serving. Part B was only concerned with one serve.