##3.6 Triathlon Times men (mean=4313, sd=583), women (mean=5261, sd=807)
#a) Cutoff time for fastest 5% of men. (This instance its the lowest number that is the fastest as the runner who comes in at the fastest time is one who did it in fewest seconds)
qnorm(0.05, 4313, 583)
## [1] 3354.05
#The cutoff time for the fastest 5% of men is about 3,354 seconds.
#b Cutoff time for the slowest 10% of women
qnorm(0.9, 5261, 807)
## [1] 6295.212
#The cutoff time for the slowest 10% of women is about 6,295 seconds.

##3.12 Speeding on the I-5. mean=72.6 mph, sd=4.78 mph
#a What % of vehicles go slower than 80 mph?
pnorm(80, 72.6, 4.78)
## [1] 0.939203
#About 93.92%

#b What % of vehicles go b/t 60 and 80 mph? (high minus low)
pnorm(80, 72.6, 4.78)-pnorm(60, 72.6, 4.78)
## [1] 0.9350083
#About 93.50%

#c How fast do the fastest 5% vehicles travel? (Different than triathlon because this is highest mph measurements)
qnorm(0.95, 72.6, 4.78)
## [1] 80.4624
#About 80.46 mph.

#d Speed limit=70 mph, what percentage of vehicles travel past it? 
1-pnorm(70, 72.6, 4.78)
## [1] 0.7067562
#About 70.68%

##3.18 Heights of Female College Students.
#a Mean=61.52, sd=4.58. Do heights follow the 68-95-99.7 Rule? (1, 2, 3)
H <- c(54, 55, 56, 56, 57, 58, 58, 59, 60, 60, 60, 61, 61, 62, 62, 63, 63, 63, 64, 65, 65, 67, 67, 69, 73)
m <- mean(H)
m
## [1] 61.52
s <- sd(H)
s
## [1] 4.583667
pnorm(m+s, m, s)-pnorm(m-s, m, s)
## [1] 0.6826895
pnorm(m+2*s, m, s)-pnorm(m-2*s, m, s)
## [1] 0.9544997
pnorm(m+3*s, m, s)-pnorm(m-3*s, m, s)
## [1] 0.9973002
#Yes, it appears to follow the 68-95-99.7 rule.

##b Does data seem to follow a normal distribution?
hist(H)

plot(H)

#Yes, based on the plots it seems like the distribution follows the normal bell curve distribution, and the plot follows the normal plot with a few outliers.

##3.24 Speeding on the I-5 Part II
##a Probability of 5 cars passing and none speeding? (Probability of one car not speeding^5)
p <- pnorm(70, 72.6, 4.78)
p^5
## [1] 0.002168423
#About 0.2%.

##b On average, how many cars until one was speeding? Sd?
#Prob of speeding as solved in part 1
1-p
## [1] 0.7067562
#
Espeed <- 1/(1-p)
Espeed
## [1] 1.414915
#On average 1.41 cars pass until one was speeding. 
##SD
d <- sqrt(p/p^2)
d #standard deviation
## [1] 1.846654
##3.30 Survey Response Rate. (mean=0.09)
sum(dbinom(1500:15000, 15000, 0.09))
## [1] 1.326331e-05
#The probability that at least 1,500 will agree to respond is 1.326331e-05.

##3.36 Multiple Choice Quiz. % questions, 4 choices.
#a Probability that first right question is 3rd. (25% chance of being right on each question, 75% chance of being wrong. This ends up being Probability of getting first two wrong (0.75^2)) multiplied by probability of getting third right (0.25)
(0.75^2)*0.25
## [1] 0.140625
#b She gets exactly 3 or exactly 4 questions right? 3 right (0.25^3) and 2 wrong (0.75^2)
sum(dbinom(3:4, 5, 0.25 ))
## [1] 0.1025391
#About 10.25%.

#c She gets the majority of the questions right (at least 3 out of 5)
sum(dbinom(3:5, 5, 0.25))
## [1] 0.1035156
#About 10.35%.

##3.42 Serving in Volleyball. 15% chance of making serve, serves independent of each other. 
#a Probability on 10th try she will make 3rd successful serve
goodserve <- 0.15
n <- 10
t <- 3
factorial(n-1)/(factorial(t-1)*(factorial(n - t)))*goodserve^t*(1-goodserve)^(n-t)
## [1] 0.03895012
#About 3.90%.
#b Prob that 10th serve will be successful. 
#In this case the probability of success is just 15% because each serve is independent of one another
# c why are a and b different
# There is a discrepancy between a and b because in part a the question is asking the probability of having 3 total successful serves over a total of 10 serves, so not each individual serves success probability. Meanwhile, in part b the successful serves in nine attempts are already described and the only thing being asked is the probability of that final tenth serve being successful. And we know that the independence of each serves means that any individual serve has a 15% chance of success. 

##4.6 Art After School.
#a What is the distribution called?
# Sampling distribution of means.

#b What distribution shape do you expect?
#Based on the sample of 14 out of 15 students that responded yes I would expect the distribution to be left skewed as the number of yes answers strongly outweighs the no. 

#c Calculate variability
sampled <- rbinom(100, 15, 14/15)
v <- sd(sampled)
v/sqrt(15) # This is the approximation of the variability. 
## [1] 0.2537795
##How will variability compare if sample increases to 25?
#I believe the variability would decrease as the sample would describe more of the final population. 

##4.12 Mental Health.
#a We are 95% confident that the average days that the US residents in the survey described their mental health has not good was 3.40 t0 4.24 days.
#b 95% confident means that if we repeated the survey's sampling size, the mean would be in the intervals of 3.4 to 4.24 95% of the time. 
#c Using a 99% confidence level will mean that the new interval will be larger than the 95% confidence interval.
#d The standard error would be larger because the population size of the surveys decrease and the se equals the sd divided the sqrt of the population size.

##4.18 Identify hypotheses, part II. 
#(a) Since 2008, chain restaurants in California have been required to display calorie counts of each menu item. Prior to menus displaying calorie counts, the average calorie intake of diners at a restaurant was 1100 calories. After calorie counts started to be displayed on menus, a nutritionist collected data on the number of calories consumed at this restaurant from a random sample of diners. Do these data provide convincing evidence of a difference in the average calorie intake of a diners at this restaurant? 
#Ho:there is no difference in the average calorie intake of diners Ha:There is a difference. Or using symbols Ho: Mewo-Mew1=0 Ha: Mew0-mew1 does not equal 0.

#(b) Based on the performance of those who took the GRE exam between July 1, 2004 and June 30, 2007, the average Verbal Reasoning score was calculated to be 462. In 2011 the average verbal score was slightly higher. Do these data provide convincing evidence that the average GRE Verbal Reasoning score has changed since 2004?
#Ho: There is no change in the average GRE score since 2004. Ha: There is a change. Or Ho: mew0-mew1=0, Ha: mew0-mew does not equal 0.

##4.24 Gifted Children, Part I.(n=36, min=21, mean=30.69, sd=4.31, max=39)
#a Yes the conditions for inference are satisfied as it is a random sample and the sample size is large enough (over 30).

#b Hypothesis test to evaluate Ho:Not counting to 10 by 32 months, Ha: Counting to 10 by 32 months.
sdev <- 4.31
samplemean <-30.69
se <- sdev/sqrt(36)
z <- (32-samplemean)/se
1-pnorm(z)
## [1] 0.0341013
#c The p- value is lower than the significance level 0f 0.10 so we can reject the null hypotheses ans say that there is evidence of gifted children learning to count to 10 earlier.
#d 90% confidence interval
lower <- samplemean - 1.645 * se
upper <- samplemean + 1.645 * se
lower
## [1] 29.50834
upper
## [1] 31.87166
#e Yes, 32 is outside of the confidence interval so we reject the null just as we did in the hypothesis test. 

##4.30 Testing for Food Safety

#a Ho: Regulations are being met, license kept. Ha: Regulations are not being met, license revoked

#b A type 1 error would be finding that the restaurant is not meeting regulations and committing gross violations when that is not true.Thus, revoking the license when it shouldn't be.

#c A type 2 error would be concluding that the regulations are being met and the license is kept when the restaurant actually is committing gross violations and the license should be revoked.

#d Type 1 error is much more problematic for the restaurant owner because their restaurant shuts down when it should not have.

#e Type 2 error is more problematic for diners because the restaurant remains licensed when it should not have and continues to serve diners under gross health violations.

#f I would prefer strong evidence over very strong evidence just because as a diner I would rather be safe than sorry. If their is just strong evidence of gross violations that is enough for me as a diner to want a restaurant's license to be revoked, even if it means more type 1 errors.

##4.32 True or False>
#a True. The 99% confidence interval just expands the bounds of the 95% confidence interval, meaning if a value is in the 95% interval its also in the 995 confidence interval.

#b False. If we change this to increasing the significance level will increase the probability of making a type 1 error then it becomes true.

#c False. If we change the null to mean does not equal 5, then it becomes true that the true population mean is 5 when we reject the null.

# d True. When the alternative hypothesis is true the probability of a type 1 error and the power of a test are two different parts of the interval equaling 1, 

#e True.

##4.38 Identify Distributions, part II
#Plot A is distribution (3) of 500 sample means of 81 because it seems to have the tighest distribution intervals.
#Plot B is distribution (1) of the random sample of 500 people as the total goes all the way up to 100 and seems to follow the distribution of the plot we are given,
#Plot C is distribution (2) of the sample means of size 18 because it has a looser distribution but does not cover as wide a range as the full random sample.

##4.44 Nearsighted 
#a Ho: mew=0.08, Ha: mew does not equal 0.08.

#b What proportion of children are nearsighted?
21/194
## [1] 0.1082474
#c Calculate the test statistic
zstat <- (0.08-(21/194))/0.0195

#d What is p-value
pnorm(zstat)
## [1] 0.07372665
#e The conclusion of the hypothesis test is that we cannot reject the null hypothesis that 8% of children are nearsighted.