#5.6 Working Inwards

##Given 90% Confidence Interval (65,77)
##Sample mean is
(65+77)/2
## [1] 71
##Margin of error
(77-65)/2
## [1] 6
##Since n is less than 30, I am using t distribution
##T value for 2 tail 24 degrees of freedom is 1.71
##Standrad error
(77-71)/1.71
## [1] 3.508772
##Sample SD is SE*sqrt(n)
3.508772*sqrt(25)
## [1] 17.54386
#5.14 SAT Scores
##a)SD=250, ME=25
##n=(z*SD/ME)^2
(1.65*250/25)^2
## [1] 272.25
##b) Minimum required sample size for 99% CI should be higher than that required for 90% CI.
##c) Confidence level is 99%
##Using the same formula as in (a) and a z value of 2.58
(2.58*250/25)^2
## [1] 665.64
#5.20 High School and Beyond
##a) There does not seem to be a clear difference as the vertical lines in the boxplot are approximately at the same same level and the histogram has a center close to 0.
##b) The cores are not expected to be independant as people with more education level are expected to score higher on both.
##c) Null Hypothesis Ho : Mean=0
##   Alternate Hypotheis Ha: Mean is not equal to 0
##d)Independance: satisfied as the score of each student differs
##Randomization: satisfied as random sampling is used
##Normal: satisfied sa samplae size of 200 is large enough for application of Central Limit Theorem
##e) 
SE <- 8.887/sqrt(200)
t <- (-0.545 - 0)/SE
pt(q=t, df=199, lower.tail = TRUE)
## [1] 0.1934182
##p is greater than 5% so we fail to reject null hypothesis
##The data does not provide convincing evidence that there is a difference in the means of the two groups.
##f)We might have made a TYpe 2 error if we fail to reject and the null and there is actually a significant difference in the means of the two groups.
##g)We failed to reject the null hypothesis and we therefore expect the Confidence Interval to include 0.

#5.32 Fuel Efficiency
mean1 <- 16.12
sd1<- 3.58
var1 <- sd1^2

mean2 <- 19.85
sd2<- 4.51
var2 <- sd2^2

diff <- mean1 - mean2
diff
## [1] -3.73
SE <- sqrt((var1/26) + (var2/26))   
t_score <- (diff - 0)/SE
pt(q=t_score, df=25, lower.tail = TRUE) * 2
## [1] 0.002883615
##The data provides strong evidence that there is a difference between the means of the two grous as p is less than alpha.

#5.48 Work hours and Education
##a) Ho : mean1=mean2=mean3=mean4=mean5
##   Ha : not all of mean1, mean2, mean 3, mean4 and mean5 are equal
##b) Independent: satisfied assimung the subjects are independent
##Normal: Satisfied: as sample size is very large at 1172.
##Equal variance: satisfied as the variances look more or less same in the boxplots.
##c) Df degree is 4
##Df residuals is 1167
##Df total is 1171
##Sum Square degree is 501.54*4= 2006.16
##Total sum of square is 2006.16+267382= 269388.16
##Mean square residual is 267382/1167= 229.12
##F value is 501.54/229.12= 2.19
##d) p>alphas so we fail to reject the null hypothesis
##The conclusion is that there is not enough evidence that at least one of the groups has a different mean.