Assignment - Chapter 04

4.4
(a) Mean: 171.1 Medain: 170.3
(b) SD: 9.4 IQR: 177.8 - 163.8 = 14
(c) Z180 =

(180 - 171.1) / 9.4

## [1] 0.9468085

which is below 2 SD. So not unusually tall.

Z155 =

(155 - 171.1) / 9.4

## [1] -1.712766

which is also below 2 SD. So not unusually short.

(e) We use the SE, which is

9.4/sqrt(507)

## [1] 0.4174687

4.14
(a) False. Inference is made on the population parameter, not the point estimate. 
The point estimate is always in the confidence interval.

(b) False. Provided the sample is randomly selected and sample size way greater 
30, we can be lenient with the skew. So the normal distribution can be applied 
to generate confidence interval.

(c) False. The confidence interval is not about sample mean.

(d) True. Definition of confidence interval.

(e) True. 

(f) False. In the calculation of the standard error, we divide the standard 
deviation by the square root of the sample size. To cut the SE (or margin of 
error) in one third, we would need to sample 3^2 = 9 times the number of people 
in the initial sample.

(g) True. The margin of error is half the width of the interval

(89.11 - 80.31) / 2

## [1] 4.4

4.24

(a) Independence: The sample is random and 36 children would almost certainly 
make up less than 10% of the total school children in the city. The sample size 
is alteast 30. The data is slightly skewed to left but the sample is greater 
than 30, we can apply normal model can be applied.

(b) H0: u = 32  HA: u < 32 
alpha = 0.10

(z <- (30.69 - 32) * sqrt(36) / 4.31 )

## [1] -1.823666

p = 0.0344 
p < alpha, so we reject null hypothesis.

(c) If the null hypothesis is true, the probability of observing a sample mean 
at least as large as 32 months for a sample of 36 students is 0.0344. 
That is, if the null hypothesis is true, we would not
see such a large mean.

(d) z* value for 90% confidence interval is 1.64

(lo <- 30.69 - 1.64 * 4.31 / 6)

## [1] 29.51193

(hi <- 30.69 + 1.64 * 4.31 / 6)

## [1] 31.86807

c(lo,hi)

## [1] 29.51193 31.86807

(e) Yes, they agree. Both support the alternative hypothesis in providing 
convincing evidence that the average age at which gifted children first count 
to 10 successfully can be less than the general average of 32 months.

4.26
(a) H0: u = 100 HA: u not equal 100
alpha = 0.10

(z <- (118.2 - 100) * sqrt(36) / 6.5)

## [1] 16.8

(p <- 2 * (1 - .9998))

## [1] 4e-04

p < alpha, so we reject null hypothesis.

(b) z* value for 90% confidence interval is 1.64

(lo <- 118.2 - 1.64 * 6.5 / 6)

## [1] 116.4233

(hi <- 118.2 + 1.64 * 6.5 / 6)

## [1] 119.9767

c(lo,hi)

## [1] 116.4233 119.9767

(c)  They agree, and both support rejecting the null hypothesis. In the 2-sided 
hypothesis testing, the p-value returned is so small that it is too far less 
than the significance level of 0.10, while the confidence interval does not 
include the average IQ of 100 for the population

4.34 A sampling distribution of the mean is a probability distribution of mean 
obtained through a large number of samples drawn from a specific population. 
When the number of samples increase the shape of the distribution becomes more 
normal. Also the SD becomes less as sample size increases this 
makes curve variable. The center of the distribution comes close to the 
population mean when sample size increase.

4.40
(a)

pnorm(10500, 9000, 1000, lower.tail = FALSE)

## [1] 0.0668072

(b) The
population SD is known and the data are nearly normal, so the sample mean will 
be nearly normal with distribution N (μ, σ/ sqrt(n)),i.e. N (9000, 258.2).

(c)

pnorm(10500, mean = 9000, sd = 1000/sqrt(15), lower.tail = FALSE)

## [1] 3.133452e-09

(d)

par(mfrow = c(2, 1))

xpopulation <- 7000:12000
ypopulation <- dnorm(xpopulation,mean=9000,sd=1000)

xsample <- 7000:12000
ysample <- dnorm(xsample,mean=9000,sd=1000/sqrt(15))

plot(xpopulation,ypopulation)
plot(xsample, ysample)

(e) We could not estimate (a) without a nearly normal population distribution. 
We also could not estimate (c) since the sample size is not sufficient to yield 
a nearly normal sampling distribution if the population distribution is not 
nearly normal.

4.48 z score is directly proportional to sqrt(n). So when n increases z scores 
increases so does p value.

Assignment - Chapter 04

Binish Kurian Chandy

March 17, 2018