Data 606 Homework 5

5.6 Working backwards, Part II.

A 90% con???dence interval for a population mean is (65, 77). The population distribution is approximately normal and the population standard deviation is unknown. This con???dence interval is based on a simple random sample of 25 observations. Calculate the sample mean, the margin of error, and the sample standard deviation.

mE = (77-65) / 2
sM = 65 + mE
x = 25
dF = x-1
CI = .9
a = 1-CI
tIndex =  (1-(a/2))
t = qt(tIndex,dF)
sE = mE/t
sSD = sE*sqrt(x)
sM

## [1] 71

mE

## [1] 6

sSD

## [1] 17.53481

5.14 SAT scores.

(a) Raina wants to use a 90% con???dence interval. How large a sample should she collect?

mE = 25
sSD = 250
z = qnorm(0.95)
x = ((sSD*z)/mE)^2
x

## [1] 270.5543

(b) Luke wants to use a 99% con???dence interval. Without calculating the actual sample size, determine whether his sample should be larger or smaller than Raina’s, and explain your reasoning.

In terms of confidence intervals, the z-score is dependent on sample size…as Z-score rises, samplesize decreases. Luke’s sample should be larger than Raina’s.
####(c) Calculate the minimum required sample size for Luke.

mE = 25
sSD = 250
z = qnorm(0.995)
x = ((sSD*z)/mE)^2
x

## [1] 663.4897

5.20 High School and Beyond, Part I.

(a) Is there a clear di???erence in the average reading and writing scores?

No real difference is immediately apparent. Minor differences seen can merely be sample variability.

(b) Are the reading and writing scores of each student independent of each other?

Depends on how population sampling was conducted for the observations. However an individual student scores are not independent of each other categorie.

(c) Create hypotheses appropriate for the following research question: is there an evident di???erence in the average scores of students in the reading and writing exam?

nH : There is no evident difference in the average scores of students in the reading and writing exam.
aH : There is an evident difference in the average scores of students in the reading and writing exam.

(d) Check the conditions required to complete this test.

Students must be independent of each other. Yes
Nearly normal distribution. Yes Sample size less than 10% of population. Yes

(e) The average observed difference in scores is ¯ xreadwrite =0.545, and the standard deviationof the differences is 8.887 points. Do these data provide convincing evidence of a difference between the average scores on the two exams?

x = 200
dF = 199
sDD = 8.887
diffOfMean = -0.545
sE = sDD /sqrt(x)
t  = (diffOfMean) /(sE)
pV = pt(t,df=dF)
pV

## [1] 0.1934182

The P-value fails to be smaller than 5%. We reject the alternative Hypothesis in favor of the null.

(f) What type of error might we have made? Explain what the error means in the context of the application.

A type II error could have been made. A false negative where there may have been differences in scores that we failed to observe.

(g) Based on the results of this hypothesis test, would you expect a con???dence interval for the average difference between the reading and writing scores to include 0? Explain your reasoning.

0 is the best possible result to reject the alternative hypothesis, which we did. I would expect 0 to be in the CI.

5.32 Fuel eciency of manual and automatic cars, Part I.

nH : The data provides no difference between the average fuel efficiency of cars with manual and automatic transmissions in terms of their average city mileage?
aH : The data provides differences between the average fuel efficiency of cars with manual and automatic transmissions in terms of their average city mileage?

x = 26
autoMean = 16.12
manualMean = 19.85
autoSD = 3.58
manualSD = 4.51
diffOfMeans = autoMean - manualMean
sE = sqrt((autoSD^2/x)+(manualSD^2/x))
t = (diffOfMeans)/(sE)
p = pt(t,df=25)*2
p

## [1] 0.002883615

p-value is less than 0.05. We reject nH in favor of aH. Which means we’ve detected a difference in automatic vs manual car fuel efficiency.

5.48 Work hours and education.

(a) Write hypotheses for evaluating whether the average number of hours worked varies across the ???ve groups.

nH : The means of each group are equal
aH : The means of each group are not equal

(b) Check conditions and describe any assumptions you must make to proceed with the test.

Observations independent of each other. Samples are 10% or less of the population. Sample sizes greater than 30 to prevent too much skew. All looks good.

(c) Below is part of the output associated with this test. Fill in the empty cells.

msg = 501.54
prf = 0.0682
x = 1172
groups = 5
spacesBetween = 4
dF = x - groups
f = qf(1-prf,spacesBetween,dF)
mse= msg/f
ssg = spacesBetween*msg
sse = dF*mse
dF

## [1] 1167

spacesBetween

## [1] 4

## [1] 2.188931

mse

## [1] 229.1255

ssg

## [1] 2006.16

sse

## [1] 267389.5

(d) What is the conclusion of the test?

Since the p-value is higher than 0.05 we fail to reject the Null hypothesis. This means theres no percieved difference in the means between groups.