HW 4

4.4

a)

Average: 171.1
Median: 170.3

b)

Standard Deviation: 9.4
IQR: Q3 - Q1 = 177.8 - 163.8 = 14

c)

Z180 <- (180 - 171.1)/9.4
Z180
## [1] 0.9468085
Z155 <- (155 - 171.1)/9.4
Z155
## [1] -1.712766

Both observations are less than 2 standard deviations of the mean so the are not unusual.

d)

I wouldn’t expect the mean and standard deviation of the new group to be the same as what we have but I wouldn’t expect it to be exactly the same.

e)

SE <- 9.4/sqrt(507)
SE
## [1] 0.4174687

4.14

a)

False. The confidence interval looks at the population not jus tthe sample size.

b)

False. THe sample size is large enough to make up for skew.

c)

False. The mean is not for the smaple but for the population. Another sample may have a different range.

d)

True. The confidence interval looks at the popluation.

e)

True.

f)

Sample size would need to be 9 times larger.

g)

True. Margin of error= (89.11 - 80.31)/2 = 4.4

4.24

a)

Conditions for inference are satisfied. Independent sample from a large city. At 36, n is greater than 30. The distribution is not skewed.

b)

H0: Average age to count to 10 is 32 months. \[\mu =32\] HA: Average age to count to 10 less than 32 months. \[\mu<32\]

Z <- (30.69 - 32)/(4.31)
p <- pnorm(Z)
p
## [1] 0.3805852

c)

The p-value = .38 is greater than the significance level of .10 meaning we cannot reject H0.

d)

SE = 4.31/sqrt(36) 
upper <- 30.69 + (1.645 * SE)
lower <- 30.69 - (1.645 * SE) 
upper
## [1] 31.87166
lower
## [1] 29.50834

e)

The results from the test and the interval do not agree since 32 months is outside of the interval.

4.26

a)

\[H0:\mu =100\] \[HA:\mu \neq100\]

SE <- 4.31/sqrt(36)
Z <- (118.2 - 100)/(SE)
p <- (1 - pnorm(Z)) * 2
p
## [1] 0

The p-value = 0 is less than the significance level of .10 meaning we reject H0 and accept HA.

b)

SE <- 4.31/sqrt(36)
Upper <- 118.2 + 1.645 * SE
Upper
## [1] 119.3817
Lower <- 118.2 - 1.645 * SE
Lower
## [1] 117.0183

c)

The results from the test and the interval agree since 100 is outside of the interval.

4.34

“Sampling distribution” of the mean refers to taking independent and random samples of a sample size. As the smaple size increases the shape gets closer to a normal distribution, the center increases meaning more values are closer to the mean, and spread gets more narrow.

4.40

a)

p <- 1 - pnorm(10500, 9000, 1000)
p
## [1] 0.0668072

The probabilty that a bulb lasts longer than 10,500 hours is 6.7%

b)

The distribution for 15 light bulbs should be nearly normal

c)

Z <- (10500 - 9000)/258
p <- 1 - pnorm(Z) 
p
## [1] 3.050719e-09

The probability is 0%.

d)

library(DATA606)
## 
## Welcome to CUNY DATA606 Statistics and Probability for Data Analytics 
## This package is designed to support this course. The text book used 
## is OpenIntro Statistics, 3rd Edition. You can read this by typing 
## vignette('os3') or visit www.OpenIntro.org. 
##  
## The getLabs() function will return a list of the labs available. 
##  
## The demo(package='DATA606') will list the demos that are available.
## 
## Attaching package: 'DATA606'
## The following object is masked from 'package:utils':
## 
##     demo
normalPlot(9000, 1000)

normalPlot(9000, 1000/sqrt(15)) #15 bulbs

e)

No, a normal distribution is needed.

4.48

The p value will decrease as the sample size increases. The size of the population affects the standard deviation, larger population means a smaller sd and vice versa (sd = sigma/sqrt(n)). With a smaller sd the Z-score will increase, Z= mu-x /sd, making the p-value decrease. Example:

p10 <- 1 - pnorm(.1) 
p10
## [1] 0.4601722
p20 <- 1 - pnorm(.5) 
p20
## [1] 0.3085375