Given sample data, we use the sample statistics to make inferences about the unknown population parameters, such as the population mean and the population proportion. Often it is more informative to provide a range of values rather than a single point estimate for the unknown population parameter. This range of values is called a confidence interval, also referred to as an interval estimate, for the population parameter.
This lab will explore the concepts of creating confidence intervals for population mean and population proportion.
Remember to always set your working directory to the source file location. Go to ‘Session’, scroll down to ‘Set Working Directory’, and click ‘To Source File Location’. Read carefully the below and follow the instructions to complete the tasks and answer any questions.
Amazon Prime is a service that gives company’s customers free delivery, free video streaming and free e-book services. File Prime includes annual expenditures (in $) for 100 prime customers. Construct the 95% confidence interval for the average annual expenditures of all Prime customers and summarize the results.
#Read data correctly
myData = read.csv(file="prime.csv")
View(myData)
head(myData)
## Customer Expenditures
## 1 1 1272
## 2 2 1089
## 3 3 1169
## 4 4 1161
## 5 5 1286
## 6 6 1178
nrow(myData)
## [1] 100
Next, for a 95% confidence interval with n = 100, use qt function for the t-value and calculate the lower and upper limits.
#t value for significance value of 0.05 and for a degree of freedom equal to 99 (100 - 1)
tvalue = qt(0.975, 99, lower.tail = TRUE)
tvalue
## [1] 1.984217
Next, calculate the lower and upper limits
# Lower limit
lower = mean(myData$Expenditures) - tvalue * sd(myData$Expenditures)/sqrt(nrow(myData))
lower
## [1] 1240.243
# Upper limit
upper = mean(myData$Expenditures) + tvalue * sd(myData$Expenditures)/sqrt(nrow(myData))
upper
## [1] 1373.637
With 95% confidence, we conclude that the average annual expenditures of all Prime services customers fall between $1,240.24 and $1,373.64. CI = [$1,240.24, $1,373.64]
In a sample of 25 ultra-green cars, seven of the carsobtained over 100 miles per gallon (mpg). Construct a 90% confidence intervalfor the population proportion of all ultra-green cars that obtain over 100 mpg.
First, calculate the point estimate for the population proportion and check the normality conditions.(0.5 points)
#point estimate for the population proportion
n=25
pestimate = 7/n
pestimate
## [1] 0.28
#check for the normality condition
np = 25*pestimate
np
## [1] 7
noneminusp = 25*(1-pestimate)
noneminusp
## [1] 18
Is the normality condition satisfied? Yes Next, calculate the z* value in the formula for the margin of error.
#Calculate the value for the z* for a 90% confidence level, alpha = 0.10 or alpha/2 = 0.05
zstar = qnorm(p=0.95, mean = 0, sd = 1, lower.tail = TRUE)
zstar
## [1] 1.644854
Next, calculate the lower and upper limits.
#lower limit
lowerp <- pestimate - zstar * sqrt(pestimate*(1-pestimate)/n)
lowerp
## [1] 0.1322925
#upper limit
upperp <- pestimate + zstar * sqrt(pestimate*(1-pestimate)/n)
upperp
## [1] 0.4277075
Conclusion: 13.23%-42.77% of cars carry 100 mpg.
Assuming a larger confidence level of 99%, would the confidence interval be larger or smaller? (0.25) As the confidence level increases, so does the confidence interval. As a result, assuming a 99% confidence level would result in a larger confidence interval. ———-