Load the data set workers:
workers<-read.file("/home/emesekennedy/Data/Ch5/workers.txt")
## Reading data with read.table()
The mean salary for the entire population of 1000 workers is
mu<-mean(~salary, data=workers)
mu
## [1] 23600.19
The standard deviation of salary for the entire population of 1000 workers is
s<-sd(~salary,data=workers)
s
## [1] 19269.82
Create 1000 simple random samples of size 100, and record the mean \(\bar{x}\) for each of the samples:
samples<-do(1000)*mean(~salary,data=sample(workers, 100, replace=T))
Rename the column name in samples:
names(samples)[1]<-"xbar"
Find \(z^*\) for the 95% confidence interval
qdist("norm", mean=0, sd=1, c(.025, .975))
## [1] -1.959964 1.959964
So \(z^*\) is approximately 1.96. Compute the margin of error:
m<-1.96*s/10
m
## [1] 3776.885
Add a column to samples that has the lower bound for the 95% confidence intervals for each of the 1000 samples:
samples<-transform(samples, lower=xbar-m)
Add a column to samples that has the upper bound for the 95% confidence intervals for each of the 1000 samples:
samples<-transform(samples, upper=xbar+m)
Create a subset with the samples for which the population mean \(\mu\) lies within the computed 95% confidence intervals:
samples2<-subset(samples, lower<mu & upper>mu)
Output the number of observations in this subset using the command
nrow(samples2)
## [1] 955
So 955 out of the 1000 samples, or 95.5% of the samples had the population mean lie in the 95% confidence interval.