The main idea is that a sampling distribution for the mean is normal with mean \( \mu \) and standard deviation \( \frac{\sigma}{\sqrt{n}} \) when \( n \rightarrow \infty \), where \( \mu \) is the mean of the population and \( \sigma \) is the standard deviation of the popultion.
qnorm to find a 95% confidence interval for the mean weight of 13 year old girls in this community:xbar <- 95
sigma <- 24.6
n <- 150
qnorm(c(.025, .975), mean=xbar, sd=sigma/sqrt(n))
[1] 91.06325 98.93675
If we repeat this sampling process many times, 95% of the time the true mean weight will lie in the interval we obtain.
What does the community conclude? The interval does not contain the mean weight of the national population and so they have evidence that the mean weight of girls in their community is different than that of the nation.
qnorm(0.975, mean=0, sd=1)*sigma/sqrt(n)
[1] 3.936748
qnorm(0.025, mean=0, sd=1)*sigma/sqrt(n)
[1] -3.936748
xbar-qnorm(0.975, mean=0, sd=1)*sigma/sqrt(n)
[1] 91.06325
xbar+qnorm(0.975, mean=0, sd=1)*sigma/sqrt(n)
[1] 98.93675
Notice: We used a mean of 0 and a standard deviation of 1. This is for a standard normal distribution. The reason for this type of reporting is historical: before R there were tables of values, but only for standard normal distributions.
qt and pt don't quite work like qnorm and pnorm however.xbar <- 110
S <- 7.5
n <- 28
df <- n-1
q <- qt(.95, df)
q
[1] 1.703288
MOE <- q*S/sqrt(n)
xbar-MOE
[1] 107.5858
xbar+MOE
[1] 112.4142
there is a shorter procedure.
library(resampledata)
girls <- subset(NCBirths2004, select=Weight, subset=Gender=="Female", drop=T)
qqnorm(girls)
t.test(girls, conf.level=.99)$conf
[1] 3343.305 3453.328
attr(,"conf.level")
[1] 0.99
(If you take off $conf you'll see some other information.)
qt. Eg. for a 95% CI: q <- qt(.975, df)