Confidence Intervals

In Statistics, the concept of Confidence Intervals is used to make the claim that for a specific confidence level percentage, then that percentage of the samples in the interval estimate will contain the population parameter.

For example, if we use a 95% confidence level, we are finding the upper and lower limits of the interval that is 95% likely to contain the unknown parameter (often the population mean).

The upper limit is found by adding the error to the sample mean, and the lower limit is found by subtracting the error from the sample mean. The difference between the lower limit and upper limit is twice the error.

Equations

Equation to find the Lower Limit and Upper Limit: \(\hspace{1cm} \bar{X} \pm Z_{\alpha \over 2} {\sigma \over \sqrt n}\)

Therefore the Confidence Interval is: \(\hspace{1cm} (\bar{X} - Z_{\alpha \over 2} {\sigma \over \sqrt n}, \bar{X} + Z_{\alpha \over 2} {\sigma \over \sqrt n})\)

\(\bar{X} \hspace{1cm}\) represents the sample mean

\(Z_{\alpha \over 2}\hspace{1cm}\) represents the Z value corresponding to the Confidence Level (CL), such that CL = 1 - \(\alpha\)

\(\sigma \hspace{1cm}\) represents the Standard Deviation

\(n \hspace{1cm}\) represents the number of observations

(\(Z_{\alpha \over 2} {\sigma \over \sqrt n}\) is also called the “Error”)

Z-Scores

The value of the Z-Score is the number of standard deviations a value is away from the mean. The Z-Score that is used to find the Confidence Interval changes based on the given Confidence Level.

For example, if you wanted to find the Confidence Interval with a 95% confidence level, 0.95 = 1 - \(\alpha\). Then \(\alpha\) = 0.05, so we would find the value of \(Z_{0.05 \over 2} = Z_{0.025}\hspace{0.25cm}\) . Since 1 - 0.025 = 0.9750, we would look at a Z score table to find the value of Z where the area to its left is equal to 0.9750, or we could use the R formula qnorm(0.9750) which gives the 97.5th percentile of the normal distribution. Both methods provide a Z-Score of 1.96. Therefore, the equation for calculating the Confidence Interval with a 95% Confidence Level is \(\hspace{1cm} \bar{X} \pm 1.96 {\sigma \over \sqrt n}\)

cat(round(qnorm(0.9750),2),",",round(qnorm(0.0250),2))

## 1.96 , -1.96

Common Z-Scores in calculating Confidence Intervals

Here is a table of Common \(Z_{\alpha \over 2}\hspace{0.25cm}\) values used.

zscore = matrix(c("80%", 0.10, "qnorm(0.80+0.10)", qnorm(0.80+0.10), 
    1.28,"90%", 0.05,"qnorm(0.90+0.05)", qnorm(0.90+0.05), 1.645, 
    "95%", 0.025, "qnorm(0.95+0.025)", qnorm(0.95+0.025), 1.96, 
    "99%", 0.005, "qnorm(0.99+0.005)", qnorm(0.99+0.005), 2.58),
    ncol=5, byrow=TRUE)
colnames(zscore) = c("CL  ", "Alpha/2  ", "R Function ", "Z-Score ", 
    "Rounded Z-Score")
as.table(zscore)

##   CL   Alpha/2   R Function        Z-Score          Rounded Z-Score
## A 80%  0.1       qnorm(0.80+0.10)  1.2815515655446  1.28           
## B 90%  0.05      qnorm(0.90+0.05)  1.64485362695147 1.645          
## C 95%  0.025     qnorm(0.95+0.025) 1.95996398454005 1.96           
## D 99%  0.005     qnorm(0.99+0.005) 2.5758293035489  2.58

An Example of Calculating the Confidence Interval

For example, assume the results of an experiment has a normal distribution. The sample mean \(\bar{X}\) = 131, the standard deviation \(\sigma\) = 40, and the number of observations \(n\) = 64. Use a 99% Confidence Level to find the Confidence Interval of the experiment.

x = 131
sd = 40
n = 64
error = round(qnorm(0.99+0.005),2) * sd/sqrt(n)
lower = x - error
upper = x + error
cat("The Confidence Interval is (",lower, ",", upper, ")")

## The Confidence Interval is ( 118.1 , 143.9 )

Confidence Level and Corresponding Z-Score

Graph of Confidence Level vs Confidence Interval

The graph below shows relationship between the Confidence Level and the width of the Confidence Interval. In this case, the mean is 0, the standard deviation is 1, and the number of observations is 1, but the same pattern is always true. The width of the Confidence Interval gets larger the closer the Confidence Level gets to 100%.