Coverage Probability in Confidence Intervals

Introduction

A Confidence Interval for an unknown parameter is constructed in order to have a range of values in which such a parameter is located with a certain confidence. Frequentists claim Confidence Intervals be grounded on repeatability, a concept which may be easily explained through the following steps:

define a population for the parameter of interest;
obtain a sample from that population;
set a confidence level;
build an interval (with this confidence level) for the parameter of interest;
repeat steps 2 to 4 a large number of times.

The confidence level acquires the meaning of coverage probability, i.e. the proportion of intervals which contain the parameter. Although in practice only one sample is usually drawn and nothing is to be clamed about the probability of the parameter being in the derived interval, the following section will demonstrate empirically that repeated sampling is a very useful tool for grasping classical inference.

The case of ratio of variances

The steps

This section applies the above steps to the case of ratio of two variances. Since two Normally distributed populations are considered, such steps need to be adapted as follows:

define two Gaussian populations (named \(M\) and \(N\), respectively) with (possibly) different variances. For instance, choose \(\sigma_M^2 \ge \sigma_N^2\);
obtain a sample of size \(m\) from population \(M\), and one of size \(n\) from population \(N\);
set a confidence level, say 95%;
build an interval for \(\frac{\sigma_M^2}{\sigma_N^2}\);
repeat steps 2 to 4 a large number of times, say ten thousands.

The code

set.seed(14)

# define m, n, sigma_M, sigma_N, parameter, numb_samples
m <- 80; n <- 73
sigma_M <- 1.7; sigma_N <- 1.1
parameter <- (sigma_M/sigma_N)^2
numb_samples <- 10000

# draw samples from M
samples_M <- replicate(numb_samples, rnorm(m, sd = sigma_M))

# draw samples from N
samples_N <- replicate(numb_samples, rnorm(n, sd = sigma_N))

# compute sample variances for M
s2_M <- apply(samples_M, 2, var)

# compute sample variances for M
s2_N <- apply(samples_N, 2, var)

# compute ratios
ratios <- s2_M/s2_N

# define the confidence level
conf_level <- 0.95

# compute the two needed F quantiles
F_lowerbound <- qf((conf_level+1)/2, m-1, n-1)
F_upperbound <- qf((1-conf_level)/2, m-1, n-1)

# compute intervals
intervals <- cbind(ratios/F_lowerbound, ratios/F_upperbound)

# check the proportion of intervals containing the parameter
mean(apply(intervals, 1, findInterval, x = parameter) == 1)

## [1] 0.9495

The obtained value is (approximately) equal to the confidence level which has been set.

Coverage Probability in Confidence Intervals

Davide Passaretti

Introduction

The case of ratio of variances

The steps

The code