Sample Size for Variance

The problem: Find the minimum sample size needed to be 99% confident that the sample’s variance is within 1% of the population’s variance.

This question is exceedingly tricky, since 1) \(\chi^2\) is not symmetrical, but we can at least find one interval that meets the mission. But remember, the sample variance is the best estimate of the population variance, so we have some cancellation! The confidence interval follows

\[ \frac{(n-1)s^2}{\chi^2_{df. \alpha/2}}<\sigma^2<\frac{(n-1)s^2}{\chi^2_{df. 1-\alpha/2}} \] In our case, we can divide through by \(\sigma^2\) to have the following.

\[ \frac{(n-1)}{\chi^2_{df. \alpha/2}}<1<\frac{(n-1)}{\chi^2_{df. 1-\alpha/2}} \] So we need this interval to within 1% of the population’s variance. So the upper bound minus the lower bound must be within \(0.01\times \sigma^2=0.01 \times 1=0.01\).

\[ \frac{(n-1)}{\chi^2_{df. \alpha/2}}-\frac{(n-1)}{\chi^2_{df. 1-\alpha/2}}=.01 \] Since both the numerator and the \(\chi^2\) random variable are dependent on the degrees of freedom (df=n-1), we have two moving parts at once. We could just use a table or some quick optimization, right? Or we could use brute force estimation.

Brute Force

Here, we successively iterate our bounds until we find numbers appearing. (I changed both the sequence interval and the step to do so.) Solutions for within 0.01 (what was asked in the problem), 0.10, and 1.00 are provided.

# You can verify the bottom two with a table or 
#https://stats.stackexchange.com/questions/7004/calculating-required-sample-size-precision-of-variance-estimate

#Within 1% is 530805.  Correct Answer

for (i in seq(530800, 530806,by=1)){

  t1=(i-1)/qchisq(.005,df=i-1, lower.tail=F)
  t2=(i-1)/qchisq(.995, df=i-1, lower.tail =F)
  ifelse(t2-t1<.01, print(i), print(''))
  
}

## [1] ""
## [1] ""
## [1] ""
## [1] ""
## [1] ""
## [1] 530805
## [1] 530806

#Within 10% is 5321

for (i in seq(5315, 5325,by=1)){
  
  t1=(i-1)/qchisq(.005,df=i-1, lower.tail=F)
  t2=(i-1)/qchisq(.995, df=i-1, lower.tail =F)
  ifelse(t2-t1<.1, print(i), print(''))
  
}

## [1] ""
## [1] ""
## [1] ""
## [1] ""
## [1] ""
## [1] ""
## [1] 5321
## [1] 5322
## [1] 5323
## [1] 5324
## [1] 5325

#Within 100% is 65

for (i in seq(60,66,by=1)){
  
  t1=(i-1)/qchisq(.005,df=i-1, lower.tail=F)
  t2=(i-1)/qchisq(.995, df=i-1, lower.tail =F)
  ifelse(t2-t1<1, print(i), print(''))
  
}

## [1] ""
## [1] ""
## [1] ""
## [1] ""
## [1] ""
## [1] 65
## [1] 66

More Elegantly

We can find the root function as follows.

myf=function(i)(i-1)/qchisq(.995, df=i-1, lower.tail =F)-(i-1)/qchisq(.005,df=i-1, lower.tail=F)-.01
ceiling(uniroot(myf, c(2, 531000), extendInt='yes', maxiter=100000)$root)

## [1] 530805

myf=function(i)(i-1)/qchisq(.995, df=i-1, lower.tail =F)-(i-1)/qchisq(.005,df=i-1, lower.tail=F)-.1
ceiling(uniroot(myf, c(2,5000), extendInt='yes', maxiter=100000)$root)

## [1] 5321

myf=function(i)(i-1)/qchisq(.995, df=i-1, lower.tail =F)-(i-1)/qchisq(.005,df=i-1, lower.tail=F)-1
ceiling(uniroot(myf, c(2,100), extendInt='yes', maxiter=100000)$root)

## [1] 65

Population Variance Sample Size

Sith

October 19, 2022

Sample Size for Variance

Brute Force

More Elegantly