Introduction

Consider two sample variances that are calculated from random samples from normal populations.

If we need to perfom a significance test to determine whether the underlying variances are in fact equal; that is, we want to test the hypothesis \(H{_0}\): \(\sigma_1^2\) = \(\sigma_2^2\) versus \(H{_1:}\) \(\sigma_1^2\) != \(\sigma_2^2\) we will proceed basing the significance test on the relative magnitudes of the sample variances (\(s_1^2\), \(s_2^2\)). It is prefereable to base the test on the ratio of the sample variances (\(s_1^2\) \(/\) \(s_2^2\)) rather than on the difference between the sample variances (\(s_1^2\)- \(s_2^2\)).

The ratio of two such variances is called an F ratio and the F ratio has a standard distribution called an F distribution. The shape of this distribution depends on the sample sizes of the two groups more generally on the degrees of freedom of the two variance estimates. The variance ratio follows an F distribution under the null hypothesis that \(\sigma_1^2\) = \(\sigma_2^2\) and is indexed by the two parameters termed the numerator and denominator degrees of freedom, respectively. If the sizes of the first and second samples are n1 and n2 respectively, then the variance ratio follows an F distribution with n1-1 (numerator df) and n2-1 (denominator df), which is called an \(F_{(n-1),(n-2)}\) distribution. If the two normal populations have different standard deviations, the F distribution is scaled by their ratio. However if the two groups really have the same population standard deviations, the distribution does not involve any unknown parameters.

Function to calculate F test p value and ratio confidence interval, the variance and df for each sample are required.

      # enter each variance and each degrees of freedom
      var.rat <- function (v1, df1, v2, df2) {  
        V.x <- v1
        DF.x <- df1 
        V.y <- v2
        DF.y <- df2
        ratio <- 1
        conf.level <- 0.95
        ESTIMATE <- V.x/V.y
        STATISTIC <- ESTIMATE/ratio
        PARAMETER <- c( DF.x,  DF.y)
        PVAL <- pf(STATISTIC, DF.x, DF.y)
        PVAL <- 2 * min(PVAL, 1 - PVAL)
        BETA <- (1 - conf.level)/2
        CINT <- c(ESTIMATE/qf(1 - BETA, DF.x, DF.y),
                  ESTIMATE/qf(BETA, DF.x, DF.y))
        c(ESTIMATE, CINT, PVAL)
      }

Let’s perform the F test manually, create some data, note very small sample sizes

      s1 <- 10:12 ; s2 <- 13:16
      n1 <- length(s1) ; n2 <- length(s2)

Manual F test, and options to calculate the ratio confidence limits. The symmetry properties of the F distribution make it possible to derive the lower percentage points of any F distribution from the corresponding upper percentage points of an F distribution with the degrees of freedom reversed.

      (vr <- var(s1)/var(s2))     # ratio of variances

[1] 0.6

      vr*qf(0.025, n2-1, n1-1)    # lower

[1] 0.03739691

      vr*qf(0.975, n2-1, n1-1)    # upper

[1] 23.4993

      vr/qf(0.975, n1-1, n2-1)    # lower

[1] 0.03739691

      vr/qf(0.025, n1-1, n2-1)    # upper

[1] 23.4993

F test using functions

      # base R function, requires the actual samples s1 and s2 in our example
      var.test(s1, s2)


    F test to compare two variances

data:  s1 and s2
F = 0.6, num df = 2, denom df = 3, p-value = 0.7926
alternative hypothesis: true ratio of variances is not equal to 1
95 percent confidence interval:
  0.03739691 23.49929674
sample estimates:
ratio of variances 
               0.6

      # function defined earlier
      var.rat(var(s1), n1-1, var(s2), n2-1)

[1]  0.60000000  0.03739691 23.49929674  0.79263678

References

Computing Environment

R version 3.2.2 (2015-08-14)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 8 x64 (build 9200)

locale:
[1] LC_COLLATE=English_United Kingdom.1252  LC_CTYPE=English_United Kingdom.1252   
[3] LC_MONETARY=English_United Kingdom.1252 LC_NUMERIC=C                           
[5] LC_TIME=English_United Kingdom.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] bootstrap_2015.2 boot_1.3-17      knitr_1.12.3    

loaded via a namespace (and not attached):
 [1] magrittr_1.5    formatR_1.3     tools_3.2.2     htmltools_0.3.5 yaml_2.1.13     Rcpp_0.12.4     stringi_1.0-1  
 [8] rmarkdown_0.9.6 stringr_1.0.0   digest_0.6.9    evaluate_0.9

[1] "~/X/"

This took 0.89 seconds to execute.

F Distribution: An investigation on the ratio of variances

Eamonn O’Brien