The F distribution is the ratio of two scaled chi-square distributions \(W_1\) and \(W_2\) with degrees of freedom \(df_1\) and \(df_2\).

\(F = \frac{{W_X / df_X}}{{W_Y / df_Y}}\)

The F statistic may also be written

\(F = \frac{{s_X^2 / \sigma_X^2}}{{s_Y^2 / \sigma_Y^2}} = \frac{{s_X^2 / s_Y^2}}{{\sigma_X^2 / \sigma_Y^2}}\).

Like the chi-square distribution, the F distribution contains only positive values and in nonsymmetrical. There is an F distribution for each degree of freedom associated with \(s_A^2\) and \(s_B^2\).

R Functions df, pf, qf, and rf

R function df(x, df1, df2) is the probability of F equalling x when the degrees of freedom are df1 and df2. R function pf(q, df1, df2, lower.tail) is the cumulative probability (lower.tail = TRUE for left tail, lower.tail = FALSE for right tail) of less than or equal to value q. R function qf(p, df1, df2, lower.tail) is the value of x at the qth percentile (lower.tail = TRUE). R function rf(n, df1, df2) returns n random numbers from the F distribution.

The F distribution has numerous applications. The F test is used in to test whether two distributions are equivalent \(H_0: \sigma_A^2 = \sigma_B^2\).

library(dplyr)
library(ggplot2)
library(tidyr)

data.frame(f = 0:1000 / 100) %>% 
           mutate(df_10_20 = df(x = f, df1 = 10, df2 = 20),
                  df_05_10 = df(x = f, df1 = 5, df2 = 10)) %>%
  gather(key = "df", value = "density", -f) %>%
ggplot() +
  geom_line(aes(x = f, y = density, color = df)) +
  labs(title = "F at Various Degrees of Freedom",
       x = "F",
       y = "Density")