The F distribution is the ratio of two scaled chi-square distributions \(W_1\) and \(W_2\) with degrees of freedom \(df_1\) and \(df_2\).
\(F = \frac{{W_X / df_X}}{{W_Y / df_Y}}\)
The F statistic may also be written
\(F = \frac{{s_X^2 / \sigma_X^2}}{{s_Y^2 / \sigma_Y^2}} = \frac{{s_X^2 / s_Y^2}}{{\sigma_X^2 / \sigma_Y^2}}\).
Like the chi-square distribution, the F distribution contains only positive values and in nonsymmetrical. There is an F distribution for each degree of freedom associated with \(s_A^2\) and \(s_B^2\).
df
, pf
, qf
, and rf
R function df(x, df1, df2)
is the probability of F equalling x
when the degrees of freedom are df1
and df2
. R function pf(q, df1, df2, lower.tail)
is the cumulative probability (lower.tail = TRUE
for left tail, lower.tail = FALSE
for right tail) of less than or equal to value q
. R function qf(p, df1, df2, lower.tail) is the value of x
at the qth percentile (lower.tail = TRUE
). R function rf(n, df1, df2) returns n
random numbers from the F distribution.
The F distribution has numerous applications. The F test is used in to test whether two distributions are equivalent \(H_0: \sigma_A^2 = \sigma_B^2\).
library(dplyr)
library(ggplot2)
library(tidyr)
data.frame(f = 0:1000 / 100) %>%
mutate(df_10_20 = df(x = f, df1 = 10, df2 = 20),
df_05_10 = df(x = f, df1 = 5, df2 = 10)) %>%
gather(key = "df", value = "density", -f) %>%
ggplot() +
geom_line(aes(x = f, y = density, color = df)) +
labs(title = "F at Various Degrees of Freedom",
x = "F",
y = "Density")