Quarterbacks (QB) are the most important position in the NFL. Are quarterbacks equally likely to be picked in the fourth, fifth, sixth, and seventh rounds (last 4 rounds) in the NFL draft?
Our Null Hypothesis is:
\[H_0: \pi_4 = \pi_5 = \pi_6 = \pi_7\]
We aren’t testing all of the proportions this time, unlike our student survey example (only interested in 4 of the 7 proportions).
We’ll start by counting the number of quarterbacks picked in each round of the NFL Draft:
qb_round <-
draft |>
# Only looking at the quarterback position
filter(Position == "QB") |>
# Counting the number of quarterbacks picked per round
count(Round) |>
# and renaming the column from n to picks
rename(picks = n) |>
# and finally calculating the sample proportions per round
mutate(prop = picks/sum(picks))
qb_round
## Round picks prop
## 1 1 66 0.23571429
## 2 2 23 0.08214286
## 3 3 31 0.11071429
## 4 4 31 0.11071429
## 5 5 36 0.12857143
## 6 6 48 0.17142857
## 7 7 45 0.16071429
# Let's record the total number of quarterbacks drafted
N <- sum(qb_round$picks)
Our test statistic will look similar to the Goodness-of-fit test from the previous section:
\[\chi^2 = N \sum_{i = 1}^c \frac{(p_i - \pi_{i,0})^2}{\pi_{i,0}} \\ G = \sum_{i = 1}^c n_i \ln \frac{p_i}{\pi_{i,0}}\]
However, we have a little bit of an issue: what are \(\pi_{i,0}\) this time?
The null hypothesis doesn’t explicitly state what the population proportions are expected to be, just that \(\pi_4\) through \(\pi_7\) are all the same. So what do we do?
Good news: Any expected proportions not specified in the null hypothesis are equal to their sample proportions! - \(\pi_{i,0} = p_i\)
Bad news: We need to estimate the expected proportions for the groups included in the null hypothesis.
If we are assuming that \(\pi_4\) through \(\pi_7\) are all equal, the expected proportions should be equal.
That means our expected proportions for \(\pi_4\) through \(\pi_7\) will just be the average of the four sample proportions!
qb_round2 <-
qb_round |>
# using filter() and between() to pick rounds 4 - 7
filter(between(Round, 4, 7)) |>
# calculating the average proportion for the 4 remaining rounds:
mutate(expected_prop = mean(prop))
qb_round2
## Round picks prop expected_prop
## 1 4 31 0.1107143 0.1428571
## 2 5 36 0.1285714 0.1428571
## 3 6 48 0.1714286 0.1428571
## 4 7 45 0.1607143 0.1428571
If the null hypothesis is true, we should expect our sample proportions to be close to 14.3%
qb_round2 <-
qb_round2 |>
mutate(
# individual chi2 contributions
zi2 = N * (prop - expected_prop)^2/expected_prop,
# individual LRT G contributions
gi = picks * log(prop/expected_prop)
)
qb_round2
## Round picks prop expected_prop zi2 gi
## 1 4 31 0.1107143 0.1428571 2.025 -7.901660
## 2 5 36 0.1285714 0.1428571 0.400 -3.792979
## 3 6 48 0.1714286 0.1428571 1.600 8.751435
## 4 7 45 0.1607143 0.1428571 0.625 5.300237
# Calculating the test stats
nfl_test_stats <-
c(
chisq = sum(qb_round2$zi2), # Calculating Pearson's chi-squared test stat
nfl_G = sum(qb_round2$gi) # Calculating the LRT G-test stat
)
nfl_test_stats
## chisq nfl_G
## 4.650000 2.357033
Like our previous \(\chi^2\) and \(G\)-tests, we’ll use a \(\chi^2\) distribution to find the p-value. But what are the degrees of freedom this time?
Again, it is the number of proportions estimated for the sample proportions \(p_i\) (\(r_1\)) minus the number of expected proportions \(\pi_{i,0}\) estimated (\(r_0\))
\(df = r_1 - r_0\)
since we are only estimating some of the proportions (instead of all of them), the four won’t add up to 1 this time. So we needed to estimate all 4 proportions: \(r_1 = 4\)
While we estimated four expected proportions, how many unique \(\pi_{i,0}\) did we estimate? Since we’re assuming the four proportions are all equal, we only needed to estimate one: \(r_0 = 1\)
Which means our degrees of freedom are \(4 - 1 = 3\)
pchisq(q = nfl_test_stats, df = 3, lower = F)
## chisq nfl_G
## 0.1992946 0.5016830
While the results here are noticeably different, we reach the same conclusion: No evidence that the probability a QB is picked differs between rounds 4, 5, 6, or 7.
Let’s work through a test that QBs are equally likely to be picked in rounds that occur on the same day:
\[H_0: \\ \pi_2 = \pi_3 \\ \pi_4 = \pi_5 = \pi_6 = \pi_7\]
qb_round |>
round(digits = 3)
## Round picks prop
## 1 1 66 0.236
## 2 2 23 0.082
## 3 3 31 0.111
## 4 4 31 0.111
## 5 5 36 0.129
## 6 6 48 0.171
## 7 7 45 0.161