Testing to see if how a drive starts (drive_start2) and if it ends in a touchdown (drive_end2) are associated:

chisq_test(x = drives$drive_start2, y = drives$drive_end2)
## # A tibble: 1 × 6
##       n statistic        p    df method          p.signif
## * <int>     <dbl>    <dbl> <int> <chr>           <chr>   
## 1  4152      18.4 0.000018     1 Chi-square test ****

Let’s include another variable: yds_to_td -> how far away the drive starts from the endzone (closer to the endzone the easier it should be to score a TD)

Since we already tested if \(X\) (drive_start2) and \(Y\) (drive_end2) are associated, let’s check to see if \(X\) and \(Y\) are both independent of \(Z\) (yds_to_td)

Testing for Conditional Independence

Let’s test to see if how a drive begins (drive_start2) and if the drive results in a touchdown (drive_end2) are independent, if we know where on the field the drive began (yds_to_td)

Now we are testing to see if \(X\) and \(Y\) are independent, conditioning on \(Z\)

The expected proportions for the outcomes of \(X\), \(Y\), and \(Z\) assuming conditional independence on \(Z\) is:

\[\hat{\pi}_{ijk,0} = P(X = i|Z = k) P(Y = j|Z = k) P(Z = k) = \frac{n_{i\bullet k}}{n_{\bullet\bullet k}} \frac{n_{\bullet jk}}{n_{\bullet\bullet k}} \frac{n_{\bullet\bullet k}}{N}\]

The number of parameters for the conditional independence model is:

\[(I + J - 1)K - 1\]

and the degrees of freedom of the test will be:

\[df = r_1 - r_0 = \\IJK - 1 -(I + J -1)K - 1 = \\2(2)(4) - 1 -(2 + 2 -1)4 - 1 = \\15 - 11 =\\ 4\]

Calculating the test statistic using chisq_test()

We can shortcut calculating our test statistic by performing a \(\chi^2\) test between \(X\) and \(Y\) for each group of \(Z\), then adding their individual test statistics together!

We can see the four different \(2\times 2\) tables below:

xtabs(formula = ~ drive_start2 + drive_end2 + yds_to_td, data = drives)
## , , yds_to_td = < 25
## 
##             drive_end2
## drive_start2 No TD   TD
##     Kick         8   11
##     Turnover    32   39
## 
## , , yds_to_td = 25 - 50
## 
##             drive_end2
## drive_start2 No TD   TD
##     Kick        75   44
##     Turnover    86   56
## 
## , , yds_to_td = 50 - 75
## 
##             drive_end2
## drive_start2 No TD   TD
##     Kick       658  258
##     Turnover    81   29
## 
## , , yds_to_td = >= 75
## 
##             drive_end2
## drive_start2 No TD   TD
##     Kick      2096  584
##     Turnover    78   17
xtabs(formula = ~ yds_to_td + drive_start2 + drive_end2, data = drives) |> 
  ftable()
##                        drive_end2 No TD   TD
## yds_to_td drive_start2                      
## < 25      Kick                        8   11
##           Turnover                   32   39
## 25 - 50   Kick                       75   44
##           Turnover                   86   56
## 50 - 75   Kick                      658  258
##           Turnover                   81   29
## >= 75     Kick                     2096  584
##           Turnover                   78   17

Using summarize() and chisq_test(), we can perform a separate \(\chi^2\) for \(X\) and \(Y\) at each level of \(Z\):

partial_chisq_tests <- 
  drives |>
  # Calculating the test statistic, df, and p-value 
  # for each individual partial table of yds_to_td
  summarize(
    .by = yds_to_td,
    test_stat = chisq_test(drive_start2, drive_end2)$statistic,
    df = chisq_test(drive_start2, drive_end2)$df,
    p_val = chisq_test(drive_start2, drive_end2)$p
  )

partial_chisq_tests 
##   yds_to_td    test_stat df p_val
## 1      < 25 6.726077e-31  1 1.000
## 2     >= 75 6.073278e-01  1 0.436
## 3   25 - 50 7.819726e-02  1 0.780
## 4   50 - 75 8.151203e-02  1 0.775

Examining each partial table individually, no evidence that how the drive ends depends on how the drive begins at any level of yards to touchdown.

But we don’t want to conduct 4 separate tests, but one big “omnibus” test. How?

Just combine the test statistics of each partial table!

partial_chisq_tests |>
  # Adding the partial test stats and df together
  summarize(
    test_stat = sum(test_stat),
    df = sum(df)
  ) |> 
  # calculating the p-value
  mutate(
    p_val = pchisq(test_stat, df = df, lower.tail = F)
  )
##   test_stat df     p_val
## 1 0.7670371  4 0.9428118

Out test statistic is 0.767, which is small, and our p-value is close to 1, indicating that we don’t have evidence that drive start and drive end are associated, once we account for the starting location of the drive.

What that tells us is, while drive_start2 and drive_end2 are associated, they are related because of their association with yds_to_td.

Because a drive that begins from a turnover is more likely to be closer to the endzone, it is also more likely to end with a touchdown.

But if a drive starts at the same location, the probability it ends with a touchdown is the same if it began from a kickoff/punt or a turnover!

Likelihood ratio G-test

If we wanted to conduct a likelihood ratio G-test instead of Pearson’s \(\chi^2\) test, we can repeat the code chunk named “partial_chi_squareds”, but replace chisq_test() with GTest() in DescTools and $p with $p.value and $df with $parameter

library(DescTools)
partial_G_tests <- 
  drives |>
  # Calculating the test statistic, df, and p-value 
  # for each individual partial table
  summarize(
    .by = yds_to_td,
    test_stat = GTest(drive_start2, drive_end2)$statistic,
    df = GTest(drive_start2, drive_end2)$parameter,
    p_val = GTest(drive_start2, drive_end2)$p.value
  )

partial_G_tests 
##   yds_to_td  test_stat df     p_val
## 1      < 25 0.05352689  1 0.8170361
## 2     >= 75 0.85764542  1 0.3543984
## 3   25 - 50 0.16616331  1 0.6835443
## 4   50 - 75 0.16002587  1 0.6891327
# test stat and p-value
partial_G_tests |> 
  summarize(
    test_stat = sum(test_stat),
    df = sum(df)
  ) |> 
  mutate(
    p_val = pchisq(test_stat, df, lower.tail = F)
  )
##   test_stat df     p_val
## 1  1.237361  4 0.8719098

We end up with similar results!

Checking sample size:

Unfortunately, there isn’t a simple way to determine if any expected counts are below 5 using the methods above, which means we have to do it in a complicated way:

drives |> 
  summarize(
    .by = yds_to_td,
    below_5 = sum(chisq.test(drive_start2, drive_end2)$expected <5)
  )
##   yds_to_td below_5
## 1      < 25       0
## 2     >= 75       0
## 3   25 - 50       0
## 4   50 - 75       0

All of our expected counts for all 16 groups are at least 5, so we meet our sample size requirement!