Motivating Question: Is a team more likely to score a touchdown after a turnover compared to a punt or kickoff?

Relative risk is often used for two binary variables, so let’s convert our data into two binary variables:

drives2 <- 
  drives |> 
  mutate(
    drive_start = if_else(drive_start %in% c("Interception", "Fumble"),
                          true = "turnover",
                          false = "kick") |> 
                  factor(),
         
    drive_end = if_else(drive_end == "Touchdown",
                        true = "TD",
                        false = "noTD") |> 
                factor()
  )

Next, let’s display the results in a two-way table using xtabs()

# Displaying a two-way table using xtabs()
drive_freq <- 
  xtabs(formula = ~ drive_start + drive_end, data = drives2)

# The addmargins() function will add the margins to the table
drive_freq |> 
  addmargins() 
##            drive_end
## drive_start noTD   TD  Sum
##    kick     4055 1200 5255
##    turnover  571  239  810
##    Sum      4626 1439 6065

And converting the table from counts to conditional proportions using prop.table(margin = "drive_start")

drive_freq |> 
  prop.table(margin = "drive_start") |> 
  
  round(digits = 3)
##            drive_end
## drive_start  noTD    TD
##    kick     0.772 0.228
##    turnover 0.705 0.295

Let’s create a bar chart to compare the probability of a touchdown between the two groups:

drive_freq |> 
  # Calculate the conditional proportions
  prop.table(margin = "drive_start") |> 
  # Convert the table to a data frame to be used by ggplot()
  data.frame() |> 
  # Rename Freq to prop
  rename(prop = Freq) |> 
  
  ggplot(
    mapping = aes(
      x = drive_start,
      fill = drive_end,
      y = prop
    )
  ) + 
  
  geom_col(width = 0.5) + 
  
  labs(
    y = "Proportion",
    x = "How Drive Began",
    fill = "Drive \nResult"
  ) + 
  
  theme_test() + 
  
  scale_y_continuous(expand = c(0, 0, 0, 0.05)) + 
  
  scale_fill_manual(
    labels = c("No Touchdown", "Touchdown"),
    values = c("darkorange", "steelblue")
  )

There are some differences, but it isn’t a very noticeable difference.

Odds: pi/(1-pi)

The odds are another way of measuring how frequently events occur, but instead they compare the probability of the event happening vs the event not happening using a ratio:

\[\textrm{odds} = \frac{\pi}{1-\pi}\]

If we ignore how the drive began, the odds a drive ends in a touchdown is:

\[\textrm{odds of a touchdown} = \frac{p}{1-p} = \frac{1439}{4626} = 0.31\]

which means that for every 1 drive that ends in a touchdown, about 3 drives do not

One benefit of calculating the odds is if we want to estimate it for a binary variable, we place the sample count of success on the top and the sample count of failures on the bottom of the fraction.

Like we did with the probabilities (or “risks”), we can use the odds to test if two binary variables are associated using the odds ratio

Odds Ratio

The odds ratio is similar to the risk ratio, in that it is a ratio of the odds of success for two groups of a binary variable (group 1 vs group 2):

\[\theta_{XY} = \frac{\textrm{odds}_1}{\textrm{odds}_2}\]

The odds a drive ends on a touchdown if the drive starts from a turnover is:

\[\textrm{odds}_1 = \frac{239}{571} = 0.419\]

The odds a drive ends on a touchdown if the drive starts from a kickoff or punt is:

\[\textrm{odds}_2 = \frac{1200}{4055} = 0.296\]

Then our estimated odds ratio is:

\[\hat{\theta}_{Y|X} = \frac{\textrm{odds}_1}{\textrm{odds}_2} == \frac{0.419}{0.296} = 1.414\]

We can interpret the results as

“The odds a drive ends with a touchdown is 1.414 times higher if the drive started from a turnover than from a kickoff or punt”

While both odds are below 1, the odds a drive ends in a touchdown if it starts from a turnover is about 40% higher than if it started after a kick.

Odds ratio: Direct Estimation

We can estimate the odds ratio a little more directly using our sample counts. If the counts are in a 2x2 table:

Y = 1 Y = 2
X = 1 n11 n12
X = 2 n21 n22

Then our estimated odds ratio is

\[\hat{\theta}_{Y|X} = \frac{p_{1|1} (1-p_{1|2})}{p_{1|2} (1-p_{1|1})} = \frac{n_{11} \times n_{22}}{n_{12} \times n_{21}}\]

The odds ratio for our example is:

\[\hat{\theta}_{Y|X} = \frac{239 \times 4055}{571 * 1200} = 1.414\]

Inference with the Odds Ratio:

Similar to the risk ratio (or relative risk), we work with the log odds ratio \(\ln \hat{\theta}_{Y|X}\) instead of the odds ratio directly.

If all of the observed counts are at least 10:

\[\ln \hat{\theta}_{Y|X} \sim N(\ln \theta_{Y|X}, SE)\]

and the standard error is:

\[SE(\ln \hat{\theta}_{Y|X}) = \sqrt{\frac{1}{n_{11}} + \frac{1}{n_{12}} + \frac{1}{n_{21}} + \frac{1}{n_{22}}}\]

Hypothesis Test

Our null hypothesis is:

\[H_0: \theta_{Y|X} = 1 \rightarrow H_0: \ln \theta_{Y|X} = 0\]

and our test statistic is:

\[z = \frac{\ln \hat{\theta}_{Y|X}}{\sqrt{\frac{1}{n_{11}} + \frac{1}{n_{12}} + \frac{1}{n_{21}} + \frac{1}{n_{22}}}}\]

We can conduct a hypothesis test for the odds ratio using oddsratio() in the epitools package.

It wants a 2x2 table with:

  • column 1 = failure and column 2 = success
    • fail = noTD
    • success = TD)
  • row 1 = reference group and row 2 = comparison group
    • reference = kick
    • comparison = turnover

which is the exact order of our table!

The arguments for oddsratio() are:

  • x = the 2x2 table described above
  • method = "wald" to use the Normal model to find the p-value and critical value
  • conf.level = 0.95 the confidence level of a desired hypothesis test
  • rev = "neither" will reverse the order of the “rows”, “columns”, or “both” of the table from x =
td_OR <- 
  oddsratio(x = drive_freq, method = "wald")

# The table with margins:
td_OR$data
##            drive_end
## drive_start noTD   TD Total
##    kick     4055 1200  5255
##    turnover  571  239   810
##    Total    4626 1439  6065
# The estimated odds ratio and confidence interval
td_OR$measure |> 
  data.frame()
##          estimate    lower    upper
## kick     1.000000       NA       NA
## turnover 1.414397 1.200264 1.666733
# And the p-value (chi.squared)
td_OR$p.value |> 
  data.frame() 
##            midp.exact fisher.exact   chi.square
## kick               NA           NA           NA
## turnover 4.724261e-05 5.233613e-05 3.264256e-05

Order of Conditioning

A big advantage of using the odds ratio over the relative risk is the order of the condition (\(Y|X\) vs \(X|Y\)) won’t change the result

\[\theta_{Y|X} = \theta_{X|Y} = \theta_{XY}\]

For instance, if we swap the order of the rows and columns with t() (for transpose), the results for the odds ratio won’t change, but the relative risk will!

# Calculate the OR swapping the rows and columns (swapping X and Y roles)
oddsratio(x = drive_freq |> t(), method = "wald")$measure |> 
  data.frame() |> 
  round(3)
##      estimate lower upper
## noTD    1.000    NA    NA
## TD      1.414   1.2 1.667

Note that our estimate and confidence interval remain exactly the same!

The same won’t be true for the relative risk:

# Calculating the relative risk for Y|X
riskratio(x = drive_freq, method = "wald")$measure |> 
  data.frame() |> 
  round(3)
##          estimate lower upper
## kick        1.000    NA    NA
## turnover    1.292 1.149 1.453
# Calculating the relative risk for X|Y
riskratio(x = drive_freq |> t(), method = "wald")$measure |> 
  data.frame() |> 
  round(3)
##      estimate lower upper
## noTD    1.000    NA    NA
## TD      1.346 1.171 1.546

Who cares if the odds ratio is symmetric (can reverse X and Y and get the same result) while the relative risk is not?

Based on how the data are collected, we can’t always estimate the conditional proportions needed for the relative risk.

For retrospective studies, we can’t estimate \(P(Y=j|X = i) = \pi_{i|j}\) since the number of cases in each group of \(Y\) is not random.

However, the counts for \(X\) within each group of \(Y\) is random in retrospective studies, which means we can estimate \(P(X = i|Y = j) = \pi_{i|j}\). And as long as we can estimate one conditional proportion, we can use the odds ratio to check if \(X\) and \(Y\) are independent and calculate confidence intervals for \(\theta_{Y|X}\) even if we can estimate \(\pi_{j|i}\) directly!