Motivating Question: Is a team more likely to score a touchdown after a turnover compared to a punt or kickoff?
Relative risk is often used for two binary variables, so let’s convert our data into two binary variables:
drive_start
: turnover - Interception/Fumble vs kick -
Kickoff/Puntdrive_end
: TD - Touchdown vs noTD - Field
Goal/Turnover/Puntdrives2 <-
drives |>
mutate(
drive_start = if_else(drive_start %in% c("Interception", "Fumble"),
true = "turnover",
false = "kick") |>
factor(),
drive_end = if_else(drive_end == "Touchdown",
true = "TD",
false = "noTD") |>
factor()
)
Next, let’s display the results in a two-way table using
xtabs()
# Displaying a two-way table using xtabs()
drive_freq <-
xtabs(formula = ~ drive_start + drive_end, data = drives2)
# The addmargins() function will add the margins to the table
drive_freq |>
addmargins()
## drive_end
## drive_start noTD TD Sum
## kick 4055 1200 5255
## turnover 571 239 810
## Sum 4626 1439 6065
And converting the table from counts to conditional proportions using
prop.table(margin = "drive_start")
drive_freq |>
prop.table(margin = "drive_start") |>
round(digits = 3)
## drive_end
## drive_start noTD TD
## kick 0.772 0.228
## turnover 0.705 0.295
Let’s create a bar chart to compare the probability of a touchdown between the two groups:
drive_freq |>
# Calculate the conditional proportions
prop.table(margin = "drive_start") |>
# Convert the table to a data frame to be used by ggplot()
data.frame() |>
# Rename Freq to prop
rename(prop = Freq) |>
ggplot(
mapping = aes(
x = drive_start,
fill = drive_end,
y = prop
)
) +
geom_col(width = 0.5) +
labs(
y = "Proportion",
x = "How Drive Began",
fill = "Drive \nResult"
) +
theme_test() +
scale_y_continuous(expand = c(0, 0, 0, 0.05)) +
scale_fill_manual(
labels = c("No Touchdown", "Touchdown"),
values = c("darkorange", "steelblue")
)
There are some differences, but it isn’t a very noticeable difference.
The odds are another way of measuring how frequently events occur, but instead they compare the probability of the event happening vs the event not happening using a ratio:
\[\textrm{odds} = \frac{\pi}{1-\pi}\]
If we ignore how the drive began, the odds a drive ends in a touchdown is:
\[\textrm{odds of a touchdown} = \frac{p}{1-p} = \frac{1439}{4626} = 0.31\]
which means that for every 1 drive that ends in a touchdown, about 3 drives do not
One benefit of calculating the odds is if we want to estimate it for a binary variable, we place the sample count of success on the top and the sample count of failures on the bottom of the fraction.
Like we did with the probabilities (or “risks”), we can use the odds to test if two binary variables are associated using the odds ratio
The odds ratio is similar to the risk ratio, in that it is a ratio of the odds of success for two groups of a binary variable (group 1 vs group 2):
\[\theta_{XY} = \frac{\textrm{odds}_1}{\textrm{odds}_2}\]
The odds a drive ends on a touchdown if the drive starts from a turnover is:
\[\textrm{odds}_1 = \frac{239}{571} = 0.419\]
The odds a drive ends on a touchdown if the drive starts from a kickoff or punt is:
\[\textrm{odds}_2 = \frac{1200}{4055} = 0.296\]
Then our estimated odds ratio is:
\[\hat{\theta}_{Y|X} = \frac{\textrm{odds}_1}{\textrm{odds}_2} == \frac{0.419}{0.296} = 1.414\]
We can interpret the results as
“The odds a drive ends with a touchdown is 1.414 times higher if the drive started from a turnover than from a kickoff or punt”
While both odds are below 1, the odds a drive ends in a touchdown if it starts from a turnover is about 40% higher than if it started after a kick.
We can estimate the odds ratio a little more directly using our sample counts. If the counts are in a 2x2 table:
Y = 1 | Y = 2 | |
---|---|---|
X = 1 | n11 | n12 |
X = 2 | n21 | n22 |
Then our estimated odds ratio is
\[\hat{\theta}_{Y|X} = \frac{p_{1|1} (1-p_{1|2})}{p_{1|2} (1-p_{1|1})} = \frac{n_{11} \times n_{22}}{n_{12} \times n_{21}}\]
The odds ratio for our example is:
\[\hat{\theta}_{Y|X} = \frac{239 \times 4055}{571 * 1200} = 1.414\]
Similar to the risk ratio (or relative risk), we work with the log odds ratio \(\ln \hat{\theta}_{Y|X}\) instead of the odds ratio directly.
If all of the observed counts are at least 10:
\[\ln \hat{\theta}_{Y|X} \sim N(\ln \theta_{Y|X}, SE)\]
and the standard error is:
\[SE(\ln \hat{\theta}_{Y|X}) = \sqrt{\frac{1}{n_{11}} + \frac{1}{n_{12}} + \frac{1}{n_{21}} + \frac{1}{n_{22}}}\]
Our null hypothesis is:
\[H_0: \theta_{Y|X} = 1 \rightarrow H_0: \ln \theta_{Y|X} = 0\]
and our test statistic is:
\[z = \frac{\ln \hat{\theta}_{Y|X}}{\sqrt{\frac{1}{n_{11}} + \frac{1}{n_{12}} + \frac{1}{n_{21}} + \frac{1}{n_{22}}}}\]
We can conduct a hypothesis test for the odds ratio using
oddsratio()
in the epitools
package.
It wants a 2x2 table with:
which is the exact order of our table!
The arguments for oddsratio()
are:
x =
the 2x2 table described abovemethod = "wald"
to use the Normal model to find the
p-value and critical valueconf.level = 0.95
the confidence level of a desired
hypothesis testrev = "neither"
will reverse the order of the “rows”,
“columns”, or “both” of the table from x =
td_OR <-
oddsratio(x = drive_freq, method = "wald")
# The table with margins:
td_OR$data
## drive_end
## drive_start noTD TD Total
## kick 4055 1200 5255
## turnover 571 239 810
## Total 4626 1439 6065
# The estimated odds ratio and confidence interval
td_OR$measure |>
data.frame()
## estimate lower upper
## kick 1.000000 NA NA
## turnover 1.414397 1.200264 1.666733
# And the p-value (chi.squared)
td_OR$p.value |>
data.frame()
## midp.exact fisher.exact chi.square
## kick NA NA NA
## turnover 4.724261e-05 5.233613e-05 3.264256e-05
A big advantage of using the odds ratio over the relative risk is the order of the condition (\(Y|X\) vs \(X|Y\)) won’t change the result
\[\theta_{Y|X} = \theta_{X|Y} = \theta_{XY}\]
For instance, if we swap the order of the rows and columns with
t()
(for transpose), the results for the odds ratio won’t
change, but the relative risk will!
# Calculate the OR swapping the rows and columns (swapping X and Y roles)
oddsratio(x = drive_freq |> t(), method = "wald")$measure |>
data.frame() |>
round(3)
## estimate lower upper
## noTD 1.000 NA NA
## TD 1.414 1.2 1.667
Note that our estimate and confidence interval remain exactly the same!
The same won’t be true for the relative risk:
# Calculating the relative risk for Y|X
riskratio(x = drive_freq, method = "wald")$measure |>
data.frame() |>
round(3)
## estimate lower upper
## kick 1.000 NA NA
## turnover 1.292 1.149 1.453
# Calculating the relative risk for X|Y
riskratio(x = drive_freq |> t(), method = "wald")$measure |>
data.frame() |>
round(3)
## estimate lower upper
## noTD 1.000 NA NA
## TD 1.346 1.171 1.546
Who cares if the odds ratio is symmetric (can reverse X and Y and get the same result) while the relative risk is not?
Based on how the data are collected, we can’t always estimate the conditional proportions needed for the relative risk.
For retrospective studies, we can’t estimate \(P(Y=j|X = i) = \pi_{i|j}\) since the number of cases in each group of \(Y\) is not random.
However, the counts for \(X\) within each group of \(Y\) is random in retrospective studies, which means we can estimate \(P(X = i|Y = j) = \pi_{i|j}\). And as long as we can estimate one conditional proportion, we can use the odds ratio to check if \(X\) and \(Y\) are independent and calculate confidence intervals for \(\theta_{Y|X}\) even if we can estimate \(\pi_{j|i}\) directly!