LJ Data Dive - Hypothesis Testing

#Package Loading
library(Hmisc)

## 
## Attaching package: 'Hmisc'

## The following objects are masked from 'package:base':
## 
##     format.pval, units

library(tidyverse)

## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.4     ✔ readr     2.1.5
## ✔ forcats   1.0.0     ✔ stringr   1.5.1
## ✔ ggplot2   3.5.1     ✔ tibble    3.2.1
## ✔ lubridate 1.9.4     ✔ tidyr     1.3.1
## ✔ purrr     1.0.4

## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter()    masks stats::filter()
## ✖ dplyr::lag()       masks stats::lag()
## ✖ dplyr::src()       masks Hmisc::src()
## ✖ dplyr::summarize() masks Hmisc::summarize()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

library(ggplot2)

#Loading the MoneyPuck Shot Dataset
mpd = read.csv('./shots_2024.csv')

#adding descriptors to dataframe

# Load the data dictionary (update with your file path)
#data_dict <- read.csv('./MoneyPuck_Shot_Data_Dictionary (1) (1).csv')

# Iterate through the data dictionary and assign labels (from ChatGPT -- QOL Step)

#for (i in 1:nrow(data_dict)) {
  #column_name <- data_dict$Variable[i]
  #description <- data_dict$Definition[i]
  
  #if (column_name %in% colnames(mpd)) {
    #label(mpd[[column_name]]) <- description
  #}
#}

Hypothesis Testing 1

Null Hypothesis (H0): There is no significant difference in the time on ice between defending team and shooting team players.

Alternative Hypothesis (H1): There is a significant difference in the time on ice between defending team and shooting team players.

Test Chosen: Paired t‑test Since both values are measured in the same event, a paired test controls for event‐to‐event variability.

Alpha Level: 0.05 This standard significance level implies a 5% risk of a Type I error (i.e., falsely rejecting the null hypothesis). Nothing is really at risk here so a standard significance value is taken.

Use standard .80 and .5 for Power and Minimum Effect Size.

# Calculate required sample size for a paired t-test:
sample_size <- power.t.test(delta = 0.5, sd = 1, sig.level = 0.05, power = 0.8, type = "paired", alternative = "two.sided")
print(sample_size$n)

## [1] 33.3672

The set has more than enough rows!

paired_t_test_result <- t.test(mpd$shootingTeamAverageTimeOnIce,                          mpd$defendingTeamAverageTimeOnIce, 
                               paired = TRUE)

# Print the test results
print(paired_t_test_result)

## 
##  Paired t-test
## 
## data:  mpd$shootingTeamAverageTimeOnIce and mpd$defendingTeamAverageTimeOnIce
## t = -38.715, df = 58558, p-value < 2.2e-16
## alternative hypothesis: true mean difference is not equal to 0
## 95 percent confidence interval:
##  -3.191080 -2.883546
## sample estimates:
## mean difference 
##       -3.037313

if (paired_t_test_result$p.value < 0.05) {
  print("Reject the null hypothesis: There is a significant difference in time on ice between the shooting and defending teams.")
} else {
  print("Fail to reject the null hypothesis: There is no significant difference in time on ice between the shooting and defending teams.")
}

## [1] "Reject the null hypothesis: There is a significant difference in time on ice between the shooting and defending teams."

Interpretation

The paired t-test showed a statistically significant difference between the time on ice for the shooting and defending teams. On average, the shooting team spent about 3 seconds less on ice than the defending team. However, this 3-second difference is very small when considered in the context of game play. Due to the large sample size, even such a minor discrepancy can achieve statistical significance. In practical terms, it is unlikely that a 3-second difference would have a meaningful impact on game strategy or player performance. Overall, while the result is statistically significant, it is probably practically insignificant.

mpd_long <- mpd %>%
  select(shootingTeamAverageTimeOnIce, defendingTeamAverageTimeOnIce) %>%
  pivot_longer(cols = everything(), 
               names_to = "TeamType", 
               values_to = "TimeOnIce")

mpd_long$TeamType <- recode(mpd_long$TeamType, 
                            shootingTeamAverageTimeOnIce = "Shooting Team", 
                            defendingTeamAverageTimeOnIce = "Defending Team")

ggplot(mpd_long, aes(x = TimeOnIce, fill = TeamType)) +
  geom_density(alpha = 0.5) +
  labs(title = "Density Plot of Time on Ice: Shooting vs. Defending Teams",
       x = "Time on Ice (seconds)",
       y = "Density") +
  theme_minimal()

The density plot shows the distribution of time on ice for both the shooting and defending teams, highlighting the overall spread and concentration of values. It reveals that the shooting team’s time on ice is slightly lower than the defending team’s, but both distributions overlap significantly, indicating that the difference may not be practically significant.

Hypothesis Testing 2

(This will use a subset of 10 random games for computational efficiency)

Chosen Test: We will use Fishers Exact Test for two categorical variables, with an alpha of 0.05.

H0: There is no association between shot type (Wrist Shot, Slap Shot, Snap Shot, and Other) and goal outcome (Goal vs. No Goal). In other words, the probability of scoring a goal is the same for all shot types.

H1: There is a significant association between shot type and goal outcome. This implies that the likelihood of scoring a goal differs by shot type.

set.seed(123)

random_games <- sample(unique(mpd$game_id), 10)

mpd_filtered <- mpd %>%
  filter(game_id %in% random_games) %>%
  filter(shotType %in% c("WRIST", "SNAP", "SLAP"))

contingency_table_all_shots <- table(mpd_filtered$shotType, mpd_filtered$goal)

print(contingency_table_all_shots)

##        
##           0   1
##   SLAP   92   3
##   SNAP  123  17
##   WRIST 399  29

fisher_test_result <- fisher.test(contingency_table_all_shots)

print(fisher_test_result)

## 
##  Fisher's Exact Test for Count Data
## 
## data:  contingency_table_all_shots
## p-value = 0.02932
## alternative hypothesis: two.sided

# Convert the contingency table into a data frame for ggplot
contingency_df <- as.data.frame(contingency_table_all_shots)

# Create a bar plot to show the proportions of goals for each shot type
ggplot(contingency_df, aes(x = Var1, y = Freq, fill = factor(Var2))) +
  geom_bar(stat = "identity", position = "fill") +
  labs(title = "Proportion of Goals by Shot Type",
       x = "Shot Type",
       y = "Proportion of Goals",
       fill = "Goal Outcome") +
  theme_minimal()

This graph provides a clear visual comparison of goal conversion rates for each shot type, showing that SNAP shots have the highest conversion rate, followed by WRIST shots and then SLAP shots. Further statistical testing could help confirm whether these differences are statistically significant.

Interpretation

The result of the Fisher’s Exact Test indicates that there is a statistically significant difference in the proportion of goals scored across different shot types (SLAP, SNAP, and WRIST). The p-value of 0.02932 is below the commonly used significance threshold of 0.05, leading to the rejection of the null hypothesis. This suggests that the distribution of goals is not the same across the three shot types, with some shot types being more or less likely to result in goals. However, while statistically significant, the practical significance of this finding should be considered. For example, even though the difference is statistically significant, the actual impact of this difference on overall game performance may be small, so further analysis would be needed to understand its true relevance in a broader context. A logical follow-up test could be a pairwise comparison using Fisher’s Exact Test for each pair of shot types (e.g., SLAP vs. SNAP, SNAP vs. WRIST, and SLAP vs. WRIST) to explore which specific shot types differ in goal outcomes.

LJ Data Dive - Hypothesis Testing

Logan Johnson

2025-03-03

Hypothesis Testing 1

Interpretation

Hypothesis Testing 2

Interpretation