https://github.com/ST541-Fall2020/itsthesnake-project-HotHand
Is each made shot an independent event?
What is the frequency of occurrence of the “hot hand” in NBA basketball games?
Does the chance of a “hot hand” vary significantly by player position as opposed to simulated players?
Canonical study is Gilovich, Vallone, and Tversky (1985) who initiated the debate and attributed the belief in the hot hand to general misconceptions of chance.
Ran probability of making any x shot conditioned on outcome of previous n makes or misses and found largely negative serial correlations.
The selection procedure here has several biases, the primary one being demonstrated by the following table
\[P(\text{Shot n+1 is made }|\text{ Player has made n previous shots})\]
\[P(B|A)=\frac{P(A \text{ and } B)}{P(A)}\]
Paired t test after shifting the difference of each shooter by corresponding bias
I.i.d. Bernoulli trials with probability of success equal to player’s observed shooting percentage
Streak length is determined after grouping by game and player by looking at the sequence of consecutive makes or misses
Takes size (number of simulations), number of simulated players, and probability of make or miss as arguments and uses a for loop because it is slightly more efficient at much larger sizes.
simulate_players <- function(size, n_sim_players, prob = c(0.5, 0.5)) {
shot_outcomes <- c(TRUE, FALSE)
simulated_players <- tibble(isShotMade = rep(NA, size), streakLength = rep(NA, size), Label = rep(NA, size))
for (i in 1:n_sim_players) {
df <- tibble(isShotMade = rep(NA, size), streakLength = rep(NA, size), Label = rep(NA, size))
df <- df %>% mutate(isShotMade = sample(shot_outcomes, size = size, replace = T, prob = prob)) %>%
mutate(streakLength = HotHand::streak_length(isShotMade), Label = paste0("Simulated Player #", (i)))
simulated_players <- bind_rows(simulated_players, df) %>% drop_na()
}
assign("simulated_players", simulated_players, envir = .GlobalEnv)
}A small but substantial bias exists in the common measure of conditional dependence of present outcomes on streaks of past outcomes on sequential data. The magnitude of “streak selection bias” mostly decreases as the sequence gets longer but increases in streak length.
Bernoulli trials found 3-13% magnitude of difference depending on streak length thus far, difference between median three point shooter and top three point shooter in the NBA over the last 5 years is about 12%.
As a player begins to heat up, their behavior often changes, as well as their defender’s this type of behavior could also be responsible for drops in field goal percentage. Maybe some players are streaky and others are not, some could even just not be good!