Hot Hands Lab

Author

Alexandra Veremeychik

Hot Hands Lab

Based on: http://openintrostat.github.io/oilabs-tidy/03_probability/probability.html

Load libraries

library(tidyverse)
library(openintro)

Data

data(kobe_basket)
head(kobe_basket)
# A tibble: 6 × 6
  vs     game quarter time  description                                    shot 
  <fct> <int> <fct>   <fct> <fct>                                          <chr>
1 ORL       1 1       9:47  Kobe Bryant makes 4-foot two point shot        H    
2 ORL       1 1       9:07  Kobe Bryant misses jumper                      M    
3 ORL       1 1       8:11  Kobe Bryant misses 7-foot jumper               M    
4 ORL       1 1       7:41  Kobe Bryant makes 16-foot jumper (Derek Fishe… H    
5 ORL       1 1       7:03  Kobe Bryant makes driving layup                H    
6 ORL       1 1       6:01  Kobe Bryant misses jumper                      M    

Exercise 1

What does a streak length of 1 mean, i.e. how many hits and misses are in a streak of 1? What about a streak length of 0?

A streak length of 1 means one hit and one miss. A streak length of 0 is no hits and one miss.

kobe_streak <- calc_streak(kobe_basket$shot)
summary(kobe_streak)
     length      
 Min.   :0.0000  
 1st Qu.:0.0000  
 Median :0.0000  
 Mean   :0.7632  
 3rd Qu.:1.0000  
 Max.   :4.0000  

Create a bar graph

ggplot(data = kobe_streak, aes(x = length)) +
  geom_bar()

Exercise 2

Describe the distribution of Kobe’s streak lengths from the 2009 NBA finals. What was his typical streak length? How long was his longest streak of baskets?

The distribution is right skewed and unimodal. His typical streak length was 0. His longest streak length was 4. More than half of his streaks were of length 0.

Simulations

Simulate flipping a fair coin:

coin_outcomes <- c("heads", "tails")
sample(coin_outcomes, size = 1, replace = TRUE)
[1] "tails"

Simulate flipping a fair coin 100 times:

sim_fair_coin <- sample(coin_outcomes, size = 100, replace = TRUE)

To view the results of this simulation, type the name of the object and then use table to count up the number of heads and tails.

sim_fair_coin
  [1] "tails" "heads" "heads" "heads" "tails" "tails" "tails" "tails" "heads"
 [10] "tails" "heads" "tails" "tails" "tails" "tails" "tails" "heads" "heads"
 [19] "tails" "tails" "tails" "tails" "heads" "tails" "heads" "heads" "heads"
 [28] "heads" "tails" "tails" "tails" "tails" "tails" "heads" "tails" "tails"
 [37] "tails" "tails" "heads" "heads" "tails" "tails" "tails" "heads" "tails"
 [46] "tails" "heads" "heads" "heads" "tails" "tails" "heads" "tails" "heads"
 [55] "heads" "tails" "heads" "tails" "heads" "heads" "tails" "heads" "heads"
 [64] "tails" "tails" "heads" "heads" "heads" "tails" "tails" "tails" "heads"
 [73] "heads" "tails" "heads" "heads" "tails" "tails" "tails" "heads" "heads"
 [82] "heads" "heads" "heads" "tails" "tails" "tails" "heads" "heads" "heads"
 [91] "heads" "tails" "heads" "heads" "tails" "tails" "tails" "heads" "heads"
[100] "heads"
table(sim_fair_coin)
sim_fair_coin
heads tails 
   48    52 

Simulate an unfair coin that we know only lands heads 20% of the time:

set.seed(11235)

sim_unfair_coin <- sample(coin_outcomes, size = 100, replace = TRUE, 
                          prob = c(0.2, 0.8))

Exercise 3

In your simulation of flipping the unfair coin 100 times, how many flips came up heads?

table(sim_unfair_coin)
sim_unfair_coin
heads tails 
   22    78 

In my simulation 22 flips came up heads

Simulating the Independent Shooter

Simulate a single shot from an independent shooter with a shooting percentage of 50%:

shot_outcomes <- c("H", "M")
sim_basket <- sample(shot_outcomes, size = 1, replace = TRUE)

Exercise 4

What change needs to be made to the sample function so that it reflects a shooting percentage of 45%? Make this adjustment, then run a simulation to sample 133 shots. Assign the output of this simulation to a new object called sim_basket.

sim_basket <- sample(shot_outcomes, size = 133, replace = TRUE, prob = c(0.45, 0.55))

Exercise 5

Using calc_streak, compute the streak lengths of sim_basket, and save the results in a data frame called sim_streak.

sim_streak <- calc_streak(sim_basket)

Exercise 6

Describe the distribution of streak lengths. What is the typical streak length for this simulated independent shooter with a 45% shooting percentage? How long is the player’s longest streak of baskets in 133 shots? Make sure to include a plot in your answer.

ggplot(data = sim_streak, aes(x = length)) +
  geom_bar()

The distribution is right skewed and unimodal. The minimum streak length is 0 and the maximum is 5. The chance of getting at least one basket is greater than the chance of getting no baskets. 0 is still the most common streak length. The player’s longest streak of baskets in 133 shots is 5.

Exercise 7

If you were to run the simulation of the independent shooter a second time, how would you expect its streak distribution to compare to the distribution from the question above? Exactly the same? Somewhat similar? Totally different? Explain your reasoning.

I would expect its streak distribution to be somewhat similar. Though technically possible, it would be unlikely for the simulation to produce the exact same distribution (without setting a seed).

Exercise 8

How does Kobe Bryant’s distribution of streak lengths compare to the distribution of streak lengths for the simulated shooter? Using this comparison, do you have evidence that the hot hand model fits Kobe’s shooting patterns? Explain.

Kobe Bryant’s maximum streak length is lower than that of the simulated shooter. He was also more likely to miss a basket than make one. For the simulated shooter the opposite was true. There is not evidence here that the hot hand model fits Kobe’s shooting patterns; he actually performed worse than an independent shooter, someone without a hot hand.