library(tidyverse)
library(openintro)Hot Hands Lab
Hot Hands Lab
Based on: http://openintrostat.github.io/oilabs-tidy/03_probability/probability.html
Load libraries
Data
data(kobe_basket)
head(kobe_basket)# A tibble: 6 × 6
vs game quarter time description shot
<fct> <int> <fct> <fct> <fct> <chr>
1 ORL 1 1 9:47 Kobe Bryant makes 4-foot two point shot H
2 ORL 1 1 9:07 Kobe Bryant misses jumper M
3 ORL 1 1 8:11 Kobe Bryant misses 7-foot jumper M
4 ORL 1 1 7:41 Kobe Bryant makes 16-foot jumper (Derek Fishe… H
5 ORL 1 1 7:03 Kobe Bryant makes driving layup H
6 ORL 1 1 6:01 Kobe Bryant misses jumper M
Exercise 1
What does a streak length of 1 mean, i.e. how many hits and misses are in a streak of 1? What about a streak length of 0?
A streak length of 1 means one hit and one miss. A streak length of 0 is no hits and one miss.
kobe_streak <- calc_streak(kobe_basket$shot)
summary(kobe_streak) length
Min. :0.0000
1st Qu.:0.0000
Median :0.0000
Mean :0.7632
3rd Qu.:1.0000
Max. :4.0000
Create a bar graph
ggplot(data = kobe_streak, aes(x = length)) +
geom_bar()Exercise 2
Describe the distribution of Kobe’s streak lengths from the 2009 NBA finals. What was his typical streak length? How long was his longest streak of baskets?
The distribution is right skewed and unimodal. His typical streak length was 0. His longest streak length was 4. More than half of his streaks were of length 0.
Simulations
Simulate flipping a fair coin:
coin_outcomes <- c("heads", "tails")
sample(coin_outcomes, size = 1, replace = TRUE)[1] "tails"
Simulate flipping a fair coin 100 times:
sim_fair_coin <- sample(coin_outcomes, size = 100, replace = TRUE)To view the results of this simulation, type the name of the object and then use table to count up the number of heads and tails.
sim_fair_coin [1] "tails" "heads" "heads" "heads" "tails" "tails" "tails" "tails" "heads"
[10] "tails" "heads" "tails" "tails" "tails" "tails" "tails" "heads" "heads"
[19] "tails" "tails" "tails" "tails" "heads" "tails" "heads" "heads" "heads"
[28] "heads" "tails" "tails" "tails" "tails" "tails" "heads" "tails" "tails"
[37] "tails" "tails" "heads" "heads" "tails" "tails" "tails" "heads" "tails"
[46] "tails" "heads" "heads" "heads" "tails" "tails" "heads" "tails" "heads"
[55] "heads" "tails" "heads" "tails" "heads" "heads" "tails" "heads" "heads"
[64] "tails" "tails" "heads" "heads" "heads" "tails" "tails" "tails" "heads"
[73] "heads" "tails" "heads" "heads" "tails" "tails" "tails" "heads" "heads"
[82] "heads" "heads" "heads" "tails" "tails" "tails" "heads" "heads" "heads"
[91] "heads" "tails" "heads" "heads" "tails" "tails" "tails" "heads" "heads"
[100] "heads"
table(sim_fair_coin)sim_fair_coin
heads tails
48 52
Simulate an unfair coin that we know only lands heads 20% of the time:
set.seed(11235)
sim_unfair_coin <- sample(coin_outcomes, size = 100, replace = TRUE,
prob = c(0.2, 0.8))Exercise 3
In your simulation of flipping the unfair coin 100 times, how many flips came up heads?
table(sim_unfair_coin)sim_unfair_coin
heads tails
22 78
In my simulation 22 flips came up heads
Simulating the Independent Shooter
Simulate a single shot from an independent shooter with a shooting percentage of 50%:
shot_outcomes <- c("H", "M")
sim_basket <- sample(shot_outcomes, size = 1, replace = TRUE)Exercise 4
What change needs to be made to the sample function so that it reflects a shooting percentage of 45%? Make this adjustment, then run a simulation to sample 133 shots. Assign the output of this simulation to a new object called sim_basket.
sim_basket <- sample(shot_outcomes, size = 133, replace = TRUE, prob = c(0.45, 0.55))Exercise 5
Using calc_streak, compute the streak lengths of sim_basket, and save the results in a data frame called sim_streak.
sim_streak <- calc_streak(sim_basket)Exercise 6
Describe the distribution of streak lengths. What is the typical streak length for this simulated independent shooter with a 45% shooting percentage? How long is the player’s longest streak of baskets in 133 shots? Make sure to include a plot in your answer.
ggplot(data = sim_streak, aes(x = length)) +
geom_bar()The distribution is right skewed and unimodal. The minimum streak length is 0 and the maximum is 5. The chance of getting at least one basket is greater than the chance of getting no baskets. 0 is still the most common streak length. The player’s longest streak of baskets in 133 shots is 5.
Exercise 7
If you were to run the simulation of the independent shooter a second time, how would you expect its streak distribution to compare to the distribution from the question above? Exactly the same? Somewhat similar? Totally different? Explain your reasoning.
I would expect its streak distribution to be somewhat similar. Though technically possible, it would be unlikely for the simulation to produce the exact same distribution (without setting a seed).
Exercise 8
How does Kobe Bryant’s distribution of streak lengths compare to the distribution of streak lengths for the simulated shooter? Using this comparison, do you have evidence that the hot hand model fits Kobe’s shooting patterns? Explain.
Kobe Bryant’s maximum streak length is lower than that of the simulated shooter. He was also more likely to miss a basket than make one. For the simulated shooter the opposite was true. There is not evidence here that the hot hand model fits Kobe’s shooting patterns; he actually performed worse than an independent shooter, someone without a hot hand.