Hot Hands Lab

Author

E Jeong

Hot Hands Lab

Load Libraries

library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.5
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ ggplot2   3.5.1     ✔ tibble    3.2.1
✔ lubridate 1.9.3     ✔ tidyr     1.3.1
✔ purrr     1.0.2     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(openintro)
Loading required package: airports
Loading required package: cherryblossom
Loading required package: usdata

Data

data(kobe_basket)
head(kobe_basket)
# A tibble: 6 × 6
  vs     game quarter time  description                                    shot 
  <fct> <int> <fct>   <fct> <fct>                                          <chr>
1 ORL       1 1       9:47  Kobe Bryant makes 4-foot two point shot        H    
2 ORL       1 1       9:07  Kobe Bryant misses jumper                      M    
3 ORL       1 1       8:11  Kobe Bryant misses 7-foot jumper               M    
4 ORL       1 1       7:41  Kobe Bryant makes 16-foot jumper (Derek Fishe… H    
5 ORL       1 1       7:03  Kobe Bryant makes driving layup                H    
6 ORL       1 1       6:01  Kobe Bryant misses jumper                      M    

Excercise 1

Question: What does a streak length of 1 mean, i.e. how many hits and misses are in a streak of 1? What about a streak length of 0?

# Streak length of 1 means that there is only 1 hit. Streak length of 0 means there are 0 hits.

Create Bar Graph

kobe_streak <- calc_streak(kobe_basket$shot)

Look at distribution

ggplot(data = kobe_streak, aes(x = length)) +
  geom_bar()

Exercise 2

Question: Describe the distribution of Kobe’s streak lengths from the 2009 NBA finals. What was his typical streak length? How long was his longest streak of baskets? Make sure to include the accompanying plot in your answer.

# His typical streak length was 0 according to the distribution. His longest streak was 4 baskekts.

ggplot(data = kobe_streak, aes(x = length)) +
  geom_bar()

Simulations in R

Coin Flip Simulation

coin_outcomes <- c("heads", "tails")
sample(coin_outcomes, size = 1, replace = TRUE)
[1] "tails"

Save Simulation

sim_fair_coin <- sample(coin_outcomes, size = 100, replace = TRUE)

View Results

sim_fair_coin
  [1] "heads" "heads" "heads" "tails" "tails" "heads" "heads" "tails" "heads"
 [10] "tails" "tails" "heads" "heads" "tails" "heads" "tails" "tails" "tails"
 [19] "tails" "tails" "heads" "heads" "tails" "tails" "heads" "tails" "heads"
 [28] "heads" "tails" "heads" "heads" "heads" "tails" "tails" "tails" "heads"
 [37] "heads" "tails" "heads" "heads" "tails" "tails" "heads" "heads" "tails"
 [46] "tails" "heads" "heads" "heads" "tails" "tails" "heads" "tails" "tails"
 [55] "tails" "heads" "heads" "tails" "tails" "tails" "heads" "heads" "tails"
 [64] "tails" "heads" "tails" "tails" "heads" "heads" "heads" "heads" "tails"
 [73] "heads" "tails" "heads" "heads" "heads" "heads" "tails" "heads" "tails"
 [82] "tails" "tails" "heads" "tails" "heads" "tails" "heads" "tails" "tails"
 [91] "heads" "tails" "tails" "heads" "tails" "heads" "heads" "tails" "heads"
[100] "tails"
table(sim_fair_coin)
sim_fair_coin
heads tails 
   50    50 

Add probability weights

sim_unfair_coin <- sample(coin_outcomes, size = 100, replace = TRUE, 
                          prob = c(0.2, 0.8))

Exercise 3

Question: In your simulation of flipping the unfair coin 100 times, how many flips came up heads? Include the code for sampling the unfair coin in your response. Since the markdown file will run the code, and generate a new sample each time you Knit it, you should also “set a seed” before you sample. Read more about setting a seed below.

# Heads comes up 44 times, and Tails come up 56 times.
set.seed(35797)  

Simulating the Independent Shooter

shot_outcomes <- c("H", "M")
sim_basket <- sample(shot_outcomes, size = 1, replace = TRUE)

Exercise 4

Question: What change needs to be made to the sample function so that it reflects a shooting percentage of 45%? Make this adjustment, then run a simulation to sample 133 shots. Assign the output of this simulation to a new object called sim_basket.

sim_basket <- sample(shot_outcomes, size = 1, replace = TRUE, prob = c(0.45, 0.55))