Lab 3 Goals

(1) think about the effects of independent and dependent events.

(2) learn how to simulate shooting streaks in R.

(3) to compare a simulation to actual data in order to determine if the hot hand phenomenon appears to be real.

library(tidyverse)
## ── Attaching packages ───────────────────────────────────────────────────── tidyverse 1.3.0 ──
## ✓ ggplot2 3.3.2     ✓ purrr   0.3.4
## ✓ tibble  3.0.3     ✓ dplyr   1.0.2
## ✓ tidyr   1.1.2     ✓ stringr 1.4.0
## ✓ readr   1.3.1     ✓ forcats 0.5.0
## ── Conflicts ──────────────────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()
library(openintro)
## Loading required package: airports
## Loading required package: cherryblossom
## Loading required package: usdata

Kobe Bryant 2009 NBA finals performance

glimpse(kobe_basket)
## Rows: 133
## Columns: 6
## $ vs          <fct> ORL, ORL, ORL, ORL, ORL, ORL, ORL, ORL, ORL, ORL, ORL, OR…
## $ game        <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, …
## $ quarter     <fct> 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, …
## $ time        <fct> 9:47, 9:07, 8:11, 7:41, 7:03, 6:01, 4:07, 0:52, 0:00, 6:3…
## $ description <fct> Kobe Bryant makes 4-foot two point shot, Kobe Bryant miss…
## $ shot        <chr> "H", "M", "M", "H", "H", "M", "M", "M", "M", "H", "H", "H…

Exercise 1

What does a streak length of 1 mean, i.e. how many hits and misses are in a streak of 1? What about a streak length of 0?

(The length of a shooting streak is defined to be the number of consecutive baskets made until a miss occurs)

Calculate Kobe Streaks

kobe_streak <- calc_streak(kobe_basket$shot)

We can then take a look at the distribution of these streak lengths

ggplot(data = kobe_streak, aes(x = length)) + geom_bar()

Exercise 2

The distribution of Kobe’s streak length from the 2009 NBA finals is exponential

On average Kobe’s typical streak length was of .7632 baskets

Kobe’s longest streak of baskets was 4 .

summary(kobe_streak)
##      length      
##  Min.   :0.0000  
##  1st Qu.:0.0000  
##  Median :0.0000  
##  Mean   :0.7632  
##  3rd Qu.:1.0000  
##  Max.   :4.0000

Simulations in R

how do we tell if Kobe’s shooting streaks are long enough to indicate that he has a hot hand? We can compare his streak lengths to someone without a hot hand: an independent shooter.

Example of a simulation with a coin flip

The vector coin_outcomes can be thought of as a hat with two slips of paper in it: one slip says heads and the other says tails. The function sample draws one slip from the hat and tells us if it was a head or a tail

coin_outcomes <- c("heads", "tails")
sample(coin_outcomes, size =1, replace = TRUE)
## [1] "tails"

Simulation run 100 times

all elements in the outcomes vector have an equal probability of being drawn as there are no probability weights

sim_fair_coin <- sample(coin_outcomes, size = 100, replace = TRUE)

Simulation results

sim_fair_coin
##   [1] "tails" "heads" "heads" "heads" "tails" "heads" "tails" "heads" "tails"
##  [10] "tails" "tails" "heads" "tails" "tails" "tails" "heads" "heads" "heads"
##  [19] "heads" "tails" "tails" "tails" "tails" "heads" "heads" "heads" "tails"
##  [28] "tails" "tails" "tails" "heads" "tails" "heads" "heads" "heads" "heads"
##  [37] "heads" "tails" "heads" "tails" "heads" "tails" "tails" "heads" "heads"
##  [46] "tails" "tails" "tails" "heads" "tails" "heads" "heads" "heads" "tails"
##  [55] "tails" "heads" "tails" "heads" "heads" "tails" "heads" "tails" "tails"
##  [64] "heads" "heads" "heads" "heads" "tails" "tails" "heads" "heads" "tails"
##  [73] "tails" "heads" "tails" "heads" "heads" "heads" "heads" "tails" "heads"
##  [82] "heads" "heads" "heads" "heads" "heads" "heads" "heads" "tails" "heads"
##  [91] "tails" "tails" "tails" "heads" "heads" "tails" "tails" "heads" "heads"
## [100] "heads"
table(sim_fair_coin)
## sim_fair_coin
## heads tails 
##    56    44

Adjustment of simulation argument to provide a vector with 2 probability weights:

set.seed(06151982)  # make sure to change the seed
sim_unfair_coin <- sample(coin_outcomes, size = 100, replace = TRUE, 
                                  prob = c(0.2, 0.8))

prob=c(0.2, 0.8) indicates that for the two elements in the outcomes vector, we want to select the first one, heads, with probability 0.2 and the second one, tails with probability 0.8. Another way of thinking about this is to think of the outcome space as a bag of 10 chips, where 2 chips are labeled “head” and 8 chips “tail”. Therefore at each draw, the probability of drawing a chip that says “head”" is 20%, and “tail” is 80%

Exercise 3

Number of heads and tails

sim_unfair_coin
##   [1] "heads" "tails" "heads" "tails" "tails" "heads" "heads" "tails" "tails"
##  [10] "tails" "heads" "tails" "heads" "tails" "tails" "tails" "heads" "heads"
##  [19] "tails" "tails" "heads" "tails" "heads" "tails" "heads" "tails" "heads"
##  [28] "tails" "tails" "tails" "tails" "heads" "tails" "tails" "tails" "tails"
##  [37] "tails" "tails" "tails" "tails" "tails" "heads" "tails" "tails" "tails"
##  [46] "tails" "tails" "tails" "tails" "tails" "tails" "tails" "tails" "tails"
##  [55] "tails" "tails" "tails" "heads" "tails" "tails" "tails" "tails" "tails"
##  [64] "heads" "heads" "tails" "heads" "tails" "heads" "tails" "heads" "tails"
##  [73] "heads" "tails" "tails" "tails" "tails" "tails" "tails" "tails" "tails"
##  [82] "tails" "tails" "tails" "tails" "tails" "tails" "heads" "tails" "tails"
##  [91] "tails" "tails" "tails" "tails" "tails" "tails" "heads" "tails" "heads"
## [100] "tails"
table(sim_unfair_coin)
## sim_unfair_coin
## heads tails 
##    24    76
?sample

Simulating the independent shooter

Simulating a basketball player who has independent shots uses the same mechanism that you used to simulate a coin flip. To simulate a single shot from an independent shooter with a shooting percentage of 50% you can run the following code:

shot_outcomes <- c("H", "M")
sim_basket <- sample(shot_outcomes, size = 1, replace = TRUE)

results of sample 1 simulation of the independent shooter

sim_basket
## [1] "M"
table(sim_basket)
## sim_basket
## M 
## 1

Exercise 4

To make a valid comparison between Kobe and your simulated independent shooter, you need to align both their shooting percentage and the number of attempted shots

set.seed(06151983)  # make sure to change the seed
sim_basket <- sample(shot_outcomes, size = 133, replace = TRUE, 
                                  prob = c(0.45, 0.55))

Results

sim_basket
##   [1] "M" "M" "H" "H" "M" "M" "H" "M" "H" "H" "M" "H" "M" "H" "M" "H" "H" "M"
##  [19] "H" "M" "H" "H" "H" "H" "M" "M" "H" "M" "M" "H" "M" "H" "H" "M" "M" "M"
##  [37] "M" "H" "M" "H" "M" "M" "M" "M" "H" "H" "M" "H" "M" "H" "H" "M" "H" "M"
##  [55] "M" "M" "H" "H" "H" "H" "M" "M" "H" "M" "H" "H" "H" "H" "H" "M" "H" "M"
##  [73] "H" "H" "H" "H" "M" "M" "M" "H" "H" "H" "M" "M" "H" "M" "M" "M" "M" "H"
##  [91] "M" "M" "M" "M" "M" "M" "M" "H" "M" "M" "H" "M" "M" "M" "H" "M" "H" "M"
## [109] "M" "H" "H" "M" "H" "H" "M" "H" "M" "H" "M" "H" "M" "M" "H" "M" "H" "H"
## [127] "M" "H" "M" "H" "M" "M" "M"
table(sim_basket)
## sim_basket
##  H  M 
## 62 71

More Pactice

Comparing Kobe Bryant to the Independent Shooter

Exercise 5 Calculate the streak lengths of independent shooter and store it in a data frame

# sim_streak <- calc_streak(sim_basket)

Exercise 6 distribution of streak lengths of independent shooter

The distribution of of the streak lengths of the independent shooter are very similar to those of Kobe Bryant’s, as it appears to be exponential.

The typical streak lenghth for this simulated shooter is on average of .8611 baskets.

The longest streak of baskets in 133 shots is 5 baskets.

# ggplot(data = sim_streak, aes(x = length)) + geom_bar()
# summary(sim_streak)

Exercise 7

Running the simulation of the independent shooter a second time will yield a similar distribution result (not exactly the same), but somewhat similar, given that we have influenced the likelyhood of the events of the simulation by providing probability arguments of probability to the outcome vector.

Exercise 8

Kobe Bryant’s distribution of streak lengths appeasrs to be of similar proportions to that of the simulated shooter, or non-hot hand shooter, thus leading to the conclusion that Kobe’s shooting pattern is not that of a hot hand model.