library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.2 ──
## ✔ ggplot2 3.3.6 ✔ purrr 0.3.4
## ✔ tibble 3.1.8 ✔ dplyr 1.0.9
## ✔ tidyr 1.2.0 ✔ stringr 1.4.1
## ✔ readr 2.1.2 ✔ forcats 0.5.2
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
library(openintro)
## Loading required package: airports
## Loading required package: cherryblossom
## Loading required package: usdata
A streak length of one would be 1 hit and 1 miss. A streak length of zero would simply be 1 miss!
kobe_streak <- calc_streak(kobe_basket$shot)
ggplot(data = kobe_streak, aes(x = length)) +
geom_bar()
Kobe’s streak looks like a discrete graph of 1/x as <=1 is fairly
common, whereas any positive integer is statistically uncommon.
In terms of Kobe’s streaks:
set.seed(0144)
coin_outcomes <- c("heads", "tails")
sim_fair_coin <- sample(coin_outcomes, size = 100, replace = TRUE)
sim_unfair_coin <- sample(coin_outcomes, size = 100, replace = TRUE,
prob = c(0.2, 0.8))
table(sim_unfair_coin)
## sim_unfair_coin
## heads tails
## 20 80
20 flips came up heads, 80 came up tails.
set.seed(0144)
shot_outcomes <- c("H", "M")
sim_basket <- sample(shot_outcomes, size = 133, replace = TRUE, prob = c(0.45, 0.55))
table(sim_basket)
## sim_basket
## H M
## 58 75
Effectively, 3 parameters needs to be updated, the shot_outcomes, the size and the prob parameter.
sim_streak <- calc_streak(sim_basket)
sim_streak
ggplot(data = sim_streak, aes(x = length)) +
geom_bar()
The distribution of the streak looks like a discrete graph of 1/x as <=1 is fairly common, whereas any positive integer is statistically uncommon.
In terms of the streak:
I would expect it to be approximately the same. Just from a more fun perspective, I would say this fits nicely with Benford’s Law. In addition, with a high degree of confidence, I can say that the streak of 6 will most likely not be replicated in most other distributions as its occurrence is significantly small.
Kobe’s distribution of streak lengths compared to the “random” distribution show a few key similarities. Firstly, he has a remarkably high rate of sinking baskets, which increases the odds of creating a streak >= 1. What is interesting to note is that the simulated shooter had a greater range of values (0-6), whereas Kobe simply had a range of 0-4. Suffice to say, looking at this distribution, the hot hand model is not relevant to Kobe as there is nothing indicating that previous independent events can predict a future event.