Setup

download.file("http://www.openintro.org/stat/data/kobe.RData", destfile = "kobe.RData")
load("kobe.RData")
head(kobe)
##    vs game quarter time
## 1 ORL    1       1 9:47
## 2 ORL    1       1 9:07
## 3 ORL    1       1 8:11
## 4 ORL    1       1 7:41
## 5 ORL    1       1 7:03
## 6 ORL    1       1 6:01
##                                               description basket
## 1                 Kobe Bryant makes 4-foot two point shot      H
## 2                               Kobe Bryant misses jumper      M
## 3                        Kobe Bryant misses 7-foot jumper      M
## 4 Kobe Bryant makes 16-foot jumper (Derek Fisher assists)      H
## 5                         Kobe Bryant makes driving layup      H
## 6                               Kobe Bryant misses jumper      M
kobe$basket[1:9]
## [1] "H" "M" "M" "H" "H" "M" "M" "M" "M"

Exercise 1

A streak of length 1 means that there was only one hit, surrounded by misses on both side. A streak of length 0 is just a single miss.

kobe_streak <- calc_streak(kobe$basket)
barplot(table(kobe_streak))

Exercise 2

A large majority of streaks are either of length 0 or 1.

mean(kobe_streak)
## [1] 0.7631579
max(kobe_streak)
## [1] 4

Exercise 3

outcomes <- c("heads", "tails")
sim_unfair_coin <- sample(outcomes, size = 100, replace = TRUE, prob = c(0.2, 0.8))
coins <- table(sim_unfair_coin)
coins
## sim_unfair_coin
## heads tails 
##    18    82

Exercise 4

outcomes <- c("H", "M")
sim_basket <- sample(outcomes, size = 133, replace = TRUE, prob = c(0.45, 0.55))
kobe$basket
##   [1] "H" "M" "M" "H" "H" "M" "M" "M" "M" "H" "H" "H" "M" "H" "H" "M" "M"
##  [18] "H" "H" "H" "M" "M" "H" "M" "H" "H" "H" "M" "M" "M" "M" "M" "M" "H"
##  [35] "M" "H" "M" "M" "H" "H" "H" "H" "M" "H" "M" "M" "H" "M" "M" "H" "M"
##  [52] "M" "H" "M" "H" "H" "M" "M" "H" "M" "H" "H" "M" "H" "M" "M" "M" "H"
##  [69] "M" "M" "M" "M" "H" "M" "H" "M" "M" "H" "M" "M" "H" "H" "M" "M" "M"
##  [86] "M" "H" "H" "H" "M" "M" "H" "M" "M" "H" "M" "H" "H" "M" "H" "M" "M"
## [103] "H" "M" "M" "M" "H" "M" "H" "H" "H" "M" "H" "H" "H" "M" "H" "M" "H"
## [120] "M" "M" "M" "M" "M" "M" "H" "M" "H" "M" "M" "M" "M" "H"
sim_basket
##   [1] "M" "M" "H" "M" "M" "H" "H" "H" "M" "M" "H" "H" "H" "H" "M" "H" "H"
##  [18] "H" "H" "M" "M" "M" "H" "M" "M" "H" "H" "M" "M" "H" "M" "M" "H" "M"
##  [35] "M" "M" "M" "H" "M" "M" "M" "M" "M" "M" "H" "H" "M" "M" "M" "M" "H"
##  [52] "H" "H" "H" "H" "H" "M" "M" "M" "H" "H" "M" "M" "M" "M" "M" "M" "H"
##  [69] "M" "M" "M" "M" "M" "H" "H" "H" "M" "H" "H" "M" "H" "H" "M" "M" "H"
##  [86] "H" "M" "M" "M" "M" "M" "H" "H" "M" "H" "H" "H" "H" "H" "M" "H" "H"
## [103] "H" "H" "H" "M" "M" "M" "H" "M" "H" "H" "M" "M" "H" "M" "M" "M" "M"
## [120] "H" "M" "H" "M" "M" "H" "M" "M" "H" "M" "H" "M" "M" "M"

On Your Own

1

sim_streak <- calc_streak(sim_basket)
mean(sim_streak)
## [1] 0.7866667
max(sim_streak)
## [1] 6

2

The repeated simulation should be similar, but probably not the same as, the first simulation. Each new simulation is subject to random noise, and by its very nature, we should expect different ammounts of variation from the average

3

barplot(table(kobe_streak))

barplot(table(sim_streak))

Kobe’s streak lengths are usually sorter than the simulation’s. The overall distributions are pretty similar too, which suggests that the Hot Hand model is a myth.