What does a streak length of 1 mean, i.e. how many hits and misses are in a streak of 1? What about a streak length of 0?
- A streak length of 1 –> 1 hit and 1 miss.
- A streak of length 0 –> 0 hit and 1 miss.
Describe the distribution of Kobe’s streak lengths from the 2009 NBA finals. What was his typical streak length? How long was his longest streak of baskets?
As you can see in the summary below, Kobe’s mean streak is 0.7632 with a max streak of 4. And the barplot shows that his most frequent streak is 0, followed by 1. His highest streak for this game is 4, which happened once.
kobe_streak <- calc_streak(kobe$basket)
summary(kobe_streak)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.0000 0.0000 0.0000 0.7632 1.0000 4.0000
table(kobe_streak)
## kobe_streak
## 0 1 2 3 4
## 39 24 6 6 1
barplot(table(kobe_streak))
In your simulation of flipping the unfair coin 100 times, how many flips came up heads?
In one of my simuation, I got 17 heads and 83 tails. I also got exactly 20 heads and 80 tails.
outcomes <- c("heads", "tails")
sim_unfair_coin <- sample(outcomes, size = 100, replace = TRUE, prob = c(0.2, 0.8))
table(sim_unfair_coin)
## sim_unfair_coin
## heads tails
## 17 83
What change needs to be made to the sample function so that it reflects a shooting percentage of 45%? Make this adjustment, then run a simulation to sample 133 shots. Assign the output of this simulation to a new object called sim_basket.
a prob argument needs to be specified with a probability of 45% for hits.
outcomes <- c("H", "M")
sim_basket <- sample(outcomes, size = 133, replace = TRUE, prob = c(0.45, 0.55))
table(sim_basket)
## sim_basket
## H M
## 62 71
table(kobe$basket)
##
## H M
## 58 75
sim_streak <- calc_streak(sim_basket)
head(sim_streak)
## [1] 0 2 1 1 2 1
I ran the simulation several times just to get a feel for the simulated streak I would see. Below is a list of mean streak length and max streak length I observed. Based on these observations, the mean streak length for this simulation is around 0.81. Kobe’s mean streak length is 0.7632. I do see similar mean streak lengths in the simulated dataset. Kobe’s max is 4, while the average max in the simulation is 6. These are only 8 sets of simulations.
- Mean streak length: 0.8108, 0.942, 0.8611, 0.7632, 0.8356, 0.5952, 0.7179, 0.8108, 0.942
- Max: 6, 6, 8, 4, 7, 6, 4, 8
sim_basket <- sample(outcomes, size = 133, replace = TRUE, prob = c(0.45, 0.55))
sim_streak <- calc_streak(sim_basket)
summary(sim_streak)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.0000 0.0000 0.0000 0.7867 1.0000 6.0000
table(sim_streak)
## sim_streak
## 0 1 2 3 4 5 6
## 42 20 6 4 1 1 1
#Saved streak mean and max
sim_streakMean = c(0.8108, 0.942, 0.8611, 0.7632, 0.8356, 0.5952, 0.7179, 0.8108, 0.942)
sim_max = c(6, 6, 8, 4, 7, 6, 4, 8)
mean(sim_streakMean)
## [1] 0.8087333
mean(sim_max)
## [1] 6.125
I would expect the data from the simulation to vary each time the simulation was run. But I do expect it to be somewhat similar. The expect the distribution of H and M to be 45% - 55%.
I ran the simulation several times, and there was one simulation result that looked almost identical to Kobe’s streaks. It has the same max, very close mean streak length, and the distribution of streak lengths also very similar. But I did notice that for the rest of the other simulations I ran, the streak length performance seemed to be better than Kobe’s. There was once simulation I observed where the mean streak length was around 0.5, which is poorer than Kobe’s. I’m not sure if this is evidence or not regarding the hot hand model. But it does appear to me that Kobe’s streak data does fit with the pattern I observed in the simulation.