Notes

Before we can answer any questions, we load the mosaic package and the Kobe Bryant data set.

library(mosaic)
library(oilabs)
data(kobe)

Total out of 6 points.

Question 1

Describe the distribution of streak lengths for the independent shooter. What is the typical streak length for this simulated independent shooter with a 45% shooting percentage? How long is the player’s longest streak of baskets in 133 shots?

(2 points)

set.seed(76)
outcomes <- c("H", "M")
sim_basket <- resample(outcomes, size = 133, prob = c(0.45, 0.55))
sim_streak <- calc_streak(sim_basket)
tally(~sim_streak)
## 
##  0  1  2  3  4  5  7 
## 35 17  7  5  2  1  1
barchart(tally(~sim_streak), horizontal=FALSE)

The distribution is right-skewed. The typical streak length for this kind of shooter is 0 and 1, as they are the most common/typical values. The longest streak was 7 baskets.

Note that is is a Geometric distribution where a success is a Kobe “MISS”. i.e. a streak is defined as a series of “HIT”s until the first “MISS”.

Question 2

If you were to run the simulation of the independent shooter a second time, how would you expect its streak distribution to compare to the distribution from the question above? Exactly the same? Somewhat similar? Totally different? Explain your reasoning.

(2 points) There would be some variation since we are sampling at random, but the results would not differ by much. However, this is the case if we didn’t set the seed value with set.seed(). If we set the same seed value before each simulation, we’d obtain replicable pseudorandom results. If we set different seed values, chances are we’d obtain differences.

Question 3

How does Kobe Bryant’s distribution of streak lengths from the lab compare to the distribution of streak lengths for the simulated shooter? Using this comparison, do you have evidence that the hot hand model fits Kobe’s shooting patterns? Explain.

(2 points) We make the scales on the y-axis match so that things are easier to compare:

kobe_streak <- calc_streak(kobe$basket)
barchart(tally(~kobe_streak), horizontal=FALSE, ylim=c(0,40))

barchart(tally(~sim_streak), horizontal=FALSE, ylim=c(0,40))

We see that with independence model (i.e. no hot-hand), we expect to see some streaks of lengths 5 and 7. Kobe Bryant’s shooting record had fewer long streaks. I argue that if the hot-hand model held for Kobe, we would’ve seen more streaks of relatively long length. So we don’t have evidence to suggest that the hot hand model holds. If anything, we have mild evidence that the opposite holds!

Perhaps 133 shots is not enough? Just for fun, let’s increase the sample size to 10000 shots and observe the barchart:

sim_basket <- resample(outcomes, size = 10000, prob = c(0.45, 0.55))
sim_streak <- calc_streak(sim_basket)
barchart(tally(~sim_streak), horizontal=FALSE)

We observe a certain number of streaks greater than 4, which we don’t for Kobe’s real data.