Basketball players who make several baskets in succession are described as having a hot hand. Fans and players have long believed in the hot hand phenomenon, which refutes the assumption that each shot is independent of the next.
Our investigation will focus on the performance of one player: Kobe Bryant of the Los Angeles Lakers. His performance against the Orlando Magic in the 2009 NBA finals earned him the title Most Valuable Player and many spectators commented on how he appeared to show a hot hand. Let’s load some data from those games and look at the first several rows.
load("more/kobe.RData")
head(kobe)
## vs game quarter time
## 1 ORL 1 1 9:47
## 2 ORL 1 1 9:07
## 3 ORL 1 1 8:11
## 4 ORL 1 1 7:41
## 5 ORL 1 1 7:03
## 6 ORL 1 1 6:01
## description basket
## 1 Kobe Bryant makes 4-foot two point shot H
## 2 Kobe Bryant misses jumper M
## 3 Kobe Bryant misses 7-foot jumper M
## 4 Kobe Bryant makes 16-foot jumper (Derek Fisher assists) H
## 5 Kobe Bryant makes driving layup H
## 6 Kobe Bryant misses jumper M
For example, in Game 1 Kobe had the following sequence of hits and misses from his nine shot attempts in the first quarter:
\[ \textrm{H M | M | H H M | M | M | M} \]
To verify this use the following command:
kobe$basket[1:9]
## [1] "H" "M" "M" "H" "H" "M" "M" "M" "M"
Within the nine shot attempts, there are six streaks, which are separated by a “|” above. Their lengths are one, zero, two, zero, zero, zero (in order of occurrence). 1. What does a streak length of 1 mean, i.e. how many hits and misses are in a streak of 1? What about a streak length of 0?
Answer:
The custom function calc_streak, which was loaded in with the data, may be used to calculate the lengths of all shooting streaks and then look at the distribution.
kobe_streak <- calc_streak(kobe$basket)
barplot(table(kobe_streak))
Note that instead of making a histogram, we chose to make a bar plot from a table of the streak data. A bar plot is preferable here since our variable is discrete – counts – instead of continuous.
Describe the distribution of Kobe’s streak lengths from the 2009 NBA finals. What was his typical streak length? How long was his longest streak of baskets?
summary(kobe_streak)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.0000 0.0000 0.0000 0.7632 1.0000 4.0000
Answer: -The distribution of Kobe’s streaks is Unimodal, right skewed -The typical streak lenght was 0. -The longest streak of baskets was lenght 4
In your simulation of flipping the unfair coin 100 times, how many flips came up heads?
outcomes <- c("heads", "tails")
sim_unfair_coin <- sample(outcomes, size = 100, replace = TRUE, prob = c(0.2, 0.8))
table(sim_unfair_coin)
## sim_unfair_coin
## heads tails
## 21 79
sample function so that it reflects a shooting percentage of 45%? Make this adjustment, then run a simulation to sample 133 shots. Assign the output of this simulation to a new object called sim_basket.outcomes <- c("H", "M")
sim_basket <- sample(outcomes, size = 133, replace = TRUE, prob = c(0.45, 0.55))
sim_basket
## [1] "H" "M" "M" "M" "M" "H" "H" "M" "M" "M" "M" "M" "H" "H" "M" "M" "H"
## [18] "M" "H" "M" "M" "H" "H" "H" "H" "M" "M" "M" "M" "M" "H" "M" "M" "H"
## [35] "H" "M" "M" "M" "M" "H" "M" "H" "M" "H" "M" "H" "M" "H" "M" "M" "M"
## [52] "M" "M" "H" "M" "M" "M" "H" "H" "M" "M" "M" "M" "M" "H" "H" "M" "H"
## [69] "M" "M" "H" "M" "H" "M" "H" "M" "H" "H" "M" "H" "H" "M" "M" "M" "M"
## [86] "M" "H" "M" "M" "M" "M" "M" "M" "H" "M" "M" "M" "M" "M" "M" "H" "M"
## [103] "M" "M" "H" "M" "H" "H" "M" "H" "M" "H" "M" "M" "H" "H" "H" "M" "M"
## [120] "H" "M" "H" "M" "H" "H" "H" "M" "H" "H" "H" "M" "H" "M"
kobe$basket
## [1] "H" "M" "M" "H" "H" "M" "M" "M" "M" "H" "H" "H" "M" "H" "H" "M" "M"
## [18] "H" "H" "H" "M" "M" "H" "M" "H" "H" "H" "M" "M" "M" "M" "M" "M" "H"
## [35] "M" "H" "M" "M" "H" "H" "H" "H" "M" "H" "M" "M" "H" "M" "M" "H" "M"
## [52] "M" "H" "M" "H" "H" "M" "M" "H" "M" "H" "H" "M" "H" "M" "M" "M" "H"
## [69] "M" "M" "M" "M" "H" "M" "H" "M" "M" "H" "M" "M" "H" "H" "M" "M" "M"
## [86] "M" "H" "H" "H" "M" "M" "H" "M" "M" "H" "M" "H" "H" "M" "H" "M" "M"
## [103] "H" "M" "M" "M" "H" "M" "H" "H" "H" "M" "H" "H" "H" "M" "H" "M" "H"
## [120] "M" "M" "M" "M" "M" "M" "H" "M" "H" "M" "M" "M" "M" "H"
sim_basket
## [1] "H" "M" "M" "M" "M" "H" "H" "M" "M" "M" "M" "M" "H" "H" "M" "M" "H"
## [18] "M" "H" "M" "M" "H" "H" "H" "H" "M" "M" "M" "M" "M" "H" "M" "M" "H"
## [35] "H" "M" "M" "M" "M" "H" "M" "H" "M" "H" "M" "H" "M" "H" "M" "M" "M"
## [52] "M" "M" "H" "M" "M" "M" "H" "H" "M" "M" "M" "M" "M" "H" "H" "M" "H"
## [69] "M" "M" "H" "M" "H" "M" "H" "M" "H" "H" "M" "H" "H" "M" "M" "M" "M"
## [86] "M" "H" "M" "M" "M" "M" "M" "M" "H" "M" "M" "M" "M" "M" "M" "H" "M"
## [103] "M" "M" "H" "M" "H" "H" "M" "H" "M" "H" "M" "M" "H" "H" "H" "M" "M"
## [120] "H" "M" "H" "M" "H" "H" "H" "M" "H" "H" "H" "M" "H" "M"
Both data sets represent the results of 133 shot attempts, each with the same shooting percentage of 45%. We know that our simulated data is from a shooter that has independent shots. That is, we know the simulated shooter does not have a hot hand.
Using calc_streak, compute the streak lengths of sim_basket.
sim_streak <- calc_streak(sim_basket)
head(sim_streak)
## [1] 1 0 0 0 2 0
barplot(table(sim_streak), col = "lightgreen")
summary(sim_streak)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.0000 0.0000 0.0000 0.6341 1.0000 4.0000
-The distribution is right skewed.
-The typical streak lenght is 0.
-The longest streak of baskets is lenght 5.
sim_basket <- sample(outcomes, size = 133, replace = TRUE, prob = c(0.45, 0.55))
sim_streak <- calc_streak(sim_basket)
summary(sim_streak)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.0000 0.0000 0.0000 0.6962 1.0000 6.0000
If you were to run the simulation of the independent shooter a second time, how would you expect its streak distribution to compare to the distribution from the question above? Exactly the same? Somewhat similar? Totally different? Explain your reasoning.
Answer:I would assume the data from the simulation to vary each time the simulation was run but the results would be somewhat similar. The shots are independant of each other and the probability of making the shot does not change between the two simulations
How does Kobe Bryant’s distribution of streak lengths compare to the distribution of streak lengths for the simulated shooter? Using this comparison, do you have evidence that the hot hand model fits Kobe’s shooting patterns? Explain.
Answer:The distribution of streak lengths are similar for both the simulation and for Kobe. Both distributions are right skewed and both modes are 0. However, there is not enough evidence that the hot hand model fits kobe’s shooting patterns.