Probability

What does a streak length of 1 mean, i.e. how many hits and misses are in a streak of 1? What about a streak length of 0?
A streak of length 1 indicates a hit followed by a miss. A streak of length 0 indicates a miss followed by a miss.
Describe the distribution of Kobe’s streak lengths from the 2009 NBA finals. What was his typical streak length? How long was his longest streak of baskets?
His most frequent (and therefore “typical”) streak length is 1; his longest streak of baskets was 4.

median(kobe_streak)

## [1] 0

table(kobe_streak)

## kobe_streak
##  0  1  2  3  4 
## 39 24  6  6  1

outcomes <- c("heads", "tails")
sample(outcomes, size = 1, replace = TRUE)

## [1] "heads"

sim_fair_coin <- sample(outcomes, size = 100, replace = TRUE)

sim_fair_coin

##   [1] "tails" "tails" "tails" "heads" "tails" "tails" "heads" "tails"
##   [9] "heads" "tails" "heads" "heads" "heads" "heads" "heads" "tails"
##  [17] "tails" "heads" "heads" "tails" "heads" "tails" "tails" "tails"
##  [25] "heads" "tails" "tails" "tails" "heads" "tails" "tails" "tails"
##  [33] "heads" "heads" "heads" "heads" "heads" "tails" "tails" "heads"
##  [41] "tails" "heads" "heads" "tails" "tails" "heads" "tails" "heads"
##  [49] "heads" "tails" "tails" "heads" "tails" "tails" "tails" "heads"
##  [57] "heads" "tails" "tails" "tails" "tails" "heads" "tails" "heads"
##  [65] "heads" "tails" "tails" "tails" "tails" "heads" "heads" "tails"
##  [73] "heads" "tails" "heads" "heads" "heads" "heads" "tails" "heads"
##  [81] "heads" "heads" "heads" "heads" "tails" "tails" "heads" "tails"
##  [89] "heads" "heads" "heads" "tails" "heads" "heads" "tails" "heads"
##  [97] "tails" "heads" "heads" "tails"

table(sim_fair_coin)

## sim_fair_coin
## heads tails 
##    51    49

sim_unfair_coin <- sample(outcomes, size = 100, replace = TRUE, prob = c(0.2, 0.8))

prob=c(0.2, 0.8) indicates that for the two elements in the outcomes vector, we want to select the first one, heads, with probability 0.2 and the second one, tails with probability 0.8. Another way of thinking about this is to think of the outcome space as a bag of 10 chips, where 2 chips are labeled “head” and 8 chips “tail”. Therefore at each draw, the probability of drawing a chip that says “head”" is 20%, and “tail” is 80%.

In your simulation of flipping the unfair coin 100 times, how many flips came up heads?
Please see table below. Heads came up 16 times.

table(sim_unfair_coin)

## sim_unfair_coin
## heads tails 
##    16    84

outcomes <- c("H", "M")
sim_basket <- sample(outcomes, size = 1, replace = TRUE)

What change needs to be made to the sample function so that it reflects a shooting percentage of 45%? Make this adjustment, then run a simulation to sample 133 shots. Assign the output of this simulation to a new object called sim_basket.
See code below.

outcomes <- c("H", "M") set.seed(133)#prevents the random sample from being recreated. sim_basket <- sample(outcomes, size = 133, replace = TRUE, prob = c(0.45, 0.55)) table(sim_basket)

## sim_basket ## H M ## 73 60

In this simulation, the number of hits exceeded the number of misses, 55% to 45%.

Comparing Kobe Bryant to the Independent Shooter

Using calc_streak, compute the streak lengths of sim_basket.

sim_player<-calc_streak(sim_basket) sim_player

## [1] 0 2 0 2 2 2 0 0 0 1 0 4 2 0 2 0 0 2 1 1 3 0 0 1 5 2 0 6 2 0 0 0 0 5 1 ## [36] 4 0 1 2 0 1 1 3 1 1 1 0 0 1 2 3 0 0 1 1 2 1 0 1 0 0

Describe the distribution of streak lengths. What is the typical streak length for this simulated independent shooter with a 45% shooting percentage? How long is the player’s longest streak of baskets in 133 shots?
The most frequent (and therefore “typical”) streak length is 1; the longest streak of baskets is 6.

median(sim_player)

## [1] 1

table(sim_player)

## sim_player ## 0 1 2 3 4 5 6 ## 25 16 12 3 2 2 1

barplot(table(sim_player))

If you were to run the simulation of the independent shooter a second time, how would you expect its streak distribution to compare to the distribution from the question above? Exactly the same? Somewhat similar? Totally different? Explain your reasoning.
I would expect the distribution to be different. While we have set the probability of hits and misses (probability of a shot being made set at 0.45), the distribution of streaks will be random. I’ve run the simulation several times and all parameters vary.
How does Kobe Bryant’s distribution of streak lengths compare to the distribution of streak lengths for the simulated shooter? Using this comparison, do you have evidence that the hot hand model fits Kobe’s shooting patterns? Explain.
The distribution of streaks are different; the simulated player made 73 (55%) shots, while Kobe made 58 (44%). The simulated player streak range was wider, including 3 streaks with length greater than Kobe’s maximum length. The sim player and Kobe had different frequencies of consecutive misses (25 and 39 respectively), Kobe beat the sim player’s streak frequency of length 1. Streak lengths of 2 and 3 were differently distributed for the simulated player and Kobe. The simulated player was more likely to make baskets in longer streaks, scoring twice as many streaks of 2 than Kobe. The hot hand model has been discredited.