Sarah Wigodsky

DATA 606 Lab 2

Probability - Streaks in Basketball

  1. A streak length of 1 means that the player made 1 hit and then missed next time. A streak length of zero means the player missed the shot after having missed the previous shot.
load("more/kobe.RData")
  1. The distribution is skewed right and is unimodal. Because it is skewed right, the median is used to describe a typical observation.

Calculating Kobe’s Median Streak

kobe_streak <- calc_streak(kobe$basket)
median(kobe_streak)
## [1] 0

The median streak is zero. This means that the most typical streak length is zero; it was most common that Kobe missed a basket after missing the previous shot. The longest streak was 4 hits in a row.

Simulating an Unfair Coin

outcomes <- c("heads", "tails")
sim_unfair_coin <- sample(outcomes, size = 100, replace = TRUE, prob = c(0.2,0.8))
sim_unfair_coin
##   [1] "tails" "heads" "tails" "heads" "tails" "tails" "tails" "tails"
##   [9] "tails" "tails" "tails" "tails" "tails" "tails" "heads" "tails"
##  [17] "tails" "heads" "tails" "heads" "heads" "tails" "tails" "tails"
##  [25] "tails" "tails" "tails" "tails" "tails" "tails" "tails" "tails"
##  [33] "tails" "tails" "tails" "tails" "tails" "tails" "tails" "tails"
##  [41] "tails" "heads" "tails" "tails" "heads" "tails" "heads" "tails"
##  [49] "tails" "heads" "tails" "tails" "tails" "tails" "tails" "tails"
##  [57] "tails" "tails" "tails" "tails" "tails" "tails" "tails" "tails"
##  [65] "tails" "tails" "tails" "tails" "tails" "tails" "tails" "tails"
##  [73] "heads" "tails" "tails" "heads" "tails" "tails" "tails" "tails"
##  [81] "tails" "tails" "tails" "tails" "tails" "tails" "tails" "tails"
##  [89] "tails" "tails" "tails" "heads" "tails" "tails" "tails" "heads"
##  [97] "heads" "tails" "tails" "heads"
table(sim_unfair_coin)
## sim_unfair_coin
## heads tails 
##    16    84
  1. There were 16 heads and 84 tails.

Simulating Kobe Bryant’s shots - Question 4

outcomes <- c ("H", "M")
sim_basket <- sample(outcomes, size = 133, replace = TRUE, prob = c(0.45, 0.55))
sim_basket
##   [1] "M" "H" "M" "M" "H" "M" "M" "H" "M" "H" "H" "M" "M" "M" "M" "M" "H"
##  [18] "M" "M" "M" "H" "M" "H" "H" "H" "H" "H" "M" "M" "H" "H" "H" "H" "M"
##  [35] "M" "M" "M" "M" "M" "M" "M" "H" "H" "M" "H" "M" "H" "H" "H" "H" "H"
##  [52] "M" "M" "H" "M" "H" "H" "H" "H" "H" "H" "H" "M" "M" "H" "M" "M" "M"
##  [69] "M" "M" "H" "M" "M" "H" "M" "H" "M" "M" "H" "M" "H" "H" "H" "M" "M"
##  [86] "M" "H" "M" "M" "H" "H" "H" "H" "M" "M" "M" "H" "M" "M" "M" "M" "H"
## [103] "H" "M" "M" "H" "M" "M" "H" "H" "M" "H" "H" "M" "H" "M" "H" "H" "H"
## [120] "M" "M" "H" "H" "H" "M" "H" "H" "H" "M" "H" "M" "H" "H"
table(sim_basket)
## sim_basket
##  H  M 
## 66 67

Comparing the Simulation to Kobe’s statistics

kobe$basket
##   [1] "H" "M" "M" "H" "H" "M" "M" "M" "M" "H" "H" "H" "M" "H" "H" "M" "M"
##  [18] "H" "H" "H" "M" "M" "H" "M" "H" "H" "H" "M" "M" "M" "M" "M" "M" "H"
##  [35] "M" "H" "M" "M" "H" "H" "H" "H" "M" "H" "M" "M" "H" "M" "M" "H" "M"
##  [52] "M" "H" "M" "H" "H" "M" "M" "H" "M" "H" "H" "M" "H" "M" "M" "M" "H"
##  [69] "M" "M" "M" "M" "H" "M" "H" "M" "M" "H" "M" "M" "H" "H" "M" "M" "M"
##  [86] "M" "H" "H" "H" "M" "M" "H" "M" "M" "H" "M" "H" "H" "M" "H" "M" "M"
## [103] "H" "M" "M" "M" "H" "M" "H" "H" "H" "M" "H" "H" "H" "M" "H" "M" "H"
## [120] "M" "M" "M" "M" "M" "M" "H" "M" "H" "M" "M" "M" "M" "H"
sim_basket
##   [1] "M" "H" "M" "M" "H" "M" "M" "H" "M" "H" "H" "M" "M" "M" "M" "M" "H"
##  [18] "M" "M" "M" "H" "M" "H" "H" "H" "H" "H" "M" "M" "H" "H" "H" "H" "M"
##  [35] "M" "M" "M" "M" "M" "M" "M" "H" "H" "M" "H" "M" "H" "H" "H" "H" "H"
##  [52] "M" "M" "H" "M" "H" "H" "H" "H" "H" "H" "H" "M" "M" "H" "M" "M" "M"
##  [69] "M" "M" "H" "M" "M" "H" "M" "H" "M" "M" "H" "M" "H" "H" "H" "M" "M"
##  [86] "M" "H" "M" "M" "H" "H" "H" "H" "M" "M" "M" "H" "M" "M" "M" "M" "H"
## [103] "H" "M" "M" "H" "M" "M" "H" "H" "M" "H" "H" "M" "H" "M" "H" "H" "H"
## [120] "M" "M" "H" "H" "H" "M" "H" "H" "H" "M" "H" "M" "H" "H"

Barplot for Simulated Data for Kobe’s shots

sim_streak <- calc_streak(sim_basket)
barplot(table(sim_streak))

Barplot for Kobe’s shots

kobe_streak <- calc_streak(kobe$basket)
barplot(table(kobe_streak))

The distribution of streak lengths in the simulation is unimodal and skews right, similar to the distribution for Kobe’s actual shots.

Median for Simulation of Kobe’s shots

median(sim_streak)
## [1] 0

When running the simulation, the median streak is sometimes 0 and sometimes 1. This means that a typical streak when making shots 45% of the time is missing after having missed the last shot or making 1 shot and then missing. When the simulation runs, the longest streak is 5 or 6 shots in a row, which is higher than Kobe’s actual longest streak of 4.

If I run the simulation again, I would expect the results to be somewhat similar, but not exactly the same. There will be some variablilty because there is a 45/55 percent chance each time the simulation runs. However the larger the sample’s size, the more similar I would expect the results to be with each other.

I don’t see evidence that Kobe has hot hands. The simulation showed more streaks of 4 than Kobe actually made and a streak of 6, which Kobe did not make. The median streak from the simulation of 1 was higher than Kobe’s median streak of 0. The simulated data is similar to Kobe’s results, and in fact demonstrates a higher median and a higher number of baskets made in a row. The idea that Kobe has hot hands is not supported.