library(tidyverse)## -- Attaching packages --------------------------------------- tidyverse 1.3.0 --
## v ggplot2 3.3.3 v purrr 0.3.4
## v tibble 3.0.6 v dplyr 1.0.3
## v tidyr 1.1.2 v stringr 1.4.0
## v readr 1.4.0 v forcats 0.5.1
## -- Conflicts ------------------------------------------ tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
library(openintro)## Loading required package: airports
## Loading required package: cherryblossom
## Loading required package: usdata
glimpse(kobe_basket)## Rows: 133
## Columns: 6
## $ vs <fct> ORL, ORL, ORL, ORL, ORL, ORL, ORL, ORL, ORL, ORL, ORL, ...
## $ game <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1...
## $ quarter <fct> 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3...
## $ time <fct> 9:47, 9:07, 8:11, 7:41, 7:03, 6:01, 4:07, 0:52, 0:00, 6...
## $ description <fct> Kobe Bryant makes 4-foot two point shot, Kobe Bryant mi...
## $ shot <chr> "H", "M", "M", "H", "H", "M", "M", "M", "M", "H", "H", ...
Answer: A streak length of 1 means hit then miss, then hit the miss, etc. A streak length of 0 means miss then miss then miss then miss, etc.
kobe_streak <- calc_streak(kobe_basket$shot)
ggplot(data = kobe_streak, aes(x = length)) +
geom_bar(fill = "blue") ### Exercise 2
summary(kobe_streak)## length
## Min. :0.0000
## 1st Qu.:0.0000
## Median :0.0000
## Mean :0.7632
## 3rd Qu.:1.0000
## Max. :4.0000
Answer: His typical streak length was of 0 baskets and his longest streak length was of 4 baskets before missing on the fifth shot. The distribution is right skewed with a range of 4, a median of 0 and a mean of 0.8.
coin_outcomes <- c("heads", "tails")
sample(coin_outcomes, size = 1, replace = TRUE)## [1] "heads"
sim_fair_coin <- sample(coin_outcomes, size = 100, replace = TRUE)sim_fair_coin## [1] "tails" "heads" "tails" "heads" "tails" "heads" "heads" "heads" "tails"
## [10] "tails" "tails" "tails" "heads" "tails" "heads" "heads" "tails" "tails"
## [19] "tails" "heads" "tails" "heads" "heads" "tails" "tails" "heads" "heads"
## [28] "heads" "tails" "heads" "heads" "tails" "tails" "heads" "heads" "tails"
## [37] "tails" "heads" "tails" "heads" "heads" "heads" "heads" "heads" "heads"
## [46] "heads" "tails" "heads" "heads" "heads" "tails" "heads" "tails" "tails"
## [55] "heads" "heads" "heads" "tails" "heads" "tails" "tails" "tails" "heads"
## [64] "tails" "tails" "tails" "tails" "tails" "heads" "tails" "heads" "tails"
## [73] "heads" "heads" "heads" "heads" "heads" "tails" "tails" "heads" "tails"
## [82] "tails" "heads" "tails" "heads" "tails" "heads" "tails" "heads" "tails"
## [91] "tails" "tails" "tails" "heads" "heads" "tails" "heads" "heads" "heads"
## [100] "heads"
table(sim_fair_coin)## sim_fair_coin
## heads tails
## 53 47
sim_unfair_coin <- sample(coin_outcomes, size = 100, replace = TRUE, prob = c(0.2, 0.8))set.seed(35797)
sim_unfair_coin <- sample(coin_outcomes, size = 100, replace = TRUE, prob = c(0.2, 0.8))
sim_unfair_coin## [1] "tails" "tails" "tails" "tails" "tails" "heads" "tails" "tails" "tails"
## [10] "tails" "tails" "heads" "heads" "tails" "heads" "heads" "heads" "tails"
## [19] "tails" "tails" "tails" "tails" "heads" "tails" "tails" "tails" "tails"
## [28] "tails" "tails" "tails" "tails" "tails" "tails" "tails" "tails" "heads"
## [37] "tails" "tails" "tails" "tails" "tails" "tails" "tails" "tails" "heads"
## [46] "heads" "tails" "tails" "tails" "heads" "heads" "tails" "tails" "heads"
## [55] "heads" "tails" "tails" "tails" "tails" "tails" "tails" "heads" "tails"
## [64] "heads" "tails" "tails" "heads" "tails" "tails" "tails" "tails" "heads"
## [73] "heads" "tails" "tails" "tails" "tails" "tails" "heads" "heads" "tails"
## [82] "heads" "tails" "tails" "heads" "tails" "tails" "tails" "tails" "tails"
## [91] "tails" "heads" "heads" "tails" "tails" "tails" "heads" "tails" "tails"
## [100] "tails"
table(sim_unfair_coin)## sim_unfair_coin
## heads tails
## 26 74
Answer: 26 heads
shot_outcomes <- c("H", "M")
sim_basket <- sample(shot_outcomes, size = 1, replace = TRUE)set.seed(35797)
coin_outcomes2 <- c("H", "M")
sim_basket2 <- sample(coin_outcomes2, size = 133, replace = TRUE, prob = c(0.45, 0.55))
table(sim_basket2)## sim_basket2
## H M
## 63 70
Answer: This simulation reflecting a shooting percentage of 45% gave 36/133 hits
kobetable <- table(kobe_streak$basket)
kobetable## < table of extent 0 >
simbasket2Table <-table(sim_basket2)
simbasket2Table ## sim_basket2
## H M
## 63 70
Answer: The Kobe’s results seems to be similar to the simulated shooter without the hot hand.
sim_streak<-calc_streak(sim_basket2)ggplot(data = sim_streak, aes(x = length)) +
geom_bar(fill = "red")table(sim_streak)## sim_streak
## 0 1 2 3 4 5 6
## 40 14 9 4 2 1 1
Answer: The typical streak length of the basket’s player is 0 and his longest streak of baskets was 6, but that was only 1 time.
Answer: The distribution is quite similar rather than similar, I would presume. Owing to the firing rate remaining below 50 percent, the number of misses will still be greater than the number of strikes. The longer the streak, the less the chance or frequency of events. The extremity of outliers as a result of the fact that this is a random simulation is essentially what can differ. Another simulation might, for example, yield outliers of only 4 and 15.
Answer: Provided the similarities of the distributions in terms of form, mean, and skew, the hot hand model tends to match or model Kobe’s shooting patterns reasonably well. The question of whether shots are based on each other also seems to be difficult to prove, considering the validity of the model. The existence of outliers in the simulation case was the only important distinction between the two distributions (streaks of 5 and 10). If we had to eliminate the outlier of 10, the mean and standard deviation of the simulated shooter would be much nearer to Kobe’s. If we had to simulate a greater number of shots (e.g. 10,000), according to the formula, the ratio of hits would certainly be closer to 45 percent.