## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.0 ──
## ✓ ggplot2 3.3.3 ✓ purrr 0.3.4
## ✓ tibble 3.0.6 ✓ dplyr 1.0.4
## ✓ tidyr 1.1.2 ✓ stringr 1.4.0
## ✓ readr 1.4.0 ✓ forcats 0.5.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
## Loading required package: airports
## Loading required package: cherryblossom
## Loading required package: usdata
Kobe_basket <-glimpse(kobe_basket)
## Rows: 133
## Columns: 6
## $ vs <fct> ORL, ORL, ORL, ORL, ORL, ORL, ORL, ORL, ORL, ORL, ORL, OR…
## $ game <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, …
## $ quarter <fct> 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, …
## $ time <fct> 9:47, 9:07, 8:11, 7:41, 7:03, 6:01, 4:07, 0:52, 0:00, 6:3…
## $ description <fct> Kobe Bryant makes 4-foot two point shot, Kobe Bryant miss…
## $ shot <chr> "H", "M", "M", "H", "H", "M", "M", "M", "M", "H", "H", "H…
Exercise 1: What does a streak length of 1 mean, i.e. how many hits and misses are in a streak of 1? What about a streak length of 0
kobe_streak <- calc_streak(kobe_basket$shot)
now display the distribution of the streak lengths
ggplot(data = kobe_streak, aes(x = length)) +
geom_bar() + ggtitle("kobe_streak")

length of a shooting streak is the number of consecutive baskets made until a miss occurs. A streak length of 1 means one basket was made during the streak and is represented by the notation HM. A streak length of 0 means one attempt was made and the attempt failed. This is represented by the notation M.
Exercise 2
##the histogram above notes that approximately 40 streaks contain no baskets made (M), nearly 25 streaks contain one basket made (HM), approximately 6 streaks contain 2 baskets made (HHM), approximately 6 streaks contain 3 baskets made (HHH), and finally 2 streaks contain 4 baskets made (HHHH). His most common streak involves nearly 40 streaks with no baskets made. the greatest number of baskets – 4 baskets (HHHH) – occurred in only approximately 2 streaks.
###simulate the independent shooter (control group) in R.
### run coin example simulation x1 .
coin_outcomes <- c("heads", "tails")
sample(coin_outcomes, size = 1, replace = TRUE)
## [1] "tails"
now run coin example x 100
coin_outcomes <- c("heads", "tails")
sample(coin_outcomes, size = 100, replace = TRUE)
## [1] "heads" "heads" "tails" "heads" "tails" "heads" "heads" "tails" "heads"
## [10] "tails" "tails" "tails" "heads" "heads" "heads" "heads" "heads" "tails"
## [19] "tails" "tails" "tails" "heads" "heads" "heads" "tails" "heads" "tails"
## [28] "tails" "heads" "heads" "heads" "heads" "tails" "heads" "heads" "tails"
## [37] "heads" "heads" "heads" "heads" "tails" "tails" "tails" "tails" "heads"
## [46] "heads" "heads" "tails" "tails" "tails" "tails" "heads" "tails" "tails"
## [55] "heads" "tails" "heads" "heads" "tails" "tails" "heads" "heads" "heads"
## [64] "heads" "heads" "heads" "heads" "heads" "heads" "heads" "tails" "tails"
## [73] "heads" "heads" "heads" "tails" "tails" "heads" "heads" "tails" "heads"
## [82] "heads" "tails" "heads" "tails" "heads" "tails" "heads" "heads" "heads"
## [91] "tails" "tails" "tails" "tails" "tails" "heads" "tails" "tails" "heads"
## [100] "tails"
###now simulate flipping a fair coin x 100
sim_fair_coin <- sample(coin_outcomes, size = 100, replace = TRUE)
###now display the fair coin sim as a table
## [1] "heads" "tails" "heads" "heads" "heads" "tails" "tails" "heads" "heads"
## [10] "tails" "heads" "tails" "heads" "heads" "heads" "tails" "tails" "tails"
## [19] "tails" "tails" "heads" "heads" "heads" "heads" "heads" "tails" "tails"
## [28] "heads" "tails" "heads" "heads" "heads" "tails" "heads" "tails" "tails"
## [37] "heads" "heads" "heads" "tails" "tails" "heads" "heads" "tails" "heads"
## [46] "tails" "tails" "heads" "heads" "heads" "tails" "heads" "heads" "tails"
## [55] "heads" "tails" "heads" "heads" "tails" "tails" "heads" "heads" "heads"
## [64] "heads" "heads" "heads" "heads" "tails" "heads" "tails" "tails" "tails"
## [73] "heads" "tails" "heads" "heads" "heads" "heads" "heads" "tails" "tails"
## [82] "heads" "heads" "tails" "tails" "heads" "tails" "heads" "heads" "tails"
## [91] "heads" "heads" "tails" "heads" "tails" "heads" "tails" "heads" "tails"
## [100] "tails"
## sim_fair_coin
## heads tails
## 58 42
###simulate an unfair coin that land heads only 20% of the time.
sim_unfair_coin <- sample(coin_outcomes, size = 100, replace = TRUE, prob = c(0.2, 0.8))
## [1] "tails" "tails" "tails" "tails" "tails" "tails" "heads" "tails" "tails"
## [10] "tails" "tails" "tails" "tails" "heads" "tails" "tails" "tails" "tails"
## [19] "tails" "heads" "tails" "tails" "tails" "heads" "heads" "tails" "tails"
## [28] "heads" "tails" "tails" "tails" "tails" "heads" "tails" "tails" "tails"
## [37] "tails" "tails" "tails" "tails" "tails" "tails" "tails" "heads" "tails"
## [46] "heads" "tails" "tails" "tails" "heads" "tails" "tails" "tails" "tails"
## [55] "tails" "tails" "tails" "heads" "tails" "tails" "tails" "tails" "tails"
## [64] "tails" "tails" "tails" "tails" "tails" "tails" "tails" "tails" "tails"
## [73] "tails" "tails" "tails" "tails" "heads" "heads" "tails" "tails" "tails"
## [82] "tails" "tails" "tails" "tails" "heads" "tails" "tails" "tails" "tails"
## [91] "tails" "tails" "heads" "heads" "tails" "tails" "heads" "heads" "tails"
## [100] "heads"
## sim_unfair_coin
## heads tails
## 19 81
Exercise 3 In your simulation of flipping the unfair coin 100 times, how many flips came up heads? Include the code for sampling the unfair coin in your response. Since the markdown file will run the code, and generate a new sample each time you Knit it, you should also “set a seed” before you sample. Read more about setting a seed below.
the unfair coin simulation returned 19 heads and 81 tails. this is the code used #sim_unfair_coin table(sim_unfair_coin)
now simulate the Independent shooter
shot_outcomes <- c("H", "M")
sim_basket <- sample(shot_outcomes, size = 1, replace = TRUE)
## shot_outcomes
## H M
## 1 1
exercise 4 - What change needs to be made to the sample function so that it reflects a shooting percentage of 45%? Make this adjustment, then run a simulation to sample 133 shots. Assign the output of this simulation to a new object called sim_basket . To reflect a shooting % fo 0.45, we insert Prob fx and change the sample size to 133.
shot_outcomes <- c("H", "M")
sim_basket <- sample(shot_outcomes, size = 133, replace = TRUE, prob = c(0.45, 0.55))
## sim_basket
## H M
## 55 78
###now compare Kobe data to independent shooter(control grp).
###exercise 5 – Using calc_streak , compute the streak lengths of sim_basket , and save the results in a data frame called sim_streak .
independent_streak <- calc_streak(sim_basket)
sim_streak <-data.frame(sim_basket)
generate histogram of independent shooter streak
ggplot(data = independent_streak, aes(x = length)) +
geom_bar() + ggtitle("independent_streak")

###Exercise 6: Describe the distribution of streak lengths. What is the typical streak length for this simulated independent shooter with a 45% shooting percentage? How long is the player’s longest streak of baskets in 133 shots? Make sure to include a plot in your answer.
from above the definition of length of a shooting streak is the number of consecutive baskets made until a miss occurs. A streak length of 1 means one basket was made during the streak and is represented by the notation HM. A streak length of 0 means one attempt was made and the attempt failed. This is represented by the notation M.
###the histogram independent_streak notes that approximately 46 streaks contain no baskets made (M), nearly 19 streaks contain one basket made (HM), approximately 8 streaks contain 2 baskets made (HHM), approximately 4 streaks contain 3 baskets made (HHH), and finally 3 streaks contain 4 baskets made (HHHH). Most common streak for Independent_streak data involves nearly 46 streaks with no baskets made. the greatest number of baskets – 4 baskets (HHHH) – occurred in only approximately 3 streaks.
###Exercise 7 If you were to run the simulation of the independent shooter a second time, how would you expect its streak distribution to compare to the distribution from the question above? Exactly the same? Somewhat similar? Totally different? Explain your reasoning.
shot_outcomes_2 <- c("H", "M")
sim_basket <- sample(shot_outcomes_2, size = 133, replace = TRUE, prob = c(0.45, 0.55))
## sim_basket
## H M
## 63 70
independent_streak_2 <- calc_streak(sim_basket)
ggplot(data = independent_streak, aes(x = length)) +
geom_bar() + ggtitle("independent_streak_2")
### comparing the histograms independent_streak to independent_streak_2, we see similar distributions. At the parameters – H 0.45 and M 0.55 – where not changed from the independent_streak simulation this is not an unexpected outcome.
###Exercise 8 - How does Kobe Bryant’s distribution of streak lengths compare to the distribution of streak lengths for the simulated shooter? Using this comparison, do you have evidence that the hot hand model fits Kobe’s shooting patterns? Explain.
using this model to compare the Kobe data with the simulated independent shooter data, it does not appear that there are major differences. Both data sets are similar. Comparing the histograms for Kobe and Independent, both are skewed to the right. Both data sets, note that the most common streaks involved no baskets completed. this analysis does not convincingly show that Kobe’s performance from one shot to another is dependent. If success in completing baskets was dependent on the prior shot, then we should see more streaks of completed baskets in the data. Rather this data shows the opposite. Only few streaks occurred where a string of successful baskets occurred. I will note that I posted to Discord information that notes that subsequent research into the behavior of random sequences. Professors Joshua Miller and Adam Sanjurjo identified errors in the way the data was sampled that introduced bias into the original analysis by Gilovich, Vallone, and Tversky mentioned in the Probability Lab text. I suspect that lab as we follow it is introducing this same bias into our analysis.
