library(tidyverse)
library(openintro)
glimpse(kobe_basket) #glimpse function allows me to look at data set
## Rows: 133
## Columns: 6
## $ vs <fct> ORL, ORL, ORL, ORL, ORL, ORL, ORL, ORL, ORL, ORL, ORL, ORL~
## $ game <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1~
## $ quarter <fct> 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3~
## $ time <fct> 9:47, 9:07, 8:11, 7:41, 7:03, 6:01, 4:07, 0:52, 0:00, 6:35~
## $ description <fct> Kobe Bryant makes 4-foot two point shot, Kobe Bryant misse~
## $ shot <chr> "H", "M", "M", "H", "H", "M", "M", "M", "M", "H", "H", "H"~
##shows a general idea of what the data looks like H is for hit meaning the shot went in and M is for miss
A streak length of 1 means that Kobe made 1 shot then missed the next shot attempt. A streak length of 0 means that the shot attempt before was a miss and the shot after it was a miss again.
In this part of code streak is calculated by looking at the $shot column
kobe_streak <- calc_streak(kobe_basket$shot)
ggplot(data =kobe_streak, aes (x=length)) +geom_bar() + scale_y_continuous(breaks = c(0, 5, 10, 15, 20, 25, 30, 35, 40))
##Graph showing the distribution of streak lengths
##aes is for labeling the x axis
##geom_bar() is to show the bars of the graph
The distribution is skewed right where most of the streak length was 0. His typical streak length was 0. His longest streak of baskets was 4 shot in a row.
//Simulations in R of flipping a coin
coin_outcomes <- c("heads", "tails")
sample(coin_outcomes, size= 1, replace =TRUE)
## [1] "tails"
## simulates flipping a coin on R
sim_fair_coin <- sample(coin_outcomes, size = 100, replace = TRUE)
##this chunk of code simulates flipping a coin 100 times
sim_fair_coin
## [1] "heads" "tails" "tails" "heads" "heads" "tails" "tails" "heads" "heads"
## [10] "tails" "tails" "heads" "heads" "tails" "heads" "tails" "tails" "tails"
## [19] "heads" "heads" "heads" "tails" "tails" "heads" "heads" "tails" "heads"
## [28] "tails" "heads" "heads" "heads" "heads" "tails" "tails" "tails" "tails"
## [37] "heads" "tails" "tails" "heads" "heads" "tails" "heads" "tails" "heads"
## [46] "heads" "tails" "heads" "tails" "heads" "heads" "tails" "tails" "tails"
## [55] "heads" "tails" "heads" "tails" "tails" "tails" "heads" "tails" "tails"
## [64] "tails" "heads" "heads" "heads" "heads" "tails" "heads" "heads" "tails"
## [73] "heads" "tails" "tails" "tails" "heads" "tails" "heads" "heads" "tails"
## [82] "heads" "heads" "tails" "tails" "tails" "tails" "heads" "tails" "heads"
## [91] "tails" "tails" "heads" "tails" "heads" "heads" "heads" "heads" "tails"
## [100] "tails"
table(sim_fair_coin)
## sim_fair_coin
## heads tails
## 49 51
## this runs the code for flipping then shows it on the table
set.seed(1738)
sim_unfair_coin <- sample(coin_outcomes, size = 100, replace = TRUE,
prob = c(0.2, 0.8))
table (sim_unfair_coin)
## sim_unfair_coin
## heads tails
## 23 77
##in this coin flip there is favored 80% to the tail
Only 23 heads came up in the unfair coin flip.
shot_outcomes <- c("H", "M")
sim_basket <- sample(shot_outcomes, size = 1, replace = TRUE)
##Separates the hits and misses making it like heads or tails
prob = c(.45, .55) would need to be added to the line to reflect shooting percentage and size would equal 133
sim_basket<- sample(shot_outcomes, size = 133, replace = TRUE ,prob = c(.45, .55) )
sim_streak<- calc_streak(sim_basket)
The typical streak length is still 0 for the independent shooter and the longest streak of baskets is 6.
ggplot(data =sim_streak, aes (x=length)) +geom_bar()
There will be slight differences between the distributions but the general shape of the curve will basically be the same. The probabilities wouldn’t change so the distribution wouldn’t change that much either. The next run of code could have the highest streak be 4 but the general shape would still be the same.
The distribution streak length of the simulated shooter has a larger range on the x axis and a longer length of streak. The general shape of the two graphs are very similar with most of the length being at 0. Kobe’s shooting patterns don’t fit the hot hand model because the hot hand model says that the events which are the shots are going to be dependent of each and if you have one you should have a higher chance to make the next but in this case the data doesn’t show that.