Lab 4 - Probability

Lab report

Load data:

library(openintro)

## Loading required package: airports

## Loading required package: cherryblossom

## Loading required package: usdata

data(kobe_basket)

Exercises:

Exercise 1:

head(kobe_basket)

## # A tibble: 6 × 6
##   vs     game quarter time  description                                    shot 
##   <fct> <int> <fct>   <fct> <fct>                                          <chr>
## 1 ORL       1 1       9:47  Kobe Bryant makes 4-foot two point shot        H    
## 2 ORL       1 1       9:07  Kobe Bryant misses jumper                      M    
## 3 ORL       1 1       8:11  Kobe Bryant misses 7-foot jumper               M    
## 4 ORL       1 1       7:41  Kobe Bryant makes 16-foot jumper (Derek Fishe… H    
## 5 ORL       1 1       7:03  Kobe Bryant makes driving layup                H    
## 6 ORL       1 1       6:01  Kobe Bryant misses jumper                      M

A shooting streak of 1, meanings that there was a shooting streak that is exactly 1 shot long. So, there would be 1 (H) or hit n this streak, which will end from 1 (M) or miss. A shooting streak with a length of 0, means that there were was no shooting streak, meaning it would be a (M) miss followed by another (M) miss.

Exercise 2:

kobe_streak<-calc_streak(kobe_basket$shot)$length
barplot(table(kobe_streak))

It seems that Kobe’s most typical streak length is a streak of 0, meaning no streak. We can see from the bar plot that the x-axis has the streak length, while the y-axis has the number of streaks, and it seems that there are well over 30 streaks of 0. Additionally, it seems that his longest streak was 4 shots hit in a row, which is seems he only did once or twice.

Exercise 3:

set.seed(1022002)
coin_outcomes<-c("heads", "tails")
sample(coin_outcomes, size=1, replace = TRUE)

## [1] "heads"

sim_fair_coin<-sample(coin_outcomes, size = 100, replace = TRUE)
sim_fair_coin

##   [1] "heads" "tails" "heads" "heads" "tails" "heads" "tails" "heads" "heads"
##  [10] "tails" "tails" "tails" "tails" "tails" "heads" "tails" "heads" "heads"
##  [19] "tails" "tails" "heads" "heads" "heads" "heads" "tails" "heads" "heads"
##  [28] "tails" "heads" "heads" "heads" "tails" "heads" "heads" "heads" "tails"
##  [37] "heads" "tails" "heads" "tails" "heads" "tails" "heads" "tails" "heads"
##  [46] "heads" "tails" "tails" "heads" "heads" "tails" "tails" "heads" "tails"
##  [55] "heads" "tails" "tails" "heads" "tails" "tails" "tails" "tails" "heads"
##  [64] "heads" "tails" "heads" "tails" "tails" "heads" "heads" "heads" "tails"
##  [73] "tails" "heads" "heads" "heads" "tails" "heads" "heads" "heads" "heads"
##  [82] "heads" "tails" "tails" "heads" "heads" "tails" "tails" "tails" "heads"
##  [91] "tails" "tails" "tails" "tails" "heads" "tails" "tails" "heads" "tails"
## [100] "heads"

table(sim_fair_coin)

## sim_fair_coin
## heads tails 
##    52    48

sim_unfair_coin<-sample(coin_outcomes, size = 100, replace = TRUE, prob = c(0.2, 0.8))
table(sim_unfair_coin)

## sim_unfair_coin
## heads tails 
##    19    81

In this simulation 19 flips came up “heads”, to find this data I used sim_unfair_coin<-sample(coin_outcomes, size = 100, replace = TRUE, prob = c(0.2, 0.8)) then to view the data, I used table(sim_unfair_coin), which generated the two values for “heads” and “tails” seen above under sim_unfair_coin.

Exercise 4:

shot_outcomes<-c("H", "M")
set.seed(1112223)
sim_basket<-sample(shot_outcomes, size = 133, replace = TRUE, prob =c(0.45, 0.55))
table(sim_basket)

## sim_basket
##  H  M 
## 52 81

Sine we want a percentage of 45%, we will use the “prob” feature with numbers 0.45 and 0.55 respectively. Then since we want it to sample 133 shots, we will change size from 1 to 133, which we will assign to “sim_basket”.

On your own:

1:

sim_streak<-calc_streak(sim_basket)$length
barplot(table(sim_streak))

table((sim_streak))

## 
##  0  1  2  3  4 
## 49 21  8  1  3

2:

sim_streak<-calc_streak(sim_basket)$length
barplot(table(sim_streak))

table((sim_streak))

## 
##  0  1  2  3  4 
## 49 21  8  1  3

The distribution of streak lengths seems to show that there are significant more streaks of 0, than any other streak length. With 49 streaks of 0, that length would be the most typical in this simulation. The longest streak length is 4 shots out of the 133 shots in this simulation.

3:

I would expect that the mot typical streak length would still be 0, since that has been true for all the simulations I have ran. I doubt that the other streaks would have the exact same amounts attached to them, but I imagine that a streak of 1 shot would be the second most common, solely because that is the second most probable given that shooting streaks seem to be dependent on random chance, and 1 shot streaks are easier to obtain than a streak with higher number of shots. But since these simulations are independent of one another, the distribution could be completely different.

4:

The distribution of Kobe’s streaks compared to the ones from the simulation are similar looking, but Kobe has significantly less streaks of 0. Meaning he has less miss streaks than what was found in the simulation. Kobe also has more 3-shot streaks than what was shown in the simulation, which suggests he has a better overall shooting streak, with less miss streaks and more multiple length streaks. I don’t think this data supports that Kobe had “hot hands”, since probability states that each shot he took was separate from the one before and the one after. Although, there was a slightly higher number of 3-shot streaks, I don’t think it implies he had “hot hands”. But the data does show that he was a more successful shooter than our simulated streaks.