Probability Lab

Author

Dom Ellis

#Probalbity Lab

Load packages

library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.5
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ ggplot2   3.4.4     ✔ tibble    3.2.1
✔ lubridate 1.9.3     ✔ tidyr     1.3.0
✔ purrr     1.0.2     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(openintro)
Loading required package: airports
Loading required package: cherryblossom
Loading required package: usdata

##View the Data

data("kobe_basket")
glimpse(kobe_basket)
Rows: 133
Columns: 6
$ vs          <fct> ORL, ORL, ORL, ORL, ORL, ORL, ORL, ORL, ORL, ORL, ORL, ORL…
$ game        <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1…
$ quarter     <fct> 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3…
$ time        <fct> 9:47, 9:07, 8:11, 7:41, 7:03, 6:01, 4:07, 0:52, 0:00, 6:35…
$ description <fct> Kobe Bryant makes 4-foot two point shot, Kobe Bryant misse…
$ shot        <chr> "H", "M", "M", "H", "H", "M", "M", "M", "M", "H", "H", "H"…

Exercise 1

What does a streak length of 1 mean, i.e. how many hits and misses are in a streak of 1? What about a streak length of 0?

# answer a streak of 1 has one hit and one miss.  A streak of 0 has no hits and one miss.
kobe_streak <- calc_streak(kobe_basket$shot)
kobe_streak
   length
1       1
2       0
3       2
4       0
5       0
6       0
7       3
8       2
9       0
10      3
11      0
12      1
13      3
14      0
15      0
16      0
17      0
18      0
19      1
20      1
21      0
22      4
23      1
24      0
25      1
26      0
27      1
28      0
29      1
30      2
31      0
32      1
33      2
34      1
35      0
36      0
37      1
38      0
39      0
40      0
41      1
42      1
43      0
44      1
45      0
46      2
47      0
48      0
49      0
50      3
51      0
52      1
53      0
54      1
55      2
56      1
57      0
58      1
59      0
60      0
61      1
62      3
63      3
64      1
65      1
66      0
67      0
68      0
69      0
70      0
71      1
72      1
73      0
74      0
75      0
76      1
ggplot(data = kobe_streak, aes(x = length)) +
  geom_bar()

Exercise 2

Describe the distribution of Kobe’s streak lengths from the 2009 NBA finals. What was his typical streak length? How long was his longest streak of baskets? Make sure to include the accompanying plot in your answer.

It’s right skewed with the most frequent streak of 0. His longest streak was 4 and he scored this once. Kobe Bryant isn’t as good as people say that he is.

coin_outcomes <- c("heads", "tails")
sample(coin_outcomes, size = 1, replace = TRUE)
[1] "heads"
sim_fair_coin <- sample(coin_outcomes, size = 100, replace = TRUE)
sim_fair_coin
  [1] "tails" "heads" "heads" "heads" "tails" "tails" "tails" "tails" "tails"
 [10] "tails" "heads" "tails" "heads" "tails" "tails" "tails" "tails" "tails"
 [19] "tails" "tails" "tails" "heads" "heads" "heads" "heads" "heads" "tails"
 [28] "heads" "tails" "heads" "heads" "heads" "tails" "tails" "tails" "heads"
 [37] "heads" "heads" "heads" "heads" "heads" "tails" "heads" "heads" "heads"
 [46] "tails" "heads" "heads" "heads" "heads" "tails" "tails" "tails" "tails"
 [55] "tails" "heads" "heads" "heads" "heads" "tails" "tails" "tails" "heads"
 [64] "heads" "heads" "tails" "heads" "tails" "heads" "tails" "heads" "heads"
 [73] "tails" "heads" "heads" "tails" "tails" "heads" "tails" "heads" "tails"
 [82] "tails" "tails" "tails" "heads" "tails" "tails" "tails" "heads" "tails"
 [91] "heads" "tails" "tails" "heads" "heads" "heads" "tails" "heads" "tails"
[100] "tails"
table(sim_fair_coin)
sim_fair_coin
heads tails 
   49    51 
sim_unfair_coin <- sample(coin_outcomes, size = 100, replace = TRUE, 
                          prob = c(0.2, 0.8))
sim_fair_coin <- sample(coin_outcomes, size = 100, replace = TRUE)
sim_fair_coin
  [1] "tails" "tails" "heads" "tails" "tails" "tails" "heads" "heads" "tails"
 [10] "heads" "tails" "tails" "tails" "tails" "tails" "heads" "heads" "heads"
 [19] "heads" "heads" "heads" "heads" "tails" "heads" "heads" "tails" "tails"
 [28] "tails" "heads" "tails" "heads" "heads" "heads" "tails" "heads" "heads"
 [37] "tails" "tails" "heads" "tails" "heads" "tails" "tails" "heads" "tails"
 [46] "tails" "heads" "heads" "heads" "heads" "tails" "tails" "heads" "tails"
 [55] "tails" "heads" "tails" "heads" "heads" "heads" "tails" "heads" "heads"
 [64] "heads" "tails" "heads" "tails" "tails" "heads" "heads" "tails" "tails"
 [73] "tails" "heads" "tails" "tails" "heads" "tails" "heads" "heads" "heads"
 [82] "heads" "heads" "tails" "heads" "heads" "tails" "tails" "tails" "tails"
 [91] "tails" "tails" "heads" "tails" "tails" "tails" "heads" "tails" "tails"
[100] "heads"
table(sim_fair_coin)
sim_fair_coin
heads tails 
   49    51 
sim_unfair_coin <- sample(coin_outcomes, size = 100, replace = TRUE, 
                          prob = c(0.2, 0.8))

##Exercise 3

In your simulation of flipping the unfair coin 100 times, how many flips came up heads? Include the code for sampling the unfair coin in your response. Since the markdown file will run the code, and generate a new sample each time you Knit it, you should also “set a seed” before you sample. Read more about setting a seed below.

54/100 and 44/100

##Exercise 4

What change needs to be made to the sample function so that it reflects a shooting percentage of 45%? Make this adjustment, then run a simulation to sample 133 shots. Assign the output of this simulation to a new object called sim_basket.

change size to 133, H to 9 and M to 11.

shot_outcomesKobe <- c("H", "M")
sim_basketKobe <- sample(shot_outcomesKobe, size = 133, replace = TRUE)
shot_outcomesRandom <- c(rep("H",9),rep("M",11))
sim_basketRandom <- sample(shot_outcomesRandom, size = 133, replace = TRUE)
diffprops <- mean(sim_basketKobe == "H") - mean(sim_basketRandom == "M")
diffprops
[1] 0.1278195