Lab 3, probability

library(tidyverse)

## Warning: package 'tidyverse' was built under R version 4.0.3

library(openintro)

## Warning: package 'openintro' was built under R version 4.0.3

## Warning: package 'airports' was built under R version 4.0.3

## Warning: package 'cherryblossom' was built under R version 4.0.3

## Warning: package 'usdata' was built under R version 4.0.3

library(wesanderson)

## Warning: package 'wesanderson' was built under R version 4.0.3

library(ggplot2)

glimpse(kobe_basket)

## Rows: 133
## Columns: 6
## $ vs          <fct> ORL, ORL, ORL, ORL, ORL, ORL, ORL, ORL, ORL, ORL, ORL, ...
## $ game        <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1...
## $ quarter     <fct> 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3...
## $ time        <fct> 9:47, 9:07, 8:11, 7:41, 7:03, 6:01, 4:07, 0:52, 0:00, 6...
## $ description <fct> Kobe Bryant makes 4-foot two point shot, Kobe Bryant mi...
## $ shot        <chr> "H", "M", "M", "H", "H", "M", "M", "M", "M", "H", "H", ...

Exercise 1

What does a streak length of 1 mean, i.e. how many hits and misses are in a streak of 1? What about a streak length of 0? > This means there is one hit, followed by a miss immediately. A streak of length 0 means there are only consecutive misses.

Exercise 2

Describe the distribution of Kobe’s streak lengths from the 2009 NBA finals. What was his typical streak length? How long was his longest streak of baskets? Make sure to include the accompanying plot in your answer. > The data of Kobe’s streaks follows a right-skew distribution. His longest streak of baskets is 4, although he nearly has 40 misses during this game.

# Insert code for Exercise 2 here

#calculating and seeing Kobe's streaks:

kobe_streak <- calc_streak(kobe_basket$shot)
ggplot(data = kobe_streak, aes(x = length)) + geom_bar()

Exercise 3

In your simulation of flipping the unfair coin 100 times, how many flips came up heads? Include the code for sampling the unfair coin in your response. Since the markdown file will run the code, and generate a new sample each time you Knit it, you should also “set a seed” before you sample. Read more about setting a seed below.

82 times heads via unfair coin.

# Insert code for Exercise 3 here

coin_outcomes <- c("heads", "tails")
sample(coin_outcomes, size = 1, replace = TRUE)

## [1] "tails"

# Fair
set.seed(100)
sim_fair_coin <- sample(coin_outcomes, size = 100, replace = TRUE)
table(sim_fair_coin)

## sim_fair_coin
## heads tails 
##    50    50

# Unfair
set.seed(100)
sim_unfair_coin <- sample(coin_outcomes, size = 100, replace = TRUE,
                          prob = c(0.2, 0.8))
table(sim_unfair_coin)

## sim_unfair_coin
## heads tails 
##    18    82

Exercise 4

What change needs to be made to the sample function so that it reflects a shooting percentage of 45%? Make this adjustment, then run a simulation to sample 133 shots. Assign the output of this simulation to a new object called sim_basket.

set.seed(133)
shot_outcomes <- c("H", "M")
sim_basket <- sample(shot_outcomes, size = 133, replace = TRUE, prob =                        c(.45, .55))
table(sim_basket)

## sim_basket
##  H  M 
## 73 60

Exercise 5

Using calc_streak, compute the streak lengths of sim_basket, and save the results in a data frame called sim_streak.

### Code for exercise 6

library(ggplot2)
library(tidyr)

set.seed(133)
shot_outcomes <- c("H", "M")
sim_basket <- sample(shot_outcomes, size = 133, replace = TRUE, prob =                        c(.45, .55))
table(sim_basket)

## sim_basket
##  H  M 
## 73 60

sim_streak <- calc_streak(sim_basket)
sim_streak

##    length
## 1       0
## 2       2
## 3       0
## 4       2
## 5       2
## 6       2
## 7       0
## 8       0
## 9       0
## 10      1
## 11      0
## 12      4
## 13      2
## 14      0
## 15      2
## 16      0
## 17      0
## 18      2
## 19      1
## 20      1
## 21      3
## 22      0
## 23      0
## 24      1
## 25      5
## 26      2
## 27      0
## 28      6
## 29      2
## 30      0
## 31      0
## 32      0
## 33      0
## 34      5
## 35      1
## 36      4
## 37      0
## 38      1
## 39      2
## 40      0
## 41      1
## 42      1
## 43      3
## 44      1
## 45      1
## 46      1
## 47      0
## 48      0
## 49      1
## 50      2
## 51      3
## 52      0
## 53      0
## 54      1
## 55      1
## 56      2
## 57      1
## 58      0
## 59      1
## 60      0
## 61      0

hist(sim_streak$length, main = "Length of Streaks", xlab = "Common Shots", ylab = "Frequency", breaks =  10)

Exercise 6

Describe the distribution of streak lengths. What is the typical streak length for this simulated independent shooter with a 45% shooting percentage? How long is the player’s longest streak of baskets in 133 shots? Make sure to include a plot in your answer. > The streak lengths follow a right skew distribution. > The typical streaks for hits are single shots. After that, doubles are the second common, followed by triples and quadruples tied for third, and sextuplets being the least common with one occurence. The breakdown of streaks are as follows:

singles: 16x doubles: 12x triples: 3x quads: 4x sext.: 1x

Exercise 7

If you were to run the simulation of the independent shooter a second time, how would you expect its streak distribution to compare to the distribution from the question above? Exactly the same? Somewhat similar? Totally different? Explain your reasoning. > Although the outcome would indeed be different, the data would be distributed similarly to how it was distributed the first time. This is because the probabilty that the shooter would make the shots remains the same.

shot_outcomes <- c("H", "M")
sim_basket <- sample(shot_outcomes, size = 133, replace = TRUE, prob =                        c(.45, .55))
table(sim_basket)

## sim_basket
##  H  M 
## 58 75

sim_streak <- calc_streak(sim_basket)
sim_streak

##    length
## 1       0
## 2       3
## 3       0
## 4       0
## 5       0
## 6       1
## 7       0
## 8       0
## 9       1
## 10      2
## 11      0
## 12      0
## 13      0
## 14      0
## 15      2
## 16      1
## 17      5
## 18      0
## 19      4
## 20      0
## 21      0
## 22      0
## 23      0
## 24      2
## 25      0
## 26      1
## 27      0
## 28      1
## 29      0
## 30      2
## 31      0
## 32      1
## 33      0
## 34      0
## 35      0
## 36      0
## 37      0
## 38      0
## 39      1
## 40      1
## 41      1
## 42      1
## 43      0
## 44      1
## 45      0
## 46      3
## 47      0
## 48      0
## 49      0
## 50      2
## 51      1
## 52      0
## 53      0
## 54      0
## 55      0
## 56      0
## 57      2
## 58      1
## 59      0
## 60      1
## 61      0
## 62      1
## 63      0
## 64      0
## 65      2
## 66      1
## 67      1
## 68      0
## 69      2
## 70      2
## 71      1
## 72      0
## 73      2
## 74      1
## 75      2
## 76      2

Exercise 8

How does Kobe Bryant’s distribution of streak lengths compare to the distribution of streak lengths for the simulated shooter? Using this comparison, do you have evidence that the hot hand model fits Kobe’s shooting patterns? Explain. > Kobe Bryant’s streak distribution follows the same as the independent shooter. In both models, we see that the number of misses is always the largest category. Followed by single shots, double streaks, triples, etc., in descending order.

# Insert code for Exercise 2 here

#calculating and seeing Kobe's streaks:

kobe_streak <- calc_streak(kobe_basket$shot)
ggplot(data = kobe_streak, aes(x = length)) + geom_bar()

LS0tDQp0aXRsZTogIkxhYiAzLCBwcm9iYWJpbGl0eSINCmF1dGhvcjogIkpvZSBDb25ub2xseSINCmRhdGU6ICJgMi8xOC8yMWAiDQpvdXRwdXQ6IG9wZW5pbnRybzo6bGFiX3JlcG9ydA0KLS0tDQoNCmBgYHtyIGxvYWQtcGFja2FnZXMsIG1lc3NhZ2U9RkFMU0V9DQpsaWJyYXJ5KHRpZHl2ZXJzZSkNCmxpYnJhcnkob3BlbmludHJvKQ0KbGlicmFyeSh3ZXNhbmRlcnNvbikNCmxpYnJhcnkoZ2dwbG90MikNCg0KZ2xpbXBzZShrb2JlX2Jhc2tldCkNCg0KDQoNCmBgYA0KDQojIyMgRXhlcmNpc2UgMQ0KDQpXaGF0IGRvZXMgYSBzdHJlYWsgbGVuZ3RoIG9mIDEgbWVhbiwgaS5lLiBob3cgbWFueSBoaXRzIGFuZCBtaXNzZXMgYXJlIGluIGEgc3RyZWFrIG9mIDE/IFdoYXQgYWJvdXQgYSBzdHJlYWsgbGVuZ3RoIG9mIDA/DQo+IFRoaXMgbWVhbnMgdGhlcmUgaXMgb25lIGhpdCwgZm9sbG93ZWQgYnkgYSBtaXNzIGltbWVkaWF0ZWx5LiBBIHN0cmVhayBvZiBsZW5ndGggMCBtZWFucyB0aGVyZSBhcmUgb25seSBjb25zZWN1dGl2ZSBtaXNzZXMuDQoNCg0KIyMjIEV4ZXJjaXNlIDINCkRlc2NyaWJlIHRoZSBkaXN0cmlidXRpb24gb2YgS29iZeKAmXMgc3RyZWFrIGxlbmd0aHMgZnJvbSB0aGUgMjAwOSBOQkEgZmluYWxzLiBXaGF0IHdhcyBoaXMgdHlwaWNhbCBzdHJlYWsgbGVuZ3RoPyBIb3cgbG9uZyB3YXMgaGlzIGxvbmdlc3Qgc3RyZWFrIG9mIGJhc2tldHM/IE1ha2Ugc3VyZSB0byBpbmNsdWRlIHRoZSBhY2NvbXBhbnlpbmcgcGxvdCBpbiB5b3VyIGFuc3dlci4NCj4gVGhlIGRhdGEgb2YgS29iZSdzIHN0cmVha3MgZm9sbG93cyBhIHJpZ2h0LXNrZXcgZGlzdHJpYnV0aW9uLiBIaXMgbG9uZ2VzdCBzdHJlYWsgb2YgYmFza2V0cyBpcyA0LCBhbHRob3VnaCBoZSBuZWFybHkgaGFzIDQwIG1pc3NlcyBkdXJpbmcgdGhpcyBnYW1lLg0KDQoNCmBgYHtyfQ0KIyBJbnNlcnQgY29kZSBmb3IgRXhlcmNpc2UgMiBoZXJlDQoNCiNjYWxjdWxhdGluZyBhbmQgc2VlaW5nIEtvYmUncyBzdHJlYWtzOg0KDQprb2JlX3N0cmVhayA8LSBjYWxjX3N0cmVhayhrb2JlX2Jhc2tldCRzaG90KQ0KZ2dwbG90KGRhdGEgPSBrb2JlX3N0cmVhaywgYWVzKHggPSBsZW5ndGgpKSArIGdlb21fYmFyKCkgDQoNCmBgYA0KDQojIyMgRXhlcmNpc2UgMw0KSW4geW91ciBzaW11bGF0aW9uIG9mIGZsaXBwaW5nIHRoZSB1bmZhaXIgY29pbiAxMDAgdGltZXMsIGhvdyBtYW55IGZsaXBzIGNhbWUgdXAgaGVhZHM/IEluY2x1ZGUgdGhlIGNvZGUgZm9yIHNhbXBsaW5nIHRoZSB1bmZhaXIgY29pbiBpbiB5b3VyIHJlc3BvbnNlLiBTaW5jZSB0aGUgbWFya2Rvd24gZmlsZSB3aWxsIHJ1biB0aGUgY29kZSwgYW5kIGdlbmVyYXRlIGEgbmV3IHNhbXBsZSBlYWNoIHRpbWUgeW91IEtuaXQgaXQsIHlvdSBzaG91bGQgYWxzbyDigJxzZXQgYSBzZWVk4oCdIGJlZm9yZSB5b3Ugc2FtcGxlLiBSZWFkIG1vcmUgYWJvdXQgc2V0dGluZyBhIHNlZWQgYmVsb3cuDQoNCj4gODIgdGltZXMgaGVhZHMgdmlhIHVuZmFpciBjb2luLg0KDQpgYGB7cn0NCiMgSW5zZXJ0IGNvZGUgZm9yIEV4ZXJjaXNlIDMgaGVyZQ0KDQpjb2luX291dGNvbWVzIDwtIGMoImhlYWRzIiwgInRhaWxzIikNCnNhbXBsZShjb2luX291dGNvbWVzLCBzaXplID0gMSwgcmVwbGFjZSA9IFRSVUUpDQoNCiMgRmFpcg0Kc2V0LnNlZWQoMTAwKQ0Kc2ltX2ZhaXJfY29pbiA8LSBzYW1wbGUoY29pbl9vdXRjb21lcywgc2l6ZSA9IDEwMCwgcmVwbGFjZSA9IFRSVUUpDQp0YWJsZShzaW1fZmFpcl9jb2luKQ0KDQojIFVuZmFpcg0Kc2V0LnNlZWQoMTAwKQ0Kc2ltX3VuZmFpcl9jb2luIDwtIHNhbXBsZShjb2luX291dGNvbWVzLCBzaXplID0gMTAwLCByZXBsYWNlID0gVFJVRSwNCiAgICAgICAgICAgICAgICAgICAgICAgICAgcHJvYiA9IGMoMC4yLCAwLjgpKQ0KdGFibGUoc2ltX3VuZmFpcl9jb2luKQ0KYGBgDQoNCiMjIyBFeGVyY2lzZSA0DQoNCldoYXQgY2hhbmdlIG5lZWRzIHRvIGJlIG1hZGUgdG8gdGhlIHNhbXBsZSBmdW5jdGlvbiBzbyB0aGF0IGl0IHJlZmxlY3RzIGEgc2hvb3RpbmcgcGVyY2VudGFnZSBvZiA0NSU/IE1ha2UgdGhpcyBhZGp1c3RtZW50LCB0aGVuIHJ1biBhIHNpbXVsYXRpb24gdG8gc2FtcGxlIDEzMyBzaG90cy4gQXNzaWduIHRoZSBvdXRwdXQgb2YgdGhpcyBzaW11bGF0aW9uIHRvIGEgbmV3IG9iamVjdCBjYWxsZWQgc2ltX2Jhc2tldC4NCg0KYGBgIHtyfQ0Kc2V0LnNlZWQoMTMzKQ0Kc2hvdF9vdXRjb21lcyA8LSBjKCJIIiwgIk0iKQ0Kc2ltX2Jhc2tldCA8LSBzYW1wbGUoc2hvdF9vdXRjb21lcywgc2l6ZSA9IDEzMywgcmVwbGFjZSA9IFRSVUUsIHByb2IgPSAgICAgICAgICAgICAgICAgICAgICAgIGMoLjQ1LCAuNTUpKQ0KdGFibGUoc2ltX2Jhc2tldCkNCg0KYGBgDQoqKioqKioqKg0KDQojIyMgRXhlcmNpc2UgNQ0KDQpVc2luZyBjYWxjX3N0cmVhaywgY29tcHV0ZSB0aGUgc3RyZWFrIGxlbmd0aHMgb2Ygc2ltX2Jhc2tldCwgYW5kIHNhdmUgdGhlIHJlc3VsdHMgaW4gYSBkYXRhIGZyYW1lIGNhbGxlZCBzaW1fc3RyZWFrLg0KDQpgYGAge3J9DQoNCiMjIyBDb2RlIGZvciBleGVyY2lzZSA2DQoNCmxpYnJhcnkoZ2dwbG90MikNCmxpYnJhcnkodGlkeXIpDQoNCnNldC5zZWVkKDEzMykNCnNob3Rfb3V0Y29tZXMgPC0gYygiSCIsICJNIikNCnNpbV9iYXNrZXQgPC0gc2FtcGxlKHNob3Rfb3V0Y29tZXMsIHNpemUgPSAxMzMsIHJlcGxhY2UgPSBUUlVFLCBwcm9iID0gICAgICAgICAgICAgICAgICAgICAgICBjKC40NSwgLjU1KSkNCnRhYmxlKHNpbV9iYXNrZXQpDQoNCnNpbV9zdHJlYWsgPC0gY2FsY19zdHJlYWsoc2ltX2Jhc2tldCkNCnNpbV9zdHJlYWsNCg0KaGlzdChzaW1fc3RyZWFrJGxlbmd0aCwgbWFpbiA9ICJMZW5ndGggb2YgU3RyZWFrcyIsIHhsYWIgPSAiQ29tbW9uIFNob3RzIiwgeWxhYiA9ICJGcmVxdWVuY3kiLCBicmVha3MgPSAgMTApIA0KYGBgDQoNCiMjIyBFeGVyY2lzZSA2DQoNCkRlc2NyaWJlIHRoZSBkaXN0cmlidXRpb24gb2Ygc3RyZWFrIGxlbmd0aHMuIFdoYXQgaXMgdGhlIHR5cGljYWwgc3RyZWFrIGxlbmd0aCBmb3IgdGhpcyBzaW11bGF0ZWQgaW5kZXBlbmRlbnQgc2hvb3RlciB3aXRoIGEgNDUlIHNob290aW5nIHBlcmNlbnRhZ2U/IEhvdyBsb25nIGlzIHRoZSBwbGF5ZXLigJlzIGxvbmdlc3Qgc3RyZWFrIG9mIGJhc2tldHMgaW4gMTMzIHNob3RzPyBNYWtlIHN1cmUgdG8gaW5jbHVkZSBhIHBsb3QgaW4geW91ciBhbnN3ZXIuDQogPiBUaGUgc3RyZWFrIGxlbmd0aHMgZm9sbG93IGEgcmlnaHQgc2tldyBkaXN0cmlidXRpb24uIA0KID4gVGhlIHR5cGljYWwgc3RyZWFrcyBmb3IgaGl0cyBhcmUgc2luZ2xlIHNob3RzLiBBZnRlciB0aGF0LCBkb3VibGVzIGFyZSB0aGUgc2Vjb25kIGNvbW1vbiwgZm9sbG93ZWQgYnkgdHJpcGxlcyBhbmQgcXVhZHJ1cGxlcyB0aWVkIGZvciB0aGlyZCwgYW5kIHNleHR1cGxldHMgYmVpbmcgdGhlIGxlYXN0IGNvbW1vbiB3aXRoIG9uZSBvY2N1cmVuY2UuIFRoZSBicmVha2Rvd24gb2Ygc3RyZWFrcyBhcmUgYXMgZm9sbG93czoNCg0Kc2luZ2xlczogMTZ4DQpkb3VibGVzOiAxMngNCnRyaXBsZXM6IDN4DQpxdWFkczogNHgNCnNleHQuOiAxeA0KDQoNCiMjIyBFeGVyY2lzZSA3DQoNCklmIHlvdSB3ZXJlIHRvIHJ1biB0aGUgc2ltdWxhdGlvbiBvZiB0aGUgaW5kZXBlbmRlbnQgc2hvb3RlciBhIHNlY29uZCB0aW1lLCBob3cgd291bGQgeW91IGV4cGVjdCBpdHMgc3RyZWFrIGRpc3RyaWJ1dGlvbiB0byBjb21wYXJlIHRvIHRoZSBkaXN0cmlidXRpb24gZnJvbSB0aGUgcXVlc3Rpb24gYWJvdmU/IEV4YWN0bHkgdGhlIHNhbWU/IFNvbWV3aGF0IHNpbWlsYXI/IFRvdGFsbHkgZGlmZmVyZW50PyBFeHBsYWluIHlvdXIgcmVhc29uaW5nLg0KID4gQWx0aG91Z2ggdGhlIG91dGNvbWUgd291bGQgaW5kZWVkIGJlIGRpZmZlcmVudCwgdGhlIGRhdGEgd291bGQgYmUgZGlzdHJpYnV0ZWQgc2ltaWxhcmx5IHRvIGhvdyBpdCB3YXMgZGlzdHJpYnV0ZWQgdGhlIGZpcnN0IHRpbWUuIFRoaXMgaXMgYmVjYXVzZSB0aGUgcHJvYmFiaWx0eSB0aGF0IHRoZSBzaG9vdGVyIHdvdWxkIG1ha2UgdGhlIHNob3RzIHJlbWFpbnMgdGhlIHNhbWUuDQogDQogYGBge3J9DQogc2hvdF9vdXRjb21lcyA8LSBjKCJIIiwgIk0iKQ0Kc2ltX2Jhc2tldCA8LSBzYW1wbGUoc2hvdF9vdXRjb21lcywgc2l6ZSA9IDEzMywgcmVwbGFjZSA9IFRSVUUsIHByb2IgPSAgICAgICAgICAgICAgICAgICAgICAgIGMoLjQ1LCAuNTUpKQ0KdGFibGUoc2ltX2Jhc2tldCkNCg0Kc2ltX3N0cmVhayA8LSBjYWxjX3N0cmVhayhzaW1fYmFza2V0KQ0Kc2ltX3N0cmVhaw0KIA0KIGBgYA0KDQojIyMgRXhlcmNpc2UgOA0KDQpIb3cgZG9lcyBLb2JlIEJyeWFudOKAmXMgZGlzdHJpYnV0aW9uIG9mIHN0cmVhayBsZW5ndGhzIGNvbXBhcmUgdG8gdGhlIGRpc3RyaWJ1dGlvbiBvZiBzdHJlYWsgbGVuZ3RocyBmb3IgdGhlIHNpbXVsYXRlZCBzaG9vdGVyPyBVc2luZyB0aGlzIGNvbXBhcmlzb24sIGRvIHlvdSBoYXZlIGV2aWRlbmNlIHRoYXQgdGhlIGhvdCBoYW5kIG1vZGVsIGZpdHMgS29iZeKAmXMgc2hvb3RpbmcgcGF0dGVybnM/IEV4cGxhaW4uDQogPiBLb2JlIEJyeWFudCdzIHN0cmVhayBkaXN0cmlidXRpb24gZm9sbG93cyB0aGUgc2FtZSBhcyB0aGUgaW5kZXBlbmRlbnQgc2hvb3Rlci4gSW4gYm90aCBtb2RlbHMsIHdlIHNlZSB0aGF0IHRoZSBudW1iZXIgb2YgbWlzc2VzIGlzIGFsd2F5cyB0aGUgbGFyZ2VzdCBjYXRlZ29yeS4gRm9sbG93ZWQgYnkgc2luZ2xlIHNob3RzLCBkb3VibGUgc3RyZWFrcywgdHJpcGxlcywgZXRjLiwgaW4gZGVzY2VuZGluZyBvcmRlci4NCiANCiBgYGB7ciBjb2RlLWNodW5rLWxhYmVsfQ0KIyBJbnNlcnQgY29kZSBmb3IgRXhlcmNpc2UgMiBoZXJlDQoNCiNjYWxjdWxhdGluZyBhbmQgc2VlaW5nIEtvYmUncyBzdHJlYWtzOg0KDQprb2JlX3N0cmVhayA8LSBjYWxjX3N0cmVhayhrb2JlX2Jhc2tldCRzaG90KQ0KZ2dwbG90KGRhdGEgPSBrb2JlX3N0cmVhaywgYWVzKHggPSBsZW5ndGgpKSArIGdlb21fYmFyKCkgDQoNCmBgYA0K