Lab-3---Probability.knit

library(tidyverse)
library(openintro)
library(ggplot2)

Exercise 1

What does a streak length of 1 mean, i.e. how many hits and misses are in a streak of 1? What about a streak length of 0?

A streak of 1 means one hit followed by one miss. A streak length of 0 refers to a particular miss which comes after a miss which ended the previous streak.

Exercise 2

Describe the distribution of Kobe’s streak lengths from the 2009 NBA finals. What was his typical streak length? How long was his longest streak of baskets? Make sure to include the accompanying plot in your answer.

# calculate streak lengths for all 133 shots and putting it into data frame called: "Kobe_streak"
kobe_streak <- calc_streak(kobe_basket$shot)
#plotting the bar plot to analyze the distribution of the streak lengths
ggplot(data = kobe_streak, aes(x = length)) +
  geom_bar()

Kobe’s streak typically occurred in length 0 and 1. He made 39 baskets with the length of 0 and with length 1 he made a count of 25. His longest streak of baskets was of length 4.

Exercise 3

In your simulation of flipping the unfair coin 100 times, how many flips came up heads? Include the code for sampling the unfair coin in your response. Since the markdown file will run the code, and generate a new sample each time you Knit it, you should also “set a seed” before you sample. Read more about setting a seed below.

#Set the seed of simulation to my birth year
set.seed(1987)
# create the vector called: "results", and add the "heads" and "tails" column headers
results <- c("heads", "tails")
#Set heads to have the probability of appearing 20% and set tails to have the probability of appearing 80%
unfair_results <- sample(results, size = 100, replace = TRUE, prob = c(0.2, 0.8))
# Display the results                        
table(unfair_results)

## unfair_results
## heads tails 
##    14    86

It looks like the simulation of flipping the unfair coin 100 times produced 14 heads and 86 tails.

Exercise 4

What change needs to be made to the sample function so that it reflects a shooting percentage of 45%? Make this adjustment, then run a simulation to sample 133 shots. Assign the output of this simulation to a new object called sim_basket.

#Set the seed of simulation to my birth year
set.seed(1987)
# create the vector called: "shot_results", which contains the "H" (Hits) and "M" (Misses) columns
shot_results <- c("H", "M")
# set the probability for shooting percentage to a P(0.45) and shots missed to a P(0.55)
sim_basket <- sample(shot_results, size = 133, replace = TRUE,prob=c(0.45,0.55))
#Display the results
table(sim_basket)

## sim_basket
##  H  M 
## 53 80

The change that needs to be made to the sample function so that it reflects a shooting percentage of 45%,is assign P(1 - 0.45) or 55% to get the probability which reflects the shots missed.

Exercise 5

Using calc_streak, compute the streak lengths of sim_basket, and save the results in a data frame called sim_streak.

#Set the seed of simulation to my birth year
set.seed(1987)
# create the vector called: "shot_results", which contains the "H" (Hits) and "M" (Misses) columns
shot_outcomes <- c("H", "M")

sim_basket <- sample(shot_outcomes, size = 133, replace = TRUE,prob=c(0.45,0.55))
sim_streak <- calc_streak(sim_basket)

The change that needs to be made to the sample function so that it reflects a shooting percentage of 45%,is assign P(1 - 0.45) or 55% to get the probability which reflects the shots missed.

Exercise 6

Describe the distribution of streak lengths. What is the typical streak length for this simulated independent shooter with a 45% shooting percentage? How long is the player’s longest streak of baskets in 133 shots? Make sure to include a plot in your answer.

# Graph the simulation streak
ggplot(data=sim_streak,aes(x=length))+
  geom_bar()

The typical streak length for this simulated independent shooter with a 45% shooting percentage is:0 with a count of 40. The players longest streak of baskets in 133 shots is:5 .

Exercise 7

If you were to run the simulation of the independent shooter a second time, how would you expect its streak distribution to compare to the distribution from the question above? Exactly the same? Somewhat similar? Totally different? Explain your reasoning.

I we were to run the simulation of the independent shooter a second time, I would expect its streak distribution to be similar to the one above. The reason being that the shooter is independent and has a higher percentage (55%) of missing the shots. Therefore, no changes should be expected for the simulation with the same probabilities as the one above.

Exercise 8

How does Kobe Bryant’s distribution of streak lengths compare to the distribution of streak lengths for the simulated shooter? Using this comparison, do you have evidence that the hot hand model fits Kobe’s shooting patterns? Explain.

I would say that Kobe Bryant’s distribution of streak lengths is similar to the ones of the simulated shooter. This is so because the highest streak had the length of 0 and the second highest streak had 1 for both of the simulations. Even though there were similarities between Kobe’s shooting and the simulations, it was due to the probabilities (0.45 and 0.55) which were unchanged and provided no proof that this was a result of the hot hand model.