{ {;;;} {*;;;}
—
Complete all Exercises, and submit answers to
Questions on the Coursera
platform.
dplyr
package and visualize itggplot2
package for data visualization. The data
can be found in thestatsr
.\{r load-packages, message=FALSE\}\ library(statsr)\ library(dplyr)\ library(ggplot2)\
\{r load-data\}\ data(kobe_basket)\
shot
variable
in this datasetH
) or a miss
(M
).calc_streak
to calculate them, and
store the resultskobe_streak
as the
length
variable.\{r calc-streak-kobe\}\ kobe_streak <- calc_streak(kobe_basket$shot)\
\{r plot-streak-kobe\}\ ggplot(data = kobe_streak, aes(x = length)) +\ geom_histogram(binwidth = 1)\
\{r head-tail\}\ coin_outcomes <- c("heads", "tails")\ sample(coin_outcomes, size = 1, replace = TRUE)\
outcomes
can be thought of as a hat with two
slips of paper in it:heads
and the other says tails
.
The function sample
drawssize
argument, whichreplace = TRUE
argument indicates we putsim_fair_coin
.\{r sim-fair-coin\}\ sim_fair_coin <- sample(coin_outcomes, size = 100, replace = TRUE)\
table
to count up the number of heads and tails.\{r table-sim-fair-coin\}\ sim_fair_coin\ table(sim_fair_coin)\
outcomes
, the
probability that we “flip”prob
, which provides a vector of two
probability weights.\{r sim-unfair-coin\}\ sim_unfair_coin <- sample(coin_outcomes, size = 100, replace = TRUE, \ prob = c(0.2, 0.8))\
prob = c(0.2, 0.8)
indicates that for the two elements in
the outcomes
vector,heads
, with probability
0.2 and the secondtails
with probability 0.8. Another way of thinking
about this is to
Exercise: In your simulation of flipping the unfair
coin 100 times, how many flips came up heads?
prob
argument; all elements in the
outcomes
vector havesample
or any other
function, recall that you?sample
.\{r sim-basket\}\ shot_outcomes <- c("H", "M")\ sim_basket <- sample(shot_outcomes, size = 1, replace = TRUE)\
Exercise: What change needs to be made to the
sample
function so that it reflects a shooting percentage
of 45%? Make this adjustment, then run a simulation to sample 133 shots.
Assign the output of this simulation to a new object called
sim_basket
.
\{r\}\ # type your code for the Exercise here, and Knit\ \
sim_basket
, the same
name that we gave tosim_basket
, we
have the data
Exercise: Using calc_streak
, compute the
streak lengths of sim_basket
, and save the results in a
data frame called sim_streak
. Note that since the
sim_streak
object is just a vector and not a variable in a
data frame, we don’t need to first select it from a data frame like we
did earlier when we calculated the streak lengths for Kobe’s
shots.
\{r sim-streak-lengths\}\ # type your code for the Exercise here, and Knit\ \
Exercise: Make a plot of the distribution of simulated
streak lengths of the independent shooter. What is the typical streak
length for this simulated independent shooter with a 45% shooting
percentage? How long is the player’s longest streak of baskets in 133
shots?
\{r plot-sim-streaks\}\ # type your code for the Exercise here, and Knit\ \
Exercise: What concepts from the course videos are
covered in this lab? What
concepts, if any, are not covered in the videos? Have you seen these
concepts
elsewhere, e.g. textbook, previous labs, or practice problems?
This is a derivative of an OpenIntro lab, and is
released under a Attribution-NonCommercial-ShareAlike
3.0 United States license.
}