Please answer these questions in a google document and share it with me when you have completed this lab.

Day 1: The Famous Unknown Saint Ann’s Basketball Player

Recall our observer who collects data on Saint Ann’s basketball players and find the distribution of true shooting percentages:

Bin	Bin Average	Proportion of Basketball Players
60 -70 %	0.65	0.06
50 - 60 %	0.55	0.25
40 - 50 %	0.45	0.38
30 - 40 %	0.35	0.25
20 - 30 %	0.25	0.06

In class, we (special thanks to Brooks) answered a few questions about this using R and I want to go over those answers now since they will be helpful later.

1) What is the average shooting percentage for Saint Ann’s Basketball players?

Reed recognized that the answer must be 45 % since the distribution of shooting abilities is symmetric around 45 %. We could also calculate this using our expected value/weighted average formula:

\[ E[X] = \Sigma \ x \cdot P(x)\] which in our case means:

x <- c(0.65, 0.55, 0.45, 0.35, 0.25) #shooting percentages for each bin
px <- c(0.06, 0.25, 0.38, 0.25, 0.06) #proportion of players in each bin

ex <- sum(x*px)
ex

## [1] 0.45

2) What is the standard deviation in shooting percentage for Saint Ann’s basketball players?

We calculated the variance using the formula formula:

\[ Var[X] = \Sigma \ P(x) \cdot (x-E[X])^2 \] and then took the square root of the variance to find the standard deviation:

varx <- sum(px*(x-ex)^2)
sqrt(varx)

## [1] 0.09899495

In other words, according to this (fake) data Saint Ann’s basketball players have mean true talent shooting ability of 45 % with a standard deviation of 10 %.

Now, here’s where things start to get tricky…

3) An unknown player takes one shot and makes it. How likely is she to be in each shooting percentage bucket?

This is where Bayes Formula comes in, written as an odds ratio we have:

\[odds(A|B) = odds(A) \cdot \frac{P(B|A)}{P(B|\bar{A})}\]

When instead of just \(A\) and \(\bar{A}\), not A, we have a distribution of possibilities we can write Bayes’ Formula as:

\[posterior\ ratio = prior\ ratio \cdot likelihood\ ratio\]

In this case, our prior distribution is the proportion of Saint Ann’s basketball players in each bucket. Since our unknown player made a basket, the likelihood is the likelihood that a player who belongs to each basket would make a bakset which is the same as the shooting percentage for each basket.

prior <- c(0.06, 0.25, 0.38, 0.25, 0.06) #proportion of players in each bin
likelihood <- c(0.65, 0.55, 0.45, 0.35, 0.25) #shooting percentages for each bin

posterior <- prior*likelihood
posterior

## [1] 0.0390 0.1375 0.1710 0.0875 0.0150

sum(posterior)

## [1] 0.45

Note, that our posterior’s don’t add to 1 which they should if these are the only possible shooting abilities. They are in the correct ratio relative to each other, however, so to convert these to proper probabilities we simply to to “normalize” them so that they add to 1. We can do that by calculating the posterior probabilities as:

posterior <- prior*likelihood/sum(prior*likelihood)
posterior

## [1] 0.08666667 0.30555556 0.38000000 0.19444444 0.03333333

sum(posterior)

## [1] 1

Q1: Based on the calculations above, what is the probability that our unknown shooter (who we saw make one shot) is a 50- 60 % shooter?

Q2: How does this compare to our prior probability that she was a 50-60 % shooter (prior meaning whot we thought before we saw her shoot)?

4) Now that we’ve updated our understanding of our unknown shooter’s ability to take into account that she made a shot, what is her expected true shooting ability?

To answer this question, we can use our same old formula for expected value:

\[E[X] = \Sigma \ x \cdot P(x)\]

but use our posterior probabilities in place of the prior probabilities (which were just the proportion of all players in each bin).

ex <- sum(x*posterior)
ex

## [1] 0.4717778

Q3: By how much did hitting one shot increase our estimate of her shooting ability?

5) What is the uncertainty (meaning standard deviation) in our estimate of her shooting ability?

Since we’ve only seen her shoot one shot, there’s still quite a bit of uncertainty in how good of a shooter she is. We can calculate our uncertainty using our formula for standard deviation:

\[ SD[X] = \sqrt{\Sigma \ P(x) \cdot (x-E[X])^2}\]

Once again, we’ll use our posterior probabilities in place of the prior probabilities.

sdx <- sqrt(sum(posterior*(x-ex)^2))
sdx

## [1] 0.09656981

Q4: How does our uncertainty in her shooting ability compare with the standard deviation in shooting ability for the population?

Collecting More Evidence

Now, let’s imagine that we stick around the gym for long enough to see our unknown shooter than nine more shots (ten total) and that all told she has made eight of the ten shots we have witnessed. What can we say about her shooting ability. Once again, we can use Bayes’ Formula to find the answer.

\[posterior = prior \cdot likelihood\] Our prior is the same as it was in the last problem (the population of Saint Ann’s shooters) but the likelihood had change. The likelihood we need is the likelihood that that a shooter would make 8 or 10 shoots given each of the possible ability levels. Thankfully, we can use the binomial formula (and R’s dbinom function) to figure this out.

x <- c(0.65, 0.55, 0.45, 0.35, 0.25) #possible shooting abilities
likelihood <- dbinom(8, 10, prob=x)
likelihood

## [1] 0.1756529531 0.0763025509 0.0228895894 0.0042813781 0.0003862381

Q5: Based on the calculation above, what is the probability that a 55 % shooter would make 8 or 10 shots?

The next part of our calculation is the same as before:

posterior <- prior*likelihood/sum(prior*likelihood)
posterior

## [1] 0.2674485147 0.4840748842 0.2207268085 0.0271617079 0.0005880847

Q5: Based on the calculation above, how likely is it that our shooter is a 55% shooter given that she made 8 of 10 shots?

Q6: Using code from above, calculate the expected value of her shooting ability. This is our best guess as to her true shooting ability or, put another way, if we were tasked with setting a betting line on the chance that she will make her next shot, this is the probability we would use.

Q7: Is your result above consistent with what you would expect based on regression towards the mean? Explain.

Q8: Using code from above, calculate the uncertainty (standard deviation) in our estimate of her shooting ability. Be careful to use the correct ex in your calculation.

Q9: How does the uncertainty in her shooting ability after 10 shots compare with the uncertainty in her shooting ability after 1 shot? Is that what you would expect? Explain.

Q10: Repeat your calculations above but for a shooter who has made 70 of 100 shots.

Day 2: Continuous Priors

On day 1, we assumed that there were only 5 possible shooting ability for our unknown hero which is of course unrealistic. A priori (meaning before we see her shoot) her true shooting percentage could be anything between 0 and 1. In what follows, we’re attempt to use something approaching a continuous prior allowing for the possibilities of any shooting ability. For the sake of our calculations, we’ll still approximate these continuous priors with using a finite number of possibilities.

A Uniform Prior

Let’s start with the simplest (although not the most reasonable) prior – a uniform prior. We’ll assume that “a priori” (before we saw her hit 8 of 10 free throws, say) that we thought that every possible ability level was equally likely. We can create and plot a uniform prior as follows:

prob.range <- seq(0, 1, .001)
unif.prior <- rep(1,length(prob.range))
plot(prob.range, unif.prior, type="l", main="Uniform Prior")

The next thing we need to is to estime the likelihood of hitting 8 of 10 shots given every possible level of ability.

likelihood <- dbinom(8, 10, prob.range)
plot(prob.range, likelihood, type="l", main="Likelihood Distribution")

Hitting 8 of 10 shots much more likely for a good shooter than a bad shooter and this will affect our posterior (our estimate of her shooting ability after seeing her results). To calculate the posterior, we multiply our prior by our likelihood and then normalize it so that the our posterior probabilities add to 1. Just as in day 1, this normalization can be done by simply dividing the products of likelihoods and priors by the sum of the products.

posterior <- unif.prior*likelihood/sum(unif.prior*likelihood)
plot(prob.range, posterior, type="l", main="Posterior Distribution")

Our posterior distirbution shows which possible abilities we find most likely in light of the results we’ve seem. Notice how the posterior distibution looks just like the likelihood distribution. This is because we used a uniform prior. Before trying a different prior, let’s summarize our posterior distribution. We can do this by finding its mean and its standard deviation. We can calculate these using the same formulas we used to find the expected value and standard deviation of random variables:

\[ E[X] = \Sigma x \cdot p(x) \] \[ VAR[X] = \Sigma (x - E[x])^2 \cdot p(x) \] \[ SD[X] = \sqrt Var[X] \]

(EX = sum(prob.range*posterior))

## [1] 0.75

(SDX = sqrt(sum((prob.range-EX)^2*posterior)))

## [1] 0.1200961

Q11: What would the mean and standard deviation of our posterior distribution be if she had taken 100 shots and made 70 of theem (Hint: try rewriting the code for the likelihoods and then rerunning the code that comes afterwards.)?

A Beta (and better) Prior

Statisticians spend a lot of time thinking about distributions. There’s a long list of notable distributions that a statistician has at their disposal some of which are continuous (like the normal distribution and the uniform distribution) and can take on any value within a certain range and some of them are discrete (the the binomial distribution which can only take on integer values).

One useful continuous distribution is the Beta distribution. We won’t get into the particulars here but for our purposes there are two nice things about the Beta distribution: 1) It takes on values between 0 and 1 (and surely someone’s shooting percentage falls within this range) and 2) It’s pretty flexible - we can make our Beta distribution take on a wide variety of shapes by changing its two parameters (usually called alpha and beta but R calls them shape1 and shape2).

Take a look a the beta distribution with parameters 45 and 55:

beta.prior <- dbeta(prob.range, 45, 55)
plot(prob.range, beta.prior, type="l", main="beta(45,55) Prior ")

Notice that this distribution peaks near 45%. This is no coincidence, as an alpha of 45 and beta of 55 means that “a priori” (before we’ve seen our basketball player’s performance) we have a mean expectation of 45 successes for every 55 failures. The total of our parameters (45 and 55) has a meaning as well – it tells us how much we regress someone’s observed data towards our prior (remember regression towards the mean?). With parameters 45 and 55, we would regress a player with 100 (45+55) shot attempts half the way towards the mean.

What if we think that expecting a 45% shooter is reasonable but think that’s way too much regression to the mean. Then we could try smaller parameters in the same ratio. Lets take a look at beta(4.5,5.5):

beta.prior <- dbeta(prob.range, 4.5, 5.5)
plot(prob.range, beta.prior, type="l", main="beta(4.5,5.5) Prior ")

Let’s evaluate our basketball player, who hit 8 of 10 shots, using this beta(4.5,5.5) prior rather than a uniform prior:

beta.prior <- dbeta(prob.range, 4.5, 5.5)
likelihood <- dbinom(8, 10, prob.range)
posterior <- beta.prior*likelihood/sum(beta.prior*likelihood)
plot(prob.range, posterior, type="l", main="Posterior Distribution")

(EX = sum(prob.range*posterior))

## [1] 0.625

(SDX = sqrt(sum((prob.range-EX)^2*posterior)))

## [1] 0.1056443

Q12: How does the mean and standard deviation of our posterior with a beta(4.5,5.5) prior compare with the mean and standard deviation with the uniform(0,1) prior? (Note: uniform(0,1) means that every value 0 to 1 is equally likely.)

Q13: Create a beta prior that you think is reasonable for a high school basketball player and find the posterior distribution and the mean and standard deviation of the posterior using your prior.

Q14 (Challenge): Try applying this idea to something other than shooting percentages. Some possibilities include winning percentages for teams, batting averages for batters, completion percentages for quarterbacks and penalty kick save percentages for goalies (but you can use whatever you like). Try to find a beta distribution that you think describes the population of true talent. Then, explore different possible results and see what inferences Bayes Formula would lead you to make about the player in question.

Q15 (Challenge): What if you didn’t want to merely guess at the prior distribution? How would you go about estimating the correct prior distribution using data?

Bayesian Analysis: a two-day lab

Sports Data Science

Day 1: The Famous Unknown Saint Ann’s Basketball Player

Collecting More Evidence

Day 2: Continuous Priors

A Uniform Prior

A Beta (and better) Prior