2E4. The Bayesian statistician Bruno de Finetti (1906–1985) began his 1973 book on probability theory with the declaration: “PROBABILITY DOES NOT EXIST.” The capitals appeared in the original, so I imagine de Finetti wanted us to shout this statement. What he meant is that probability is a device for describing uncertainty from the perspective of an observer with limited knowledge; it has no objective reality. Discuss the globe tossing example from the chapter, in light of this statement. What does it mean to say “the probability of water is 0.7”?
In contrast, Bayesian estimates are valid for any sample size. This does not mean that more data isn’t helpful—it certainly is. Rather, the estimates have a clear and valid interpretation, no matter the sample size. But the price for this power is dependency upon the initial plausibilities, the prior. If the prior is a bad one, then the resulting inference will be misleading.
Click here
Why sampling?
A grid is simply a selection of values that the parameter(s) of interest can take. It’s a way to discretize a continuous distribution
# Store draws (1 = R, 0 = B)
draws <- c(1, 0, 0, 1, 1, 1, 0, 1, 0, 1, 1)
# We need to define how coarse the grid is
grid_points <- 100
# Define Bayes theorem through grid approximation
grid_posterior <- tibble(
# GRID OF PARAMETER VALUES
grid = seq(from = 0,
to = 1,
length.out = grid_points),
# UNINFORMATIVE PRIOR
# LIKELIHOOD
# POSTERIOR
)An uninformative prior is a uniform distribution where every parameter value has the same probability than the others
# Store draws (1 = R, 0 = B)
draws <- c(1, 0, 0, 1, 1, 1, 0, 1, 0, 1, 1)
# We need to define how coarse the grid is
grid_points <- 100
# Define Bayes theorem through grid approximation
grid_posterior <- tibble(
# GRID OF PARAMETER VALUES
grid = seq(from = 0,
to = 1,
length.out = grid_points),
# UNINFORMATIVE PRIOR
prior = 1,
# LIKELIHOOD
# POSTERIOR
)The probability of the data given a specific parameter value [P(D|p)]. Our data consist of red and black cards, so we are asking what is the probability of observing N red cards in N draws?
# Store draws (1 = R, 0 = B)
draws <- c(1, 0, 0, 1, 1, 1, 0, 1, 0, 1, 1)
# We need to define how coarse the grid is
grid_points <- 100
# Define Bayes theorem through grid approximation
grid_posterior <- tibble(
# GRID OF PARAMETER VALUES
grid = seq(from = 0,
to = 1,
length.out = grid_points),
# UNINFORMATIVE PRIOR
prior = 1,
# LIKELIHOOD
likelihood = dbinom(sum(draws), size = length(draws), prob = grid),
# POSTERIOR
)We have the grid, the prior and the likelihood, let’s apply Bayes’ theorem
# Store draws (1 = R, 0 = B)
draws <- c(1, 0, 0, 1, 1, 1, 0, 1, 0, 1, 1)
# We need to define how coarse the grid is
grid_points <- 100
# Define Bayes theorem through grid approximation
grid_posterior <- tibble(
# GRID OF PARAMETER VALUES
grid = seq(from = 0,
to = 1,
length.out = grid_points),
# UNINFORMATIVE PRIOR
prior = 1,
# LIKELIHOOD
likelihood = dbinom(sum(draws), size = length(draws), prob = grid),
# POSTERIOR
posterior = (prior * likelihood) / sum(prior * likelihood)
)2M1. Recall the globe tossing model from the chapter. Compute and plot the grid approximate posterior distribution for each of the following sets of observations. In each case, assume a uniform prior for p. (1) W, W, W (2) W, W, W, L (3) L, W, W, L, W, W, W
2M2. Now assume a prior for p that is equal to zero when p < 0.5 and is a positive constant when p ≥ 0.5. Again compute and plot the grid approximate posterior distribution for each of the sets of observations in the problem just above.
If you don’t have a strong argument for any particular prior, then try different ones. Because the prior is an assumption, it should be interrogated like other assumptions: by altering it and checking how sensitive inference is to the assumption. No one is required to swear an oath to the assumptions of a model, and no set of assumptions deserves our obedience.