[Video]
Bayesian inference is a method for figuring out unknown or unobservable quantities given known facts. In the case of the Enigma machine, Alan Turing wanted to figure out the unknown settings of the wheels and ultimately the meaning of the coded messages.
When analyzing data, we are also interested in learning about unknown quantities. For example, say that we are interested in how daily ice cream sales relate to the temperature, and we decide to use linear regression to investigate this.
[Video]
Probability
A Bayesian model for the proportion of success
prop_model(data)data is a vector of successes and failures represented by 1s and 0s.Trying out prop_model
data <- c() prop_model(data)
data <- c(0)
prop_model(data)
## Warning: `data_frame()` is deprecated, use `tibble()`.
## This warning is displayed once per session.
data <- c(1, 0, 0, 1)
prop_model(data)
Looking at the final probability distribution at n=4, what information does the model have regarding the underlying proportion of heads?
# Update the data and rerun prop_model
data = c(1, 0, 0, 1)
prop_model(data)
# Update the data and rerun prop_model
data = c(1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0)
prop_model(data)
The model implemented in prop_model makes more sense here as we had no clue how good the drug would be. The final probability distribution (at n=13) represents what the model now knows about the underlying proportion of cured zombies. What proportion of zombies would we expect to turn human if we administered this new drug to the whole zombie population?
[Video]
data = c(1, 0, 0, 1, 0, 0,
0, 0, 0, 0, 0, 0, 0)
# Extract and explore the posterior
posterior <- prop_model(data)
head(posterior)
## [1] 0.2490093 0.2043833 0.2509618 0.3857773 0.4716986 0.1224137
# Plot the histogram of the posterior
hist(posterior)
# Edit the histogram
hist(posterior, breaks = 30, xlim = c(0, 1), col = "palegreen4")
data = c(1, 0, 0, 1, 0, 0,
0, 0, 0, 0, 0, 0, 0)
posterior <- prop_model(data)
hist(posterior, breaks = 30, xlim = c(0, 1), col = "palegreen4")
# Calculate the median
median(posterior)
## [1] 0.1878043
# Calculate the credible interval
quantile(posterior, c(0.05, 0.95))
## 5% 95%
## 0.06113353 0.38891029
# Calculate the probability
sum(posterior > 0.07) / length(posterior)
## [1] 0.931
[Video]
Michael is a hybrid thinker and doer—a byproduct of being a StrengthsFinder “Learner” over time. With nearly 20 years of engineering, design, and product experience, he helps organizations identify market needs, mobilize internal and external resources, and deliver delightful digital customer experiences that align with business goals. He has been entrusted with problem-solving for brands—ranging from Fortune 500 companies to early-stage startups to not-for-profit organizations.
Michael earned his BS in Computer Science from New York Institute of Technology and his MBA from the University of Maryland, College Park. He is also a candidate to receive his MS in Applied Analytics from Columbia University.