Introduction

This code-through explores the concept of Fermi Estimates and the difference between Bayesian and Frequentist approaches in R. If this sounds like a lot of jargon at first, don’t worry, I felt the same way when I first heard it, and it takes a bit to become familiar with the terms. The thin is, you probably already have an intuition for some of these concepts.

Content Overview

Specifically in this code through we’ll explain and demonstrate:

What a Fermi Estimate (back-of-the-envelope calculation) is, and how to do one in R
How to add uncertainty to your estimates using a Beta distribution
Frequentist vs. Bayesian interpretations in a straightforward, side-by-side comparison

Why You Should Care

We make a lot of decisions every day—some say thousands!

Sure, you could Google that number, but it’s fun (and instructive) to estimate it yourself.

This teaches you how to break down problems, make informed guesses, and refine those guesses if you collect data.

Learning Objectives

Specifically, you’ll learn: 1. Perform a quick Fermi Estimate in R 2. Understand why you might want to model uncertainty (instead of using a single guess) 3. Appreciate the key differences between Frequentist and Bayesian approaches

Part 1: Fermi Estimates and Envelopes

So who or what is a Fermi Estimate? Is it the same as a back-of-the-envelope calculation? For our purposes, yes. And what is a back-of-the-envelope calculation (BOTEC)?

Back-of-the-envelope Calculation: A “back of the envelope calculation” means a quick, rough estimate or calculation done informally, often using simplified assumptions and rounded numbers, usually on a scrap of paper like the back of an envelope, to get a ballpark figure rather than a precise answer; it’s considered more accurate than a guess but less precise than a detailed calculation.

Great, that explains the envelope, but what is a Fermi?

A Fermi estimate: (or order-of-magnitude estimate, order estimation) is an estimate of an extreme scientific calculation. The estimation technique is named after physicist Enrico Fermi as he was known for his ability to make good approximate calculations with little or no actual data.

So, yes there is a difference, but not a big one. Fermi Estimates are a kind of BOTEC calculation, but with difficult to estimate things on grand scales.

1.2 Example: Decisions per Day

Let’s do a back-of-the-envelope calculation to guess how many decisions you make in a day.

## Assume:
## - Awake ~16 hours/day
## - 1 "decision" every 2 minutes (i.e., 0.5 decisions/minute)

hours_awake <- 16
decisions_per_minute <- 0.5  # 1 decision every 2 min

daily_decisions <- hours_awake * 60 * decisions_per_minute
daily_decisions

## [1] 480

See how simple that was? Of course, we made big assumptions… for instance we assumed the rate never changes, but it’s a starting point.

Part 2: Moving Beyond a Single Estimate

Frequentist vs. Bayesian Thinking: What’s the Difference?

Before we talk about why Frequentist methods don’t work for BOTECs, let’s quickly compare the two main statistical approaches.

Two Ways to Think About Probability

At a high level, Frequentists and Bayesians think about uncertainty differently:

Frequentists: Probability is about long-run frequencies—how often something happens in repeated trials.
Bayesians: Probability represents a degree of belief—how likely we think something is, based on prior knowledge + new data.

Here’s a quick comparison table:

Frequentist vs. Bayesian Thinking
Aspect	Frequentist.Approach	Bayesian.Approach
What is probability?	The long-run frequency of an event happening in repeated trials.	A degree of belief about an event, which updates with new data.
How do we treat parameters?	Fixed but unknown. There is one ‘true’ answer, we just don’t know it.	A probability distribution. We don’t assume a single ‘true’ value.
How do we get results?	Collect repeated samples and compute a confidence interval.	Start with a prior belief and update it using Bayes’ Theorem.
What do we need?	Lots of data! Frequentists rely on repeated sampling to estimate parameters.	A prior belief + data (if available). Can reason even with little data.
Can it handle BOTECs?	❌ No—requires observed data and repeated trials.	✅ Yes—can model uncertainty even without direct observations.

Why Frequentist Methods DON’T Work for BOTECs

A Frequentist approach is based on the idea that there is a fixed but unknown true parameter (e.g., “decisions per minute”) and that we can estimate it by collecting data and analyzing repeated samples.

But here’s the problem:
- BOTECs don’t rely on observed data—they are based on logical reasoning and estimation, not repeatable experiments.
- Frequentist statistics assume repeated trials—without actual data from repeated decision-making observations, you cannot construct a confidence interval or perform hypothesis tests. - Frequentists don’t assign probabilities to parameters—they estimate values based on collected data, but in a BOTEC, we aren’t working with sampled data at all.

So, if Frequentist statistics can’t be used for BOTECs, what do we do instead?

Part 3: The Bayesian Process

Why Bayesian Thinking Works for BOTECs

Unlike Frequentists, Bayesians treat unknown parameters as probability distributions. This means we can start with an initial belief (a prior), then refine it with new information.

BOTECs are all about reasoning under uncertainty, which is exactly what Bayesian methods are designed to do.

Bayesian Thought Process:

Before collecting data, we use a prior distribution to represent our initial uncertainty.
If we later get new data, we update our prior using Bayes’ Theorem to get a posterior distribution.
Instead of needing repeated experiments, we can express our uncertainty as a probability distribution.

Example: Estimating “Decisions per Minute”

Let’s say we expect most people make between 0.3 and 0.7 decisions per minute, but we’re not sure. Instead of picking one number, we can use a Beta(5,5) prior to represent this uncertainty.

Now, we visualize that distribution:

Visualizing a Beta(5,5) Prior

A Beta(5,5) distribution is: - Centered around 0.5 (both parameters are equal) - Somewhat confident, but not too rigid - Symmetric, meaning it doesn’t skew left or right

Think of Beta(5,5) like saying:
“I believe there’s a decent chance my rate is ~0.5, but it could be lower or higher.”

Now, let’s visualize it:

set.seed(123)  # for reproducibility

alpha_prior <- 5
beta_prior  <- 5
prior_samples <- rbeta(10000, alpha_prior, beta_prior)

hist(prior_samples,
     breaks = 30,
     col = "skyblue",
     main = "Beta(5,5) Prior Distribution",
     xlab = "Decisions per Minute")
abline(v = mean(prior_samples), col = "red", lwd = 2)

Simulating Daily Decisions Using Bayesian Modeling

Now, let’s take our Beta(5,5) prior and use it to simulate a range of possible daily decisions.

hours_awake <- 16
minutes_awake <- hours_awake * 60
daily_decisions_dist <- prior_samples * minutes_awake

hist(daily_decisions_dist,
     breaks = 30,
     col = "skyblue",
     main = "Distribution of Estimated Daily Decisions (Beta(5,5) Prior)",
     xlab = "Daily Decisions")
abline(v = mean(daily_decisions_dist), col = "red", lwd = 2)

Part 4, Frequentist versus Bayesian in a Nutshell

Now that we have seen how Bayesian methods allow us to model uncertainty through a Beta(5,5) prior and simulate a range of daily decisions, let us compare the two major statistical viewpoints.

Below is a comparison table:

Frequentist versus Bayesian Thinking
Aspect	Frequentist Approach	Bayesian Approach
What is probability?	The long-run frequency of an event happening in repeated trials.	A degree of belief about an event that updates as new data become available.
How do we treat parameters?	Fixed but unknown, meaning there is one ‘true’ value that we do not know.	A probability distribution, meaning we represent our uncertainty about the parameter.
How do we express uncertainty?	Confidence intervals computed from repeated sampling.	Credible intervals that directly state the probability of the parameter lying in a given range.
What do we need?	A large amount of data from repeated experiments.	A prior belief, which can be updated even with limited data.
Can it handle BOTECs?	No, because it requires observed data.	Yes, it can model uncertainty even without direct observations.

Part 5, Conclusion, Final Takeaways and Further Resources

We have covered a lot, including how to perform a Fermi Estimate to roughly gauge the number of decisions made per day, and why Bayesian reasoning is the natural choice for back-of-the-envelope calculations (BOTECs). In summary:

Back-of-the-envelope calculations provide a quick, logical way to break down big problems, but they rely on assumptions rather than observed data.
Frequentist methods require repeated, real-world data to produce confidence intervals and hypothesis tests. Because BOTECs do not involve observed data, Frequentist methods do not apply.
Bayesian methods allow us to incorporate prior knowledge and express our uncertainty as a probability distribution, which makes them ideal for BOTECs.
Using a Beta(5,5) prior, we represent our initial belief that the decision rate is around 0.5, while allowing for variability. This results in a simulated distribution of daily decision counts that is more realistic than a single fixed estimate.

decision_table <- data.frame(
  Situation = c(
    "Having a large dataset and repeated experiments",
    "Quantifying uncertainty from limited or no data",
    "Making a back-of-the-envelope calculation without data",
    "Updating beliefs as new data become available"
  ),
  Frequentist = c("Yes", "No", "No", "No"),
  Bayesian = c("Yes", "Yes", "Yes", "Yes"),
  stringsAsFactors = FALSE
)


kable(decision_table, format = "html", escape = FALSE, caption = "When to Use Each Approach") %>%
  kable_styling(full_width = FALSE, bootstrap_options = c("striped", "hover"))

When to Use Each Approach
Situation	Frequentist	Bayesian
Having a large dataset and repeated experiments	Yes	Yes
Quantifying uncertainty from limited or no data	No	Yes
Making a back-of-the-envelope calculation without data	No	Yes
Updating beliefs as new data become available	No	Yes

Final Thought

Every decision involves uncertainty. Frequentist methods rely on the idea of repeated sampling and real data, whereas Bayesian methods let us reason through uncertainty even when data are scarce. For BOTECs, Bayesian modeling is the only viable approach

Further Resources

Fermi Estimates
- Introduction to Fermi Estimates (Effective Altruism Forum)
- Fermi problem (Wikipedia)
Bayesian Statistics
- Statistical Rethinking by Richard McElreath
- John K. Kruschke. Doing Bayesian Data Analysis: A Tutorial with R, JAGS, and Stan.
Frequentist Statistics
- Freedman, D., Pisani, R., Purves, R. (2007). Statistics.
- Intro to Hypothesis Testing (Datacamp)
Comparisons & Tutorials
- Bayes vs Frequentist — DataCamp Blog
- Andrew Gelman’s Blog

Fermi Estimates: Comparing Bayesian and Frequentist Approaches in R

Aaron Graifman

01 March 2025