From How to Read Numbers – Chapter 4
A biased sample occurs when the group you study isn’t representative of the larger population you’re trying to understand.
Even large samples can be misleading if they’re biased.
“Cheese on toast is Britain’s favourite lockdown snack!”
— The Sun, April 2020
👉 Key question: Does this sample reflect the whole population?
Many polls use convenience sampling: - Twitter polls - Online surveys - Street interviews
But these groups are not random: - Twitter users skew younger, more urban, more politically engaged. - Landline phone polls miss mobile-only households.
More data ≠ better accuracy if the sample is biased.
After a 2019 UK TV debate: - YouGov (landline + online): Johnson won 48%–46% - Twitter polls: Corbyn won by wide margins
Why the difference? - Twitter ≠ UK electorate - Younger, more Labour-leaning users dominate Twitter
→ Sampling bias, not media bias.
When samples aren’t perfect, statisticians weight responses:
Example:
- Your survey has 40% women, but population is 50%. - You give each woman’s response 1.25× more weight.
This works only if you know the true population characteristics (e.g., from census data).
But weighting can’t fix everything—especially if key groups are missing entirely.
Bias isn’t just who you ask—it’s how you ask.
“Should 600 lives be saved?” → 72% say yes
“Should 400 people die?” → Only 22% say yes
Same outcome, different framing.
This is the framing effect—a cognitive bias that distorts responses.
Predicted Alf Landon would beat FDR in the US presidential election.
But in 1 936, phones and cars = wealthy → Republican-leaning.
Result? FDR won in a landslide.
Meanwhile, George Gallup surveyed just 50,000 people—but used a representative sample and got it right.
“A biased sample gives you precise answers to the wrong question.”
“All models are wrong, but some are useful.”
— George Box
Similarly:
> All samples are imperfect—but only unbiased ones are trustworthy.
Be skeptical. Ask about methods. Demand transparency.