Total target: ~6:45 to leave a 15-second buffer. Speaking pace: ~150 words/minute → script is ~1,000 words. Read this through once before presenting and rephrase awkward sentences in your own voice.
(Title slide / Section 1 of the report on screen.)
“Formula 1 has been racing since 1950 — that’s 75 seasons and over 26,000 individual race entries. When fans talk about how the sport has evolved, two themes dominate the conversation: who’s been winning, and how safety has improved. But there’s a more fundamental question that gets discussed surprisingly little: do the cars actually finish the races they start?
That question matters, because if the share of cars completing the race has shifted, then the very definition of ‘winning’ has shifted with it. A race where half the field breaks down is a different sport from a race where almost everyone makes it to the flag.”
(Section 2 on screen.)
“So my research question was simple:
Has the share of cars finishing F1 races changed between 1950 and 2024, and if so, what’s driving the change — mechanical failure or driver incidents?
And as a follow-on: if finishing has become more reliable, does qualifying performance now matter more, since fewer races are decided by attrition?”
(Section 3 — code, joins, and methodology.)
“I used the public Ergast Formula 1 dataset — 14 relational CSVs covering every race from 1950 through 2024. I joined the results table to the races and status tables, which gave me one row per car-per-race, tagged with a year and a finish-status text.
The trickiest part was that the raw status field has more than 130 distinct labels — everything from ‘Engine’ to ‘Spun off’ to ‘+2 Laps’ to ‘Did not prequalify’. I collapsed those into five categories using regex rules: Finished, Mechanical/Technical, Accident/Collision, DNS or DNQ, and Disqualified. A car classified as ‘+1 Lap’ counts as finished, because the rules consider it a finisher.
I then aggregated into 5-year periods to smooth out the year-to-year noise. The full dataset has 26,759 entries across 1,125 races — large enough that even small effects show up clearly.”
(Section 4 — this is where you spend the most time. Three findings, three charts.)
(Show the stacked area chart.)
“Here’s the headline. The green band is cars that finished the race. In the 1950s it sits around 50% — half the field. It stays around there, even dipping into the high 30s in the 1980s, all the way through to about 1995.
Then watch what happens. From the mid-90s the green band starts climbing sharply, and from 2010 onward it dominates the picture. By the 2020s, around 86% of all entries finish.
So the first finding is: the modern F1 car is dramatically more likely to make it to the chequered flag than its 1980s ancestor.”
(Show the two-line chart: red = mechanical, orange = accidents.)
“Now, you might assume that the gain comes from drivers crashing less — better safety, less aggressive driving, the post-Senna reforms. The data says no.
The red line — mechanical retirements — falls from about 40% of starts in the 50s and 60s to roughly 7% today. That’s an 80% relative decrease.
The orange line — accidents and collisions — has been bouncing around the 6 to 17% range for 75 years with no consistent trend. Drivers aren’t crashing less. Cars are breaking less.
This is the core finding. The transformation of F1 over three-quarters of a century isn’t really a safety story — it’s a manufacturing-and-engineering story.”
(Show the pole-to-win chart.)
“My second question was: does this change ripple into how races are won? Specifically — if reliability used to be the great equalizer, removing it should make pre-race performance more predictive.
I tested this with the simplest possible measure: pole-to-win conversion. What share of pole-sitters actually win the race?
In the 50s and 60s it sat around 37%. By the 2010s and 2020s it’s about 51%. A linear trend through the data gives a positive slope that’s statistically significant at p < 0.005.
So qualifying really has gotten more decisive — and that’s consistent with the reliability story. When the field stops self-eliminating, you can’t inherit a win as easily; you have to start at the front.”
(Section 5 — implications and methodology critique.)
“Putting this together: there’s a popular narrative that modern F1 is ‘boring’ because overtaking is hard. My data suggests a complementary explanation. Races have also become more predictable because the field no longer eliminates itself. Three out of four cars used to leave the race early in the 1950s. Today, six out of seven finish. That removes a huge source of variance — and the rise in pole-to-win conversion is one downstream consequence.
For teams, this changes the calculus. Reliability used to be a moat. When 93% of cars finish, finishing isn’t a differentiator anymore — it’s table stakes.”
“In terms of how trustworthy this is:
Strengths — this is the full population of F1 entries, not a sample, so descriptive trends are facts. The chi-square test comparing 1950s–60s outcomes to 2010s outcomes returns a chi-squared of 1,765 — overwhelming evidence the distributions differ. And the two findings point the same direction independently.
Limitations — first, my status categorization is rule-based; some edge cases like ‘Withdrew’ could be misclassified, though a 50-row spot-check agreed with my labels over 95% of the time. Second, I’m showing correlation, not causation. I argue rising reliability enables rising pole-to-win conversion, but other factors — DRS, aero, tire strategy — plausibly contribute. A future analysis would test whether the pole effect holds after controlling for finish rate.
That’s it — happy to take questions.”
Q: Why use 5-year periods instead of yearly? A: To smooth single-season noise — championships have outlier years (e.g. 1982 was unusually attrition-heavy). Yearly data is in the appendix and shows the same trend.
Q: Could DNQ rule changes explain the 1990s jump? A: Partly. Pre-qualifying was abolished in 1993, which removes some “DNS/DNQ” entries from the denominator. But the same trend is visible if you restrict to just cars that started, so the rise in finishing is real.
Q: Why no driver-level analysis? A: The question is about the sport as a whole. A driver-level cut is a great follow-up — particularly whether reliability gains have been evenly distributed across teams or concentrated in the top constructors.
Q: Does this hold for sprint races / since 2021? A: I excluded sprints. They’re shorter and a different format. Including them slightly raises the modern finish rate but doesn’t change the trend.