Background
The aim of this analysis is to investigate driver behaviour using Formula 1 telemetry data and explore whether driver skill can be separated from car performance.
Telemetry data provides detailed information such as speed, throttle, and braking, which can be used to understand how drivers approach corners and control the car. By comparing drivers across multiple circuits, we can identify patterns and differences in driving style.
This study focuses on comparing Charles Leclerc and Lewis Hamilton across three circuits: Australia, Bahrain, and Saudi Arabia.These drivers were selected to enable a controlled comparison of driving behaviour, while the selected circuits represent different track characteristics, allowing for a more comprehensive analysis.
library(f1dataR)
library(dplyr)
library(ggplot2)
library(tidyr)
setup_fastf1()
## Virtual environment 'f1dataR_env' removed.
## Using Python: /usr/bin/python3
## Creating virtual environment 'f1dataR_env' ...
## Done!
## Installing packages: pip, wheel, setuptools
## Virtual environment 'f1dataR_env' successfully created.
## Using virtual environment 'f1dataR_env' ...
Dataset Description
The dataset was obtained using the f1dataR package, which provides access to Formula 1 telemetry data.
The analysis includes three race sessions:
Australian Grand Prix (Round 1)
Bahrain Grand Prix (Round 2)
Saudi Arabian Grand Prix (Round 3)
Two drivers were analysed:
Charles Leclerc
Lewis Hamilton
The telemetry dataset includes the following key variables:
Speed (km/h): vehicle velocity
Throttle (%): percentage of acceleration input
Brake (TRUE/FALSE): whether braking is applied
Distance (metres): position along the track
Gear and RPM: engine and transmission data
Distance to driver ahead: proximity to other cars
process_race <- function(year, round, circuit_name) {
circuit <- load_circuit_details(year, circuit_name)
lec <- load_driver_telemetry(year, round, "R", "LEC")
ham <- load_driver_telemetry(year, round, "R", "HAM")
lec$driver <- "Leclerc"
ham$driver <- "Hamilton"
lec <- lec[lec$distance_to_driver_ahead > 20 | is.na(lec$distance_to_driver_ahead), ]
ham <- ham[ham$distance_to_driver_ahead > 20 | is.na(ham$distance_to_driver_ahead), ]
lec$circuit <- circuit_name
ham$circuit <- circuit_name
return(bind_rows(lec, ham))
}
aus <- process_race(2025, 1, "Australia")
bah <- process_race(2025, 2, "Bahrain")
sau <- process_race(2025, 3, "Saudi")
all_telemetry <- bind_rows(aus, bah, sau)
summary(all_telemetry)
## date session_time time rpm
## Min. :2025-03-16 16:34:10 Min. :7497 Min. : 0.00 Min. : 5064
## 1st Qu.:2025-03-16 16:35:16 1st Qu.:7704 1st Qu.:21.16 1st Qu.: 9777
## Median :2025-03-23 19:22:22 Median :8122 Median :42.92 Median :10709
## Mean :2025-03-26 18:41:29 Mean :8173 Mean :43.82 Mean :10242
## 3rd Qu.:2025-04-06 16:16:23 3rd Qu.:8815 3rd Qu.:65.43 3rd Qu.:11170
## Max. :2025-04-06 16:23:20 Max. :8892 Max. :96.16 Max. :12304
## speed n_gear throttle brake
## Min. : 60.0 Min. :2.000 Min. : 0.00 Mode :logical
## 1st Qu.:168.0 1st Qu.:4.000 1st Qu.: 42.00 FALSE:3144
## Median :230.5 Median :6.000 Median : 99.00 TRUE :716
## Mean :219.8 Mean :5.727 Mean : 71.33
## 3rd Qu.:282.0 3rd Qu.:7.000 3rd Qu.:100.00
## Max. :322.0 Max. :8.000 Max. :100.00
## drs source relative_distance status
## Min. : 0.0000 Length:3860 Min. :9.800e-07 Length:3860
## 1st Qu.: 0.0000 Class :character 1st Qu.:2.313e-01 Class :character
## Median : 0.0000 Mode :character Median :4.694e-01 Mode :character
## Mean : 0.6446 Mean :4.806e-01
## 3rd Qu.: 0.0000 3rd Qu.:7.137e-01
## Max. :14.0000 Max. :9.997e-01
## x y z distance
## Min. :-13802 Min. :-7012.8 Min. : 74.94 Min. : 0
## 1st Qu.: -5581 1st Qu.:-1711.8 1st Qu.: 95.00 1st Qu.:1258
## Median : -1967 Median : 466.5 Median :148.01 Median :2567
## Mean : -2001 Mean : 1029.2 Mean :369.53 Mean :2648
## 3rd Qu.: 1243 3rd Qu.: 3016.5 3rd Qu.:775.50 3rd Qu.:3941
## Max. : 7502 Max. :11852.0 Max. :945.08 Max. :5772
## driver_ahead distance_to_driver_ahead driver_code
## Length:3860 Min. : 20.06 Length:3860
## Class :character 1st Qu.: 85.57 Class :character
## Mode :character Median : 482.82 Mode :character
## Mean : 485.10
## 3rd Qu.: 779.10
## Max. :1355.07
## driver circuit
## Length:3860 Length:3860
## Class :character Class :character
## Mode :character Mode :character
##
##
##
nrow(all_telemetry)
## [1] 3860
Data cleaning was performed by removing observations where the distance to the driver ahead was less than 20 metres. This ensures that driver behaviour is not influenced by traffic.The dataset contains 3860 telemetry observations across three circuits.
Speed ranges from approximately 60 km/h to 322 km/h, while throttle values range from 0% to 100%. Brake events occur less frequently, indicating that braking is concentrated around specific track sections such as corners.
Data Use Cases
Use Case 1: Driver Skill vs Car Performance
This use case examines whether drivers differ in cornering performance by analysing minimum and maximum speed within corners.
get_corner_windows <- function(telemetry, corners) {
corners <- corners[order(corners$distance), ]
all_corners <- list()
for (i in 1:nrow(corners)) {
corner_dist <- corners$distance[i]
prev_dist <- ifelse(i == 1, 0, corners$distance[i-1])
next_dist <- ifelse(i == nrow(corners),
max(telemetry$distance),
corners$distance[i+1])
start_dist <- corner_dist - (corner_dist - prev_dist)/2
end_dist <- corner_dist + (next_dist - corner_dist)/2
data <- telemetry[telemetry$distance >= start_dist &
telemetry$distance <= end_dist, ]
data$corner <- corners$number[i]
all_corners[[i]] <- data
}
do.call(rbind, all_corners)
}
bahrain_circuit <- load_circuit_details(2025, "Bahrain")
lec_corner <- get_corner_windows(bah[bah$driver=="Leclerc", ], bahrain_circuit$corners)
ham_corner <- get_corner_windows(bah[bah$driver=="Hamilton", ], bahrain_circuit$corners)
lec_metrics <- lec_corner %>%
group_by(corner) %>%
summarise(min_speed = min(speed), max_speed = max(speed))
ham_metrics <- ham_corner %>%
group_by(corner) %>%
summarise(min_speed = min(speed), max_speed = max(speed))
comparison <- lec_metrics %>%
rename(lec_min=min_speed, lec_max=max_speed) %>%
left_join(ham_metrics %>% rename(ham_min=min_speed, ham_max=max_speed),
by="corner")
comparison
## # A tibble: 15 × 5
## corner lec_min lec_max ham_min ham_max
## <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 1 90 306. 98 302
## 2 2 83 114 89 125
## 3 3 82 263. 91 263.
## 4 4 72 288 75 285
## 5 5 157. 250. 165. 248.
## 6 6 253 270. 252. 269
## 7 7 253 272 259 271
## 8 8 164 265 169 268
## 9 9 103 192 109 196
## 10 10 112. 281 132. 278
## 11 11 84 284 88 281.
## 12 12 286. 307 281 305
## 13 13 308. 319 306 316
## 14 14 60 317 63 315.
## 15 15 156 254 162 250
Hypothesis
Drivers will show differences in throttle and braking behaviour, reflecting different driving styles.
Plot: Minimum Speed per Corner
plot_min <- comparison %>%
pivot_longer(cols=c(lec_min, ham_min),
names_to="driver", values_to="speed")
plot_min$driver <- ifelse(plot_min$driver=="lec_min","Leclerc","Hamilton")
ggplot(plot_min, aes(corner, speed, colour=driver)) +
geom_line() + geom_point() +
labs(title="Minimum Speed per Corner") +
theme_minimal()
Interpretation
The results show that both drivers have similar minimum speeds across corners, indicating comparable cornering ability. Small differences may still exist at specific corners.
Use Case 2: Throttle and Braking Behaviour
This use case examines how drivers differ in acceleration and braking patterns.
ggplot(all_telemetry, aes(x=speed, fill=driver)) +
geom_histogram(alpha=0.5, bins=50) +
facet_wrap(~circuit) +
labs(title="Speed Distribution")
Interpretation
Both drivers operate within similar speed ranges, suggesting comparable performance. however, slight differences in distribution shape may indicate variation in how each driver approaches acceleration and braking zones.
Throttle Distribution
ggplot(all_telemetry, aes(x=throttle, fill=driver)) +
geom_histogram(alpha=0.5, bins=50) +
facet_wrap(~circuit) +
labs(title="Throttle Distribution")
Interpretation
Throttle distributions indicate how aggressively drivers accelerate. Minor differences suggest variation in driving style.
Brake Usage
ggplot(all_telemetry, aes(x=brake, fill=driver)) +
geom_bar(position="dodge") +
facet_wrap(~circuit) +
labs(title="Brake Usage")
Interpretation
Brake usage patterns show how frequently drivers apply braking, which may reflect differences in corner approach strategy.
Limitations
Only two drivers were analysed
Car performance differences are not fully controlled
External factors such as tyre wear and weather are not included
Thresholds such as 20m for traffic removal are assumptions
Conclusion
This analysis demonstrated how telemetry data can be used to compare driver behaviour across circuits. While speed metrics suggest similar performance, differences in throttle and braking behaviour indicate variations in driving style. Future work will include additional drivers, more circuits, and advanced statistical models such as mixed-effects models.