Background

The aim of this analysis is to investigate driver behaviour using Formula 1 telemetry data and explore whether driver skill can be separated from car performance.

Telemetry data provides detailed information such as speed, throttle, and braking, which can be used to understand how drivers approach corners and control the car. By comparing drivers across multiple circuits, we can identify patterns and differences in driving style.

This study focuses on comparing Charles Leclerc and Lewis Hamilton across three circuits: Australia, Bahrain, and Saudi Arabia.These drivers were selected to enable a controlled comparison of driving behaviour, while the selected circuits represent different track characteristics, allowing for a more comprehensive analysis.

library(f1dataR)
library(dplyr)
library(ggplot2)
library(tidyr)

setup_fastf1()
## Virtual environment 'f1dataR_env' removed.
## Using Python: /usr/bin/python3
## Creating virtual environment 'f1dataR_env' ...
## Done!
## Installing packages: pip, wheel, setuptools
## Virtual environment 'f1dataR_env' successfully created.
## Using virtual environment 'f1dataR_env' ...

Dataset Description

The dataset was obtained using the f1dataR package, which provides access to Formula 1 telemetry data.

The analysis includes three race sessions:

Australian Grand Prix (Round 1)

Bahrain Grand Prix (Round 2)

Saudi Arabian Grand Prix (Round 3)

Two drivers were analysed:

Charles Leclerc

Lewis Hamilton

The telemetry dataset includes the following key variables:

Speed (km/h): vehicle velocity

Throttle (%): percentage of acceleration input

Brake (TRUE/FALSE): whether braking is applied

Distance (metres): position along the track

Gear and RPM: engine and transmission data

Distance to driver ahead: proximity to other cars

process_race <- function(year, round, circuit_name) {
  circuit <- load_circuit_details(year, circuit_name)
  lec <- load_driver_telemetry(year, round, "R", "LEC")
  ham <- load_driver_telemetry(year, round, "R", "HAM")
  
  lec$driver <- "Leclerc"
  ham$driver <- "Hamilton"
  
  lec <- lec[lec$distance_to_driver_ahead > 20 | is.na(lec$distance_to_driver_ahead), ]
  ham <- ham[ham$distance_to_driver_ahead > 20 | is.na(ham$distance_to_driver_ahead), ]
  
  lec$circuit <- circuit_name
  ham$circuit <- circuit_name
  
  return(bind_rows(lec, ham))
}

aus <- process_race(2025, 1, "Australia")
bah <- process_race(2025, 2, "Bahrain")
sau <- process_race(2025, 3, "Saudi")

all_telemetry <- bind_rows(aus, bah, sau)

summary(all_telemetry)
##       date                      session_time       time            rpm       
##  Min.   :2025-03-16 16:34:10   Min.   :7497   Min.   : 0.00   Min.   : 5064  
##  1st Qu.:2025-03-16 16:35:16   1st Qu.:7704   1st Qu.:21.16   1st Qu.: 9777  
##  Median :2025-03-23 19:22:22   Median :8122   Median :42.92   Median :10709  
##  Mean   :2025-03-26 18:41:29   Mean   :8173   Mean   :43.82   Mean   :10242  
##  3rd Qu.:2025-04-06 16:16:23   3rd Qu.:8815   3rd Qu.:65.43   3rd Qu.:11170  
##  Max.   :2025-04-06 16:23:20   Max.   :8892   Max.   :96.16   Max.   :12304  
##      speed           n_gear         throttle        brake        
##  Min.   : 60.0   Min.   :2.000   Min.   :  0.00   Mode :logical  
##  1st Qu.:168.0   1st Qu.:4.000   1st Qu.: 42.00   FALSE:3144     
##  Median :230.5   Median :6.000   Median : 99.00   TRUE :716      
##  Mean   :219.8   Mean   :5.727   Mean   : 71.33                  
##  3rd Qu.:282.0   3rd Qu.:7.000   3rd Qu.:100.00                  
##  Max.   :322.0   Max.   :8.000   Max.   :100.00                  
##       drs             source          relative_distance      status         
##  Min.   : 0.0000   Length:3860        Min.   :9.800e-07   Length:3860       
##  1st Qu.: 0.0000   Class :character   1st Qu.:2.313e-01   Class :character  
##  Median : 0.0000   Mode  :character   Median :4.694e-01   Mode  :character  
##  Mean   : 0.6446                      Mean   :4.806e-01                     
##  3rd Qu.: 0.0000                      3rd Qu.:7.137e-01                     
##  Max.   :14.0000                      Max.   :9.997e-01                     
##        x                y                 z             distance   
##  Min.   :-13802   Min.   :-7012.8   Min.   : 74.94   Min.   :   0  
##  1st Qu.: -5581   1st Qu.:-1711.8   1st Qu.: 95.00   1st Qu.:1258  
##  Median : -1967   Median :  466.5   Median :148.01   Median :2567  
##  Mean   : -2001   Mean   : 1029.2   Mean   :369.53   Mean   :2648  
##  3rd Qu.:  1243   3rd Qu.: 3016.5   3rd Qu.:775.50   3rd Qu.:3941  
##  Max.   :  7502   Max.   :11852.0   Max.   :945.08   Max.   :5772  
##  driver_ahead       distance_to_driver_ahead driver_code       
##  Length:3860        Min.   :  20.06          Length:3860       
##  Class :character   1st Qu.:  85.57          Class :character  
##  Mode  :character   Median : 482.82          Mode  :character  
##                     Mean   : 485.10                            
##                     3rd Qu.: 779.10                            
##                     Max.   :1355.07                            
##     driver            circuit         
##  Length:3860        Length:3860       
##  Class :character   Class :character  
##  Mode  :character   Mode  :character  
##                                       
##                                       
## 
nrow(all_telemetry)
## [1] 3860

Data cleaning was performed by removing observations where the distance to the driver ahead was less than 20 metres. This ensures that driver behaviour is not influenced by traffic.The dataset contains 3860 telemetry observations across three circuits.

Speed ranges from approximately 60 km/h to 322 km/h, while throttle values range from 0% to 100%. Brake events occur less frequently, indicating that braking is concentrated around specific track sections such as corners.

Data Use Cases

Use Case 1: Driver Skill vs Car Performance

This use case examines whether drivers differ in cornering performance by analysing minimum and maximum speed within corners.

get_corner_windows <- function(telemetry, corners) {
  corners <- corners[order(corners$distance), ]
  all_corners <- list()
  
  for (i in 1:nrow(corners)) {
    corner_dist <- corners$distance[i]
    prev_dist <- ifelse(i == 1, 0, corners$distance[i-1])
    next_dist <- ifelse(i == nrow(corners),
                        max(telemetry$distance),
                        corners$distance[i+1])
    
    start_dist <- corner_dist - (corner_dist - prev_dist)/2
    end_dist <- corner_dist + (next_dist - corner_dist)/2
    
    data <- telemetry[telemetry$distance >= start_dist &
                        telemetry$distance <= end_dist, ]
    data$corner <- corners$number[i]
    
    all_corners[[i]] <- data
  }
  do.call(rbind, all_corners)
}

bahrain_circuit <- load_circuit_details(2025, "Bahrain")

lec_corner <- get_corner_windows(bah[bah$driver=="Leclerc", ], bahrain_circuit$corners)
ham_corner <- get_corner_windows(bah[bah$driver=="Hamilton", ], bahrain_circuit$corners)

lec_metrics <- lec_corner %>%
  group_by(corner) %>%
  summarise(min_speed = min(speed), max_speed = max(speed))

ham_metrics <- ham_corner %>%
  group_by(corner) %>%
  summarise(min_speed = min(speed), max_speed = max(speed))

comparison <- lec_metrics %>%
  rename(lec_min=min_speed, lec_max=max_speed) %>%
  left_join(ham_metrics %>% rename(ham_min=min_speed, ham_max=max_speed),
            by="corner")

comparison
## # A tibble: 15 × 5
##    corner lec_min lec_max ham_min ham_max
##     <dbl>   <dbl>   <dbl>   <dbl>   <dbl>
##  1      1     90     306.     98     302 
##  2      2     83     114      89     125 
##  3      3     82     263.     91     263.
##  4      4     72     288      75     285 
##  5      5    157.    250.    165.    248.
##  6      6    253     270.    252.    269 
##  7      7    253     272     259     271 
##  8      8    164     265     169     268 
##  9      9    103     192     109     196 
## 10     10    112.    281     132.    278 
## 11     11     84     284      88     281.
## 12     12    286.    307     281     305 
## 13     13    308.    319     306     316 
## 14     14     60     317      63     315.
## 15     15    156     254     162     250

Hypothesis

Drivers will show differences in throttle and braking behaviour, reflecting different driving styles.

Plot: Minimum Speed per Corner

plot_min <- comparison %>%
  pivot_longer(cols=c(lec_min, ham_min),
               names_to="driver", values_to="speed")

plot_min$driver <- ifelse(plot_min$driver=="lec_min","Leclerc","Hamilton")

ggplot(plot_min, aes(corner, speed, colour=driver)) +
  geom_line() + geom_point() +
  labs(title="Minimum Speed per Corner") +
  theme_minimal()

Interpretation

The results show that both drivers have similar minimum speeds across corners, indicating comparable cornering ability. Small differences may still exist at specific corners.

Use Case 2: Throttle and Braking Behaviour

This use case examines how drivers differ in acceleration and braking patterns.

ggplot(all_telemetry, aes(x=speed, fill=driver)) +
  geom_histogram(alpha=0.5, bins=50) +
  facet_wrap(~circuit) +
  labs(title="Speed Distribution")

Interpretation

Both drivers operate within similar speed ranges, suggesting comparable performance. however, slight differences in distribution shape may indicate variation in how each driver approaches acceleration and braking zones.

Throttle Distribution

ggplot(all_telemetry, aes(x=throttle, fill=driver)) +
  geom_histogram(alpha=0.5, bins=50) +
  facet_wrap(~circuit) +
  labs(title="Throttle Distribution")

Interpretation

Throttle distributions indicate how aggressively drivers accelerate. Minor differences suggest variation in driving style.

Brake Usage

ggplot(all_telemetry, aes(x=brake, fill=driver)) +
  geom_bar(position="dodge") +
  facet_wrap(~circuit) +
  labs(title="Brake Usage")

Interpretation

Brake usage patterns show how frequently drivers apply braking, which may reflect differences in corner approach strategy.

Limitations

Only two drivers were analysed

Car performance differences are not fully controlled

External factors such as tyre wear and weather are not included

Thresholds such as 20m for traffic removal are assumptions

Conclusion

This analysis demonstrated how telemetry data can be used to compare driver behaviour across circuits. While speed metrics suggest similar performance, differences in throttle and braking behaviour indicate variations in driving style. Future work will include additional drivers, more circuits, and advanced statistical models such as mixed-effects models.