Formula 1 Analysis

As the Formula One 2023 season draws near and fans gear up for the excitement, let’s take a deeper look into the world of F1. If you are new to Formula one, this visualization is for you. With the help of data sourced from Kaggle, we’ll be using R to explore some common and not-so-common questions, bringing our own unique spin to the analysis. So fasten your seatbelts and join us on this journey into the world of F1!

Data Gathering and Manipulating

To start, I loaded the necessary packages and read in the data files using read.csv(). The four data files I used were results.csv, drivers.csv, races.csv, and constructors.csv.

##   resultId raceId driverId constructorId number grid position positionText
## 1        1     18        1             1     22    1        1            1
## 2        2     18        2             2      3    5        2            2
## 3        3     18        3             3      7    7        3            3
## 4        4     18        4             4      5   11        4            4
## 5        5     18        5             1     23    3        5            5
## 6        6     18        6             3      8   13        6            6
##   positionOrder points laps        time milliseconds fastestLap rank
## 1             1     10   58 1:34:50.616      5690616         39    2
## 2             2      8   58      +5.478      5696094         41    3
## 3             3      6   58      +8.163      5698779         41    5
## 4             4      5   58     +17.181      5707797         58    7
## 5             5      4   58     +18.014      5708630         43    1
## 6             6      3   57         \\N          \\N         50   14
##   fastestLapTime fastestLapSpeed statusId
## 1       1:27.452         218.300        1
## 2       1:27.739         217.586        1
## 3       1:28.090         216.719        1
## 4       1:28.603         215.464        1
## 5       1:27.418         218.385        1
## 6       1:29.639         212.974       11
##   driverId  driverRef number code forename    surname        dob nationality
## 1        1   hamilton     44  HAM    Lewis   Hamilton 1985-01-07     British
## 2        2   heidfeld    \\N  HEI     Nick   Heidfeld 1977-05-10      German
## 3        3    rosberg      6  ROS     Nico    Rosberg 1985-06-27      German
## 4        4     alonso     14  ALO Fernando     Alonso 1981-07-29     Spanish
## 5        5 kovalainen    \\N  KOV   Heikki Kovalainen 1981-10-19     Finnish
## 6        6   nakajima    \\N  NAK   Kazuki   Nakajima 1985-01-11    Japanese
##                                              url
## 1    http://en.wikipedia.org/wiki/Lewis_Hamilton
## 2     http://en.wikipedia.org/wiki/Nick_Heidfeld
## 3      http://en.wikipedia.org/wiki/Nico_Rosberg
## 4   http://en.wikipedia.org/wiki/Fernando_Alonso
## 5 http://en.wikipedia.org/wiki/Heikki_Kovalainen
## 6   http://en.wikipedia.org/wiki/Kazuki_Nakajima
##   raceId year round circuitId                  name       date     time
## 1      1 2009     1         1 Australian Grand Prix 2009-03-29 06:00:00
## 2      2 2009     2         2  Malaysian Grand Prix 2009-04-05 09:00:00
## 3      3 2009     3        17    Chinese Grand Prix 2009-04-19 07:00:00
## 4      4 2009     4         3    Bahrain Grand Prix 2009-04-26 12:00:00
## 5      5 2009     5         4    Spanish Grand Prix 2009-05-10 12:00:00
## 6      6 2009     6         6     Monaco Grand Prix 2009-05-24 12:00:00
##                                                       url fp1_date fp1_time
## 1 http://en.wikipedia.org/wiki/2009_Australian_Grand_Prix      \\N      \\N
## 2  http://en.wikipedia.org/wiki/2009_Malaysian_Grand_Prix      \\N      \\N
## 3    http://en.wikipedia.org/wiki/2009_Chinese_Grand_Prix      \\N      \\N
## 4    http://en.wikipedia.org/wiki/2009_Bahrain_Grand_Prix      \\N      \\N
## 5    http://en.wikipedia.org/wiki/2009_Spanish_Grand_Prix      \\N      \\N
## 6     http://en.wikipedia.org/wiki/2009_Monaco_Grand_Prix      \\N      \\N
##   fp2_date fp2_time fp3_date fp3_time quali_date quali_time sprint_date
## 1      \\N      \\N      \\N      \\N        \\N        \\N         \\N
## 2      \\N      \\N      \\N      \\N        \\N        \\N         \\N
## 3      \\N      \\N      \\N      \\N        \\N        \\N         \\N
## 4      \\N      \\N      \\N      \\N        \\N        \\N         \\N
## 5      \\N      \\N      \\N      \\N        \\N        \\N         \\N
## 6      \\N      \\N      \\N      \\N        \\N        \\N         \\N
##   sprint_time
## 1         \\N
## 2         \\N
## 3         \\N
## 4         \\N
## 5         \\N
## 6         \\N
##   constructorId constructorRef       name nationality
## 1             1        mclaren    McLaren     British
## 2             2     bmw_sauber BMW Sauber      German
## 3             3       williams   Williams     British
## 4             4        renault    Renault      French
## 5             5     toro_rosso Toro Rosso     Italian
## 6             6        ferrari    Ferrari     Italian
##                                                            url
## 1                         http://en.wikipedia.org/wiki/McLaren
## 2                      http://en.wikipedia.org/wiki/BMW_Sauber
## 3 http://en.wikipedia.org/wiki/Williams_Grand_Prix_Engineering
## 4          http://en.wikipedia.org/wiki/Renault_in_Formula_One
## 5             http://en.wikipedia.org/wiki/Scuderia_Toro_Rosso
## 6                http://en.wikipedia.org/wiki/Scuderia_Ferrari

Cleaning and Merging the Data

Next, I cleaned the column names using clean_names() from the janitor package and merged the four data frames into one using merge(). I also removed unnecessary columns and renamed some columns to make them more descriptive. Finally, I sorted the data frame by year, round, and position order.

##    year             gpName round driverId      driverName constructorId
## 1  2022 Bahrain Grand Prix     1      844         Leclerc             6
## 2  2022 Bahrain Grand Prix     1      832           Sainz             6
## 3  2022 Bahrain Grand Prix     1        1        Hamilton           131
## 4  2022 Bahrain Grand Prix     1      847         Russell           131
## 5  2022 Bahrain Grand Prix     1      825 Kevin Magnussen           210
## 6  2022 Bahrain Grand Prix     1      822          Bottas            51
## 7  2022 Bahrain Grand Prix     1      839            Ocon           214
## 8  2022 Bahrain Grand Prix     1      852         Tsunoda           213
## 9  2022 Bahrain Grand Prix     1        4          Alonso           214
## 10 2022 Bahrain Grand Prix     1      855            Zhou            51
##    constructorName grid positionOrder points    raceTime milliseconds
## 1          Ferrari    1             1     26 1:37:33.584      5853584
## 2          Ferrari    3             2     18      +5.598      5859182
## 3         Mercedes    5             3     15      +9.675      5863259
## 4         Mercedes    9             4     12     +11.211      5864795
## 5     Haas F1 Team    7             5     10     +14.754      5868338
## 6       Alfa Romeo    6             6      8     +16.119      5869703
## 7   Alpine F1 Team   11             7      6     +19.423      5873007
## 8       AlphaTauri   16             8      4     +20.386      5873970
## 9   Alpine F1 Team    8             9      2     +22.390      5875974
## 10      Alfa Romeo   15            10      1     +23.064      5876648
##    fastestLapRank fastestLapTime fastestLapSpeed driverNationality
## 1               1       1:34.570         206.018        Monegasque
## 2               3       1:35.740         203.501           Spanish
## 3               5       1:36.228         202.469           British
## 4               6       1:36.302         202.313           British
## 5               8       1:36.623         201.641            Danish
## 6               7       1:36.599         201.691           Finnish
## 7              14       1:37.110         200.630            French
## 8              13       1:37.104         200.642          Japanese
## 9              10       1:36.733         201.412           Spanish
## 10              9       1:36.685         201.512           Chinese
##    constructorNationality
## 1                 Italian
## 2                 Italian
## 3                  German
## 4                  German
## 5                American
## 6                   Swiss
## 7                  French
## 8                 Italian
## 9                  French
## 10                  Swiss

Drivers With Most Wins in Formula One

The following code generates a bar graph showing the number of wins for the top seven drivers in Formula 1 history. The drivers are ranked in descending order by their total number of wins. The graph reveals that Lewis Hamilton is currently the GOAT in terms of wins, with 103 victories to his name. He has surpassed the previous record holder, Michael Schumacher, who is widely regarded as one of the greatest drivers of all time. Schumacher has 91 wins to his name, which is still an impressive feat. The graph also highlights the rising success of Max Verstappen, who has already achieved 35 wins at a young age and is quickly climbing the ranks.

Constructors with Most Race Wins

The following bar chart shows the number of wins for the top 7 constructors in Formula One history. Ferrari, with its rich history and iconic red cars, stands out with the most wins by a significant margin. McLaren, with its distinctive orange livery and strong legacy in F1, comes in second place. Williams, which has produced some of the most successful cars in F1 history, takes the third spot. Mercedes, the dominant force in the sport in recent years, is not too far behind Williams. Red Bull, known for its aggressive and innovative approach to racing, comes in fifth. Lotus, a team that has produced some of the most legendary drivers in F1 history, takes sixth place. Finally, Renault, now rebranded as Alpine, rounds up the top 7 with its impressive record of Wins

## Drivers with Most Wins for Each Constructor

Looking at the graph, we can see that Mercedes and Ferrari are the most successful constructors, with their top drivers (Hamilton and Schumacher, respectively) having won significantly more races than any other driver on this list. Red Bull and Vettel are also relatively successful, with Vettel winning most of his races with Red Bull.

It’s worth noting that Hamilton has had a particularly successful career, winning a record-breaking number of races and championships, and that Mercedes has been the dominant constructor in recent years, winning the constructors’ championship every year since 2014. Similarly, Schumacher won most of his races with Ferrari, and is widely regarded as one of the greatest drivers in the history of the sport.

## Adding missing grouping variables: `constructorId`

Average Speed for the Fastest Lap

The graph illustrates the average speed for the fastest lap in each Grand Prix from 2004, excluding Grand Prix with fewer than 5 races. The data is separated by Grand Prix using facet_grid, with each facet showing the trend in average speed for the fastest lap in a specific Grand Prix. The points are colored by Grand Prix, with each Grand Prix being assigned a different color.

The goal of this facet grid is to examine how the average speed for the fastest lap has changed over time for each Grand Prix. By separating the data into facets, we can identify if there are any significant differences in the trends between Grand Prix, and if some Grand Prix have seen a more significant increase in speed over time than others. This can provide insights into the factors that may be contributing to changes in the speed of the fastest lap, such as changes in technology, driver skills, or track design.

Looking at the facet grid, we can see that the average speed of the fastest lap for most of the Grand Prix circuits has remained relatively consistent over the years. However, there are a few Grand Prix where we can observe a significant change in the average speed over time.

For example, the Australian GP updated their circuit amid cancellation of race event in 2020 and 2021 due to covid. They also planned for a fourth Drag Reduction System (DRS) zone , but was cancelled due to safety reasons. The upgrade did help them make the circuit faster, but when the cars go racing this season they will be able to go even faster because of the addition of 4th DRS zone. The 2023 Australian GP will see drivers exceed 330km/h (205mph).

The reason for the absence of the French Grand Prix from the Formula One World Championship calendar between 2010 and 2017 was due to a combination of factors, including financial difficulties faced by the organizers, lack of interest from potential host cities, and scheduling conflicts with other races.

Monaco GP, considered as one of the iconic if not the iconic races in formula one has always been one of the slowest. It’s narrow roads and crazy u-shaped turns makes it difficult for drivers to overtake one another.

Most Difference in speed for Grand Prix

Grand Prix Difference In Speed
Japanese Grand Prix 43.72077
British Grand Prix 38.49261
Chinese Grand Prix 37.17060
Brazilian Grand Prix 36.64566
Spanish Grand Prix 32.06235
Turkish Grand Prix 30.21350
Australian Grand Prix 29.59399
Abu Dhabi Grand Prix 28.60371
German Grand Prix 28.45907
European Grand Prix 25.52644
Bahrain Grand Prix 25.23509
United States Grand Prix 24.03391
Hungarian Grand Prix 23.23751
Italian Grand Prix 20.87135
Malaysian Grand Prix 20.37037

Conclusion

In conclusion, our analysis has provided some valuable insights into the world of Formula One racing. Through our examination of drivers, we have seen that Michael Schumacher and Lewis Hamilton have established themselves as some of the greatest drivers in the history of the sport, largely due to their ability to consistently finish at the top of the standings. Our examination of constructors has revealed the dominance of teams like Ferrari and Mercedes, who have managed to build dynasties through a combination of talented drivers and innovative car designs.

Finally, our examination of average speeds for the fastest lap in each Grand Prix has shown that there have been some significant changes over the years, with some tracks experiencing much larger shifts than others. This information can be valuable for teams and drivers as they prepare for each season, as it allows them to anticipate the unique challenges posed by each track on the schedule.

As we look ahead to the upcoming season, there is reason to be excited for the potential for a three-way battle between Red Bull, Ferrari, and Mercedes. With talented drivers like Sergio Perez, Sainz and Russell ( the “Second” drivers) in the mix, there is every reason to believe that this season could be one of the most exciting in recent memory.

Throughout this analysis, we have utilized the powerful statistical programming language R to conduct our analysis and create compelling visualizations of our data. By leveraging the power of R, we have been able to uncover insights that would be impossible to see with the naked eye alone, providing a valuable advantage to any team or organization looking to succeed in the world of Formula One racing.