Final Project - STAT5110

The Hendrick Motorsports Conundrum

In 2024, NASCAR imposed a new rule for the 2025 season. Teams in NASCAR will only be allowed to charter up to 3 teams per season. This doesn’t affect most of the teams in the garage, as they have 3 or less charters, but it does affect both Hendrick Motorsports and Joe Gibbs Racing, which currently have 4 charters. In order to go from 4 to 3 charters, each of those teams will eventually have to sell one of their charters. For Joe Gibbs, the decision seems easier than it does for Hendrick. In this project, I want to investigate Hendrick Motorsports performance, look at their drivers, and see if I can come to a conclusion on which driver should ultimately be the one that gets cut.

Load in the necessary libraries

library(ggplot2)
library(tidyverse)
library(readxl)
library(showtext)
library(viridis)
library(ggExtra)
library(plotly)
library(openxlsx)

#Add a new font for the plots
font_add_google("Pridi")

Load in the data that I will be using for the final project

#the main nascar data
nascar <- read.xlsx("nascar.xlsx", sheet = 1)
#All nascar data pre Elliott injury
pre_injury <-  read.xlsx("nascar.xlsx", sheet = 6)
#all nascar data post Elliott injury 
post_injury <-  read.xlsx("nascar.xlsx", sheet = 7)

Data Cleaning and Reorganization

#Look at Hendrick Motorsports - pre and post chase injury
hendrick_preinjury <- pre_injury %>%
  filter(driver_name == "Alex Bowman" | driver_name == "Chase Elliott" |
         driver_name == "William Byron" | driver_name == "Kyle Larson") %>%
  mutate(status = "Pre-Injury")
hendrick_postinjury <- post_injury %>%
  filter(driver_name == "Alex Bowman" | driver_name == "Chase Elliott" |
         driver_name == "William Byron" | driver_name == "Kyle Larson") %>%
  mutate(status = "Post-Injury")

#Create two barcharts so we can tell a more complete story
Predata1 <- hendrick_preinjury %>%
  select(driver_name, num_races, avg_fin, avg_q1rank, avg_q2rank, avg_q3rank, avg_q4rank, wavg_speed, ARP, wARP, status) 
Predata2 <- hendrick_preinjury %>%
  select(driver_name, num_races, wins, stage_wins, dnf, poles, top5, top10, status) 
Postdata1 <- hendrick_postinjury %>%
  select(driver_name, num_races, avg_fin, avg_q1rank, avg_q2rank, avg_q3rank, avg_q4rank, wavg_speed, ARP, wARP, status) 
Postdata2 <- hendrick_postinjury %>%
  select(driver_name, num_races, wins, stage_wins, dnf, poles, top5, top10, status)

#Combine the previous sets for the two unique charts
speedstats <- rbind(Predata1, Postdata1)
faststats <- rbind(Predata2, Postdata2)

#Elliott plot1
CE1 <- as.matrix(speedstats[2,3:9]); CE2 <- as.matrix(speedstats[6,3:9])
CE <- rbind(CE1, CE2); rownames(CE) <- c("Pre-Injury", "Post-Injury")
#Elliott plot2
CE3 <- as.matrix(faststats[2,3:7]); CE4 <- as.matrix(faststats[6,3:7])
ce <- rbind(CE3, CE4); rownames(ce) <- c("Pre-Injury", "Post-Injury")

#Remove excess clutter
rm(CE1,CE2,CE3,CE4)

Plots 1-2 : Base R

What am I investigating?

Chase Elliott was one of the best driver in the garage from 2018 to 2021. He won a championship and many races while being the most popular driver of NASCAR. One of the biggest question marks of the last season and a half in NASCAR has been regarding the aforementioned Chase Elliott. While Hendrick Motorsports has been the best team in the NASCAR garage over the last 3 years, racking up loads of wins and dominant performances, Chase Elliott has only 1 win over his last 80 races and has not been a part of this domination. Elliott broke his leg in a snowboarding accident in the early parts of the 2023 season, and since his return, has looked worse to the eye than he used to look. I want to see if my eyes have been deceiving me, or if Chase Elliott really has taken a step back since his injury.

Has Chase Elliott regressed since his leg injury in 2023?

Plot

#Create a side-by-side barplot comparing Chase Elliott Pre-Post injury
par(mar = c(5,4,5,4)+.01)
#Create the barplot
barplot(CE, col = c("navyblue", "gold"), 
        cex.names = 0.7, las = 1, ylab = NA, 
        names = c("Avg Fin", "Q1 Rank", "Q2 Rank", "Q3 Rank", "Q4 Rank", "wAvgSpeed", "ARP"),
        beside = T)
#Set the y-axis
axis(2, at = seq(0, 16, by = 2), labels = F)
#Set the alt tick marks
axis(2, at = seq(1, 15, by = 2), labels = F, tck = -0.02)
#Main title text
mtext("Chase Elliott Next-Gen Comparison",
      line = 3, side = 3, cex = 1.3, font = 2)
#Subtitle text
mtext("Pre-Injury (38 races) VS Post-Injury (73 races)",
      line = 2, side = 3, cex = 0.9)
#Y-axis text
mtext("Position", side = 2, line = 3, cex = 1)
#x-axis text
mtext("Stat", side = 1, line = 3, cex = 1)
#legend set-up
legend("top", inset = c(-0.15, -0.15), legend=c("Pre-Injury", "Post-Injury"),
       pch= 15, col=c("navyblue", "gold"), cex=0.8,
       box.lty=0, bty = 'n', xpd = T, horiz = T)

#Plot 2 - Set the barplot
barplot(ce, col = c("navyblue", "gold"), beside = T, cex.names = 0.8,
        names = c("Wins", "Stage Wins", "DNF's", "Poles", "Top 5's"))
#Set y-axis
axis(2, at = seq(0, 20, by = 5), labels = F)
#y-axis alt tick marks
axis(2, at = seq(0, 20, by = 1), labels = F, tck = -0.02)
#Main title text
mtext("Chase Elliott Next-Gen Comparison",
      line = 3, side = 3, cex = 1.3, font = 2)
#Subtitle text
mtext("Pre-Injury (38 races) VS Post-Injury (73 races)",
      line = 2, side = 3, cex = 0.9)
#Y-axis text
mtext("Count", side = 2, line = 3, cex = 1)
#x-axis text
mtext("Stat", side = 1, line = 3, cex = 1)
#Set up the legend
legend("top", inset = c(-0.15, -0.15), legend=c("Pre-Injury", "Post-Injury"),
       pch= 15, col=c("navyblue", "gold"), cex=0.8,
       box.lty=0, bty = 'n', xpd = T, horiz = T)

Discussion of Plot

I do think that there is an argument to be made that the barcharts support the visual appearance of Chase Elliott being worse since his injury. Starting with the first barchart, Chase Elliott was about 2 positions slower post-injury in Q2, Q3, and Q4 ranks. As the race would go on, Chase Elliott and his team were not faster than they were in the first quarter, which is not a great thing. He was also slower in his weighted average speed and average running position, which are metrics that measure how fast and what place a driver is compared to everyone around him. The more glaring differences are in the second plot. Pre-injury Chase Elliott had 5 wins in 38 races, while post-injury Chase Elliott has 1 win in 73 races. Driving for an organization that wins a lot and values winning, this is an issue. The same can be said about poles (4-1). Lastly, while he has more top 5’s post-injury, he only has 7 more top 5’s than his pre-injury self in 35 more races. These plots do show that Chase Elliott has performed worse post-injury, but would it be enough for Hendrick to consider dropping him?

Plot 3 : GGPlot

What am I investigating?

In order to decide who to drop when it comes down to go from 4 to 3 charters, we also need to investigate Chase Elliott against his Hendrick teammates. Results are ultimately the most important metric, but teams are looking for drivers who run well throughout the race as well. That is where Weighted Average Running Position (wARP) comes in. wARP is a metric that calculates where a driver runs, putting weight on how accurately that lap predicts where a driver would finish. Laps later in the race typically hold more weight.

How does Chase Elliott wARP compare to his teammates?

Plot

#Filter out the 4 hendrick drivers
hendrick <- nascar %>%
  filter(driver_name == "Alex Bowman" | driver_name == "Chase Elliott" |
         driver_name == "William Byron" | driver_name == "Kyle Larson")

#allows for the new font to be used
showtext_auto()
#GGplot code
ggplot(hendrick, aes(x=as.factor(driver_num), wARP, 
                 fill = as.factor(driver_num)))+
  geom_violin() + #violin object
  scale_y_reverse() + #scale y-axis in reverse
  geom_boxplot(width = 0.3, color = "black", alpha = 0.2) + #boxplots over the violin plots
  labs(x = "Driver Number", y = "wARP") +
  ggtitle("Weighted Average Running Position for Hendrick Motorsports") +
  scale_fill_manual(values = c("5"="dodgerblue1","9"="royalblue4", 
                              "24"="yellowgreen","48"="purple4")) +
  theme(plot.title = element_text(family="Pridi", face = "bold", size = 40),
        axis.title.x = element_text(family="Pridi", size = 25),
        axis.title.y = element_text(family="Pridi", size = 25),
        axis.text = element_text(size = 25),
        strip.text = element_text(size = 25),
        legend.position = "none") +
  facet_wrap(~year) #Wrap it by year to show distributions for each year

Discussion of Plot

Looking at the violin plots, there are a few things to point out. The distributions are all very close. Chase Elliott (9) ran very well according to wARP in 2022, which we expected due to his great pre-injury numbers. However, in comparison to his teammates, his distribution is not significantly different than his teammates in the following years. His median wARP is the highest in 2022, but then falls behind Kyle Larson (5) and William Byron (24) in 2023 and 2024. In 2025 so far, Elliott’s median wARP is last among his team. It is clear that from 2023 on, Kyle Larson and William Byron seem to have overtaken Elliott as the best cars at Hendrick. However, Alex Bowman (48) has the worst distributions in 2022, 2023, and 2024, and nearly the worst in 2025. His performance, while not far behind, has been worse year after year in comparison to his Hendrick Teammates. So who should be cut? A slightly declining Chase Elliott, or a slightly underperforming Alex Bowman?

Plot 4 : Plotly

What am I investigating?

The 2025 violin plot caught my eye, as for the first time in the Next-Gen era, since Chase Elliott had the worst distribution in terms of wARP. I want to pivot from this variable slightly. As I mentioned before, results are the most important thing to a team. Finishing well on a weekly basis can make a team want to keep you around. Additionally, starting position is another example of results. A good starting position represents a team nailing the set-up of the car during the week leading up to the race. Both of these results matter to a team. I highlighted the 4 hendrick cars while keeping the rest of the field in for comparison.

Whose average finish better in relation to start position: Bowman or Elliott?

Plot

#Read in the 2025 summary data
nas25 <- read.xlsx("nascar.xlsx", sheet = 5); nas25$year <- 2025
#Create a dummy variable for point size
nas25$hendrick <- NULL
nas25$hendrick <- ifelse((nas25$driver_name == "Alex Bowman" | nas25$driver_name == "Chase Elliott"|
       nas25$driver_name == "Kyle Larson" | nas25$driver_name == "William Byron"), 9, 7)

#Create a variable for driver color for the points - Color for the hendrick points, grey for the rest
nas25$color <- NULL
nas25$color <- ifelse(nas25$driver_name == "Chase Elliott", "Chase Elliott",
                      ifelse(nas25$driver_name == "Alex Bowman", "Alex Bowman",
                             ifelse(nas25$driver_name == "William Byron", "William Byron",
                                    ifelse(nas25$driver_name == "Kyle Larson", "Kyle Larson", "Everyone Else"))))
#Create the plotly object
nas25 %>%
  group_by(driver_name) %>%
  plot_ly(x = ~avg_st, y = ~avg_fin, color = ~color,
          colors = c("purple", "navyblue", "black", "dodgerblue", "yellowgreen"),
          hoverinfo = "text",
          text = ~paste("Driver:", driver_name,
                        "<br>Start Position: ", round(avg_st,2),
                        "<br>Finish Position: ", round(avg_fin,2),
                        "<br>Driver Rating: ", round(avg_driverRTG,2))) %>%
  add_markers(showlegend = T, size = ~hendrick) %>%
  add_annotations(x = 12.5, y = 32.5, font = list(color = 'darkred',size = 14),
                  text = "Finished Below<br> Avg Start Pos",showarrow = F) %>%
  add_annotations(x = 32.5, y = 12.5,  font = list(color = 'darkgreen',size = 14),
                  text = "Finished Above<br> Avg Start Pos",showarrow = F) %>%
  layout(yaxis = list(title = "Finish Position"),
         xaxis = list(title = "Start Position",
                      range = list(0,40), dtick = 5, tick0 = 0, tickmode = "linear"),
         title = "2025 Season Average Start Position VS Finish Position",
         shapes = list(type = "line", x0 = 0, y0 = 0, x1 = 40, y1 = 40,
                       line = list(color = "black", width = 1)))

Discussion of Plot

There is a pretty glaring difference between Elliott and Bowman in this plotly scatterplot. Despite Elliott and Bowman having very similar starting positions (15.82 to 16.18), the two finish about 6 and a half positions apart. Elliott has an average finish of 11.36, while Bowman has an average finish of 18. That is a major difference and shows that Elliott is able to get way better results than his starting position and his teammate Alex Bowman. In fact, just in terms of finish position, Hendrick has the top 3 cars (Byron - 8.91, Larson - 10.64, Elliott - 11.36), while Bowman lags significantly behind the rest. This goes to prove that despite Elliott having a slight regression in speed on a weekly basis, he is getting the most out of his equipment and getting good finishes, a complete contrast from Bowman.

Plot 5 : Plotly

What am I investigating?

2024 was the first full year back from injury for Elliott and was a good year to really see the comparison between the drivers at hendrick. A really important metric is laps inside of the top 15. This metric shows both performance and results, as the more laps inside the top 15, the faster your car is and the more likely you are to finish the race well. I wanted to look at the cumulative average t15 laps % for each of the hendrick drivers for the 2024 season.

Is Alex Bowman behind his teammates in average top 15 laps percentage as well?

Plot

#This is supposed to help make the lines to work
#Code is from https://plotly.com/r/cumulative-animations/
accumulate_by <- function(dat, var) {
  var <- lazyeval::f_eval(var, dat)
  lvls <- plotly:::getLevels(var)
  dats <- lapply(seq_along(lvls), function(x) {
    cbind(dat[var %in% lvls[seq(1, x)], ], frame = lvls[[x]])
  })
  dplyr::bind_rows(dats)
}

#Create a cumulative t15 lap percentage variable in for the 2024 season
nas2024 <- nascar %>%
  filter(year == 2024 & 
         (driver_name == "Alex Bowman" | driver_name == "Chase Elliott" |
         driver_name == "William Byron" | driver_name == "Kyle Larson")) %>%
  group_by(driver_name) %>%
  mutate(cumulativet15 = cumsum(t15_laps) / cumsum(laps_ran)*100) %>%
  accumulate_by(~race_num) #allows for the frames to be specified on race_num

#Used for the color object on the animated plotly graph
nas2024$split <- ifelse(nas2024$driver_name == "Chase Elliott", "Chase Elliott",
                      ifelse(nas2024$driver_name == "Alex Bowman", "Alex Bowman",
                             ifelse(nas2024$driver_name == "William Byron", "William Byron",
                                    ifelse(nas2024$driver_name == "Kyle Larson", "Kyle Larson", "Everyone Else"))))

#T15% - Animated Line plot
nas2024 %>%
  plot_ly(x = ~race_num, y = ~cumulativet15, frame = ~frame,
          type = 'scatter', mode = 'lines+markers', color = ~split, showlegend = T,
          colors = c("purple", "navyblue", "dodgerblue", "yellowgreen"),
          hoverinfo = "text",
          text = ~paste("<b>Driver:</b><i>", driver_name,
                        "</i><br><b>Current Race:</b><i>", race_num,
                        "</i><br><b>Current Race T15%:</b><i>", round(t15_lapsPCT,2),
                        "%</i><br><b>2024 T15%:</b><i>", round(cumulativet15,2), "%")) %>%
  add_text(x = 20, y = 20, text = ~race_num, frame = ~race_num, showlegend = F,
           textfont = list(size = 50, color = toRGB("grey75"))) %>%
  layout(title = "<b>Cumulative Top 15 Laps Ran PCT<br>2024 Season</b>",
         yaxis = list(title = "<b>Cumulative T15 Lap PCT</b>",
                      range = list(0,100), dtick = 10, tick0 = 0, tickmode = "linear"),
         xaxis = list(title = "<b>Race Num</b>",
                      range = list(0,37), dtick = 4, tick0 = 0, tickmode = "linear")) %>%
  animation_opts(frame = 200, transition = 10, redraw = F) %>%
  animation_slider(hide = T) %>%
  animation_button(x = 1, xanchor = "right", y = 0, yanchor = "bottom")

Discussion of Plot

As expected, Alex Bowman was far behind his teammates in the average t15 lap percentage for the 2024 season. His season average was 62%, while his next closest teammate was William Byron (74%). Looking at the line chart, Alex Bowman had a rough stretch of races between races 22 and 26, with his line dipping down from 59% to 52%. At this time, his teammates were all increasing or staying rather constant. The other thing that is notable here is that Chase Elliott did run more of his 2024 laps in the top 15 than William Byron did. While it was very close, this shows that Elliott was on par with William Byron despite the lack of wins. This is something that teams do look for in a driver. Alex Bowman has been worse than his teammates in this metric, and by a considerable amount.

Conclusions

When trying to answer the Hendrick Motorsports conundrum, there is a lot of things to consider. I do believe that no matter what, William Byron and Kyle Larson are safe from being cut. They are first and second amongst the 4 drivers in just about every metric and have somehow gotten better year after year. I mean, William Byron’s wARP for 2025 so far is insanely good. Additionally, I think that despite the slight regression since his injury, Chase Elliott is safe as well. He is getting better results than where he starts and is running nearly 3/4 of his laps in the top 15, which is a very high mark and one of the best in the entire series. Chase Elliott is a NASCAR champion, which only 36 drivers in the history of the sport have ever done. Alex Bowman would be my candidate to be the one that gets cut. All four cars come out of the same shop, which should mean that the cars perform rather similarly. This is not the case with Bowman. His average finish being so much lower than his teammates, coupled with his smaller amount of laps in the top 15 to his teammates, is why I would choose him. Hendrick Motorsports is known for the championships, wins, and dominant performances. Jimmie Johnson won 7 championships and 83 races. Jeff Gordon has 4 champiomships and 92 victories. Kyle Larson and Chase Elliott each have over 20 wins and a championship. William Byron has been as dominant as anyone over the last 4 years. Alex Bowman does not fit that mold, and the graphs and data support that claim.

Final Project - STAT5110

Sam Wigton

2025-05-09

The Hendrick Motorsports Conundrum

Load in the necessary libraries

Load in the data that I will be using for the final project

Data Cleaning and Reorganization

Plots 1-2 : Base R

What am I investigating?

Has Chase Elliott regressed since his leg injury in 2023?

Plot

Discussion of Plot

Plot 3 : GGPlot

What am I investigating?

How does Chase Elliott wARP compare to his teammates?

Plot

Discussion of Plot

Plot 4 : Plotly

What am I investigating?

Whose average finish better in relation to start position: Bowman or Elliott?

Plot

Discussion of Plot

Plot 5 : Plotly

What am I investigating?

Is Alex Bowman behind his teammates in average top 15 laps percentage as well?

Plot

Discussion of Plot

Conclusions