# Analyzing Tire Pressure in NASCAR: Using Formulas in R
# In my analysis of tire pressure reduction in NASCAR race cars, I decided to leverage the Wilkinson-Rogers formula notation in R. This notation, particularly useful in statistical modeling, allows me to specify the relationships between different variables in the dataset, helping me understand which factors most influence tire pressure loss over multiple laps. Here, I’ll walk through how I use formulas to model tire pressure based on various environmental, driver, and vehicle conditions, using R's lm() function to fit linear regression models.
#
# The Basics of Formulas in R
# In R, formulas are fundamental for setting up statistical models. Each formula defines a dependent variable on the left side of the ~ operator and one or more independent variables on the right. For instance, to model how Tire_Pressure_FL_psi (the pressure in the front-left tire) changes over laps, I could create a formula like this:
library(readxl)
library(knitr)
library(kableExtra)
# Load the dataset
nascar_tire_pressure_analysis <- read_excel("C:/Users/jacob/Downloads/nascar_tire_pressure_analysis.xlsx")
View(nascar_tire_pressure_analysis)
# Define the formula for the model
my_formula <- formula(Tire_Pressure_FL_psi ~ Lap_Number + Ambient_Temperature_F + Track_Surface_Temperature_F)
# This formula object specifies that the model will predict Tire_Pressure_FL_psi based on Lap_Number, Ambient_Temperature_F, and Track_Surface_Temperature_F. When I pass this formula to lm(), R automatically estimates the regression coefficients for each independent variable, showing how they affect the dependent variable, Tire_Pressure_FL_psi.
# Fitting the Model
# Using lm() with a formula and a dataset allows me to fit a linear model. For example, in my analysis, I can use:
# Fit the model
mod1 <- lm(Tire_Pressure_FL_psi ~ Lap_Number + Ambient_Temperature_F + Track_Surface_Temperature_F, data = nascar_tire_pressure_analysis)
summary(mod1)
## Warning in summary.lm(mod1): essentially perfect fit: summary may be unreliable
##
## Call:
## lm(formula = Tire_Pressure_FL_psi ~ Lap_Number + Ambient_Temperature_F +
## Track_Surface_Temperature_F, data = nascar_tire_pressure_analysis)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.358e-14 -6.049e-16 3.093e-16 1.168e-15 3.095e-15
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 3.205e+01 1.410e-14 2.274e+15 <2e-16 ***
## Lap_Number -5.000e-02 2.392e-17 -2.090e+15 <2e-16 ***
## Ambient_Temperature_F 2.189e-16 1.199e-16 1.825e+00 0.0745 .
## Track_Surface_Temperature_F 1.258e-16 7.724e-17 1.629e+00 0.1102
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.384e-15 on 46 degrees of freedom
## Multiple R-squared: 1, Adjusted R-squared: 1
## F-statistic: 1.526e+30 on 3 and 46 DF, p-value: < 2.2e-16
# This model helps me identify how much tire pressure decreases per lap, and the impact of ambient and track surface temperatures. Each coefficient in the output provides insight into the effect of each variable on the tire pressure.
# Excluding the Intercept
# By default, the lm() function includes an intercept in the model. If I want to exclude this intercept and estimate the effect of each variable independently, I can adjust the formula by adding 0 or -1:
# Exclude the intercept
coef(lm(Tire_Pressure_FL_psi ~ 0 + Lap_Number + Ambient_Temperature_F, data = nascar_tire_pressure_analysis))
## Lap_Number Ambient_Temperature_F
## -0.06033434 0.36068414
# This setup is useful if I want a model focused strictly on the effects of the independent variables without an intercept.
# Modeling Interactions
# One of the strengths of formulas in R is the ability to include interactions between variables. For instance, the interaction between Lap_Number and Driver_Aggression_Level could reveal how aggressive driving over multiple laps impacts tire pressure. I can specify this interaction using : in the formula:
# Model interactions
coef(lm(Tire_Pressure_FL_psi ~ Lap_Number * Driver_Aggression_Level, data = nascar_tire_pressure_analysis))
## (Intercept) Lap_Number
## 3.205000e+01 -5.000000e-02
## Driver_Aggression_Level Lap_Number:Driver_Aggression_Level
## 1.124921e-15 -3.409218e-17
# Typically, though, it’s advisable to include the main effects along with interactions, which I can do using *, expanding the formula to:
# The * symbol expands to include both the main effects (Lap_Number, Driver_Aggression_Level) and the interaction term (Lap_Number
# ), giving a comprehensive model of how these factors influence tire pressure.
#
# Higher-Order Interactions
# To analyze more complex interactions, I can extend this approach to multiple variables. For example, I could model how tire pressure is influenced by interactions among Lap_Number, Ambient_Temperature_F, and Track_Surface_Temperature_F:
# Higher-order interactions
coef(lm(Tire_Pressure_FL_psi ~ Lap_Number * Ambient_Temperature_F * Track_Surface_Temperature_F, data = nascar_tire_pressure_analysis))
## (Intercept)
## 3.205000e+01
## Lap_Number
## -5.000000e-02
## Ambient_Temperature_F
## 2.422985e-14
## Track_Surface_Temperature_F
## 1.690219e-14
## Lap_Number:Ambient_Temperature_F
## -7.629100e-16
## Lap_Number:Track_Surface_Temperature_F
## -5.337269e-16
## Ambient_Temperature_F:Track_Surface_Temperature_F
## -1.880742e-16
## Lap_Number:Ambient_Temperature_F:Track_Surface_Temperature_F
## 5.947871e-18
# Exclude specific interactions
coef(lm(Tire_Pressure_FL_psi ~ Lap_Number * Ambient_Temperature_F * Track_Surface_Temperature_F - Lap_Number:Ambient_Temperature_F:Track_Surface_Temperature_F, data = nascar_tire_pressure_analysis))
## (Intercept)
## 3.205000e+01
## Lap_Number
## -5.000000e-02
## Ambient_Temperature_F
## 4.249384e-15
## Track_Surface_Temperature_F
## 2.898159e-15
## Lap_Number:Ambient_Temperature_F
## -1.655972e-17
## Lap_Number:Track_Surface_Temperature_F
## -9.004575e-18
## Ambient_Temperature_F:Track_Surface_Temperature_F
## -2.872339e-17
# This formula includes all main effects, two-way interactions, and the three-way interaction, allowing me to understand how the combination of these variables affects tire pressure. If I want to exclude the three-way interaction, I can subtract it using -:
# Limit interactions to two-way
coef(lm(Tire_Pressure_FL_psi ~ (Lap_Number + Ambient_Temperature_F + Track_Surface_Temperature_F) ^ 2, data = nascar_tire_pressure_analysis))
## (Intercept)
## 3.205000e+01
## Lap_Number
## -5.000000e-02
## Ambient_Temperature_F
## 4.249384e-15
## Track_Surface_Temperature_F
## 2.898159e-15
## Lap_Number:Ambient_Temperature_F
## -1.655972e-17
## Lap_Number:Track_Surface_Temperature_F
## -9.004575e-18
## Ambient_Temperature_F:Track_Surface_Temperature_F
## -2.872339e-17
# Alternatively, I can use ^ to specify that I only want up to two-way interactions:
# Using All Variables in a Formula
# For a more comprehensive model, I can use the . symbol to include all variables in the dataset as predictors, except for the dependent variable. This shorthand is particularly helpful when I want to quickly analyze the effect of all available variables on tire pressure:
# Use all variables
coef(lm(Tire_Pressure_FL_psi ~ ., data = nascar_tire_pressure_analysis))
## (Intercept) Lap_Number
## 3.205000e+01 -5.000000e-02
## Ambient_Temperature_F Track_Surface_Temperature_F
## 8.521292e-17 1.968615e-17
## Tire_Pressure_FR_psi Tire_Pressure_RL_psi
## NA NA
## Tire_Pressure_RR_psi Tire_Wear_FL_Percent
## NA NA
## Tire_Wear_FR_Percent Tire_Wear_RL_Percent
## NA NA
## Tire_Wear_RR_Percent Speed_mph
## NA NA
## Driver_Aggression_Level Pit_Stops
## 1.526379e-16 -2.876368e-15
## Lap_Time_sec Humidity_Percent
## NA 3.132553e-16
## Wind_Speed_mph
## 2.593774e-16
# This approach includes every variable in the model, making it an efficient way to explore which factors have the most significant impact on tire pressure.
# Summary Table of Formula Notations for Tire Pressure Analysis
# Below is a summary table that provides examples of formula notations I used for tire pressure analysis in NASCAR race cars. The table illustrates different ways to specify relationships and interactions in the model, enhancing my understanding of how various factors impact tire pressure.
# Here I will load my libaries but I have already installed the packages on my local computer.
# Summary table of formula notations
formula_table <- data.frame(
Formula_Notation = c("Basic Formula", "Without Intercept", "Two-Way Interaction",
"Higher-Order Interaction", "Exclude Higher Interaction", "All Variables"),
Code_Example = c("Tire_Pressure_FL_psi ~ Lap_Number + Ambient_Temperature_F",
"Tire_Pressure_FL_psi ~ 0 + Lap_Number + Ambient_Temperature_F",
"Tire_Pressure_FL_psi ~ Lap_Number * Driver_Aggression_Level",
"Tire_Pressure_FL_psi ~ Lap_Number * Ambient_Temperature_F * Track_Surface_Temperature_F",
"Tire_Pressure_FL_psi ~ Lap_Number * Ambient_Temperature_F * Track_Surface_Temperature_F - Lap_Number:Ambient_Temperature_F:Track_Surface_Temperature_F",
"Tire_Pressure_FL_psi ~ ."),
Description = c("Predicts tire pressure based on lap number and ambient temperature.",
"Excludes intercept to focus on independent variables only.",
"Includes interaction between lap number and driver aggression level.",
"Adds higher-order interaction among multiple environmental factors.",
"Removes three-way interaction but keeps main and two-way effects.",
"Uses all available variables as predictors for tire pressure.")
)
# Create the table
kable(formula_table, "html", col.names = c("Formula Notation", "Code Example", "Description")) %>%
kable_styling(full_width = F, bootstrap_options = c("striped", "hover", "condensed")) %>%
column_spec(1, bold = TRUE, color = "white", background = "#4CAF50") %>%
column_spec(2, background = "#E0F7FA") %>%
column_spec(3, background = "#FFEBEE")
|
Formula Notation
|
Code Example
|
Description
|
|
Basic Formula
|
Tire_Pressure_FL_psi ~ Lap_Number + Ambient_Temperature_F
|
Predicts tire pressure based on lap number and ambient temperature.
|
|
Without Intercept
|
Tire_Pressure_FL_psi ~ 0 + Lap_Number + Ambient_Temperature_F
|
Excludes intercept to focus on independent variables only.
|
|
Two-Way Interaction
|
Tire_Pressure_FL_psi ~ Lap_Number * Driver_Aggression_Level
|
Includes interaction between lap number and driver aggression level.
|
|
Higher-Order Interaction
|
Tire_Pressure_FL_psi ~ Lap_Number * Ambient_Temperature_F *
Track_Surface_Temperature_F
|
Adds higher-order interaction among multiple environmental factors.
|
|
Exclude Higher Interaction
|
Tire_Pressure_FL_psi ~ Lap_Number * Ambient_Temperature_F *
Track_Surface_Temperature_F -
Lap_Number:Ambient_Temperature_F:Track_Surface_Temperature_F
|
Removes three-way interaction but keeps main and two-way effects.
|
|
All Variables
|
Tire_Pressure_FL_psi ~ .
|
Uses all available variables as predictors for tire pressure.
|
# This table serves as a guide to different formula notations, helping me navigate various configurations of statistical models in R. By understanding these notations, I can better analyze the complex dynamics of tire pressure in NASCAR race cars and optimize strategies for performance and safety on the track.