November 02, 2025

Agenda

  • Data & questions
  • Quick data description
  • Visual exploration (ggplot & plotly)
  • Statistical analysis (linear regression)
  • Takeaways & next steps

Data & Questions

Dataset: Built-in R dataset mtcars (Motor Trend Car Road Tests)

Why this data? - Numeric and categorical variables suitable for multiple plot types - Good for demonstrating 2D and 3D visualizations - Supports a simple regression analysis

Guiding questions 1. How does fuel efficiency (mpg) vary by number of cylinders? 2. How are horsepower and weight associated with fuel efficiency? 3. Which variables best explain variation in mpg?

Data Description

str(mtcars)
## 'data.frame':    32 obs. of  11 variables:
##  $ mpg : num  21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ...
##  $ cyl : num  6 6 4 6 8 6 8 4 4 6 ...
##  $ disp: num  160 160 108 258 360 ...
##  $ hp  : num  110 110 93 110 175 105 245 62 95 123 ...
##  $ drat: num  3.9 3.9 3.85 3.08 3.15 2.76 3.21 3.69 3.92 3.92 ...
##  $ wt  : num  2.62 2.88 2.32 3.21 3.44 ...
##  $ qsec: num  16.5 17 18.6 19.4 17 ...
##  $ vs  : num  0 0 1 1 0 1 0 1 1 1 ...
##  $ am  : num  1 1 1 0 0 0 0 0 0 0 ...
##  $ gear: num  4 4 4 3 3 3 3 4 4 4 ...
##  $ carb: num  4 4 1 1 2 1 4 2 2 4 ...
summary(mtcars)
##       mpg             cyl             disp             hp       
##  Min.   :10.40   Min.   :4.000   Min.   : 71.1   Min.   : 52.0  
##  1st Qu.:15.43   1st Qu.:4.000   1st Qu.:120.8   1st Qu.: 96.5  
##  Median :19.20   Median :6.000   Median :196.3   Median :123.0  
##  Mean   :20.09   Mean   :6.188   Mean   :230.7   Mean   :146.7  
##  3rd Qu.:22.80   3rd Qu.:8.000   3rd Qu.:326.0   3rd Qu.:180.0  
##  Max.   :33.90   Max.   :8.000   Max.   :472.0   Max.   :335.0  
##       drat             wt             qsec             vs        
##  Min.   :2.760   Min.   :1.513   Min.   :14.50   Min.   :0.0000  
##  1st Qu.:3.080   1st Qu.:2.581   1st Qu.:16.89   1st Qu.:0.0000  
##  Median :3.695   Median :3.325   Median :17.71   Median :0.0000  
##  Mean   :3.597   Mean   :3.217   Mean   :17.85   Mean   :0.4375  
##  3rd Qu.:3.920   3rd Qu.:3.610   3rd Qu.:18.90   3rd Qu.:1.0000  
##  Max.   :4.930   Max.   :5.424   Max.   :22.90   Max.   :1.0000  
##        am              gear            carb      
##  Min.   :0.0000   Min.   :3.000   Min.   :1.000  
##  1st Qu.:0.0000   1st Qu.:3.000   1st Qu.:2.000  
##  Median :0.0000   Median :4.000   Median :2.000  
##  Mean   :0.4062   Mean   :3.688   Mean   :2.812  
##  3rd Qu.:1.0000   3rd Qu.:4.000   3rd Qu.:4.000  
##  Max.   :1.0000   Max.   :5.000   Max.   :8.000
cars <- mtcars %>%
  tibble::rownames_to_column(var = "model") %>%
  mutate(cyl = factor(cyl),
         am = factor(am, labels = c("Automatic", "Manual")))
  • Observations: 32
  • Variables: 11
  • Key fields: mpg (fuel efficiency), hp (horsepower), wt (weight), cyl (cylinders), am (transmission)
  • Data source: Built-in dataset in base R

ggplot (1): Bar chart — Count of Cars by Cylinders

ggplot (2): Boxplot — MPG by Cylinders

plotly (1): 2D Scatter — MPG vs Horsepower (size = weight)

plotly (2): 3D Scatter — MPG, Horsepower, and Weight

Statistical Analysis

From the multiple linear regression of fuel efficiency (mpg) on horsepower (hp) and weight (wt), we can see that both predictors have a strong negative relationship with miles per gallon. Cars that are heavier or have higher horsepower tend to be less fuel efficient.
The model explains approximately 82.7 % of the variation in fuel economy. Weight has the stronger effect: each additional 1000 lbs reduces mpg by about -3.88 miles per gallon, while a 10 hp increase reduces mpg by about -0.32 miles per gallon.
Both variables are statistically significant at the p < 0.05 level, confirming that heavier and more powerful cars consume more fuel.

## 
## Call:
## lm(formula = mpg ~ wt + hp, data = cars)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -3.941 -1.600 -0.182  1.050  5.854 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 37.22727    1.59879  23.285  < 2e-16 ***
## wt          -3.87783    0.63273  -6.129 1.12e-06 ***
## hp          -0.03177    0.00903  -3.519  0.00145 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.593 on 29 degrees of freedom
## Multiple R-squared:  0.8268, Adjusted R-squared:  0.8148 
## F-statistic: 69.21 on 2 and 29 DF,  p-value: 9.109e-12

Diagnostics: Observed vs Fitted MPG

Takeaways & Next Steps

  • MPG drops as weight and horsepower increase.
  • Cylinder count segments the fuel-efficiency distribution.
  • The simple model (mpg ~ wt + hp) is a reasonable first pass.