This presentation explores the relationship between horsepower and fuel efficiency (mpg) using the mtcars dataset. We’ll visualize data using ggplot and interactive plots.
This presentation explores the relationship between horsepower and fuel efficiency (mpg) using the mtcars dataset. We’ll visualize data using ggplot and interactive plots.
# Load and display the first few rows of the dataset data(mtcars) head(mtcars)
## mpg cyl disp hp drat wt qsec vs am gear carb ## Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4 ## Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4 ## Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1 ## Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1 ## Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2 ## Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1
library(ggplot2)
# Create a scatter plot
ggplot(mtcars, aes(x = hp, y = mpg)) +
geom_point(color = "blue") +
labs(
title = "Scatter Plot of mpg vs hp",
x = "Horsepower (hp)",
y = "Miles per Gallon (mpg)"
)
# Fit the regression model model <- lm(mpg ~ hp, data = mtcars) # Scatter plot with regression line
## `geom_smooth()` using formula = 'y ~ x'
#slide 4
coefficients(model)
#slide 5
library(plotly) plot_ly(data = mtcars, x = ~hp, y = ~mpg, type = ‘scatter’, mode = ‘markers’, marker = list(color = ‘blue’)) %>% add_lines(x = ~hp, y = fitted(model), line = list(color = ‘red’)) %>% layout(title = “Interactive Plot of mpg vs hp”, xaxis = list(title = “Horsepower (hp)”), yaxis = list(title = “Miles per Gallon (mpg)”))
We can represent the regression equation using LaTeX notation:
\[ \hat{y} = \beta_0 + \beta_1 \times \text{hp} \]
Where: - \(\hat{y}\) is the predicted miles per gallon (mpg) - \(\beta_0\) is the intercept - \(\beta_1\) is the slope of horsepower (hp)
Here is the actual regression equation computed using R:
## The actual regression equation is: mpg = 30.1 + -0.07 * hp
# Predicted mpg values
y_pred <- predict(model)
# Mean of actual mpg values
y_mean <- mean(mtcars$mpg)
# Calculate Sum of Squares due to Regression (SSR)
SSR <- sum((y_pred - y_mean)^2)
# Display the SSR value
cat("Sum of squares due to regression (SSR):", round(SSR, 2))
## Sum of squares due to regression (SSR): 678.37