This project analyzes how different car characteristics, specifically horsepower and weight, influence fuel efficiency (mpg). The dataset used includes variables such as mpg, horsepower, weight, and origin. These variables allow for both quantitative and categorical analysis.
The goal of this project is to explore the relationship between horsepower, weight, and fuel efficiency using linear regression and data visualization.
library(tidyverse)
Warning: package 'tidyverse' was built under R version 4.5.3
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr 1.2.0 ✔ readr 2.1.6
✔ forcats 1.0.1 ✔ stringr 1.6.0
✔ ggplot2 4.0.2 ✔ tibble 3.3.1
✔ lubridate 1.9.5 ✔ tidyr 1.3.2
✔ purrr 1.2.1
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag() masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
Rows: 398 Columns: 9
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (2): origin, name
dbl (7): mpg, cylinders, displacement, horsepower, weight, acceleration, mod...
ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
head(df)
# A tibble: 6 × 9
mpg cylinders displacement horsepower weight acceleration model_year origin
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <chr>
1 18 8 307 130 3504 12 70 usa
2 15 8 350 165 3693 11.5 70 usa
3 18 8 318 150 3436 11 70 usa
4 16 8 304 150 3433 12 70 usa
5 17 8 302 140 3449 10.5 70 usa
6 15 8 429 198 4341 10 70 usa
# ℹ 1 more variable: name <chr>
# Remove missing values for accurate analysisdf <- df %>%drop_na()
summary(df)
mpg cylinders displacement horsepower weight
Min. : 9.00 Min. :3.000 Min. : 68.0 Min. : 46.0 Min. :1613
1st Qu.:17.00 1st Qu.:4.000 1st Qu.:105.0 1st Qu.: 75.0 1st Qu.:2225
Median :22.75 Median :4.000 Median :151.0 Median : 93.5 Median :2804
Mean :23.45 Mean :5.472 Mean :194.4 Mean :104.5 Mean :2978
3rd Qu.:29.00 3rd Qu.:8.000 3rd Qu.:275.8 3rd Qu.:126.0 3rd Qu.:3615
Max. :46.60 Max. :8.000 Max. :455.0 Max. :230.0 Max. :5140
acceleration model_year origin name
Min. : 8.00 Min. :70.00 Length:392 Length:392
1st Qu.:13.78 1st Qu.:73.00 Class :character Class :character
Median :15.50 Median :76.00 Mode :character Mode :character
Mean :15.54 Mean :75.98
3rd Qu.:17.02 3rd Qu.:79.00
Max. :24.80 Max. :82.00
A multiple linear regression model was used to examine the relationship between horsepower, weight, and fuel efficiency (mpg). The results show that both horsepower and weight are statistically significant predictors of mpg, as their p-values are less than 0.05.
Specifically, horsepower has a negative coefficient (-0.047), indicating that as horsepower increases, fuel efficiency decreases. Similarly, weight also has a negative coefficient (-0.0058), showing that heavier cars tend to have lower mpg.
The adjusted R-squared value is approximately 0.7049, which means that about 70% of the variation in mpg can be explained by horsepower and weight. This suggests that the model provides a strong fit to the data.
The regression equation is:
mpg = 45.64 − 0.047(horsepower) − 0.0058(weight)
```{+}
```
model <-lm(mpg ~ horsepower + weight, data = df)summary(model)
Call:
lm(formula = mpg ~ horsepower + weight, data = df)
Residuals:
Min 1Q Median 3Q Max
-11.0762 -2.7340 -0.3312 2.1752 16.2601
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 45.6402108 0.7931958 57.540 < 2e-16 ***
horsepower -0.0473029 0.0110851 -4.267 2.49e-05 ***
weight -0.0057942 0.0005023 -11.535 < 2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 4.24 on 389 degrees of freedom
Multiple R-squared: 0.7064, Adjusted R-squared: 0.7049
F-statistic: 467.9 on 2 and 389 DF, p-value: < 2.2e-16
par(mfrow =c(2,2))plot(model)
ggplot(df, aes(x = weight, y = mpg, color = origin)) +geom_point(size =2, alpha =0.8) +geom_smooth(method ="lm", se =FALSE) +labs(title ="Relationship Between Car Weight and Fuel Efficiency",subtitle ="Heavier cars tend to have lower miles per gallon",x ="Weight",y ="Miles per Gallon (MPG)",color ="Car Origin",caption ="Source: Car MPG Dataset" ) +scale_color_manual(values =c("usa"="#E41A1C","europe"="#377EB8","japan"="#4DAF4A" )) +theme_minimal() +theme(plot.title =element_text(size =14, face ="bold"),plot.subtitle =element_text(size =12),legend.position ="right" )