library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.0 ──
## ✓ ggplot2 3.3.3     ✓ purrr   0.3.4
## ✓ tibble  3.0.6     ✓ dplyr   1.0.4
## ✓ tidyr   1.1.2     ✓ stringr 1.4.0
## ✓ readr   1.4.0     ✓ forcats 0.5.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()
  1. Publish your project on RPubs and send me the link.

Here is the link to this project on RPubs https://rpubs.com/ProfCelia/mpgProject

Abstract

I analyzed the dataset available in tidyverse which contains data from 38 car models from 1999 to 2008.

Introduction

The data was sourced from the US. Goverment (EPA and Department of Energy). The dataset contains information from 38 car models from 1999-2008. Although the dataset is outdated; the same website hosts data from 1984 to 2021 (https://fueleconomy.gov/feg/powerSearch.jsp?action=NoRecs).

Description of the Data

The variables in this dataset are: manufacturer (car maker), mode (car model), displ (engine displacement in liters), year (yr of manufacture), cyl (nunmber of cylinders), trans (transmission type), drv (drive train), cty (city miles per galon), hwy (highway mileage per galon), fl (fuel type), and class (car type).

?mpg
mpg
## # A tibble: 234 x 11
##    manufacturer model    displ  year   cyl trans   drv     cty   hwy fl    class
##    <chr>        <chr>    <dbl> <int> <int> <chr>   <chr> <int> <int> <chr> <chr>
##  1 audi         a4         1.8  1999     4 auto(l… f        18    29 p     comp…
##  2 audi         a4         1.8  1999     4 manual… f        21    29 p     comp…
##  3 audi         a4         2    2008     4 manual… f        20    31 p     comp…
##  4 audi         a4         2    2008     4 auto(a… f        21    30 p     comp…
##  5 audi         a4         2.8  1999     6 auto(l… f        16    26 p     comp…
##  6 audi         a4         2.8  1999     6 manual… f        18    26 p     comp…
##  7 audi         a4         3.1  2008     6 auto(a… f        18    27 p     comp…
##  8 audi         a4 quat…   1.8  1999     4 manual… 4        18    26 p     comp…
##  9 audi         a4 quat…   1.8  1999     4 auto(l… 4        16    25 p     comp…
## 10 audi         a4 quat…   2    2008     4 manual… 4        20    28 p     comp…
## # … with 224 more rows
#> # A tibble: 234 x 11
#>   manufacturer model displ  year   cyl trans      drv     cty   hwy fl    class 
#>   <chr>        <chr> <dbl> <int> <int> <chr>      <chr> <int> <int> <chr> <chr> 
#> 1 audi         a4      1.8  1999     4 auto(l5)   f        18    29 p     compa…
#> 2 audi         a4      1.8  1999     4 manual(m5) f        21    29 p     compa…
#> 3 audi         a4      2    2008     4 manual(m6) f        20    31 p     compa…
#> 4 audi         a4      2    2008     4 auto(av)   f        21    30 p     compa…
#> 5 audi         a4      2.8  1999     6 auto(l5)   f        16    26 p     compa…
#> 6 audi         a4      2.8  1999     6 manual(m5) f        18    26 p     compa…
#> # … with 228 more rows
ggplot(data = mpg) + 
  geom_point(mapping = aes(x = displ, y = hwy))

mpg2 <- filter(mpg, class == "2seater")

ggplot(data = mpg, mapping = aes(x = displ, y = hwy)) +
  geom_point() +
  geom_point(data = mpg2, color = "red", size = 2)

ggplot(data = mpg) + 
  geom_point(mapping = aes(x = hwy, y = cyl))

ggplot(data = mpg) + 
  geom_point(mapping = aes(x = hwy, y = trans ))

Findings

Finding 1. I created a scatter plot to compare engine displacement and hwy, and the output shows a strong relationship between both variables. However there are also outliers (2 seater cars). These 2 seater cars have a high engine displacement and a higher expected hwy mileage. One possible explanation is that those 2 seater cars are sport models; therefore the MPG that they achive may be higher than the other cars that have a high displacement.

Finding 2. I also examined the relationship between the car’s cylinders and highway miles per gallon. My analysis indicates that there is a relationship between the number of cylinders and hwy miles. The fewer cylinders a car has, the higher highway miles it reaches. At the same time, the more cylinders a car has, the lower highway miles it has.

Finding 3. I also analyzed the relationship between cars tramissions and highway miles per gallon and the analysis found no relationship. The data shows that both manual and automatic transmissions can have high and low highway miles per gallon.

Bibliography

US Goverment Source for Fuel Economy Information https://fueleconomy.gov/feg/powerSearch.jsp?action=NoRecs