library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.0 ──
## ✓ ggplot2 3.3.3 ✓ purrr 0.3.4
## ✓ tibble 3.0.6 ✓ dplyr 1.0.4
## ✓ tidyr 1.1.2 ✓ stringr 1.4.0
## ✓ readr 1.4.0 ✓ forcats 0.5.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
view(mpg)
This is a data analysis of MPG dataset. The main goal is to explore the relationship between the collected variables and the fuel consumption efficiency. The analysis started with an introduction to the dataset and the description of variables. Further analysis is supported by plot and other references.
This dataset contains fuel economy data from 1999 to 2008 for 38 popular models of cars. The data source is available on https://fueleconomy.gov/. Some of the data has been updated and they are available on the website.
There are 11 variables in this dataset. Categorical variables are manufacturer, model, year, number of cylinders, type of transmission, the type of drive train, fuel type, and class of car. Continuous variables are engine displacement in liters, city miles per gallon, and highway miles per gallon.
Engine displacement is the combined swept volumn of the pistons inside the cylinders of an engine. It’s one of the determining factors of the horsepower and how much fuel a vehicle can consume. Generally speaking, higher an engine’s displacement, more fuel the engine can consume.
Cylinder is the power unit of an engine. Fuel is burned in the cylinder and converted to energy that powers the vehicle. The most common number of cylinders in a car are 4, 6, and 8. The vehicle with more cylinders has more power and better performance and has the ability to carry heavier weight. On the other hand, more cylinder also means more fuel consumption.
There are two types of transmission: auto and manual transmission. There are pros and cons to both types. In terms of fuel efficiency, historically manual transmission is more efficient, however, the gap is closing with the developed technology.
Full-wheel drive vehicles provide better traction than two-wheel drive vehicles. Most of passenger vehicles are front-wheel drive. Some sports and performance vehicles are rear-wheel drive.
City mpg is usually lower than highway mpg because of frequent stopping and starting in city. Uninterrupted driving tends to burn less fuel and that’s why highway mpg is higher.
highlight1_df <- mpg %>%
filter(hwy > 40)
highlight2_df <- mpg %>%
filter(displ > 5 & hwy > 20)
mpg %>%
ggplot(aes(x = displ, y = hwy))+
geom_point() +
geom_point(data = highlight1_df, aes(x = displ, y = hwy), color = 'red') +
geom_point(data = highlight2_df, aes(x = displ, y = hwy), color = 'red') +
xlab("Engine displacement, in litres") +
ylab("Highway miles per gallon") +
ggtitle ("Scatterplot of Engine Displacement vs. Highway MPG")
We can see from the plot that engine displacement and highway miles per gallon are in a positive correlation. There are some outliers on upper left and lower right of the plot highlighted in red.
ggplot(data = mpg) +
geom_point(mapping = aes(x = displ, y = hwy, color = class)) +
xlab("Engine displacement, in litres") +
ylab("Highway miles per gallon") +
ggtitle ("Scatterplot of Engine Displacement vs. Highway MPG")
From above plot, outliers are identified as subcompact and 2seater class. Subcompact and 2seater vehicles have some similarities that allow them to be more energy-efficient. They are generally smaller in size and ligher in weight. Their engines are also less powerful. These features determine that they consume less fuel.
ggplot(data = mpg) +
geom_point(mapping = aes(x = displ, y = cty, color = drv, shape = drv)) +
xlab("Engine displacement, in litres") +
ylab("City miles per gallon") +
ggtitle ("Scatterplot of Engine Displacement vs. City MPG")
Above plot shows the relationship between city miles per gallon and engine displacement. Similar to plot 1 and 2, they are also in a positive correlation. The higher an engine’s displacement, the more fuel it will consume, thus the lower mileage per gallon.
The front-wheel drive category streches out from 35 to 11 mpg. However, they are generally higher than most of the rear-wheel and four-wheel drive. Rear-wheel category is not as popular as the other two and their mpg is between 10-17. All four-wheels’ mpg are below 20.
mpg %>%
mutate(year=factor(year, levels=c("1999","2008"), ordered=TRUE)) %>%
ggplot()+
geom_point(mapping = aes(x = displ, y = hwy, color = year))+
xlab("Engine displacement, in litres") +
ylab("Highway miles per gallon") +
ggtitle ("Scatterplot of Engine Displacement vs. Highway MPG")
We can see from the plot that vehicles made in 2008 have higher mpg. Therefore, vehicles are becoming more energy-efficient as technology develops.
mpg %>%
mutate(cyl=factor(cyl, levels=c("4","6", "5", "8"), ordered=TRUE)) %>%
ggplot()+
geom_point(mapping = aes(x = cty, y = hwy, color = cyl))+
xlab("City miles per gallon") +
ylab("Highway miles per gallon") +
ggtitle ("Scatterplot of City MPG vs. Highway MPG")
This scatterplot shows city and highway mpg by number of cylinders. It’s obvious that number of cylinders impact the fuel consumption. Vehicles with more cylinders consumes more fuel.
In conclusion, engine displacement, number of cylinders, and the type of drive train have impact on fuel consumption. Higher an engine’s displacement, lower the mpg meaning more fuel consumption. More cylinders a vehicle has, the more powerful it is and more fuel it consumes. Front-wheel drive category is the most energy efficient and most popular category.
When selecting the vehicle to purchase, this data analysis could be a good reference. However, there are a lot of other factors to consider before making the purchase. Always assess your own needs and considering your own situation.
What Is Engine Displacement? https://www.yourmechanic.com/article/what-is-engine-displacement
What Do Different Cylinder Numbers Mean in Regards to Engine Performance or Reliability? https://www.autoblog.com/2015/12/02/what-do-different-cylinder-numbers-mean-in-regards-to-engine-per/
Transmission Guide: Automatic vs Manual https://www.drivparts.com/parts-matter/learning-center/driver-education-and-vehicle-safety/manual-vs-automatic-car.html
Know the difference: two-wheel drive vs. four-wheel drive vs. all-wheel drive https://www.economical.com/en/blog/economical-blog/february-2017/two-wheel-four-wheel-all-wheel-drive-differences
2WD vs AWD vs 4WD https://www.consumerreports.org/cro/2012/12/2wd-awd-or-4wd-how-much-traction-do-you-need/index.htm