mpg - question list:What is the most popular fuel type in this data set?
Regarding the fuel type variable, the value “d” represents diesel, “p” represents premium (petrol) and “r” represents regular (petrol). Do you think there is an effect of fuel type on fuel economy?
Do you think there is a difference in fuel economy between vehicles made in 1999 and 2008? (When plotting with “year” variable, use “as.factor(year)” to convert it to categorical variables. This will be explained in future classes.)
What happens if you make a scatter plot of “class” vs “drv”? Do you think this plot is useful or not?
library(tidyverse)
ggplot(data = mpg) +
geom_bar(mapping = aes(x = fl, fill = fl)) +
xlab("Fuel Type") + ggtitle("Distribution of Fuel Types") +
theme(plot.title = element_text(hjust = 0.5))
Answer: The fuel type “r” (regular petrol) is the most popular fuel type as shown in the figure.
ggplot(data = mpg, mapping = aes(x = as.factor(year), y = cty)) +
stat_boxplot(geom = "errorbar", width = 0.5) +
geom_boxplot() +
labs(x = "Year", y = "Miles Per Gallon in City", title = "Fuel Economy (City) Between 1999 and 2008") +
theme(plot.title = element_text(hjust = 0.5))
Answer: The figure above show that there is no significant difference in fuel economy between vehicles made in 1999 and 2008 based on our data set.
ggplot(data = mpg) +
geom_point(mapping = aes(x = class, y = drv)) +
labs(x = "vehicle class", y = "drive train type", title = "Vehicle Class vs Drive Train Type") +
theme(plot.title = element_text(hjust = 0.5))
Answer: This plot is still useful in the sense that
it shows which combination of class and drv
exists or does not exist. For example, All 2-seater cars are rear-wheel
driven.