Regarding mpg data set:

2. Regarding the fuel type variable, the value “d” represents diesel, “p” represents premium (petrol) and “r” represents regular (petrol). Do you think there is an effect of fuel type on how many miles a vehicle can run on average per gallon of fuel?

ggplot(data = mpg) +
  geom_boxplot(mapping = aes(x = fl, y = hwy)) +
  xlab("Fuel Type") + ylab("Miles Per Gallon in Highway") +
  ggtitle("Fuel Economy (Highway) vs Fuel Type") +
  theme(plot.title = element_text(hjust = 0.5))

ggplot(data = mpg) +
  geom_boxplot(mapping = aes(x = fl, y = cty)) +
  xlab("Fuel Type") + ylab("Miles Per Gallon in City") +
  ggtitle("Fuel Economy (City) vs Fuel Type") +
  theme(plot.title = element_text(hjust = 0.5))

Answer: The graphs show that the fuel type does have an effect on how many miles a vehicle can run on average per gallon of fuel measured by hwy and cty. And diesel has the best fuel economy.

3. Do you think there is a difference in fuel economy for vehicles made in 1999 and 2008? (When plotting with “year” variable, use as.factor(year) to convert it to categorical variables. This will be explained in future classes.)

ggplot(data = mpg, mapping = aes(x = as.factor(year), y = cty)) +
  stat_boxplot(geom = "errorbar", width = 0.5) +
  geom_boxplot() +
  labs(x = "Year", y = "Miles Per Gallon in City", title = "Fuel Economy (City)   Between 1999 and 2008") +
  theme(plot.title = element_text(hjust = 0.5))

Answer: The graph shows that there is no significant difference between vehicles made in 1999 and 2008.

4. What happens if you make a scatter plot of class vs drv? Do you think this plot is useful or not?

ggplot(data = mpg) +
  geom_point(mapping = aes(x = class, y = drv)) +
  labs(x = "vehicle class", y = "drive train type", title = "Vehicle Class vs     Drive Train Type") +
  theme(plot.title = element_text(hjust = 0.5))

Answer: This plot is useful because it shows which combination of class and drv exists or does not exist. For example, All pickup cars are four-wheel driven.