Module 2 Lesson 1 Application

Author

Jamal Rogers

Published

May 16, 2023

Let’s load the tidyverse to begin.

library(tidyverse)

Among the variables in mpg are:

  1. displ: A car’s engine size, in liters. A numerical variable.

  2. hwy: A car’s fuel efficiency on the highway, in miles per gallon (mpg). A car with a low fuel efficiency consumes more fuel than a car with a high fuel efficiency when they travel the same distance. A numerical variable.

  3. class: Type of car. A categorical variable.

Let’s start by visualizing the relationship between displ and hwy for various classes of cars. We can do this with a scatterplot where the numerical variables are mapped to the x and y aesthetics and the categorical variable is mapped to an aesthetic like color or shape.

ggplot(mpg, aes(x = displ, y = hwy, color = class)) +
  geom_point()

We can facet wrap based on class.

ggplot(mpg, aes(x = displ, y = hwy, color = class)) +
        geom_point() +
        facet_wrap(~class)

With 7 classes of cars showing negative correlation, we can then say that the bigger the engine size is, the lower the car’s fuel efficiency is on the highway. We can then answer the question: Cars with big engines use more fuel than cars with smaller engines.