Find the mpg data in R. This is the dataset that you will use for the first three questions.
displ for each transmission type trans from the mpg data set. Hint: Can you figure out how to rotate the x-axis categories so they are all readable?#Use geom_boxplot and add coord_flip() to flip the x-axis and y-axis so we can make the words reafable.
data("mpg")
ggplot(mpg, aes(x = trans, y = displ)) + geom_boxplot() + labs(title = "Engine displacement for each transmission type",
x = "transmission type",
y = "engine displacement") + coord_flip()
class type in mpg.#Apply geom_bar and add geom_text to show the data of frequency
ggplot(mpg, aes(x= class)) + geom_bar() + geom_text(stat='count', aes(label=..count..), vjust=-0.25)+
labs(title = "The frequency of each `class` type",
x = "Class",
y = "Frequency")
cyl type within class. Hint:You might have to use (group) or convert cyl to a factor (as.factor).#On top of the frequency of class type, use (as.factor) to convert the cyl to into a factor. Then ad position “stack” to show the number within ‘class’.
ggplot(mpg, aes(x = class)) +
geom_bar(aes(fill = as.factor(cyl)), position = "stack") +
labs(title = "The frequency of each `cyl` type within `class`",
x = "Class",
y = "Frequency")
4. Draw a scatter plot using ggplot showing the relationship between
cty and hwy. Explain the utility or lack of utility of this graphic.
#Use geom_point to create a scatter plot. The values of cty and ‘hwy’ data points may overlap each other, which causes the problem - overplotting.
ggplot(mpg, aes(cty,hwy)) +
geom_point()
mpg and write a brief summary about why you chose that visualization.#I want to know the relationship between hwy and displ by using scatter plot. To avoid overplotting, I apply geom_jitter. I also define the color of class (the type of car) to provide a deeper level of information, and I use geom_smooth to show a guided smoothed line.
ggplot(mpg, aes(x = displ, y = hwy, color = class)) +
geom_point() +
geom_jitter(position = position_jitter(width = 0.5, height = 0.5)) +
geom_smooth(se =FALSE, method = lm) +
labs(title = "The relationship between `Highway miles` of fuel and `Engine displacement` in differen types of car",
x = "Engine displacement, in liters",
y = "Highway miles per gallon") +
theme(title=element_text(size=8))