Before we being, we need to load the ‘ggplot2’ and the ‘ggthemes’ libraries. We will be using the mpg data set for this project.
library(ggplot2)
library(ggthemes)
Create a bar graph for the “manufacturer” variable. Names of manufacturers should be displayed at a 90 degree angle.
For this question, we use a combination of the functionsggplot(), geom_bar() and theme() to create the bar graph.
# Creating a variable 'parta' to store the graph.
parta <- ggplot(mpg, aes(x = manufacturer)) +
geom_bar(fill = "blue") + # Adding bars to the empty plot, colored blue.
theme_gray() + # Applying theme for aesthetic purposes.
theme(axis.text.x = element_text(angle = 90)) + # Aligning x axis labels vertically.
labs(title = "Distribution of Manufacturer in the 'mpg' dataset",
x = "Manufacturer",
y = "Count") # Adding a title.
# Printing graph
parta
Create a graph for the ‘year’ variable.
Similar to how we did the previous question, except, since it was not specified, we will not have the x axis labels at an angle. ANother thing to note is that we had to consider the ‘year’ variable as a factor and not an integer, for us to obtain an x axis with discrete values.
# Creating a variable 'partb' to store the graph.
partb <- ggplot(mpg, aes(x = factor(year))) + # Treating the 'year' variable as a factor.
geom_bar(fill = "blue") + # Adding bars to empty plot, colored blue.
theme_gray() + # Applying theme for aesthetic reasons.
labs(title = "Distribution of Year in the 'mpg' dataset",
x = "Year",
y = "Count") # Adding a title, and renaming the x and y axes.
# Printing graph
partb
Create a density curve for each of the quantitative variables (displ, cty, hwy), conditioning on each type of cylinder. You should overlay the 3 curves for displ, 3 for cty, and 3 for hwy. You should have 3 separate plots, each having 3 curves.
In this question, we use the geom_density() function instead to generate a density curve. We use the ‘displ’, ‘cty’ and ‘hwy’ variables and overlay them with the ‘cyl’ variable, comparing them across the number of cylinders in a car.
# Creating a variable 'partc1' to store the graph.
partc1 <- ggplot(mpg, aes(x = displ, color = factor(cyl))) + # Mapping aesthetic to 'displ', and overlaying it with the 'same'cyl' variable.
geom_density() + # Adding density curves.
labs(title = "Density distribution of the variable 'displ'",
x = "Displ",
y = "Density",
col = "Cyl") + # Adding title, and renaming x and y axes, and also the legend title.
theme_gray() # Applying a theme.
# Printing the graph.
partc1
We then repeat this for the other two variables.
partc2 <- ggplot(mpg, aes(x = cty, color = factor(cyl))) + # This time mapped for 'cty'.
geom_density() +
labs(title = "Density distribution of the variable 'cty'",
x = "Cty",
y = "Density",
col = "Cyl") +
theme_gray()
partc2
partc3 <- ggplot(mpg, aes(x = hwy, color = factor(cyl))) + # This time mapped for 'hwy'.
geom_density() +
labs(title = "Density distribution of the variable 'hwy'",
x = "Hwy",
y = "Density",
col = "Cyl") +
theme_gray()
partc3
Create side-by-side boxplots for ‘displ’ by ‘cyl’.
For this questions, we use the geom_boxplot function and the ‘cyl’ formatted as a factor variable.
# Creating the variable 'partd' to store the graph.
partd <- ggplot(mpg, aes(x = factor(cyl), y = displ)) +
geom_boxplot(fill = "blue") + # Adding boxplot to empty graph, colored blue.
theme_gray() + # Adding a theme
labs(main = "Side-by-sided Boxplots for 'displ' by 'cyl'",
x = "Cyl",
y = "Displ") # Adding a main title, and renaming the x and y axes.
# Printing the graph.
partd