The objectives of this problem set is to use ggplot2 to build graphs.
## ── Attaching packages ────────────────────────────────────────────────────────────────────────────────────────────────────────────────── tidyverse 1.2.1 ──
## ✔ ggplot2 3.1.0 ✔ purrr 0.2.5
## ✔ tibble 1.4.2 ✔ dplyr 0.7.8
## ✔ tidyr 0.8.2 ✔ stringr 1.3.1
## ✔ readr 1.2.1 ✔ forcats 0.3.0
## ── Conflicts ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
Stacked bar chart for the mpg dataset.
#create a base ggolot2 plot, the below graphics will be modification of this base chart
baseplot_mpg <- ggplot(mpg)
baseplot_mpg+
geom_bar(aes(class,fill=trans))+
scale_fill_discrete(name="transmission")
This chart shows the distribution of different classes of the vehicles with the different types of transmissions stacked in them.
Box plot for the mpg dataset.
#reuse the base plot from above and create the required vis
baseplot_mpg+
geom_boxplot(aes(manufacturer,hwy))+
theme_classic()+
coord_flip()+
labs(y="Highway Fuel Efficiency (Mile/gallon)",
x="Vehicle Manufacturer")
This chart visualizes the distribution of the fuel efficiency per vehicle manufacturer.
Density plot for diamond dataset.
baseplot_diamonds <- ggplot(diamonds)
baseplot_diamonds+
geom_density(aes(price,
fill=cut,
color=cut),
alpha=0.3,
size=0.6)+
labs(title="Diamond Price Density",x="Diamond Price (USD)",y="Density")+
theme_economist() +
scale_colour_economist()
This graph shows the relation between density and price of different diamond cuts. Used the economist theme from ggthemes package.
Scatter plot with line for iris dataset.
#create baseplot with the required aesthetics
baseplot_iris <- ggplot(iris,aes(x=Sepal.Length, y=Petal.Length))
baseplot_iris+
geom_point()+
geom_smooth(method = lm)+labs(title="Relationship between Petal and Sepal Length", x="Iris Sepal Length", y="Iris Petal Length")+
theme_minimal()
This is scattered plot with fitted line using ggplot fitted line.
Multiple scattered plots for comparison for iris datasets.
# use the baseplot from above and add color as aesthetics.
baseplot_iris+
geom_point(aes(color = Species))+
geom_smooth(aes(color = Species),method = lm,se=F)+
labs(title="Relationship between Petal and Sepal Length",subtitle = "Species level comparison",x="Iris Sepal Length", y="Iris Petal Length")+
theme_minimal()+
theme(panel.grid.major = element_line(size = 1),panel.grid.minor = element_line(size = 0.7))
Compares the fits for different species identified by different colors.