Objectives

The objectives of this problem set is to use ggplot2 to build graphs.

## ── Attaching packages ────────────────────────────────────────────────────────────────────────────────────────────────────────────────── tidyverse 1.2.1 ──
## ✔ ggplot2 3.1.0     ✔ purrr   0.2.5
## ✔ tibble  1.4.2     ✔ dplyr   0.7.8
## ✔ tidyr   0.8.2     ✔ stringr 1.3.1
## ✔ readr   1.2.1     ✔ forcats 0.3.0
## ── Conflicts ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()

Vis 1

Stacked bar chart for the mpg dataset.

#create a base ggolot2 plot, the below graphics will be modification of this base chart
baseplot_mpg <- ggplot(mpg)

baseplot_mpg+
  geom_bar(aes(class,fill=trans))+
  scale_fill_discrete(name="transmission")

This chart shows the distribution of different classes of the vehicles with the different types of transmissions stacked in them.

Vis 2

Box plot for the mpg dataset.

#reuse the base plot from above and create the required vis
baseplot_mpg+
  geom_boxplot(aes(manufacturer,hwy))+
  theme_classic()+
  coord_flip()+
  labs(y="Highway Fuel Efficiency (Mile/gallon)",
       x="Vehicle Manufacturer")

This chart visualizes the distribution of the fuel efficiency per vehicle manufacturer.

Vis 3

Density plot for diamond dataset.

baseplot_diamonds <- ggplot(diamonds)
baseplot_diamonds+
geom_density(aes(price,
                   fill=cut,
                   color=cut),
               alpha=0.3,
               size=0.6)+
  labs(title="Diamond Price Density",x="Diamond Price (USD)",y="Density")+
  theme_economist() + 
  scale_colour_economist()

This graph shows the relation between density and price of different diamond cuts. Used the economist theme from ggthemes package.

Vis 4

Scatter plot with line for iris dataset.

#create baseplot with the required aesthetics
baseplot_iris <- ggplot(iris,aes(x=Sepal.Length, y=Petal.Length))
baseplot_iris+
  geom_point()+
  geom_smooth(method = lm)+labs(title="Relationship between Petal and Sepal Length", x="Iris Sepal Length", y="Iris Petal Length")+
  theme_minimal()

This is scattered plot with fitted line using ggplot fitted line.

Vis 5

Multiple scattered plots for comparison for iris datasets.

# use the baseplot from above and add color as aesthetics.
baseplot_iris+
  geom_point(aes(color = Species))+
  geom_smooth(aes(color = Species),method = lm,se=F)+
  labs(title="Relationship between Petal and Sepal Length",subtitle = "Species level comparison",x="Iris Sepal Length", y="Iris Petal Length")+
  theme_minimal()+
  theme(panel.grid.major = element_line(size = 1),panel.grid.minor = element_line(size = 0.7))

Compares the fits for different species identified by different colors.