Visualization

This assignment is aimed at practicing ggplot and gain experience with visualization with ggplot, objective for this assignment will be to write the code necessary to exactly recreate the provided graphics.

—“Experts from datascienceDojo”

ggplot2 became defacto standard for visualization in R. It helps to design print grade graphs on the fly. User has fine grain control over layering graphical elements while building visualization.

Every visualization in ggplot2 is composed of: 1. Data
2. Layers
3. Scales 4. Coordinates 5. Faceting 6. Themes

Data - Main ingredient of visualization [!required]

Layers - what you see on the plots

Scales - mapping between data and output

Coordinates - Visualization perspective

Faceting - Visual drill-down into data

Themes - COntrols the details of display

Working with grammer of graphics:

ggplot2 visualization has three required components:

  1. Data

  2. Aesthetics

  3. Layers

if(!require(ggplot2)) {install.packages("ggplot2")}
## Loading required package: ggplot2
library(ggplot2)

if(!require(ggthemes)) {install.packages("ggthemes")}
## Loading required package: ggthemes
library(ggthemes)

str(mpg)
## Classes 'tbl_df', 'tbl' and 'data.frame':    234 obs. of  11 variables:
##  $ manufacturer: chr  "audi" "audi" "audi" "audi" ...
##  $ model       : chr  "a4" "a4" "a4" "a4" ...
##  $ displ       : num  1.8 1.8 2 2 2.8 2.8 3.1 1.8 1.8 2 ...
##  $ year        : int  1999 1999 2008 2008 1999 1999 2008 1999 1999 2008 ...
##  $ cyl         : int  4 4 4 4 6 6 6 4 4 4 ...
##  $ trans       : chr  "auto(l5)" "manual(m5)" "manual(m6)" "auto(av)" ...
##  $ drv         : chr  "f" "f" "f" "f" ...
##  $ cty         : int  18 21 20 21 16 18 18 18 16 20 ...
##  $ hwy         : int  29 29 31 30 26 26 27 26 25 28 ...
##  $ fl          : chr  "p" "p" "p" "p" ...
##  $ class       : chr  "compact" "compact" "compact" "compact" ...

Vis 1

This graphic is a traditional stacked bar chart. This graphic works on the mpg dataset, which is built into the ggplot2 library.

         mpg$trans <- as.factor(mpg$trans)
         ggplot(mpg, aes(class, fill = trans))+ # Data and aesthetics [x =axis is "class"]
         geom_bar()+ # layer, barplot
         labs(fill = "Transmission") # Label

Vis 2

This is a boxplot built using the mpg dataset. Classic theme is used to have plain background.

        mpg$manufacturer <- as.factor(mpg$manufacturer)
        ggplot(mpg,aes(manufacturer,hwy))+ #Data and aesthetics
        theme_classic()+ #Theme
        geom_boxplot()+ # Layer
        coord_flip()+ # Coordinates
        labs(y="Highway Fuel Efficiency(miles/gallon) ",x="Vehicle Manufacturer") #Labels

Vis 3

      ggplot(diamonds, aes(x=price,color=cut,fill=cut)) + # Data and aesthetics
      theme_economist() + # economist background theme
      geom_density(aes(fill=factor(cut)), alpha=0.25) + # Transparent Density plot 
      labs(x="Diamond Price (USD)", y="Density", title="Diamond Price Density ") # Labels

Vis 4

      ggplot(iris, aes(x = Sepal.Length, y = Petal.Length)) + # Data and aesthetics
      theme_bw() + #white background theme
      theme(panel.border = element_blank(),axis.ticks = element_blank()) + # Removing border and removinf x-axis and y-axis ticks
      geom_point() +  # Scatterplot
      geom_smooth(method = "lm") + # curve fitting , linear regression
      labs(x="Iris Sepal Length", y="Iris Petal Length", title = "Relationship between Petal and Sepal Length") # Labels

Vis 5

      ggplot(data=iris, aes(x = Sepal.Length, y = Petal.Length, color=Species)) + # Data and aesthetics
      theme_bw() + # White background
      theme(legend.position="bottom", panel.border = element_blank(), panel.grid.minor = element_blank(),panel.grid.major = element_blank()) + # Removing background grid and border
      geom_point() + # Scatter plot
      geom_smooth(method=lm, se = F, stat = "smooth") + # Curve fitting 
      labs(x="Iris Sepal Length", y="Iris Petal Length", title = "Relationship between Petal and Sepal Length", subtitle = "Species level comparison")