Objectives

The objectives of this problem set is to gain experience working with the ggplot2 package for data visualization.

Vis 1

This graphic is a traditional stacked bar chart. This graphic works on the mpg dataset, which is built into the ggplot2 library. This means that you can access it simply by ggplot(mpg, ....). There is one modification above default in this graphic, I renamed the legend for more clarity.

library(datasets)                         #Load default datasets
library(ggplot2)                          #Load ggplot2 package
## Registered S3 methods overwritten by 'ggplot2':
##   method         from 
##   [.quosures     rlang
##   c.quosures     rlang
##   print.quosures rlang
ggplot(mpg)+                              #Create plot for mpg dataset
  geom_bar(aes(x=class,fill=trans))+      #Plot Bar Chart, set x-axis=class, set legend = trans
  scale_fill_discrete(name="Transmission")#Name the tegend Transmission

Vis 2

This boxplot is also built using the mpg dataset. Notice the changes in axis labels, and an altered theme_XXXX

ggplot(mpg)+
  geom_boxplot(aes(manufacturer,hwy)) + #boxplot of fuel-efficiency by Manufact
  coord_flip()+                         #Flip to horizontal view
  labs(y = "Highway Fuel Efficiency (mile/gallon)", x="Vehicle Manufacturer")+ # Assign x and y-labels
  theme_classic()                       #Use classic theme

Vis 3

This graphic is built with another dataset diamonds a dataset also built into the ggplot2 package. For this one I used an additional package called library(ggthemes) check it out to reproduce this view.

library(ggthemes)
## Warning: package 'ggthemes' was built under R version 3.6.1
ggplot(diamonds)+
  geom_density(aes(price,        #Density plot of diamond price
                   fill=cut,     #Set legend = cut. Using fill colors to differentiate
                   color=cut),   #Using stroke colors to differentiate
               alpha=0.2,        #Set trasparency level of fill colors
               size=0.6)+        #Set width of strokes
  labs(title = "Diamond Price Density",x="Diamond Price (USD)",y="Density")+ #Add title and X and Y Labels
  theme_economist()              #Use theme 'Economist'

Vis 4

For this plot we are changing vis idioms to a scatter plot framework. Additionally, I am using ggplot2 package to fit a linear model to the data all within the plot framework. Three are edited labels and theme modifications as well.

ggplot(iris,
       aes(Sepal.Length,Petal.Length))+  # set x-axis as Sepal.Length; y-axis as Petal.Length
  geom_point()+                          # Scatterplot
  geom_smooth(method=lm)+                # Add regression line to scatterplot 
  labs(title="Relationship between Petal and Sepal Length",x="Iris Sepal Length",y="Iris Petal Length")+          # Label title, x and y labels
  theme_minimal()                        # Use theme 'minimal'

Vis 5

Finally, in this vis I extend on the last example, by plotting the same data but using an additional channel to communicate species level differences. Again I fit a linear model to the data but this time one for each species, and add additional theme and labeling modicitations.

ggplot(iris,
       aes(Sepal.Length,     #Set x-axis as Sepal Length
           Petal.Length,     #Set y-axis as Petal Length
           color=Species))+  #Set legend = Species; Use colors to differentiate Species
  geom_point()+                     # Scatterplot
  geom_smooth(method=lm,se=FALSE)+  # Draw regression lin w/o confidence region 
  labs(title="Relationship between Petal and Sepal Length",
       subtitle = "Species level comparison",
       x="Iris Sepal Length",
       y="Iris Petal Length")+      # Label title, subtitle, x and y labels
  theme_tufte()+                    # Use theme 'pander'
  theme(legend.position="bottom")   # Position legend at the bottom