Objectives

The objectives of this problem set is to gain experience working with the ggplot2 package for data visualization. To do this I have provided a series of graphics, all created using the ggplot2 package. Your objective for this assignment will be write the code necessary to exactly recreate the provided graphics.

When completed submit a link to your file on rpubs.com. Be sure to include echo = TRUE for each graphic so that I can see the visualization and the code required to create it.

Questions

  1. This graphic is a traditional stacked bar chart. This graphic works on the mpg dataset, which is built into the ggplot2 library. This means that you can access it simply by ggplot(mpg, ..). There is one modification above default in this graphic, I renamed the legend for more clarity.
mpg.plot <- ggplot(mpg)                      # Create plot for mpg dataset
mpg.plot +                                   # Use stacked bar chart. x-asix is class,
  geom_bar(aes(class, fill = trans)) +       
  scale_fill_discrete(name = "transmission") # Rename legend title

  1. This boxplot is also built using the mpg dataset. Notice the changes in axis labels, and an altered theme_XXXX
mpg.plot +                                          # Create plot for mpg dataset 
  geom_boxplot(aes(manufacturer, hwy)) +            
  theme_classic() +                                 # Use the classic theme 
  coord_flip() +                                    # Flip coordinate
  labs(y = "Highway Fuel Efficiency (mile/gallon)", x = "Vehicle Manufacturer")

  1. This graphic is built with another dataset diamonds a dataset also built into the ggplot2 package. For this one I used an additional package called library(ggthemes) check it out to reproduce this view.
ggplot(diamonds) +                        # Create plot for diamonds dataset
  geom_density(aes(price,                 # Find density of diamond price
                   fill = cut,color = cut),alpha = 0.3,size = 0.6) +
  labs(title = "Diamond Price Density",x = "Diamond Price (USD$)",y = "Density") +theme_economist()# Use theme similar to The Economist

  1. For this plot we are changing vis idioms to a scatter plot framework. Additionally, I am using ggplot2 package to fit a linear model to the data all within the plot framework. Three are edited labels and theme modifications as well.
ggplot(iris,                                                   # Create plot for iris dataset
       aes(Sepal.Length, Petal.Length)) +
  geom_point() +                                               # Use scatter plot
  geom_smooth(method = lm) +                                   # Add a regression line
  theme_minimal() +                                            # Use the "minimal" theme      
  theme(panel.grid.major = element_line(size = 1),             # Set width of major grid line
        panel.grid.minor = element_line(size = 0.7)) +         # Set width of minor grid line
  labs(title = "Relationship between Petal and Sepal Length",
       x = "Iris Sepal Length", 
       y = "Iris Petal Length") 

  1. Finally, in this vis I extend on the last example, by plotting the same data but using an additional channel to communicate species level differences. Again I fit a linear model to the data but this time one for each species, and add additional theme and labeling modicitations.
 ggplot(iris,                                                   # Create plot for iris dataset
       aes(Sepal.Length,                            
           Petal.Length,            
           color = Species)) +                                 # Use "Species" as the type of legend, and use colors to differentiate
  geom_point() +                                               # Use scatter plot
  geom_smooth(method = lm, se = FALSE) +                       # Draw regression line without confidence region
  theme_pander() +                  
  theme(text = element_text(family = "serif"),                 # Use fond: Times news roman 
        axis.ticks = element_line(color = "black",
                                  size = 0.7),
        legend.position = "bottom",                            # Move legend to the bottom of plot
        legend.title = element_text(face = "plain"),
        plot.title = element_text(size = 14,                   
                             face = "plain")) +
  labs(title = "Relationship between Petal and Sepal Length",
       subtitle = "Species level comparison",
       x = "Iris Sepal Length",     
       y = "Iris Petal Length")