Objectives

The objectives of this problem set is to gain experience working with the ggplot2 package for data visualization. To do this I have provided a series of graphics, all created using the ggplot2 package. Your objective for this assignment will be write the code necessary to exactly recreate the provided graphics.

When completed submit a link to your file on rpubs.com. Be sure to include echo = TRUE for each graphic so that I can see the visualization and the code required to create it.

Vis 1

This graphic is a traditional stacked bar chart. This graphic works on the mpg dataset, which is built into the ggplot2 library. This means that you can access it simply by ggplot(mpg, ….). There is one modification above default in this graphic, I renamed the legend for more clarity.

library(ggplot2)
head(mpg)
## # A tibble: 6 x 11
##   manufacturer model displ  year   cyl trans  drv     cty   hwy fl    class
##   <chr>        <chr> <dbl> <int> <int> <chr>  <chr> <int> <int> <chr> <chr>
## 1 audi         a4      1.8  1999     4 auto(~ f        18    29 p     comp~
## 2 audi         a4      1.8  1999     4 manua~ f        21    29 p     comp~
## 3 audi         a4      2    2008     4 manua~ f        20    31 p     comp~
## 4 audi         a4      2    2008     4 auto(~ f        21    30 p     comp~
## 5 audi         a4      2.8  1999     6 auto(~ f        16    26 p     comp~
## 6 audi         a4      2.8  1999     6 manua~ f        18    26 p     comp~
ggplot(mpg,aes(class)) +
  geom_bar(aes(fill = trans))+
  labs(fill = "Transmission")

Vis 2

This boxplot is also built using the mpg dataset. Notice the changes in axis labels, and an altered theme_XXXX

ggplot(mpg, aes(x = manufacturer, y = hwy)) +
        geom_boxplot()+coord_flip()+scale_x_discrete(name = "vehicle Manufactuere") +
        scale_y_continuous(name = "Highway Fuel Efficiency(miles/gallon)")+theme_classic()

Vis 3

This graphic is built with another dataset diamonds a dataset also built into the ggplot2 package. For this one I used an additional package called library(ggthemes) check it out to reproduce this view.

library(ggthemes)
head(diamonds)
## # A tibble: 6 x 10
##   carat cut       color clarity depth table price     x     y     z
##   <dbl> <ord>     <ord> <ord>   <dbl> <dbl> <int> <dbl> <dbl> <dbl>
## 1 0.23  Ideal     E     SI2      61.5    55   326  3.95  3.98  2.43
## 2 0.21  Premium   E     SI1      59.8    61   326  3.89  3.84  2.31
## 3 0.23  Good      E     VS1      56.9    65   327  4.05  4.07  2.31
## 4 0.290 Premium   I     VS2      62.4    58   334  4.2   4.23  2.63
## 5 0.31  Good      J     SI2      63.3    58   335  4.34  4.35  2.75
## 6 0.24  Very Good J     VVS2     62.8    57   336  3.94  3.96  2.48
ggplot(diamonds,aes(x=price,color=cut,fill=cut))+
  geom_density(aes(fill=factor(cut)), alpha=0.3)+
  ggtitle("Diamond Price Density")+
  labs(x="Diamond Price(USD)",y="Density")+
  theme(legend.position = "top")+
  theme_economist()+scale_colour_economist()

Vis 4

For this plot we are changing vis idioms to a scatter plot framework. Additionally, I am using ggplot2 package to fit a linear model to the data all within the plot framework. Three are edited labels and theme modifications as well.

head(iris)
##   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
## 1          5.1         3.5          1.4         0.2  setosa
## 2          4.9         3.0          1.4         0.2  setosa
## 3          4.7         3.2          1.3         0.2  setosa
## 4          4.6         3.1          1.5         0.2  setosa
## 5          5.0         3.6          1.4         0.2  setosa
## 6          5.4         3.9          1.7         0.4  setosa
ggplot(iris,aes(x=Sepal.Length,y=Petal.Length))+
  geom_point()+geom_smooth(method=lm)+
  labs(title="Relationship between Patel amd Sepal Length",x="Iris Sepal Length",y="Iris Petal Length")+
  theme_light()

Vis 5

Finally, in this vis I extend on the last example, by plotting the same data but using an additional channel to communicate species level differences. Again I fit a linear model to the data but this time one for each species, and add additional theme and labeling modicitations.

ggplot(iris,aes(x=Sepal.Length,y=Petal.Length,color=Species))+
  geom_point()+geom_smooth(method=lm,se=FALSE)+
  labs(title = "Relationship between Patel amd Sepal Length",
       subtitle = "Species Level Comparison",x="Iris Sepal Length",y="Iris Petal Length")+
  theme(legend.position = "bottom",panel.background=element_blank(),panel.grid=element_blank())