In this assignment, we’ll use ggplot2 to reverse-engineer a plot constructed using another package.

We’ll example the dataset diamonds, a dataset containing the prices and other attributes of almost 54,000 diamonds. Run ?diamonds to learn about the data and each variable it contains.

library(lattice)
library(ggplot2)
data(diamonds)

Run the following code using functions from the lattice package.

xyplot(price ~ carat | cut, 
                          data = diamonds, 
                          panel = function(x, y, ...) {
                            panel.xyplot(x, y, ...)
                            lm1 <- lm(y ~ x)
                            panel.abline(a = lm1$coefficients[1], 
                                         b = lm1$coefficients[2])
                            },
                          as.table = TRUE)

This plots the number of carats on the x axis and the price on the y axis. The panels group the data by the diamond quality. A linear regression line is fit to the data in each panel, and is plotted.

Your task is to reproduce this plot in ggplot2.

Initialize a ggplot object, map carat to the x axis and price to the y axis. You should get the following output.

Next, plot the points. (hint: “geom_points”)

Next, we need to add a linear regression line. (Hint: The “geom_smooth” function fits various lines to data. Run ?geom_smooth to check the help documentation and make sure you fit a linear regression line.)

Finally, seperate out into panels according to diamond quality (indicated by the “cut” variable). (hint: You need “facet_grid”, and you will have to pass a formula object to the formula arguement. Look at the help docs and play around.)