A while back while I was completing the Coursera Data Science specialization, I came across a problem where I wanted to plot a histogram with an overlay of a density plot as well as an overlay of a normal density plot.
As an example, this is what I was aiming for:
library(ggplot2)
set.seed(1234)
dat <- data.frame(cond = factor(rep(c("A","B"), each=200)), rating = c(rnorm(200),rnorm(200, mean=.8)))
plot <- ggplot(dat, aes(x = rating))
plot <- plot + geom_histogram(aes(y=..density..), color="black", fill = "steelblue", binwidth = 0.5, alpha = 0.2)
plot <- plot + geom_density()
plot <- plot + stat_function(fun = dnorm, colour = "red", args = list(mean = 0.3, sd = 1))
plot
The problem I encountered was that I wanted to add a legend to the plot explaining what the red and black density plots were but while this sounds like a simple task I couldn’t figure out how to do it in ggplot.
In the end I resorted to asking the question on Stack Overflow and a user called mpalanco provided a nice solution.
The key was that I had to include the legend labels with aes(color = “xxx”) for both plots and then add the legend using scale_colour_manual function (scale_colour_manual("Density", values = c("red", "black"))).
library(ggplot2)
set.seed(1234)
dat <- data.frame(cond = factor(rep(c("A","B"), each=200)), rating = c(rnorm(200),rnorm(200, mean=.8)))
plot <- ggplot(dat, aes(x = rating))
plot <- plot + geom_histogram(aes(y = ..density..), color = "black", fill = "steelblue", binwidth = 0.5, alpha = 0.2)
plot <- plot + geom_density(aes(color = "Simulated"))
plot <- plot + stat_function(aes(color = "Normal"), fun = dnorm, args = list(mean = 0.3, sd = 1))
plot <- plot + scale_colour_manual("Density", values = c("red", "black"))
plot
Sometimes ggplot is not that intuitive but once you get the hang of it, it is very powerful and flexible.