This is just some examples of a few settings for making graphs prettier, discussing what is going on with each example.
It is using the built in iris data set, and each code chunk for an example is completely self-contained, so includes some code repetition that is not needed if you were running through the lot in one go and had the previous code in memory.
First, a fairly basic boxplot
data(iris)
boxplot(iris$Sepal.Length ~ iris$Species)
Now we can fancy this up by making and using a few colours
data(iris)
setoscol = "#FF000088"
versicol = "#88880088"
virgicol = "#0000FF88"
collectivecols = c(setoscol,versicol,virgicol)
boxplot(iris$Sepal.Length ~ iris$Species, col=collectivecols)
In this particular example I am storing individual colours, then putting them into a list in the order I want them used.
One way of defining colours is by putting the colour name, as described in:
http://www.stat.columbia.edu/~tzheng/files/Rcolor.pdf
However the way these colours are created is the hash mark followed by four two-digit hexidecimal numbers giving the amount of red, green, blue, and the opacity (how solid the colour is), so #FF000088 is maximum red, no green, no blue, and about 50% see through. For a sense of what the different amounts of red, green, and blue will give (before fading out with opacity) check the RGB hexidecimal numbers shown in:
We can add more settings to the plot command to control the axes in various ways, including removing them.
data(iris)
setoscol = "#FF000088"
versicol = "#88880088"
virgicol = "#0000FF88"
collectivecols = c(setoscol,versicol,virgicol)
boxplot(iris$Sepal.Length ~ iris$Species, col=collectivecols, ylab="Sepal Length", frame.plot=F, xaxt="n", ylim = c(4,max(iris$Sepal.Length)))
text(1,4,"Setosa")
text(2,4,"Versicolor")
text(3,4,"Virginica")
This particular graph also takes advantage of the fact that with R’s base plotting system you can add extra stuff to a graph after making it, like drawing extra things onto a graph on paper. In this case, it makes a plot with no x axis at all, then uses text() commands to add extra text labels at chosen points.
Another way of showing the distribution of a variable between different groups would be to make a series of histograms for each subgroup, using the par() settings to make several graphs in one.
data(iris)
par(mfrow=c(3,1))
hist(iris$Sepal.Length[iris$Species == "setosa"])
hist(iris$Sepal.Length[iris$Species == "versicolor"])
hist(iris$Sepal.Length[iris$Species == "virginica"])
par(mfrow=c(1,1))
Note the subsequent par() command is resetting things back to make one graph at a time
However, these graphs would be much nicer if we used the graph settings to give each subgraph the same x and y axis range, and made the break points for the histograms in the same places.
data(iris)
xmin = 4
xmax = 8.5
xdist = c(xmin,xmax)
xbreaks = seq(from=xmin, to=xmax, by=0.25)
setoscol = "#FF000088"
versicol = "#88880088"
virgicol = "#0000FF88"
par(mfrow=c(3,1))
hist(iris$Sepal.Length[iris$Species == "setosa"], xlim=xdist, xlab="", main="Setosa", col=setoscol, ylim=c(0,20), breaks=xbreaks)
hist(iris$Sepal.Length[iris$Species == "versicolor"], xlim=xdist, xlab="", main="Versicolor", col=versicol, ylim=c(0,20), breaks=xbreaks)
hist(iris$Sepal.Length[iris$Species == "virginica"], xlim=xdist, xlab="Sepal Length", main="Virginica", col=virgicol, ylim=c(0,20), breaks=xbreaks)
par(mfrow=c(1,1))
If I was only comparing two things, rather than making three histograms I could take advantage of see-through colours and make one overlapping histogram
data(iris)
xmin = 4
xmax = 8.5
xdist = c(xmin,xmax)
xbreaks = seq(from=xmin, to=xmax, by=0.25)
setoscol = "#FF000088"
virgicol = "#0000FF88"
bothcol = "#9900AABB"
hist(iris$Sepal.Length[iris$Species == "setosa"], xlim=xdist, xlab="Sepal Length", col=setoscol, ylim=c(0,20), breaks=xbreaks, main="")
hist(iris$Sepal.Length[iris$Species == "virginica"], xlim=xdist, col=virgicol, breaks=xbreaks, add=TRUE)
legend("topright", legend=c("Setosa","Virginica", "Both"), fill=c(setoscol,virgicol, bothcol), inset= c(0.1,0.2),box.lwd = 0,box.col = "white")
In this case, I annotated the graph with an extra legend (adding in a combined zone colour on the legend) as an additional instruction after making the initial graph.
Note: This document may be updated as I receive feedback, so should not be regarded as permanent unchanging content.