Not so dumb things to do when presenting data

Rowan Trebilco
April 24th 2015

Making good data graphics

“While graphics technology is moving along at a rapid pace, the human visual system has remained the same.”

– Cleveland (1994) The elements of graphing data

examples of new advanced visualisations:

What I'll cover today

  • Tufte's rules and general principles

  • hints for customising plots in R base graphics

    • par settings
    • approaches for multipanel plotting

Resources:

  • “The Visual Display of Quantitative Informaiton” by Edward Tufte (1st edition 1983)

  • “The Elements of Graphing Data” by William S. Cleveland (1st edition 1985)

  • “R Graphics 2nd edn.” by Paul Murrell (2011)

  • Books by Stephen Few and www.perceptualedge.com

  • Spoon said not to forget to mention: Displaying Time Series, Spatial, and Space-Time Data with R (Lamiguero, 2014)

  • some great examples of data graphics: http://bost.ocks.org/mike/

Tufte's rules

Graphical excellence

Gives the viewer the greatest number of ideas

in the shortest time

with the least ink

in the smallest space

Tufte's rules: maximise the data/ink ratio

“Above all else, show the data”

The Visual Display of Quantitative Information

Tufte's rules: maximise the data/ink ratio

Bad

plot of chunk unnamed-chunk-1

Better

plot of chunk unnamed-chunk-2

Tufte's rules: maximise the data/ink ratio

Even better

plot of chunk unnamed-chunk-3

Tufte's rules: maximise the data/ink ratio

This

plot of chunk unnamed-chunk-4

vs. this

plot of chunk unnamed-chunk-5

Tufte's rules: remove chartjunk

Show data variation, not design variation

Su 2008 Computational Statistics and Data Analysis 52, 4594-4601

Tufte's rules: remove chartjunk

Show data variation, not design variation

Su 2008 Computational Statistics and Data Analysis 52, 4594-4601

Tufte's rules: remove chartjunk

Chartjunk includes most:

  • 3d plots (use colour, size, or contours to show additional dimensions)
  • pie charts

… but see Bateman et al. 2010 “Useful Junk? The Effects of Visual Embellishment on Comprehension and Memorability of Charts”

=> chartjunk can be useful for info-graphics?

Tufte's rules: remove chartjunk and maximise data-ink

Maps are are a point of reference for both maximising data-ink and minimising chart-junk

Be aware of the biases and limitations of human perceptions


We're good at percieving:

  • 45 and 90 degree angles
  • lengths
  • relative colour and changes in colour intensity


We're bad at percieving:

  • angles other than 45 degrees
  • areas
  • absolute colour and changes in colour hue

General principles for good graphics

choose a sensible colour scheme

library(RColorBrewer)
display.brewer.all()

plot of chunk unnamed-chunk-6



display.brewer.pal(5, "YlOrRd")

plot of chunk unnamed-chunk-7

General principles for good graphics

choose a sensible colour scheme

library(gplots)
m <- abs(matrix(1:120+rnorm(120), nrow=15, ncol=8))
barplot(m, col=rich.colors(15), main="\nrich.colors")

plot of chunk unnamed-chunk-8

good resource for rules on choosing colour schemes

General principles for good graphics

Carefully consider aspect ratio

General principles for good graphics

Line things up to facilitate comparisons

Making publication-quality graphics is time-consuming

don't be surprised if you spend lots of time customising and revising

try to make your processes as repeatable as possible

Hints for plotting in R base graphics

General hints

  • “build” graphs by using type = “n” and axes = F, then sequentially adding elements

  • call a device driver from a script rather than saving from a window

    • use PDF for publication, and png for web graphics (and word)

Hints for plotting in R base graphics

par settings

a good habit to get into:

dev("filename.pdf")
  op<- par([my-settings])
  ... my plot
  par(op)
dev.off()

par

useful par settings and what they control

  • mfrow : lets you split into sub figures (more on this to follow)
  • mar : width of inner margins (b,l,t,r)
  • oma : width of outer margins (b,l,t,r)

from Murrell 2011

par

useful par settings and what they control

  • las : text direction for axis labels
  • tcl : length of axis ticks
  • 'xpd = NA' : lets you plot outside the figure region
  • fg, col.axis, col.lab: colour of the non-data bits of the plot
  • family: font family (for journals that care)

  • par(“usr”): returns the location of the limits of the plotting area

    • e.g. par(“usr”)[4]

will return the RHS limit of the x axis

Hints for plotting in R base graphics: mulitipanel plotting

par(“mfrow”) (and mfcol)

layout()

Multipanel plotting: par(mfrow)

par(mfrow=c(2,2))
for(i in 1:4) plot(i, pch=i)

plot of chunk unnamed-chunk-9

Multipanel plotting: par(mfrow)

use par to shrink inner margins, for common axes

par(mfrow=c(2,2), mar = c(0,0,0,0), oma=c(2,2,2,2))

plot of chunk unnamed-chunk-10

Multipanel plotting: layout

m<- matrix(c(1,1,1,1,2,2,3,4),nrow=2,ncol=4)
layout(m)
layout.show(6)

plot of chunk unnamed-chunk-11 plot of chunk unnamed-chunk-11

Par and multipanel example

Finally, a trick from Spoon

You can draw lines between panels using

grcoverntX() and grconvertY()

Let me know if you want the code!