Eric_Hirsch_607_DataScienceInContext

Eric Hirsch

2021-04-15

Principle 1: Style is substance

Elegance and style are not superfluous to or, even worse, antithetical to good data design - they are central to readability and clarity.

Consider the elegance of this table:

Principle 2. Form is function

Placement, color, width - all express meaning If it’s a side comment, put it on the side. For example, this would be a good place to make a side comment about the table to the left.

Airline Status Los.Angeles Phoenix San.Diego San.Francisco Seattle
ALASKA on time 497 221 212 503 1841
NA delayed 62 12 20 102 305
AMWEST on time 694 4840 383 320 201
NA delayed 117 415 65 129 61

Principle 3. Less is more

Why a box?

Consider this box plot. What dos the box actually get you? What is the ratio of data expression to ink?

This boxplot is great for analysis …

… but this is cleaner, leaner and perhaps clearer for a presentation.

Or consider this example:

… to this:

Fig.1: Percent delays by city and airline Fig.1: Percent delays by city and airline

Principle 4. Pictures are words

Consider these sparklines - they are graphical words.

In R we can put more data into a sparkline if we want to. Here we see IQR and min/max graphed onto a sparkline:

Tufte invented the sparkline - a small graph that shows the upward and downward trajectory of some variable in a small, concise graphical statement. For Tufte, the sparkline functions just as a word does - small, discrete representation meant to express a single - though complex - idea.

Principle 5. Get off the grid

In the graph below, the lines are the same as in a line graph. But the y axis has been obliterated - instead, it is occupied by the names of airlines on either side with their traffic shown at various points. It is as if the sparkline words had been arranged in a paragraph. It is a very small change but a highly effective one:

Packages

Basic style, background color, font and sidebar: tufte

Boxplot: geom_boxplot() with ggplot2 ggthemes

Minimal ink: theme_tufte() ggthemes

Minimal Barchart ggplot2

Sparklines: sparkline

Slopegraph: slopegraph