Albert Y. Kim
Monday 2015/01/26
We set the restriction that all our data exists in a matrix called a data frame, which we say has the “tidy” property:
Wilkinson (2005) boils it down:
In brief, the grammar tells us that a statistical graphic is a mapping from data to aesthetic attributes (color, shape, size) of geometric objects (points, lines, bars).
Famous graphical illustration by Minard of Napolean's march to and retreat from Moscow in 1812
| Data (Variable) | Aesthetic | Geometric Object |
|---|---|---|
| longitude | points | |
| latitude | points | |
| army size | size = width | bars |
| army direction | color = brown or black | bars |
| date | text | |
| temperature | lines |
The plot may also contain statistical transformations of the data. Ex: histograms transform numbers into counts that fall into bins
aes mappings of data to aesthetics we can perceive on a graphic: x/y position, color, size, and shape. Each aesthetic can be mapped to a variable in our data set. If not assigned, they are set to defaults.geom geometric objects: type of plot: points, lines, bars, etc.stat statistical transformations to summarise data: smoothing, binning values into a histogram, or just itself “identity”facet how to break up data into subsets and display broken down plotsscales both
coord coordinate system for x/y values: typically cartesian.position adjustmentsOpen ggplot.R in RStudio and do examples.
?geom_line()ggplot2 book is on Moodle. To learn more, I suggest reading
The code for all examples in the book: http://ggplot2.org/book/