Note: These are examples from Wickham (2016, 2nd ed).
Chapter 2
Set up
library(ggplot2)
For viewing the data set, type mpg. To see them a bit more comfortably, use View(mpg) (note the capital V).
Key components
Let’s look at the components for creating a chart with ggplot2, using a scatterplot as the example.
ggplot(mpg, aes(x = displ, y = hwy)) +
geom_point()

- mpg is the data set that comes with ggplot
- Aesthetic mappings: engine displacement is mapped to x axis, mileage (hwy) to the y axis
- A layer with points is added
The pattern shown here is fundamental for gglplot: * data and aesthetic mappings are provided in ggplot(), then * layers are added with +.
A short version of the above is
ggplot(mpg, aes(displ, hwy)) +
geom_point()
This produces exactly the same output as the longer version above.
Colour, Size, Shape and other aesthetic attributes
Colour can be used for analysing the outliers, the deviations from the general trend where mileage decreases with engine size, here applied to the city mileage (cty):
ggplot(mpg, aes(displ, cty, colour = class)) +
geom_point()

Facetting
An alternative to aesthetics other than x and y axes is facetting.
ggplot(mpg, aes(displ, hwy)) +
geom_point() +
facet_wrap(~class)

Plot geoms
Plotting points is a but one way to visualise data. Many other exist, such as line graphs and box plots. An important class are smoothers (Note the relation to EDA terminology: smooths and roughs!)
Adding a smoother to a plot
ggplot(mpg, aes(displ, hwy)) +
geom_point() +
geom_smooth()

The smoother by default is loess, which is non-linear. For fiting with a linear function, do this:
ggplot(mpg, aes(displ, hwy)) +
geom_point() +
geom_smooth(method = "lm", se = FALSE)

This also illustrates how to supress display of the confidence intervall, se = FALSE.
Boxplots and jittering
When combining a continuious variable, such as highway mileage, with a categorical one, such as engine type, we are usually looking at the categorical variable as a level:
ggplot(mpg, aes(drv, hwy)) +
geom_point()

This may lead to the data points becoming overprinted, as shown above: There are many more points plotted than visible. There are three ways to avoid that:
ggplot(mpg, aes(drv, hwy)) + geom_jitter()

ggplot(mpg, aes(drv, hwy)) + geom_boxplot()

ggplot(mpg, aes(drv, hwy)) + geom_violin()

LS0tCnRpdGxlOiAiZ2dwbG90IGV4ZXJjaXNlcyIKb3V0cHV0OgogIGh0bWxfbm90ZWJvb2s6IGRlZmF1bHQKICBwZGZfZG9jdW1lbnQ6IGRlZmF1bHQKLS0tCk5vdGU6IFRoZXNlIGFyZSBleGFtcGxlcyBmcm9tIFdpY2toYW0gKDIwMTYsIDJuZCBlZCkuCgojI0NoYXB0ZXIgMgoKIyMjIFNldCB1cAoKYGBge3IgbG9hZCBsaWJyYXJ5fQpsaWJyYXJ5KGdncGxvdDIpCmBgYApGb3Igdmlld2luZyB0aGUgZGF0YSBzZXQsIHR5cGUgYG1wZ2AuIFRvIHNlZSB0aGVtIGEgYml0IG1vcmUgY29tZm9ydGFibHksIHVzZSBgVmlldyhtcGcpYCAobm90ZSB0aGUgY2FwaXRhbCBWKS4gCgojIyMgS2V5IGNvbXBvbmVudHMKTGV0J3MgbG9vayBhdCB0aGUgY29tcG9uZW50cyBmb3IgY3JlYXRpbmcgYSBjaGFydCB3aXRoIGdncGxvdDIsIHVzaW5nIGEgc2NhdHRlcnBsb3QgYXMgdGhlIGV4YW1wbGUuCmBgYHtyfQpnZ3Bsb3QobXBnLCBhZXMoeCA9IGRpc3BsLCB5ID0gaHd5KSkgKwogIGdlb21fcG9pbnQoKQpgYGAKKiBtcGcgaXMgdGhlIGRhdGEgc2V0IHRoYXQgY29tZXMgd2l0aCBnZ3Bsb3QKKiBBZXN0aGV0aWMgbWFwcGluZ3M6IGVuZ2luZSBkaXNwbGFjZW1lbnQgaXMgbWFwcGVkIHRvIHggYXhpcywgbWlsZWFnZSAoaHd5KSB0byB0aGUgeSBheGlzCiogQSBsYXllciB3aXRoIHBvaW50cyBpcyBhZGRlZAoKVGhlICoqcGF0dGVybioqIHNob3duIGhlcmUgaXMgZnVuZGFtZW50YWwgZm9yIGdnbHBsb3Q6IAoqIGRhdGEgYW5kIGFlc3RoZXRpYyBtYXBwaW5ncyBhcmUgcHJvdmlkZWQgaW4gZ2dwbG90KCksIHRoZW4KKiBsYXllcnMgYXJlIGFkZGVkIHdpdGggKy4gCgpBIHNob3J0IHZlcnNpb24gb2YgdGhlIGFib3ZlIGlzIApgYGB7cn0KZ2dwbG90KG1wZywgYWVzKGRpc3BsLCBod3kpKSArIAogIGdlb21fcG9pbnQoKQpgYGAKVGhpcyBwcm9kdWNlcyBleGFjdGx5IHRoZSBzYW1lIG91dHB1dCBhcyB0aGUgbG9uZ2VyIHZlcnNpb24gYWJvdmUuIAoKIyMjIENvbG91ciwgU2l6ZSwgU2hhcGUgYW5kIG90aGVyIGFlc3RoZXRpYyBhdHRyaWJ1dGVzCkNvbG91ciBjYW4gYmUgdXNlZCBmb3IgYW5hbHlzaW5nIHRoZSBvdXRsaWVycywgdGhlIGRldmlhdGlvbnMgZnJvbSB0aGUgZ2VuZXJhbCB0cmVuZCB3aGVyZSBtaWxlYWdlIGRlY3JlYXNlcyB3aXRoIGVuZ2luZSBzaXplLCBoZXJlIGFwcGxpZWQgdG8gdGhlIGNpdHkgbWlsZWFnZSAoY3R5KTogIApgYGB7cn0KZ2dwbG90KG1wZywgYWVzKGRpc3BsLCBjdHksIGNvbG91ciA9IGNsYXNzKSkgKyAKICBnZW9tX3BvaW50KCkKYGBgCiMjIyBGYWNldHRpbmcgeyNhbmNob3J9CkFuIGFsdGVybmF0aXZlIHRvIGFlc3RoZXRpY3Mgb3RoZXIgdGhhbiB4IGFuZCB5IGF4ZXMgaXMgZmFjZXR0aW5nLiAKYGBge3J9CmdncGxvdChtcGcsIGFlcyhkaXNwbCwgaHd5KSkgKyAKICBnZW9tX3BvaW50KCkgKyAKICBmYWNldF93cmFwKH5jbGFzcykKYGBgCiMjIyBQbG90IGdlb21zClBsb3R0aW5nIHBvaW50cyBpcyBhIGJ1dCBvbmUgd2F5IHRvIHZpc3VhbGlzZSBkYXRhLiBNYW55IG90aGVyIGV4aXN0LCBzdWNoIGFzIGxpbmUgZ3JhcGhzIGFuZCBib3ggcGxvdHMuIEFuIGltcG9ydGFudCBjbGFzcyBhcmUgc21vb3RoZXJzIChOb3RlIHRoZSByZWxhdGlvbiB0byBFREEgdGVybWlub2xvZ3k6IHNtb290aHMgYW5kIHJvdWdocyEpIAoKIyMjIyBBZGRpbmcgYSBzbW9vdGhlciB0byBhIHBsb3QKYGBge3J9CmdncGxvdChtcGcsIGFlcyhkaXNwbCwgaHd5KSkgKyAKICBnZW9tX3BvaW50KCkgKyAKICBnZW9tX3Ntb290aCgpCmBgYApUaGUgc21vb3RoZXIgYnkgZGVmYXVsdCBpcyBsb2Vzcywgd2hpY2ggaXMgbm9uLWxpbmVhci4gRm9yIGZpdGluZyB3aXRoIGEgbGluZWFyIGZ1bmN0aW9uLCBkbyB0aGlzOgpgYGB7cn0KZ2dwbG90KG1wZywgYWVzKGRpc3BsLCBod3kpKSArIAogIGdlb21fcG9pbnQoKSArIAogIGdlb21fc21vb3RoKG1ldGhvZCA9ICJsbSIsIHNlID0gRkFMU0UpCmBgYApUaGlzIGFsc28gaWxsdXN0cmF0ZXMgaG93IHRvIHN1cHJlc3MgZGlzcGxheSBvZiB0aGUgY29uZmlkZW5jZSBpbnRlcnZhbGwsIGBzZSA9IEZBTFNFYC4gCgojIyMjIEJveHBsb3RzIGFuZCBqaXR0ZXJpbmcKV2hlbiBjb21iaW5pbmcgYSBjb250aW51aW91cyB2YXJpYWJsZSwgc3VjaCBhcyBoaWdod2F5IG1pbGVhZ2UsIHdpdGggYSBjYXRlZ29yaWNhbCBvbmUsIHN1Y2ggYXMgZW5naW5lIHR5cGUsIHdlIGFyZSB1c3VhbGx5IGxvb2tpbmcgYXQgdGhlIGNhdGVnb3JpY2FsIHZhcmlhYmxlIGFzIGEgKmxldmVsKjogCmBgYHtyfQpnZ3Bsb3QobXBnLCBhZXMoZHJ2LCBod3kpKSArCiAgZ2VvbV9wb2ludCgpCmBgYApUaGlzIG1heSBsZWFkIHRvIHRoZSBkYXRhIHBvaW50cyBiZWNvbWluZyBvdmVycHJpbnRlZCwgYXMgc2hvd24gYWJvdmU6IFRoZXJlIGFyZSBtYW55IG1vcmUgcG9pbnRzIHBsb3R0ZWQgdGhhbiB2aXNpYmxlLiBUaGVyZSBhcmUgdGhyZWUgd2F5cyB0byBhdm9pZCB0aGF0OgoKYGBge3J9CmdncGxvdChtcGcsIGFlcyhkcnYsIGh3eSkpICsgZ2VvbV9qaXR0ZXIoKQpnZ3Bsb3QobXBnLCBhZXMoZHJ2LCBod3kpKSArIGdlb21fYm94cGxvdCgpCmdncGxvdChtcGcsIGFlcyhkcnYsIGh3eSkpICsgZ2VvbV92aW9saW4oKQpgYGAKCg==