1. What’s gone wrong with this code? Why are the points not blue?
suppressPackageStartupMessages(library(tidyverse))
package 㤼㸱tidyverse㤼㸲 was built under R version 3.6.3
ggplot(data = mpg) +
geom_point(mapping = aes(x = displ, y = hwy, colour = "blue"))

The argument colour = "blue"
is included within the mapping argument, and as such, it is treated as an aesthetic, which is a mapping between a variable and a value. In the expression, colour = "blue"
, "blue"
is interpreted as a categorical variable which only takes a single value "blue"
. If this is confusing, consider how colour = 1:234
and colour = 1
are interpreted by aes()
.
The following code does produces the expected result.
ggplot(data = mpg) +
geom_point(mapping = aes(x = displ, y = hwy), colour = "blue")

2. Which variables in mpg
are categorical? Which variables are continuous? (Hint: type ?mpg
to read the documentation for the dataset). How can you see this information when you run mpg?
The following list contains the categorical variables in mpg.
The following list contains the continuous variables in mpg
.
In the printed data frame, angled brackets at the top of each column provide type of each variable.
mpg
Those with <chr>
above their columns are categorical, while those with <dbl>
or <int>
are continuous. Alternatively, glimpse()
displays the type of each column.
glimpse(mpg)
Observations: 234
Variables: 11
$ manufacturer [3m[38;5;246m<chr>[39m[23m "audi", "audi", "audi", "audi", "audi", "audi", "audi", "audi", "audi", "audi"...
$ model [3m[38;5;246m<chr>[39m[23m "a4", "a4", "a4", "a4", "a4", "a4", "a4", "a4 quattro", "a4 quattro", "a4 quat...
$ displ [3m[38;5;246m<dbl>[39m[23m 1.8, 1.8, 2.0, 2.0, 2.8, 2.8, 3.1, 1.8, 1.8, 2.0, 2.0, 2.8, 2.8, 3.1, 3.1, 2.8...
$ year [3m[38;5;246m<int>[39m[23m 1999, 1999, 2008, 2008, 1999, 1999, 2008, 1999, 1999, 2008, 2008, 1999, 1999, ...
$ cyl [3m[38;5;246m<int>[39m[23m 4, 4, 4, 4, 6, 6, 6, 4, 4, 4, 4, 6, 6, 6, 6, 6, 6, 8, 8, 8, 8, 8, 8, 8, 8, 8, ...
$ trans [3m[38;5;246m<chr>[39m[23m "auto(l5)", "manual(m5)", "manual(m6)", "auto(av)", "auto(l5)", "manual(m5)", ...
$ drv [3m[38;5;246m<chr>[39m[23m "f", "f", "f", "f", "f", "f", "f", "4", "4", "4", "4", "4", "4", "4", "4", "4"...
$ cty [3m[38;5;246m<int>[39m[23m 18, 21, 20, 21, 16, 18, 18, 18, 16, 20, 19, 15, 17, 17, 15, 15, 17, 16, 14, 11...
$ hwy [3m[38;5;246m<int>[39m[23m 29, 29, 31, 30, 26, 26, 27, 26, 25, 28, 27, 25, 25, 25, 25, 24, 25, 23, 20, 15...
$ fl [3m[38;5;246m<chr>[39m[23m "p", "p", "p", "p", "p", "p", "p", "p", "p", "p", "p", "p", "p", "p", "p", "p"...
$ class [3m[38;5;246m<chr>[39m[23m "compact", "compact", "compact", "compact", "compact", "compact", "compact", "...
3. Map a continuous variable to color, size, and shape. How do these aesthetics behave differently for categorical vs. continuous variables?
The variable cty
, city highway miles per gallon, is a continuous variable.
ggplot(mpg, aes(x = displ, y = hwy, colour = cty)) +
geom_point()

Instead of using discrete colors, the continuous variable uses a scale that varies from a light to dark blue color.
ggplot(mpg, aes(x = displ, y = hwy, size = cty)) +
geom_point()

When mapped to size, the sizes of the points vary continuously as a function of their size.
#ggplot(mpg, aes(x = displ, y = hwy, shape = cty)) + geom_point()
#> Error: A continuous variable can not be mapped to shape
When a continuous value is mapped to shape, it gives an error. Though we could split a continuous variable into discrete categories and use a shape aesthetic, this would conceptually not make sense. A numeric variable has an order, but shapes do not. It is clear that smaller points correspond to smaller values, or once the color scale is given, which colors correspond to larger or smaller values. But it is not clear whether a square is greater or less than a circle.
4. What happens if you map the same variable to multiple aesthetics?
ggplot(mpg, aes(x = displ, y = hwy, colour = hwy, size = displ)) +
geom_point()

In the above plot, hwy
is mapped to both location on the y-axis and color, and displ
is mapped to both location on the x-axis and size. The code works and produces a plot, even if it is a bad one. Mapping a single variable to multiple aesthetics is redundant. Because it is redundant information, in most cases avoid mapping a single variable to multiple aesthetics.
5. What does the stroke aesthetic do? What shapes does it work with? (Hint: use ?geom_point
)
Stroke changes the size of the border for shapes (21-25). These are filled shapes in which the color and size of the border can differ from that of the filled interior of the shape.
For example
ggplot(mtcars, aes(wt, mpg)) +
geom_point(shape = 21, colour = "black", fill = "white", size = 5, stroke = 5)

6. What happens if you map an aesthetic to something other than a variable name, like aes(colour = displ < 5)
?
ggplot(mpg, aes(x = displ, y = hwy, colour = displ < 5)) +
geom_point()

Aesthetics can also be mapped to expressions like displ < 5
. The ggplot()
function behaves as if a temporary variable was added to the data with with values equal to the result of the expression. In this case, the result of displ < 5
is a logical variable which takes values of TRUE
or FALSE
.
This also explains why, in the earlier exercise, the expression colour = "blue"
created a categorical variable with only one category: “blue”
LS0tDQp0aXRsZTogIkFlc3RoZXRpYyBtYXBwaW5ncyBkZW1vIg0Kb3V0cHV0OiANCiAgaHRtbF9ub3RlYm9vazoNCiAgICB0b2M6IHRydWUNCiAgICB0b2NfZmxvYXQ6IHRydWUNCi0tLQ0KDQojIyMgMS4gV2hhdOKAmXMgZ29uZSB3cm9uZyB3aXRoIHRoaXMgY29kZT8gV2h5IGFyZSB0aGUgcG9pbnRzIG5vdCBibHVlPw0KDQpgYGB7ciB3cm9uZ2NvbG9yfQ0Kc3VwcHJlc3NQYWNrYWdlU3RhcnR1cE1lc3NhZ2VzKGxpYnJhcnkodGlkeXZlcnNlKSkNCmdncGxvdChkYXRhID0gbXBnKSArDQogIGdlb21fcG9pbnQobWFwcGluZyA9IGFlcyh4ID0gZGlzcGwsIHkgPSBod3ksIGNvbG91ciA9ICJibHVlIikpDQpgYGANCg0KVGhlIGFyZ3VtZW50IGBjb2xvdXIgPSAiYmx1ZSJgIGlzIGluY2x1ZGVkIHdpdGhpbiB0aGUgbWFwcGluZyBhcmd1bWVudCwgYW5kIGFzIHN1Y2gsIGl0IGlzIHRyZWF0ZWQgYXMgYW4gYWVzdGhldGljLCB3aGljaCBpcyBhIG1hcHBpbmcgYmV0d2VlbiBhIHZhcmlhYmxlIGFuZCBhIHZhbHVlLiBJbiB0aGUgZXhwcmVzc2lvbiwgYGNvbG91ciA9ICJibHVlImAsIGAiYmx1ZSJgIGlzIGludGVycHJldGVkIGFzIGEgY2F0ZWdvcmljYWwgdmFyaWFibGUgd2hpY2ggb25seSB0YWtlcyBhIHNpbmdsZSB2YWx1ZSBgImJsdWUiYC4gSWYgdGhpcyBpcyBjb25mdXNpbmcsIGNvbnNpZGVyIGhvdyBgY29sb3VyID0gMToyMzRgIGFuZCBgY29sb3VyID0gMWAgYXJlIGludGVycHJldGVkIGJ5IGBhZXMoKWAuDQoNClRoZSBmb2xsb3dpbmcgY29kZSBkb2VzIHByb2R1Y2VzIHRoZSBleHBlY3RlZCByZXN1bHQuDQoNCmBgYHtyIGZpeGVkaXR9DQpnZ3Bsb3QoZGF0YSA9IG1wZykgKw0KICBnZW9tX3BvaW50KG1hcHBpbmcgPSBhZXMoeCA9IGRpc3BsLCB5ID0gaHd5KSwgY29sb3VyID0gImJsdWUiKQ0KYGBgDQoNCiMjIyAyLiBXaGljaCB2YXJpYWJsZXMgaW4gYG1wZ2AgYXJlIGNhdGVnb3JpY2FsPyBXaGljaCB2YXJpYWJsZXMgYXJlIGNvbnRpbnVvdXM/IChIaW50OiB0eXBlIGA/bXBnYCB0byByZWFkIHRoZSBkb2N1bWVudGF0aW9uIGZvciB0aGUgZGF0YXNldCkuIEhvdyBjYW4geW91IHNlZSB0aGlzIGluZm9ybWF0aW9uIHdoZW4geW91IHJ1biBtcGc/DQoNClRoZSBmb2xsb3dpbmcgbGlzdCBjb250YWlucyB0aGUgY2F0ZWdvcmljYWwgdmFyaWFibGVzIGluIG1wZy4NCg0KIC0gYG1vZGVsYA0KIC0gYHRyYW5zYA0KIC0gYGRydmANCiAtIGBmbGANCiAtIGBjbGFzc2ANCg0KVGhlIGZvbGxvd2luZyBsaXN0IGNvbnRhaW5zIHRoZSBjb250aW51b3VzIHZhcmlhYmxlcyBpbiBgbXBnYC4NCg0KIC0gYGRpc3BsYA0KIC0gYHllYXJgDQogLSBgY3lsYA0KIC0gYGN0eWANCiAtIGBod3lgDQoNCkluIHRoZSBwcmludGVkIGRhdGEgZnJhbWUsIGFuZ2xlZCBicmFja2V0cyBhdCB0aGUgdG9wIG9mIGVhY2ggY29sdW1uIHByb3ZpZGUgdHlwZSBvZiBlYWNoIHZhcmlhYmxlLg0KDQpgYGB7ciBtcGd9DQptcGcNCmBgYA0KDQpUaG9zZSB3aXRoIGA8Y2hyPmAgYWJvdmUgdGhlaXIgY29sdW1ucyBhcmUgY2F0ZWdvcmljYWwsIHdoaWxlIHRob3NlIHdpdGggYDxkYmw+YCBvciBgPGludD5gIGFyZSBjb250aW51b3VzLiBBbHRlcm5hdGl2ZWx5LCBgZ2xpbXBzZSgpYCBkaXNwbGF5cyB0aGUgdHlwZSBvZiBlYWNoIGNvbHVtbi4NCg0KYGBge3IgZ2xpbXBzZX0NCmdsaW1wc2UobXBnKQ0KYGBgDQoNCiMjIyAzLiBNYXAgYSBjb250aW51b3VzIHZhcmlhYmxlIHRvIGNvbG9yLCBzaXplLCBhbmQgc2hhcGUuIEhvdyBkbyB0aGVzZSBhZXN0aGV0aWNzIGJlaGF2ZSBkaWZmZXJlbnRseSBmb3IgY2F0ZWdvcmljYWwgdnMuIGNvbnRpbnVvdXMgdmFyaWFibGVzPw0KDQpUaGUgdmFyaWFibGUgYGN0eWAsIGNpdHkgaGlnaHdheSBtaWxlcyBwZXIgZ2FsbG9uLCBpcyBhIGNvbnRpbnVvdXMgdmFyaWFibGUuDQoNCmBgYHtyIGN0eX0NCmdncGxvdChtcGcsIGFlcyh4ID0gZGlzcGwsIHkgPSBod3ksIGNvbG91ciA9IGN0eSkpICsNCiAgZ2VvbV9wb2ludCgpDQpgYGANCg0KSW5zdGVhZCBvZiB1c2luZyBkaXNjcmV0ZSBjb2xvcnMsIHRoZSBjb250aW51b3VzIHZhcmlhYmxlIHVzZXMgYSBzY2FsZSB0aGF0IHZhcmllcyBmcm9tIGEgbGlnaHQgdG8gZGFyayBibHVlIGNvbG9yLg0KDQpgYGB7ciBjdHlzaXplfQ0KZ2dwbG90KG1wZywgYWVzKHggPSBkaXNwbCwgeSA9IGh3eSwgc2l6ZSA9IGN0eSkpICsNCiAgZ2VvbV9wb2ludCgpDQpgYGANCg0KV2hlbiBtYXBwZWQgdG8gc2l6ZSwgdGhlIHNpemVzIG9mIHRoZSBwb2ludHMgdmFyeSBjb250aW51b3VzbHkgYXMgYSBmdW5jdGlvbiBvZiB0aGVpciBzaXplLg0KDQpgYGB7ciBjdHlzaGFwZX0NCiNnZ3Bsb3QobXBnLCBhZXMoeCA9IGRpc3BsLCB5ID0gaHd5LCBzaGFwZSA9IGN0eSkpICsgZ2VvbV9wb2ludCgpDQojPiBFcnJvcjogQSBjb250aW51b3VzIHZhcmlhYmxlIGNhbiBub3QgYmUgbWFwcGVkIHRvIHNoYXBlDQpgYGANCg0KV2hlbiBhIGNvbnRpbnVvdXMgdmFsdWUgaXMgbWFwcGVkIHRvIHNoYXBlLCBpdCBnaXZlcyBhbiBlcnJvci4gVGhvdWdoIHdlIGNvdWxkIHNwbGl0IGEgY29udGludW91cyB2YXJpYWJsZSBpbnRvIGRpc2NyZXRlIGNhdGVnb3JpZXMgYW5kIHVzZSBhIHNoYXBlIGFlc3RoZXRpYywgdGhpcyB3b3VsZCBjb25jZXB0dWFsbHkgbm90IG1ha2Ugc2Vuc2UuIEEgbnVtZXJpYyB2YXJpYWJsZSBoYXMgYW4gb3JkZXIsIGJ1dCBzaGFwZXMgZG8gbm90LiBJdCBpcyBjbGVhciB0aGF0IHNtYWxsZXIgcG9pbnRzIGNvcnJlc3BvbmQgdG8gc21hbGxlciB2YWx1ZXMsIG9yIG9uY2UgdGhlIGNvbG9yIHNjYWxlIGlzIGdpdmVuLCB3aGljaCBjb2xvcnMgY29ycmVzcG9uZCB0byBsYXJnZXIgb3Igc21hbGxlciB2YWx1ZXMuIEJ1dCBpdCBpcyBub3QgY2xlYXIgd2hldGhlciBhIHNxdWFyZSBpcyBncmVhdGVyIG9yIGxlc3MgdGhhbiBhIGNpcmNsZS4NCg0KIyMjIDQuIFdoYXQgaGFwcGVucyBpZiB5b3UgbWFwIHRoZSBzYW1lIHZhcmlhYmxlIHRvIG11bHRpcGxlIGFlc3RoZXRpY3M/DQoNCmBgYHtyIG11bHRpfQ0KZ2dwbG90KG1wZywgYWVzKHggPSBkaXNwbCwgeSA9IGh3eSwgY29sb3VyID0gaHd5LCBzaXplID0gZGlzcGwpKSArDQogIGdlb21fcG9pbnQoKQ0KYGBgDQoNCkluIHRoZSBhYm92ZSBwbG90LCBgaHd5YCBpcyBtYXBwZWQgdG8gYm90aCBsb2NhdGlvbiBvbiB0aGUgeS1heGlzIGFuZCBjb2xvciwgYW5kIGBkaXNwbGAgaXMgbWFwcGVkIHRvIGJvdGggbG9jYXRpb24gb24gdGhlIHgtYXhpcyBhbmQgc2l6ZS4gVGhlIGNvZGUgd29ya3MgYW5kIHByb2R1Y2VzIGEgcGxvdCwgZXZlbiBpZiBpdCBpcyBhIGJhZCBvbmUuIE1hcHBpbmcgYSBzaW5nbGUgdmFyaWFibGUgdG8gbXVsdGlwbGUgYWVzdGhldGljcyBpcyByZWR1bmRhbnQuIEJlY2F1c2UgaXQgaXMgcmVkdW5kYW50IGluZm9ybWF0aW9uLCBpbiBtb3N0IGNhc2VzIGF2b2lkIG1hcHBpbmcgYSBzaW5nbGUgdmFyaWFibGUgdG8gbXVsdGlwbGUgYWVzdGhldGljcy4NCg0KIyMjIDUuIFdoYXQgZG9lcyB0aGUgc3Ryb2tlIGFlc3RoZXRpYyBkbz8gV2hhdCBzaGFwZXMgZG9lcyBpdCB3b3JrIHdpdGg/IChIaW50OiB1c2UgYD9nZW9tX3BvaW50YCkNCg0KU3Ryb2tlIGNoYW5nZXMgdGhlIHNpemUgb2YgdGhlIGJvcmRlciBmb3Igc2hhcGVzICgyMS0yNSkuIFRoZXNlIGFyZSBmaWxsZWQgc2hhcGVzIGluIHdoaWNoIHRoZSBjb2xvciBhbmQgc2l6ZSBvZiB0aGUgYm9yZGVyIGNhbiBkaWZmZXIgZnJvbSB0aGF0IG9mIHRoZSBmaWxsZWQgaW50ZXJpb3Igb2YgdGhlIHNoYXBlLg0KDQpGb3IgZXhhbXBsZQ0KDQpgYGB7ciBzdHJva2V9DQpnZ3Bsb3QobXRjYXJzLCBhZXMod3QsIG1wZykpICsNCiAgZ2VvbV9wb2ludChzaGFwZSA9IDIxLCBjb2xvdXIgPSAiYmxhY2siLCBmaWxsID0gIndoaXRlIiwgc2l6ZSA9IDUsIHN0cm9rZSA9IDUpDQpgYGANCg0KIyMjIDYuIFdoYXQgaGFwcGVucyBpZiB5b3UgbWFwIGFuIGFlc3RoZXRpYyB0byBzb21ldGhpbmcgb3RoZXIgdGhhbiBhIHZhcmlhYmxlIG5hbWUsIGxpa2UgYGFlcyhjb2xvdXIgPSBkaXNwbCA8IDUpYD8NCg0KYGBge3IgbG9naX0NCmdncGxvdChtcGcsIGFlcyh4ID0gZGlzcGwsIHkgPSBod3ksIGNvbG91ciA9IGRpc3BsIDwgNSkpICsNCiAgZ2VvbV9wb2ludCgpDQpgYGANCg0KQWVzdGhldGljcyBjYW4gYWxzbyBiZSBtYXBwZWQgdG8gZXhwcmVzc2lvbnMgbGlrZSBgZGlzcGwgPCA1YC4gVGhlIGBnZ3Bsb3QoKWAgZnVuY3Rpb24gYmVoYXZlcyBhcyBpZiBhIHRlbXBvcmFyeSB2YXJpYWJsZSB3YXMgYWRkZWQgdG8gdGhlIGRhdGEgd2l0aCB3aXRoIHZhbHVlcyBlcXVhbCB0byB0aGUgcmVzdWx0IG9mIHRoZSBleHByZXNzaW9uLiBJbiB0aGlzIGNhc2UsIHRoZSByZXN1bHQgb2YgYGRpc3BsIDwgNWAgaXMgYSBsb2dpY2FsIHZhcmlhYmxlIHdoaWNoIHRha2VzIHZhbHVlcyBvZiBgVFJVRWAgb3IgYEZBTFNFYC4NCg0KVGhpcyBhbHNvIGV4cGxhaW5zIHdoeSwgaW4gdGhlIGVhcmxpZXIgZXhlcmNpc2UsIHRoZSBleHByZXNzaW9uIGBjb2xvdXIgPSAiYmx1ZSJgIGNyZWF0ZWQgYSBjYXRlZ29yaWNhbCB2YXJpYWJsZSB3aXRoIG9ubHkgb25lIGNhdGVnb3J5OiDigJxibHVl4oCd