suppressPackageStartupMessages(library("tidyverse"))
package 㤼㸱tidyverse㤼㸲 was built under R version 3.6.3

1. How can you tell if an object is a tibble? (Hint: try printing mtcars, which is a regular data frame).

When we print mtcars, it prints all the columns.

mtcars

But when we first convert mtcars to a tibble using as_tibble(), it prints on the first ten observations. There are also some other differences in formatting of the printed data frame.

as_tibble(mtcars)

You can use the function is_tibble() to check whether a data frame is a tibble or not. The mtcars data frame is not a tibble.

is_tibble(mtcars)
[1] FALSE

But the diamonds and flights data are tibbles.

is_tibble(ggplot2::diamonds)
[1] TRUE
is_tibble(nycflights13::flights)
[1] TRUE
is_tibble(as_tibble(mtcars))
[1] TRUE

More generally, you can use the class() function to find out the class of an object. Tibbles has the classes c("tbl_df", "tbl", "data.frame"), while old data frames will only have the class "data.frame".

class(mtcars)
[1] "data.frame"
class(ggplot2::diamonds)
[1] "tbl_df"     "tbl"        "data.frame"
class(nycflights13::flights)
[1] "tbl_df"     "tbl"        "data.frame"

If you are interested in reading more on R’s classes, read the chapters on object oriented programming in Advanced R.

2. Compare and contrast the following operations on a data.frame and equivalent tibble. What is different? Why might the default data frame behaviors cause you frustration?

df <- data.frame(abc = 1, xyz = "a")
df$x
[1] a
Levels: a
df[, "xyz"]
[1] a
Levels: a
df[, c("abc", "xyz")]
tbl <- as_tibble(df)
tbl$x
Unknown or uninitialised column: 'x'.
NULL
tbl[, "xyz"]
tbl[, c("abc", "xyz")]

The $ operator will match any column name that starts with the name following it. Since there is a column named xyz, the expression df$x will be expanded to df$xyz. This behavior of the $ operator saves a few keystrokes, but it can result in accidentally using a different column than you thought you were using.

With data.frames, with [ the type of object that is returned differs on the number of columns. If it is one column, it won’t return a data.frame, but instead will return a vector. With more than one column, then it will return a data.frame. This is fine if you know what you are passing in, but suppose you did df[ , vars] where vars was a variable. Then what that code does depends on length(vars) and you’d have to write code to account for those situations or risk bugs.

3. If you have the name of a variable stored in an object, e.g. var <- "mpg", how can you extract the reference variable from a tibble?

You can use the double bracket, like df[[var]]. You cannot use the dollar sign, because df$var would look for a column named var.

4. Practice referring to non-syntactic names in the following data frame by:

  1. Extracting the variable called 1.
  2. Plotting a scatterplot of 1 vs 2.
  3. Creating a new column called 3 which is 2 divided by 1.
  4. Renaming the columns to one, two and three.

For this example, I’ll create a dataset called annoying with columns named 1 and 2.

annoying <- tibble(
  `1` = 1:10,
  `2` = `1` * 2 + rnorm(length(`1`))
)
  1. To extract the variable named 1:
annoying[["1"]]
 [1]  1  2  3  4  5  6  7  8  9 10

or

annoying$`1`
 [1]  1  2  3  4  5  6  7  8  9 10
  1. To create a scatter plot of 1 vs. 2:
ggplot(annoying, aes(x = `1`, y = `2`)) +
  geom_point()

  1. To add a new column 3 which is 2 divided by 1:
mutate(annoying, `3` = `2` / `1`)

or

annoying[["3"]] <- annoying$`2` / annoying$`1`

or

annoying[["3"]] <- annoying[["2"]] / annoying[["1"]]
  1. To rename the columns to one, two, and three, run:
annoying <- rename(annoying, one = `1`, two = `2`, three = `3`)
glimpse(annoying)
Observations: 10
Variables: 3
$ one   <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10
$ two   <dbl> 1.795927, 2.367798, 6.048697, 8.507860, 9.895426, 10.586689, 13.874955, 13....
$ three <dbl> 1.795927, 1.183899, 2.016232, 2.126965, 1.979085, 1.764448, 1.982136, 1.690...

5. What does tibble::enframe() do? When might you use it?

The function tibble::enframe() converts named vectors to a data frame with names and values

enframe(c(a = 1, b = 2, c = 3))
LS0tDQp0aXRsZTogIlRpYmJsZXMiDQpvdXRwdXQ6IA0KICBodG1sX25vdGVib29rOg0KICAgIHRvYzogdHJ1ZQ0KICAgIHRvY19mbG9hdDogdHJ1ZQ0KLS0tDQoNCmBgYHtyfQ0Kc3VwcHJlc3NQYWNrYWdlU3RhcnR1cE1lc3NhZ2VzKGxpYnJhcnkoInRpZHl2ZXJzZSIpKQ0KYGBgDQoNCiMjIyAxLiBIb3cgY2FuIHlvdSB0ZWxsIGlmIGFuIG9iamVjdCBpcyBhIHRpYmJsZT8gKEhpbnQ6IHRyeSBwcmludGluZyBgbXRjYXJzYCwgd2hpY2ggaXMgYSByZWd1bGFyIGRhdGEgZnJhbWUpLg0KDQpXaGVuIHdlIHByaW50IGBtdGNhcnNgLCBpdCBwcmludHMgYWxsIHRoZSBjb2x1bW5zLg0KDQpgYGB7cn0NCm10Y2Fycw0KYGBgDQoNCkJ1dCB3aGVuIHdlIGZpcnN0IGNvbnZlcnQgYG10Y2Fyc2AgdG8gYSB0aWJibGUgdXNpbmcgYGFzX3RpYmJsZSgpYCwgaXQgcHJpbnRzIG9uIHRoZSBmaXJzdCB0ZW4gb2JzZXJ2YXRpb25zLiBUaGVyZSBhcmUgYWxzbyBzb21lIG90aGVyIGRpZmZlcmVuY2VzIGluIGZvcm1hdHRpbmcgb2YgdGhlIHByaW50ZWQgZGF0YSBmcmFtZS4NCg0KYGBge3J9DQphc190aWJibGUobXRjYXJzKQ0KYGBgDQoNCllvdSBjYW4gdXNlIHRoZSBmdW5jdGlvbiBgaXNfdGliYmxlKClgIHRvIGNoZWNrIHdoZXRoZXIgYSBkYXRhIGZyYW1lIGlzIGEgdGliYmxlIG9yIG5vdC4gVGhlIG10Y2FycyBkYXRhIGZyYW1lIGlzIG5vdCBhIHRpYmJsZS4NCg0KYGBge3J9DQppc190aWJibGUobXRjYXJzKQ0KYGBgDQoNCkJ1dCB0aGUgYGRpYW1vbmRzYCBhbmQgYGZsaWdodHNgIGRhdGEgYXJlIHRpYmJsZXMuDQoNCmBgYHtyfQ0KaXNfdGliYmxlKGdncGxvdDI6OmRpYW1vbmRzKQ0KaXNfdGliYmxlKG55Y2ZsaWdodHMxMzo6ZmxpZ2h0cykNCmlzX3RpYmJsZShhc190aWJibGUobXRjYXJzKSkNCmBgYA0KDQpNb3JlIGdlbmVyYWxseSwgeW91IGNhbiB1c2UgdGhlIGBjbGFzcygpYCBmdW5jdGlvbiB0byBmaW5kIG91dCB0aGUgY2xhc3Mgb2YgYW4gb2JqZWN0LiBUaWJibGVzIGhhcyB0aGUgY2xhc3NlcyBgYygidGJsX2RmIiwgInRibCIsICJkYXRhLmZyYW1lIilgLCB3aGlsZSBvbGQgZGF0YSBmcmFtZXMgd2lsbCBvbmx5IGhhdmUgdGhlIGNsYXNzIGAiZGF0YS5mcmFtZSJgLg0KDQpgYGB7cn0NCmNsYXNzKG10Y2FycykNCmNsYXNzKGdncGxvdDI6OmRpYW1vbmRzKQ0KY2xhc3MobnljZmxpZ2h0czEzOjpmbGlnaHRzKQ0KYGBgDQoNCklmIHlvdSBhcmUgaW50ZXJlc3RlZCBpbiByZWFkaW5nIG1vcmUgb24gUuKAmXMgY2xhc3NlcywgcmVhZCB0aGUgY2hhcHRlcnMgb24gb2JqZWN0IG9yaWVudGVkIHByb2dyYW1taW5nIGluIFtBZHZhbmNlZCBSXShodHRwOi8vYWR2LXIuaGFkLmNvLm56L1MzLmh0bWwpLg0KDQojIyMgMi4gQ29tcGFyZSBhbmQgY29udHJhc3QgdGhlIGZvbGxvd2luZyBvcGVyYXRpb25zIG9uIGEgYGRhdGEuZnJhbWVgIGFuZCBlcXVpdmFsZW50IHRpYmJsZS4gV2hhdCBpcyBkaWZmZXJlbnQ/IFdoeSBtaWdodCB0aGUgZGVmYXVsdCBkYXRhIGZyYW1lIGJlaGF2aW9ycyBjYXVzZSB5b3UgZnJ1c3RyYXRpb24/DQoNCmBgYHtyfQ0KZGYgPC0gZGF0YS5mcmFtZShhYmMgPSAxLCB4eXogPSAiYSIpDQpkZiR4DQpkZlssICJ4eXoiXQ0KZGZbLCBjKCJhYmMiLCAieHl6IildDQp0YmwgPC0gYXNfdGliYmxlKGRmKQ0KdGJsJHgNCnRibFssICJ4eXoiXQ0KdGJsWywgYygiYWJjIiwgInh5eiIpXQ0KYGBgDQoNClRoZSBgJGAgb3BlcmF0b3Igd2lsbCBtYXRjaCBhbnkgY29sdW1uIG5hbWUgdGhhdCBzdGFydHMgd2l0aCB0aGUgbmFtZSBmb2xsb3dpbmcgaXQuIFNpbmNlIHRoZXJlIGlzIGEgY29sdW1uIG5hbWVkIHh5eiwgdGhlIGV4cHJlc3Npb24gYGRmJHhgIHdpbGwgYmUgZXhwYW5kZWQgdG8gYGRmJHh5emAuIFRoaXMgYmVoYXZpb3Igb2YgdGhlICQgb3BlcmF0b3Igc2F2ZXMgYSBmZXcga2V5c3Ryb2tlcywgYnV0IGl0IGNhbiByZXN1bHQgaW4gYWNjaWRlbnRhbGx5IHVzaW5nIGEgZGlmZmVyZW50IGNvbHVtbiB0aGFuIHlvdSB0aG91Z2h0IHlvdSB3ZXJlIHVzaW5nLg0KDQpXaXRoIGRhdGEuZnJhbWVzLCB3aXRoIGBbYCB0aGUgdHlwZSBvZiBvYmplY3QgdGhhdCBpcyByZXR1cm5lZCBkaWZmZXJzIG9uIHRoZSBudW1iZXIgb2YgY29sdW1ucy4gSWYgaXQgaXMgb25lIGNvbHVtbiwgaXQgd29u4oCZdCByZXR1cm4gYSBkYXRhLmZyYW1lLCBidXQgaW5zdGVhZCB3aWxsIHJldHVybiBhIHZlY3Rvci4gV2l0aCBtb3JlIHRoYW4gb25lIGNvbHVtbiwgdGhlbiBpdCB3aWxsIHJldHVybiBhIGRhdGEuZnJhbWUuIFRoaXMgaXMgZmluZSBpZiB5b3Uga25vdyB3aGF0IHlvdSBhcmUgcGFzc2luZyBpbiwgYnV0IHN1cHBvc2UgeW91IGRpZCBgZGZbICwgdmFyc11gIHdoZXJlIGB2YXJzYCB3YXMgYSB2YXJpYWJsZS4gVGhlbiB3aGF0IHRoYXQgY29kZSBkb2VzIGRlcGVuZHMgb24gYGxlbmd0aCh2YXJzKWAgYW5kIHlvdeKAmWQgaGF2ZSB0byB3cml0ZSBjb2RlIHRvIGFjY291bnQgZm9yIHRob3NlIHNpdHVhdGlvbnMgb3IgcmlzayBidWdzLg0KDQojIyMgMy4gSWYgeW91IGhhdmUgdGhlIG5hbWUgb2YgYSB2YXJpYWJsZSBzdG9yZWQgaW4gYW4gb2JqZWN0LCBlLmcuIGB2YXIgPC0gIm1wZyJgLCBob3cgY2FuIHlvdSBleHRyYWN0IHRoZSByZWZlcmVuY2UgdmFyaWFibGUgZnJvbSBhIHRpYmJsZT8NCg0KWW91IGNhbiB1c2UgdGhlIGRvdWJsZSBicmFja2V0LCBsaWtlIGBkZltbdmFyXV1gLiBZb3UgY2Fubm90IHVzZSB0aGUgZG9sbGFyIHNpZ24sIGJlY2F1c2UgYGRmJHZhcmAgd291bGQgbG9vayBmb3IgYSBjb2x1bW4gbmFtZWQgYHZhcmAuDQoNCiMjIyA0LiBQcmFjdGljZSByZWZlcnJpbmcgdG8gbm9uLXN5bnRhY3RpYyBuYW1lcyBpbiB0aGUgZm9sbG93aW5nIGRhdGEgZnJhbWUgYnk6DQoNCjEuIEV4dHJhY3RpbmcgdGhlIHZhcmlhYmxlIGNhbGxlZCAxLg0KMi4gUGxvdHRpbmcgYSBzY2F0dGVycGxvdCBvZiAxIHZzIDIuDQozLiBDcmVhdGluZyBhIG5ldyBjb2x1bW4gY2FsbGVkIDMgd2hpY2ggaXMgMiBkaXZpZGVkIGJ5IDEuDQo0LiBSZW5hbWluZyB0aGUgY29sdW1ucyB0byBvbmUsIHR3byBhbmQgdGhyZWUuDQoNCkZvciB0aGlzIGV4YW1wbGUsIEnigJlsbCBjcmVhdGUgYSBkYXRhc2V0IGNhbGxlZCBhbm5veWluZyB3aXRoIGNvbHVtbnMgbmFtZWQgMSBhbmQgMi4NCg0KYGBge3J9DQphbm5veWluZyA8LSB0aWJibGUoDQogIGAxYCA9IDE6MTAsDQogIGAyYCA9IGAxYCAqIDIgKyBybm9ybShsZW5ndGgoYDFgKSkNCikNCmBgYA0KDQoxLiBUbyBleHRyYWN0IHRoZSB2YXJpYWJsZSBuYW1lZCBgMWA6DQoNCmBgYHtyfQ0KYW5ub3lpbmdbWyIxIl1dDQpgYGANCg0Kb3INCg0KYGBge3J9DQphbm5veWluZyRgMWANCmBgYA0KDQoyLiBUbyBjcmVhdGUgYSBzY2F0dGVyIHBsb3Qgb2YgYDFgIHZzLiBgMmA6DQoNCmBgYHtyfQ0KZ2dwbG90KGFubm95aW5nLCBhZXMoeCA9IGAxYCwgeSA9IGAyYCkpICsNCiAgZ2VvbV9wb2ludCgpDQpgYGANCg0KMy4gVG8gYWRkIGEgbmV3IGNvbHVtbiBgM2Agd2hpY2ggaXMgYDJgIGRpdmlkZWQgYnkgYDFgOg0KDQpgYGB7cn0NCm11dGF0ZShhbm5veWluZywgYDNgID0gYDJgIC8gYDFgKQ0KYGBgDQoNCm9yDQoNCmBgYHtyfQ0KYW5ub3lpbmdbWyIzIl1dIDwtIGFubm95aW5nJGAyYCAvIGFubm95aW5nJGAxYA0KYGBgDQoNCm9yDQoNCmBgYHtyfQ0KYW5ub3lpbmdbWyIzIl1dIDwtIGFubm95aW5nW1siMiJdXSAvIGFubm95aW5nW1siMSJdXQ0KYGBgDQoNCjQuIFRvIHJlbmFtZSB0aGUgY29sdW1ucyB0byBvbmUsIHR3bywgYW5kIHRocmVlLCBydW46DQoNCmBgYHtyfQ0KYW5ub3lpbmcgPC0gcmVuYW1lKGFubm95aW5nLCBvbmUgPSBgMWAsIHR3byA9IGAyYCwgdGhyZWUgPSBgM2ApDQpnbGltcHNlKGFubm95aW5nKQ0KYGBgDQoNCiMjIyA1LiBXaGF0IGRvZXMgYHRpYmJsZTo6ZW5mcmFtZSgpYCBkbz8gV2hlbiBtaWdodCB5b3UgdXNlIGl0Pw0KDQpUaGUgZnVuY3Rpb24gYHRpYmJsZTo6ZW5mcmFtZSgpYCBjb252ZXJ0cyBuYW1lZCB2ZWN0b3JzIHRvIGEgZGF0YSBmcmFtZSB3aXRoIG5hbWVzIGFuZCB2YWx1ZXMNCg0KYGBge3J9DQplbmZyYW1lKGMoYSA9IDEsIGIgPSAyLCBjID0gMykpDQpgYGANCg0KIyMjIDYuIFdoYXQgb3B0aW9uIGNvbnRyb2xzIGhvdyBtYW55IGFkZGl0aW9uYWwgY29sdW1uIG5hbWVzIGFyZSBwcmludGVkIGF0IHRoZSBmb290ZXIgb2YgYSB0aWJibGU/DQoNClRoZSBoZWxwIHBhZ2UgZm9yIHRoZSBgcHJpbnQoKWAgbWV0aG9kIG9mIHRpYmJsZSBvYmplY3RzIGlzIGRpc2N1c3NlZCBpbiBgP3ByaW50LnRibGAuIFRoZSBgbl9leHRyYWAgYXJndW1lbnQgZGV0ZXJtaW5lcyB0aGUgbnVtYmVyIG9mIGV4dHJhIGNvbHVtbnMgdG8gcHJpbnQgaW5mb3JtYXRpb24gZm9yLg==