suppressPackageStartupMessages(library("tidyverse"))
package 㤼㸱tidyverse㤼㸲 was built under R version 3.6.3

1. In code that doesn’t use stringr, you’ll often see paste() and paste0(). What’s the difference between the two functions? What stringr function are they equivalent to? How do the functions differ in their handling of NA?

The function paste() separates strings by spaces by default, while paste0() does not separate strings with spaces by default.

paste("foo", "bar")
[1] "foo bar"
paste0("foo", "bar")
[1] "foobar"

Since str_c() does not separate strings with spaces by default it is closer in behavior to paste0().

str_c("foo", "bar")
[1] "foobar"

However, str_c() and the paste function handle NA differently. The function str_c() propagates NA, if any argument is a missing value, it returns a missing value. This is in line with how the numeric R functions, e.g. sum(), mean(), handle missing values. However, the paste functions, convert NA to the string "NA" and then treat it as any other character vector.

str_c("foo", NA)
[1] NA
paste("foo", NA)
[1] "foo NA"
paste0("foo", NA)
[1] "fooNA"

2. In your own words, describe the difference between the sep and collapse arguments to str_c().

The sep argument is the string inserted between arguments to str_c(), while collapse is the string used to separate any elements of the character vector into a character vector of length one.

3. Use str_length() and str_sub() to extract the middle character from a string. What will you do if the string has an even number of characters?

The following function extracts the middle character. If the string has an even number of characters the choice is arbitrary. We choose to select \([\frac{n}{2}]\), because that case works even if the string is only of length one. A more general method would allow the user to select either the floor or ceiling for the middle character of an even string.

x <- c("a", "abc", "abcd", "abcde", "abcdef")
L <- str_length(x)
m <- ceiling(L / 2)
str_sub(x, m, m)
[1] "a" "b" "b" "c" "c"

4. What does str_wrap() do? When might you want to use it?

The function str_wrap() wraps text so that it fits within a certain width. This is useful for wrapping long strings of text to be typeset.

5. What does str_trim() do? What’s the opposite of str_trim()?

The function str_trim() trims the whitespace from a string.

str_trim(" abc ")
[1] "abc"
str_trim(" abc ", side = "left")
[1] "abc "
str_trim(" abc ", side = "right")
[1] " abc"

The opposite of str_trim() is str_pad() which adds characters to each side.

str_pad("abc", 5, side = "both")
[1] " abc "
str_pad("abc", 4, side = "right")
[1] "abc "
str_pad("abc", 4, side = "left")
[1] " abc"

6. Write a function that turns (e.g.) a vector c("a", "b", "c") into the string "a, b, and c". Think carefully about what it should do if given a vector of length 0, 1, or 2.

See the Functions lecture for more details on writing R functions.

This function needs to handle four cases.

  1. n == 0: an empty string, e.g. "".
  2. n == 1: the original vector, e.g. "a".
  3. n == 2: return the two elements separated by “and”, e.g. "a and b".
  4. n > 2: return the first n - 1 elements separated by commas, and the last element separated by a comma and “and”, e.g. "a, b, and c".
str_commasep <- function(x, delim = ",") {
  n <- length(x)
  if (n == 0) {
    ""
  } else if (n == 1) {
    x
  } else if (n == 2) {
    # no comma before and when n == 2
    str_c(x[[1]], "and", x[[2]], sep = " ")
  } else {
    # commas after all n - 1 elements
    not_last <- str_c(x[seq_len(n - 1)], delim)
    # prepend "and" to the last element
    last <- str_c("and", x[[n]], sep = " ")
    # combine parts with spaces
    str_c(c(not_last, last), collapse = " ")
  }
}
str_commasep("")
[1] ""
str_commasep("a")
[1] "a"
str_commasep(c("a", "b"))
[1] "a and b"
str_commasep(c("a", "b", "c"))
[1] "a, b, and c"
str_commasep(c("a", "b", "c", "d"))
[1] "a, b, c, and d"
LS0tDQp0aXRsZTogIlN0cmluZyBiYXNpY3MiDQpvdXRwdXQ6IA0KICBodG1sX25vdGVib29rOg0KICAgIHRvYzogdHJ1ZQ0KICAgIHRvY19mbG9hdDogdHJ1ZQ0KLS0tDQoNCmBgYHtyfQ0Kc3VwcHJlc3NQYWNrYWdlU3RhcnR1cE1lc3NhZ2VzKGxpYnJhcnkoInRpZHl2ZXJzZSIpKQ0KYGBgDQoNCiMjIyAxLiBJbiBjb2RlIHRoYXQgZG9lc27igJl0IHVzZSBzdHJpbmdyLCB5b3XigJlsbCBvZnRlbiBzZWUgYHBhc3RlKClgIGFuZCBgcGFzdGUwKClgLiBXaGF04oCZcyB0aGUgZGlmZmVyZW5jZSBiZXR3ZWVuIHRoZSB0d28gZnVuY3Rpb25zPyBXaGF0IHN0cmluZ3IgZnVuY3Rpb24gYXJlIHRoZXkgZXF1aXZhbGVudCB0bz8gSG93IGRvIHRoZSBmdW5jdGlvbnMgZGlmZmVyIGluIHRoZWlyIGhhbmRsaW5nIG9mIE5BPw0KDQpUaGUgZnVuY3Rpb24gYHBhc3RlKClgIHNlcGFyYXRlcyBzdHJpbmdzIGJ5IHNwYWNlcyBieSBkZWZhdWx0LCB3aGlsZSBgcGFzdGUwKClgIGRvZXMgbm90IHNlcGFyYXRlIHN0cmluZ3Mgd2l0aCBzcGFjZXMgYnkgZGVmYXVsdC4NCg0KYGBge3J9DQpwYXN0ZSgiZm9vIiwgImJhciIpDQpwYXN0ZTAoImZvbyIsICJiYXIiKQ0KYGBgDQoNClNpbmNlIGBzdHJfYygpYCBkb2VzIG5vdCBzZXBhcmF0ZSBzdHJpbmdzIHdpdGggc3BhY2VzIGJ5IGRlZmF1bHQgaXQgaXMgY2xvc2VyIGluIGJlaGF2aW9yIHRvIGBwYXN0ZTAoKWAuDQoNCmBgYHtyfQ0Kc3RyX2MoImZvbyIsICJiYXIiKQ0KYGBgDQoNCkhvd2V2ZXIsIGBzdHJfYygpYCBhbmQgdGhlIHBhc3RlIGZ1bmN0aW9uIGhhbmRsZSBOQSBkaWZmZXJlbnRseS4gVGhlIGZ1bmN0aW9uIGBzdHJfYygpYCBwcm9wYWdhdGVzIGBOQWAsIGlmIGFueSBhcmd1bWVudCBpcyBhIG1pc3NpbmcgdmFsdWUsIGl0IHJldHVybnMgYSBtaXNzaW5nIHZhbHVlLiBUaGlzIGlzIGluIGxpbmUgd2l0aCBob3cgdGhlIG51bWVyaWMgUiBmdW5jdGlvbnMsIGUuZy4gYHN1bSgpYCwgYG1lYW4oKWAsIGhhbmRsZSBtaXNzaW5nIHZhbHVlcy4gSG93ZXZlciwgdGhlIHBhc3RlIGZ1bmN0aW9ucywgY29udmVydCBOQSB0byB0aGUgc3RyaW5nIGAiTkEiYCBhbmQgdGhlbiB0cmVhdCBpdCBhcyBhbnkgb3RoZXIgY2hhcmFjdGVyIHZlY3Rvci4NCg0KYGBge3J9DQpzdHJfYygiZm9vIiwgTkEpDQpwYXN0ZSgiZm9vIiwgTkEpDQpwYXN0ZTAoImZvbyIsIE5BKQ0KYGBgDQoNCiMjIyAyLiBJbiB5b3VyIG93biB3b3JkcywgZGVzY3JpYmUgdGhlIGRpZmZlcmVuY2UgYmV0d2VlbiB0aGUgYHNlcGAgYW5kIGBjb2xsYXBzZWAgYXJndW1lbnRzIHRvIGBzdHJfYygpYC4NCg0KVGhlIGBzZXBgIGFyZ3VtZW50IGlzIHRoZSBzdHJpbmcgaW5zZXJ0ZWQgYmV0d2VlbiBhcmd1bWVudHMgdG8gYHN0cl9jKClgLCB3aGlsZSBgY29sbGFwc2VgIGlzIHRoZSBzdHJpbmcgdXNlZCB0byBzZXBhcmF0ZSBhbnkgZWxlbWVudHMgb2YgdGhlIGNoYXJhY3RlciB2ZWN0b3IgaW50byBhIGNoYXJhY3RlciB2ZWN0b3Igb2YgbGVuZ3RoIG9uZS4NCg0KIyMjIDMuIFVzZSBgc3RyX2xlbmd0aCgpYCBhbmQgYHN0cl9zdWIoKWAgdG8gZXh0cmFjdCB0aGUgbWlkZGxlIGNoYXJhY3RlciBmcm9tIGEgc3RyaW5nLiBXaGF0IHdpbGwgeW91IGRvIGlmIHRoZSBzdHJpbmcgaGFzIGFuIGV2ZW4gbnVtYmVyIG9mIGNoYXJhY3RlcnM/DQoNClRoZSBmb2xsb3dpbmcgZnVuY3Rpb24gZXh0cmFjdHMgdGhlIG1pZGRsZSBjaGFyYWN0ZXIuIElmIHRoZSBzdHJpbmcgaGFzIGFuIGV2ZW4gbnVtYmVyIG9mIGNoYXJhY3RlcnMgdGhlIGNob2ljZSBpcyBhcmJpdHJhcnkuIFdlIGNob29zZSB0byBzZWxlY3QgICRbXGZyYWN7bn17Mn1dJCwgYmVjYXVzZSB0aGF0IGNhc2Ugd29ya3MgZXZlbiBpZiB0aGUgc3RyaW5nIGlzIG9ubHkgb2YgbGVuZ3RoIG9uZS4gQSBtb3JlIGdlbmVyYWwgbWV0aG9kIHdvdWxkIGFsbG93IHRoZSB1c2VyIHRvIHNlbGVjdCBlaXRoZXIgdGhlIGZsb29yIG9yIGNlaWxpbmcgZm9yIHRoZSBtaWRkbGUgY2hhcmFjdGVyIG9mIGFuIGV2ZW4gc3RyaW5nLg0KDQpgYGB7cn0NCnggPC0gYygiYSIsICJhYmMiLCAiYWJjZCIsICJhYmNkZSIsICJhYmNkZWYiKQ0KTCA8LSBzdHJfbGVuZ3RoKHgpDQptIDwtIGNlaWxpbmcoTCAvIDIpDQpzdHJfc3ViKHgsIG0sIG0pDQpgYGANCg0KIyMjIDQuIFdoYXQgZG9lcyBgc3RyX3dyYXAoKWAgZG8/IFdoZW4gbWlnaHQgeW91IHdhbnQgdG8gdXNlIGl0Pw0KDQpUaGUgZnVuY3Rpb24gYHN0cl93cmFwKClgIHdyYXBzIHRleHQgc28gdGhhdCBpdCBmaXRzIHdpdGhpbiBhIGNlcnRhaW4gd2lkdGguIFRoaXMgaXMgdXNlZnVsIGZvciB3cmFwcGluZyBsb25nIHN0cmluZ3Mgb2YgdGV4dCB0byBiZSB0eXBlc2V0Lg0KDQojIyMgNS4gV2hhdCBkb2VzIGBzdHJfdHJpbSgpYCBkbz8gV2hhdOKAmXMgdGhlIG9wcG9zaXRlIG9mIGBzdHJfdHJpbSgpYD8NCg0KVGhlIGZ1bmN0aW9uIGBzdHJfdHJpbSgpYCB0cmltcyB0aGUgd2hpdGVzcGFjZSBmcm9tIGEgc3RyaW5nLg0KDQpgYGB7cn0NCnN0cl90cmltKCIgYWJjICIpDQpzdHJfdHJpbSgiIGFiYyAiLCBzaWRlID0gImxlZnQiKQ0Kc3RyX3RyaW0oIiBhYmMgIiwgc2lkZSA9ICJyaWdodCIpDQpgYGANCg0KVGhlIG9wcG9zaXRlIG9mIGBzdHJfdHJpbSgpYCBpcyBgc3RyX3BhZCgpYCB3aGljaCBhZGRzIGNoYXJhY3RlcnMgdG8gZWFjaCBzaWRlLg0KDQpgYGB7cn0NCnN0cl9wYWQoImFiYyIsIDUsIHNpZGUgPSAiYm90aCIpDQpzdHJfcGFkKCJhYmMiLCA0LCBzaWRlID0gInJpZ2h0IikNCnN0cl9wYWQoImFiYyIsIDQsIHNpZGUgPSAibGVmdCIpDQpgYGANCg0KIyMjIDYuIFdyaXRlIGEgZnVuY3Rpb24gdGhhdCB0dXJucyAoZS5nLikgYSB2ZWN0b3IgYGMoImEiLCAiYiIsICJjIilgIGludG8gdGhlIHN0cmluZyBgImEsIGIsIGFuZCBjImAuIFRoaW5rIGNhcmVmdWxseSBhYm91dCB3aGF0IGl0IHNob3VsZCBkbyBpZiBnaXZlbiBhIHZlY3RvciBvZiBsZW5ndGggMCwgMSwgb3IgMi4NCg0KU2VlIHRoZSBGdW5jdGlvbnMgbGVjdHVyZSBmb3IgbW9yZSBkZXRhaWxzIG9uIHdyaXRpbmcgUiBmdW5jdGlvbnMuDQoNClRoaXMgZnVuY3Rpb24gbmVlZHMgdG8gaGFuZGxlIGZvdXIgY2FzZXMuDQoNCjEuIGBuID09IDBgOiBhbiBlbXB0eSBzdHJpbmcsIGUuZy4gYCIiYC4NCjIuIGBuID09IDFgOiB0aGUgb3JpZ2luYWwgdmVjdG9yLCBlLmcuIGAiYSJgLg0KMy4gYG4gPT0gMmA6IHJldHVybiB0aGUgdHdvIGVsZW1lbnRzIHNlcGFyYXRlZCBieSDigJxhbmTigJ0sIGUuZy4gYCJhIGFuZCBiImAuDQo0LiBgbiA+IDJgOiByZXR1cm4gdGhlIGZpcnN0IGBuIC0gMWAgZWxlbWVudHMgc2VwYXJhdGVkIGJ5IGNvbW1hcywgYW5kIHRoZSBsYXN0IGVsZW1lbnQgc2VwYXJhdGVkIGJ5IGEgY29tbWEgYW5kIOKAnGFuZOKAnSwgZS5nLiBgImEsIGIsIGFuZCBjImAuDQoNCmBgYHtyfQ0Kc3RyX2NvbW1hc2VwIDwtIGZ1bmN0aW9uKHgsIGRlbGltID0gIiwiKSB7DQogIG4gPC0gbGVuZ3RoKHgpDQogIGlmIChuID09IDApIHsNCiAgICAiIg0KICB9IGVsc2UgaWYgKG4gPT0gMSkgew0KICAgIHgNCiAgfSBlbHNlIGlmIChuID09IDIpIHsNCiAgICAjIG5vIGNvbW1hIGJlZm9yZSBhbmQgd2hlbiBuID09IDINCiAgICBzdHJfYyh4W1sxXV0sICJhbmQiLCB4W1syXV0sIHNlcCA9ICIgIikNCiAgfSBlbHNlIHsNCiAgICAjIGNvbW1hcyBhZnRlciBhbGwgbiAtIDEgZWxlbWVudHMNCiAgICBub3RfbGFzdCA8LSBzdHJfYyh4W3NlcV9sZW4obiAtIDEpXSwgZGVsaW0pDQogICAgIyBwcmVwZW5kICJhbmQiIHRvIHRoZSBsYXN0IGVsZW1lbnQNCiAgICBsYXN0IDwtIHN0cl9jKCJhbmQiLCB4W1tuXV0sIHNlcCA9ICIgIikNCiAgICAjIGNvbWJpbmUgcGFydHMgd2l0aCBzcGFjZXMNCiAgICBzdHJfYyhjKG5vdF9sYXN0LCBsYXN0KSwgY29sbGFwc2UgPSAiICIpDQogIH0NCn0NCnN0cl9jb21tYXNlcCgiIikNCnN0cl9jb21tYXNlcCgiYSIpDQpzdHJfY29tbWFzZXAoYygiYSIsICJiIikpDQpzdHJfY29tbWFzZXAoYygiYSIsICJiIiwgImMiKSkNCnN0cl9jb21tYXNlcChjKCJhIiwgImIiLCAiYyIsICJkIikpDQpgYGANCg==