suppressPackageStartupMessages(library("tidyverse"))
package 㤼㸱tidyverse㤼㸲 was built under R version 3.6.3
1. How would you find all strings containing \
with regex()
vs. with fixed()
?
str_subset(c("a\\b", "ab"), "\\\\")
[1] "a\\b"
str_subset(c("a\\b", "ab"), fixed("\\"))
[1] "a\\b"
2. What are the five most common words in sentences?
Using str_extract_all()
with the argument boundary("word")
will extract all words. The rest of the code uses dplyr
functions to count words and find the most common words.
tibble(word = unlist(str_extract_all(sentences, boundary("word")))) %>%
mutate(word = str_to_lower(word)) %>%
count(word, sort = TRUE) %>%
head(5)
LS0tDQp0aXRsZTogIk90aGVyIHR5cGVzIG9mIHBhdHRlcm4iDQpvdXRwdXQ6IA0KICBodG1sX25vdGVib29rOg0KICAgIHRvYzogdHJ1ZQ0KICAgIHRvY19mbG9hdDogdHJ1ZQ0KLS0tDQoNCmBgYHtyfQ0Kc3VwcHJlc3NQYWNrYWdlU3RhcnR1cE1lc3NhZ2VzKGxpYnJhcnkoInRpZHl2ZXJzZSIpKQ0KYGBgDQoNCiMjIyAxLiBIb3cgd291bGQgeW91IGZpbmQgYWxsIHN0cmluZ3MgY29udGFpbmluZyBgXGAgd2l0aCBgcmVnZXgoKWAgdnMuIHdpdGggYGZpeGVkKClgPw0KDQpgYGB7cn0NCnN0cl9zdWJzZXQoYygiYVxcYiIsICJhYiIpLCAiXFxcXCIpDQpzdHJfc3Vic2V0KGMoImFcXGIiLCAiYWIiKSwgZml4ZWQoIlxcIikpDQpgYGANCg0KIyMjIDIuIFdoYXQgYXJlIHRoZSBmaXZlIG1vc3QgY29tbW9uIHdvcmRzIGluIHNlbnRlbmNlcz8NCg0KVXNpbmcgYHN0cl9leHRyYWN0X2FsbCgpYCB3aXRoIHRoZSBhcmd1bWVudCBgYm91bmRhcnkoIndvcmQiKWAgd2lsbCBleHRyYWN0IGFsbCB3b3Jkcy4gVGhlIHJlc3Qgb2YgdGhlIGNvZGUgdXNlcyBgZHBseXJgIGZ1bmN0aW9ucyB0byBjb3VudCB3b3JkcyBhbmQgZmluZCB0aGUgbW9zdCBjb21tb24gd29yZHMuDQoNCmBgYHtyfQ0KdGliYmxlKHdvcmQgPSB1bmxpc3Qoc3RyX2V4dHJhY3RfYWxsKHNlbnRlbmNlcywgYm91bmRhcnkoIndvcmQiKSkpKSAlPiUNCiAgbXV0YXRlKHdvcmQgPSBzdHJfdG9fbG93ZXIod29yZCkpICU+JQ0KICBjb3VudCh3b3JkLCBzb3J0ID0gVFJVRSkgJT4lDQogIGhlYWQoNSkNCmBgYA==