## -- Attaching packages ---------- tidyverse 1.3.0 --
## √ ggplot2 3.3.2     √ purrr   0.3.4
## √ tibble  3.0.3     √ dplyr   0.8.5
## √ tidyr   1.0.2     √ stringr 1.4.0
## √ readr   1.3.1     √ forcats 0.5.0
## -- Conflicts ------------- tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()

1.str_c

Join multiple strings into a single string.

用法:
str_c(…, sep = "", collapse = NULL)

sep: String to insert between input vectors.

collapse: Optional string used to combine input vectors into single string.If collapse = NULL (the default) a character vector with length equal to the longest input string. If collapse is non-NULL, a character vector of length 1

案例1

## [1] "a b c d e f g h i j k l m n o p q r s t u v w x y z"
## [1] "a*b*c*d*e*f*g*h*i*j*k*l*m*n*o*p*q*r*s*t*u*v*w*x*y*z"
## [1] "a&b&c&d&e&f&g&h&i&j&k&l&m&n&o&p&q&r&s&t&u&v&w&x&y&z"
##  [1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j" "k" "l" "m" "n" "o" "p" "q" "r" "s"
## [20] "t" "u" "v" "w" "x" "y" "z"
##  [1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j" "k" "l" "m" "n" "o" "p" "q" "r" "s"
## [20] "t" "u" "v" "w" "x" "y" "z"
##  [1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j" "k" "l" "m" "n" "o" "p" "q" "r" "s"
## [20] "t" "u" "v" "w" "x" "y" "z"

案例3

## [1] "我-和-你"
## [1] "我" "和" "你"
## [1] "我和你-"
## [1] "我-d" "和-d" "你-d"
## [1] "我和你"
## [1] "我-和-你"

总结

collapse连接字符串中间,比如连接c(“我”,“和”,“你”),代码:str_c(c(“我”,“和”,“你”),collapse=“-”),效果:“我-和-你”。sep连接字符串与字符串之间,比如连接”我“,”和“,”你“,代码:str_c(”我“,”和“,”你“,sep=”-“),效果:”我-和-你“。第三种情况,既不加collapse,也不加sep,见案例二,代码:str_c(c(”我“,”和“,”你“),”-d“),效果:”我-d" “和-d” “你-d”

2.str_conv

Specify the encoding of a string.
指定字符串的编码

用法:
str_conv(string, encoding)

4.str_detect

Detect the presence or absence of a pattern in a string.
检测字符串中是否存在模式。

用法:
str_detect(string, pattern, negate = FALSE)
negate: If TRUE, return non-matching elements.

案例2

##  [1]  TRUE FALSE  TRUE FALSE  TRUE  TRUE  TRUE FALSE FALSE FALSE FALSE FALSE
## [13] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [25] FALSE FALSE

5.str_dup

Duplicate and concatenate strings within a character vector.
在字符向量内复制和连接字符串。

用法:
str_dup(string, times)

案例1

## [1] "appleapple"   "pearpear"     "bananabanana"
## [1] "apple"              "pearpear"           "bananabananabanana"

案例2

## [1] "ba"           "bana"         "banana"       "bananana"     "banananana"  
## [6] "bananananana"
## [1] "ba"           "baba"         "bababa"       "babababa"     "bababababa"  
## [6] "babababababa"
## [1] "别叨叨"         "别叨叨叨叨"     "别叨叨叨叨叨叨"

6.str_extract

Extract matching patterns from a string.
从字符串中提取匹配模式。

用法:
str_extract(string, pattern)
str_extract_all(string, pattern, simplify = FALSE)

simplify: If FALSE, the default, returns a list of character vectors. If TRUE returns a character matrix.

案例2

## [1] "4" NA  NA  "2"
## [[1]]
## [1] "4"
## 
## [[2]]
## character(0)
## 
## [[3]]
## character(0)
## 
## [[4]]
## [1] "2"
##      [,1]
## [1,] "4" 
## [2,] ""  
## [3,] ""  
## [4,] "2"
## [1] "a" "b" "b" "m"
## [1] "apples" "bag"    "bag"    "milk"
## [[1]]
## [1] "a" "p" "p" "l" "e" "s" "x"
## 
## [[2]]
##  [1] "b" "a" "g" "o" "f" "f" "l" "o" "u" "r"
## 
## [[3]]
##  [1] "b" "a" "g" "o" "f" "s" "u" "g" "a" "r"
## 
## [[4]]
## [1] "m" "i" "l" "k"
## [[1]]
## [1] "apples" "x"     
## 
## [[2]]
## [1] "bag"   "of"    "flour"
## 
## [[3]]
## [1] "bag"   "of"    "sugar"
## 
## [[4]]
## [1] "milk"
## [1] "ap" "ba" "ba" "mi"
## [1] "appl" "bag"  "bag"  "milk"

案例4

## [[1]]
## [1] "apples"
## 
## [[2]]
## [1] "bag"   "of"    "flour"
## 
## [[3]]
## [1] "bag"   "of"    "sugar"
## 
## [[4]]
## [1] "milk"
##      [,1]     [,2] [,3]   
## [1,] "apples" ""   ""     
## [2,] "bag"    "of" "flour"
## [3,] "bag"    "of" "sugar"
## [4,] "milk"   ""   ""

7.str_flatten

Flatten a string 展平字符串。

用法:
str_flatten(string, collapse = "")

案例1

## [1] "abcdefghijklmnopqrstuvwxyz"
## [1] "a-b-c-d-e-f-g-h-i-j-k-l-m-n-o-p-q-r-s-t-u-v-w-x-y-z"
## [1] "A*B*C*D*E*F*G*H*I*J*K*L*M*N*O*P*Q*R*S*T*U*V*W*X*Y*Z"

案例2

## [1] "a,b,c"

8.str_glue

Format and interpolate a string with glue. 用glue格式化和插入字符串。

用法:
str_glue(…, .sep = "“, .envir = parent.frame())
str_glue_data(.x, …, .sep =”“, .envir = parent.frame(),.na =”NA")

.sep: [character(1): ‘""’] Separator used to separate elements.分隔符用于分隔元素。

.envir: [environment: parent.frame()] Environment to evaluate each expression in. Expressions are evaluated from left to right. If .x is an environment, the expressions are evaluated in that environment and .envir is ignored.用于评估每个表达式的环境。从左到右评估表达式。

.x: [listish] An environment, list or data frame used to lookup values.

.na: [character(1): ‘NA’] Value to replace NA values with. If NULL missing values are propagated, that is an NA result will cause NA output. Otherwise the value is replaced by the value of .na.

案例2

## 张三北风网北京海定
## 张三-北风网-北京海定

note: str_glue_data() is useful in data pipelines
mtcars %>% str_glue_data(“{rownames(.)} has {hp} hp”)

9.str_length

从技术上讲,这将以字符串形式返回“代码点”的数量。一个代码点通常对应一个字符,但并非总是如此。例如,带有变音符号的u可能表示为单个字符或u和变音符号的组合。

用法:
str_length(string)

案例1

##  [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
## [1] NA
## [1] 3
## [1]  1  4 11 NA

案例2

## [1] "ü"
## [1] "u<U+0308>"
## [1] 1
## [1] 2
## [1] 1
## [1] 1

10.str_locate

Locate the position of patterns in a string.

用法:
str_locate(string, pattern) str_locate_all(string, pattern)

对于str_locate,是一个整数矩阵。 第一列给出比赛的开始位置,第二列给出给出最终位置。 对于str_locate_all,是整数矩阵的列表。

str_extract() for a convenient way of extracting matches, stringi::stri_locate() for the underlying implementation.

str_extract()用于提取匹配项的便捷方法,stringi ::stri_locate()用于提取匹配项基础实施。

案例1

##      start end
## [1,]     6   5
## [2,]     7   6
## [3,]     5   4
## [4,]    10   9
##      start end
## [1,]     1   1
## [2,]     2   2
## [3,]     3   3
## [4,]     5   5
##      start end
## [1,]     5   5
## [2,]    NA  NA
## [3,]     2   2
## [4,]     4   4
##      start end
## [1,]     1   1
## [2,]     1   1
## [3,]     1   1
## [4,]     1   1
## [[1]]
##      start end
## [1,]     1   1
## 
## [[2]]
##      start end
## [1,]     2   2
## [2,]     4   4
## [3,]     6   6
## 
## [[3]]
##      start end
## [1,]     3   3
## 
## [[4]]
##      start end
## [1,]     5   5
## [[1]]
##      start end
## [1,]     5   5
## 
## [[2]]
##      start end
## 
## [[3]]
##      start end
## [1,]     2   2
## 
## [[4]]
##      start end
## [1,]     4   4
## [2,]     9   9
## [[1]]
##      start end
## [1,]     1   1
## 
## [[2]]
##      start end
## [1,]     1   1
## 
## [[3]]
##      start end
## [1,]     1   1
## 
## [[4]]
##      start end
## [1,]     1   1
## [2,]     6   6
## [3,]     7   7
## [[1]]
##      start end
## [1,]     1   1
## [2,]     2   2
## [3,]     3   3
## [4,]     4   4
## [5,]     5   5
## 
## [[2]]
##      start end
## [1,]     1   1
## [2,]     2   2
## [3,]     3   3
## [4,]     4   4
## [5,]     5   5
## [6,]     6   6
## 
## [[3]]
##      start end
## [1,]     1   1
## [2,]     2   2
## [3,]     3   3
## [4,]     4   4
## 
## [[4]]
##       start end
##  [1,]     1   1
##  [2,]     2   2
##  [3,]     3   3
##  [4,]     4   4
##  [5,]     5   5
##  [6,]     6   6
##  [7,]     7   7
##  [8,]     8   8
##  [9,]     9   9

11.str_match

Extract matched groups from a string.
从字符串中提取匹配的组

用法:
str_match(string, pattern) str_match_all(string, pattern)

对于str_match,是一个字符矩阵。第一列是完全匹配,第二列是对于每个捕获组。 对于str_match_all,字符矩阵列表。

str_extract() to extract the complete match,stringi::stri_match() for the underlying implementation.

案例1(不懂)

##  [1] "219 733 8965" "329-293-8753" NA             "595 794 7569" "387 287 6718"
##  [6] NA             "233.398.9187" "482 952 3315" "239 923 8115" "579-499-7527"
## [11] NA             "543.355.3679"
##       [,1]           [,2]  [,3]  [,4]  
##  [1,] "219 733 8965" "219" "733" "8965"
##  [2,] "329-293-8753" "329" "293" "8753"
##  [3,] NA             NA    NA    NA    
##  [4,] "595 794 7569" "595" "794" "7569"
##  [5,] "387 287 6718" "387" "287" "6718"
##  [6,] NA             NA    NA    NA    
##  [7,] "233.398.9187" "233" "398" "9187"
##  [8,] "482 952 3315" "482" "952" "3315"
##  [9,] "239 923 8115" "239" "923" "8115"
## [10,] "579-499-7527" "579" "499" "7527"
## [11,] NA             NA    NA    NA    
## [12,] "543.355.3679" "543" "355" "3679"
## [[1]]
## [1] "219 733 8965"
## 
## [[2]]
## [1] "329-293-8753"
## 
## [[3]]
## character(0)
## 
## [[4]]
## [1] "595 794 7569"
## 
## [[5]]
## [1] "387 287 6718"
## 
## [[6]]
## character(0)
## 
## [[7]]
## [1] "233.398.9187"
## 
## [[8]]
## [1] "482 952 3315"
## 
## [[9]]
## [1] "239 923 8115" "842 566 4692"
## 
## [[10]]
## [1] "579-499-7527"
## 
## [[11]]
## character(0)
## 
## [[12]]
## [1] "543.355.3679"
## [[1]]
##      [,1]           [,2]  [,3]  [,4]  
## [1,] "219 733 8965" "219" "733" "8965"
## 
## [[2]]
##      [,1]           [,2]  [,3]  [,4]  
## [1,] "329-293-8753" "329" "293" "8753"
## 
## [[3]]
##      [,1] [,2] [,3] [,4]
## 
## [[4]]
##      [,1]           [,2]  [,3]  [,4]  
## [1,] "595 794 7569" "595" "794" "7569"
## 
## [[5]]
##      [,1]           [,2]  [,3]  [,4]  
## [1,] "387 287 6718" "387" "287" "6718"
## 
## [[6]]
##      [,1] [,2] [,3] [,4]
## 
## [[7]]
##      [,1]           [,2]  [,3]  [,4]  
## [1,] "233.398.9187" "233" "398" "9187"
## 
## [[8]]
##      [,1]           [,2]  [,3]  [,4]  
## [1,] "482 952 3315" "482" "952" "3315"
## 
## [[9]]
##      [,1]           [,2]  [,3]  [,4]  
## [1,] "239 923 8115" "239" "923" "8115"
## [2,] "842 566 4692" "842" "566" "4692"
## 
## [[10]]
##      [,1]           [,2]  [,3]  [,4]  
## [1,] "579-499-7527" "579" "499" "7527"
## 
## [[11]]
##      [,1] [,2] [,3] [,4]
## 
## [[12]]
##      [,1]           [,2]  [,3]  [,4]  
## [1,] "543.355.3679" "543" "355" "3679"

案例2(不懂)

##      [,1]      [,2] [,3]
## [1,] "<a> <b>" "a"  "b" 
## [2,] "<a> <>"  "a"  ""  
## [3,] NA        NA   NA  
## [4,] NA        NA   NA  
## [5,] NA        NA   NA
## [[1]]
##      [,1]  [,2]
## [1,] "<a>" "a" 
## [2,] "<b>" "b" 
## 
## [[2]]
##      [,1]  [,2]
## [1,] "<a>" "a" 
## [2,] "<>"  ""  
## 
## [[3]]
##      [,1]  [,2]
## [1,] "<a>" "a" 
## 
## [[4]]
##      [,1] [,2]
## 
## [[5]]
##      [,1] [,2]
## [1,] NA   NA
## [1] "<a>" "<a>" "<a>" NA    NA
## [[1]]
## [1] "<a>" "<b>"
## 
## [[2]]
## [1] "<a>" "<>" 
## 
## [[3]]
## [1] "<a>"
## 
## [[4]]
## character(0)
## 
## [[5]]
## [1] NA

案例3

##      [,1]
## [1,] "a" 
## [2,] NA  
## [3,] NA
##      [,1]
## [1,] NA  
## [2,] NA  
## [3,] NA
## [[1]]
##      [,1]
## [1,] "a" 
## 
## [[2]]
##      [,1]
## 
## [[3]]
##      [,1]
## [[1]]
##      [,1]
## 
## [[2]]
##      [,1]
## 
## [[3]]
##      [,1]
## [1] "a" NA  NA
## [1] NA NA NA
## [[1]]
## [1] "a"
## 
## [[2]]
## character(0)
## 
## [[3]]
## character(0)
## [[1]]
## character(0)
## 
## [[2]]
## character(0)
## 
## [[3]]
## character(0)

12.str_order

Order or sort a character vector.排序

用法:
str_order(x, decreasing = FALSE, na_last = TRUE, locale = “en”, numeric = FALSE, …)
str_sort(x, decreasing = FALSE, na_last = TRUE, locale = “en”, numeric = FALSE, …)

decreasing: A boolean. If FALSE, the default, sorts from lowest to highest; if TRUE sorts from highest to lowest.

na_last: Where should NA go? TRUE at the end, FALSE at the beginning, NA dropped.

locale: In which locale should the sorting occur? Defaults to the English. This ensures that code behaves the same way across platforms.

numeric: If TRUE, will sort digits numerically, instead of as strings.

案例1

##  [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
## [26] 26
##  [1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j" "k" "l" "m" "n" "o" "p" "q" "r" "s"
## [20] "t" "u" "v" "w" "x" "y" "z"
##  [1]  1  5  9 15 21  2  3  4  6  7  8 10 11 12 13 14 16 17 18 19 20 22 23 24 25
## [26] 26
##  [1] "a" "e" "i" "o" "u" "b" "c" "d" "f" "g" "h" "j" "k" "l" "m" "n" "p" "q" "r"
## [20] "s" "t" "v" "w" "x" "y" "z"

案例3

## [1] NA      "book"  "cook"  "ebook" "edog"  "good"
## [1] 5 3 2 4 6 1

13.str_pad

Pad a string.

用法:
str_pad(string, width, side = c(“left”, “right”, “both”), pad = " ")

str_trim() to remove whitespace; str_trunc() to decrease the maximum width of a string. str_trim()删除空格; str_trunc()减小字符串的最大宽度。

##案例1

##      [,1]                            
## [1,] "                        hadley"
## [2,] "hadley                        "
## [3,] "            hadley            "
## [1] "         a" "       abc" "    abcdef"
## [1] "    a"                "         a"           "                   a"
## [1] "---------a" "_________a" "         a"
## [1] "hadley"

案例2

## [1] "--------我"
## [1] "______鲁迅"

14.str_remove

Remove matched patterns in a string.

用法:
str_remove(string, pattern) str_remove_all(string, pattern)

Alias for str_replace(string, pattern, "“) str_replace的别名(字符串,模式,“”)

str_replace() for the underlying implementation str_replace()用于基础实现

案例1

## [1] "ne apple"     "tw pears"     "thre bananas"
## [1] "n ppl"    "tw prs"   "thr bnns"

案例2

## [1] "和你"

15.str_replace & str_replace_na

Replace matched patterns in a string.

用法:
str_replace(string, pattern, replacement) str_replace_all(string, pattern, replacement)

pattern Pattern to look for. The default interpretation is a regular expression, as described in stringi::stringisearch-regex. Control options with regex(). Match a fixed string (i.e. by comparing only bytes), using fixed(). This is fast, but approximate. Generally, for matching human text, you’ll want coll() which respects character matching rules for the specified locale.

str_replace_na() to turn missing values into “NA”; stri_replace() for the underlying implementation.

案例2

## [1] "one apple" "two pears" NA
## [1] "ne apple"     "tw pears"     "thre bananas"

案例3(不懂)

## [1] "oone apple"     "twoo pears"     "threee bananas"

案例4

## [1] "1ne apple"     "tw2 pears"     "thr3e bananas"
## [1] "one -pple"     "two p-ars"     "three bananas"

案例6

## [1] "1"   "abc" "def"

16.str_split

Split up a string into pieces. 将一串切成小块。

用法:
str_split(string, pattern, n = Inf, simplify = FALSE) str_split_fixed(string, pattern, n)

simplify: If FALSE, the default, returns a list of character vectors. If TRUE returns a character matrix.

For str_split_fixed, a character matrix with n columns. For str_split, a list of character vectors.

案例1

## [[1]]
## [1] "apples"  "oranges" "pears"   "bananas"
## 
## [[2]]
## [1] "pineapples" "mangos"     "guavas"
##      [,1]         [,2]      [,3]     [,4]     
## [1,] "apples"     "oranges" "pears"  "bananas"
## [2,] "pineapples" "mangos"  "guavas" ""
## [[1]]
## [1] "apples"            "oranges"           "pears and bananas"
## 
## [[2]]
## [1] "pineapples" "mangos"     "guavas"
## [[1]]
## [1] "apples"                        "oranges and pears and bananas"
## 
## [[2]]
## [1] "pineapples"        "mangos and guavas"
## [[1]]
## [1] "apples"  "oranges" "pears"   "bananas"
## 
## [[2]]
## [1] "pineapples" "mangos"     "guavas"
##      [,1]         [,2]      [,3]               
## [1,] "apples"     "oranges" "pears and bananas"
## [2,] "pineapples" "mangos"  "guavas"
##      [,1]         [,2]      [,3]     [,4]     
## [1,] "apples"     "oranges" "pears"  "bananas"
## [2,] "pineapples" "mangos"  "guavas" ""

案例2

## [[1]]
## [1] "你"   "我他"
##      [,1] [,2] [,3]
## [1,] "你" "我" "他"
## [[1]]
## [1] "你" "我" "他"

17.str_starts & str_ends

Detect the presence or absence of a pattern at the beginning or end of a string. 检测字符串开头或结尾是否存在模式。

用法:
str_starts(string, pattern, negate = FALSE) str_ends(string, pattern, negate = FALSE)

negate: If TRUE, return non-matching elements.

案例1

## [1] FALSE FALSE  TRUE  TRUE
## [1]  TRUE  TRUE FALSE FALSE
## [1]  TRUE FALSE FALSE  TRUE
## [1] FALSE  TRUE  TRUE FALSE

案例2

## [1]  TRUE FALSE

18.str_sub

Extract and replace substrings from a character vector. 从字符向量中提取并替换子字符串。

用法:
str_sub(string, start = 1L, end = -1L) str_sub(string, start = 1L, end = -1L, omit_na = FALSE)<- value

omit_na: Single logical value. If TRUE, missing values in any of the arguments provided will result in an unchanged input.

案例2

##      start end
## [1,]     2   2
## [2,]     5   5
## [3,]     9   9
## [4,]    13  13
## [1] "a" "e" "i" "a"
## [1] "a" "e" "i" "a"

案例3

## [1] "我来自"
## [1] "中国"
## [1] "我来自中"
## [1] "中国"
## [1] "中国"
## [1] "我来自中"
## [1] "我" "来" "自" "中" "国"

19.str_subset & str_which

Keep strings matching a pattern, or find positions.
保持字符串与模式匹配,或找到位置。

用法:
str_subset(string, pattern, negate = FALSE)
str_which(string, pattern, negate = FALSE)

案例2

## [1] "我来自中国" "我来自英国"
## [1] "我来自中国"

20.str_trim & str_squish

Trim whitespace from a string。 从字符串修剪空格。

用法:
str_trim(string, side = c(“both”, “left”, “right”)) str_squish(string) str_pad() to add whitespace

案例1

## [1] "String with trailing and leading white space"
## [1] "String with trailing and leading white space"
## [1] "String with trailing and leading white space"
## [1] "String with trailing and leading white space"
## [1] "String with trailing, middle, and leading white space"
## [1] "String with excess, trailing and leading white space"

案例2

## [1] "我来 自中国"
## [1] "我来 自中国"
## [1] "我来 自中国"

21.str_trunc

Truncate a character string.截断字符串。

用法:
str_trunc(string, width, side = c(“right”, “left”, “center”), ellipsis = “…”)

side, ellipsis: Location and content of ellipsis that indicates content has been removed.

str_pad() to increase the minimum width of a string.

案例1

##      [,1]                  
## [1,] "This string is mo..."
## [2,] "...s moderately long"
## [3,] "This stri...ely long"

22.str_view

View HTML rendering of regular expression match. 查看正则表达式匹配的HTML呈现。

用法:
str_view(string, pattern, match = NA)
str_view_all(string, pattern, match = NA)

match: If TRUE, shows only strings that match the pattern. If FALSE, shows only the strings that don’t match the pattern. Otherwise (the default, NA) displays both matches and non-matches.

案例1

案例2

23.str_wrap

Wrap strings into nicely formatted paragraphs. 将字符串包装成格式正确的段落。

用法:
str_wrap(string, width = 80, indent = 0, exdent = 0)

width: positive integer giving target line width in characters. A width less than or equal to 1 will put each word on its ownline. 正整数,以字符为单位给出目标行宽。小于或等于1的宽度会将每个单词放在自己的行上。

indent: non-negative integer giving indentation of first line in each paragraph 非负整数,使每个段落的第一行缩进

exdent: non-negative integer giving indentation of following lines in each paragraph 非负整数,使每段中的以下行缩进

案例1

## R would not be what it is today without the invaluable help of these people
## outside of the R core team, who contributed by donating code, bug fixes and
## documentation: Valerio Aimale, Suharto Anggono, Thomas Baier, Henrik Bengtsson,
## Roger Bivand, Ben Bolker, David Brahm, G"oran Brostr"om, Patrick Burns, Vince
## Carey, Saikat DebRoy, Matt Dowle, Brian D'Urso, Lyndon Drake, Dirk Eddelbuettel,
## Claus Ekstrom, Sebastian Fischmeister, John Fox, Paul Gilbert, Yu Gong, Gabor
## Grothendieck, Frank E Harrell Jr, Peter M. Haverty, Torsten Hothorn, Robert
## King, Kjetil Kjernsmo, Roger Koenker, Philippe Lambert, Jan de Leeuw, Jim
## Lindsey, Patrick Lindsey, Catherine Loader, Gordon Maclean, Arni Magnusson, John
## Maindonald, David Meyer, Ei-ji Nakama, Jens Oehlschaegel, Steve Oncley, Richard
## O'Keefe, Hubert Palme, Roger D. Peng, Jose' C. Pinheiro, Tony Plate, Anthony
## Rossini, Jonathan Rougier, Petr Savicky, Guenther Sawitzki, Marc Schwartz, Arun
## Srinivasan, Detlef Steuer, Bill Simpson, Gordon Smyth, Adrian Trapletti, Terry
## Therneau, Rolf Turner, Bill Venables, Gregory R. Warnes, Andreas Weingessel,
## Morten Welinder, James Wettenhall, Simon Wood, and Achim Zeileis. Others have
## written code that has been adopted by R and is acknowledged in the code files,
## including
## R would not be what it is today
## without the invaluable help of these
## people outside of the R core team, who
## contributed by donating code, bug fixes
## and documentation: Valerio Aimale,
## Suharto Anggono, Thomas Baier, Henrik
## Bengtsson, Roger Bivand, Ben Bolker,
## David Brahm, G"oran Brostr"om, Patrick
## Burns, Vince Carey, Saikat DebRoy,
## Matt Dowle, Brian D'Urso, Lyndon Drake,
## Dirk Eddelbuettel, Claus Ekstrom,
## Sebastian Fischmeister, John Fox, Paul
## Gilbert, Yu Gong, Gabor Grothendieck,
## Frank E Harrell Jr, Peter M. Haverty,
## Torsten Hothorn, Robert King, Kjetil
## Kjernsmo, Roger Koenker, Philippe
## Lambert, Jan de Leeuw, Jim Lindsey,
## Patrick Lindsey, Catherine Loader,
## Gordon Maclean, Arni Magnusson, John
## Maindonald, David Meyer, Ei-ji Nakama,
## Jens Oehlschaegel, Steve Oncley, Richard
## O'Keefe, Hubert Palme, Roger D. Peng,
## Jose' C. Pinheiro, Tony Plate, Anthony
## Rossini, Jonathan Rougier, Petr Savicky,
## Guenther Sawitzki, Marc Schwartz, Arun
## Srinivasan, Detlef Steuer, Bill Simpson,
## Gordon Smyth, Adrian Trapletti, Terry
## Therneau, Rolf Turner, Bill Venables,
## Gregory R. Warnes, Andreas Weingessel,
## Morten Welinder, James Wettenhall, Simon
## Wood, and Achim Zeileis. Others have
## written code that has been adopted by R
## and is acknowledged in the code files,
## including
##   R would not be what it is today without the invaluable help
## of these people outside of the R core team, who contributed
## by donating code, bug fixes and documentation: Valerio
## Aimale, Suharto Anggono, Thomas Baier, Henrik Bengtsson,
## Roger Bivand, Ben Bolker, David Brahm, G"oran Brostr"om,
## Patrick Burns, Vince Carey, Saikat DebRoy, Matt Dowle,
## Brian D'Urso, Lyndon Drake, Dirk Eddelbuettel, Claus
## Ekstrom, Sebastian Fischmeister, John Fox, Paul Gilbert,
## Yu Gong, Gabor Grothendieck, Frank E Harrell Jr, Peter M.
## Haverty, Torsten Hothorn, Robert King, Kjetil Kjernsmo,
## Roger Koenker, Philippe Lambert, Jan de Leeuw, Jim Lindsey,
## Patrick Lindsey, Catherine Loader, Gordon Maclean, Arni
## Magnusson, John Maindonald, David Meyer, Ei-ji Nakama,
## Jens Oehlschaegel, Steve Oncley, Richard O'Keefe, Hubert
## Palme, Roger D. Peng, Jose' C. Pinheiro, Tony Plate, Anthony
## Rossini, Jonathan Rougier, Petr Savicky, Guenther Sawitzki,
## Marc Schwartz, Arun Srinivasan, Detlef Steuer, Bill Simpson,
## Gordon Smyth, Adrian Trapletti, Terry Therneau, Rolf Turner,
## Bill Venables, Gregory R. Warnes, Andreas Weingessel, Morten
## Welinder, James Wettenhall, Simon Wood, and Achim Zeileis.
## Others have written code that has been adopted by R and is
## acknowledged in the code files, including
## R would not be what it is today without the invaluable help
##   of these people outside of the R core team, who contributed
##   by donating code, bug fixes and documentation: Valerio
##   Aimale, Suharto Anggono, Thomas Baier, Henrik Bengtsson,
##   Roger Bivand, Ben Bolker, David Brahm, G"oran Brostr"om,
##   Patrick Burns, Vince Carey, Saikat DebRoy, Matt Dowle,
##   Brian D'Urso, Lyndon Drake, Dirk Eddelbuettel, Claus
##   Ekstrom, Sebastian Fischmeister, John Fox, Paul Gilbert,
##   Yu Gong, Gabor Grothendieck, Frank E Harrell Jr, Peter M.
##   Haverty, Torsten Hothorn, Robert King, Kjetil Kjernsmo,
##   Roger Koenker, Philippe Lambert, Jan de Leeuw, Jim Lindsey,
##   Patrick Lindsey, Catherine Loader, Gordon Maclean, Arni
##   Magnusson, John Maindonald, David Meyer, Ei-ji Nakama,
##   Jens Oehlschaegel, Steve Oncley, Richard O'Keefe, Hubert
##   Palme, Roger D. Peng, Jose' C. Pinheiro, Tony Plate, Anthony
##   Rossini, Jonathan Rougier, Petr Savicky, Guenther Sawitzki,
##   Marc Schwartz, Arun Srinivasan, Detlef Steuer, Bill Simpson,
##   Gordon Smyth, Adrian Trapletti, Terry Therneau, Rolf Turner,
##   Bill Venables, Gregory R. Warnes, Andreas Weingessel, Morten
##   Welinder, James Wettenhall, Simon Wood, and Achim Zeileis.
##   Others have written code that has been adopted by R and is
##   acknowledged in the code files, including
## R
##   would
##   not
##   be
##   what
##   it
##   is
##   today
##   without
##   the
##   invaluable
##   help
##   of
##   these
##   people
##   outside
##   of
##   the
##   R
##   core
##   team,
##   who
##   contributed
##   by
##   donating
##   code,
##   bug
##   fixes
##   and
##   documentation:
##   Valerio
##   Aimale,
##   Suharto
##   Anggono,
##   Thomas
##   Baier,
##   Henrik
##   Bengtsson,
##   Roger
##   Bivand,
##   Ben
##   Bolker,
##   David
##   Brahm,
##   G"oran
##   Brostr"om,
##   Patrick
##   Burns,
##   Vince
##   Carey,
##   Saikat
##   DebRoy,
##   Matt
##   Dowle,
##   Brian
##   D'Urso,
##   Lyndon
##   Drake,
##   Dirk
##   Eddelbuettel,
##   Claus
##   Ekstrom,
##   Sebastian
##   Fischmeister,
##   John
##   Fox,
##   Paul
##   Gilbert,
##   Yu
##   Gong,
##   Gabor
##   Grothendieck,
##   Frank
##   E
##   Harrell
##   Jr,
##   Peter
##   M.
##   Haverty,
##   Torsten
##   Hothorn,
##   Robert
##   King,
##   Kjetil
##   Kjernsmo,
##   Roger
##   Koenker,
##   Philippe
##   Lambert,
##   Jan
##   de
##   Leeuw,
##   Jim
##   Lindsey,
##   Patrick
##   Lindsey,
##   Catherine
##   Loader,
##   Gordon
##   Maclean,
##   Arni
##   Magnusson,
##   John
##   Maindonald,
##   David
##   Meyer,
##   Ei-
##   ji
##   Nakama,
##   Jens
##   Oehlschaegel,
##   Steve
##   Oncley,
##   Richard
##   O'Keefe,
##   Hubert
##   Palme,
##   Roger
##   D.
##   Peng,
##   Jose'
##   C.
##   Pinheiro,
##   Tony
##   Plate,
##   Anthony
##   Rossini,
##   Jonathan
##   Rougier,
##   Petr
##   Savicky,
##   Guenther
##   Sawitzki,
##   Marc
##   Schwartz,
##   Arun
##   Srinivasan,
##   Detlef
##   Steuer,
##   Bill
##   Simpson,
##   Gordon
##   Smyth,
##   Adrian
##   Trapletti,
##   Terry
##   Therneau,
##   Rolf
##   Turner,
##   Bill
##   Venables,
##   Gregory
##   R.
##   Warnes,
##   Andreas
##   Weingessel,
##   Morten
##   Welinder,
##   James
##   Wettenhall,
##   Simon
##   Wood,
##   and
##   Achim
##   Zeileis.
##   Others
##   have
##   written
##   code
##   that
##   has
##   been
##   adopted
##   by
##   R
##   and
##   is
##   acknowledged
##   in
##   the
##   code
##   files,
##   including

24.word

Extract words from a sentence.

用法:
word(string, start = 1L, end = start, sep = fixed(" "))

sep: separator between words. Defaults to single space. 单词之间的分隔符。 默认为单个空格。

案例1

## [1] "Jane" "Jane"
## [1] "saw" "sat"
## [1] "cat"  "down"
## [1] "saw a cat" "sat down"

案例2

## [1] "Jane saw a cat" "saw a cat"      "a cat"
## [1] "Jane"           "Jane saw"       "Jane saw a"     "Jane saw a cat"

##案例3(不懂)

## [1] "abc.def"
## [1] "123.4568.999"
## [1] "abc"

modifier functions

fixed

Compare literal bytes in the string. This is very fast, but not usually what you want for non-ASCII character sets. 比较字符串中的文字字节。 这非常快,但通常不是非ASCII字符集所需的。

用法:
fixed(pattern, ignore_case = FALSE)

ignore_case: Should case differences be ignored in the match?是否需要区分大小写?

coll

Compare strings respecting standard collation rules.
比较符合标准整理规则的字符串。

用法:
coll(pattern, ignore_case = FALSE, locale = “en”, …)

locale: Locale to use for comparisons. See stringi::stri_locale_list() for all possible options. Defaults to “en” (English) to ensure that the default collation is consistent across platforms.

regex

The default. Uses ICU regular expressions.

用法:
regex(pattern, ignore_case = FALSE, multiline = FALSE, comments = FALSE, dotall = FALSE, …)

multiline: If TRUE, $ and ^ match the beginning and end of each line. If FALSE, the default, only match the start and end of the input.

comments: If TRUE, white space and comments beginning with # are ignored. Escape literal spaces with

dotall: If TRUE, . will also match line terminators.

boundary

Match boundaries between things.

用法:
boundary(type = c(“character”, “line_break”, “sentence”, “word”), skip_word_none = NA, …)

character
Every character is a boundary.

line_break
Boundaries are places where it is acceptable to have a line break in the current locale.

sentence
The beginnings and ends of sentences are boundaries, using intelligent rules to avoid counting abbreviations (details).

word
The beginnings and ends of words are boundaries.

skip_word_none: Ignore “words” that don’t contain any characters or numbers - i.e. punctuation. Default NA will skip such “words” only when splitting on word boundaries.