1 Data in R

1.1 Modes and Classes

  • The mode function returns the mode of any object in R, and the classfunction returns the class.

  • The most commonly encountered modes of individual objects are numeric, character, and logical.

  • Matrices and Arrays demand that all data contained in them should be in the same mode.

  • Lists and Data Frames allow for multiple modes.

  • typeof function provides type of an object.

  • Factors represent categorical data.

  • Mode of Factors is numeric.

  • Date and Time classes : Date, POSIXlt, POSIXct

  • Lists can accomodate objects of different mode and length.

myList <- list(a = 1:3, b = c("Hero", "Zero"), factor(x = c("Male", "Female")))

sapply(X = myList, FUN = mode)
##           a           b             
##   "numeric" "character"   "numeric"
sapply(X = myList, FUN = class)
##           a           b             
##   "integer" "character"    "factor"

1.2 Data Storage in R

c() function - concatenate or combine

Read on the following examples and focus on the mode values.

x <- c(1, 2, 3, 4, 5)

x
## [1] 1 2 3 4 5
mode(x)
## [1] "numeric"
y <- c(1, 3, "Hero", TRUE)

y
## [1] "1"    "3"    "Hero" "TRUE"
mode(y)
## [1] "character"
z <- c(1, 2, 3, TRUE)
z
## [1] 1 2 3 1
mode(z)
## [1] "numeric"

Notice that when elements of different modes are combined with c, the mode of the resultant vector is different than that of its parts.

Conversion Rules :

  • If any element is a character, other elements are converted to character.

  • If there is no character element, but numeric and logical elements are present, then TRUE is coerced to 1 and FALSE to 0

Elements of a vector can be assigned names which can be further used as subscripts to access the elements.

Use names function to set the names of vector elements.

x <- c(one = 1, two = 2, three = 3, four = 4)

x
##   one   two three  four 
##     1     2     3     4
x <- 1:4

names(x) <- c("a", "b", "c", "d")

x
## a b c d 
## 1 2 3 4
x <- 1:100

names(x)[1:3] <- c("a", "b", "c")

x
##    a    b    c <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> 
##    1    2    3    4    5    6    7    8    9   10   11   12   13   14   15 
## <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> 
##   16   17   18   19   20   21   22   23   24   25   26   27   28   29   30 
## <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> 
##   31   32   33   34   35   36   37   38   39   40   41   42   43   44   45 
## <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> 
##   46   47   48   49   50   51   52   53   54   55   56   57   58   59   60 
## <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> 
##   61   62   63   64   65   66   67   68   69   70   71   72   73   74   75 
## <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> 
##   76   77   78   79   80   81   82   83   84   85   86   87   88   89   90 
## <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> 
##   91   92   93   94   95   96   97   98   99  100

If two vectors are not of the same length, then R will recycle the shorter vector to make their lengths comparable.

x <- 1:10

x + 1
##  [1]  2  3  4  5  6  7  8  9 10 11
x + c(11, 22)
##  [1] 12 24 14 26 16 28 18 30 20 32
# Notice the warning message
x + 1:3
## Warning in x + 1:3: longer object length is not a multiple of shorter
## object length
##  [1]  2  4  6  5  7  9  8 10 12 11

1.2.1 Arrays

  • Multidimensional extensions of vector

  • All elements have the same mode

1.2.2 Matrices

  • Matrices are stored internally as vectors, with the columns of the matrix stacked on top of each other.

  • The matrix function converts a vector to a matrix.

  • The nrow= and ncol= arguments to matrix specify the number of rows and columns, respectively. If only one of these arguments is given, the other will be calculated based on the length of the input data.

  • Since matrices are internally stored by columns, matrix assumes that the input vector will be converted to a matrix by columns; the byrow=TRUE argument can be used to override this in the more common case where the matrix needs to be read in by rows.

  • Class of a Matrix is reported as matrix

  • dim - Retrieve or set the dimension of an object.

  • nrow and ncol return the number of rows or columns present in x. NCOL and NROW do the same treating a vector as 1-column matrix.

  • Names can be assigned to the rows and/or columns of matrices, through the dimnames=argument of the matrix function, or after the fact through the dimnames or row.names assignment function.

  • Since the number of rows and columns of a matrix need not be the same, the value of dimnames must be a list; the first element is a vector of names for the rows, and the second is a vector of names for the columns.

  • To provide names for just one dimension of a matrix, use a value of NULL for the dimension for which you don’t wish to provide names.

myMatrix <- matrix(data = 1:20, nrow = 4, ncol = 5, byrow = TRUE, dimnames = list(c("Row 1", "Row 2", "Row 3", "Row 4"), c("Col 1", "Col 2", "Col 3", "Col 4", "Col 5")))

myMatrix
##       Col 1 Col 2 Col 3 Col 4 Col 5
## Row 1     1     2     3     4     5
## Row 2     6     7     8     9    10
## Row 3    11    12    13    14    15
## Row 4    16    17    18    19    20
myList <- list("Hello", "Honda", TRUE, c(1, 2, 3, 4), rnorm(n = 10))

myList
## [[1]]
## [1] "Hello"
## 
## [[2]]
## [1] "Honda"
## 
## [[3]]
## [1] TRUE
## 
## [[4]]
## [1] 1 2 3 4
## 
## [[5]]
##  [1]  0.73400033  0.68787261 -0.79371801 -1.26020268 -0.01372998
##  [6]  0.36396849  0.75590310 -0.36167884 -0.84678344 -0.09927101
sapply(X = myList, FUN = mode)
## [1] "character" "character" "logical"   "numeric"   "numeric"

List elements can also be named.

myList <- list(a = "A", b = c(2, 3, 4, 1), c = TRUE)

myList
## $a
## [1] "A"
## 
## $b
## [1] 2 3 4 1
## 
## $c
## [1] TRUE
myList <- list("Hello", "King", 1, 2, 3)

names(myList) <- letters[1:5]

myList
## $a
## [1] "Hello"
## 
## $b
## [1] "King"
## 
## $c
## [1] 1
## 
## $d
## [1] 2
## 
## $e
## [1] 3

1.2.3 Data Frame

  • Think of it as a list where each element of the list is of the same length

  • Mode of a data frame is list but class is data.frame

1.3 Testing for Modes and Classes

Functions beginning with is. can be used to test if an object is of a particular type.

Examples : is.list, is.factor, is.numeric, is.data.frame, and is.character.

methods function - List all available methods for an S3 generic function, or all methods for a class.

1.4 Structure of R Objects

summary function - a generic function used to produce result summaries of the results of various model fitting functions.

str function- Compactly Displays the Structure of an Arbitrary R Object

myList <- list(1:3, letters[1:3], rnorm(3))

summary(myList)
##      Length Class  Mode     
## [1,] 3      -none- numeric  
## [2,] 3      -none- character
## [3,] 3      -none- numeric
nestlist = list(a = list(matrix(rnorm(10),5,2),val = 3),b = list(sample(letters,10),values = runif(5)), c = list(list(1:10,1:20),list(1:5,1:10)))

summary(nestlist)
##   Length Class  Mode
## a 2      -none- list
## b 2      -none- list
## c 2      -none- list
str(nestlist)
## List of 3
##  $ a:List of 2
##   ..$    : num [1:5, 1:2] -2.148 -0.433 -0.583 0.588 -0.641 ...
##   ..$ val: num 3
##  $ b:List of 2
##   ..$       : chr [1:10] "q" "x" "z" "f" ...
##   ..$ values: num [1:5] 0.0331 0.4031 0.4118 0.1633 0.1677
##  $ c:List of 2
##   ..$ :List of 2
##   .. ..$ : int [1:10] 1 2 3 4 5 6 7 8 9 10
##   .. ..$ : int [1:20] 1 2 3 4 5 6 7 8 9 10 ...
##   ..$ :List of 2
##   .. ..$ : int [1:5] 1 2 3 4 5
##   .. ..$ : int [1:10] 1 2 3 4 5 6 7 8 9 10

The number of elements displayed for each component is controlled by the vec.len=argument, and can be set to 0 to suppress any values being printed; the depth of levels displayed for each object is controlled by the max.level=argument, which defaults to NA, meaning to display whatever depth of levels is actually encountered in the object.

1.5 Conversion of Objects

Conversion routines begin with as.

table uses the cross-classifying factors to build a contingency table of the counts at each combination of factor levels.

numbers <- c(1, 2, 3, 1, 2, 3, 1, 2, 3, 4, 5, 6, 5, 6, 7, 8, 9)

tab <- table(numbers)

names(tab)
## [1] "1" "2" "3" "4" "5" "6" "7" "8" "9"
sum(as.numeric(names(tab)))
## [1] 45

Note - as. forms for many types of objects behave very differently than the function which bears the type’s name.

x <- 1:10

list(x)
## [[1]]
##  [1]  1  2  3  4  5  6  7  8  9 10
as.list(x)
## [[1]]
## [1] 1
## 
## [[2]]
## [1] 2
## 
## [[3]]
## [1] 3
## 
## [[4]]
## [1] 4
## 
## [[5]]
## [1] 5
## 
## [[6]]
## [1] 6
## 
## [[7]]
## [1] 7
## 
## [[8]]
## [1] 8
## 
## [[9]]
## [1] 9
## 
## [[10]]
## [1] 10

1.6 Missing Values

NA - represents missing value in R

The generic function is.na indicates which elements are missing.

Infand -Inf are positive and negative infinity whereas NaN means ‘Not a Number’.

1.7 Working with Missing Values

na.rm= argument can be set to TRUE if you want to remove the missing values from summary calculations like mean, sum, etc

x <- c(1, 2, 3, NA, NA, 2, 3, 4, NA)

x
## [1]  1  2  3 NA NA  2  3  4 NA
x[is.na(x)]
## [1] NA NA NA
x[!is.na(x)]
## [1] 1 2 3 2 3 4

The statistical modeling functions (lm, glm, gam, etc.) all have an argument called na.action=, which allows you to specify a function that will be applied to the data frame specified by the data= argument before the modeling function processes the data.

na.fail returns the object if it does not contain any missing values, and signals an error otherwise. na.omit returns the object with incomplete cases removed. na.pass returns the object unchanged.

complete.cases - Returns a logical vector indicating which cases are complete, i.e., have no missing values.

Normally, missing values are not included when a variable is made into a factor; if you want the missing values to be considered a valid factor level, use the exclude=NULL argument to factor when the factor is first created.

na.strings= argument of read.table can be passed a vector of character values that should be treated as missing values.

2 Reading and Writing Data

2.1 Reading Vectors and Matrices

scan function - Reads data into a vector or list from the console or file. It is most appropriate when all the data to be read have same mode.

scan(file = "FileToScan1.txt", what = "character")
## [1] "Hello" "Bye"   "Sigh"  "Why"   "Die"   "Tie"
scan(file = "FileToScan2.txt", what = list(col1 = "numeric", col2 = "character", col3 = "numeric"), multi.line = FALSE)
## $col1
## [1] "1" "2" "3"
## 
## $col2
## [1] "Hello" "Bye"   "Die"  
## 
## $col3
## [1] "2" "3" "4"
x <- scan(file = "FileToScan3.txt")

x
##  [1] 1 2 3 3 4 5 4 5 6 5 6 7 6 7 8
matrix(x, ncol = 3, byrow = TRUE)
##      [,1] [,2] [,3]
## [1,]    1    2    3
## [2,]    3    4    5
## [3,]    4    5    6
## [4,]    5    6    7
## [5,]    6    7    8
# I want to read only 1st and 3rd columns and skip reading the 2nd column

dat <- scan(file = "FileToScan2.txt", what = list(firstCol = "numeric", secondCol = NULL, thirdCol = "numeric"))

dat
## $firstCol
## [1] "1" "2" "3"
## 
## $secondCol
## NULL
## 
## $thirdCol
## [1] "2" "3" "4"
# You can see that a NULL is introduced. To remove it, do the following

dat <- cbind(dat$firstCol, dat$thirdCol)

dat
##      [,1] [,2]
## [1,] "1"  "2" 
## [2,] "2"  "3" 
## [3,] "3"  "4"

2.2 Data Frames : read.table

read.table - Reads a file in table format and creates a data frame from it, with cases corresponding to lines and variables to fields in the file. Read the help file ?read.table for more information.

2.3 Comma and Tab Delimited Input Files

Read the help files for read.csv, read.csv2, read.delim. Precede the function name with ? while seeking help.

2.4 Fixed Width Input Files

Read the help file - ?read.fwf for detailed information.

2.5 Extracting Data from R Objects

Method dispatch allows R to examine the class of the arguments to a function, and to invoke a special version of the function designed for that class of object.

Generic Functions - provide Method Dispatch

Inheritance allows developers to create new classes that are similar to other classes and only methods that differ from the original class need to be provided.

When an object in R inherits the properties of an already defined object, its class attribute will be a vector containing the object’s class (in the first position), along with the classes from which it inherits.

apropos function can be used to find all the available methods for a given class.

apropos(".*\\.lm$")
## [1] "as.lm"           "confint.lm"      "dummy.coef.lm"   "kappa.lm"       
## [5] "model.matrix.lm" "predict.lm"      "residuals.lm"    "summary.lm"

With S4 classes, generic functions can be identified by the presence of a call to the standardGeneric function inside the generic function’s definition.

library(methods)

showMethods(classes = "mle")
## 
## Function "doPreJobValidation":
##  <not an S4 generic function>
## 
## Function "rxCompleteClusterJob":
##  <not an S4 generic function>
## 
## Function "rxGetHadoopJobId":
##  <not an S4 generic function>
## 
## Function "rxGetJobState":
##  <not an S4 generic function>
## 
## Function "rxOpenClass":
##  <not an S4 generic function>
## 
## Function "rxOpenType":
##  <not an S4 generic function>
## 
## Function "rxPrepareClusterJob":
##  <not an S4 generic function>
## 
## Function "rxPrintJobOutputFile":
##  <not an S4 generic function>
## 
## Function "rxReadAll":
##  <not an S4 generic function>
## 
## Function "rxRemoveClusterDataBlob":
##  <not an S4 generic function>
## 
## Function "rxStartClusterJob":
##  <not an S4 generic function>
## 
## Function "rxSyncJobResultsBlob":
##  <not an S4 generic function>

Later on I’ll post a tutorial on Object Oriented Programming in R which will dive into more detail.

2.6 Connections

Connections provide a flexible way for R to read data from a variety of sources, providing more complete control over the nature of the connection than simply specifying a file name as input to functions like read.table and scan.

file - files on the local system

pipe - output from a command

textConnection - treats text as a file

gzfile - local gzipped file

unz - local zip archive

bzfile - local bzipped file

url - remote file read via http

socketConnection - socket for client / server programs

rpage <- url("http://www.r-project.org","r")

rpage
##                description                      class 
## "http://www.r-project.org"                      "url" 
##                       mode                       text 
##                        "r"                     "text" 
##                     opened                   can read 
##                   "opened"                      "yes" 
##                  can write 
##                       "no"
lines = readLines(con = rpage, n = 5)
lines
## [1] "<!DOCTYPE html>"                                              
## [2] "<html lang=\"en\">"                                           
## [3] "  <head>"                                                     
## [4] "    <meta charset=\"utf-8\">"                                 
## [5] "    <meta http-equiv=\"X-UA-Compatible\" content=\"IE=edge\">"
close(rpage)

# Example of teXtConnection

tc <- textConnection(object = "2012-10-01 1 2
                     2012-12-13 2 3
                     2012-11-14 3 4")

tab <- read.table(file = tc, colClasses = c("Date", NA, NA))

class(tab$V1)
## [1] "Date"

2.7 Reading Large Data Files

I’ll cover this part in my Big Data Analytics with R Tutorial.

2.8 Generating Data

2.8.1 Sequences

1:10
##  [1]  1  2  3  4  5  6  7  8  9 10
seq(1, 15)
##  [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15
seq(0, 12, 2)
## [1]  0  2  4  6  8 10 12
seq(12, 0, -2)
## [1] 12 10  8  6  4  2  0
seq(from = 100, by = 100, length.out = 10)
##  [1]  100  200  300  400  500  600  700  800  900 1000

gl function - Generate factors by specifying the pattern of their levels.

myLevels <- data.frame(group = gl(3,10,length = 30),
                       subgroup = gl(5,2,length = 30),
                       obsv = gl(2,1,length = 30, ordered = TRUE))

head(myLevels)
##   group subgroup obsv
## 1     1        1    1
## 2     1        1    2
## 3     1        2    1
## 4     1        2    2
## 5     1        3    1
## 6     1        3    2

expand.grid function - Create a data frame from all combinations of the supplied vectors or factors

oddEvenDF = expand.grid(odd = seq(1,5,by = 2),even = seq(2,5,by = 2))

oddEvenDF
##   odd even
## 1   1    2
## 2   3    2
## 3   5    2
## 4   1    4
## 5   3    4
## 6   5    4

2.8.2 Random Numbers

List of Function Names and corresponding Distributions:

  • rbeta - Beta
  • rbinom - Binomial
  • rcauchy - Cauchy
  • rchisq - Chi-squared
  • rexp - Exponential
  • rf - F
  • rgamma - Gamma
  • rgeom - Geometric
  • rhyper - Hypergeometric
  • rlnorm - Log Normal
  • rlogis - Logistic
  • rmultinom - Multinomial
  • rnbinom - Negative Binomial
  • rnorm - Normal
  • rpois - Poisson
  • rsingrank - Signed Rank
  • rt - Student’s t
  • runif - Uniform
  • rweibull - Weibull
  • rwilcox - Wilcoxon Rank Sum

Also read ?set.seed help file.

2.9 Permutations

2.9.1 Random Permutations

sample function - takes a sample of the specified size from the elements of x using either with or without replacement.

2.9.2 Enumerating All Permutations

permn function - Generates all permutations of the elements of x, in a minimal- change order. If x is a positive integer, returns all permutations of the elements of seq(x). If argument “fun” is not null, applies a function given by the argument to each point. “…” are passed unchanged to the function given by argument fun, if any.

library(combinat)
## 
## Attaching package: 'combinat'
## 
## The following object is masked from 'package:utils':
## 
##     combn
permn(1:5)
## [[1]]
## [1] 1 2 3 4 5
## 
## [[2]]
## [1] 1 2 3 5 4
## 
## [[3]]
## [1] 1 2 5 3 4
## 
## [[4]]
## [1] 1 5 2 3 4
## 
## [[5]]
## [1] 5 1 2 3 4
## 
## [[6]]
## [1] 5 1 2 4 3
## 
## [[7]]
## [1] 1 5 2 4 3
## 
## [[8]]
## [1] 1 2 5 4 3
## 
## [[9]]
## [1] 1 2 4 5 3
## 
## [[10]]
## [1] 1 2 4 3 5
## 
## [[11]]
## [1] 1 4 2 3 5
## 
## [[12]]
## [1] 1 4 2 5 3
## 
## [[13]]
## [1] 1 4 5 2 3
## 
## [[14]]
## [1] 1 5 4 2 3
## 
## [[15]]
## [1] 5 1 4 2 3
## 
## [[16]]
## [1] 5 4 1 2 3
## 
## [[17]]
## [1] 4 5 1 2 3
## 
## [[18]]
## [1] 4 1 5 2 3
## 
## [[19]]
## [1] 4 1 2 5 3
## 
## [[20]]
## [1] 4 1 2 3 5
## 
## [[21]]
## [1] 4 1 3 2 5
## 
## [[22]]
## [1] 4 1 3 5 2
## 
## [[23]]
## [1] 4 1 5 3 2
## 
## [[24]]
## [1] 4 5 1 3 2
## 
## [[25]]
## [1] 5 4 1 3 2
## 
## [[26]]
## [1] 5 1 4 3 2
## 
## [[27]]
## [1] 1 5 4 3 2
## 
## [[28]]
## [1] 1 4 5 3 2
## 
## [[29]]
## [1] 1 4 3 5 2
## 
## [[30]]
## [1] 1 4 3 2 5
## 
## [[31]]
## [1] 1 3 4 2 5
## 
## [[32]]
## [1] 1 3 4 5 2
## 
## [[33]]
## [1] 1 3 5 4 2
## 
## [[34]]
## [1] 1 5 3 4 2
## 
## [[35]]
## [1] 5 1 3 4 2
## 
## [[36]]
## [1] 5 1 3 2 4
## 
## [[37]]
## [1] 1 5 3 2 4
## 
## [[38]]
## [1] 1 3 5 2 4
## 
## [[39]]
## [1] 1 3 2 5 4
## 
## [[40]]
## [1] 1 3 2 4 5
## 
## [[41]]
## [1] 3 1 2 4 5
## 
## [[42]]
## [1] 3 1 2 5 4
## 
## [[43]]
## [1] 3 1 5 2 4
## 
## [[44]]
## [1] 3 5 1 2 4
## 
## [[45]]
## [1] 5 3 1 2 4
## 
## [[46]]
## [1] 5 3 1 4 2
## 
## [[47]]
## [1] 3 5 1 4 2
## 
## [[48]]
## [1] 3 1 5 4 2
## 
## [[49]]
## [1] 3 1 4 5 2
## 
## [[50]]
## [1] 3 1 4 2 5
## 
## [[51]]
## [1] 3 4 1 2 5
## 
## [[52]]
## [1] 3 4 1 5 2
## 
## [[53]]
## [1] 3 4 5 1 2
## 
## [[54]]
## [1] 3 5 4 1 2
## 
## [[55]]
## [1] 5 3 4 1 2
## 
## [[56]]
## [1] 5 4 3 1 2
## 
## [[57]]
## [1] 4 5 3 1 2
## 
## [[58]]
## [1] 4 3 5 1 2
## 
## [[59]]
## [1] 4 3 1 5 2
## 
## [[60]]
## [1] 4 3 1 2 5
## 
## [[61]]
## [1] 4 3 2 1 5
## 
## [[62]]
## [1] 4 3 2 5 1
## 
## [[63]]
## [1] 4 3 5 2 1
## 
## [[64]]
## [1] 4 5 3 2 1
## 
## [[65]]
## [1] 5 4 3 2 1
## 
## [[66]]
## [1] 5 3 4 2 1
## 
## [[67]]
## [1] 3 5 4 2 1
## 
## [[68]]
## [1] 3 4 5 2 1
## 
## [[69]]
## [1] 3 4 2 5 1
## 
## [[70]]
## [1] 3 4 2 1 5
## 
## [[71]]
## [1] 3 2 4 1 5
## 
## [[72]]
## [1] 3 2 4 5 1
## 
## [[73]]
## [1] 3 2 5 4 1
## 
## [[74]]
## [1] 3 5 2 4 1
## 
## [[75]]
## [1] 5 3 2 4 1
## 
## [[76]]
## [1] 5 3 2 1 4
## 
## [[77]]
## [1] 3 5 2 1 4
## 
## [[78]]
## [1] 3 2 5 1 4
## 
## [[79]]
## [1] 3 2 1 5 4
## 
## [[80]]
## [1] 3 2 1 4 5
## 
## [[81]]
## [1] 2 3 1 4 5
## 
## [[82]]
## [1] 2 3 1 5 4
## 
## [[83]]
## [1] 2 3 5 1 4
## 
## [[84]]
## [1] 2 5 3 1 4
## 
## [[85]]
## [1] 5 2 3 1 4
## 
## [[86]]
## [1] 5 2 3 4 1
## 
## [[87]]
## [1] 2 5 3 4 1
## 
## [[88]]
## [1] 2 3 5 4 1
## 
## [[89]]
## [1] 2 3 4 5 1
## 
## [[90]]
## [1] 2 3 4 1 5
## 
## [[91]]
## [1] 2 4 3 1 5
## 
## [[92]]
## [1] 2 4 3 5 1
## 
## [[93]]
## [1] 2 4 5 3 1
## 
## [[94]]
## [1] 2 5 4 3 1
## 
## [[95]]
## [1] 5 2 4 3 1
## 
## [[96]]
## [1] 5 4 2 3 1
## 
## [[97]]
## [1] 4 5 2 3 1
## 
## [[98]]
## [1] 4 2 5 3 1
## 
## [[99]]
## [1] 4 2 3 5 1
## 
## [[100]]
## [1] 4 2 3 1 5
## 
## [[101]]
## [1] 4 2 1 3 5
## 
## [[102]]
## [1] 4 2 1 5 3
## 
## [[103]]
## [1] 4 2 5 1 3
## 
## [[104]]
## [1] 4 5 2 1 3
## 
## [[105]]
## [1] 5 4 2 1 3
## 
## [[106]]
## [1] 5 2 4 1 3
## 
## [[107]]
## [1] 2 5 4 1 3
## 
## [[108]]
## [1] 2 4 5 1 3
## 
## [[109]]
## [1] 2 4 1 5 3
## 
## [[110]]
## [1] 2 4 1 3 5
## 
## [[111]]
## [1] 2 1 4 3 5
## 
## [[112]]
## [1] 2 1 4 5 3
## 
## [[113]]
## [1] 2 1 5 4 3
## 
## [[114]]
## [1] 2 5 1 4 3
## 
## [[115]]
## [1] 5 2 1 4 3
## 
## [[116]]
## [1] 5 2 1 3 4
## 
## [[117]]
## [1] 2 5 1 3 4
## 
## [[118]]
## [1] 2 1 5 3 4
## 
## [[119]]
## [1] 2 1 3 5 4
## 
## [[120]]
## [1] 2 1 3 4 5
factorial(x = 1:5)
## [1]   1   2   6  24 120

2.10 Working with Sequences

x <- c(1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, NA, NA, 11)

table(x)
## x
##  1  2  3  4  5  6  7  8  9 10 11 
##  2  2  2  2  2  1  1  1  1  1  1
unique(x)
##  [1]  1  2  3  4  5  6  7  8  9 10 NA 11
duplicated(x)
##  [1] FALSE FALSE FALSE FALSE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE FALSE
## [12] FALSE FALSE FALSE FALSE FALSE  TRUE FALSE
!duplicated(x)
##  [1]  TRUE  TRUE  TRUE  TRUE  TRUE FALSE FALSE FALSE FALSE FALSE  TRUE
## [12]  TRUE  TRUE  TRUE  TRUE  TRUE FALSE  TRUE

rle - run length encoding

sequence <- sample(1:10, size = 20, replace = TRUE)

sequence
##  [1]  6  6  7  3  1  3 10  9 10  9  9  5 10  4  8  5  5  1 10  8
rle(sequence)
## Run Length Encoding
##   lengths: int [1:17] 2 1 1 1 1 1 1 1 2 1 ...
##   values : int [1:17] 6 7 3 1 3 10 9 10 9 5 ...