There are a number of operators that can be used to extract subsets of R objects.
[ always returns an object of the same class as the original; can be used to select more than one element (there is one exception)
[[ is used to extract elements of a list or a data frame; it can only be used to extract a single element and the class of the returned object will not necessarily be a list or data frame
$ is used to extract elements of a list or data frame by name; semantics are similar to that of [[.
x <- c("a", "b", "c", "c", "d", "a")
x[1]
## [1] "a"
x[1:4]
## [1] "a" "b" "c" "c"
x[x > "a"]
## [1] "b" "c" "c" "d"
u <- x > "a"
u
## [1] FALSE TRUE TRUE TRUE TRUE FALSE
x[u]
## [1] "b" "c" "c" "d"
x <- list(foo = 1:4, bar = 0.6)
x[1]
## $foo
## [1] 1 2 3 4
x[[1]]
## [1] 1 2 3 4
x$bar
## [1] 0.6
x[["bar"]]
## [1] 0.6
x["bar"]
## $bar
## [1] 0.6
If extracting multiple elements, must use single brackets
x <- list(foo = 1:4, bar = 0.6, baz = "hello")
x[c(1,3)]
## $foo
## [1] 1 2 3 4
##
## $baz
## [1] "hello"
The [[ operator can be used with computed indices; $ can only be used with literal names.
x <- list(foo = 1:4, bar = 0.6, baz = "hello")
name <- "foo"
x[[name]] ## computed index for ‘foo’
## [1] 1 2 3 4
x$name ## element ‘name’ doesn’t exist!
## NULL
x$foo## element ‘foo’ does exist
## [1] 1 2 3 4
The [[ can take an integer sequence.
x <- list(a = list(10, 12, 14), b = c(3.14, 2.81))
x[[c(1, 3)]]
## [1] 14
x[[1]][[3]]
## [1] 14
x[[c(2,1)]]
## [1] 3.14
Matrices can be subsetted in the usual way with (i,j) type indices.
x <- matrix(1:6, 2,3)
x[1,2]
## [1] 3
x[2,1]
## [1] 2
Indices can also be missing.
x[1, ]
## [1] 1 3 5
x[, 2]
## [1] 3 4
By default, when a single element of a matrix is retrieved, it is returned as a vector of length 1 rather than a 1 × 1 matrix. This behavior can be turned off by setting drop = FALSE
x <- matrix(1:6, 2, 3)
x[1,2]
## [1] 3
x[1,2,drop=FALSE]
## [,1]
## [1,] 3
Similarly, subsetting a single column or a single row will give you a vector, not a matrix
x <- matrix(1:6, 2, 3)
x[1, ]
## [1] 1 3 5
x[1, ,drop=FALSE]
## [,1] [,2] [,3]
## [1,] 1 3 5
x <- c(1, 2, NA, 4, NA, 5)
bad <- is.na(x)
x[!bad]
## [1] 1 2 4 5
What if there are multiple things and you want to take the subset with no missing values?
x <- c(1, 2, NA, 4, NA, 5)
y <- c("a", "b", NA, "d", NA, "f")
good <- complete.cases(x, y)
good
## [1] TRUE TRUE FALSE TRUE FALSE TRUE
x[good]
## [1] 1 2 4 5
y[good]
## [1] "a" "b" "d" "f"
Many operations in R are vectorized making code more efficient, concise, and easier to read.
x <- 1:4; y <- 6:9
x + y
## [1] 7 9 11 13
x > 2
## [1] FALSE FALSE TRUE TRUE
x >= 2
## [1] FALSE TRUE TRUE TRUE
y == 8
## [1] FALSE FALSE TRUE FALSE
x * y
## [1] 6 14 24 36
x / y
## [1] 0.1666667 0.2857143 0.3750000 0.4444444
#------------------------
x <- matrix(1:4, 2, 2); y <- matrix(rep(10, 4), 2, 2)
x
## [,1] [,2]
## [1,] 1 3
## [2,] 2 4
y
## [,1] [,2]
## [1,] 10 10
## [2,] 10 10
x*y #element wise multiplication
## [,1] [,2]
## [1,] 10 30
## [2,] 20 40
x/y
## [,1] [,2]
## [1,] 0.1 0.3
## [2,] 0.2 0.4
x%%y #True matrix multiplication
## [,1] [,2]
## [1,] 1 3
## [2,] 2 4