Jesse Yang
Jan 27th, 2017
x <- c(2, 4, 6, 8)
x[c(3, 1)]
[1] 6 2
x[c(2:4, 1, 3)]
[1] 4 6 8 2 6
x <- c(first = 2, second = 4,
third = 6, forth = 8)
x[c(3, 1)]
third first
6 2
x["second"]
second
4
x
first second third forth
2 4 6 8
x[-c(3, 2)]
first forth
2 8
x[c(-1, -2)]
third forth
6 8
x[c(-1, 2)]
Error in x[c(-1, 2)] : only 0's may be mixed with negative subscripts
x <- matrix(c(2, 4, 6, 8, 10, 12), ncol = 3)
x
[,1] [,2] [,3]
[1,] 2 6 10
[2,] 4 8 12
x[2, 1] # row, column
[1] 4
[]x[1, ]
[1] 2 6 10
x[1, , drop = FALSE]
[,1] [,2] [,3]
[1,] 2 6 10
?"["str(x)
num [1:2, 1:3] 2 4 6 8 10 12
dim(x)
[1] 2 3
dat <- data.frame(x = c(1, 2),
y = c("A", "B"),
z = c("a", "b"))
dat
x y z
1 1 A a
2 2 B b
dat[1, 2] # access first row, second column
[1] A
Levels: A B
dat[1, 2]
dat[1, "y"] # Recommended!
dat[1, 'y']
dat[1, 2:2]
dat$y[1]
dat[["y"]][1]
dat[1, 2:2] is redundant, but just so you know sequences are do-able
in subsetting, too." vs ') in R, and always
use names to access columns, if possible.dat
x y z
1 1 A a
2 2 B b
row.names(dat) <- c("first", "second")
dat["first", "y", drop = FALSE]
y
first A
,”, it picks columnsdat["y"]
y
first A
second B
dat[c(2, 3)]
y z
first A a
second B b
dat[, c(2, 3)] # better
y z
first A a
second B b
dat[, c("y", "z")] # even better
y z
first A a
second B b
x <- list(x = c(1, 2, 3, 3, 4), y = c(4, 8),
z = c("burp", "blah"))
str(x)
List of 3
$ x: num [1:5] 1 2 3 3 4
$ y: num [1:2] 4 8
$ z: chr [1:2] "burp" "blah"
x[["z"]][1]
[1] "burp"
What is behind dat[dat$x > 5] ?
TRUE, FALSE# constructing a list of random numbers
x <- round(rnorm(n = 6, mean = 5, sd = 3))
x
[1] 9 -2 8 3 7 7
x < 5 # a logical expression
[1] FALSE TRUE FALSE TRUE FALSE FALSE
x is:[1] 9 -2 8 3 7 7
# if we just pass in a vector of booleans
x[c(FALSE, TRUE)]
[1] -2 3 7
[1] 9 -2 8 3 7 7
y <- x < 5
y
[1] FALSE TRUE FALSE TRUE FALSE FALSE
x[y] == x[x < 5]
[1] TRUE TRUE
[1] 9 -2 8 3 7 7
# X & Y <-> intersect(x, y)
x[x < 5 & x > 2]
[1] 3
# X | Y <-> union(x, y)
x[x < 5 | x > 7]
[1] 9 -2 8 3
[1] 9 -2 8 3 7 7
# X & !Y <-> setdiff(x, y)
y3 <- x < 5 & !(x < 3) # !(x < 3) == x >= 3
y3
[1] FALSE FALSE FALSE TRUE FALSE FALSE
x[y3]
[1] 3
dat <- data.frame(x = c(1, NA, 3),
y = c("A", "B", "B"),
z = c("a", "b", "c"))
dat
x y z
1 1 A a
2 NA B b
3 3 B c
dat[dat$x < 2, ]
x y z
1 1 A a
NA NA <NA> <NA>
dat$x < 2
[1] TRUE NA FALSE
dat[c(TRUE, NA, FALSE), ]
x y z
1 1 A a
NA NA <NA> <NA>
which()dat[which(dat$x < 2), ]
x y z
1 1 A a
x y z
1 1 A a
2 NA B b
3 3 B c
subset(dat, x < 2)
x y z
1 1 A a
dat[dat$x < 2, ]
x y z
1 1 A a
NA NA <NA> <NA>
subset function can eliminate NAs for you. But it is
still not recommended for production code because of
some other reasons.That's all you need to know about subsetting in R.
Questions? Email me: yang.jianc@husky.neu.edu