In contrast to a vector, in which all elements must be of the same mode, R’s list structure can combine objects of different types.
Technically, a list is a vector. Ordinary vectors—those of the type we’ve been using so far in this book—are termed atomic vectors, since their components cannot be broken down into smaller components. In contrast, lists are referred to as recursive vectors.
For our first look at lists, let’s consider an employee database. For each employee, we wish to store the name, salary, and a Boolean indicating union membership. Since we have three different modes here—character, numeric, and logical—it’s a perfect place for using lists. Our entire database might then be a list of lists, or some other kind of list such as a data frame, though we won’t pursue that here. We could create a list to represent our employee, Wizard, this way:
W <- list(name="Wizard", salary=9780000, union=T)
We could print out W, either in full or by component: Print in full
W
## $name
## [1] "Wizard"
##
## $salary
## [1] 9780000
##
## $union
## [1] TRUE
Print Salary only
W$salary
## [1] 9780000
Print Name only
W$name
## [1] "Wizard"
Print Union only
W$union
## [1] TRUE
Now that you’ve seen a simple example of creating a list, let’s look at how to access and work with lists.
You can access a list component in several different ways:
access salary by direct reference
W$salary
## [1] 9780000
access by column name
W[["salary"]]
## [1] 9780000
access by column index
W[[2]]
## [1] 9780000
We can refer to list components by their numerical indices, treating the list as a vector. However, note that in this case, we use double brackets instead of single ones. So, there are three ways to access an individual component c of a list lst and return it in the data type of c:
• lst$c • lst[[“c”]] • lst[[i]], where i is the index of c within lst
The operations of adding and deleting list elements arise in a surprising number of contexts. This is especially true for data structures in which lists form the foundation, such as data frames and R classes.
New components can be added after a list is created. initial assignment of list ‘Wiz’ column a: RWizard column b: 42
wiz <- list(a="RWizard",b=42)
wiz
## $a
## [1] "RWizard"
##
## $b
## [1] 42
adding column c: coding to list ‘Wiz’
wiz$c <- "coding" # add a c component
wiz
## $a
## [1] "RWizard"
##
## $b
## [1] 42
##
## $c
## [1] "coding"
We confirmed that a third element c was added to the list named c.
Adding components can also be done via a vector index: assigning 84 to fourth column and assigning boolean to 5th, 6th, and 7th column
wiz[[4]] <- 84
wiz[5:7] <- c(FALSE,TRUE,TRUE)
wiz
## $a
## [1] "RWizard"
##
## $b
## [1] 42
##
## $c
## [1] "coding"
##
## [[4]]
## [1] 84
##
## [[5]]
## [1] FALSE
##
## [[6]]
## [1] TRUE
##
## [[7]]
## [1] TRUE
column reference is shown as index in double bracets
You can delete a list component by setting it to NULL. setting
wiz$b <- NULL
wiz
## $a
## [1] "RWizard"
##
## $c
## [1] "coding"
##
## [[3]]
## [1] 84
##
## [[4]]
## [1] FALSE
##
## [[5]]
## [1] TRUE
##
## [[6]]
## [1] TRUE
confirming that column b is removed
You can also concatenate lists. using c to concatenate two lists:
c(list("Hart", 556540, F),list(9865))
## [[1]]
## [1] "Hart"
##
## [[2]]
## [1] 556540
##
## [[3]]
## [1] FALSE
##
## [[4]]
## [1] 9865
Since a list is a vector, you can obtain the number of components in a list via length().
length(wiz)
## [1] 6
length(W)
## [1] 3
If the components in a list do have tags, as is the case with name, salary, and union for W in Section 4.1, you can obtain them via names():
names(W)
## [1] "name" "salary" "union"
To obtain the values, use unlist():
unlistW <- unlist(W)
unlistW
## name salary union
## "Wizard" "9780000" "TRUE"
using class, we can see that R stores the values as characters not lists
class(unlistW)
## [1] "character"
On the other hand, if we were to start with numbers, we would get numbers.
X <- list(a=565,b=1252,c=62513)
XX <- unlist(X)
class(XX)
## [1] "numeric"
confirm that unlisted int list is now numeric.
print XX
XX
## a b c
## 565 1252 62513
So the output of unlist() in this case was a numeric vector. What about a mixed case?
w <- list(a=5,b="xyz")
wu <- unlist(w)
class(wu)
## [1] "character"
it seems like R chooses the data type that work for both values -> character
Printing wu
wu
## a b
## "5" "xyz"
Here, R chose the least common denominator: character strings. This sounds like some kind of precedence structure, and it is.
Two functions are handy for applying functions to lists: lapply and sapply.
The function lapply() (for list apply) works like the matrix apply() function, calling the specified function on each component of a list (or vector coerced to a list) and returning another list. Here’s an example:
applying median on each component of the list():
lapply(list(155:783,255:29565),median)
## [[1]]
## [1] 469
##
## [[2]]
## [1] 14910
R applied median() to 1:3 and to 25:29, returning a list consisting of 2 and 27.In some cases, such as the example here, the list returned by lapply() could be simplified to a vector or matrix. This is exactly what sapply() (for simplified [l]apply) does.
sapply(list(155:783,255:29565),median)
## [1] 469 14910
-> same output, different format
Using sapply(), rather than applying the function directly, gave us the desired matrix form in the output.
Extended Example: Back to the Abalone Data
Let’s use the lapply() function in our abalone gender example. Recall that at one point in that example, we wished to know the indices of the observations that were male, female, and infant. For an easy demonstration, let’s use the same test case: a vector of genders.
genders <- c("M","F","F","I","M","M","F")
lapply(c("M","F","I"),function(gender) which(genders==gender))
## [[1]]
## [1] 1 5 6
##
## [[2]]
## [1] 2 3 7
##
## [[3]]
## [1] 4
output shows three assignments to ‘M’ Three assignments to ‘F’ and one assignment to ‘I’
The lapply() function expects its first argument to be a list. Here it was a vector, but lapply() will coerce that vector to a list form. Also, lapply() expects its second argument to be a function. This could be the name of a function, as you saw before, or the actual code, as we have here. This is an anonymous function, which you’ll learn more about in Section 7.13.
Then lapply() calls that anonymous function on “M”, then on “F”, and then on “I”. In that first case, the function calculates which(g==“M”), giving us the vector of indices in g of the males. After determining the indices for the females and infants, lapply() will return the three vectors in a list.
Note that even though the object of our main attention is the vector g of genders, it is not the first argument in the lapply() call in the example. Instead, that argument is an innocuous-looking vector of the three possible gender encodings. By contrast, g is mentioned only briefly in the function, as the second actual argument. This is a common situation in R. An even better way to do this will be presented in Section 6.2.2.
Lists can be recursive, meaning that you can have lists within lists. Here’s an example:
listOne <- list(u = 5852, v = 1982)
listTwo <- list(w = 13216)
listThree <- list(listOne,listTwo)
listThree
## [[1]]
## [[1]]$u
## [1] 5852
##
## [[1]]$v
## [1] 1982
##
##
## [[2]]
## [[2]]$w
## [1] 13216
-> making a list ‘listThree’ by adding listOne and listTwo together
showing the length of list two
length(listTwo)
## [1] 1
This code makes a into a two-component list, with each component itself also being a list. The concatenate function c() has an optional argument recursive, which controls whether flattening occurs when recursive lists are combined.
c(list(a=1546,b=0.322,c=list(d=231655,e=651659)))
## $a
## [1] 1546
##
## $b
## [1] 0.322
##
## $c
## $c$d
## [1] 231655
##
## $c$e
## [1] 651659
-> a list is assigned to column c within a list.
c(list(a=8941,b=31562,c=list(d=17815,e=984989)),recursive=T)
## a b c.d c.e
## 8941 31562 17815 984989
In the first case, we accepted the default value of recursive, which is FALSE, and obtained a recursive list, with the c component of the main list itself being another list. In the second call, with recursive set to TRUE, we got a single list as a result; only the names look recursive. (It’s odd that setting recursive to TRUE gives a nonrecursive list.) Recall that our first example of lists consisted of an employee database. I mentioned that since each employee was represented as a list, the entire database would be a list of lists. That is a concrete example of recursive lists.