Lists

Julius Schmid

In contrast to a vector, in which all elements must be of the same mode, R’s list structure can combine objects of different types.

Creating Lists

Technically, a list is a vector. Ordinary vectors—those of the type we’ve been using so far in this book—are termed atomic vectors, since their components cannot be broken down into smaller components. In contrast, lists are referred to as recursive vectors.

For our first look at lists, let’s consider an employee database. For each employee, we wish to store the name, salary, and a Boolean indicating union membership. Since we have three different modes here—character, numeric, and logical — it’s a perfect place for using lists. Our entire database might then be a list of lists, or some other kind of list such as a data frame, though we won’t pursue that here. We could create a list to represent our employee, Marco, this way:

j <- list(name="Marco", salary=72000, union=T)

We could print out j, either in full or by component:

Start by print j out in full. This is done by just entering the list name j:

j
$name
[1] "Marco"

$salary
[1] 72000

$union
[1] TRUE

Now we return the different list entries individually. We do this in the same way, as if we would access a column of a data frame, running list_name$entry.

  1. Return the name entry:
j$name
[1] "Marco"
  1. Return the salary entry:
j$salary
[1] 72000
  1. Return the union entry:
j$union
[1] TRUE

General List Operations

Now that you’ve seen a simple example of creating a list, let’s look at how to access and work with lists.

List Indexing

You can access a list component in several different ways. The first method is the one we have seen above.

j$salary
[1] 72000

The second method is the syntax list_name[[“entry”]]:

j[["salary"]]
[1] 72000

The third way is using the index of the index of the entry that is to be returned. In this case, salary is the second entry, hence we return the entry with index 2:

j[[2]]
[1] 72000

We can refer to list components by their numerical indices, treating the list as a vector. However, note that in this case, we use double brackets instead of single ones.

In summary, there are three ways to access an individual component c of a list lst and return it in the data type of c: list_name$entry, list_name[[“entry”]], and list_name[[index]].

Adding and Deleting List Elements

The operations of adding and deleting list elements arise in a surprising number of contexts. This is especially true for data structures in which lists form the foundation, such as data frames and R classes.

New components can be added after a list is created.

z <- list(first="character",second=5)
z
$first
[1] "character"

$second
[1] 5

We can add an additional entry by assigning a value to a yet unnamed list entry:

z$third <- "soccer"
z
$first
[1] "character"

$second
[1] 5

$third
[1] "soccer"

We confirm that a third element was added to the list named third.

Adding components can also be done via a vector index:

z[[4]] <- 32
z[5:7] <- c(FALSE,TRUE,TRUE)
z
$first
[1] "character"

$second
[1] 5

$third
[1] "soccer"

[[4]]
[1] 32

[[5]]
[1] FALSE

[[6]]
[1] TRUE

[[7]]
[1] TRUE

You can also delete a list component by setting it to NULL.

z$second <- NULL
z
$first
[1] "character"

$third
[1] "soccer"

[[3]]
[1] 32

[[4]]
[1] FALSE

[[5]]
[1] TRUE

[[6]]
[1] TRUE

You can also concatenate lists.

c(list("Lucas", 83000, T),list(5))
[[1]]
[1] "Lucas"

[[2]]
[1] 83000

[[3]]
[1] TRUE

[[4]]
[1] 5

Getting the Size of a List

Since a list is a vector, you can obtain the number of components in a list via length().

length(j)
[1] 3
length(z)
[1] 6

Accessing List Components and Values

If the components in a list do have tags, as is the case with name, salary, and union for j, you can obtain them via names():

names(j)
[1] "name"   "salary" "union" 

To obtain the values, use unlist():

ulj <- unlist(j)
ulj
   name  salary   union 
"Marco" "72000"  "TRUE" 

We apply the class() function to return the data type of the list:

class(ulj)
[1] "character"

On the other hand, if all the components are numbers, we would get numeric as an output.

z <- list(a=8,b=16,c=5)
class(y)
[1] "numeric"

We can apply the unlist() function, to not interpret the variable as a list:

y <- unlist(z)
y
 a  b  c 
 8 16  5 

Same as above, we apply the class() function for the unlisted y, and we get the same output as if we apply the class() function to the list y:

class(y)
[1] "numeric"

So the output of unlist() in this case was a numeric vector. What about a mixed case?

w <- list(a=52,b="random character")
wu <- unlist(w)
class(wu)
[1] "character"

Since there is one character element in the array (“random character”), all components are interpreted as characters.

Let us print out the unlisted wu:

wu
                 a                  b 
              "52" "random character" 

Both 52 and random character are interpreted as character strings.

Applying Functions to Lists

Two functions are handy for applying functions to lists: lapply and sapply.

Using the lapply() and sapply() Functions

The function lapply() (for list apply) works like the matrix apply() function, calling the specified function on each component of a list (or vector coerced to a list) and returning another list. Here’s an example:

lapply(list(5:7,25:35),median)
[[1]]
[1] 6

[[2]]
[1] 30

R applied median() to 5:7 and to 25:35, returning a list consisting of 6 and 30. In some cases, such as the example here, the list returned by lapply() could be simplified to a vector or matrix. This is exactly what sapply() (for simplified [l]apply) does.

sapply(list(5:7,25:35),median)
[1]  6 30

Using sapply(), rather than applying the function directly, gave us the desired matrix form in the output.

Extended Example: Back to the Abalone Data

Let’s use the lapply() function in our abalone gender example. Recall that at one point in that example, we wished to know the indices of the observations that were male, female, and infant. For an easy demonstration, let’s use the same test case: a vector of genders.

g <- c("M","I","F","I","M","F","F")
lapply(c("F","M","I"),function(gender) which(g==gender))
[[1]]
[1] 3 6 7

[[2]]
[1] 1 5

[[3]]
[1] 2 4

The lapply() function expects its first argument to be a list. Here it was a vector, but lapply() will coerce that vector to a list form. Also, lapply() expects its second argument to be a function. This could be the name of a function, as you saw before, or the actual code, as we have here. This is an anonymous function, which you’ll learn more about in Section 7.13.

Then lapply() calls that anonymous function on “F”, then on “M”, and then on “I”. In that first case, the function calculates which(g==“F”), giving us the vector of indices in g of the females. After determining the indices for the males and infants, lapply() will return the three vectors in a list.

Note that even though the object of our main attention is the vector g of genders, it is not the first argument in the lapply() call in the example. Instead, that argument is an innocuous-looking vector of the three possible gender encodings. By contrast, g is mentioned only briefly in the function, as the second actual argument. This is a common situation in R. An even better way to do this will be presented in Section 6.2.2.

Recursive Lists

Lists can be recursive, meaning that you can have lists within lists. Here’s an example:

b <- list(u = 5, v = 12)
c <- list(w = 13)
a <- list(b,c)

The list b has two entries, the list c has one entry. a then combines lists b and c in a new list of lists. Let us see what this looks like when we return it:

a
[[1]]
[[1]]$u
[1] 5

[[1]]$v
[1] 12


[[2]]
[[2]]$w
[1] 13

The two lists b ([[1]]) and c([[2]]) are returned, where b has two components. Let us print the length of a:

length(a)
[1] 2

The length of a is 2. This is because a contains 2 lists.

The code makes a into a two-component list, with each component itself also being a list. The concatenate function c() has an optional argument recursive, which controls whether flattening occurs when recursive lists are combined.

Now let us create a list p with 3 entries, where one component is a list itself:

p <- c(list(a=4,b=1,c=list(d=25,e=7)))
p
$a
[1] 4

$b
[1] 1

$c
$c$d
[1] 25

$c$e
[1] 7

The length of p is 3, since p has three entries:

length(p)
[1] 3

However, if we set the optional parameter recursive to TRUE, our list has 4 components.

p2 = c(list(a=4,b=1,c=list(d=25,e=7)),recursive=T)
p2
  a   b c.d c.e 
  4   1  25   7 

We confirm this by applying the length() function again:

length(p2)
[1] 4

In the first case, we accepted the default value of recursive, which is FALSE, and obtained a recursive list, with the c component of the main list itself being another list. In the second call, with recursive set to TRUE, we got a single list as a result; only the names look recursive. (It’s odd that setting recursive to TRUE gives a nonrecursive list.) Recall that our first example of lists consisted of an employee database. I mentioned that since each employee was represented as a list, the entire database would be a list of lists. That is a concrete example of recursive lists.

---
title: "R Notebook"
output: html_notebook
---

**Lists**

Julius Schmid

In contrast to a vector, in which all elements must be of the same mode, R’s list structure can combine objects of different types.

***Creating Lists***

Technically, a list is a vector. Ordinary vectors—those of the type we’ve been using so far in this book—are termed atomic vectors, since their components cannot be broken down into smaller components. In contrast, lists are referred to as recursive vectors.


For our first look at lists, let’s consider an employee database. For each employee, we wish to store the name, salary, and a Boolean indicating union membership. Since we have three different modes here—character, numeric, and logical — it’s a perfect place for using lists. Our entire database might then be a list of lists, or some other kind of list such as a data frame, though we won’t pursue that here. We could create a list to represent our employee, Marco, this way:
```{r}
j <- list(name="Marco", salary=72000, union=T)
```

We could print out j, either in full or by component:

Start by print j out in full. This is done by just entering the list name j:

```{r}
j
```

Now we return the different list entries individually. We do this in the same way, as if we would access a column of a data frame, running list_name$entry.

1. Return the name entry:
```{r}
j$name
```

2. Return the salary entry:
```{r}
j$salary
```

3. Return the union entry:
```{r}
j$union
```


**General List Operations**

Now that you’ve seen a simple example of creating a list, let’s look at how to access and work with lists.

***List Indexing***

You can access a list component in several different ways. The first method is the one we have seen above.
```{r}
j$salary
```

The second method is the syntax list_name[["entry"]]:
```{r}
j[["salary"]]
```

The third way is using the index of the index of the entry that is to be returned. In this case, salary is the second entry, hence we return the entry with index 2:
```{r}
j[[2]]
```

We can refer to list components by their numerical indices, treating the list as a vector. However, note that in this case, we use double brackets instead of single ones. 

In summary, there are three ways to access an individual component c of a list lst and return it in the data type of c: list_name$entry, list_name[[“entry”]], and list_name[[index]].



**Adding and Deleting List Elements**

The operations of adding and deleting list elements arise in a surprising number of contexts. This is especially true for data structures in which lists form the foundation, such as data frames and R classes.

New components can be added after a list is created.
```{r}
z <- list(first="character",second=5)
z
```

We can add an additional entry by assigning a value to a yet unnamed list entry:
```{r}
z$third <- "soccer"
z
```
We confirm that a third element was added to the list named third.

Adding components can also be done via a vector index:
```{r}
z[[4]] <- 32
z[5:7] <- c(FALSE,TRUE,TRUE)
z
```

You can also delete a list component by setting it to NULL.
```{r}
z$second <- NULL
z
```

You can also concatenate lists.
```{r}
c(list("Lucas", 83000, T),list(5))
```

***Getting the Size of a List***

Since a list is a vector, you can obtain the number of components in a list via length().
```{r}
length(j)
length(z)
```

***Accessing List Components and Values***

If the components in a list do have tags, as is the case with name, salary, and union for j, you can obtain them via names():
```{r}
names(j)
```

To obtain the values, use unlist():
```{r}
ulj <- unlist(j)
ulj
```

We apply the class() function to return the data type of the list:
```{r}
class(ulj)
```

On the other hand, if all the components are numbers, we would get numeric as an output.
```{r}
z <- list(a=8,b=16,c=5)
class(y)
```
We can apply the unlist() function, to not interpret the variable as a list:
```{r}
y <- unlist(z)
y
```

Same as above, we apply the class() function for the unlisted y, and we get the same output as if we apply the class() function to the list y:
```{r}
class(y)
```

So the output of unlist() in this case was a numeric vector. What about a mixed case?
```{r}
w <- list(a=52,b="random character")
wu <- unlist(w)
class(wu)
```
Since there is one character element in the array ("random character"), all components are interpreted as characters.

Let us print out the unlisted wu:
```{r}
wu
```
Both 52 and random character are interpreted as character strings.

***Applying Functions to Lists***

Two functions are handy for applying functions to lists: lapply and sapply.

***Using the lapply() and sapply() Functions***

The function lapply() (for list apply) works like the matrix apply() function, calling the specified function on each component of a list (or vector coerced to a list) and returning another list. Here’s an example:
```{r}
lapply(list(5:7,25:35),median)
```
R applied median() to 5:7 and to 25:35, returning a list consisting of 6 and 30. In some cases, such as the example here, the list returned by lapply() could be simplified to a vector or matrix. This is exactly what sapply() (for simplified [l]apply) does.
```{r}
sapply(list(5:7,25:35),median)
```

Using sapply(), rather than applying the function directly, gave us the desired matrix form in the output.


***Extended Example: Back to the Abalone Data***

Let’s use the lapply() function in our abalone gender example. Recall that at one point in that example, we wished to know the indices of the observations that were male, female, and infant. For an easy demonstration, let’s use the same test case: a vector of genders.
```{r}
g <- c("M","I","F","I","M","F","F")
lapply(c("F","M","I"),function(gender) which(g==gender))
```

The lapply() function expects its first argument to be a list. Here it was a vector, but lapply() will coerce that vector to a list form. Also, lapply() expects its second argument to be a function. This could be the name of a function, as you saw before, or the actual code, as we have here. This is an anonymous function, which you’ll learn more about in Section 7.13.

Then lapply() calls that anonymous function on “F”, then on “M”, and then on “I”. In that first case, the function calculates which(g==“F”), giving us the vector of indices in g of the females. After determining the indices for the males and infants, lapply() will return the three vectors in a list.

Note that even though the object of our main attention is the vector g of genders, it is not the first argument in the lapply() call in the example. Instead, that argument is an innocuous-looking vector of the three possible gender encodings. By contrast, g is mentioned only briefly in the function, as the second actual argument. This is a common situation in R. An even better way to do this will be presented in Section 6.2.2.


***Recursive Lists*** 

Lists can be recursive, meaning that you can have lists within lists. Here’s an example:
```{r}
b <- list(u = 5, v = 12)
c <- list(w = 13)
a <- list(b,c)
```
The list b has two entries, the list c has one entry. a then combines lists b and c in a new list of lists. Let us see what this looks like when we return it:
```{r}
a
```
The two lists b ([[1]]) and c([[2]]) are returned, where b has two components. Let us print the length of a:
```{r}
length(a)
```
The length of a is 2. This is because a contains 2 lists.

The code makes a into a two-component list, with each component itself also being a list. The concatenate function c() has an optional argument recursive, which controls whether flattening occurs when recursive lists are combined.

Now let us create a list p with 3 entries, where one component is a list itself:
```{r}
p <- c(list(a=4,b=1,c=list(d=25,e=7)))
p
```
The length of p is 3, since p has three entries:
```{r}
length(p)
```
However, if we set the optional parameter recursive to TRUE, our list has 4 components.
```{r}
p2 = c(list(a=4,b=1,c=list(d=25,e=7)),recursive=T)
p2
```
We confirm this by applying the length() function again:
```{r}
length(p2)
```

In the first case, we accepted the default value of recursive, which is FALSE, and obtained a recursive list, with the c component of the main list itself being another list. In the second call, with recursive set to TRUE, we got a single list as a result; only the names look recursive. (It’s odd that setting recursive to TRUE gives a nonrecursive list.) Recall that our first example of lists consisted of an employee database. I mentioned that since each employee was represented as a list, the entire database would be a list of lists. That is a concrete example of recursive lists.