10.5 Exercises

1. How can you tell if an object is a tibble? (Hint: try printing mtcars, which is a regular data frame).

  • 表示が違う
  • クラスが違う

2. Compare and contrast the following operations on a data.frame and equivalent tibble. What is different? Why might the default data frame behaviours cause you frustration?

df <- data.frame(abc = 1, xyz = "a")

## `$`,`[[`オペレータで部分マッチ(pmatch)が適用される
## これは部分マッチされるが
df$x
## [1] a
## Levels: a
## これはされない
df[, "x"]
## Error in `[.data.frame`(df, , "x"): undefined columns selected
## 引数のクラスが同じなのに、返り値のクラスが異なる
df[, "xyz"] %>% class
## [1] "factor"
df[, c("xyz", "abc")] %>% class
## [1] "data.frame"

3. If you have the name of a variable stored in an object, e.g. var <- "mpg", how can you extract the reference variable from a tibble?

var <- "mpg"
## data.frmaeの場合
mtcars[,var]
##  [1] 21.0 21.0 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 17.8 16.4 17.3 15.2
## [15] 10.4 10.4 14.7 32.4 30.4 33.9 21.5 15.5 15.2 13.3 19.2 27.3 26.0 30.4
## [29] 15.8 19.7 15.0 21.4
## tibbleの場合
mtt <- as_tibble(mtcars)
select(mtt, var)
## # A tibble: 32 x 1
##      mpg
##    <dbl>
##  1  21  
##  2  21  
##  3  22.8
##  4  21.4
##  5  18.7
##  6  18.1
##  7  14.3
##  8  24.4
##  9  22.8
## 10  19.2
## # … with 22 more rows
select(mtt, !!var)
## # A tibble: 32 x 1
##      mpg
##    <dbl>
##  1  21  
##  2  21  
##  3  22.8
##  4  21.4
##  5  18.7
##  6  18.1
##  7  14.3
##  8  24.4
##  9  22.8
## 10  19.2
## # … with 22 more rows

4. Practice referring to non-syntactic names in the following data frame by:

annoying <- tibble(
  `1` = 1:10,
  `2` = `1` * 2 + rnorm(length(`1`))
)

1. Extracting the variable called 1.

annoying["1"]
## # A tibble: 10 x 1
##      `1`
##    <int>
##  1     1
##  2     2
##  3     3
##  4     4
##  5     5
##  6     6
##  7     7
##  8     8
##  9     9
## 10    10
annoying$`1`
##  [1]  1  2  3  4  5  6  7  8  9 10
select(annoying, `1`)
## # A tibble: 10 x 1
##      `1`
##    <int>
##  1     1
##  2     2
##  3     3
##  4     4
##  5     5
##  6     6
##  7     7
##  8     8
##  9     9
## 10    10

2. Plotting a scatterplot of 1 vs 2.

annoying %>%
  ggplot +
  geom_point(aes(`1`, `2`))

3. Creating a new column called 3 which is 2 divided by 1.

annoying <- annoying %>% mutate(`3` = `2` / `1`)

4. Renaming the columns to one, two and three.

annoying %>% rename(one = `1`, two = `2`, three = `3`)
## # A tibble: 10 x 3
##      one   two three
##    <int> <dbl> <dbl>
##  1     1  1.28  1.28
##  2     2  2.61  1.31
##  3     3  5.67  1.89
##  4     4  6.90  1.73
##  5     5 10.5   2.09
##  6     6 11.5   1.92
##  7     7 13.3   1.91
##  8     8 16.3   2.04
##  9     9 19.3   2.14
## 10    10 21.6   2.16

5. What does tibble::enframe() do? When might you use it?

‘enframe()’ converts named atomic vectors or lists to one- or
two-column data frames. For a list, the result will be a nested
tibble with a column of type ‘list’. For unnamed vectors, the
natural sequence is used as name column.