data.frame() and disgard the worsttbl_df and tbl classes.
data.frame.data.frame@JennyBryan ‘tibble-diff’ has a certain charm :) Sept 23, 2014
tibble(): Creates a tibble similar to data.frame()as_tibble(): Coerces an existing object (dataframe, table, matrix) into a tibble.Sequential evaluation is when you can create a variable that is dependent on another variable that is being created simultaneously.
data.frame(x = 1:3, y = 15:17, z = x*y)
Error in data.frame(x = 1:3, y = 15:17, z = x * y): object 'x' not found
# Create a tibble
tibble(x = 1:3, y = 15:17, z = x*y,
listcol = list(1:5, seq(5, 10, 1), rep(1, 3)))
# A tibble: 3 x 4
x y z listcol
<int> <int> <int> <list>
1 1 15 15 <int [5]>
2 2 16 32 <dbl [6]>
3 3 17 51 <dbl [3]>
Dataframes can store more than atomic vectors. They can store anything including lists. A list-column is a column in a dataframe that is a list. You can inspect, index and compute operations on list-columns. Tibbles are still indexed the same way as data.frame.
# Tibble with list-column
my_tibble <- tibble(x = 1:3, y = 15:17, z = x*y,
listcol = list(1:5, seq(5, 10, 1), rep(1, 3)))
my_tibble$listcol
[[1]]
[1] 1 2 3 4 5
[[2]]
[1] 5 6 7 8 9 10
[[3]]
[1] 1 1 1
my_tibble$listcol[[1]]
[1] 1 2 3 4 5
my_tibble$listcol[[1]][[5]]
[1] 5
my_tibble$x
[1] 1 2 3
my_tibble[['y']]
[1] 15 16 17
Vector recycling is the
# Create a tibble with vector recycling
tibble(x = 1:3, y = 15:17, z = x*y,
listcol = list(1:5, seq(5, 10, 1), rep(1, 3)),
one_col = 1)
# A tibble: 3 x 5
x y z listcol one_col
<int> <int> <int> <list> <dbl>
1 1 15 15 <int [5]> 1
2 2 16 32 <dbl [6]> 1
3 3 17 51 <dbl [3]> 1
Tibbles allow for variable names that are not valid in R.
# tibble's column naming conventions
tibble(`0 1 Normal distribution` = rnorm(5),
`:)` = rep('Smile',5))
# A tibble: 5 x 2
`0 1 Normal distribution` `:)`
<dbl> <chr>
1 0.827 Smile
2 -0.764 Smile
3 -0.978 Smile
4 -0.468 Smile
5 -0.0950 Smile
Tibbles also differ from data.frame with the way it prints output.
options(tibble.print_max, tibble.print_min)# Show off tibble's printing of a large dataframe
df <- tibble(x = 1:1000, y = 2, z = x*y)
print(df)
# A tibble: 1,000 x 3
x y z
<int> <dbl> <dbl>
1 1 2 2
2 2 2 4
3 3 2 6
4 4 2 8
5 5 2 10
6 6 2 12
7 7 2 14
8 8 2 16
9 9 2 18
10 10 2 20
# ... with 990 more rows
# Class of the tibble
class(df)
[1] "tbl_df" "tbl" "data.frame"
add_column(): Add one or more columns to existing tibbleadd_row(): Add one or more rows to existing tibblerow_names_to_column(): Convert row names to columnrowid_to_column(): Create a column of IDs using row indicies# Load ISLR library
library(ISLR)
# Show College data
head(College)
Private Apps Accept Enroll Top10perc
Abilene Christian University Yes 1660 1232 721 23
Adelphi University Yes 2186 1924 512 16
Adrian College Yes 1428 1097 336 22
Agnes Scott College Yes 417 349 137 60
Alaska Pacific University Yes 193 146 55 16
Albertson College Yes 587 479 158 38
Top25perc F.Undergrad P.Undergrad Outstate
Abilene Christian University 52 2885 537 7440
Adelphi University 29 2683 1227 12280
Adrian College 50 1036 99 11250
Agnes Scott College 89 510 63 12960
Alaska Pacific University 44 249 869 7560
Albertson College 62 678 41 13500
Room.Board Books Personal PhD Terminal
Abilene Christian University 3300 450 2200 70 78
Adelphi University 6450 750 1500 29 30
Adrian College 3750 400 1165 53 66
Agnes Scott College 5450 450 875 92 97
Alaska Pacific University 4120 800 1500 76 72
Albertson College 3335 500 675 67 73
S.F.Ratio perc.alumni Expend Grad.Rate
Abilene Christian University 18.1 12 7041 60
Adelphi University 12.2 16 10527 56
Adrian College 12.9 30 8735 54
Agnes Scott College 7.7 37 19016 59
Alaska Pacific University 11.9 2 10922 15
Albertson College 9.4 11 9727 55
# Convert to tibble and change rownames to a column
as_tibble(rownames_to_column(College, 'College'))
# A tibble: 777 x 19
College Private Apps Accept Enroll Top10perc Top25perc F.Undergrad
<chr> <fct> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 Abilen~ Yes 1660 1232 721 23 52 2885
2 Adelph~ Yes 2186 1924 512 16 29 2683
3 Adrian~ Yes 1428 1097 336 22 50 1036
4 Agnes ~ Yes 417 349 137 60 89 510
5 Alaska~ Yes 193 146 55 16 44 249
6 Albert~ Yes 587 479 158 38 62 678
7 Albert~ Yes 353 340 103 17 45 416
8 Albion~ Yes 1899 1720 489 37 68 1594
9 Albrig~ Yes 1038 839 227 30 63 973
10 Alders~ Yes 582 498 172 21 44 799
# ... with 767 more rows, and 11 more variables: P.Undergrad <dbl>,
# Outstate <dbl>, Room.Board <dbl>, Books <dbl>, Personal <dbl>,
# PhD <dbl>, Terminal <dbl>, S.F.Ratio <dbl>, perc.alumni <dbl>,
# Expend <dbl>, Grad.Rate <dbl>