Chapter 2 and 3 Computing with R and R Command Patters
Loading Data Sets
install.packages("nycflights13")
#or go to packages in rstudio
library(nycflights13)
#or go to packages and check box next to nycflights13
data("flights",package="nycflights13")
Chap 3 Chaining Syntax
suppressMessages(library(dplyr))
data("BabyNames",package="DataComputing")
The general chaining syntax:
#Object_Name <-
#Data_Table %>%
#function_name(arguments) %>%
#function_name(arguments)
example 1:
MyBabies<-
BabyNames %>%
head(3)
MyBabies
## name sex count year
## 1 Mary F 7065 1880
## 2 Anna F 2604 1880
## 3 Emma F 2003 1880
this is equivalent to:
MyBabies <-
head(BabyNames,3)
MyBabies
## name sex count year
## 1 Mary F 7065 1880
## 2 Anna F 2604 1880
## 3 Emma F 2003 1880
example 2:
Princes <-
BabyNames %>%
filter(name=="Prince") %>%
group_by(year,sex) %>%
summarise(sum(count))
Princes
## Source: local data frame [194 x 3]
## Groups: year [?]
##
## year sex sum(count)
## (int) (chr) (int)
## 1 1880 M 16
## 2 1881 M 17
## 3 1882 M 18
## 4 1883 M 18
## 5 1884 M 17
## 6 1885 M 21
## 7 1886 M 22
## 8 1887 M 18
## 9 1888 M 19
## 10 1889 M 14
## .. ... ... ...
equivalent to:
Princes <- summarise(group_by(filter(BabyNames,name=="Prince"),year,sex),sum(count))
Princes
## Source: local data frame [194 x 3]
## Groups: year [?]
##
## year sex sum(count)
## (int) (chr) (int)
## 1 1880 M 16
## 2 1881 M 17
## 3 1882 M 18
## 4 1883 M 18
## 5 1884 M 17
## 6 1885 M 21
## 7 1886 M 22
## 8 1887 M 18
## 9 1888 M 19
## 10 1889 M 14
## .. ... ... ...
Data Frames
You can make data frames as follows:
a <- c(10,30,15)
b <- c("Bob","John","Ben")
df <- data.frame(a,b, stringsAsFactors = FALSE) #want strings to be characters not factors
df
## a b
## 1 10 Bob
## 2 30 John
## 3 15 Ben
You can name your data frame as follows:
names(df) <- c("age","friend")
df
## age friend
## 1 10 Bob
## 2 30 John
## 3 15 Ben
You can look at elements of your data frame as follows:
df[1,2] #first case, second variable
## [1] "Bob"
df[,2] #second variable
## [1] "Bob" "John" "Ben"
df[,1] #first variable
## [1] 10 30 15
How do you select last two cases of the friend variable?
df[2:3,2]
## [1] "John" "Ben"
df[2:3,"friend"]
## [1] "John" "Ben"
df[-1,"friend"]
## [1] "John" "Ben"
How do you select case with “Bob” as friend?
bob_friend <- c(TRUE,FALSE,FALSE)
df[bob_friend,]
## age friend
## 1 10 Bob
#subset(my_data_frame, subset = some_condition)
subset(df, subset= friend=="Bob")
## age friend
## 1 10 Bob
Lists
You can make a list as follows:
x <- df
y <- 7
z <- c(1,2)
my_list <- list(x,y,z)
my_list
## [[1]]
## age friend
## 1 10 Bob
## 2 30 John
## 3 15 Ben
##
## [[2]]
## [1] 7
##
## [[3]]
## [1] 1 2
you can create a named list as follows
names(my_list) <- c("data_frame", "num", "vec")
my_list
## $data_frame
## age friend
## 1 10 Bob
## 2 30 John
## 3 15 Ben
##
## $num
## [1] 7
##
## $vec
## [1] 1 2
You can select the “data frame” element of your list as follows:
my_list[[1]]
## age friend
## 1 10 Bob
## 2 30 John
## 3 15 Ben
my_list[["data_frame"]]
## age friend
## 1 10 Bob
## 2 30 John
## 3 15 Ben
You can select the second element of “vec” as follows:
my_list[["vec"]][2]
## [1] 2