2017-lec2

Today

DC chap 3: R Command Patterns
DataCamp Intro to R, chap 5: Data Frames
DataCamp Intro to R, chap 6: Lists

Chap 3 Chaining Syntax

The general chaining syntax:

#Object_Name <- 
#Data_Table %>%
#function_name(arguments) %>%
#function_name(arguments)

example 1:

MyBabies<-
  BabyNames %>% 
  head(3)
MyBabies

name	sex	count	year
Mary	F	7065	1880
Anna	F	2604	1880
Emma	F	2003	1880

this is equivalent to:

MyBabies <-
  head(BabyNames,3)  
MyBabies

name	sex	count	year
Mary	F	7065	1880
Anna	F	2604	1880
Emma	F	2003	1880

example 2:

Princes <- 
  BabyNames %>%
  filter(name=="Prince") %>%
  group_by(year,sex) %>%
  summarise(sum(count))
Princes

## Source: local data frame [194 x 3]
## Groups: year [?]
## 
##     year   sex `sum(count)`
##    <int> <chr>        <int>
## 1   1880     M           16
## 2   1881     M           17
## 3   1882     M           18
## 4   1883     M           18
## 5   1884     M           17
## 6   1885     M           21
## 7   1886     M           22
## 8   1887     M           18
## 9   1888     M           19
## 10  1889     M           14
## # ... with 184 more rows

equivalent to:

Princes <- summarise(group_by(filter(BabyNames,name=="Prince"),year,sex),sum(count))
Princes

## Source: local data frame [194 x 3]
## Groups: year [?]
## 
##     year   sex `sum(count)`
##    <int> <chr>        <int>
## 1   1880     M           16
## 2   1881     M           17
## 3   1882     M           18
## 4   1883     M           18
## 5   1884     M           17
## 6   1885     M           21
## 7   1886     M           22
## 8   1887     M           18
## 9   1888     M           19
## 10  1889     M           14
## # ... with 184 more rows

Note: If you look at the codebook for dplyr::filter you will see that the first arguement is the datatable (BabyNames in this case). When you do chaining don’t put this first arguement in —this blocks the pipeline and you will get an error.

In-class exercises

DC chapter 3 exercises

DataCamp Intro to R, chap 5: Data Frames

You can make data frames as follows:

a <- c(10,30,15)
b <- c("Bob","John","Ben")
df <- data.frame(a,b, stringsAsFactors = FALSE)  #want strings to be characters not factors
df

a	b
10	Bob
30	John
15	Ben
You	can name your data frame as follows:

names(df) <- c("age","friend")
 df

age	friend
10	Bob
30	John
15	Ben

You can look at elements of your data frame as follows:

df[1,2] #first case, second variable

## [1] "Bob"

 df[,2] #second variable

## [1] "Bob"  "John" "Ben"

 df[,1] #first variable

## [1] 10 30 15

How do you select last two cases of the friend variable?

df[2:3,2]

## [1] "John" "Ben"

df[2:3,"friend"]

## [1] "John" "Ben"

df[-1,"friend"]

## [1] "John" "Ben"

How do you select case with “Bob” as friend?

bob_friend <- c(TRUE,FALSE,FALSE)
df[bob_friend,]

age	friend
10	Bob

#subset(my_data_frame, subset = some_condition)
subset(df, subset= friend=="Bob")

age	friend
10	Bob

DataCamp Intro to R, chap 6: Lists

You can make a list as follows:

x <- df
y <- 7
z <- c(1,2)
my_list <- list(x,y,z)
my_list

## [[1]]
##   age friend
## 1  10    Bob
## 2  30   John
## 3  15    Ben
## 
## [[2]]
## [1] 7
## 
## [[3]]
## [1] 1 2

you can create a named list as follows

names(my_list) <- c("data_frame", "num", "vec")
my_list

## $data_frame
##   age friend
## 1  10    Bob
## 2  30   John
## 3  15    Ben
## 
## $num
## [1] 7
## 
## $vec
## [1] 1 2

You can select the “data frame” element of your list as follows:

my_list[[1]]

age	friend
10	Bob
30	John
15	Ben

my_list[["data_frame"]]

age	friend
10	Bob
30	John
15	Ben
You ca	n select the second element of “vec” as follows:

my_list[["vec"]][2]

## [1] 2

Replicator and Sequence functions These functions are handy for making sequences of numbers.
Seq is a generalization of “:”
Seq uses the arguement by

1:5

## [1] 1 2 3 4 5

5:1

## [1] 5 4 3 2 1

seq(0,11, by=2)

## [1]  0  2  4  6  8 10

seq(10,0, by=-2)

## [1] 10  8  6  4  2  0

The replicator function uses the argument times or each

rep(c(0,1), times=5)

##  [1] 0 1 0 1 0 1 0 1 0 1

rep(letters[1:5],each=2)

##  [1] "a" "a" "b" "b" "c" "c" "d" "d" "e" "e"

rep(1:3, each =2, times=3)

##  [1] 1 1 2 2 3 3 1 1 2 2 3 3 1 1 2 2 3 3

rep(1:3, times=3)

## [1] 1 2 3 1 2 3 1 2 3

I-clicker questions

These are good midterm questions: Soln: The answer is b. See below.

x<-1:4
x

## [1] 1 2 3 4

names(x) <- letters[1:4]
x

## a b c d 
## 1 2 3 4

x[1:2] <- 2:1
x

## a b c d 
## 2 1 3 4

x["a"] <- 100
x[x==100] <-  NA
x

##  a  b  c  d 
## NA  1  3  4

solution: the answer is d.

rep(seq(0,6,by=2),each=5)

##  [1] 0 0 0 0 0 2 2 2 2 2 4 4 4 4 4 6 6 6 6 6

rep(1:5, times=4)

##  [1] 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5

rep(1:5, times=4) + rep(0:3, each=5)

##  [1] 1 2 3 4 5 2 3 4 5 6 3 4 5 6 7 4 5 6 7 8

2017-lec2

Adam Lucas

Today

In-class exercises

I-clicker questions

To do