Source file ⇒ 2017-midterm_review.Rmd
tracing functions in question 1 and question 7 # problem 1 ####chunk a:
a <- c(1, 2, 3)
g <- function(rr) {
h <- function(x) x + rr;
return(h);
}
f <- function(a) {
b <- list()
for (i in 1:length(a)){
b[[i]] <- g(a[[i]])
print(b[[i]](10)) #LINE 1
}
return(b)
}
c <- f(a)
## [1] 11
## [1] 12
## [1] 13
By Lazy evaluation, inside of for loop, b[[1]]<- g(a[[i]])
print(b[[1]](10))
this call of b[[1]]
causes us to evaluate g(a[[i]])
with i=1
b[[2]]<- g(a[[i]])
print(b[[2]](10))
this call of b[[2]]
causes us to evaluate g(a[[i]])
with i=2
b[[3]]<- g(a[[i]])
print(b[[3]](10))
this call of b[[3]]
causes us to evaluate g(a[[i]])
with i=3
# The only difference between these two chunk of code is whether LINE 1 is commented or not
a <- c(1, 2, 3)
g <- function(rr) {
h <- function(x) x + rr;
return(h);
}
f <- function(a) {
b <- list()
for (i in 1:length(a)){
b[[i]] <- g(a[[i]])
#print(b[[i]](10)) #LINE 1
}
return(b)
}
c <- f(a)
print(c[[1]](10))
## [1] 13
By Lazy evaluation, b[[1]]<- g(a[[i]])
b[[2]]<- g(a[[i]])
b[[3]]<- g(a[[i]])
isn’t evaluated until return(b)
at which time i=3
so we have
b[[1]]<- g(a[[3]])
b[[2]]<- g(a[[3]])
b[[3]]<- g(a[[3]])
so print(c[[1]](10))
is 3+10=13.
What is the output of the following chunk?
i=3
a <- c(1, 2, 3)
g <- function(x){x+a[i]}
f <- function(a) {
b <- list()
for (i in 1:length(a)){
b[[i]] <- function(x) g(x)
print(b[[1]](10))
}
return(b)
}
c <- f(a)
Here is a review of how sapply works:
mylist <- list(1,2,3)
myfunc <- function(x,y,z) x+y+z
mylist %>% sapply(myfunc,10,20)
## [1] 31 32 33
or
mylist %>% sapply(myfunc,y=10,z=20)
## [1] 31 32 33
Now for the quesiton: It helps to analyze lapply()
first
myfun <- function(var1=c(1,2,3),var2=1,var3=c(1,2,3)){
output=list()
for(i in 1:length(var3)) {
if(var3[i]==1) {output[[i]]=var1+var2}
if(var3[i]==2) {output[[i]]=var1-var2}
if(var3[i]==3) {output[[i]]=var1*var2}
}
return(output)
}
myvec=c(-1,0,1)
lapply(myvec,myfun,var2=2,var3=c(1,2))
## [[1]]
## [[1]][[1]]
## [1] 1
##
## [[1]][[2]]
## [1] -3
##
##
## [[2]]
## [[2]][[1]]
## [1] 2
##
## [[2]][[2]]
## [1] -2
##
##
## [[3]]
## [[3]][[1]]
## [1] 3
##
## [[3]][[2]]
## [1] -1
Understand what we get for each value of myvec.
var1=-1
, var2=2
, var3=c(1,2)
:
for loop with i=1, var3[1]=1
so output[[1]]=-1+2=1
for loop with i=2 var3[2]=2
so output[[2]]=-1-2=-3
var1=0
, var2=2
, var3=c(1,2)
:
for loop with i=1, var3[1]=1
so output[[1]]=0+2=2
for loop with i=2 var3[2]=2
so output[[2]]=0-2=-2
var1=1
, var2=2
, var3=c(1,2)
:
for loop with i=1, var3[1]=1
so output[[1]]=1+2=3
for loop with i=2 var3[2]=2
so output[[2]]=1-2=-1
then sapply is clear:
## [,1] [,2] [,3]
## [1,] 1 2 3
## [2,] -3 -2 -1
A related question is:
Notice that
lapply(a, function(x) x < 1.5)
## [[1]]
## [1] TRUE
##
## [[2]]
## [1] FALSE
##
## [[3]]
## [1] FALSE
is equivalent to:
list( 1 < 1.5, 2 < 1.5, 3 < 1.5)
## [[1]]
## [1] TRUE
##
## [[2]]
## [1] FALSE
##
## [[3]]
## [1] FALSE
so
a <- c(1, 2, 3)
b <- lapply(a, lapply, function(x) x < 1.5)
b
## [[1]]
## [[1]][[1]]
## [1] TRUE
##
##
## [[2]]
## [[2]][[1]]
## [1] FALSE
##
##
## [[3]]
## [[3]][[1]]
## [1] FALSE
is equivalent to
list( lapply(1, function(x) x < 1.5), lapply(2, function(x) x < 1.5), lapply(3, function(x) x < 1.5))
## [[1]]
## [[1]][[1]]
## [1] TRUE
##
##
## [[2]]
## [[2]][[1]]
## [1] FALSE
##
##
## [[3]]
## [[3]][[1]]
## [1] FALSE
creating a list of lists
Questions 18, 20, 21, 22, 25(c), and 26.
What is your question?
What is your question?
Practice writing functions:
Create a function roll() to roll a die for 6 times and return how many times you got the number 2 facing up. Hint: use sample(x, size, replace = FALSE, prob = NULL)
where x
is a vector, size
is number of items to choose
MISPRINT. Sorry this problem is missing directions on what genre
and Sum_dN15_SG
is. Make sure you understand the solution and you will be fine.
Writing f function to generate the Fibonnaci sequence is a right of passage for all programmers! Make sure you understand what each line of the solution code does.
Although writing functions is an important skill for the midterm you will more likely you will get a question asking you to interpret the output of a function.
#25c (actually do 25(a-c))
What is your question.
Make sure you can do this using replicator and sequence functions.
Recall
seq(0,11, by=2)
## [1] 0 2 4 6 8 10
rep(1:3, each =2, times=3)
## [1] 1 1 2 2 3 3 1 1 2 2 3 3 1 1 2 2 3 3
rep(1:25,each=2)
## [1] 1 1 2 2 3 3 4 4 5 5 6 6 7 7 8 8 9 9 10 10 11 11 12
## [24] 12 13 13 14 14 15 15 16 16 17 17 18 18 19 19 20 20 21 21 22 22 23 23
## [47] 24 24 25 25
Not sure wny you need function split_chars(). You can just use strsplit("University of California","")
1, 12, 19, 20
For every location, sex and month there are incomes which is a categorical variable with levels poverty, low, mid and high. For evvery income for a particular location, sex and month we choose the max and min which means the max and min alphabetically.
What is your question?
14, 15, more specifically, how do we know the number of cases?
16, inner joint, and could you please go over left and right joints?
Suppose you have these two tables:
clinicName
, postalCode
.## clinicName postalCode
## 1 A 22120
## 2 B 35752
## 3 C 56718
## 4 D 35752
## 5 E 67756
## 6 F 69129
## 7 G 73455
## 8 H 73455
## 9 I 76292
postalCode
, over65
, etc.## over65 postalCode
## 1 0.46 35752
## 2 0.72 22120
## 3 0.93 22120
## 4 0.26 92332
## 5 0.46 84739
## 6 0.94 67756
The diagram below shows the cases in the left and right tables. The lines show the matches between left and right. The cases connected by a match are the overlap cases; there are five of them in the diagram. Cases without a match also appear in both the left and right tables.
An inner join gives the matching pairs. Note that clinic A, which had two matches in the right table, appears twice, once for each matching pair in which clinic A is involved.
LL %>% inner_join(RR)
## clinicName postalCode over65
## 1 A 22120 0.72
## 2 A 22120 0.93
## 3 B 35752 0.46
## 4 D 35752 0.46
## 5 E 67756 0.94
An outer join can include cases where there is no match. You might want to include the unmatched cases from the left table, from the right table, or from both tables.
LL %>% left_join( RR)
## clinicName postalCode over65
## 1 A 22120 0.72
## 2 A 22120 0.93
## 3 B 35752 0.46
## 4 C 56718 NA
## 5 D 35752 0.46
## 6 E 67756 0.94
## 7 F 69129 NA
## 8 G 73455 NA
## 9 H 73455 NA
## 10 I 76292 NA
LL %>% right_join(RR)
## clinicName postalCode over65
## 1 B 35752 0.46
## 2 D 35752 0.46
## 3 A 22120 0.72
## 4 A 22120 0.93
## 5 <NA> 92332 0.26
## 6 <NA> 84739 0.46
## 7 E 67756 0.94
LL %>% full_join(RR)
## clinicName postalCode over65
## 1 A 22120 0.72
## 2 A 22120 0.93
## 3 B 35752 0.46
## 4 C 56718 NA
## 5 D 35752 0.46
## 6 E 67756 0.94
## 7 F 69129 NA
## 8 G 73455 NA
## 9 H 73455 NA
## 10 I 76292 NA
## 11 <NA> 92332 0.26
## 12 <NA> 84739 0.46
Q1, 12,14, 15,17,18,20,21,22,25