Source file ⇒ 2017-midterm_review.Rmd

Wing Vasiksiri

tracing functions in question 1 and question 7 # problem 1 ####chunk a:

a <- c(1, 2, 3)
g <- function(rr) {
  h <- function(x) x + rr;
  return(h);
}
f <- function(a) {
  b <- list()
  for (i in 1:length(a)){
    b[[i]] <- g(a[[i]])
    print(b[[i]](10)) #LINE 1
  }
  return(b)
}
c <- f(a)

## [1] 11
## [1] 12
## [1] 13

By Lazy evaluation, inside of for loop, b[[1]]<- g(a[[i]]) print(b[[1]](10)) this call of b[[1]] causes us to evaluate g(a[[i]]) with i=1

b[[2]]<- g(a[[i]]) print(b[[2]](10)) this call of b[[2]] causes us to evaluate g(a[[i]]) with i=2

b[[3]]<- g(a[[i]]) print(b[[3]](10)) this call of b[[3]] causes us to evaluate g(a[[i]]) with i=3

chunk b:

# The only difference between these two chunk of code is whether LINE 1 is commented or not
a <- c(1, 2, 3)
g <- function(rr) {
  h <- function(x) x + rr;
  return(h);
}
f <- function(a) {
  b <- list()
  for (i in 1:length(a)){
    b[[i]] <- g(a[[i]])
    #print(b[[i]](10)) #LINE 1
  }
  return(b)
}
c <- f(a)
print(c[[1]](10))

## [1] 13

By Lazy evaluation, b[[1]]<- g(a[[i]]) b[[2]]<- g(a[[i]]) b[[3]]<- g(a[[i]])

isn’t evaluated until return(b) at which time i=3 so we have

b[[1]]<- g(a[[3]]) b[[2]]<- g(a[[3]]) b[[3]]<- g(a[[3]])

so print(c[[1]](10)) is 3+10=13.

extra question for you:

What is the output of the following chunk?

i=3
a <- c(1, 2, 3)
g <- function(x){x+a[i]}
f <- function(a) {
  b <- list()
  for (i in 1:length(a)){
    b[[i]] <- function(x) g(x)
    print(b[[1]](10))
  }
  return(b)
}
c <- f(a)

7

Here is a review of how sapply works:

mylist <- list(1,2,3)
myfunc <- function(x,y,z) x+y+z

mylist %>% sapply(myfunc,10,20)

## [1] 31 32 33

mylist %>% sapply(myfunc,y=10,z=20)

## [1] 31 32 33

Now for the quesiton: It helps to analyze lapply() first

myfun <- function(var1=c(1,2,3),var2=1,var3=c(1,2,3)){
  output=list()
  for(i in 1:length(var3)) {
  if(var3[i]==1) {output[[i]]=var1+var2}
  if(var3[i]==2) {output[[i]]=var1-var2}
  if(var3[i]==3) {output[[i]]=var1*var2}
  }
  return(output)
}


myvec=c(-1,0,1)
lapply(myvec,myfun,var2=2,var3=c(1,2))

## [[1]]
## [[1]][[1]]
## [1] 1
## 
## [[1]][[2]]
## [1] -3
## 
## 
## [[2]]
## [[2]][[1]]
## [1] 2
## 
## [[2]][[2]]
## [1] -2
## 
## 
## [[3]]
## [[3]][[1]]
## [1] 3
## 
## [[3]][[2]]
## [1] -1

Understand what we get for each value of myvec.

var1=-1, var2=2, var3=c(1,2):
for loop with i=1, var3[1]=1 so output[[1]]=-1+2=1
for loop with i=2 var3[2]=2 so output[[2]]=-1-2=-3

var1=0, var2=2, var3=c(1,2):
for loop with i=1, var3[1]=1 so output[[1]]=0+2=2
for loop with i=2 var3[2]=2 so output[[2]]=0-2=-2

var1=1, var2=2, var3=c(1,2):
for loop with i=1, var3[1]=1 so output[[1]]=1+2=3
for loop with i=2 var3[2]=2 so output[[2]]=1-2=-1

then sapply is clear:

##      [,1] [,2] [,3]
## [1,] 1    2    3   
## [2,] -3   -2   -1

A related question is:

3

Notice that

lapply(a, function(x) x < 1.5)

## [[1]]
## [1] TRUE
## 
## [[2]]
## [1] FALSE
## 
## [[3]]
## [1] FALSE

is equivalent to:

list(  1 < 1.5, 2 < 1.5, 3 < 1.5)

## [[1]]
## [1] TRUE
## 
## [[2]]
## [1] FALSE
## 
## [[3]]
## [1] FALSE

a <- c(1, 2, 3)
b <- lapply(a, lapply, function(x) x < 1.5)
b

## [[1]]
## [[1]][[1]]
## [1] TRUE
## 
## 
## [[2]]
## [[2]][[1]]
## [1] FALSE
## 
## 
## [[3]]
## [[3]][[1]]
## [1] FALSE

is equivalent to

list( lapply(1, function(x) x < 1.5), lapply(2, function(x) x < 1.5), lapply(3, function(x) x < 1.5))

## [[1]]
## [[1]][[1]]
## [1] TRUE
## 
## 
## [[2]]
## [[2]][[1]]
## [1] FALSE
## 
## 
## [[3]]
## [[3]][[1]]
## [1] FALSE

creating a list of lists

Yiming Shi

Questions 18, 20, 21, 22, 25(c), and 26.

18:

What is your question?

20:

What is your question?

Practice writing functions:

Create a function roll() to roll a die for 6 times and return how many times you got the number 2 facing up. Hint: use sample(x, size, replace = FALSE, prob = NULL) where x is a vector, size is number of items to choose

21

MISPRINT. Sorry this problem is missing directions on what genre and Sum_dN15_SG is. Make sure you understand the solution and you will be fine.

22

Writing f function to generate the Fibonnaci sequence is a right of passage for all programmers! Make sure you understand what each line of the solution code does.

Although writing functions is an important skill for the midterm you will more likely you will get a question asking you to interpret the output of a function.

#25c (actually do 25(a-c))

What is your question.

Make sure you can do this using replicator and sequence functions.

Recall

seq(0,11, by=2)

## [1]  0  2  4  6  8 10

rep(1:3, each =2, times=3)

##  [1] 1 1 2 2 3 3 1 1 2 2 3 3 1 1 2 2 3 3

rep(1:25,each=2)

##  [1]  1  1  2  2  3  3  4  4  5  5  6  6  7  7  8  8  9  9 10 10 11 11 12
## [24] 12 13 13 14 14 15 15 16 16 17 17 18 18 19 19 20 20 21 21 22 22 23 23
## [47] 24 24 25 25

26

Not sure wny you need function split_chars(). You can just use strsplit("University of California","")

Yue She

1, 12, 19, 20

12

For every location, sex and month there are incomes which is a categorical variable with levels poverty, low, mid and high. For evvery income for a particular location, sex and month we choose the max and min which means the max and min alphabetically.

19

What is your question?

Leon Gutierrez

14, 15, more specifically, how do we know the number of cases?

16, inner joint, and could you please go over left and right joints?

Suppose you have these two tables:

Left: cases are medical clinics. The variables: clinicName, postalCode.

##   clinicName postalCode
## 1          A      22120
## 2          B      35752
## 3          C      56718
## 4          D      35752
## 5          E      67756
## 6          F      69129
## 7          G      73455
## 8          H      73455
## 9          I      76292

Right: cases are postal codes. Variables reflect the demographics of that postal code: postalCode, over65, etc.

##   over65 postalCode
## 1   0.46      35752
## 2   0.72      22120
## 3   0.93      22120
## 4   0.26      92332
## 5   0.46      84739
## 6   0.94      67756

The diagram below shows the cases in the left and right tables. The lines show the matches between left and right. The cases connected by a match are the overlap cases; there are five of them in the diagram. Cases without a match also appear in both the left and right tables.

An inner join gives the matching pairs. Note that clinic A, which had two matches in the right table, appears twice, once for each matching pair in which clinic A is involved.

LL %>% inner_join(RR)

##   clinicName postalCode over65
## 1          A      22120   0.72
## 2          A      22120   0.93
## 3          B      35752   0.46
## 4          D      35752   0.46
## 5          E      67756   0.94

An outer join can include cases where there is no match. You might want to include the unmatched cases from the left table, from the right table, or from both tables.

Unmatched cases from the left table

LL %>% left_join( RR)

##    clinicName postalCode over65
## 1           A      22120   0.72
## 2           A      22120   0.93
## 3           B      35752   0.46
## 4           C      56718     NA
## 5           D      35752   0.46
## 6           E      67756   0.94
## 7           F      69129     NA
## 8           G      73455     NA
## 9           H      73455     NA
## 10          I      76292     NA

Unmatched cases from the right table

LL %>%  right_join(RR)

##   clinicName postalCode over65
## 1          B      35752   0.46
## 2          D      35752   0.46
## 3          A      22120   0.72
## 4          A      22120   0.93
## 5       <NA>      92332   0.26
## 6       <NA>      84739   0.46
## 7          E      67756   0.94

Unmatched cases from both tables

LL %>% full_join(RR)

##    clinicName postalCode over65
## 1           A      22120   0.72
## 2           A      22120   0.93
## 3           B      35752   0.46
## 4           C      56718     NA
## 5           D      35752   0.46
## 6           E      67756   0.94
## 7           F      69129     NA
## 8           G      73455     NA
## 9           H      73455     NA
## 10          I      76292     NA
## 11       <NA>      92332   0.26
## 12       <NA>      84739   0.46

Minzhi Zhang

Q1, 12,14, 15,17,18,20,21,22,25

2017 Stat 133 Midterm review

Wing Vasiksiri

chunk b:

extra question for you:

7

3

Yiming Shi

18:

20:

21

22

26

Yue She

12

19

Leon Gutierrez

Unmatched cases from the left table

Unmatched cases from the right table

Unmatched cases from both tables

Minzhi Zhang