In this session students will learn how to create an R markdown document that basics of programming through a case study of creating a randomized experiment.
.Rmd
matrix()
sample()
for
loops%in%
operator to define group membershipif()
c()
Consider the set up:
We want to give each experimental unit an ID. Since ultimately we
arae going to randomly draw from this list of IDs we can just assign
number IDs from 1 to 24. This can be done by using the colon
(:
). The syntax for the colon function is
starting integer : ending integer
.
# STEP 1: Giving Id's to Experimental Units
# recall in our example from class there were 24 plants (experimental units)
# the colon is a way to create a vector of consecutive integers
ids<-1:24
# note that we are storing the output as a vector
ids
## [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
# INSERT CODE HERE #
# INSERT CODE HERE #
In the example we saw, the lab bench had 4 rows and 6 columns. We can
use the matrix
function to organize all our experimental
units. Matrices have two dimensions, rows and columns (in that order). A
matrix is a very useful way to store numbers there are also special
mathematical operations that can be performed on matrices.
Start by reading the documentation about matrices:
# let's read the documentation about matrix function to learn about it arguments
?matrix
# the inputs of this function are the data, nrow, and ncol
Now let’s organize our experimental units:
## STEP 2: Organizing the experimental units into rows and columns
# in the example we saw, the lab bench had 4 rows and 6 columns
labBench<-matrix(ids, nrow=4, ncol=6)
labBench
## [,1] [,2] [,3] [,4] [,5] [,6]
## [1,] 1 5 9 13 17 21
## [2,] 2 6 10 14 18 22
## [3,] 3 7 11 15 19 23
## [4,] 4 8 12 16 20 24
# this is a matrix!
ncol
)# INSERT CODE HERE #
nrow
\(\times\) ncol
is not equal to
the length of the data?# INSERT CODE HERE #
We want to randomly assign treatments to our experimental units to avoid confounding. We can use R to help us with this task.
The simplest form of an experimental design is a
Completely Randomized Design. In this design
we choose ID’s and randomly assign to treatments. In order to choose
which IDs will go in which treatments, we can use the
sample()
function.
First, let’s learn about the sample function:
?sample
sample
Let’s try it!
crd_samp<-sample(ids, replace=FALSE)
crd_samp
## [1] 3 18 13 16 1 19 5 11 9 10 7 21 24 23 4 15 14 17 22 2 6 20 12 8
ANSWER HERE:
ANSWER HERE:
Now let’s try assigning our IDs to treatments using a matrix. Here
nrow
will correspond to the number of treatment.
Let’s say that
## Completely randomized design
## choose ID's and randomly assign to treatments
# nrow will be the number of treatments
crd_mat<-matrix(crd_samp, nrow=4)
crd_mat
## [,1] [,2] [,3] [,4] [,5] [,6]
## [1,] 3 1 9 24 14 6
## [2,] 18 19 10 23 17 20
## [3,] 13 5 7 4 22 12
## [4,] 16 11 21 15 2 8
# we can also rename the rows
rownames(crd_mat)<-c("Treat A", "Treat B", "Treat C", "Treat D")
crd_mat
## [,1] [,2] [,3] [,4] [,5] [,6]
## Treat A 3 1 9 24 14 6
## Treat B 18 19 10 23 17 20
## Treat C 13 5 7 4 22 12
## Treat D 16 11 21 15 2 8
We can use the matrix from the previous step to know which
experimental units are in their respective treatments; however, it might
be easier if we made a map that showed where their treatments were
located. To accomplish this task we’ll need to understand
for
loops, the %in%
operator, and
conditions.
for
loopsFor loops can be used to repeat the same basic task over and over again. Check this one out! What is it doing?
## FOR LOOPS
for(i in 1:5){
print(i)
}
## [1] 1
## [1] 2
## [1] 3
## [1] 4
## [1] 5
%in%
operatorThis operator will identify if the element specified on the left hand side is contained by the set specified on the right hand side,\(A \subset B\).
## %in% operator
1 %in% c(1, 2, 3)
## [1] TRUE
5 %in% c(1, 2, 3)
## [1] FALSE
This is a little complicated, so I will provide the code. Let’s pay careful attention to how each piece is working together.
## Making a map of this design
## here we will learn to create a loop and how to write conditionals
treats<-matrix(nrow=24) # create an empty vector with 24 spaces so we can hold the treatments later
for(i in 1:24){
if(i %in% crd_mat[1,]){
treats[i]<-"A"
}
if(i %in% crd_mat[2,]){
treats[i]<-"B"
}
if(i %in% crd_mat[3,]){
treats[i]<-"C"
}
if(i %in% crd_mat[4,]){
treats[i]<-"D"
}
}
treats
## [,1]
## [1,] "A"
## [2,] "D"
## [3,] "A"
## [4,] "C"
## [5,] "C"
## [6,] "A"
## [7,] "C"
## [8,] "D"
## [9,] "A"
## [10,] "B"
## [11,] "D"
## [12,] "C"
## [13,] "C"
## [14,] "A"
## [15,] "D"
## [16,] "D"
## [17,] "B"
## [18,] "B"
## [19,] "B"
## [20,] "B"
## [21,] "D"
## [22,] "C"
## [23,] "B"
## [24,] "A"
## make the map!
expDes<-matrix(treats, nrow=4)
expDes
## [,1] [,2] [,3] [,4] [,5] [,6]
## [1,] "A" "C" "A" "C" "B" "D"
## [2,] "D" "A" "B" "A" "B" "C"
## [3,] "A" "C" "D" "D" "B" "B"
## [4,] "C" "D" "C" "D" "B" "A"
How does this look?
Does it appear “random”?
Note: We might be tempted to think that just because there are clusters of the same treatment together that the design is not random; however, we used a random mechanism to assign IDs to treatment. Using a random mechanism does not guarantee that there won’t be these clusters.
We have been using matrices a lot! When working with matrices its a good idea to know can we can call specific subsets within the matrix. Every cell of a matrix has an address which is defined by the row and column that its in.
In the following examples we will see how this can be used:
# EXAMPLE:
expDes[3,2]
## [1] "C"
# INSERT CODE HERE #
# INSERT CODE HERE #
# INSERT CODE HERE #
If we know that there is a natural gradient or more homogeneous subgroups within our experiment we might consider blocking to improve our design. The hallmark of a randomized complete block design is that every treatment must be present in ever block. This allows us to avoid confounding treatment with block.
In this example columns are used as blocks. We will randomly assign where the treatments are placed within which block.
To do this we will see another way to store the out from our loop. We will do it this time with concatenating out loop output.
output<-c() # start will an empty list
for(i in 1:6){
thisOut<-i
output<-c(output, i) # the new output vector is the old output vector plus the new observation
print(output)
}
## [1] 1
## [1] 1 2
## [1] 1 2 3
## [1] 1 2 3 4
## [1] 1 2 3 4 5
## [1] 1 2 3 4 5 6
Let’s do this with our experiment!
## STEP: Blocked Design
## example: if there is a gradient across columns there are 6 blocks
# we will learn how to concatenate here
# start with an empty list
blockTreats<-c()
for(i in 1:6){
thisSample<-sample(c("A", "B", "C", "D"), replace=FALSE)
blockTreats<-c(blockTreats, thisSample)
}
blockDes<-matrix(blockTreats, nrow=4)
blockDes
## [,1] [,2] [,3] [,4] [,5] [,6]
## [1,] "B" "C" "A" "C" "D" "B"
## [2,] "D" "D" "B" "D" "A" "A"
## [3,] "A" "B" "D" "A" "C" "C"
## [4,] "C" "A" "C" "B" "B" "D"