R stores things called objects in its memory, which is called the workspace. Objects are defined by you, the user. In R the = sign means ‘is’, and it is used to create objects. Instead of the = sign you can also use <- , these are equivalent except the = sign is also used to indicate the values for parameters in functions. Some prefer the <- because it is less ambiguous, and may be more readable. Others may prefer the = sign because it requires fewer keystrokes and is used in other programming languages.
R also has objects already built in which contain functions. Functions receive an input and the return an output which you can save in an object. For example, the c() function creates a vector. A vector is a sequence of data elements of the same type.
x<- c(0, 1, 2)
x
## [1] 0 1 2
Vectors can also be created using other vectors as input
x2<- c(x, x, x, x, x)
x2
## [1] 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2
There are usually many ways to do the same thing in R, we can create the same vector using the rep() function, the first argument of the function is the object to be repeated, and the second argument of the function is the number of times to repeat the object
x2v2<- rep(x, 5)
x2v2
## [1] 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2
This vector should be equivalent to the vector x2
We can check if the elements of x2 and x2v2 are equal by using the ‘==’ sign. This will return a logical vector consisting of TRUE/FALSE
x2v2 == x2
## [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
We can also check if any of the elements of the two vectors are not equivalent using the any() function and using the ‘!=’ sign which means not equals
any(x2v2 != x2)
## [1] FALSE
This returns TRUE/FALSE indicating if there are non-equivalent elements between the two vectors
We can also randomize contents of a vector using the sample() function, by default this shuffles the contents, but it can also be used to draw a random sample with replacement
x2<- sample(x2)
x2
## [1] 1 0 1 0 2 2 2 1 2 2 0 0 1 1 0
Names for each element of vector can be assigned. First we should create a vector of names. The paste() function can be used to create names that follow a pattern
nm<- paste('loc', c(1:length(x2)), sep="_")
nm
## [1] "loc_1" "loc_2" "loc_3" "loc_4" "loc_5" "loc_6" "loc_7" "loc_8"
## [9] "loc_9" "loc_10" "loc_11" "loc_12" "loc_13" "loc_14" "loc_15"
Next we assign the names using the names() function
names(x2)<-nm
x2
## loc_1 loc_2 loc_3 loc_4 loc_5 loc_6 loc_7 loc_8 loc_9 loc_10 loc_11
## 1 0 1 0 2 2 2 1 2 2 0
## loc_12 loc_13 loc_14 loc_15
## 0 1 1 0
Congratulations, you just simulated genotypes at 15 loci for a single individual
In this set of exercises, you will learn the basics of manipulating vectors that resemble vectors of genotypes
We previously demonstrated how to create this vector. Here we create it again quickly. Note that it will look different each time you create it because the sample() function randomly shuffles the elements
x<- c(0, 1, 2)
x2<- rep(x, 5)
x2<- sample(x2)
names(x2)<- paste('loc', c(1:length(x2)), sep="_")
x2
## loc_1 loc_2 loc_3 loc_4 loc_5 loc_6 loc_7 loc_8 loc_9 loc_10 loc_11
## 2 1 0 0 0 0 1 1 2 1 2
## loc_12 loc_13 loc_14 loc_15
## 0 2 2 1
This is a simulated genotype at 15 loci for a single individual. The genotypic values are coded 0, 1, 2 which represents the number of the reference alleles at the locus.
Typically, genotypic values are centered at zero. This makes the results of analysis easier to interpret. Center the genotypic values by subtracting 1 from your vector and save the result to a new object ‘x2b’
Tabulate how many of each genotype are in your vector using the function table(). Save the results in an object ‘tbx2’
This is done using vector indexing. An index is a number that indicates the position(s) that you want to retrieve. For example x2b[1] retrieves the first element of the vector (the index is 1). x2b[1:3] retrieves the elements 1 to 3 (the indices are 1,2,3).
Use the rnorm() function to create a vector of 15 numbers sampled from a normal distribution of mean zero and standard deviation equal to one. This is a simulation of additive allele effects for each of our 15 loci in our genotypic vector x2b. Save this allele effect vector as an object ‘eff’. Note that in reality it is impossible to the know the true values of allele effects.
We will discuss the meaning of true breeding value in depth in another lecture. Briefly, breeding value is the additive portion of the genotypic effect, where genotypic effect is the effect of the genotype on the phenotype. Breeding value of a genotype at a single locus is the additive allele effect * the coded genotype. You can compute these for all loci using x2b * eff. (The * symbol does element-wise multiplication of the two vectors.) Then, breeding value of an individual is the sum of the breeding values across all loci. This can also be computed in a single step using matrix multiplication: x2b %*% eff. Try both methods (element-wise multiplication and matrix multiplication) for computing the breeding value of x2b and show your work.