R programming demonstration

Working with objects and vectors

R stores things called objects in its memory, which is called the workspace. Objects are defined by you, the user. In R the = sign means ‘is’, and it is used to create objects. Instead of the = sign you can also use <- , these are equivalent except the = sign is also used to indicate the values for parameters in functions. Some prefer the <- because it is less ambiguous, and may be more readable. Others may prefer the = sign because it requires fewer keystrokes and is used in other programming languages.

R also has objects already built in which contain functions. Functions receive an input and the return an output which you can save in an object. For example, the c() function creates a vector. A vector is a sequence of data elements of the same type.

Create an object x which is a vector equal to 0, 1, 2.

x<- c(0, 1, 2)
x

## [1] 0 1 2

Create a vector of vectors

Vectors can also be created using other vectors as input

x2<- c(x, x, x, x, x)
x2

##  [1] 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2

Create a vector of vectors using the rep() function

There are usually many ways to do the same thing in R, we can create the same vector using the rep() function, the first argument of the function is the object to be repeated, and the second argument of the function is the number of times to repeat the object

x2v2<- rep(x, 5)
x2v2

##  [1] 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2

This vector should be equivalent to the vector x2

Check of the vector elements are equal

We can check if the elements of x2 and x2v2 are equal by using the ‘==’ sign. This will return a logical vector consisting of TRUE/FALSE

x2v2 == x2

##  [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE

Check if any of the vector elements are not equal

We can also check if any of the elements of the two vectors are not equivalent using the any() function and using the ‘!=’ sign which means not equals

any(x2v2 != x2)

## [1] FALSE

This returns TRUE/FALSE indicating if there are non-equivalent elements between the two vectors

Randomize the contents of the vector x2

We can also randomize contents of a vector using the sample() function, by default this shuffles the contents, but it can also be used to draw a random sample with replacement

x2<- sample(x2)
x2

##  [1] 1 0 1 0 2 2 2 1 2 2 0 0 1 1 0

Create names for the elements of the vector

Names for each element of vector can be assigned. First we should create a vector of names. The paste() function can be used to create names that follow a pattern

nm<- paste('loc', c(1:length(x2)), sep="_")
nm

##  [1] "loc_1"  "loc_2"  "loc_3"  "loc_4"  "loc_5"  "loc_6"  "loc_7"  "loc_8" 
##  [9] "loc_9"  "loc_10" "loc_11" "loc_12" "loc_13" "loc_14" "loc_15"

Name the elements of the vector

Next we assign the names using the names() function

names(x2)<-nm
x2

##  loc_1  loc_2  loc_3  loc_4  loc_5  loc_6  loc_7  loc_8  loc_9 loc_10 loc_11 
##      1      0      1      0      2      2      2      1      2      2      0 
## loc_12 loc_13 loc_14 loc_15 
##      0      1      1      0

Congratulations, you just simulated genotypes at 15 loci for a single individual

Working with genotypic vectors

In this set of exercises, you will learn the basics of manipulating vectors that resemble vectors of genotypes

Create the x2 vector

We previously demonstrated how to create this vector. Here we create it again quickly. Note that it will look different each time you create it because the sample() function randomly shuffles the elements

x<- c(0, 1, 2)
x2<- rep(x, 5)
x2<- sample(x2)
names(x2)<- paste('loc', c(1:length(x2)), sep="_")
x2

##  loc_1  loc_2  loc_3  loc_4  loc_5  loc_6  loc_7  loc_8  loc_9 loc_10 loc_11 
##      2      1      0      0      0      0      1      1      2      1      2 
## loc_12 loc_13 loc_14 loc_15 
##      0      2      2      1

This is a simulated genotype at 15 loci for a single individual. The genotypic values are coded 0, 1, 2 which represents the number of the reference alleles at the locus.

1. Center the coded genotypes

Typically, genotypic values are centered at zero. This makes the results of analysis easier to interpret. Center the genotypic values by subtracting 1 from your vector and save the result to a new object ‘x2b’

2. Calculate genotypic frequencies

Tabulate how many of each genotype are in your vector using the function table(). Save the results in an object ‘tbx2’

3. Subset the first 5 genotypes from the vector

This is done using vector indexing. An index is a number that indicates the position(s) that you want to retrieve. For example x2b[1] retrieves the first element of the vector (the index is 1). x2b[1:3] retrieves the elements 1 to 3 (the indices are 1,2,3).

4. Create a vector of random additive effects

Use the rnorm() function to create a vector of 15 numbers sampled from a normal distribution of mean zero and standard deviation equal to one. This is a simulation of additive allele effects for each of our 15 loci in our genotypic vector x2b. Save this allele effect vector as an object ‘eff’. Note that in reality it is impossible to the know the true values of allele effects.

5. Compute the true breeding value of an individual with the genotype x2b

We will discuss the meaning of true breeding value in depth in another lecture. Briefly, breeding value is the additive portion of the genotypic effect, where genotypic effect is the effect of the genotype on the phenotype. Breeding value of a genotype at a single locus is the additive allele effect * the coded genotype. You can compute these for all loci using x2b * eff. (The * symbol does element-wise multiplication of the two vectors.) Then, breeding value of an individual is the sum of the breeding values across all loci. This can also be computed in a single step using matrix multiplication: x2b %*% eff. Try both methods (element-wise multiplication and matrix multiplication) for computing the breeding value of x2b and show your work.