Welcome to R Programming

Master the fundamentals of R for livestock genomics and quantitative genetics research

1 Learning Objectives

By the end of this tutorial, you will be able to:

📊

Understand R and R Markdown basics

🔢

Perform algebraic operations in R

📦

Install and load R libraries

📁

Set working directories and manage files

📈

Read, manipulate, and visualize datasets

💾

Export data and create reproducible scripts

2 What is R?

R is a free software environment for statistical computing and graphics. One of its strengths is that you can make publication-quality plots.

RStudio is a flexible and multi-functional open-source IDE (integrated development environment) used as a graphical front-end to work with R.

R is used by typing in commands. They are entered after the prompt > in the Console. After you type a command and its arguments, simply press the Return Key. Separate commands using ; or with a newline (Enter).

2.1 Your First R Command

To run the code below, click anywhere in the code chunk and then click Run → Run Current Chunk.

"hello world!"

## [1] "hello world!"

# or

print("hello world!")

## [1] "hello world!"

💡 Pro Tip: Like learning any new language, if you get an error or are not sure how to do something, you can search online for help. There are many resources and forums for R!

3 Getting Help in R

R has built-in documentation for every function. Here are several ways to access help:

# Method 1: Using help()
help("print")

# Method 2: Using ? shortcut
?print

# Method 3: Get examples
example("print")

# Method 4: Start HTML help browser
help.start()

3.1 Helpful Online Resources

📚 R Documentation

rdocumentation.org

Comprehensive documentation on R packages

❓ Stack Overflow

stackoverflow.com

Q&A for programming challenges

📋 RStudio Cheatsheets

rstudio.github.io

Quick reference guides

🎨 Tidyverse

tidyverse.org

Modern data science tools

4 R as a Calculator

R can perform all standard mathematical operations. Try these in the Console too!

1 + 1

## [1] 2

📝 Remember: You need to give instructions to R for everything. We are using RStudio as an interface to better manage our scripts, data, files, and figures.

5 Understanding R Markdown

This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents.

When you click the Knit button, a document will be generated that includes both content as well as the output of any embedded R code chunks.

For more details on R Markdown, visit: rmarkdown.rstudio.com

6 Basic Calculations

Some symbols are familiar (+, -, /), while others might be new (* for multiplication).

4 + 3  # Addition

## [1] 7

4 - 3  # Subtraction

## [1] 1

4 / 3  # Division

## [1] 1.333333

4 * 3  # Multiplication

## [1] 12

6.1 Working with Variables

We can assign values to variables and use them in calculations:

# Method 1: Using <- (preferred)
x <- 4

# Method 2: Using assign()
assign("y", 3)

# Method 3: Using =
z <- 2

To see variable values, you can print them:

print(x)

## [1] 4

## [1] 3

Now use these variables in calculations:

x + y

## [1] 7

x - y

## [1] 1

x / y

## [1] 1.333333

x * y

## [1] 12

3 + y

## [1] 6

6.1.1 📝 Exercise 1

Make a new R chunk and label it “Exercise 1”
Write and run four calculations using +, -, *, or /
Include both variables and integers
Compare results with a partner

💡 Click for hint

# Example:
a <- 10
b <- 5
a + b
a * 2

7 Working with Vectors

Vectors are sequences of data that can be numbers, characters, or logical values. All elements must be the same type.

7.1 Creating Vectors

# Numeric vector
m <- c(1, 2, NA, 4)
m

## [1]  1  2 NA  4

# Check the type
is(m)

## [1] "numeric" "vector"

# Character vector
n <- c("A", "mango", NA)
n

## [1] "A"     "mango" NA

7.2 Logical Vectors

Logical vectors have three possible values: TRUE, FALSE, and NA.

temp <- m > 3
print(temp)

## [1] FALSE FALSE    NA  TRUE

Logical operators: - < less than - <= less than or equal to - > greater than - >= greater than or equal to - == equal to - != not equal to - & and - | or

7.3 Indexing Vectors

v <- c(4, 2, 3, 8, 2, 2, 5)

# Get the fourth element
v[4]

## [1] 8

# Change the fourth element
v[4] <- 10
v

## [1]  4  2  3 10  2  2  5

# Use logical vectors to filter
w <- v < 5
v[w]

## [1] 4 2 3 2 2

8 Matrices

Matrices are rectangular arrays of data, all of the same type. They are fundamental for genomic relationship matrices (G-matrix) in quantitative genetics!

8.1 Creating Matrices

A <- matrix(
  # Sequence of elements  
  c(1, 2, 3, 4, 5, 6, 7, 8, 9), 
  nrow = 3,   # Number of rows
  ncol = 3,   # Number of columns
  byrow = TRUE  # Fill by row
)
 
# Naming rows and columns
rownames(A) <- c("a", "b", "c")
colnames(A) <- c("c", "d", "e")

B <- matrix(c(1, 6, 12, 4, 8, 15, 3, 14, 2), nrow = 3, ncol = 3)

print(A)

##   c d e
## a 1 2 3
## b 4 5 6
## c 7 8 9

print(B)

##      [,1] [,2] [,3]
## [1,]    1    4    3
## [2,]    6    8   14
## [3,]   12   15    2

8.2 Matrix Operations

# Addition
A + B

##    c  d  e
## a  2  6  6
## b 10 13 20
## c 19 23 11

# Subtraction
A - B

##    c  d  e
## a  0 -2  0
## b -2 -3 -8
## c -5 -7  7

# Matrix multiplication
A %*% B

##   [,1] [,2] [,3]
## a   49   65   37
## b  106  146   94
## c  163  227  151

# Transpose
t(A)

##   a b c
## c 1 4 7
## d 2 5 8
## e 3 6 9

# Inverse
solve(B)

##             [,1]        [,2]         [,3]
## [1,] -0.47087379  0.08980583  0.077669903
## [2,]  0.37864078 -0.08252427  0.009708738
## [3,] -0.01456311  0.08009709 -0.038834951

8.3 Understanding Matrix Inverse

Here’s how matrix inversion works for a 2×2 matrix:

a <- 2
b <- 5
c <- 9
d <- 7

M <- matrix(c(a, c, b, d), nrow = 2, ncol = 2)
print(M)

##      [,1] [,2]
## [1,]    2    5
## [2,]    9    7

# Calculate determinant
determinant_M <- a*d - b*c

# Calculate adjoint (swap a↔d, negate b and c)
adjoint_M <- matrix(c(d, -c, -b, a), nrow = 2, ncol = 2)

# Inverse = (1/determinant) × adjoint
Inverse_M <- (1/determinant_M) * adjoint_M
print(Inverse_M)

##            [,1]        [,2]
## [1,] -0.2258065  0.16129032
## [2,]  0.2903226 -0.06451613

# Verify with R's solve() function
solve(M)

##            [,1]        [,2]
## [1,] -0.2258065  0.16129032
## [2,]  0.2903226 -0.06451613

8.3.1 📝 Exercise 2

Create a new R chunk labeled “Exercise 2”
Make at least 3 square matrices and save them as variables
Try: addition (+), subtraction (-), matrix multiplication (%*%), transpose (t()), and inverse (solve())

9 Installing and Loading Packages

⚠️ Important: You only need to install a package once, but you must load it every time you start a new R session.

9.1 Installing Packages

# Install once (then comment out with #)
install.packages('ggplot2')

The next time you run this script, comment it out:

# install.packages('ggplot2')  # Already installed!

9.2 Loading Packages

library(ggplot2)  # For creating beautiful plots

9.3 Checking Installed Packages

library()  # View all installed packages
search()   # View loaded packages

9.3.1 📝 Exercise 3

Create a new R chunk labeled “Exercise 3”
Install the lme4 package (used for linear mixed models)
Load the lme4 library

# install.packages("lme4")
library(lme4)

10 Setting Your Working Directory

The working directory is where R looks for files to read and where it saves files you create.

10.1 Methods to Set Working Directory

Method 1: Using RStudio interface - Click Files → cog icon → Set As Working Directory

Method 2: Using code (recommended for reproducibility)

# Check current working directory
getwd()

# Set new working directory
setwd("~/Desktop/CTLGH_Training/Day1/")

# Verify the change
getwd()

# List files in working directory
dir()

💡 Note: When using R Markdown, the working directory is automatically set to where your .Rmd file is saved!

10.1.1 📝 Exercise 4

Create a new R chunk labeled “Exercise 4”
Set your working directory to your course folder using setwd()
List the contents with dir()

11 Working with Real Datasets

Now we move from toy examples to real livestock genetics data!

11.1 Reading Data

# Read CSV file
traits <- read.table("CT_traits_724_pc_res.csv", h=TRUE, sep=",")

# Alternative for CSV
# traits <- read.csv("CT_traits_724_pc_res.csv")

# View first few rows
head(traits)

📊 Dataset: Scottish Blackface sheep carcass composition traits measured using CT scanning.

Publication: Genetics Selection Evolution (2016)

11.2 Variable Definitions

Variable	Description
`id`	Animal ID
`sex`	Sex (0/1)
`Year`	Year of measurement
`dob`	Date of birth
`litter`	Litter ID
`LS`	Litter size
`DamAge`	Age of dam
`Group`	Management group
`Line_a`	Line of lamb
`LW`	Live weight
`bon_area_ISC`	Bone area at ischium
`mus_density_TV8`	Muscle density of 8th thoracic vertebra
`PC1`, `PC2`, `PC3`	Principal components (population structure)
`rlw`	Residual live weight
`rbon_area`	Residual bone area
`rmus_den`	Residual muscle density

11.2.1 📝 Exercise 5

Create a new R chunk labeled “Exercise 5”
Read in the traits data using read.csv()
Display the first few rows with head()

12 Manipulating Datasets

12.1 Accessing Data Elements

We can extract specific parts of our data using $ notation or indexing:

# Summary of litter size using $
summary(traits$LS)

# Same using column number
summary(traits[, 6])

12.2 Understanding Data Structure

str(traits)

12.3 Converting Data Types

Some variables need to be converted to the correct type:

# Sex should be a factor, not numeric
summary(traits$sex)

# Convert to factor
traits$sex <- as.factor(traits$sex)
summary(traits$sex)

Common conversions: - as.factor() → categorical variable - as.numeric() → number - as.character() → text - as.integer() → whole number

12.3.1 📝 Exercise 6

Create a new R chunk labeled “Exercise 6”
Convert the litter variable to a factor

traits$litter <- as.factor(traits$litter)
summary(traits$litter)

13 Creating Visualizations

Figures help us identify patterns and trends in data.

13.1 Basic Plotting

plot(traits$PC1, traits$PC2)

13.2 Enhanced Plotting

plot(traits$PC1, traits$PC2, 
     pch = 19,  # Solid circles
     col = c("red", "blue", "yellow", "green", "purple", "orange", "turquoise")[traits$Line_a],
     main = "Population Structure",
     xlab = "Principal Component 1",
     ylab = "Principal Component 2",
     cex = 1.5)

# Add legend
legend("topright", 
       legend = levels(traits$Line_a),
       col = c("red", "blue", "yellow", "green", "purple", "orange", "turquoise"),
       pch = 19,
       title = "Line")

13.3 Professional Plotting with ggplot2

library(ggplot2)

ggplot(traits, aes(x = PC1, y = PC2, color = Line_a)) +
  geom_point(size = 3, alpha = 0.7) +
  theme_minimal() +
  labs(title = "Population Structure Analysis",
       subtitle = "Principal Components from SNP Array Data",
       x = "Principal Component 1",
       y = "Principal Component 2",
       color = "Line") +
  theme(
    plot.title = element_text(size = 16, face = "bold"),
    plot.subtitle = element_text(size = 12),
    axis.title = element_text(size = 12),
    legend.position = "bottom"
  )

13.3.1 📝 Exercise 7

Create a new R chunk labeled “Exercise 7”
Make a plot of PC2 vs PC3
Visit R Documentation to learn how to customize axis labels
Add informative x and y axis labels

14 Exporting Data

After cleaning and filtering data, we often need to export it for use in other programs (like BLUPF90).

14.1 Filtering and Exporting

# Filter data for year 2001
sheep_2001 <- subset(traits, Year == 2001)

# Write to file
write.table(sheep_2001, 
            "sheep_2001.txt", 
            quote = FALSE, 
            sep = ' ', 
            row.names = FALSE)

14.2 Saving Your Workspace

# Save entire workspace
save.image("saved_workspace.RData")

# Load workspace
load("saved_workspace.RData")

⚠️ Best Practice: Rather than saving workspaces, create reproducible scripts that regenerate your results!

15 Organizing Your Scripts

15.1 Script Organization Checklist

1. Clean Environment

rm(list = ls())  # Clear workspace

2. Set Working Directory

setwd("~/your/path/here")

3. Load Dependencies

# Install packages (if needed)
# install.packages("package_name")

# Load libraries
library(package_name)

4. Read Data

data <- read.csv("data_file.csv")

5. Analysis

# Your analysis code here
# Use comments to explain each step!

6. Export Results

write.table(results, "output.txt")
ggsave("plot.png")

💡 Remember: Use comments (#) throughout to explain where data comes from, what analyses are being done, etc. This helps you remember and helps when sharing scripts with others!

16 Summary

16.1 🎉 Congratulations!

You’ve completed the Introduction to R! You now know how to:

✅ Use R and R Markdown
✅ Perform algebraic operations
✅ Install and load libraries
✅ Set working directories
✅ Read and manipulate datasets
✅ Create informative plots
✅ Export data files
✅ Organize reproducible scripts

📚 Next Steps: Refer back to this document throughout the week. Copy and paste useful code snippets into your new scripts. Every time you start working with an R script, remember to set your working directory!

16.1.1 📝 Exercise 8

Click Knit at the top of this document
Choose “Knit to HTML”
The resulting HTML file will save in your working directory
Open it in your web browser to see the beautifully formatted result!

Introduction to R and R Markdown

A Practical Guide for Livestock Genomics Research

CTLGH Training Workshop

2026-02-16