This chapter is designed to introduce beginners to R, ensuring that trainees with no prior exposure to the software can grasp its basic functionalities. It begins with an overview of R, instructions for installing the software, and a description of the RStudio interface. Finally, it provides guidance on creating basic code.
Given the importance of RStudio, the following section outlines the steps to install both R and RStudio. For a more detailed installation guide, refer to the following resources:
Visit the official R website: https://www.r-project.org/.
Click on the CRAN link located on the left side of the webpage.
Select a mirror from the list (e.g., 0-Cloud).
Choose the appropriate version for your operating system (Windows or Mac) and click Download R.
Click on Install R for the first time.
Download the installation file for your operating system (e.g., Download R for Windows).
Run the installation file and follow the on-screen instructions to complete the installation.
Visit the RStudio website: https://posit.co/products/open-source/rstudio/.
Click on the Download button.
Select the Free version of RStudio.
Click Download RStudio and follow the installation instructions.
When you launch RStudio, you will see an interface divided into four panels:
Source Panel:
Environment and History Panel:
Multipurpose Panel:
Console Panel:
Packages are collections of R functions, data, and code that extend R’s capabilities. Some packages are included with the base R installation and are automatically accessible. These provide essential functions for statistical analysis and data visualization.
To use additional packages, you must load them into your R session
using the library() function. For example:
library(readxl)
Once you write the previous statement in the Source panel, the natural question is how to execute this line (i.e., to let R know that you want it to perform the instruction). There are two ways to do this:
First Option: Select the command line you want to run and press “Ctrl+Enter” on the keyboard (recommended option).
Second Option: Select the command line and click the “Run” button at the top of the Source panel.
Both methods will execute the selected command in the R console.
Here’s a revised and polished version of your R Markdown content. You can copy and paste this directly into your R Markdown document for a Word output. I’ve improved grammar, formatting, and clarity while preserving the R Markdown structure.
R is an open-source software, and new packages created by users are continually being developed to simplify analytical tasks. To use these packages, they must first be downloaded and installed. There are two primary ways to install packages in R.
ggplot2). As you type, RStudio will suggest available
packages to help avoid spelling errors.You can also install packages directly from the command line using
the install.packages() function. This method requires you
to know the exact name of the package but ensures your code is
self-contained and reproducible. For example:
install.packages("ggplot2")
Note: Installing a package is only the first step.
To use it in your session, you must load it using the
library() function:
library(ggplot2)
R uses various types of objects to store and manipulate data. Below, we explore the most common ones.
Variables in R can store different types of data, such as numeric,
character, or logical values. To create a variable, assign a value to it
using the = or <- operator.
# Examples of variables
A = 4.5 # Numeric
B = "House" # Character
C = FALSE # Logical
# Display the value of C
C
## [1] FALSE
A vector is a collection of elements of the same type (e.g., all
numeric, all character, or all logical). Use the c()
function to create a vector.
# Create vectors
A = c(1, 2, 3) # Numeric vector
B = c("blue", "red", "yellow") # Character vector
C = c(TRUE, FALSE, FALSE) # Logical vector
# Display the vector B
B
## [1] "blue" "red" "yellow"
# Access specific elements of B
B[1:2]
## [1] "blue" "red"
Matrices are two-dimensional objects that store elements of the same
type. They can be created by combining vectors using
rbind() (row-wise) or cbind()
(column-wise).
# Create vectors
vectorA = c(1, 2, 3)
vectorB = c(4, 5, 6)
# Combine vectors into matrices
Matrix1 = rbind(vectorA, vectorB) # Combine by row
Matrix2 = cbind(vectorA, vectorB) # Combine by column
# Display Matrix1
Matrix1
## [,1] [,2] [,3]
## vectorA 1 2 3
## vectorB 4 5 6
# Display Matrix2
Matrix2
## vectorA vectorB
## [1,] 1 4
## [2,] 2 5
## [3,] 3 6
# Check dimensions of Matrix2
dim(Matrix2)
## [1] 3 2
# Access specific elements of Matrix2
Matrix2[2, 2]
## vectorB
## 5
Factors are used to represent categorical data. They assign labels to numeric values, making it easier to interpret data.
# Create a numeric vector
vectorA = c(1, 3, 2, 2, 4, 1)
# Convert to a factor with labels
factorA = factor(vectorA,
levels = c(1, 2, 3),
labels = c("Primary", "Secondary", "Tertiary"))
# Display factorA
factorA
## [1] Primary Tertiary Secondary Secondary <NA> Primary
## Levels: Primary Secondary Tertiary
Data frames are similar to matrices but can store columns of different data types. They are ideal for working with structured data.To explore more data types.
# Create vectors
edu = c(1, 2, 3)
edu = factor(edu,
levels = c(1, 2, 3),
labels = c("Primary", "Secondary", "Tertiary"))
Name = c("Amira", "Jumana", "Carole")
Age = c(35, 32, 43)
Beneficiaries = c(TRUE, TRUE, FALSE)
# Combine vectors into a data frame
data = data.frame(Name, Age, Education = edu, Beneficiaries)
# Display the data frame
data
# Access specific columns
data$Education
## [1] Primary Secondary Tertiary
## Levels: Primary Secondary Tertiary
data$Age
## [1] 35 32 43
Functions are reusable blocks of code that perform specific tasks. R
includes many built-in functions, and you can also create your own. R
comes with many built-in functions, such as c(),
rbind(), and factor(), which you have already
encountered. However, R also allows users to define their own custom
functions. This section delves deeper into how functions work in R. For
example, the cbind() function is used to combine objects by
columns, while rbind() combines objects by rows.
Both functions take arguments to specify the objects to be combined.
To understand the arguments of a function already created in R, you can
use the ? operator or the help() function. For example,
upon run the command ?cbind R will display documentation
for the function, including its arguments and usage as illustrated in
the image xxx
dbind(..., deparse.level = 1)
The ... argument refers to one or more vectors, matrices, or other R objects that can be combined by rows. These can be provided as named or unnamed arguments.
The deparse.level argument is an integer that controls how labels are constructed for non-matrix-like arguments. By default, it is set to 1.
A function in R has the following structure:
function_name = function(arg1, arg2, ...) {
# Code to execute
return(result)
}
Example:
# Define a function to sum two numbers
sumNumbers = function(a, b) {
y = a + b
return(y)
}
# Use the function with 1 and 3 values as arguments. y= 1 + 3 will return as a result the 4
sumNumbers(1, 3)
## [1] 4
Conditional statements allow you to execute code based on specific
conditions. The if/else statement is commonly used for this
purpose.
If (condition)
{code to be executed if the condition is TRUE }
else { code to be executed if the condition is FALSE }
# Example of if/else
x = 3
if (x < 0) {
print("Negative")
} else if (x > 0) {
print("Positive")
} else {
print("Zero")
}
## [1] "Positive"
Loops are used to repeat a block of code multiple times. R supports
two main types of loops: while and for.
A while loop repeats code as long as a condition is
true.
while (test_expression) {
code to be executed
}
# Display numbers from 1 to 6
i = 1
while (i <= 6) {
print(i)
i = i + 1
}
## [1] 1
## [1] 2
## [1] 3
## [1] 4
## [1] 5
## [1] 6
A for loop iterates over a sequence of values.
for (element in a limited sequence)
code to be executed
}
# Sum numbers from 1 to 6
x = c(1, 2, 3, 4, 5, 6)
y = 0 # initialize y in 0
for (i in x) {
y = y + i
}
y
## [1] 21