## Warning: package 'kableExtra' was built under R version 4.0.5
Preamble
This notes aims at showing operations which can be performed in R using built-in R functions.
Built-in R functions
The functions which are already created or defined in R programming framework are known as a built-in function. We shall look at various operations which can be done with these functions.
Reading Data
In R we can load a data through three ways
- Assign manually
- Load data from local storage
- Load data from Package
Data assigning manually
Single value
We can assign a single value to a variable as folows
number1=234
number2=55.3644
variable_2="string"
.p=2The value in the right side of ‘=’ is stored to variable names in the left side.
Note that variable names are created by combination of alpha-numeric,underscore,dot and always starts with either alphabet or a dot.
Vector
Vector of n elements is created by using c(), we specify elements of vector using c(element1,...,elementn)
vector=c(1,2,3,4)Vector to data frame
Once we create vectors we can combine them to form data frame using any one of following ways
- merging vectors of same size as rows
vector1=c(1,2,33,454,22)
vector2=c(2,43,32,123,2.3)
d=rbind(vector1,vector2)
d=data.frame(d)- merging vectors of same size as columns
vector1=c(1,2,33,454,22)
vector2=c(2,43,32,123,2.3)
d=cbind(vector1,vector2)
d=data.frame(d)Data from local storage
If the data is in csv format we can load it using
da=read.csv("C:/Users/Sathyanaraynan/Documents/Sample.csv")
Data from package
For example we can load a data called “iris” from package “datasets” by assigning it to a variable name, since the is defaultly loaded we can assign as
data1=irisTo know more about data types and structures please take a look at Introduction to Data Types and Structures in R
Operators
Operators are symbols that are used to do assignment, arithmetic and logical operations with variables.
Assignment operator
Data can be assigned to a variable name in R using operators ‘=’ ‘<-’ for assigning values in right side to name in left side and ‘->’ for assigning values in left side to name on right side. some example are
variable=22
variable## [1] 22
vect<-43+2
vect## [1] 45
tan(45)->angle
angle## [1] 1.619775
Addition operator
The ‘+’ operator is used to add 2 numeric data
23+55576## [1] 55599
Subtraction operator
The ‘-’ operator is used to find difference between two numbers
442-221.24## [1] 220.76
Multiplication operator
The ’*’ operator is used for multiplying two numbers
22.3*63.2## [1] 1409.36
Division operator
The ‘/’ operator is used to divide one number with another
99/123## [1] 0.804878
-14/0## [1] -Inf
Remainder operator
The ‘%%’ operator is used to find the remainder if one number is divided by another
35%%4## [1] 3
Exponentiation operator
The ‘^’ operator is used to find nth power of a number
78^7## [1] 1.756557e+13
Comparison operator
We can find whether one number is less than, less than or equal to, greater than or greater than or equal to and also equality using following operators
num1=23.1433
num2=log(12353)
## Checking Equality
num1==num2## [1] FALSE
## Inequality
num1!=num1## [1] FALSE
## Less than
num1<num2## [1] FALSE
## Less than or equal to
num1<=num1## [1] TRUE
## Greater than
num1>num2## [1] TRUE
## Greater than or equal to
num1>=num2## [1] TRUE
Checking existence
%in% operator is used to check the existence of an element in a vector
1 %in% c(2,1,3,4)## [1] TRUE
# Can also be used for strings
"in" %in% c("word","sentence","in")## [1] TRUE
Logical AND, OR
‘&’ operator is used to check whether both conditions are true
x=34+4
(x>34)&(36==x)## [1] FALSE
‘|’ operator is used to execute and check whether any one of two conditions true
num=3244.22
(num<1)|(num>3244)## [1] TRUE
Dimension
We can find the dimension of data using following functions.
length() can be used to find number of elements in a vector.
vector=c(1,2,33,21,243,533,22,12)
length(vector)## [1] 8
Dimension of data frame can be obtained using dim() function.
dim(data1)## [1] 150 5
Matrix operations
A matrix of size mxn can be created in R using matrix(values,nrow=m,ncol=n,byrow=T/F), nrow and ncol are used to specify number of rows and columns of matrix, byrow is used to specify whether we need to fill the elements by row or by column.
mat1=matrix(c(1,2,3,4,5,6,7,8,9),nrow=3,ncol=3,byrow=T);mat1## [,1] [,2] [,3]
## [1,] 1 2 3
## [2,] 4 5 6
## [3,] 7 8 9
mat2=matrix(c(5,10,15,12,2,3,7,13,19),nrow=3);mat2## [,1] [,2] [,3]
## [1,] 5 12 7
## [2,] 10 2 13
## [3,] 15 3 19
Transpose of matrix
Transpose can be obtained using t(matrix)
t(mat1)## [,1] [,2] [,3]
## [1,] 1 4 7
## [2,] 2 5 8
## [3,] 3 6 9
Trace of matrix
trace can be obtained using sum(diag(matrix))
sum(diag(mat2))## [1] 26
Addition and Subtraction of matrices
Addition and subtraction of matrices are same as that of numbers
Multiplication of matrices
Two matrices can be multiplied using ’%*%’
mat1%*%mat2## [,1] [,2] [,3]
## [1,] 70 25 90
## [2,] 160 76 207
## [3,] 250 127 324
Determinant of matrix
Determinant can be found using det() function
det(mat1)## [1] 6.661338e-16
Inverse of matrix
Matrix inverse can be obtained using solve() function
solve(mat2)## [,1] [,2] [,3]
## [1,] -0.01818182 -3.7636364 2.58181818
## [2,] 0.09090909 -0.1818182 0.09090909
## [3,] 0.00000000 3.0000000 -2.00000000
Data Wrangling with Built-in functions
Accessing data using position
Values in data frame can be accessed using row and column number where it occurs. Generally data frame is represented using data[m,n] where m represents row number and n represents column number.Let us see some examples below
Accessing row
Fiftieth row can be extracted using
data1[50,]## Sepal.Length Sepal.Width Petal.Length Petal.Width Species
## 50 5 3.3 1.4 0.2 setosa
Accesssing column
Third column can be extracted using
data1[,3]## [1] 1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 1.5 1.6 1.4 1.1 1.2 1.5 1.3 1.4
## [19] 1.7 1.5 1.7 1.5 1.0 1.7 1.9 1.6 1.6 1.5 1.4 1.6 1.6 1.5 1.5 1.4 1.5 1.2
## [37] 1.3 1.4 1.3 1.5 1.3 1.3 1.3 1.6 1.9 1.4 1.6 1.4 1.5 1.4 4.7 4.5 4.9 4.0
## [55] 4.6 4.5 4.7 3.3 4.6 3.9 3.5 4.2 4.0 4.7 3.6 4.4 4.5 4.1 4.5 3.9 4.8 4.0
## [73] 4.9 4.7 4.3 4.4 4.8 5.0 4.5 3.5 3.8 3.7 3.9 5.1 4.5 4.5 4.7 4.4 4.1 4.0
## [91] 4.4 4.6 4.0 3.3 4.2 4.2 4.2 4.3 3.0 4.1 6.0 5.1 5.9 5.6 5.8 6.6 4.5 6.3
## [109] 5.8 6.1 5.1 5.3 5.5 5.0 5.1 5.3 5.5 6.7 6.9 5.0 5.7 4.9 6.7 4.9 5.7 6.0
## [127] 4.8 4.9 5.6 5.8 6.1 6.4 5.6 5.1 5.6 6.1 5.6 5.5 4.8 5.4 5.6 5.1 5.1 5.9
## [145] 5.7 5.2 5.0 5.2 5.4 5.1
Accessing particular value
Value at thirty fifth row third column value can be obtained using
data1[35,3]## [1] 1.5
Data description
Using str() function we can know the structure of data
str(data1)## 'data.frame': 150 obs. of 5 variables:
## $ Sepal.Length: num 5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...
## $ Sepal.Width : num 3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ...
## $ Petal.Length: num 1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ...
## $ Petal.Width : num 0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ...
## $ Species : Factor w/ 3 levels "setosa","versicolor",..: 1 1 1 1 1 1 1 1 1 1 ...
Extracting particular column of data and its first 6 values
‘$’ can be used to extract particular column from data and head() function can be used to fetch only first 6 values
# Let us look at first 6 values of iris
head(iris$Sepal.Length)## [1] 5.1 4.9 4.7 4.6 5.0 5.4
Final Note
We have seen important topics such as
- Loading data
- Usage of operators
- Matrix operations
- Data wrangling
The above are few methods available in R. One can easily explore more options