## Warning: package 'kableExtra' was built under R version 4.0.5

Preamble

This notes aims at showing operations which can be performed in R using built-in R functions.

Built-in R functions

The functions which are already created or defined in R programming framework are known as a built-in function. We shall look at various operations which can be done with these functions.

Reading Data

In R we can load a data through three ways

  1. Assign manually
  2. Load data from local storage
  3. Load data from Package

Data assigning manually

Single value

We can assign a single value to a variable as folows

number1=234
number2=55.3644
variable_2="string"
.p=2

The value in the right side of ‘=’ is stored to variable names in the left side.
Note that variable names are created by combination of alpha-numeric,underscore,dot and always starts with either alphabet or a dot.

Vector

Vector of n elements is created by using c(), we specify elements of vector using c(element1,...,elementn)

vector=c(1,2,3,4)

Vector to data frame

Once we create vectors we can combine them to form data frame using any one of following ways

  1. merging vectors of same size as rows
vector1=c(1,2,33,454,22)
vector2=c(2,43,32,123,2.3)
d=rbind(vector1,vector2)
d=data.frame(d)
  1. merging vectors of same size as columns
vector1=c(1,2,33,454,22)
vector2=c(2,43,32,123,2.3)
d=cbind(vector1,vector2)
d=data.frame(d)

Data from local storage

If the data is in csv format we can load it using

da=read.csv("C:/Users/Sathyanaraynan/Documents/Sample.csv")

Data from package

For example we can load a data called “iris” from package “datasets” by assigning it to a variable name, since the is defaultly loaded we can assign as

data1=iris

To know more about data types and structures please take a look at Introduction to Data Types and Structures in R

Operators

Operators are symbols that are used to do assignment, arithmetic and logical operations with variables.

Assignment operator

Data can be assigned to a variable name in R using operators ‘=’ ‘<-’ for assigning values in right side to name in left side and ‘->’ for assigning values in left side to name on right side. some example are

variable=22
variable
## [1] 22
vect<-43+2
vect
## [1] 45
tan(45)->angle
angle
## [1] 1.619775

Addition operator

The ‘+’ operator is used to add 2 numeric data

23+55576
## [1] 55599

Subtraction operator

The ‘-’ operator is used to find difference between two numbers

442-221.24
## [1] 220.76

Multiplication operator

The ’*’ operator is used for multiplying two numbers

22.3*63.2
## [1] 1409.36

Division operator

The ‘/’ operator is used to divide one number with another

99/123
## [1] 0.804878
-14/0
## [1] -Inf

Remainder operator

The ‘%%’ operator is used to find the remainder if one number is divided by another

35%%4
## [1] 3

Exponentiation operator

The ‘^’ operator is used to find nth power of a number

78^7
## [1] 1.756557e+13

Comparison operator

We can find whether one number is less than, less than or equal to, greater than or greater than or equal to and also equality using following operators

num1=23.1433
num2=log(12353)

## Checking Equality
num1==num2
## [1] FALSE
## Inequality 
num1!=num1
## [1] FALSE
## Less than
num1<num2
## [1] FALSE
## Less than or equal to
num1<=num1
## [1] TRUE
## Greater than
num1>num2
## [1] TRUE
## Greater than or equal to
num1>=num2
## [1] TRUE

Checking existence

%in% operator is used to check the existence of an element in a vector

1 %in% c(2,1,3,4)
## [1] TRUE
# Can also be used for strings
"in" %in% c("word","sentence","in")
## [1] TRUE

Logical AND, OR

‘&’ operator is used to check whether both conditions are true

x=34+4
(x>34)&(36==x)
## [1] FALSE

‘|’ operator is used to execute and check whether any one of two conditions true

num=3244.22
(num<1)|(num>3244)
## [1] TRUE

Dimension

We can find the dimension of data using following functions.
length() can be used to find number of elements in a vector.

vector=c(1,2,33,21,243,533,22,12)
length(vector)
## [1] 8

Dimension of data frame can be obtained using dim() function.

dim(data1)
## [1] 150   5

Matrix operations

A matrix of size mxn can be created in R using matrix(values,nrow=m,ncol=n,byrow=T/F), nrow and ncol are used to specify number of rows and columns of matrix, byrow is used to specify whether we need to fill the elements by row or by column.

mat1=matrix(c(1,2,3,4,5,6,7,8,9),nrow=3,ncol=3,byrow=T);mat1
##      [,1] [,2] [,3]
## [1,]    1    2    3
## [2,]    4    5    6
## [3,]    7    8    9
mat2=matrix(c(5,10,15,12,2,3,7,13,19),nrow=3);mat2
##      [,1] [,2] [,3]
## [1,]    5   12    7
## [2,]   10    2   13
## [3,]   15    3   19

Transpose of matrix

Transpose can be obtained using t(matrix)

t(mat1)
##      [,1] [,2] [,3]
## [1,]    1    4    7
## [2,]    2    5    8
## [3,]    3    6    9

Trace of matrix

trace can be obtained using sum(diag(matrix))

sum(diag(mat2))
## [1] 26

Addition and Subtraction of matrices

Addition and subtraction of matrices are same as that of numbers

Multiplication of matrices

Two matrices can be multiplied using ’%*%’

mat1%*%mat2
##      [,1] [,2] [,3]
## [1,]   70   25   90
## [2,]  160   76  207
## [3,]  250  127  324

Determinant of matrix

Determinant can be found using det() function

det(mat1)
## [1] 6.661338e-16

Inverse of matrix

Matrix inverse can be obtained using solve() function

solve(mat2)
##             [,1]       [,2]        [,3]
## [1,] -0.01818182 -3.7636364  2.58181818
## [2,]  0.09090909 -0.1818182  0.09090909
## [3,]  0.00000000  3.0000000 -2.00000000

Data Wrangling with Built-in functions

Accessing data using position

Values in data frame can be accessed using row and column number where it occurs. Generally data frame is represented using data[m,n] where m represents row number and n represents column number.Let us see some examples below

Accessing row

Fiftieth row can be extracted using

data1[50,]
##    Sepal.Length Sepal.Width Petal.Length Petal.Width Species
## 50            5         3.3          1.4         0.2  setosa

Accesssing column

Third column can be extracted using

data1[,3]
##   [1] 1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 1.5 1.6 1.4 1.1 1.2 1.5 1.3 1.4
##  [19] 1.7 1.5 1.7 1.5 1.0 1.7 1.9 1.6 1.6 1.5 1.4 1.6 1.6 1.5 1.5 1.4 1.5 1.2
##  [37] 1.3 1.4 1.3 1.5 1.3 1.3 1.3 1.6 1.9 1.4 1.6 1.4 1.5 1.4 4.7 4.5 4.9 4.0
##  [55] 4.6 4.5 4.7 3.3 4.6 3.9 3.5 4.2 4.0 4.7 3.6 4.4 4.5 4.1 4.5 3.9 4.8 4.0
##  [73] 4.9 4.7 4.3 4.4 4.8 5.0 4.5 3.5 3.8 3.7 3.9 5.1 4.5 4.5 4.7 4.4 4.1 4.0
##  [91] 4.4 4.6 4.0 3.3 4.2 4.2 4.2 4.3 3.0 4.1 6.0 5.1 5.9 5.6 5.8 6.6 4.5 6.3
## [109] 5.8 6.1 5.1 5.3 5.5 5.0 5.1 5.3 5.5 6.7 6.9 5.0 5.7 4.9 6.7 4.9 5.7 6.0
## [127] 4.8 4.9 5.6 5.8 6.1 6.4 5.6 5.1 5.6 6.1 5.6 5.5 4.8 5.4 5.6 5.1 5.1 5.9
## [145] 5.7 5.2 5.0 5.2 5.4 5.1

Accessing particular value

Value at thirty fifth row third column value can be obtained using

data1[35,3]
## [1] 1.5

Data description

Using str() function we can know the structure of data

str(data1)
## 'data.frame':    150 obs. of  5 variables:
##  $ Sepal.Length: num  5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...
##  $ Sepal.Width : num  3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ...
##  $ Petal.Length: num  1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ...
##  $ Petal.Width : num  0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ...
##  $ Species     : Factor w/ 3 levels "setosa","versicolor",..: 1 1 1 1 1 1 1 1 1 1 ...

Extracting particular column of data and its first 6 values

‘$’ can be used to extract particular column from data and head() function can be used to fetch only first 6 values

# Let us look at first 6 values of iris
head(iris$Sepal.Length)
## [1] 5.1 4.9 4.7 4.6 5.0 5.4

Final Note

We have seen important topics such as

  • Loading data
  • Usage of operators
  • Matrix operations
  • Data wrangling

The above are few methods available in R. One can easily explore more options