Introduction

How many times have you been in a situation, that you had to repeat a task several times to get the correct result by changing just a small input? Have you ever wished to have magical powers to do it without much efforts? Well, you don’t need magical powers to do it in R!

Quick Recap To Some Basic Loops

WHILE loop will keep repeating set of commands, till the condition is true.

i <- 0
cond <- TRUE
while(cond)
{
  print(i)
  i <- i+1
  if(i>2)
  {
    cond <- FALSE
  }
}
## [1] 0
## [1] 1
## [1] 2

(‘IF’ is not a loop)

For loop will also repeat set of commands. ‘For’ loop is generally considered when you are aware of the number of times the code needs to be executed, before the loop starts.

i = 0
for(i in 1:3)
{
  print(i)
}
## [1] 1
## [1] 2
## [1] 3

Did you check that ‘i’ is not explicitly incremented in FOR loop?

Why Functions If We Have Loops?

Consider that you have to run a set of commands on a variable; and after that, same commands on another variable.

i = 0
for(i in 1:4)
{
  print(i)
}
## [1] 1
## [1] 2
## [1] 3
## [1] 4
j = 0
for(j in 1:4)
{
  print(j)
}
## [1] 1
## [1] 2
## [1] 3
## [1] 4

So, we are still repeating the code.

“Functions” are to rescue us from this loop of repeating.

What are Functions?

Functions are set of commands which will be written once and can be executed several times, without repeating the code. It can be reused in different R scripts as well. If you have used any commands in R, then those are the functions written by someone else to make your life easier. This document concentrates only on writing functions.

To create and use functions (Unless you are using any package specific commands inside the function), you do not need to install any additional package. It is part of the base package of R.

Syntax

Let’s start with a simple function of calculating area of a circle.

Input

Radius r

Output

area

Formula

area of circle with radius r = pi * r * r

areaC <- function(r)
{
  area <- pi*r*r
  
  return(area)
}

You must have observed, that the function is stored in a variable. We can call this function by using variable name.

If you are wondering why I have not passed the value of ‘pi’, It is because ‘pi’ is a constant in R.

pi
## [1] 3.141593

Now that we have our function ready, let’s play with it.

areaC(2.22)
## [1] 15.48303
ans <- areaC(8438)
ans
## [1] 223680907

Did you check how the parameter is passed and how the answer is stored in a new variable?

Let’s make it little more difficult! What if I want to pass multiple parameters and want multiple output values?

Input

length l and width w

Output

area; perimeter

Formula

area = l * w

perimeter = 2 * (l + w)

PropR <- function(l,w)
{
  areaR <- l*w
  perimeter <- 2*(l+w)
  return(c(areaR,perimeter))
}

PropR(2,4)
## [1]  8 12
RectangleProperties <- PropR(4.78,2.55)
RectangleProperties
## [1] 12.189 14.660
class(RectangleProperties)
## [1] "numeric"

Did you spot ‘c’ in the return function? So, we are still returning a single parameter, but we have combined multiple outputs into one!

If you have an eye for detail, you must have observed an error in PropR(2,4). In this function call, width is the first parameter and length is second. But, our function accepts length first and then width. It did not matter here, but it is important that you follow the sequence.

Recursive Functions

Let’s make it a little more complex. We will try factorial function and exception handling for incorrect parameters.

Factorial of a number n (represented as ‘n!’) is multiplication of all the whole numbers which are smaller or equal to n (Excluding 0).

5! = 1 * 2 * 3 * 4 * 5

Input

n

Output

n!

Any guesses about the factorial of zero? (Hint: answer is in the code below.)

fact <- function(n)
{
  if(n>0)
  {
    if(n==1)
      return(1)
    else
      return(n*fact(n-1))
  }
  else if(n==0)
    return(1)
  else
    print("Please enter valid number for factorial")
}

fact(5)
## [1] 120

This function is a recursive function where it keeps calling itself. Functions can include loops or function calls inside them.

factorials <- c(fact(0))

for(i in 1:5)
  factorials <- append(factorials, fact(i))

plot(c(0:5),factorials, col= 'blue', xlab = "Number n", ylab = "Factorial")

Data Breaches

Now, lets take a look at a real dataset of data breaches-

# To run this code install and load packages
# install.packages("tidyverse")
library(tidyverse)
## -- Attaching packages ------------------------------------------- tidyverse 1.2.1 --
## v ggplot2 3.2.1     v purrr   0.3.2
## v tibble  2.1.3     v dplyr   0.8.3
## v tidyr   0.8.3     v stringr 1.4.0
## v readr   1.3.1     v forcats 0.4.0
## -- Conflicts ---------------------------------------------- tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()
#install.packages("rvest")
library(rvest)
## Loading required package: xml2
## 
## Attaching package: 'rvest'
## The following object is masked from 'package:purrr':
## 
##     pluck
## The following object is masked from 'package:readr':
## 
##     guess_encoding
url <- "https://en.wikipedia.org/wiki/List_of_data_breaches"
BreachData <- url %>%
  html() %>%
  html_nodes(xpath='//*[@id="mw-content-text"]/div/table') %>%
  html_table()

BreachData <- BreachData[[1]]
# This is an example of function. This gives very basic information about data frame. You can try expand this function to explore data frame further.

ExploreData <- function(data)
{
  
  print("printing top 2 rows")
  print(head(data,2))
    
  print("Printing Summary")
  print(summary(data))
  
}

ExploreData(BreachData)
## [1] "printing top 2 rows"
##                  Entity Year   Records Organization type        Method
## 1 21st Century Oncology 2016 2,200,000        healthcare        hacked
## 2 Accendo Insurance Co. 2011   175,350        healthcare poor security
##   Sources
## 1  [5][6]
## 2  [7][8]
## [1] "Printing Summary"
##     Entity              Year             Records         
##  Length:287         Length:287         Length:287        
##  Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character  
##  Organization type     Method            Sources         
##  Length:287         Length:287         Length:287        
##  Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character

Keep Exploring!

Now that you have good understanding of functions, explore how you can use these functions in other scripts. (HINT: source(), packages)