Creation of three column dataframe

The data frame contains 10,000,000 rows of random integers from 1-100.

library(dplyr)
#replace = TRUE means same value can occur more than once.
col1 <- sample(1:100,10000000,replace = TRUE) 
col2 <- sample(1:100,10000000,replace = TRUE)
col3 <- sample(1:100,10000000,replace = TRUE)

df <- data.frame(col1 = col1, col2 = col2, col3 = col3)
head(df)

Using dplyr’s mutate and case_when functions as alternative for if else statement.

ELSE <- TRUE
start.time <- Sys.time()
df2 <- df %>% mutate(.,result = with(.,case_when(
  (col1 > 20 & col2 < 50 & col3 > 20) ~ 1,
  (col1 < 15 | (col2 > 50 & col3 > 30)) ~ 2,
  ELSE ~ 3
)))
end.time <- Sys.time()
time.taken <- end.time - start.time

Preview of resultant dataframe

head(df2)

Time taken to create the ‘result’ column

time.taken
## Time difference of 1.565432 secs

Reproducing the previous result using nested ifelse statements

df3 <- df
start.time <- Sys.time()
df3$result <- ifelse((col1 > 20 & col2 < 50 & col3 > 20),1,
  ifelse((col1 < 15 | (col2 > 50 & col3 > 30)),2,3))
end.time <- Sys.time()
time.taken <- end.time - start.time

Preview of Result

head(df3)

Time taken using nested ifelse statement

time.taken
## Time difference of 1.96998 secs