1 Context


I want to know how many HR people are there amongst my LinkedIn connections.


2 User defined function


# The function tells if the position of the connection is related to HR or not

isHR <- function(my_string){
  
  my_string <- tolower(my_string) # We use lower case to compare strings
  
  # List of words related to HR
  hr_words <- c("hr", "recruitment", "recruiter", "recruiting", "talent", "human", "resources", 
                "acquisition", "people", "headhunter", "headhunting", # English
                "rrhh", "recursos", "humanos", "talento", "gente", "selección", "personal",
                "rr.", "hh.", "rr.hh.") #Spanish
  
  my_words <- strsplit(my_string, split = " ",f =T)[[1]] # We split the string in words
  
  if(length(which(hr_words %in% my_words)) > 0){
    return(TRUE) # Job is related to HR
  }else{
    return(FALSE) # Job is not related to HR
  }
  
}

3 Libraries


library(data.table)
library(ggplot2)
library(plyr)

4 Data import


# Reading the table of my LinkedIn connections
LinkedIn <- fread("my_connections.csv", header = T, stringsAsFactors = FALSE)

# Summary of the table
summary(LinkedIn)
##    Position         Connected On      
##  Length:1745        Length:1745       
##  Class :character   Class :character  
##  Mode  :character   Mode  :character

6 Conclusion


The best way to solve a problem is to write down which task we need to accomplish, then cut this task into pieces that can be attacked and solved one by one.

A well written and documented function can save us hundreds of hours in the future, so be smart in your code.

In this case a simple function as helped us to find the answer to the question: How many HR related people are there amongst my LinkedIn connections.