Employer Review About Their Organisation

Context

Every organization has their pros and cons which their employees feel that it should be made public so that other people who wants to join this organization make decisions based on reviews from the people.

Content

This is a textual data in the form of json file. Its has more than 145k records. Each record has attributes such as

  • Review Title
  • Review Body
  • Review Rating
  • Reviewed Company
  • Review description

Acknowledgements

All thanks to indeed.com to make this data public and easily available.


1. Call Libraries Required

library(tidyverse)
library(jsonlite)
library(lubridate)

2. Download Datasets

#download.file('https://www.kaggle.com/takkimsncn/employer-review-about-their-organisation/data?select=results.json')
er <- fromJSON('results.json')
str(er)
'data.frame':   145209 obs. of  5 variables:
 $ ReviewTitle   : chr  "Productive" "Stressful" "Good Company for Every employee" "Productive" ...
 $ CompleteReview: chr  "Good company, cool workplace, work load little bit higher. Clean environment, disciplined, good cantin, big cam"| __truncated__ "1. Need to work on boss's whims and fancies 2. Priorities keep changing 3. No regards for work life balance 4. "| __truncated__ "Good company for every Engineers dream, Full Mediclaim for entired family, Free transport services from company"| __truncated__ "I am just pass out bsc in chemistry Typical day at work Mangement Work place good The most enjoyable part of th"| __truncated__ ...
 $ URL           : chr  "https://in.indeed.com/cmp/Reliance-Industries-Ltd/reviews" "https://in.indeed.com/cmp/Reliance-Industries-Ltd/reviews" "https://in.indeed.com/cmp/Reliance-Industries-Ltd/reviews" "https://in.indeed.com/cmp/Reliance-Industries-Ltd/reviews" ...
 $ Rating        : chr  "3.0" "3.0" "5.0" "5.0" ...
 $ ReviewDetails : chr  "(Current Employee)  -  Ghansoli  -  August 30, 2021" "(Former Employee)  -   -  August 26, 2021" "(Former Employee)  -   -  August 17, 2021" "(Current Employee)  -   -  August 17, 2021" ...

3. Process Data

er$com <- sapply(strsplit(er$URL,'/'),'[[',5) %>% trimws(which='both')
er$CF <- sapply(strsplit(er$ReviewDetails,'-'),'[[',1) %>% trimws(which='both')
er$location <- sapply(strsplit(er$ReviewDetails,'-'),'[[',2) %>% trimws(which='both')
er$date <- sapply(strsplit(er$ReviewDetails,'-'),'[[',3) %>% trimws(which='both')
er$date <- as.Date(er$date,'%B %d, %Y')
str(er)
'data.frame':   145209 obs. of  9 variables:
 $ ReviewTitle   : chr  "Productive" "Stressful" "Good Company for Every employee" "Productive" ...
 $ CompleteReview: chr  "Good company, cool workplace, work load little bit higher. Clean environment, disciplined, good cantin, big cam"| __truncated__ "1. Need to work on boss's whims and fancies 2. Priorities keep changing 3. No regards for work life balance 4. "| __truncated__ "Good company for every Engineers dream, Full Mediclaim for entired family, Free transport services from company"| __truncated__ "I am just pass out bsc in chemistry Typical day at work Mangement Work place good The most enjoyable part of th"| __truncated__ ...
 $ URL           : chr  "https://in.indeed.com/cmp/Reliance-Industries-Ltd/reviews" "https://in.indeed.com/cmp/Reliance-Industries-Ltd/reviews" "https://in.indeed.com/cmp/Reliance-Industries-Ltd/reviews" "https://in.indeed.com/cmp/Reliance-Industries-Ltd/reviews" ...
 $ Rating        : chr  "3.0" "3.0" "5.0" "5.0" ...
 $ ReviewDetails : chr  "(Current Employee)  -  Ghansoli  -  August 30, 2021" "(Former Employee)  -   -  August 26, 2021" "(Former Employee)  -   -  August 17, 2021" "(Current Employee)  -   -  August 17, 2021" ...
 $ com           : chr  "Reliance-Industries-Ltd" "Reliance-Industries-Ltd" "Reliance-Industries-Ltd" "Reliance-Industries-Ltd" ...
 $ CF            : chr  "(Current Employee)" "(Former Employee)" "(Former Employee)" "(Current Employee)" ...
 $ location      : chr  "Ghansoli" "" "" "" ...
 $ date          : Date, format: "2021-08-30" "2021-08-26" ...

4. Convert data into numeric for manipulation

er <- er %>% mutate_at(-c(4,9),tolower) %>% mutate_at(4,as.numeric) %>% 
    mutate_at(6:8,as.factor)
er$CF01 <- str_sub(er$CF,-18,-1) %>% trimws(which='both')
er$CF01 <- as.factor(er$CF01)
summary(er)
 ReviewTitle        CompleteReview         URL                Rating     
 Length:145209      Length:145209      Length:145209      Min.   :1.000  
 Class :character   Class :character   Class :character   1st Qu.:4.000  
 Mode  :character   Mode  :character   Mode  :character   Median :4.000  
                                                          Mean   :4.054  
                                                          3rd Qu.:5.000  
                                                          Max.   :5.000  
                                                                         
 ReviewDetails                                   com       
 Length:145209      tata-consultancy-services-(tcs):14441  
 Class :character   ibm                            :10820  
 Mode  :character   infosys                        :10696  
                    accenture                      :10137  
                    cognizant-technology-solutions : 9626  
                    hdfc-bank                      : 6749  
                    (Other)                        :82740  
                                CF                              location     
 (former employee)               :79193                             :129943  
 (current employee)              :65493   india                     :  4367  
 officer   (former employee)     :  103   bangalore urban, karnataka:   989  
 officer   (current employee)    :   91   hp                        :   279  
 health care   (current employee):   20   in                        :   242  
 employee   (current employee)   :   19   kerala                    :   228  
 (Other)                         :  290   (Other)                   :  9161  
      date                            CF01      
 Min.   :2011-08-31   (current employee):65747  
 1st Qu.:2015-06-11   (former employee) :79461  
 Median :2017-04-18   head row          :    1  
 Mean   :2017-02-09                             
 3rd Qu.:2018-09-21                             
 Max.   :2021-09-08                             
 NA's   :142                                    

5. Drop NA values

er <- er[-7] %>% drop_na

6. Check variables

colnames(er)
[1] "ReviewTitle"    "CompleteReview" "URL"            "Rating"        
[5] "ReviewDetails"  "com"            "location"       "date"          
[9] "CF01"          

7. Plot Employers Rated by Current and Former Employees

8. Plot Employers Rated by Current Employees

9. Select an employer I’m interested in and check the Reviews that rate good or bad for it. If you look at the plot above , it’s easy to find the line is decline , which implies employees become less satisfied with Accenture along the timeline

  • Positive Comments on Accenture
er %>% subset(com=='accenture' & Rating > 4.5,select=1:2) %>% as_tibble
  • Negative Comments on Accenture
er %>% subset(com=='accenture' & Rating < 3,select=1:2) %>% as_tibble
LS0tCiN0aXRsZTogIlIgTm90ZWJvb2siCm91dHB1dDogaHRtbF9ub3RlYm9vawotLS0KCiMgRW1wbG95ZXIgUmV2aWV3IEFib3V0IFRoZWlyIE9yZ2FuaXNhdGlvbgoKIyMjIENvbnRleHQKCkV2ZXJ5IG9yZ2FuaXphdGlvbiBoYXMgdGhlaXIgcHJvcyBhbmQgY29ucyB3aGljaCB0aGVpciBlbXBsb3llZXMgZmVlbCB0aGF0IGl0IHNob3VsZCBiZSBtYWRlIHB1YmxpYyBzbyB0aGF0IG90aGVyIHBlb3BsZSB3aG8gd2FudHMgdG8gam9pbiB0aGlzIG9yZ2FuaXphdGlvbiBtYWtlIGRlY2lzaW9ucyBiYXNlZCBvbiByZXZpZXdzIGZyb20gdGhlIHBlb3BsZS4KCiMjIyBDb250ZW50CgpUaGlzIGlzIGEgdGV4dHVhbCBkYXRhIGluIHRoZSBmb3JtIG9mIGpzb24gZmlsZS4gSXRzIGhhcyBtb3JlIHRoYW4gMTQ1ayByZWNvcmRzLiBFYWNoIHJlY29yZCBoYXMgYXR0cmlidXRlcyBzdWNoIGFzCgoqIFJldmlldyBUaXRsZQoqIFJldmlldyBCb2R5CiogUmV2aWV3IFJhdGluZwoqIFJldmlld2VkIENvbXBhbnkKKiBSZXZpZXcgZGVzY3JpcHRpb24KCgojIyMgQWNrbm93bGVkZ2VtZW50cwoKQWxsIHRoYW5rcyB0byBpbmRlZWQuY29tIHRvIG1ha2UgdGhpcyBkYXRhIHB1YmxpYyBhbmQgZWFzaWx5IGF2YWlsYWJsZS4KCl9fXwoKIyMgMS4gQ2FsbCBMaWJyYXJpZXMgUmVxdWlyZWQKYGBge3J9CmxpYnJhcnkodGlkeXZlcnNlKQpsaWJyYXJ5KGpzb25saXRlKQpsaWJyYXJ5KGx1YnJpZGF0ZSkKYGBgCgojIyAyLiBEb3dubG9hZCBEYXRhc2V0cwpgYGB7cn0KI2Rvd25sb2FkLmZpbGUoJ2h0dHBzOi8vd3d3LmthZ2dsZS5jb20vdGFra2ltc25jbi9lbXBsb3llci1yZXZpZXctYWJvdXQtdGhlaXItb3JnYW5pc2F0aW9uL2RhdGE/c2VsZWN0PXJlc3VsdHMuanNvbicpCgplciA8LSBmcm9tSlNPTigncmVzdWx0cy5qc29uJykKc3RyKGVyKQpgYGAKCiMjIDMuIFByb2Nlc3MgRGF0YQpgYGB7cn0KZXIkY29tIDwtIHNhcHBseShzdHJzcGxpdChlciRVUkwsJy8nKSwnW1snLDUpICU+JSB0cmltd3Mod2hpY2g9J2JvdGgnKQplciRDRiA8LSBzYXBwbHkoc3Ryc3BsaXQoZXIkUmV2aWV3RGV0YWlscywnLScpLCdbWycsMSkgJT4lIHRyaW13cyh3aGljaD0nYm90aCcpCmVyJGxvY2F0aW9uIDwtIHNhcHBseShzdHJzcGxpdChlciRSZXZpZXdEZXRhaWxzLCctJyksJ1tbJywyKSAlPiUgdHJpbXdzKHdoaWNoPSdib3RoJykKZXIkZGF0ZSA8LSBzYXBwbHkoc3Ryc3BsaXQoZXIkUmV2aWV3RGV0YWlscywnLScpLCdbWycsMykgJT4lIHRyaW13cyh3aGljaD0nYm90aCcpCmVyJGRhdGUgPC0gYXMuRGF0ZShlciRkYXRlLCclQiAlZCwgJVknKQpzdHIoZXIpCmBgYAoKIyMgNC4gQ29udmVydCBkYXRhIGludG8gbnVtZXJpYyBmb3IgbWFuaXB1bGF0aW9uCmBgYHtyfQplciA8LSBlciAlPiUgbXV0YXRlX2F0KC1jKDQsOSksdG9sb3dlcikgJT4lIG11dGF0ZV9hdCg0LGFzLm51bWVyaWMpICU+JSAKICAgIG11dGF0ZV9hdCg2OjgsYXMuZmFjdG9yKQoKZXIkQ0YwMSA8LSBzdHJfc3ViKGVyJENGLC0xOCwtMSkgJT4lIHRyaW13cyh3aGljaD0nYm90aCcpCmVyJENGMDEgPC0gYXMuZmFjdG9yKGVyJENGMDEpCnN1bW1hcnkoZXIpCmBgYAoKIyMgNS4gRHJvcCBOQSB2YWx1ZXMKYGBge3J9CmVyIDwtIGVyWy03XSAlPiUgZHJvcF9uYQpgYGAKCiMjIDYuIENoZWNrIHZhcmlhYmxlcwpgYGB7cn0KY29sbmFtZXMoZXIpCmBgYAoKIyMgNy4gUGxvdCBFbXBsb3llcnMgUmF0ZWQgYnkgQ3VycmVudCBhbmQgRm9ybWVyIEVtcGxveWVlcwpgYGB7ciByZXN1bHRzPUZBTFNFfQplciAlPiUgbXV0YXRlKHllYXI9eWVhcihkYXRlKSkgJT4lIGdyb3VwX2J5KGNvbSx5ZWFyKSAlPiUgCiAgICBzdW1tYXJpemUoUmF0aW5nPW1lYW4oUmF0aW5nKSkgJT4lIGdncGxvdChhZXMoeWVhcixSYXRpbmcsZ3JvdXA9MSkpK2dlb21fbGluZSgpKwogICAgZmFjZXRfd3JhcCh+Y29tLHNjYWxlcz0nZnJlZScpKwogICAgZ2d0aXRsZSgnRW1wbG95ZXIgUmV2aWV3IEFib3V0IFRoZWlyIE9yZ25pc2F0aW9uKEN1cnJlbnQgYW5kIEZvcm1lciBFbXBsb3llZXMgUmF0aW5ncyknKSsKICAgIHRoZW1lKHRleHQ9ZWxlbWVudF90ZXh0KHNpemU9MTApKQpgYGAKIVtdKGZpZ3VyZS9QbG90MV9DdXJyZW50X0Zvcm1lcl9FbXBsb3llZXNfUmF0aW5nLnBuZykKCiMjIDguIFBsb3QgRW1wbG95ZXJzIFJhdGVkIGJ5IEN1cnJlbnQgRW1wbG95ZWVzCmBgYHtyLHJlc3VsdHM9RkFMU0V9CmVyICU+JSBtdXRhdGUoeWVhcj15ZWFyKGRhdGUpKSAlPiUgc3Vic2V0KENGMDE9PScoY3VycmVudCBlbXBsb3llZSknKSAlPiUKICAgIGdyb3VwX2J5KGNvbSx5ZWFyKSAlPiUgc3VtbWFyaXplKFJhdGluZz1tZWFuKFJhdGluZykpICU+JSAKICAgIGdncGxvdChhZXMoeWVhcixSYXRpbmcsZ3JvdXA9MSkpK2dlb21fbGluZSgpKwogICAgZmFjZXRfd3JhcCh+Y29tLHNjYWxlcz0nZnJlZScpKwogICAgZ2d0aXRsZSgnRW1wbG95ZXIgUmV2aWV3IEFib3V0IFRoZWlyIE9yZ25pc2F0aW9uKEN1cnJlbnQgRW1wbG95ZWVzIFJhdGluZ3MpJykrCiAgICB0aGVtZSh0ZXh0PWVsZW1lbnRfdGV4dChzaXplPTEwKSkKYGBgCiFbXShmaWd1cmUvUGxvdDJfQ3VycmVudF9FbXBsb3llZV9SYXRpbmcucG5nKQoKIyMgOS4gU2VsZWN0IGFuIGVtcGxveWVyIEknbSBpbnRlcmVzdGVkIGluIGFuZCBjaGVjayB0aGUgUmV2aWV3cyB0aGF0IHJhdGUgZ29vZCBvciBiYWQgZm9yIGl0LiBJZiB5b3UgbG9vayBhdCB0aGUgcGxvdCBhYm92ZSAsIGl0J3MgZWFzeSB0byBmaW5kIHRoZSBsaW5lIGlzIGRlY2xpbmUgLCB3aGljaCBpbXBsaWVzIGVtcGxveWVlcyBiZWNvbWUgbGVzcyBzYXRpc2ZpZWQgd2l0aCBBY2NlbnR1cmUgYWxvbmcgdGhlIHRpbWVsaW5lCgoqICMjIyMjIFBvc2l0aXZlIENvbW1lbnRzIG9uIEFjY2VudHVyZQpgYGB7cn0KZXIgJT4lIHN1YnNldChjb209PSdhY2NlbnR1cmUnICYgUmF0aW5nID4gNC41LHNlbGVjdD0xOjIpICU+JSBhc190aWJibGUKYGBgCgoqICMjIyMjIE5lZ2F0aXZlIENvbW1lbnRzIG9uIEFjY2VudHVyZQpgYGB7cn0KZXIgJT4lIHN1YnNldChjb209PSdhY2NlbnR1cmUnICYgUmF0aW5nIDwgMyxzZWxlY3Q9MToyKSAlPiUgYXNfdGliYmxlCmBgYAoKCgoKCgoKCgoKCgo=