Importamos el dataset headcount_VRG_LATAM que contiene información sobre las nuevos agentes contratados en la empresa, y creamos el objeto head_count para facilitar el acceso al dataset.
library(readxl)
head_count <- read_excel("C:/Users/ASUS/Downloads/Headcount_VRG_LATAM_1_.xlsx", sheet = "VRG Headcount")
head_count
## # A tibble: 181 × 19
## N Start_Date Tenure_Months Name Tittle Program_Supported
## <dbl> <dttm> <dbl> <chr> <chr> <chr>
## 1 1 2021-05-03 00:00:00 35.7 Manuel Alej… Recru… MP High Volume
## 2 2 2021-06-28 00:00:00 33.9 Valeria Enr… Recru… MP High Volume
## 3 3 2021-07-12 00:00:00 33.4 Cinthya Ste… Recru… MP High Volume
## 4 4 2021-08-25 00:00:00 31.9 Junior Alon… Recru… MP High Volume
## 5 5 2021-12-06 00:00:00 28.5 Grecia Hele… Recru… MP High Volume
## 6 6 2021-12-06 00:00:00 28.5 Claudia Ale… Recru… MP High Volume
## 7 7 2021-12-13 00:00:00 28.3 Sara Correa… Recru… MP Skilled Tech
## 8 8 2021-12-20 00:00:00 28.0 Iveet Allis… Recru… MP High Volume
## 9 9 2021-12-20 00:00:00 28.0 Juan José R… Recru… MP High Volume
## 10 10 2021-12-20 00:00:00 28.0 Karla Rocio… Recru… MP High Volume
## # ℹ 171 more rows
## # ℹ 13 more variables: Team_Lead <chr>, Email_Address <chr>, Extension <chr>,
## # `US ID` <chr>, `.com Email Address` <chr>, City <chr>, Country <chr>,
## # Month <dbl>, Year <dbl>, Country_ID <chr>, Phone_Number <dbl>, DOB <chr>,
## # Rate <dbl>
Dentro del summary encontramos que existen 18 variables con información de los nuevos agentes contratados, donde las variables que serán analizadas son el rol del agente contratado denominado como “Tittle”, el team lead, el programa soportado y el valor de los recursos asigandos que se representa como “Rate”.
summary(head_count)
## N Start_Date Tenure_Months
## Min. : 1 Min. :2021-05-03 00:00:00.00 Min. : 0.200
## 1st Qu.: 46 1st Qu.:2023-05-09 00:00:00.00 1st Qu.: 4.470
## Median : 91 Median :2023-09-04 00:00:00.00 Median : 7.270
## Mean : 91 Mean :2023-07-16 08:53:02.31 Mean : 8.921
## 3rd Qu.:136 3rd Qu.:2023-11-27 00:00:00.00 3rd Qu.:11.200
## Max. :181 Max. :2024-04-03 00:00:00.00 Max. :35.730
##
## Name Tittle Program_Supported Team_Lead
## Length:181 Length:181 Length:181 Length:181
## Class :character Class :character Class :character Class :character
## Mode :character Mode :character Mode :character Mode :character
##
##
##
##
## Email_Address Extension US ID .com Email Address
## Length:181 Length:181 Length:181 Length:181
## Class :character Class :character Class :character Class :character
## Mode :character Mode :character Mode :character Mode :character
##
##
##
##
## City Country Month Year
## Length:181 Length:181 Min. : 1.000 Min. :2021
## Class :character Class :character 1st Qu.: 3.000 1st Qu.:2023
## Mode :character Mode :character Median : 7.000 Median :2023
## Mean : 6.133 Mean :2023
## 3rd Qu.: 9.000 3rd Qu.:2023
## Max. :12.000 Max. :2024
##
## Country_ID Phone_Number DOB Rate
## Length:181 Min. :3.003e+09 Length:181 Min. :1977
## Class :character 1st Qu.:3.103e+09 Class :character 1st Qu.:1977
## Mode :character Median :3.144e+09 Mode :character Median :1977
## Mean :3.130e+09 Mean :2064
## 3rd Qu.:3.179e+09 3rd Qu.:2089
## Max. :3.244e+09 Max. :2313
## NA's :144
Procedemos a crear un subset con la función select para filtrar el dataset con la información que consideramos más relevante para efectos de este análisis
dataset <- subset(head_count, select= c(1,3,5:7,19))
dataset
## # A tibble: 181 × 6
## N Tenure_Months Tittle Program_Supported Team_Lead Rate
## <dbl> <dbl> <chr> <chr> <chr> <dbl>
## 1 1 35.7 Recruiter A MP High Volume Cinthya Oré Chávez 1977
## 2 2 33.9 Recruiter D MP High Volume Cinthya Oré Chávez 2313
## 3 3 33.4 Recruiter C MP High Volume Cinthya Oré Chávez 2201
## 4 4 31.9 Recruiter D MP High Volume Cinthya Oré Chávez 2313
## 5 5 28.5 Recruiter B MP High Volume Cinthya Oré Chávez 2089
## 6 6 28.5 Recruiter B MP High Volume Cinthya Oré Chávez 2089
## 7 7 28.3 Recruiter D MP Skilled Tech Nelly Gomez 2313
## 8 8 28.0 Recruiter C MP High Volume Cinthya Oré Chávez 2201
## 9 9 28.0 Recruiter D MP High Volume Cinthya Oré Chávez 2313
## 10 10 28.0 Recruiter B MP High Volume Cinthya Oré Chávez 2089
## # ℹ 171 more rows
Posteriormente creamos diferentes gráficos para poder tener una visualización del comportamiento de los datos.
Primero creamos histogramas para visualizar el número de agentes por programa soportado, team lead y por campaña.
library(ggplot2)
ggplot(dataset, aes(x=Tittle)) +
geom_histogram(stat = "count")+
labs(x= "Cargo del empleado", y= "número de empleados",
title= "número de agentes por cargo")
## Warning in geom_histogram(stat = "count"): Ignoring unknown parameters:
## `binwidth`, `bins`, and `pad`
library(ggplot2)
# Crear la gráfica y rotar las etiquetas del eje x
ggplot(dataset, aes(x = Team_Lead)) +
geom_bar() + # Utilizamos geom_bar() en lugar de geom_histogram() para contar las frecuencias
labs(x = "Team Lead", y = "Número de empleados", title = "Número de agentes por Team Lead") +
theme(axis.text.x = element_text(angle = 45, hjust = 1))
ggplot(dataset, aes(x=Program_Supported)) +
geom_histogram(stat = "count")+
labs(x= "Programa", y= "número de empleados",
title= "número de agentes por Programa")
## Warning in geom_histogram(stat = "count"): Ignoring unknown parameters:
## `binwidth`, `bins`, and `pad`