The dataset consists of IT support tickets for a small liberal arts college in NYC. The IT support tickets encompass technical, computer, software and access issues for students, faculty and staff for the full year 2021 through Nov 4, 2021. This covers one full academic year Aug 2021 - June 2022 (fall and spring). Columns that identify the college and its students and employees have been removed.
#Libraries
library(tidyverse)
library(lubridate)
library(psych)
library(ggplot2)
library(scales)
#Load the data
<- read.csv("https://raw.githubusercontent.com/johnnydrodriguez/data606project/main/IT_Tickets_2022_2021.csv", na.strings=c("","NA"))
it_support_tix
#Converts character date column into dates
$resolved_at <- mdy_hm(it_support_tix$resolved_at)
it_support_tix$opened_at <- mdy_hm(it_support_tix$opened_at)
it_support_tix
#Calculates the ticket age (date resolved - date opened)
<- it_support_tix %>%
it_support_tix mutate(age_at_resolution_days = round(difftime(it_support_tix$resolved_at, it_support_tix$opened_at, units = "days"), digits = 2))
#To create summary statistics, the age_at_resolution is converted to numeric
$age_at_resolution_days <- as.numeric(it_support_tix$age_at_resolution_days)
it_support_tix
glimpse(it_support_tix)
## Rows: 14,069
## Columns: 12
## $ number <chr> "INC0120526", "INC0120422", "INC0120775", "INC0…
## $ contact_type <chr> "Self-service", "Email", "Phone", "Email", "Ema…
## $ u_wait_reason <chr> NA, NA, NA, NA, NA, NA, "Waiting for Pickup", N…
## $ assignment_group <chr> "Service Desk", "Service Desk", "Service Desk",…
## $ closed_at <chr> "11/4/22 18:00", "11/4/22 18:00", NA, "11/4/22 …
## $ resolved_at <dttm> 2022-11-01 18:00:00, 2022-11-01 18:00:00, NA, …
## $ u_subcategory <chr> "Computer (Desktop/Laptop)", "ERP", "Desktop Ap…
## $ u_symptom <chr> "How To/Question", "Configure/Modify", "Securit…
## $ opened_at <dttm> 2022-10-21 17:03:00, 2022-10-18 10:52:00, 2022…
## $ sys_mod_count <int> 11, 9, 5, 7, 14, 14, 14, 4, 1, 3, 4, 4, 2, 12, …
## $ reassignment_count <int> 0, 2, 0, 1, 1, 2, 1, 1, 0, 0, 0, 0, 0, 1, 5, 0,…
## $ age_at_resolution_days <dbl> 11.04, 14.30, NA, 0.31, NA, NA, NA, 0.06, NA, 0…
Does the Contact type (the method the user first initiates the support ticket) predict the age of ticket at resolution?
Why this matters to IT operations managers: IT support principles typically promote resolution on first contact over the shortest period of time until resolution. IT operations managers will attempt to funnel requests into contact channels that allow IT analysts to resolve issues as quickly as possible.
Each case represents an IT support incident ie, a user has been affected by a technical issue that needs to be resolved. There are 14,069 cases and 12 variables.
Each incident is either system-generated when the IT support request is made through email or the self service portal. A support ticket is manually created by an IT analyst when a user makes a support request via phone or walks into the support office.
This is an observational study.
The data is exported from an IT support database which stores data on each support interaction.
The dependent variable is ticket age (in days) until resolution (date resolved - date created). This value is numeric.
The independent variable is the contact type - ie, one of 4 methods the user can use to initiate a support requests: email, phone, walk-in, self-service.
The distribution of the age at resolution is heavily skewed to the right. Analysis that depends on normal distribution approximations may not be possible.
# Summary stats of the age at resolution
describe(it_support_tix$age_at_resolution_days)
## vars n mean sd median trimmed mad min max range skew kurtosis
## X1 1 13849 10.53 30.06 1.19 3.8 1.73 -3 438.21 441.21 5.98 45.47
## se
## X1 0.26
# More Summary stats of the age at resolution
summary(it_support_tix$age_at_resolution_days)
## Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
## -3.00 0.06 1.19 10.53 9.01 438.21 220
#Proportional table of suppport tickets by the contact type
prop.table(table(it_support_tix$contact_type, useNA='ifany')) * 100
##
## Email Phone Self-service Walk-in <NA>
## 59.734167 12.723008 22.105338 4.193617 1.243870
# Summary stats of age at resolutions grouped by the contact type
describeBy(it_support_tix$age_at_resolution_days,
group = it_support_tix$contact_type, mat=TRUE)
## item group1 vars n mean sd median trimmed mad
## X11 1 Email 1 8287 11.271945 31.41242 2.03 4.254765 2.965200
## X12 2 Phone 1 1775 5.052231 22.52696 0.04 1.204898 0.044478
## X13 3 Self-service 1 3038 13.044668 31.85594 2.91 5.162418 4.255062
## X14 4 Walk-in 1 576 5.262066 20.81400 0.04 1.097900 0.044478
## min max range skew kurtosis se
## X11 -3.00 438.21 441.21 6.030520 46.40055 0.3450665
## X12 -0.03 328.51 328.54 8.992980 94.96848 0.5346915
## X13 -2.99 328.96 331.95 4.649317 26.39643 0.5779583
## X14 -0.08 257.91 257.99 7.706789 72.17335 0.8672502
# Distribution of tickets by age at resolution
ggplot(it_support_tix, aes(x=age_at_resolution_days)) + geom_histogram(binwidth = 20)