The analysis of the given datasets which are as following is done on R.

1. Service Desk
2. Outage Notification
3. Change Request Form

We have included the code within the documentation within the document to let the technical users leverage it.First we will be analyzing the Service Desk dataset.

library(readxl)
ServiceDeskModified <- read_excel("D:/UIC/3rd Semester/Capstone/datafrommwrd/ServiceDeskModified.xlsx")
Sdm<-ServiceDeskModified

Function to know which column has categorical values so that it can be converted to factor variables.Have to use nrow , length does not work for a tibble

differentValues<-data.frame()
for(i in 1:ncol(Sdm)){
  differentValues[i,1]<-nrow(unique(Sdm[,i]))
}
colsFactor<-Sdm[!seq(from = 1,to = ncol(Sdm),by = 1) %in% which(differentValues>20)]
Sdm[,c(2:4,6,7,8)]<-lapply(colsFactor,FUN =factor)

Adding few columns after change in requirements

library(readxl)
ServiceDesk_Data <- read_excel("D:/UIC/MyResearch/Mwrd/mwrdAnalysis/ServiceDesk Data.xlsx", 
    col_types = c("numeric", "numeric", "text", 
        "text", "text", "text", "text", "text", 
        "text", "text", "date", "date", "text", 
        "text", "numeric", "numeric", "numeric", 
        "text", "text", "text", "text", "text", 
        "text", "text", "text", "text", "text", 
        "text", "text", "text", "text", "text"))
ServiceDesk_Data<-ServiceDesk_Data[,c(4,11,12,18)]
Sdm<-cbind(Sdm,ServiceDesk_Data)

Let’s take a look at the dataset and the variables after a bit of modification

Service Desk
ID STATE PRIORITY CATEGORY SUBCATEGORY IS_SERVICE_REQUEST SITE DEPARTMENT DESCRIPTION_NO_HTML CREATED_AT UPDATED_AT CREATED_BY
18622955 Assigned Medium IT Audio/Visual/Conference In-District Video Conference Setup TRUE Stickney Water Reclamation Plant Monitoring & Research The IT Department is happy to assist with your upcoming event. Services offered are: -Audio/Video Conference Bridge -Recoding A/V Live streaming (Board Room Only) -Projector -Setup and Breakdown of Laptops -Printer -Onsite IT Support Assistance Needed Please share the Room, Date/Time and description of your event. 2017-10-04 13:22:47 2017-10-04 14:05:43 OKAL, JOHN
18622953 Assigned None Core-Applications MWRD Website Requests TRUE MOB General Administration The ITD department will process website update requests submitted through the Service Desk. Please provide the following information to initiate the process. Name of QA Approver(s) - Please add as a CC to the request if they are in addition to yourself. Is this a Press Release ? Is this a New Page or URL ? URL path of the content to be updated Posting deadline or time details Who can see this content (Public, Internal, Both) New/Replacement content description Please include attachment files as needed using the “Attach Files” hyperlink 2017-10-04 13:22:43 2017-10-04 13:23:21 LARKIN, DIANE
18622457 New Medium ServiceDesk - Triage NA FALSE Northside/Obrien Plant Monitoring & Research The chip key for the pool vehicle 1704 is not functioning. Gets an error message like this “key not in the file system” Raju Antony Environmental Specialist MWRDGC|Industrial Waste Division O’Brien Water Reclamation Plant 3500 Howard St., Skokie, Illinois 600076-4013 p: 847-568-8237 | f: 847-568-8240 raju.antony@mwrd.org 2017-10-04 13:09:20 2017-10-04 13:09:20 ANTONY, RAJU
18622304 Resolved Medium ServiceDesk - Triage Re-Entered as a Service Request FALSE Stickney Water Reclamation Plant Monitoring & Research See Below. Marie Biron Adminstrative Specialist Environmental Monitoring and Research Division Monitoring and Research Department 6001 W Pershing Road Cicero, IL 60804 PH:708.588.4264 Email: marie.biron@mwrd.org From: Biron, Marie Sent: Wednesday, October 04, 2017 1:03 PM To: Help Desk (MOB) HelpD@mwrd.org Cc: Nwoko, Marcel NwokoM@mwrd.org; Weaver, John WeaverJ@mwrd.org Subject: Special M&R Seminar We have a special seminar taking place tomorrow in the following conference rooms: Egan Auditorium KWRP Admin Building Conference Room LASMA Auditorium NSWRP Admin Building Room 115 PCBX Conference Room (swrp) ALD Tech Services Conference Room (swrp) 7th Floor Corner Training Room MOBA The seminar is taking place from 8:45 am to 9:45 am. I will need assistance making sure the rooms are all connected. I have already notified the contacts for each room. Marie Biron Adminstrative Specialist Environmental Monitoring and Research Division Monitoring and Research Department 6001 W Pershing Road Cicero, IL 60804 PH:708.588.4264 Email: marie.biron@mwrd.org 2017-10-04 13:05:15 2017-10-04 13:23:58 BIRON, MARIE
18622045 Resolved Medium Security Modification Network Account Password Reset TRUE Stickney Water Reclamation Plant Maintenance & Operations The IT Service desk can assist you with unlocking and/or resetting your Network, SAP, ReadSoft, Portal or voicemail password. Please provide your Employees ID, network login username and a contact telephone number where you can be reached. For security reasons, the service desk will need to contact you on your phone number and request an additional security details before modifying your account. 2017-10-04 12:59:56 2017-10-04 13:02:33 OKAL, JOHN

Exploratory Analysis

How many tickets have been resolved

From the plot we see that majority of them are closed ? We don’t have the date embedded in the data source file else we would have calculated

What category are most of my tickets ?

From the plot we see that majority of them are closed ? We don’t have the date embedded in the data source file else we would have calculated

Which Plant are raising the most requests for tickets ?

From the plot we see that majority of them are closed ? We don’t have the date embedded in the data source file else we would have calculated

Which department is raising most of the tickets ?

From the plot we see that majority of them are closed

We want to see who has created most of the tickets

library(knitr)
Sdm$CREATED_BY<-as.factor(Sdm$CREATED_BY)
Most_Tickets_Created<-as.data.frame(head(sort(table(Sdm$CREATED_BY),decreasing = T),n=50))
kable(Most_Tickets_Created,caption = "Most_Tickets_Created")
Most_Tickets_Created
Var1 Freq
UKPONG, IMEH 2116
OKAL, JOHN 1899
JONES, RON 837
WRIGHT, SEMAJ 232
SMITH, CHRISTOPHER 219
LARKIN, DIANE 202
KODY, JOSEPH 137
YOUNG, KEVIN 104
REICHLING, JOHN 95
RODGERS, GERALD 85
JEFFERSON, DEJIAH 82
NWOKO, MARCEL 58
ROBERTS, LASHUN 55
CHAPPELL, YVONNE 53
MCENERY, THOMAS 52
FOY, CASSANDRA 48
MALONE, LARRY 48
SANCHEZ, VICTOR 41
JOHNSTON, SAM 40
MIKA, KEVIN 31
SIA, JOEL 28
KELLY, SEAN 26
DIZON, ARTURO 25
KANE, PATRICK 25
ECKLUND, SCOTT 24
RAHMAN, ABDUR 23
LOPATKA, MARK 20
MCCAGUE, RAYMOND 18
THOMPSON, RONALD 17
KAZMIERCZAK, KIM 16
MONDLOCK, ROGER 16
KHALID, ARSHAD 15
MUSCARELLO, THOMAS 15
WOJKOVICH, RICHARD 15
CARTER, HUGH 13
DOMINA, JOSHUA 12
EBERHART, ERIC 11
JACKSON, CHENTILE 11
WARREN, ETHAN 11
COLVIN, CYNTHIA 10
SMITH, ROGER 10
CHENG, ERVIN 9
SENESE, SERGIO 9
BONNER, ROXANNE 8
KONKAPAKA, SHARMA 8
SCHRAMM, MICHAEL 8
SONTA, BIRUTE 8
LABAN, AHMAD 7
PATLOGAN, ILA 7
WRIGHT, SHARON 7

Going a bit deeper into Exploratory analysis

Since our concerned departments are MOB and MOBA we are filtering data and then analyzing by department

library(dplyr)
g4<-ggplot(data=(Sdm %>% filter(SITE %in% c("MOBA","MOB"))),aes(x=SITE,Y=DEPARTMENT))+
geom_bar(aes(x=SITE,fill=DEPARTMENT))

plot(g4)

We see that most of the tickets raised are for Information Technology department in MOB and Engineering department in MOBA

Let’s us see the priority of tickets in these two departments

library(dplyr)
g4<-ggplot(data=(Sdm %>% filter(SITE %in% c("MOBA","MOB"))),aes(x=SITE,Y=PRIORITY))+
geom_bar(aes(x=SITE,fill=PRIORITY))

plot(g4)

In both the departments tickets generated are of priority medium

Text Mining on the Description column

The approach is to to count the frequency of words that matter and remove those words out which are of common English and to remove those words which have zero information content ( from our perspective and client’s perspective )

Step 1 : Collapse all the descriptions into a paragraph Step 2 : Replace punctuations such as ,_:/ etc. Step 3 : Repace all the digits Step 3 : Replace single letter words Step 4 : Replace stopwords words such as a,and,of,the Step 5 : Delete additional words in a retrospective manner Step 6 : Make a textbag of words and after that create a frequency table

##### Text Mining
text<-paste(Sdm$DESCRIPTION_NO_HTML,collapse = "")
write(text,"description.txt")
#punctuation replacement
text2<-gsub(pattern = "\\W",replacement = " ",text)
# digits replacement
text2<-gsub(pattern="\\d",replacement = "",text2)
text2<-gsub(pattern = "\\b[A-z]\\b{1}",replacement = "",text2)

library(tm)
# words removed with no information
text2<-removeWords(text2,stopwords())
library(stringr)
textbagWithoutCleaning<-str_split(text2,pattern="\\s+")
textbagWithoutCleaning<-unlist(textbagWithoutCleaning)
write.csv(head(sort(table(textbagWithoutCleaning),decreasing = T),n=300),file = "withoutcleaning.csv")
text2<-removeWords(text2,words=c("mwrd","Please","org","User","can","will","The","Thanks","user","From","Thank","To","reached","get","MWRD","see","th","like","This","Can","able","If","We","For","know","use","It","Hi","need","needed","needs","please","Help","My","now","trying","used","help","also","via"))
textbag<-str_split(text2,pattern="\\s+")
textbag<-unlist(textbag)
write.csv(head(sort(table(textbag),decreasing = T),n=300),file = "withcleaning.csv")

We will be creating a wordcloud as well as the frequencies of the word

Creating a wordcloud

library(RColorBrewer)
library(wordcloud)
wordcloud(textbag,min.freq = 250,random.order = FALSE,colors =brewer.pal(8,"Dark2"),scale = c(4,0.5))

Correlations

Change Request Form

library(readr)
Change_Request_Form<- read_csv("D:/UIC/3rd Semester/Capstone/datafrommwrd/Change Request Form Modified.csv")

A Look at the Dataset

kable(Change_Request_Form[1:5,],caption = "Change Request Form")
Change Request Form
Status Indicator Create Date Maintenance Date Responsible IT Area Impact Category Purpose Group Emergency / Time Senstive
NA 10/5/2017 13:52 10/19/2017 GIS Minor Preventive NA
NA 10/5/2017 11:42 10/19/2017 Network Infrastructure Major Enhancement NA
NA 10/5/2017 11:38 10/18/2017 Network Infrastructure Major Enhancement NA
NA 9/29/2017 10:36 10/11/2017 Desktop Engineering (Citrix, SCCM) Minor Enhancement NA
Follow-up (See Status Notes) 9/27/2017 10:53 9/28/2017 Desktop Engineering (Citrix, SCCM) Minor Fix / Patch NA

We see that our data frame has dates.But while importing from a CSV it is read as a character.To convert this column for meaningful purposes we will convert it regex in R.Now we will extract the time and add a column for time

library("dplyr")
Change_Request_Form<-mutate(.data = Change_Request_Form,CreatedTime=gsub(pattern ="[0-9]{1,}/[0-9]{1,}/[0-9]{1,}" ,replacement ="" ,x = Change_Request_Form$`Create Date`))

We will pull the dates now.

library("dplyr")
Change_Request_Form<-mutate(.data = Change_Request_Form,CreatedDate=gsub(pattern ="([0-9]{1,}/[0-9]{1,}/[0-9]{1,}).*" ,replacement ="\\1" ,x = Change_Request_Form$`Create Date`))

We need to delete the existing column for Date time and create proper formatting for Date and Time

Change_Request_Form$CreatedDate<-as.Date(x = Change_Request_Form$CreatedDate,format = "%m/%d/%Y")
Change_Request_Form$CreatedTime<-trimws(Change_Request_Form$CreatedTime,which="left")
library("lubridate")
Change_Request_Form$CreatedTime<-hm(Change_Request_Form$CreatedTime)
Change_Request_Form$`Maintenance Date`<-as.Date(x = Change_Request_Form$`Maintenance Date`,format = "%m/%d/%Y")
Change_Request_Form<-Change_Request_Form[,c(1,8,9,3,4,5,6,7)]

We will be creating a difference in dates which is the most important parameter. This will explain us in how many days the response team responded

Change_Request_Form<-mutate(.data = Change_Request_Form,DiffDays= difftime(time1 = Change_Request_Form$`Maintenance Date`,time2 = Change_Request_Form$CreatedDate,units = "days"))

We can get the descriptive statistics for DiffDays

summary(as.numeric(Change_Request_Form$DiffDays))
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   -5.00    1.00    9.00   16.14   17.00  262.00
DiffDays<-as.data.frame(as.numeric(Change_Request_Form$DiffDays))

We can see that Min=-5 and Max=262.The mean is 16 days and Median is 9 days. We would like to know how many tickets are solved within a given time duration. We will be creating bins for different tickets

  • Bins
  • less than 3-days
  • less than 10-days but greater than 3
  • 10 or more days
D<-data.frame(IntervalDays=c("(-7)-0","0-7","7-14","14-292"),MaintainFreq=c("13","92","59","70"))
View(D)
g5<-ggplot(data=D)+
geom_col(aes(x=IntervalDays,y=MaintainFreq))
plot(g5)

kable(D[1:4,],caption = "Maintainenance")
Maintainenance
IntervalDays MaintainFreq
(-7)-0 13
0-7 92
7-14 59
14-292 70

We see that most of the tickets are resolved in more than 11 days.The ideal time should be decided by the organization but we have decided a time of 0-3 days as ideal time. We have observed that some tickets have been resolved even before the Create Date

g6<-ggplot(data = Change_Request_Form)+
geom_bar(aes(x=Change_Request_Form$`Responsible IT Area`,fill=Change_Request_Form$`Purpose Group`))+
theme(axis.text.x =element_text(angle = 45, vjust = 0.5 , hjust = 0.5))  

plot(g6)

Analysis of Outage Notification Form

library(readr)
Onm<- read_csv("D:/UIC/MyResearch/Mwrd/mwrdAnalysis/Outage Notification Modified.csv", 
    col_types = cols(`Create Date` = col_datetime(format = "%m/%d/%Y %H:%M"), 
        `Outage End Date` = col_date(format = "%m/%d/%Y"), 
        `Outage Start Date` = col_date(format = "%m/%d/%Y")))

A look at the dataset

kable(Onm[1:5,],caption = "Outage Notification Form")
Outage Notification Form
Create Date Status Indicator Outage Start Date Outage End Date Description of Outage IT Area
2017-10-03 11:06:00 Follow-up 2017-10-02 2017-10-03 Automatic interpretation and transfer to Verify, automatic transfer to Readsoft Process Director not working. ERP / SAP
2017-10-03 10:47:00 Follow-up 2017-09-28 2017-09-28 End users called when none of their newly scanned invoices were getting into SAP. Windows logs show this error:

Faulting application name: eiitrp.exe, version: 5.8.16168.110, time stamp: 0x5762601c Faulting module name: MSVCR120.dll, version: 12.0.21005.1, time stamp: 0x524f7ce6 Exception code: 0xc0000409 Fault offset: 0x000a7666 Faulting process id: 0x8544 Faulting application start time: 0x01d33889d465efdb Faulting application path: E:Files (x86).exe Faulting module path: C:32120.dll Report Id: 12edd495-a47d-11e7-80d4-005056843318 Faulting package full name: Faulting package-relative application ID:

This error would repeat every time the transfer app would run.

Determined a large document in image import might have initially caused the problem; this DLL might have been removed or altered in memory to prevent further damage and was not able to be re-intialized in this session of windows. ERP / SAP
2017-10-03 09:32:00 Reviewed 2017-09-30 2017-10-02 After removing index.html and index.htm from the 2 portal servers; MWRDNEPP1 and MWRDNEPP2, the District’s website to the outside was up and down depending on which server the user was connected to. ERP / SAP
2017-10-02 10:46:00 Reviewed 2017-10-02 2017-10-02 Between 8:48am and 9:52am users could not use the Local Sewers application. (LSS) Multi-Team
2017-09-29 16:03:00 Reviewed 2017-09-29 2017-09-29 During the AT&T DDoS readiness test, public access to mwrd.org was blocked during mitigation of the simulated attack. Network team was told by the mitigation team the blocking of access to the IP address had been cleared, which was not true. Network Infrastructure

How long it takes to close the ticket ?

library("knitr")
library("dplyr")
Onm<-mutate(.data = Onm,OnmDiffDays=difftime(time1 = Onm$`Outage End Date`,time2 = Onm$`Outage Start Date`,units = "days"))
summary(as.numeric(Onm$OnmDiffDays))

Min. 1st Qu. Median Mean 3rd Qu. Max. NA’s 0.0000 0.0000 0.0000 0.7651 1.0000 30.0000 2

setNames(as.data.frame(table(as.numeric(Onm$OnmDiffDays))),c("Outage Lasted","Frequency"))

Outage Lasted Frequency 1 0 121 2 1 25 3 2 7 4 3 3 5 4 4 6 5 2 7 6 1 8 7 1 9 10 1 10 30 1

Which IT area the tickets belong to ?

g7<-ggplot(data = Onm)+
geom_bar(aes(Onm$`IT Area`,fill=Onm$`IT Area`))+
theme(axis.text.x = element_blank())
plot(g7)

Note that the echo = FALSE parameter was added to the code chunk to prevent printing of the R code that generated the plot.