The analysis of the given datasets which are as following is done on R.

1. Service Desk

2. Outage Notification

3. Change Request Form

We have included the code within the documentation within the document to let the technical users leverage it.First we will be analyzing the Service Desk dataset.

library(readxl)
ServiceDeskModified <- read_excel("D:/UIC/3rd Semester/Capstone/datafrommwrd/ServiceDeskModified.xlsx")
Sdm<-ServiceDeskModified

Function to know which column has categorical values so that it can be converted to factor variables.Have to use nrow , length does not work for a tibble

differentValues<-data.frame()
for(i in 1:ncol(Sdm)){
  differentValues[i,1]<-nrow(unique(Sdm[,i]))
}
colsFactor<-Sdm[!seq(from = 1,to = ncol(Sdm),by = 1) %in% which(differentValues>20)]
Sdm[,c(2:4,6,7,8)]<-lapply(colsFactor,FUN =factor)

Adding few columns after change in requirements

library(readxl)
ServiceDesk_Data <- read_excel("D:/UIC/MyResearch/Mwrd/mwrdAnalysis/ServiceDesk Data.xlsx", 
    col_types = c("numeric", "numeric", "text", 
        "text", "text", "text", "text", "text", 
        "text", "text", "date", "date", "text", 
        "text", "numeric", "numeric", "numeric", 
        "text", "text", "text", "text", "text", 
        "text", "text", "text", "text", "text", 
        "text", "text", "text", "text", "text"))
ServiceDesk_Data<-ServiceDesk_Data[,c(4,11,12,18)]
Sdm<-cbind(Sdm,ServiceDesk_Data)

Let’s take a look at the dataset and the variables after a bit of modification

Service Desk
ID	STATE	PRIORITY	CATEGORY	SUBCATEGORY	IS_SERVICE_REQUEST	SITE	DEPARTMENT	DESCRIPTION_NO_HTML	CREATED_AT	UPDATED_AT	CREATED_BY
18622955	Assigned	Medium	IT Audio/Visual/Conference	In-District Video Conference Setup	TRUE	Stickney Water Reclamation Plant	Monitoring & Research	The IT Department is happy to assist with your upcoming event. Services offered are: -Audio/Video Conference Bridge -Recoding A/V Live streaming (Board Room Only) -Projector -Setup and Breakdown of Laptops -Printer -Onsite IT Support Assistance Needed Please share the Room, Date/Time and description of your event.	2017-10-04 13:22:47	2017-10-04 14:05:43	OKAL, JOHN
18622953	Assigned	None	Core-Applications	MWRD Website Requests	TRUE	MOB	General Administration	The ITD department will process website update requests submitted through the Service Desk. Please provide the following information to initiate the process. Name of QA Approver(s) - Please add as a CC to the request if they are in addition to yourself. Is this a Press Release ? Is this a New Page or URL ? URL path of the content to be updated Posting deadline or time details Who can see this content (Public, Internal, Both) New/Replacement content description Please include attachment files as needed using the “Attach Files” hyperlink	2017-10-04 13:22:43	2017-10-04 13:23:21	LARKIN, DIANE
18622457	New	Medium	ServiceDesk - Triage	NA	FALSE	Northside/Obrien Plant	Monitoring & Research	The chip key for the pool vehicle 1704 is not functioning. Gets an error message like this “key not in the file system” Raju Antony Environmental Specialist MWRDGC\|Industrial Waste Division O’Brien Water Reclamation Plant 3500 Howard St., Skokie, Illinois 600076-4013 p: 847-568-8237 \| f: 847-568-8240 raju.antony@mwrd.org	2017-10-04 13:09:20	2017-10-04 13:09:20	ANTONY, RAJU
18622304	Resolved	Medium	ServiceDesk - Triage	Re-Entered as a Service Request	FALSE	Stickney Water Reclamation Plant	Monitoring & Research	See Below. Marie Biron Adminstrative Specialist Environmental Monitoring and Research Division Monitoring and Research Department 6001 W Pershing Road Cicero, IL 60804 PH:708.588.4264 Email: marie.biron@mwrd.org From: Biron, Marie Sent: Wednesday, October 04, 2017 1:03 PM To: Help Desk (MOB) HelpD@mwrd.org Cc: Nwoko, Marcel NwokoM@mwrd.org; Weaver, John WeaverJ@mwrd.org Subject: Special M&R Seminar We have a special seminar taking place tomorrow in the following conference rooms: Egan Auditorium KWRP Admin Building Conference Room LASMA Auditorium NSWRP Admin Building Room 115 PCBX Conference Room (swrp) ALD Tech Services Conference Room (swrp) 7th Floor Corner Training Room MOBA The seminar is taking place from 8:45 am to 9:45 am. I will need assistance making sure the rooms are all connected. I have already notified the contacts for each room. Marie Biron Adminstrative Specialist Environmental Monitoring and Research Division Monitoring and Research Department 6001 W Pershing Road Cicero, IL 60804 PH:708.588.4264 Email: marie.biron@mwrd.org	2017-10-04 13:05:15	2017-10-04 13:23:58	BIRON, MARIE
18622045	Resolved	Medium	Security Modification	Network Account Password Reset	TRUE	Stickney Water Reclamation Plant	Maintenance & Operations	The IT Service desk can assist you with unlocking and/or resetting your Network, SAP, ReadSoft, Portal or voicemail password. Please provide your Employees ID, network login username and a contact telephone number where you can be reached. For security reasons, the service desk will need to contact you on your phone number and request an additional security details before modifying your account.	2017-10-04 12:59:56	2017-10-04 13:02:33	OKAL, JOHN

Exploratory Analysis

How many tickets have been resolved

From the plot we see that majority of them are closed ? We don’t have the date embedded in the data source file else we would have calculated

What category are most of my tickets ?

From the plot we see that majority of them are closed ? We don’t have the date embedded in the data source file else we would have calculated

Which Plant are raising the most requests for tickets ?

From the plot we see that majority of them are closed ? We don’t have the date embedded in the data source file else we would have calculated

Which department is raising most of the tickets ?

From the plot we see that majority of them are closed

We want to see who has created most of the tickets

library(knitr)
Sdm$CREATED_BY<-as.factor(Sdm$CREATED_BY)
Most_Tickets_Created<-as.data.frame(head(sort(table(Sdm$CREATED_BY),decreasing = T),n=50))
kable(Most_Tickets_Created,caption = "Most_Tickets_Created")

Most_Tickets_Created
Var1	Freq
UKPONG, IMEH	2116
OKAL, JOHN	1899
JONES, RON	837
WRIGHT, SEMAJ	232
SMITH, CHRISTOPHER	219
LARKIN, DIANE	202
KODY, JOSEPH	137
YOUNG, KEVIN	104
REICHLING, JOHN	95
RODGERS, GERALD	85
JEFFERSON, DEJIAH	82
NWOKO, MARCEL	58
ROBERTS, LASHUN	55
CHAPPELL, YVONNE	53
MCENERY, THOMAS	52
FOY, CASSANDRA	48
MALONE, LARRY	48
SANCHEZ, VICTOR	41
JOHNSTON, SAM	40
MIKA, KEVIN	31
SIA, JOEL	28
KELLY, SEAN	26
DIZON, ARTURO	25
KANE, PATRICK	25
ECKLUND, SCOTT	24
RAHMAN, ABDUR	23
LOPATKA, MARK	20
MCCAGUE, RAYMOND	18
THOMPSON, RONALD	17
KAZMIERCZAK, KIM	16
MONDLOCK, ROGER	16
KHALID, ARSHAD	15
MUSCARELLO, THOMAS	15
WOJKOVICH, RICHARD	15
CARTER, HUGH	13
DOMINA, JOSHUA	12
EBERHART, ERIC	11
JACKSON, CHENTILE	11
WARREN, ETHAN	11
COLVIN, CYNTHIA	10
SMITH, ROGER	10
CHENG, ERVIN	9
SENESE, SERGIO	9
BONNER, ROXANNE	8
KONKAPAKA, SHARMA	8
SCHRAMM, MICHAEL	8
SONTA, BIRUTE	8
LABAN, AHMAD	7
PATLOGAN, ILA	7
WRIGHT, SHARON	7

Going a bit deeper into Exploratory analysis

Since our concerned departments are MOB and MOBA we are filtering data and then analyzing by department

library(dplyr)
g4<-ggplot(data=(Sdm %>% filter(SITE %in% c("MOBA","MOB"))),aes(x=SITE,Y=DEPARTMENT))+
geom_bar(aes(x=SITE,fill=DEPARTMENT))

plot(g4)

We see that most of the tickets raised are for Information Technology department in MOB and Engineering department in MOBA

Let’s us see the priority of tickets in these two departments

library(dplyr)
g4<-ggplot(data=(Sdm %>% filter(SITE %in% c("MOBA","MOB"))),aes(x=SITE,Y=PRIORITY))+
geom_bar(aes(x=SITE,fill=PRIORITY))

plot(g4)

In both the departments tickets generated are of priority medium

Text Mining on the Description column

The approach is to to count the frequency of words that matter and remove those words out which are of common English and to remove those words which have zero information content ( from our perspective and client’s perspective )

Step 1 : Collapse all the descriptions into a paragraph Step 2 : Replace punctuations such as ,_:/ etc. Step 3 : Repace all the digits Step 3 : Replace single letter words Step 4 : Replace stopwords words such as a,and,of,the Step 5 : Delete additional words in a retrospective manner Step 6 : Make a textbag of words and after that create a frequency table

##### Text Mining
text<-paste(Sdm$DESCRIPTION_NO_HTML,collapse = "")
write(text,"description.txt")
#punctuation replacement
text2<-gsub(pattern = "\\W",replacement = " ",text)
# digits replacement
text2<-gsub(pattern="\\d",replacement = "",text2)
text2<-gsub(pattern = "\\b[A-z]\\b{1}",replacement = "",text2)

library(tm)
# words removed with no information
text2<-removeWords(text2,stopwords())
library(stringr)
textbagWithoutCleaning<-str_split(text2,pattern="\\s+")
textbagWithoutCleaning<-unlist(textbagWithoutCleaning)
write.csv(head(sort(table(textbagWithoutCleaning),decreasing = T),n=300),file = "withoutcleaning.csv")
text2<-removeWords(text2,words=c("mwrd","Please","org","User","can","will","The","Thanks","user","From","Thank","To","reached","get","MWRD","see","th","like","This","Can","able","If","We","For","know","use","It","Hi","need","needed","needs","please","Help","My","now","trying","used","help","also","via"))
textbag<-str_split(text2,pattern="\\s+")
textbag<-unlist(textbag)
write.csv(head(sort(table(textbag),decreasing = T),n=300),file = "withcleaning.csv")

We will be creating a wordcloud as well as the frequencies of the word

Creating a wordcloud

library(RColorBrewer)
library(wordcloud)
wordcloud(textbag,min.freq = 250,random.order = FALSE,colors =brewer.pal(8,"Dark2"),scale = c(4,0.5))

Correlations

Change Request Form

library(readr)
Change_Request_Form<- read_csv("D:/UIC/3rd Semester/Capstone/datafrommwrd/Change Request Form Modified.csv")

A Look at the Dataset

kable(Change_Request_Form[1:5,],caption = "Change Request Form")

Change Request Form
Status Indicator	Create Date	Maintenance Date	Responsible IT Area	Impact Category	Purpose Group	Emergency / Time Senstive
NA	10/5/2017 13:52	10/19/2017	GIS	Minor	Preventive	NA
NA	10/5/2017 11:42	10/19/2017	Network Infrastructure	Major	Enhancement	NA
NA	10/5/2017 11:38	10/18/2017	Network Infrastructure	Major	Enhancement	NA
NA	9/29/2017 10:36	10/11/2017	Desktop Engineering (Citrix, SCCM)	Minor	Enhancement	NA
Follow-up (See Status Notes)	9/27/2017 10:53	9/28/2017	Desktop Engineering (Citrix, SCCM)	Minor	Fix / Patch	NA

We see that our data frame has dates.But while importing from a CSV it is read as a character.To convert this column for meaningful purposes we will convert it regex in R.Now we will extract the time and add a column for time

library("dplyr")
Change_Request_Form<-mutate(.data = Change_Request_Form,CreatedTime=gsub(pattern ="[0-9]{1,}/[0-9]{1,}/[0-9]{1,}" ,replacement ="" ,x = Change_Request_Form$`Create Date`))

We will pull the dates now.

library("dplyr")
Change_Request_Form<-mutate(.data = Change_Request_Form,CreatedDate=gsub(pattern ="([0-9]{1,}/[0-9]{1,}/[0-9]{1,}).*" ,replacement ="\\1" ,x = Change_Request_Form$`Create Date`))

We need to delete the existing column for Date time and create proper formatting for Date and Time

Change_Request_Form$CreatedDate<-as.Date(x = Change_Request_Form$CreatedDate,format = "%m/%d/%Y")
Change_Request_Form$CreatedTime<-trimws(Change_Request_Form$CreatedTime,which="left")
library("lubridate")
Change_Request_Form$CreatedTime<-hm(Change_Request_Form$CreatedTime)
Change_Request_Form$`Maintenance Date`<-as.Date(x = Change_Request_Form$`Maintenance Date`,format = "%m/%d/%Y")
Change_Request_Form<-Change_Request_Form[,c(1,8,9,3,4,5,6,7)]

We will be creating a difference in dates which is the most important parameter. This will explain us in how many days the response team responded

Change_Request_Form<-mutate(.data = Change_Request_Form,DiffDays= difftime(time1 = Change_Request_Form$`Maintenance Date`,time2 = Change_Request_Form$CreatedDate,units = "days"))

We can get the descriptive statistics for DiffDays

summary(as.numeric(Change_Request_Form$DiffDays))

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   -5.00    1.00    9.00   16.14   17.00  262.00

DiffDays<-as.data.frame(as.numeric(Change_Request_Form$DiffDays))

We can see that Min=-5 and Max=262.The mean is 16 days and Median is 9 days. We would like to know how many tickets are solved within a given time duration. We will be creating bins for different tickets

Bins
less than 3-days
less than 10-days but greater than 3
10 or more days

D<-data.frame(IntervalDays=c("(-7)-0","0-7","7-14","14-292"),MaintainFreq=c("13","92","59","70"))
View(D)
g5<-ggplot(data=D)+
geom_col(aes(x=IntervalDays,y=MaintainFreq))
plot(g5)

kable(D[1:4,],caption = "Maintainenance")

Maintainenance
IntervalDays	MaintainFreq
(-7)-0	13
0-7	92
7-14	59
14-292	70

We see that most of the tickets are resolved in more than 11 days.The ideal time should be decided by the organization but we have decided a time of 0-3 days as ideal time. We have observed that some tickets have been resolved even before the Create Date

g6<-ggplot(data = Change_Request_Form)+
geom_bar(aes(x=Change_Request_Form$`Responsible IT Area`,fill=Change_Request_Form$`Purpose Group`))+
theme(axis.text.x =element_text(angle = 45, vjust = 0.5 , hjust = 0.5))  

plot(g6)

Analysis of Outage Notification Form

library(readr)
Onm<- read_csv("D:/UIC/MyResearch/Mwrd/mwrdAnalysis/Outage Notification Modified.csv", 
    col_types = cols(`Create Date` = col_datetime(format = "%m/%d/%Y %H:%M"), 
        `Outage End Date` = col_date(format = "%m/%d/%Y"), 
        `Outage Start Date` = col_date(format = "%m/%d/%Y")))

A look at the dataset

kable(Onm[1:5,],caption = "Outage Notification Form")

Outage Notification Form
Create Date	Status Indicator	Outage Start Date	Outage End Date	Description of Outage	IT Area
2017-10-03 11:06:00	Follow-up	2017-10-02	2017-10-03	Automatic interpretation and transfer to Verify, automatic transfer to Readsoft Process Director not working.	ERP / SAP
2017-10-03 10:47:00	Follow-up	2017-09-28	2017-09-28	End users called when none of their newly scanned invoices were getting into SAP. Windows logs show this error:

Faulting application name: eiitrp.exe, version: 5.8.16168.110, time stamp: 0x5762601c Faulting module name: MSVCR120.dll, version: 12.0.21005.1, time stamp: 0x524f7ce6 Exception code: 0xc0000409 Fault offset: 0x000a7666 Faulting process id: 0x8544 Faulting application start time: 0x01d33889d465efdb Faulting application path: E:Files (x86).exe Faulting module path: C:32120.dll Report Id: 12edd495-a47d-11e7-80d4-005056843318 Faulting package full name: Faulting package-relative application ID:

This error would repeat every time the transfer app would run.

Determined a large document in image import might have initially caused the problem; this DLL might have been removed or altered in memory to prevent further damage and was not able to be re-intialized in this session of windows. ERP / SAP
2017-10-03 09:32:00 Reviewed 2017-09-30 2017-10-02 After removing index.html and index.htm from the 2 portal servers; MWRDNEPP1 and MWRDNEPP2, the District’s website to the outside was up and down depending on which server the user was connected to. ERP / SAP
2017-10-02 10:46:00 Reviewed 2017-10-02 2017-10-02 Between 8:48am and 9:52am users could not use the Local Sewers application. (LSS) Multi-Team
2017-09-29 16:03:00 Reviewed 2017-09-29 2017-09-29 During the AT&T DDoS readiness test, public access to mwrd.org was blocked during mitigation of the simulated attack. Network team was told by the mitigation team the blocking of access to the IP address had been cleared, which was not true. Network Infrastructure

How long it takes to close the ticket ?

library("knitr")
library("dplyr")
Onm<-mutate(.data = Onm,OnmDiffDays=difftime(time1 = Onm$`Outage End Date`,time2 = Onm$`Outage Start Date`,units = "days"))
summary(as.numeric(Onm$OnmDiffDays))

Min. 1st Qu. Median Mean 3rd Qu. Max. NA’s 0.0000 0.0000 0.0000 0.7651 1.0000 30.0000 2

setNames(as.data.frame(table(as.numeric(Onm$OnmDiffDays))),c("Outage Lasted","Frequency"))

Outage Lasted Frequency 1 0 121 2 1 25 3 2 7 4 3 3 5 4 4 6 5 2 7 6 1 8 7 1 9 10 1 10 30 1

Which IT area the tickets belong to ?

g7<-ggplot(data = Onm)+
geom_bar(aes(Onm$`IT Area`,fill=Onm$`IT Area`))+
theme(axis.text.x = element_blank())
plot(g7)

Note that the echo = FALSE parameter was added to the code chunk to prevent printing of the R code that generated the plot.

MWRDAnalysis

UIC

1. Service Desk

2. Outage Notification

3. Change Request Form

Exploratory Analysis

Going a bit deeper into Exploratory analysis

Text Mining on the Description column

Correlations

Change Request Form

A Look at the Dataset

Analysis of Outage Notification Form