Here we take an example of a business that uses many machines to build their final products. As the supply chain is stopped every-time a machine break, the manager asked a consulting firm to build a predictive model that finds which machine is going to break next and why.
The solution will predict which machine will break next and propose some reasons about why these machines have different lifetimes. A survival analysis is a good choice as we can visualise the machine’s lifetime very easily.
The business has data on each machine for the last pasts years and they are able to provide some more information such as the team that used it and its provider. These information seems sufficient to start an analysis.
# Load our dataset
maintenance = read.table('maintenance_data.csv',sep=';',header=TRUE)
# Set the variable's types
maintenance$lifetime <- as.numeric(maintenance$lifetime)
headmaintenance <- head(maintenance, n=20)
library(DT)
## Warning: package 'DT' was built under R version 3.5.2
datatable(headmaintenance, options = list(pageLength = 5))
As we summarise the data, we can see that the business have used 1000 machines. Machine have an average lifetime of 55 weeks, with some brand new machines and others that are running since almost two years. In our dataset almost 40 % of the machines have being broken in the past two years.
# Librairy
library(ggplot2)
library(tidyr)
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
library(ggvis)
## Warning: package 'ggvis' was built under R version 3.5.2
##
## Attaching package: 'ggvis'
## The following object is masked from 'package:ggplot2':
##
## resolution
library(corrplot)
## Warning: package 'corrplot' was built under R version 3.5.2
## corrplot 0.84 loaded
# display the summary
summary(maintenance)
## lifetime broken pressureInd moistureInd
## Min. : 1.0 Min. :0.000 Min. : 33.48 Min. : 58.55
## 1st Qu.:34.0 1st Qu.:0.000 1st Qu.: 85.56 1st Qu.: 92.77
## Median :60.0 Median :0.000 Median : 97.22 Median : 99.43
## Mean :55.2 Mean :0.397 Mean : 98.60 Mean : 99.38
## 3rd Qu.:80.0 3rd Qu.:1.000 3rd Qu.:112.25 3rd Qu.:106.12
## Max. :93.0 Max. :1.000 Max. :173.28 Max. :128.60
## temperatureInd team provider
## Min. : 42.28 TeamA:336 Provider1:254
## 1st Qu.: 87.68 TeamB:356 Provider2:266
## Median :100.59 TeamC:308 Provider3:242
## Mean :100.63 Provider4:238
## 3rd Qu.:113.66
## Max. :172.54
The first graph show the statistical distribution of a machine’s lifetime. We can see that on average machines break after 80 weeks, but they can break from within 60 to 100 weeks. The second and third graphs show that for team C or provider 3 their machines tend to break several weeks before the others. Below we present the correlation between all the others variables that aren’t factors.
## When do machine breack on average?
maintenance_broken <- maintenance %>% filter(broken == 1)
par(mfrow=c(1,3))
boxplot(lifetime~broken,data=maintenance_broken, main="Borken machines", xlab="", ylab="Lifetime",col="#357EC7")
boxplot(lifetime~team,data=maintenance_broken, main="Per team", xlab="", ylab="",col="#357EC7")
boxplot(lifetime~provider,data=maintenance_broken, main="Per provider", xlab="", ylab="",col="#357EC7")
# Correlation matrix
cor(maintenance_cor <- maintenance %>% select(lifetime:temperatureInd))
## lifetime broken pressureInd moistureInd
## lifetime 1.0000000000 0.70265621 0.0009427588 0.001196246
## broken 0.7026562129 1.00000000 -0.0289424521 -0.019520365
## pressureInd 0.0009427588 -0.02894245 1.0000000000 0.020542548
## moistureInd 0.0011962464 -0.01952037 0.0205425477 1.000000000
## temperatureInd 0.0017444443 0.01536360 0.0036412524 -0.009842375
## temperatureInd
## lifetime 0.001744444
## broken 0.015363604
## pressureInd 0.003641252
## moistureInd -0.009842375
## temperatureInd 1.000000000
As we have seen above, all variables tell us something about when a machine will break. We decide to use them all into our survival analysis model. We create a model using the gaussian method and use directly the maintenance dataset presented above. You can find here the coefficients for all our variables and just below see the survival plot and the summary of our survival model.
library(survival)
## Warning: package 'survival' was built under R version 3.5.2
# Choose the dependant variables to be used in the survival regression model.
dependantvars = Surv(maintenance$lifetime, maintenance$broken)
# Create model (use the gaussian method)
survreg = survreg(dependantvars~pressureInd+moistureInd+temperatureInd+team+provider, dist="gaussian",data=maintenance)
print(survreg)
## Call:
## survreg(formula = dependantvars ~ pressureInd + moistureInd +
## temperatureInd + team + provider, data = maintenance, dist = "gaussian")
##
## Coefficients:
## (Intercept) pressureInd moistureInd temperatureInd
## 8.035095e+01 -7.139623e-04 6.012531e-03 -1.042198e-02
## teamTeamB teamTeamC providerProvider2 providerProvider3
## -5.669818e-02 -6.217791e+00 1.249442e+01 -1.438417e+01
## providerProvider4
## 7.919537e+00
##
## Scale= 0.4755612
##
## Loglik(model)= -270.1 Loglik(intercept only)= -1557
## Chisq= 2573.75 on 8 degrees of freedom, p= <2e-16
## n= 1000
library(GGally)
## Warning: package 'GGally' was built under R version 3.5.2
##
## Attaching package: 'GGally'
## The following object is masked from 'package:dplyr':
##
## nasa
maintenance_graph <- survfit(Surv(lifetime,broken) ~ 1, data = maintenance)
ggsurv(maintenance_graph)
## Warning: Removed 1 rows containing missing values (geom_path).
## Warning: Removed 1 rows containing missing values (geom_path).
maintenance_graph2 <- survfit(Surv(lifetime,broken) ~ team, data = maintenance)
ggsurv(maintenance_graph2)
maintenance_graph3 <- survfit(Surv(lifetime,broken) ~ provider, data = maintenance)
ggsurv(maintenance_graph3)
summary(survreg)
##
## Call:
## survreg(formula = dependantvars ~ pressureInd + moistureInd +
## temperatureInd + team + provider, data = maintenance, dist = "gaussian")
## Value Std. Error z p
## (Intercept) 8.04e+01 2.94e-01 273.57 <2e-16
## pressureInd -7.14e-04 1.22e-03 -0.59 0.557
## moistureInd 6.01e-03 2.40e-03 2.51 0.012
## temperatureInd -1.04e-02 1.21e-03 -8.59 <2e-16
## teamTeamB -5.67e-02 5.88e-02 -0.96 0.335
## teamTeamC -6.22e+00 6.13e-02 -101.39 <2e-16
## providerProvider2 1.25e+01 6.67e-02 187.46 <2e-16
## providerProvider3 -1.44e+01 6.27e-02 -229.24 <2e-16
## providerProvider4 7.92e+00 7.06e-02 112.23 <2e-16
## Log(scale) -7.43e-01 3.54e-02 -21.00 <2e-16
##
## Scale= 0.476
##
## Gaussian distribution
## Loglik(model)= -270.1 Loglik(intercept only)= -1557
## Chisq= 2573.75 on 8 degrees of freedom, p= 0
## Number of Newton-Raphson Iterations: 12
## n= 1000
Using our survival analysis, we can now predict which machine will break next and therefore prioritise the maintenance on these machines to avoid the supply chain to stop. Here we display the 20 machines that should be changed this month and we order them by their remaining lifetime.
# Predict
# p = percentile = 0,5 = expected median
Ebreak=predict(survreg, newdata=maintenance, type="quantile", p=.5)
# Make forecast
Forecast=data.frame(Ebreak)
Forecast$lifetime=maintenance$lifetime
Forecast$broken=maintenance$broken
# Computed Expected Remaining Lifetime (remainingLT)
Forecast$RemainingLT=Forecast$Ebreak-maintenance$lifetime
# Order the elements by Expected Remaining Lifetime
Forecast=Forecast[order(Forecast$RemainingLT),]
# Keep only those who are not broken yet
ActionsPriority=Forecast[Forecast$broken==0,]
ActionsPriorityDT <- head(ActionsPriority, n=20)
datatable(ActionsPriorityDT)
The first action should be to change the machines with a remaining lifetime inferior to one week, and re-run our program every few days to change the next machine that will break. Here we have a data table that will be automatically emailed to the manager every week. Also, we attach the ID of the machine that has to be changed.
ActionsPriority$class <- cut(ActionsPriority$RemainingLT, c(-10,1,4,1000))
levels(ActionsPriority$class) <- c('Urgent', 'Medium', 'good')
summary(ActionsPriority)
## Ebreak lifetime broken RemainingLT
## Min. :58.87 Min. : 1.00 Min. :0 Min. :-0.1807
## 1st Qu.:73.68 1st Qu.:21.00 1st Qu.:0 1st Qu.:21.8005
## Median :81.67 Median :40.00 Median :0 Median :39.4834
## Mean :80.62 Mean :40.11 Mean :0 Mean :40.5067
## 3rd Qu.:87.84 3rd Qu.:57.50 3rd Qu.:0 3rd Qu.:58.2953
## Max. :93.05 Max. :90.00 Max. :0 Max. :91.2483
## class
## Urgent: 7
## Medium: 21
## good :575
##
##
##
The second action is to show these figures to team C and provider 3 and monitor the improvement. Now the management could focus on the enhancement of the machines’ lifetimes and see what are the best methods. Of course we should interview the workers as we could have new insights on how we could push the lifetime of each machine further.