Report Content
Executive Summary
The effectiveness of the condition-based maintenance rely on the quality of the predictive methods, but more than that, the understanding of each variable for the failure and its relationship can permit a higher efficacy to intervene in advance, reducing costs of unplanned shutdowns.
The utilization of the Machine Learning technics will support a clear view of each variable and will establish the right moment to start the planning for the intervention.
“The data was obtained from a numerical simulator of a naval vessel (Frigate) characterized by a Gas Turbine (GT) propulsion plant. The different blocks forming the complete simulator (Propeller, Hull, GT, Gear Box and Controller) have been developed and fine tuned over the year on several similar real propulsion plants. In view of these observations the available data are in agreement with a possible real vessel.”
A series of measures (16 variables) which indirectly represents the state of the system subject to performance decay has been acquired and stored in the dataset over the parameter’s space:
Conclusions
The results showed that is possible to predict the moment to start the maintenance planning based on the combined values presented by the variables, and also the main variables which should be carefully monitored: a) The rate of revolution of the generator, b) the injection control of the turbine and c) the exhaust gas pressure of the turbine.
Note: The CART (Classification and regression Tree) method utilized in this project did not show a good accuracy which can be improved by using the RANDOM FOREST method, the machine will learn and propose the right decision in 78% of the times for the Compressor decay and 79% for the Turbine decay, this method was chosen due to its better graphical representation
Data Analysis
Loading the libraries necessary to run the codes
library(dplyr)
library(caret)
library(rattle)
library(ggplot2)
library(rpart)
library(rpart.plot)
library(knitr)
Loading the files necessary to develop the project
turbine <- read.csv("J:/turbinea.csv", sep=";", dec=",")
turbine<- select(turbine, -(19:20))
str(turbine)
## 'data.frame': 11934 obs. of 18 variables:
## $ Lever_position : num 1.14 2.09 3.14 4.16 5.14 6.18 7.15 8.21 9.3 1.14 ...
## $ Ship_speed : num 3 6 9 12 15 18 21 24 27 3 ...
## $ Gas_turbine_shaft_torque : num 290 6960 8380 14700 21600 29800 39000 51000 72800 380 ...
## $ Turb_revolutions_rate : num 1350 1380 1390 1550 1920 2310 2680 3090 3560 1360 ...
## $ Generator_revolutions_rate : num 6680 6830 7110 7790 8490 8830 9130 9320 9780 6680 ...
## $ Starboard_Propeller_torque : num 7.58 28.2 60.4 114 175 246 332 438 645 7.92 ...
## $ Port_propeller_torque : num 7.58 28.2 60.4 114 175 246 332 438 645 7.92 ...
## $ Turb_exit_temp_HP : num 464 635 606 661 731 800 855 952 1120 464 ...
## $ Gas_turbine_compressor_inlet_air_temp_T1 : num 288 288 288 288 288 288 288 288 288 288 ...
## $ Gas_turbine_compressor_outlet_air_temp_T2: num 551 582 588 614 646 676 700 742 789 551 ...
## $ High_pressure_turbine_exit_pressure : num 1.1 1.33 1.39 1.66 2.08 2.5 2.96 3.58 4.5 1.1 ...
## $ Gas_turbine_compressor_inlet_air_temp_P1 : num 0.998 0.998 0.998 0.998 0.998 0.998 0.998 0.998 0.998 0.998 ...
## $ Gas_turbine_compressor_outlet_air_temp_P2: num 5.95 7.28 7.57 9.01 11.2 13.4 15.7 18.6 22.8 5.96 ...
## $ Turbine_exhaust_gas_pressure : num 1.02 1.02 1.02 1.02 1.03 1.03 1.04 1.04 1.05 1.02 ...
## $ Turbine_injection_control : num 7.14 10.7 13.1 18.1 26.4 35.8 45.9 62.4 92.6 3.88 ...
## $ Fuel_flow : num 0.082 0.287 0.259 0.358 0.522 0.708 0.908 1.24 1.83 0.079 ...
## $ Compressor_decay : num 0.95 0.95 0.95 0.95 0.95 0.95 0.95 0.95 0.95 0.95 ...
## $ Turbine_decay : num 0.975 0.975 0.975 0.975 0.975 0.975 0.975 0.975 0.975 0.976 ...
turbine<- mutate(turbine, V19=1, V20=1)
for(i in 1:nrow(turbine)){
if (turbine[i,17] >= mean(turbine$Compressor_decay)) { turbine[i,19]="Compressor_E"}
else{
turbine[i,19]="Compressor_Non"}
}
for(i in 1:nrow(turbine)){
if (turbine[i,18] >= mean(turbine$Turbine_decay)) { turbine[i,20]="Turbine_E"}
else{
turbine[i,20]="Turbine_Non"}
}
Defining the columns V19 and V20 as factor
turbine$V19<- as.factor(turbine$V19)
turbine$V20<- as.factor(turbine$V20)
turbine<- select(turbine, -Turbine_decay,-Compressor_decay, -Lever_position )
Eliminating variables which null representation
turbine <- turbine[, colSums(is.na(turbine)) ==0]
Eliminating variables which covariates is zero
nzv <- nearZeroVar(turbine[, -(16:17)], saveMetrics=TRUE)
turbine <- turbine[, nzv$nzv==FALSE]
Eliminating collinearity
corrM <- cor(turbine[,-(14:15)])
high <- findCorrelation( corrM, cutoff =.95)
turbine <- turbine[, -high]
Splitting the data to permit the cross validation
set.seed(1000)
inTrain<- createDataPartition(turbine$V19, p=0.7, list=FALSE)
training <- turbine[inTrain,]
testing <- turbine[-inTrain,]
modFit_C<- train(V19~., method="rpart", data= training)
modFit_C
## CART
##
## 8354 samples
## 4 predictor
## 2 classes: 'Compressor_E', 'Compressor_Non'
##
## No pre-processing
## Resampling: Bootstrapped (25 reps)
##
## Summary of sample sizes: 8354, 8354, 8354, 8354, 8354, 8354, ...
##
## Resampling results across tuning parameters:
##
## cp Accuracy Kappa Accuracy SD Kappa SD
## 0.02002442 0.8091380 0.61721279 0.05165803 0.10388136
## 0.03137973 0.6157775 0.22028919 0.09855196 0.20281364
## 0.08595849 0.5287422 0.03973594 0.01962790 0.04232297
##
## Accuracy was used to select the optimal model using the largest value.
## The final value used for the model was cp = 0.02002442.
print(modFit_C$finalModel)
## n= 8354
##
## node), split, n, loss, yval, (yprob)
## * denotes terminal node
##
## 1) root 8354 4095 Compressor_E (0.50981566 0.49018434)
## 2) Generator_revolutions_rate< 9765 7976 3730 Compressor_E (0.53234704 0.46765296)
## 4) Generator_revolutions_rate>=9525 553 89 Compressor_E (0.83905967 0.16094033) *
## 5) Generator_revolutions_rate< 9525 7423 3641 Compressor_E (0.50949751 0.49050249)
## 10) Generator_revolutions_rate< 9315 7166 3384 Compressor_E (0.52777003 0.47222997)
## 20) Generator_revolutions_rate< 6635 240 14 Compressor_E (0.94166667 0.05833333) *
## 21) Generator_revolutions_rate>=6635 6926 3370 Compressor_E (0.51342766 0.48657234)
## 42) Turbine_injection_control< 12.25 1483 547 Compressor_E (0.63115307 0.36884693) *
## 43) Turbine_injection_control>=12.25 5443 2620 Compressor_Non (0.48135220 0.51864780)
## 86) Generator_revolutions_rate>=7425 4411 2040 Compressor_E (0.53751984 0.46248016)
## 172) Generator_revolutions_rate< 7745 353 6 Compressor_E (0.98300283 0.01699717) *
## 173) Generator_revolutions_rate>=7745 4058 2024 Compressor_Non (0.49876787 0.50123213)
## 346) Generator_revolutions_rate>=8145 3466 1576 Compressor_E (0.54529717 0.45470283)
## 692) Generator_revolutions_rate< 8475 321 9 Compressor_E (0.97196262 0.02803738) *
## 693) Generator_revolutions_rate>=8475 3145 1567 Compressor_E (0.50174881 0.49825119)
## 1386) Generator_revolutions_rate>=8645 2538 1107 Compressor_E (0.56382979 0.43617021)
## 2772) Generator_revolutions_rate< 8815 500 57 Compressor_E (0.88600000 0.11400000) *
## 2773) Generator_revolutions_rate>=8815 2038 988 Compressor_Non (0.48478901 0.51521099)
## 5546) Generator_revolutions_rate>=8980 1579 649 Compressor_E (0.58898037 0.41101963)
## 11092) Generator_revolutions_rate< 9125 408 32 Compressor_E (0.92156863 0.07843137) *
## 11093) Generator_revolutions_rate>=9125 1171 554 Compressor_Non (0.47309991 0.52690009)
## 22186) Generator_revolutions_rate>=9220 655 199 Compressor_E (0.69618321 0.30381679) *
## 22187) Generator_revolutions_rate< 9220 516 98 Compressor_Non (0.18992248 0.81007752) *
## 5547) Generator_revolutions_rate< 8980 459 58 Compressor_Non (0.12636166 0.87363834) *
## 1387) Generator_revolutions_rate< 8645 607 147 Compressor_Non (0.24217463 0.75782537) *
## 347) Generator_revolutions_rate< 8145 592 134 Compressor_Non (0.22635135 0.77364865) *
## 87) Generator_revolutions_rate< 7425 1032 249 Compressor_Non (0.24127907 0.75872093) *
## 11) Generator_revolutions_rate>=9315 257 0 Compressor_Non (0.00000000 1.00000000) *
## 3) Generator_revolutions_rate>=9765 378 13 Compressor_Non (0.03439153 0.96560847) *
predictions_C <- predict(modFit_C, newdata=testing)
confusionMatrix(predictions_C, testing$V19)
## Confusion Matrix and Statistics
##
## Reference
## Prediction Compressor_E Compressor_Non
## Compressor_E 1481 421
## Compressor_Non 344 1334
##
## Accuracy : 0.7863
## 95% CI : (0.7725, 0.7996)
## No Information Rate : 0.5098
## P-Value [Acc > NIR] : <2e-16
##
## Kappa : 0.5721
## Mcnemar's Test P-Value : 0.006
##
## Sensitivity : 0.8115
## Specificity : 0.7601
## Pos Pred Value : 0.7787
## Neg Pred Value : 0.7950
## Prevalence : 0.5098
## Detection Rate : 0.4137
## Detection Prevalence : 0.5313
## Balanced Accuracy : 0.7858
##
## 'Positive' Class : Compressor_E
##
cart <- rpart(V19~., data=training, method="class")
prp(cart)
Varimpcompr <- varImp(modFit_C, scale=TRUE)
plot(Varimpcompr, main="Critical variables for the Compressor Decay")
#Using the method CARET for the analysis of the Turbine Decay
modFit_T<- train(V20~., method="rpart", data= training)
modFit_T
## CART
##
## 8354 samples
## 4 predictor
## 2 classes: 'Turbine_E', 'Turbine_Non'
##
## No pre-processing
## Resampling: Bootstrapped (25 reps)
##
## Summary of sample sizes: 8354, 8354, 8354, 8354, 8354, 8354, ...
##
## Resampling results across tuning parameters:
##
## cp Accuracy Kappa Accuracy SD Kappa SD
## 0.01539572 0.7780961 0.5559191 0.01825474 0.03668805
## 0.01551600 0.7775923 0.5549126 0.01820551 0.03659322
## 0.02612461 0.6178924 0.2395829 0.11630961 0.22862241
##
## Accuracy was used to select the optimal model using the largest value.
## The final value used for the model was cp = 0.01539572.
print(modFit_T$finalModel)
## n= 8354
##
## node), split, n, loss, yval, (yprob)
## * denotes terminal node
##
## 1) root 8354 4157 Turbine_Non (0.49760594 0.50239406)
## 2) Turbine_injection_control< 90.65 8191 4050 Turbine_E (0.50555488 0.49444512)
## 4) Generator_revolutions_rate>=9765 249 26 Turbine_E (0.89558233 0.10441767) *
## 5) Generator_revolutions_rate< 9765 7942 3918 Turbine_Non (0.49332662 0.50667338)
## 10) Turbine_injection_control< 88.65 7709 3819 Turbine_E (0.50460501 0.49539499)
## 20) Generator_revolutions_rate>=9745 142 5 Turbine_E (0.96478873 0.03521127) *
## 21) Generator_revolutions_rate< 9745 7567 3753 Turbine_Non (0.49596934 0.50403066)
## 42) Turbine_injection_control< 60.45 7040 3413 Turbine_E (0.51519886 0.48480114)
## 84) Generator_revolutions_rate>=9305 288 16 Turbine_E (0.94444444 0.05555556) *
## 85) Generator_revolutions_rate< 9305 6752 3355 Turbine_Non (0.49688981 0.50311019)
## 170) Turbine_injection_control< 44.15 6136 2970 Turbine_E (0.51597132 0.48402868)
## 340) Generator_revolutions_rate>=9125 270 7 Turbine_E (0.97407407 0.02592593) *
## 341) Generator_revolutions_rate< 9125 5866 2903 Turbine_Non (0.49488578 0.50511422)
## 682) Turbine_injection_control< 34.35 5182 2474 Turbine_E (0.52257816 0.47742184)
## 1364) Generator_revolutions_rate>=8805 343 24 Turbine_E (0.93002915 0.06997085) *
## 1365) Generator_revolutions_rate< 8805 4839 2389 Turbine_Non (0.49369704 0.50630296)
## 2730) Turbine_injection_control< 25.15 4064 1935 Turbine_E (0.52386811 0.47613189)
## 5460) Generator_revolutions_rate>=8465 397 42 Turbine_E (0.89420655 0.10579345) *
## 5461) Generator_revolutions_rate< 8465 3667 1774 Turbine_Non (0.48377420 0.51622580)
## 10922) Turbine_injection_control< 17.15 2723 1238 Turbine_E (0.54535439 0.45464561)
## 21844) Generator_revolutions_rate>=7725 384 39 Turbine_E (0.89843750 0.10156250) *
## 21845) Generator_revolutions_rate< 7725 2339 1140 Turbine_Non (0.48738777 0.51261223)
## 43690) Turbine_injection_control< 12.35 1674 718 Turbine_E (0.57108722 0.42891278)
## 87380) Generator_revolutions_rate>=6805 810 197 Turbine_E (0.75679012 0.24320988) *
## 87381) Generator_revolutions_rate< 6805 864 343 Turbine_Non (0.39699074 0.60300926) *
## 43691) Turbine_injection_control>=12.35 665 184 Turbine_Non (0.27669173 0.72330827) *
## 10923) Turbine_injection_control>=17.15 944 289 Turbine_Non (0.30614407 0.69385593) *
## 2731) Turbine_injection_control>=25.15 775 260 Turbine_Non (0.33548387 0.66451613) *
## 683) Turbine_injection_control>=34.35 684 195 Turbine_Non (0.28508772 0.71491228)
## 1366) Generator_revolutions_rate>=8835 320 143 Turbine_E (0.55312500 0.44687500)
## 2732) Turbine_injection_control< 43.35 171 21 Turbine_E (0.87719298 0.12280702) *
## 2733) Turbine_injection_control>=43.35 149 27 Turbine_Non (0.18120805 0.81879195) *
## 1367) Generator_revolutions_rate< 8835 364 18 Turbine_Non (0.04945055 0.95054945) *
## 171) Turbine_injection_control>=44.15 616 189 Turbine_Non (0.30681818 0.69318182)
## 342) Generator_revolutions_rate>=9135 310 137 Turbine_E (0.55806452 0.44193548)
## 684) Turbine_injection_control< 59.45 179 25 Turbine_E (0.86033520 0.13966480) *
## 685) Turbine_injection_control>=59.45 131 19 Turbine_Non (0.14503817 0.85496183) *
## 343) Generator_revolutions_rate< 9135 306 16 Turbine_Non (0.05228758 0.94771242) *
## 43) Turbine_injection_control>=60.45 527 126 Turbine_Non (0.23908918 0.76091082) *
## 11) Turbine_injection_control>=88.65 233 28 Turbine_Non (0.12017167 0.87982833) *
## 3) Turbine_injection_control>=90.65 163 16 Turbine_Non (0.09815951 0.90184049) *
predictions_T <- predict(modFit_T, newdata=testing)
confusionMatrix(predictions_T, testing$V20)
## Confusion Matrix and Statistics
##
## Reference
## Prediction Turbine_E Turbine_Non
## Turbine_E 1224 157
## Turbine_Non 586 1613
##
## Accuracy : 0.7925
## 95% CI : (0.7788, 0.8056)
## No Information Rate : 0.5056
## P-Value [Acc > NIR] : < 2.2e-16
##
## Kappa : 0.586
## Mcnemar's Test P-Value : < 2.2e-16
##
## Sensitivity : 0.6762
## Specificity : 0.9113
## Pos Pred Value : 0.8863
## Neg Pred Value : 0.7335
## Prevalence : 0.5056
## Detection Rate : 0.3419
## Detection Prevalence : 0.3858
## Balanced Accuracy : 0.7938
##
## 'Positive' Class : Turbine_E
##
cart <- rpart(V20~., data=training, method="class")
prp(cart)
Varimpturb <- varImp(modFit_T, scale=TRUE)
plot(Varimpturb, main="Critical variables for the Turbine Decay")
Note from the author: This activity has a purely didactic proposal aiming to improve the utilization of the machine learning technics.
Source of the data: http://archive.ics.uci.edu/ml/datasets/Condition+Based+Maintenance+of+Naval+Propulsion+Plants
July/28/2015