The first step is to read the dataset:
setwd('/home/jb/git/PMO-HTK')
dd=read.csv(file="PMO-HKT.csv",sep=";",stringsAsFactors = FALSE,skip=1,header=TRUE)
ut=dd[1,]
#
dd=dd[-1,]
for(i in 4:9) {
dd[,i]=as.numeric(gsub(',','.',dd[,i]))
}
#
Then, it’s time to get some feedback from the data:
The first one is devoted to explore the relationship between KPI4 and KPI5.It shows an unbelievable linear relationship
Then the relationship between time delay and cost overrun, which is explained as ‘natural’.
Then the relationship between KPI1 and KPI2 is also quite curious and it deserves for more details.
Then the relationship between KPI1 and KPI4 is hardly understood as the 100% of KPI does not reflect the meaning of higher values for KPI4.
The implementation of KPI5 is not fully clear. What does it mean NHTotal? It seems that KP4 + KP5 = 100% at any time. Is this true? Id so, why?
The meaning of KPI3 is also unclear. How the project delay is measured ?
Because of what such DRAMATIC cut off happens?
## Warning in par(op0): graphical parameter "cin" cannot be set
## Warning in par(op0): graphical parameter "cra" cannot be set
## Warning in par(op0): graphical parameter "csi" cannot be set
## Warning in par(op0): graphical parameter "cxy" cannot be set
## Warning in par(op0): graphical parameter "din" cannot be set
## Warning in par(op0): graphical parameter "page" cannot be set
After the previous criticism, we kept some of the KPIs as
png 2
```