1 Effect of UPSTM-Based Decorrelation on Feature Discovery

Here I showcase of to use BSWiMS feature selection/modeling function coupled with Goal Driven Sparse Transformation Matrix (UPSTM) as a pre-processing step to decorrelate highly correlated features. The aim(s) are:

To improve model performance by uncovering the hidden information between correlated features.
To simplify the interpretation of the machine learning models.

This demo will use:

FRESA.CAD::IDeA(). For Decorrelation of Multidimensional data sets
- FRESA.CAD::getLatentCoefficients(). For the extraction of the model of the newly discovered of decorrelated features.
FRESA.CAD::randomCV() For the cross-validation of the Machine Learning models
FRESA.CAD::BSWiMS.model(). For the generation of bootstrapped logistic models
- FRESA.CAD::summary(). For the summary description of the BSWiMS model
FRESA.CAD::predictionStats_binary(). For describing the performance of the model
heatmap.2(). For displaying the correlation matrix
igraph::graph_from_adjacency_matrix(). For the display of the network of BSWiMS formulas
vioplot::vioplot(). For the display of the z-distribution of significant features.

1.0.1 Loading the libraries

library("FRESA.CAD")
library(readxl)
library(vioplot)
library(igraph)

op <- par(no.readonly = TRUE)
pander::panderOptions('digits', 3)
pander::panderOptions('table.split.table', 400)
pander::panderOptions('keep.trailing.zeros',TRUE)

1.1 Material and Methods

1.1.1 Signed Log Transform

The function will be used to transform all the continuous features of the data

signedlog <- function(x) { return (sign(x)*log10(abs(1.0e6*x)+1.0)-6)}

1.2 Data: The Parkinson Data-Set

The data to process is described in:

Erdogdu Sakar, Betul, Gorkem Serbes, and C. Okan Sakar. “Analyzing the effectiveness of vocal features in early telediagnosis of Parkinson’s disease.” PloS one 12, no. 8 (2017): e0182428.

The data was obtained from the UCI ML repository:

https://archive.ics.uci.edu/ml/datasets/Parkinson%27s+Disease+Classification

I added a column to the data identifying the repeated experiments.


pd_speech_features <- as.data.frame(read_excel("~/GitHub/FCA/Data/pd_speech_features.xlsx",sheet = "pd_speech_features", range = "A2:ACB758"))

##The fraction of samples in the training set

trainFraction=0.65

## The file with the codes to create shorter names
namecode <- read.csv("~/GitHub/FCA/Data/Parkinson_names.csv")

1.2.1 The Average of the Three Repetitions

Each subject had three repeated observations. Here I’ll use the average of the three experiments per subject.

rep1Parkison <- subset(pd_speech_features,RID==1)
rownames(rep1Parkison) <- rep1Parkison$id
rep1Parkison$id <- NULL
rep1Parkison$RID <- NULL
rep1Parkison[,1:ncol(rep1Parkison)] <- sapply(rep1Parkison,as.numeric)

rep2Parkison <- subset(pd_speech_features,RID==2)
rownames(rep2Parkison) <- rep2Parkison$id
rep2Parkison$id <- NULL
rep2Parkison$RID <- NULL
rep2Parkison[,1:ncol(rep2Parkison)] <- sapply(rep2Parkison,as.numeric)

rep3Parkison <- subset(pd_speech_features,RID==3)
rownames(rep3Parkison) <- rep3Parkison$id
rep3Parkison$id <- NULL
rep3Parkison$RID <- NULL
rep3Parkison[,1:ncol(rep3Parkison)] <- sapply(rep3Parkison,as.numeric)

whof <- !(colnames(rep1Parkison) %in% c("gender","class"));
avgParkison <- rep1Parkison;
avgParkison[,whof] <- (rep1Parkison[,whof] + rep2Parkison[,whof] + rep3Parkison[,whof])/3
## I apply the log transform to the data
avgParkison[,whof] <- signedlog(avgParkison[,whof])
pander::pander(table(avgParkison$class))

0	1
64	188

1.2.2 Correlation Matrix of the Parkinson Data

The heat-map of the correlation:

cormat <- cor(avgParkison[,colnames(avgParkison)!="class"],method="spearman")
cormat[is.na(cormat)] <- 0
gplots::heatmap.2(abs(cormat),
                  trace = "none",
                  scale = "none",
                  mar = c(10,10),
                  col=rev(heat.colors(5)),
                  main = "Raw Correlation",
                  cexRow = 0.35,
                  cexCol = 0.35,
                  key.title=NA,
                  key.xlab="Spearman Correlation",
                  xlab="Feature", ylab="Feature",
#                  srtRow = 45,
#                  srtCol = 45
)

1.2.3 Training and Testing Sets

We divided the data into training and testing sets.

set.seed(2)
caseSet <- subset(avgParkison, class == 1)
controlSet <- subset(avgParkison, class == 0)
caseTrainSize <- nrow(caseSet)*trainFraction;
controlTrainSize <- nrow(controlSet)*trainFraction;
sampleCaseTrain <- sample(nrow(caseSet),caseTrainSize)
sampleControlTrain <- sample(nrow(controlSet),controlTrainSize)
trainSet <- rbind(caseSet[sampleCaseTrain,], controlSet[sampleControlTrain,])
testSet <-  rbind(caseSet[-sampleCaseTrain,],controlSet[-sampleControlTrain,])
pander::pander(table(trainSet$class))

0	1
41	122

pander::pander(table(testSet$class))

0	1
23	66

1.2.3.1 Decorrelation: Training and Testing Sets Creation

I compute a decorrelated version of the training and testing sets using the IDeA() function of FRESA.CAD. The first decorrelation will be driven by features associated with the outcome. The second decorrelation will find the UPSTM without the outcome restriction.

## The UPSTM transformation driven by the Outcome
deTrain <- IDeA(trainSet,Outcome="class",thr=0.8,verbose = TRUE,skipRelaxed=FALSE)

Included: 679 , Uni p: 0.0159386 , Uncorrelated Base: 167 , Outcome-Driven Size: 67 , Base Size: 196

1 <R=1.000,w= 1,N= 373>, Top: 95( 2 )1 : 95 : 0.975,<|>Tot Used: 317 , Added: 224 , Zero Std: 0 , Max Cor: 1.000

2 <R=1.000,w= 1,N= 373>, Top: 24( 7 )1 : 24 : 0.975,<|>Tot Used: 353 , Added: 55 , Zero Std: 0 , Max Cor: 0.998

3 <R=0.998,w= 1,N= 373>, Top: 14( 3 )1 : 14 : 0.974,<|>Tot Used: 368 , Added: 16 , Zero Std: 0 , Max Cor: 0.974

4 <R=0.974,w= 2,N= 208>, Top: 76( 2 )=( 1 )2 : 76 : 0.966,<|>Tot Used: 441 , Added: 102 , Zero Std: 0 , Max Cor: 0.984

5 <R=0.984,w= 2,N= 208>, Top: 16( 2 )1 : 16 : 0.942,<|>Tot Used: 443 , Added: 19 , Zero Std: 0 , Max Cor: 0.967

6 <R=0.967,w= 3,N= 162>, Top: 60( 1 )1 : 60 : 0.884,<|>Tot Used: 477 , Added: 71 , Zero Std: 0 , Max Cor: 0.983

7 <R=0.983,w= 3,N= 162>, Top: 16( 1 )1 : 16 : 0.891,<|>Tot Used: 478 , Added: 19 , Zero Std: 0 , Max Cor: 0.909

8 <R=0.909,w= 3,N= 162>, Top: 35( 1 )1 : 35 : 0.854,<|>Tot Used: 493 , Added: 39 , Zero Std: 0 , Max Cor: 0.985

9 <R=0.985,w= 3,N= 162>, Top: 6( 1 )1 : 6 : 0.893,<|>Tot Used: 493 , Added: 6 , Zero Std: 0 , Max Cor: 0.877

10 <R=0.877,w= 4,N= 158>, Top: 64( 1 )1 : 64 : 0.800,<|>Tot Used: 527 , Added: 69 , Zero Std: 0 , Max Cor: 0.962

11 <R=0.962,w= 4,N= 158>, Top: 8( 1 )1 : 8 : 0.831,<|>Tot Used: 527 , Added: 10 , Zero Std: 0 , Max Cor: 0.883

12 <R=0.883,w= 5,N= 19>, Top: 9( 1 )1 : 9 : 0.800,<|>Tot Used: 527 , Added: 8 , Zero Std: 0 , Max Cor: 0.800

13 <R=0.000,w= 5,N= 19>

[ 13 ], 0.7999961 Decor Dimension: 527 . Cor to Base: 168 , ABase: 21 , Outcome Base: 33

deTest <- predictDecorrelate(deTrain,testSet)

## The UPSTM transformation without outcome
deTrainU <- IDeA(trainSet,thr=0.8,verbose = TRUE,skipRelaxed=FALSE)

Included: 679 , Uni p: 0.0159386 , Uncorrelated Base: 186 , Outcome-Driven Size: 0 , Base Size: 186

1 <R=1.000,w= 1,N= 373>, Top: 93( 2 )1 : 93 : 0.975,<|>Tot Used: 323 , Added: 231 , Zero Std: 0 , Max Cor: 1.000

2 <R=1.000,w= 1,N= 373>, Top: 20( 7 )1 : 20 : 0.975,<|>Tot Used: 354 , Added: 53 , Zero Std: 0 , Max Cor: 0.998

3 <R=0.998,w= 1,N= 373>, Top: 11( 3 )1 : 11 : 0.974,<|>Tot Used: 369 , Added: 13 , Zero Std: 0 , Max Cor: 0.974

4 <R=0.974,w= 2,N= 208>, Top: 73( 2 )1 : 73 : 0.937,<|>Tot Used: 438 , Added: 103 , Zero Std: 0 , Max Cor: 0.984

5 <R=0.984,w= 2,N= 208>, Top: 14( 2 )1 : 14 : 0.942,<|>Tot Used: 442 , Added: 16 , Zero Std: 0 , Max Cor: 0.944

6 <R=0.944,w= 3,N= 176>, Top: 62( 2 )=2 : 62 : 0.916,<|>Tot Used: 477 , Added: 83 , Zero Std: 0 , Max Cor: 0.981

7 <R=0.981,w= 3,N= 176>, Top: 13( 1 )1 : 13 : 0.891,<|>Tot Used: 485 , Added: 16 , Zero Std: 0 , Max Cor: 0.897

8 <R=0.897,w= 4,N= 202>, Top: 71( 6 )1 : 71 : 0.800,<|>Tot Used: 520 , Added: 102 , Zero Std: 0 , Max Cor: 0.979

9 <R=0.979,w= 4,N= 202>, Top: 19( 1 )1 : 19 : 0.840,<|>Tot Used: 521 , Added: 19 , Zero Std: 0 , Max Cor: 0.919

10 <R=0.919,w= 5,N= 22>, Top: 10( 1 )1 : 10 : 0.800,<|>Tot Used: 522 , Added: 12 , Zero Std: 0 , Max Cor: 0.905

11 <R=0.905,w= 5,N= 22>, Top: 1( 1 )1 : 1 : 0.800,<|>Tot Used: 522 , Added: 1 , Zero Std: 0 , Max Cor: 0.799

12 <R=0.000,w= 6,N= 0>

[ 12 ], 0.7994925 Decor Dimension: 522 . Cor to Base: 217 , ABase: 39 , Outcome Base: 0

deTestU <- predictDecorrelate(deTrainU,testSet)

1.2.4 Distance map

tranformed2 <- colnames(deTrain)[str_detect(colnames(deTrain),"La_")]
tranformed <- str_remove_all(tranformed2,"La_")

dsubs <- as.matrix(dist(as.matrix(trainSet[,tranformed]),"euclidean"))

gplots::heatmap.2(dsubs,
                  trace = "none",
                  scale = "none",
                  mar = c(10,10),
                  col=rev(heat.colors(5)),
                  main = "Original: Train Distances",
                  cexRow = 0.35,
                  cexCol = 0.35,
                  key.title=NA,
                  key.xlab="Distance",
                  xlab="Subject", ylab="Subject")



dsubsD <- as.matrix(dist(as.matrix(deTrain[,tranformed2]),"euclidean"))

gplots::heatmap.2(dsubsD,
                  trace = "none",
                  scale = "none",
                  mar = c(10,10),
                  col=rev(heat.colors(5)),
                  main = "Transformed: Train Distances",
                  cexRow = 0.35,
                  cexCol = 0.35,
                  key.title=NA,
                  key.xlab="Distance",
                  xlab="Subject", ylab="Subject")


diff <- dsubs - dsubsD
gplots::heatmap.2(diff,
                  trace = "none",
                  scale = "none",
                  mar = c(10,10),
                  col=rev(heat.colors(5)),
                  main = "Distances Diff",
                  cexRow = 0.35,
                  cexCol = 0.35,
                  key.title=NA,
                  key.xlab="Distance Diff",
                  xlab="Subject", ylab="Subject")

1.2.4.1 Correlation Matrix of the Decorrelated Test Data

The heat map of the testing set.

cormat <- cor(deTest[,colnames(deTest)!="class"],method="spearman")
cormat[is.na(cormat)] <- 0
gplots::heatmap.2(abs(cormat),
                  trace = "none",
                  scale = "none",
                  mar = c(10,10),
                  col=rev(heat.colors(5)),
                  main = "Test Set Correlation after UPSTM",
                  cexRow = 0.35,
                  cexCol = 0.35,
                  key.title=NA,
                  key.xlab="Spearman Correlation",
                  xlab="Feature", ylab="Feature")

1.2.5 Holdout Cross-Validation

Before doing the feature analysis. I’ll explore BSWiMS modeling using the Holdout cross validation method of FRESA.CAD. The purpose of the cross-validation is to observe and estimate the performance gain of decorrelation.

par(op)
par(mfrow=c(1,3))

## The Raw validation
cvBSWiMSRaw <- randomCV(avgParkison,
                "class",
                fittingFunction= BSWiMS.model,
                classSamplingType = "Pro",
                trainFraction = trainFraction,
                repetitions = 150
)

.[++++++++++–+++-]..[+++-+++++-+-].[++++++++++++++++++++]…[+++-].[++++-+++-++-+++-]..[++++–].[++++++++++++++++++++]…[+++++++++++++-+–]..[++++++++++++–]..[++++++-+–]10 Tested: 250 Avg. Selected: 29.4 Min Tests: 1 Max Tests: 7 Mean Tests: 3.56 . MAD: 0.3091545

.[++++-].[++++++++++++++++++++]…[++++++++++++++++++++]…[++++++++++++++++++++]…[+++++++++—+++++]..[++–].[++++++++++++++++-+-]..[++++++++++++++++++++]…[+++++++++++++-++-]..[++++++++++++++++++++]..20 Tested: 252 Avg. Selected: 34.45 Min Tests: 2 Max Tests: 14 Mean Tests: 7.063492 . MAD: 0.3012962

.[++++++++++++++++++++]…[+++++++++++++++++–]..[++++++++++++++++++++]…[+++++++-+++–]..[+++++++++++-]..[+++++++-+++++++++++]..[+++++-+-].[+++++++++++++++-+++]..[++++++++++++++++-]..[++++++++++++++++++++]..30 Tested: 252 Avg. Selected: 36.63333 Min Tests: 4 Max Tests: 19 Mean Tests: 10.59524 . MAD: 0.3090249

.[+++-].[++++++++++++++++++-]..[++++++++++++++++++++]…[++++++++++++++++++++]…[++++++++++++-++++++]..[++++++++++++++++-++]..[+++++++++++++++++++-]..[++++++++++++++++++++]…[++++++++++—+++-]..[++++++++-++-++++-].40 Tested: 252 Avg. Selected: 38 Min Tests: 5 Max Tests: 22 Mean Tests: 14.12698 . MAD: 0.3067733

.[++++++++-+++++-++-]..[+++++++++++-]..[+++++++++++++–]..[+++++++++++++-]..[+++++++++++++–+++]..[++++++-].[++++++++++++-]..[++++++-].[++++++++++++++++-++]..[+++++++++++++++-].50 Tested: 252 Avg. Selected: 37.26 Min Tests: 9 Max Tests: 26 Mean Tests: 17.65873 . MAD: 0.3060329

.[+++++-+++++++++++-]..[+++++++++++++++++++-]..[++++++++++-++-++-]..[++++++++–++++–]..[+++++++-++++—]..[++++++++++–]..[++++++-+++-].[++++++++++-++++++++]..[+++++++++–+-+–]..[+++++++++++++++–+].60 Tested: 252 Avg. Selected: 37.05 Min Tests: 10 Max Tests: 31 Mean Tests: 21.19048 . MAD: 0.305701

.[++++++++++++++++++++]…[++++++++++++++++++++]…[++++++++++++++++++++]…[++++++++++++++–]..[++++++++++++++++++-]..[+++++-].[++++++++++++++++++++]…[+++++++++++++++++–]..[+++-+++++-].[++++++++++++++++++++]..70 Tested: 252 Avg. Selected: 37.45714 Min Tests: 11 Max Tests: 35 Mean Tests: 24.72222 . MAD: 0.3059183

.[++++++++++++++++++++]…[++++++++++++++++++-]..[+++++++-+++++-]..[+++++++++++++++-++-]..[+++++-].[++++++++++++++++-+-]..[++++++++++++++++++++]…[+++++++++-++-++–]..[+++++++++++++-++-]..[++++++++++++++++++++]..80 Tested: 252 Avg. Selected: 37.7375 Min Tests: 17 Max Tests: 40 Mean Tests: 28.25397 . MAD: 0.3073354

.[+++++++++++-+—]..[++++++++++++++++++-]..[++++++++++++++++++++]…[++++++++++++++++++++]…[+++++++++-].[++++++++++++++++++++]…[++++++++++++++++++++]…[+++++++++++++-++–]..[+++++++++++++++++++-]..[+-++-]90 Tested: 252 Avg. Selected: 38.17778 Min Tests: 20 Max Tests: 43 Mean Tests: 31.78571 . MAD: 0.3077018

.[++++++++-+++++++-+]..[++++++++++++++++-++]..[++++++++++++++++++++]…[++++++++++++++++++++]…[++++++-+++++-]..[+++++++++++++++++–]..[++++++++++++++++++++]…[++++++++++++++++++++]…[++++++++++++-]..[+++++++++-+++++-++].100 Tested: 252 Avg. Selected: 38.56 Min Tests: 21 Max Tests: 46 Mean Tests: 35.31746 . MAD: 0.3085001

.[++++++++++++++++++++]…[+++++++++++++++-+-]..[+++++++-+-++-]..[++++++++++++++++++++]…[++++++++++++++++++++]…[++++++++-].[++++++-+++++++++-+]..[++++++++++++++++++++]…[++++++++++++++++++-]..[+++–]110 Tested: 252 Avg. Selected: 38.81818 Min Tests: 23 Max Tests: 51 Mean Tests: 38.84921 . MAD: 0.3073142

.[+++++++++-++++-+++]..[++++++++-++++++++++]..[+++++-+-].[+-+++++–].[+-+++-++++++-]..[++++++++++++++++++++]…[++++++++++++++++++-]..[++++++++-+-].[++++++++++++++++++++]…[++++++++++++++++-++].120 Tested: 252 Avg. Selected: 38.725 Min Tests: 29 Max Tests: 54 Mean Tests: 42.38095 . MAD: 0.306978

.[+++++++++++-++++–]..[++++++++++++++++++++]…[+++++++++++++++-+++]..[+++++++++++++–+++]..[++++++++++++++++++++]…[++++++++++++-+++–]..[++++++++++++++++++++]…[++++++++++++++++-]..[++++++++++++++++++++]…[+++++++++++++++–+].130 Tested: 252 Avg. Selected: 39.3 Min Tests: 33 Max Tests: 58 Mean Tests: 45.9127 . MAD: 0.307711

.[++++++++++++++++++++]…[++++++++++++++++++++]…[++++++++++++++++++-]..[+++++++++++++++–+]..[++++++++++++-++–]..[++++++++++++++++-++]..[++++++++++++++++++++]…[+++++++++++++++—]..[+++++++++++++-]..[+++++++++++++++-].140 Tested: 252 Avg. Selected: 39.57143 Min Tests: 36 Max Tests: 62 Mean Tests: 49.44444 . MAD: 0.3082386

.[++++++++++++++++++++]…[++++++++++++++++++++]…[++++++++++++++++++++]…[++++++++++++++++++++]…[++++++++++++-]..[++++++++++++-++++++]..[+++++-].[++++++++++++++++++++]…[++++++++++–++-]..[++++++++++++++++++++]..150 Tested: 252 Avg. Selected: 39.68667 Min Tests: 39 Max Tests: 65 Mean Tests: 52.97619 . MAD: 0.309539


bpraw <- predictionStats_binary(cvBSWiMSRaw$medianTest,"BSWiMS RAW",cex=0.60)

BSWiMS RAW

pander::pander(bpraw$CM.analysis$tab)

	Outcome +	Outcome -	Total
Test +	144	20	164
Test -	44	44	88
Total	188	64	252

pander::pander(bpraw$accc)

est	lower	upper
0.746	0.688	0.799

pander::pander(bpraw$aucs)

est	lower	upper
0.84	0.785	0.894

pander::pander(bpraw$berror)

50%	2.5%	97.5%
0.271	0.211	0.343


## The validation with Outcome-driven Decorrelation
cvBSWiMSDeCor <- randomCV(avgParkison,
                "class",
                trainSampleSets= cvBSWiMSRaw$trainSamplesSets,
                fittingFunction= filteredFit,
                fitmethod=BSWiMS.model,
                filtermethod=NULL,
                DECOR = TRUE,
                DECOR.control=list(Outcome="class",thr=0.8,skipRelaxed=FALSE)
)

.[++++++++-].[++++++-+-].[++++++-+-].[++–].[++++-].[+++–].[+++++++-+++-]..[+++++++++++++-]..[+++++-].[+++-++-]10 Tested: 250 Avg. Selected: 22.4 Min Tests: 1 Max Tests: 7 Mean Tests: 3.56 . MAD: 0.2871137

.[++++-+-+-].[++++-].[++++-+++–].[++++++++++++++-+++-]..[+++++++-].[++-+++-].[+++++++++++++++-+++]..[++-].[+++-+-++-].[+++-]20 Tested: 252 Avg. Selected: 23.6 Min Tests: 2 Max Tests: 14 Mean Tests: 7.063492 . MAD: 0.2826612

.[+++++-+++-].[+++++++-+++-]..[+++++++-].[++++++++-++-]..[++++++-].[++++—].[++-].[++++++-].[+++–].[++++++–+++-]30 Tested: 252 Avg. Selected: 23.6 Min Tests: 4 Max Tests: 19 Mean Tests: 10.59524 . MAD: 0.2878354

.[+++++++++++—-]..[++++-+++++-++-+++]..[+++++-].[+++-].[+++++++++++-]..[+++++++++-].[+++++++++++-++-]..[+++++++++-].[+–+-].[+++++—]40 Tested: 252 Avg. Selected: 24.6 Min Tests: 5 Max Tests: 22 Mean Tests: 14.12698 . MAD: 0.2909058

.[+++++-].[+++-+++–].[++++-].[+++++++++++++++-]..[++++++++-].[+++—-].[++++++++-].[++++++-].[++++++-].[+++++—]50 Tested: 252 Avg. Selected: 24.38 Min Tests: 9 Max Tests: 26 Mean Tests: 17.65873 . MAD: 0.2902138

.[+++++++-].[++++++-+++++++—]..[+++++-].[+++++-].[++++++++-].[++-++-].[+++++++++++-+-+++-]..[+++++++++-].[+++++++-].[+++++++++++++++-].60 Tested: 252 Avg. Selected: 25.28333 Min Tests: 10 Max Tests: 31 Mean Tests: 21.19048 . MAD: 0.2936879

.[+++-+-].[+-].[+++++++-+-+-].[++++++-].[+++++-++-].[+++++++++-].[++++++++-].[+++-+-].[+++++–+-].[+++–]70 Tested: 252 Avg. Selected: 24.54286 Min Tests: 11 Max Tests: 35 Mean Tests: 24.72222 . MAD: 0.2904197

.[++-].[++++++-++++–+-]..[++++++++++++++-]..[+++-].[++—++-].[++++++++-].[++++++++++++++++–]..[+++++++-].[+++-].[+++++++++-]80 Tested: 252 Avg. Selected: 24.6875 Min Tests: 17 Max Tests: 40 Mean Tests: 28.25397 . MAD: 0.291087

.[+++-].[+++++-].[++++–+—].[+++++++++-+-+-+-]..[++++++–].[++++++++++-]..[+++++++-].[++-].[+++++–].[+++-]90 Tested: 252 Avg. Selected: 24.07778 Min Tests: 20 Max Tests: 43 Mean Tests: 31.78571 . MAD: 0.2897524

.[+++++-].[++++++++++++++-++++]..[++++-].[++++-].[+++-].[+++++++++++++++-+-]..[+++++-++++-].[++++++++-++-+++-]..[+-+—].[++++++-]100 Tested: 252 Avg. Selected: 24.22 Min Tests: 21 Max Tests: 46 Mean Tests: 35.31746 . MAD: 0.2903004

.[++++-].[+++++-].[+++-].[++++++++++-+-++++-]..[+++++++++++-+-+–]..[++++++++–].[+++++++-].[++++-++—].[++++++++++-+++-]..[+++-++++++-+-].110 Tested: 252 Avg. Selected: 24.72727 Min Tests: 23 Max Tests: 51 Mean Tests: 38.84921 . MAD: 0.2896679

.[++++-+++-++-].[+-].[+++-].[+++++–].[+++++++++++-]..[++++++-++++-]..[++++++++-+++++-]..[+++++-].[++-+-++-].[+++-]120 Tested: 252 Avg. Selected: 24.64167 Min Tests: 29 Max Tests: 54 Mean Tests: 42.38095 . MAD: 0.28616

.[++++++++-].[++++–+-].[+–].[+++++++++++++++-]..[+++—].[+++++++++-+++++++++]..[+++-+++++-++-]..[+++-+++-].[++++++-+-].[+++++++++–]130 Tested: 252 Avg. Selected: 24.86923 Min Tests: 33 Max Tests: 58 Mean Tests: 45.9127 . MAD: 0.2869853

.[++++++++++++-]..[+++++++-].[+-].[++-].[++++-].[+++++-].[+++++++++++-]..[++++++++–].[++++++-].[+++++-]140 Tested: 252 Avg. Selected: 24.62143 Min Tests: 36 Max Tests: 62 Mean Tests: 49.44444 . MAD: 0.2860115

.[+++++-].[+++-].[++++++-].[+++–].[+++-+–].[+++-++-++-].[++++++++++-+–]..[+++-++-].[++++-].[+++++++-+–]150 Tested: 252 Avg. Selected: 24.32667 Min Tests: 39 Max Tests: 65 Mean Tests: 52.97619 . MAD: 0.2861055


bpDecor <- predictionStats_binary(cvBSWiMSDeCor$medianTest,"BSWiMS Outcome-Driven UPSTM",cex=0.60)

BSWiMS Outcome-Driven UPSTM

pander::pander(bpDecor$CM.analysis$tab)

	Outcome +	Outcome -	Total
Test +	165	14	179
Test -	23	50	73
Total	188	64	252

pander::pander(bpDecor$accc)

est	lower	upper
0.853	0.803	0.894

pander::pander(bpDecor$aucs)

est	lower	upper
0.867	0.809	0.925

pander::pander(bpDecor$berror)

50%	2.5%	97.5%
0.169	0.118	0.23


### Here we compute the probability that the outcome-driven decorrelation ROC is superior to the RAW ROC. 
pander::pander(roc.test(bpDecor$ROC.analysis$roc.predictor,bpraw$ROC.analysis$roc.predictor,alternative = "greater"))

DeLong’s test for two correlated ROC curves: `bpDecor$ROC.analysis$roc.predictor` and `bpraw$ROC.analysis$roc.predictor`
Test statistic	P value	Alternative hypothesis	AUC of roc1	AUC of roc2
1.78	0.0376 *	greater	0.867	0.84


### Testing improving proability
iprob <- .Call("improveProbCpp",cvBSWiMSRaw$medianTest[,2],
               cvBSWiMSDeCor$medianTest[,2],
               cvBSWiMSRaw$medianTest[,1]);
pander::pander(iprob)

z.idi: 1.99
z.nri: 0.225
idi: 0.0303
nri: 0.0326

### Testing improving accuracy
testRaw <- (cvBSWiMSRaw$medianTest[,1]-cvBSWiMSRaw$medianTest[,2])<0.5
testDecor <- (cvBSWiMSDeCor$medianTest[,1]-cvBSWiMSDeCor$medianTest[,2])<0.5
pander::pander(mcnemar.test(testRaw,testDecor))

McNemar’s Chi-squared test with continuity correction: `testRaw` and `testDecor`
Test statistic	df	P value
17.4	1	3.04e-05 * * *


## The validation of Decorrelation without the outcome restriction
cvBSWiMSDeCorU <- randomCV(avgParkison,
                "class",
                trainSampleSets= cvBSWiMSRaw$trainSamplesSets,
                fittingFunction= filteredFit,
                fitmethod=BSWiMS.model,
                filtermethod=NULL,
                DECOR = TRUE,
                DECOR.control=list(thr=0.8,skipRelaxed=FALSE)
)

.[++-++++-].[++++++++-+–].[+++++-].[+++++++++-].[++++-].[++–].[++++–].[+++++++-++–].[++—-].[++-]10 Tested: 250 Avg. Selected: 18.9 Min Tests: 1 Max Tests: 7 Mean Tests: 3.56 . MAD: 0.2873746

.[++-].[+++++–].[+++–].[+++-+++++-].[+-].[+++++++-].[+++++++-].[++-++–].[+++-+-+-].[++++++-+-]20 Tested: 252 Avg. Selected: 17.6 Min Tests: 2 Max Tests: 14 Mean Tests: 7.063492 . MAD: 0.2700011

.[++++++++-+++-]..[++++++++++-++–]..[++++++-+-].[+++++++++-].[+++++-+–].[+++++++++++++++-]..[+++++++++++++-]..[+++-].[++-+++—].[++++-++++++–].30 Tested: 252 Avg. Selected: 22.3 Min Tests: 4 Max Tests: 19 Mean Tests: 10.59524 . MAD: 0.2791296

.[++++++–+++-].[++++++-].[++-+-].[+++-].[++++++-].[++++++-+++–].[++++-].[++-++-+++–].[++++–].[+–]40 Tested: 252 Avg. Selected: 21.725 Min Tests: 5 Max Tests: 22 Mean Tests: 14.12698 . MAD: 0.2726633

.[++++++—].[+++++++-++-].[+++++++-].[++++++-].[++++++++++-++-]..[++++++++++++-]..[++++++-].[++-++++-].[+++-].[++++—+-]50 Tested: 252 Avg. Selected: 22.28 Min Tests: 9 Max Tests: 26 Mean Tests: 17.65873 . MAD: 0.2762398

.[++++++++-].[+++++++++-++–]..[++++++-].[+++++-].[++++++++-].[+++++–].[++-+–].[++++++-+—].[+++++-].[+++++++-+++-+-].60 Tested: 252 Avg. Selected: 22.75 Min Tests: 10 Max Tests: 31 Mean Tests: 21.19048 . MAD: 0.2804432

.[+++++++-+++-+-]..[+++-++++-].[+++-].[+-+++-].[++++-].[++++++++-].[++++++++++-+–]..[++-++++-].[++++++++-+++-]..[++-+-]70 Tested: 252 Avg. Selected: 22.97143 Min Tests: 11 Max Tests: 35 Mean Tests: 24.72222 . MAD: 0.2820656

.[++++-].[+++++-+—+-].[+++++++++++++++++-]..[++—].[++–].[++++++++-+++–+-]..[++++++++++++++–+-]..[++++++–].[+++++++++-].[++++++-]80 Tested: 252 Avg. Selected: 23.65 Min Tests: 17 Max Tests: 40 Mean Tests: 28.25397 . MAD: 0.2829365

.[+++++++-+++-]..[+++-+++++-+-].[+++–].[+++++++–+++++-]..[+++++—].[++++++–+-].[++++++—-].[+++++++–].[+++++++-].[++++-]90 Tested: 252 Avg. Selected: 23.78889 Min Tests: 20 Max Tests: 43 Mean Tests: 31.78571 . MAD: 0.280273

.[+++-+++-+-].[+++++++++-].[++-].[+++++-].[++++++++—++-]..[++++-].[++++++++–+-].[+++++++-].[++++-].[+++++—]100 Tested: 252 Avg. Selected: 23.72 Min Tests: 21 Max Tests: 46 Mean Tests: 35.31746 . MAD: 0.2804125

.[+++—-].[++++++-].[++-].[+++++++++++++++–]..[++++++++++-+-]..[++-].[++++++++++-]..[+++++–+-].[+++++-+-+-].[++-]110 Tested: 252 Avg. Selected: 23.7 Min Tests: 23 Max Tests: 51 Mean Tests: 38.84921 . MAD: 0.2798014

.[++++++++-].[++-].[+++-].[++++++++++-+-++-]..[+++++-+++++-]..[+++++++++-].[++++++++++++-]..[++++-].[++–++-].[++-]120 Tested: 252 Avg. Selected: 23.71667 Min Tests: 29 Max Tests: 54 Mean Tests: 42.38095 . MAD: 0.2785596

.[++++-].[+-++-].[+—-].[+++++++++++—–]..[+++-+-].[++++++++-+—].[++-++-+-++–].[+++++-].[++++++++++++-]..[++++-+-]130 Tested: 252 Avg. Selected: 23.7 Min Tests: 33 Max Tests: 58 Mean Tests: 45.9127 . MAD: 0.2792973

.[++++++–].[+++-+++–].[++-].[++–].[++++-+–].[++++–].[+++–].[++++++-].[++++-+-].[+++++-]140 Tested: 252 Avg. Selected: 23.19286 Min Tests: 36 Max Tests: 62 Mean Tests: 49.44444 . MAD: 0.2762984

.[+-].[+++–].[++-++++-].[++++–].[+++-++-].[++-].[+++++-].[+++–].[+++++++-].[++++-]150 Tested: 252 Avg. Selected: 22.69333 Min Tests: 39 Max Tests: 65 Mean Tests: 52.97619 . MAD: 0.2738196


bpDecorU <- predictionStats_binary(cvBSWiMSDeCorU$medianTest,"BSWiMS Data Driven UPSTM",cex=0.60)

BSWiMS Data Driven UPSTM

pander::pander(bpDecorU$CM.analysis$tab)

	Outcome +	Outcome -	Total
Test +	167	14	181
Test -	21	50	71
Total	188	64	252

pander::pander(bpDecorU$accc)

est	lower	upper
0.861	0.812	0.901

pander::pander(bpDecorU$aucs)

est	lower	upper
0.877	0.819	0.935

pander::pander(bpDecorU$berror)

50%	2.5%	97.5%
0.163	0.112	0.223


### Here we compute the probability that the blind decorrelation ROC is superior to the RAW ROC. 

pander::pander(roc.test(bpDecorU$ROC.analysis$roc.predictor,bpraw$ROC.analysis$roc.predictor,alternative = "greater"))

DeLong’s test for two correlated ROC curves: `bpDecorU$ROC.analysis$roc.predictor` and `bpraw$ROC.analysis$roc.predictor`
Test statistic	P value	Alternative hypothesis	AUC of roc1	AUC of roc2
2.26	0.0118 *	greater	0.877	0.84

par(op)

## Testing probability improvement
iprob <- .Call("improveProbCpp",cvBSWiMSRaw$medianTest[,2],cvBSWiMSDeCorU$medianTest[,2],cvBSWiMSRaw$medianTest[,1]);
pander::pander(iprob)

z.idi: 3.28
z.nri: 1.91
idi: 0.0551
nri: 0.273


## Testing accuracy improvement
testDecorU <- (cvBSWiMSDeCorU$medianTest[,1]-cvBSWiMSDeCorU$medianTest[,2])<0.5
pander::pander(mcnemar.test(testRaw,testDecorU))

McNemar’s Chi-squared test with continuity correction: `testRaw` and `testDecorU`
Test statistic	df	P value
15.6	1	7.77e-05 * * *

1.3 The Raw Model vs. the Decorrelated-Based Model

After demonstrating that decorrelation is able to improve BSWiMS model performance, I’ll focus is showcasing the ability to discover new features associated with the outcome.

First, I’ll compute the BSWiMS models for the original data, and for the decorrelated data-set. The model estimation will be done using the training set and tested on the holdout test set, and repeated 10 times. After that, I’ll compare the statistical difference of both ROC curves.

par(op)
par(mfrow=c(1,3))

bm <- BSWiMS.model(class~.,trainSet,NumberofRepeats = 20)

[++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++]…………………………………

bpraw <- predictionStats_binary(cbind(testSet$class,predict(bm,testSet)),"BSWiMS RAW",cex=0.60)

BSWiMS RAW


bmd <- BSWiMS.model(class~.,deTrain,NumberofRepeats = 20)

[+++++++-++++-++-+++++++-++–++++++++++++–+++++++-++-++++++++++++-++++++++++++-++-++++++++++++-+-+-+++++++++++-++++++-+-++++++-++++++++++-++++++++++–++++++++++-+++-++-++++++-++++-+-++++-+-+++–++++++++-++++++++++–+–+++++++++-++++++-++++++-+++-+–++++++-+-++++–]…………………

bpdecor <- predictionStats_binary(cbind(deTest$class,predict(bmd,deTest)),"Outcome-Driven Decorrelation",cex=0.60)

Outcome-Driven Decorrelation


## Comparing the two ROC curves
pander::pander(roc.test(bpdecor$ROC.analysis$roc.predictor,bpraw$ROC.analysis$roc.predictor,alternative = "greater"))

DeLong’s test for two correlated ROC curves: `bpdecor$ROC.analysis$roc.predictor` and `bpraw$ROC.analysis$roc.predictor`
Test statistic	P value	Alternative hypothesis	AUC of roc1	AUC of roc2
2.03	0.0209 *	greater	0.895	0.842

## Comparing the test accuracy
testRaw <- (testSet$class-predict(bm,testSet))<0.5
testDecor <- (deTest$class-predict(bmd,deTest))<0.5
pander::pander(mcnemar.test(testRaw,testDecor))

McNemar’s Chi-squared test with continuity correction: `testRaw` and `testDecor`
Test statistic	df	P value
4.9	1	0.0269 *



bmdU <- BSWiMS.model(class~.,deTrainU,NumberofRepeats = 20)

[+-++–++++++-++—-+++++++-++-++–+++++++-+++—-+++++++-++++–+++—–++++—+++++–+++-++++++–+++-+++—+++++++–++++++-+-+-+-+++++++–+++++++—]……….

bpdecorU <- predictionStats_binary(cbind(deTest$class,predict(bmdU,deTestU)),"Blind Decorrelation",cex=0.60)

Blind Decorrelation


## Comparing the test curves
pander::pander(roc.test(bpdecorU$ROC.analysis$roc.predictor,bpraw$ROC.analysis$roc.predictor,alternative = "greater"))

DeLong’s test for two correlated ROC curves: `bpdecorU$ROC.analysis$roc.predictor` and `bpraw$ROC.analysis$roc.predictor`
Test statistic	P value	Alternative hypothesis	AUC of roc1	AUC of roc2
2.01	0.0224 *	greater	0.906	0.842

## Comparing the accuracy
testDecorU <- (deTestU$class-predict(bmdU,deTestU))<0.5
pander::pander(mcnemar.test(testRaw,testDecorU))

McNemar’s Chi-squared test with continuity correction: `testRaw` and `testDecorU`
Test statistic	df	P value
1.07	1	0.302


par(op)

1.4 The Feature Associations

I’ll print the graph showing the association between features. Each feature cluster represents a logistic regression formula (formula nugget) discovered by the BSWiMS method. The figure will plot:

Raw formula network
Outcome-driven network
Blind network

The plots will show only formula networks with more than 50% of occurrence and 25% of feature to feature association.

par(op)
par(mfrow=c(1,3))
### The raw model

pander::pander(nrow(bm$bagging$formulaNetwork))



cmax <- apply(bm$bagging$formulaNetwork,2,max)
cnames <- names(cmax[cmax>=0.5])
cmax <- cmax[cmax>=0.5]
adma <- bm$bagging$formulaNetwork[cnames,cnames]

for (cx in c(1:nrow(namecode)))
{
  cnames <- str_replace_all(cnames,namecode[cx,1],namecode[cx,2])
}
cnames <- str_replace_all(cnames,"_","")
cnames <- str_replace_all(cnames,"th","")
rownames(adma) <- cnames
colnames(adma) <- cnames
names(cmax) <- cnames
adma[adma<0.25] <- 0;
gr <- graph_from_adjacency_matrix(adma,mode = "undirected",diag = FALSE,weighted=TRUE)
gr$layout <- layout_with_fr

fc <- cluster_optimal(gr)
plot(fc, gr,
     vertex.size=20*cmax,
     vertex.label.cex=0.5,
     vertex.label.dist=0,
     main="Original Feature Association")



### The Outcome Driven Model

pander::pander(nrow(bmd$bagging$formulaNetwork))



cmax <- apply(bmd$bagging$formulaNetwork,2,max)
cnames <- names(cmax[cmax>=0.5])
outcomeNames <- cnames

cmax <- cmax[cmax>=0.5]
adma <- bmd$bagging$formulaNetwork[cnames,cnames]

for (cx in c(1:nrow(namecode)))
{
  cnames <- str_replace_all(cnames,namecode[cx,1],namecode[cx,2])
}
cnames <- str_replace_all(cnames,"_","")
cnames <- str_replace_all(cnames,"th","")
rownames(adma) <- cnames
colnames(adma) <- cnames
names(cmax) <- cnames
adma[adma<0.25] <- 0;
gr <- graph_from_adjacency_matrix(adma,mode = "undirected",diag = FALSE,weighted=TRUE)
gr$layout <- layout_with_fr

fc <- cluster_optimal(gr)
clusterOutcome <- fc
clusterOutcome$names <- outcomeNames

plot(fc, gr,
     vertex.size=20*cmax,
     vertex.label.cex=0.5,
     vertex.label.dist=0,
     main="Outcome-Driven Decorrelation")


### The Blind Decorrelation

pander::pander(nrow(bmdU$bagging$formulaNetwork))



cmax <- apply(bmdU$bagging$formulaNetwork,2,max)
cnames <- names(cmax[cmax>=0.5])
cmax <- cmax[cmax>=0.5]
adma <- bmdU$bagging$formulaNetwork[cnames,cnames]

for (cx in c(1:nrow(namecode)))
{
  cnames <- str_replace_all(cnames,namecode[cx,1],namecode[cx,2])
}
cnames <- str_replace_all(cnames,"_","")
cnames <- str_replace_all(cnames,"th","")
rownames(adma) <- cnames
colnames(adma) <- cnames
names(cmax) <- cnames
adma[adma<0.25] <- 0;
gr <- graph_from_adjacency_matrix(adma,mode = "undirected",diag = FALSE,weighted=TRUE)
gr$layout <- layout_with_fr

fc <- cluster_optimal(gr)
plot(fc, gr,
     vertex.size=20*cmax,
     vertex.label.cex=0.5,
     vertex.label.dist=0,
     main="Blind Decorrelation")

1.4.1 Feature Analysis of Models

The analysis of the features required to predict the outcome will use the following:

Analysis of the BSWiMS bagged model using the summary function.
Analysis of the sparse GDSMT
Analysis of the univariate association of the model features of both models
Report the new features not found by the Original data analysis

par(op)
par(mfrow=c(1,1))
## 1 Get the Model Features
smOriginal <- summary(bm)
rawnames <- rownames(smOriginal$coefficients)

### From Drived Decorrelation
smDecor <- summary(bmd)
decornames <- rownames(smDecor$coefficients)

### From Blind Decorrelation
smDecorU <- summary(bmdU)
decornamesU <- rownames(smDecorU$coefficients)



## 2 Get the decorrelation matrix formulas
dc <- getLatentCoefficients(deTrain)
### 2a Get only the ones that were decorrelated by the decorrelation-based model
deNames_in_dc <- decornames[decornames %in% names(dc)]
selectedlist <- dc[deNames_in_dc]
theDeFormulas <- selectedlist
pander::pander(selectedlist)

La_tqwt_entropy_log_dec_4:

tqwt_entropy_log_dec_1 tqwt_entropy_log_dec_4

-1.09 1
La_tqwt_TKEO_std_dec_28:

tqwt_TKEO_mean_dec_28 tqwt_TKEO_std_dec_28

-0.73 1
La_tqwt_entropy_log_dec_31:

tqwt_entropy_log_dec_31 tqwt_entropy_log_dec_35

1 -0.731
La_tqwt_stdValue_dec_1:

tqwt_entropy_shannon_dec_1 tqwt_entropy_shannon_dec_2 tqwt_stdValue_dec_1 tqwt_stdValue_dec_2

-0.524 0.448 1 -0.849
La_std_12th_delta:

std_MFCC_12th_coef std_12th_delta

-0.96 1
La_tqwt_energy_dec_5:

tqwt_energy_dec_4 tqwt_energy_dec_5

-0.933 1
La_tqwt_TKEO_mean_dec_17:

tqwt_TKEO_mean_dec_17 tqwt_minValue_dec_17

1 2.31
La_tqwt_kurtosisValue_dec_3:

tqwt_kurtosisValue_dec_2 tqwt_kurtosisValue_dec_3

-0.939 1
La_tqwt_kurtosisValue_dec_31:

tqwt_kurtosisValue_dec_31 tqwt_kurtosisValue_dec_33

1 -0.921
La_locShimmer:

locShimmer apq3Shimmer

1 -0.957
La_std_MFCC_5th_coef:

std_MFCC_5th_coef std_5th_delta

1 -0.865
La_tqwt_TKEO_std_dec_17:

tqwt_TKEO_std_dec_17 tqwt_minValue_dec_17

1 2.12
La_tqwt_TKEO_std_dec_32:

tqwt_TKEO_mean_dec_33 tqwt_TKEO_std_dec_32

-0.882 1
La_det_LT_entropy_shannon_1_coef:

det_TKEO_mean_1_coef det_LT_entropy_shannon_1_coef

-0.781 1
La_tqwt_kurtosisValue_dec_2:

tqwt_kurtosisValue_dec_2 tqwt_kurtosisValue_dec_4

1 -0.917
La_tqwt_maxValue_dec_1:

tqwt_minValue_dec_1 tqwt_maxValue_dec_1 tqwt_skewnessValue_dec_1

0.949 1 -0.0151
La_std_4th_delta:

std_MFCC_4th_coef std_4th_delta

-0.97 1
La_tqwt_minValue_dec_11:

tqwt_minValue_dec_10 tqwt_minValue_dec_11

-0.989 1
La_tqwt_minValue_dec_20:

tqwt_TKEO_mean_dec_20 tqwt_minValue_dec_20

0.431 1
La_std_10th_delta:

std_MFCC_10th_coef std_10th_delta

-1.05 1
La_tqwt_minValue_dec_17:

tqwt_entropy_shannon_dec_17 tqwt_minValue_dec_17

0.586 1
La_app_LT_entropy_log_3_coef:

app_LT_entropy_log_1_coef app_LT_entropy_log_2_coef app_LT_entropy_log_3_coef

1.59 -2.69 1
La_tqwt_energy_dec_33:

tqwt_energy_dec_31 tqwt_energy_dec_33

-0.884 1
La_std_3rd_delta:

std_MFCC_3rd_coef std_3rd_delta

-0.95 1
La_std_MFCC_2nd_coef:

std_MFCC_2nd_coef std_2nd_delta

1 -0.828
La_tqwt_TKEO_std_dec_10:

tqwt_entropy_shannon_dec_9 tqwt_entropy_shannon_dec_10 tqwt_TKEO_std_dec_9 tqwt_TKEO_std_dec_10

0.722 -1.02 -0.716 1
La_tqwt_entropy_log_dec_29:

tqwt_entropy_log_dec_28 tqwt_entropy_log_dec_29

-1.01 1
La_tqwt_kurtosisValue_dec_32:

tqwt_kurtosisValue_dec_32 tqwt_kurtosisValue_dec_33

1 -1.02

tqwt_entropy_log_dec_1	tqwt_entropy_log_dec_4
-1.09	1

tqwt_TKEO_mean_dec_28	tqwt_TKEO_std_dec_28
-0.73	1

tqwt_entropy_log_dec_31	tqwt_entropy_log_dec_35
1	-0.731

tqwt_entropy_shannon_dec_1	tqwt_entropy_shannon_dec_2	tqwt_stdValue_dec_1	tqwt_stdValue_dec_2
-0.524	0.448	1	-0.849

std_MFCC_12th_coef	std_12th_delta
-0.96	1

tqwt_energy_dec_4	tqwt_energy_dec_5
-0.933	1

tqwt_TKEO_mean_dec_17	tqwt_minValue_dec_17
1	2.31

tqwt_kurtosisValue_dec_2	tqwt_kurtosisValue_dec_3
-0.939	1

tqwt_kurtosisValue_dec_31	tqwt_kurtosisValue_dec_33
1	-0.921

locShimmer	apq3Shimmer
1	-0.957

std_MFCC_5th_coef	std_5th_delta
1	-0.865

tqwt_TKEO_std_dec_17	tqwt_minValue_dec_17
1	2.12

tqwt_TKEO_mean_dec_33	tqwt_TKEO_std_dec_32
-0.882	1

det_TKEO_mean_1_coef	det_LT_entropy_shannon_1_coef
-0.781	1

tqwt_kurtosisValue_dec_2	tqwt_kurtosisValue_dec_4
1	-0.917

tqwt_minValue_dec_1	tqwt_maxValue_dec_1	tqwt_skewnessValue_dec_1
0.949	1	-0.0151

std_MFCC_4th_coef	std_4th_delta
-0.97	1

tqwt_minValue_dec_10	tqwt_minValue_dec_11
-0.989	1

tqwt_TKEO_mean_dec_20	tqwt_minValue_dec_20
0.431	1

std_MFCC_10th_coef	std_10th_delta
-1.05	1

tqwt_entropy_shannon_dec_17	tqwt_minValue_dec_17
0.586	1

app_LT_entropy_log_1_coef	app_LT_entropy_log_2_coef	app_LT_entropy_log_3_coef
1.59	-2.69	1

tqwt_energy_dec_31	tqwt_energy_dec_33
-0.884	1

std_MFCC_3rd_coef	std_3rd_delta
-0.95	1

std_MFCC_2nd_coef	std_2nd_delta
1	-0.828

tqwt_entropy_shannon_dec_9	tqwt_entropy_shannon_dec_10	tqwt_TKEO_std_dec_9	tqwt_TKEO_std_dec_10
0.722	-1.02	-0.716	1

tqwt_entropy_log_dec_28	tqwt_entropy_log_dec_29
-1.01	1

names(selectedlist) <- NULL
### 2b Get the the names of the original features

allDevar <- unique(c(names(unlist(selectedlist)),decornames))
allDevar <- allDevar[!str_detect(allDevar,"La_")]
#allDevar <- str_remove(allDevar,"Ba_")
allDevar <- unique(allDevar)


# The analysis of the blind decorrelation

dcU <- getLatentCoefficients(deTrainU)
### 2a Get only the ones that were decorrelated by the decorrelation-based model
deNames_in_dcU <- decornamesU[decornamesU %in% names(dcU)]
selectedlistU <- dcU[deNames_in_dcU]
pander::pander(selectedlistU)

La_locShimmer:

locShimmer apq3Shimmer

1 -0.957
La_std_12th_delta:

std_MFCC_12th_coef std_12th_delta

-0.96 1
La_tqwt_TKEO_mean_dec_33:

tqwt_TKEO_mean_dec_33 tqwt_TKEO_std_dec_32

1 -1.04
La_std_4th_delta:

std_MFCC_4th_coef std_4th_delta

-0.97 1
La_tqwt_TKEO_std_dec_33:

tqwt_TKEO_std_dec_32 tqwt_TKEO_std_dec_33

-0.986 1
La_std_MFCC_2nd_coef:

std_MFCC_2nd_coef std_2nd_delta

1 -0.828
La_tqwt_entropy_shannon_dec_33:

tqwt_entropy_shannon_dec_31 tqwt_entropy_shannon_dec_33

-1.07 1
La_tqwt_kurtosisValue_dec_33:

tqwt_kurtosisValue_dec_32 tqwt_kurtosisValue_dec_33

-0.883 1
La_tqwt_TKEO_std_dec_36:

tqwt_entropy_shannon_dec_35 tqwt_TKEO_std_dec_36

-1.07 1
La_std_3rd_delta:

std_MFCC_3rd_coef std_3rd_delta

-0.95 1
La_std_10th_delta:

std_MFCC_10th_coef std_10th_delta

-1.05 1
La_tqwt_entropy_log_dec_16:

tqwt_entropy_log_dec_16 tqwt_minValue_dec_17

1 0.39
La_app_LT_TKEO_mean_4_coef:

app_entropy_shannon_8_coef app_LT_entropy_shannon_8_coef app_LT_TKEO_mean_4_coef app_LT_TKEO_std_2_coef

-0.195 0.95 1 -0.969
La_tqwt_TKEO_std_dec_28:

tqwt_TKEO_mean_dec_28 tqwt_TKEO_std_dec_28

-0.73 1
La_tqwt_maxValue_dec_1:

tqwt_minValue_dec_1 tqwt_maxValue_dec_1 tqwt_skewnessValue_dec_1

0.949 1 -0.0151
La_tqwt_TKEO_mean_dec_17:

tqwt_TKEO_mean_dec_17 tqwt_minValue_dec_17

1 2.31
La_tqwt_entropy_shannon_dec_1:

tqwt_entropy_shannon_dec_1 tqwt_entropy_shannon_dec_4

1 -0.669
La_tqwt_TKEO_std_dec_10:

tqwt_entropy_shannon_dec_9 tqwt_entropy_shannon_dec_10 tqwt_TKEO_std_dec_9 tqwt_TKEO_std_dec_10

0.722 -1.02 -0.716 1
La_tqwt_kurtosisValue_dec_4:

tqwt_kurtosisValue_dec_2 tqwt_kurtosisValue_dec_4

-0.86 1
La_tqwt_TKEO_std_dec_17:

tqwt_TKEO_std_dec_17 tqwt_minValue_dec_17

1 2.12
La_minIntensity:

minIntensity maxIntensity

1 -1.34

names(selectedlistU) <- NULL
### 2b Get the the names of the original features

allDevarU <- unique(c(names(unlist(selectedlistU)),decornamesU))
allDevarU <- allDevarU[!str_detect(allDevarU,"La_")]
#allDevarU <- str_remove(allDevarU,"Ba_")
allDevarU <- unique(allDevarU)

pander::pander(c(length(rawnames),length(decornames),length(decornamesU)))

68, 61 and 41

pander::pander(c(length(rawnames),length(allDevar),length(allDevarU)))

68, 88 and 63



### 2c Get only the new feautres not found in the original analysis
dvar <- allDevar[!(allDevar %in% rawnames)] 

### 2d Get the decorrelated variables that have new features
newvars <- character();
for (cvar in deNames_in_dc)
{
  lvar <- dc[cvar]
  names(lvar) <- NULL
  lvar <- names(unlist(lvar))
  if (length(lvar[lvar %in% dvar]) > 0)
  {
     newvars <- append(newvars,cvar)
  }
}

## 3 Here is the univariate z values of the orignal set
#pander::pander(bm$univariate[dvar,])
## 4 Here is the univariate z values of the decorrelated set
#pander::pander(bmd$univariate[newvars,])

## 4a The scater plot of the decorrelated vs original Univariate values

zvalueNew <- bmd$univariate[newvars,]
rownames(zvalueNew) <- str_remove(rownames(zvalueNew),"La_")
#rownames(zvalueNew) <- str_remove(rownames(zvalueNew),"Ba_")

zvaluePrePost <- bm$univariate[rownames(zvalueNew),c(1,3)]
zvaluePrePost$Name <- NULL
zvaluePrePost$NewZ <- zvalueNew[rownames(zvaluePrePost),"ZUni"]
pander::pander(zvaluePrePost)

	ZUni	NewZ
tqwt_entropy_log_dec_4	2.628	3.27
tqwt_TKEO_std_dec_28	1.166	3.68
tqwt_entropy_log_dec_31	0.922	4.29
tqwt_stdValue_dec_1	1.354	3.00
std_12th_delta	4.431	4.27
tqwt_energy_dec_5	1.933	3.26
tqwt_TKEO_mean_dec_17	4.905	3.48
tqwt_kurtosisValue_dec_3	0.652	2.94
tqwt_kurtosisValue_dec_31	0.444	3.68
locShimmer	3.390	5.02
std_MFCC_5th_coef	2.208	2.36
tqwt_TKEO_std_dec_17	4.631	3.15
tqwt_TKEO_std_dec_32	1.647	3.77
det_LT_entropy_shannon_1_coef	1.842	3.72
tqwt_kurtosisValue_dec_2	0.341	2.95
tqwt_maxValue_dec_1	4.731	3.28
std_4th_delta	3.943	3.67
tqwt_minValue_dec_11	5.493	3.34
tqwt_minValue_dec_20	1.424	3.03
std_10th_delta	4.761	3.43
tqwt_minValue_dec_17	4.068	1.86
app_LT_entropy_log_3_coef	1.902	3.70
tqwt_energy_dec_33	1.027	5.13
std_3rd_delta	2.899	4.02
std_MFCC_2nd_coef	0.628	4.68
tqwt_TKEO_std_dec_10	4.060	2.56
tqwt_entropy_log_dec_29	1.425	4.17
tqwt_kurtosisValue_dec_32	0.433	3.98

plot(zvaluePrePost,
     xlim=c(-0.5,6.5),
     ylim=c(0,7),
     xlab="Original Z",
     ylab="Decorrelated Z",
     main="Unviariate IDI Z Values",
     pch=3,cex=0.5,
     col="red")
abline(v=1.96,col="blue")
abline(h=1.96,col="blue")
text(zvaluePrePost$ZUni,zvaluePrePost$NewZ,rownames(zvaluePrePost),srt=65,cex=0.75)

1.4.2 The Summary of the Decorrelated-Based Model

Here I will print the summary statistics of the Logistic models found by BSWiMS, using the original and transformed dataset. After that, I will show the characteristics of the features not found by the original analysis.


pander::pander(smOriginal$coefficients)

	Estimate	lower	OR	upper	u.Accuracy	r.Accuracy	full.Accuracy	u.AUC	r.AUC	full.AUC	IDI	NRI	z.IDI	z.NRI	Delta.AUC	Frequency
tqwt_TKEO_mean_dec_13	-0.04922	0.927	0.952	0.977	0.702	0.738	0.701	0.720	0.524	0.720	2.06e-01	0.883	4.65	4.52	0.19564	0.70
tqwt_TKEO_mean_dec_16	-0.03488	0.949	0.966	0.982	0.686	0.729	0.686	0.733	0.527	0.734	2.03e-01	1.010	4.57	5.42	0.20667	0.30
tqwt_TKEO_std_dec_13	-0.01292	0.980	0.987	0.995	0.702	0.723	0.699	0.720	0.561	0.720	1.94e-01	0.845	4.47	4.29	0.15866	0.25
tqwt_TKEO_std_dec_11	-0.01907	0.971	0.981	0.992	0.665	0.601	0.733	0.678	0.624	0.761	1.95e-01	0.887	4.47	4.61	0.13704	0.20
tqwt_meanValue_dec_25	-0.51076	0.438	0.600	0.822	0.748	0.741	0.741	0.500	0.765	0.765	6.82e-13	0.866	4.37	4.80	0.00000	0.95
tqwt_TKEO_mean_dec_12	-0.06032	0.912	0.941	0.972	0.725	0.704	0.725	0.706	0.575	0.721	1.92e-01	0.822	4.36	4.15	0.14616	0.90
tqwt_stdValue_dec_6	-0.01631	0.975	0.984	0.993	0.646	0.644	0.718	0.682	0.663	0.727	1.92e-01	0.968	4.35	5.14	0.06448	0.10
tqwt_entropy_log_dec_11	-0.57558	0.403	0.562	0.784	0.691	0.656	0.731	0.693	0.639	0.749	1.73e-01	0.868	4.16	4.49	0.11037	0.80
tqwt_entropy_shannon_dec_16	-0.04819	0.928	0.953	0.978	0.674	0.724	0.675	0.734	0.530	0.732	1.71e-01	0.927	4.11	4.95	0.20155	0.40
tqwt_minValue_dec_13	0.01272	1.004	1.013	1.021	0.675	0.736	0.676	0.710	0.521	0.710	1.69e-01	0.780	4.09	3.93	0.18932	0.15
tqwt_entropy_shannon_dec_12	-0.04580	0.930	0.955	0.981	0.716	0.695	0.714	0.717	0.612	0.726	1.70e-01	0.787	4.07	3.93	0.11422	0.85
tqwt_stdValue_dec_13	-0.05310	0.920	0.948	0.978	0.691	0.665	0.696	0.718	0.624	0.732	1.63e-01	0.835	4.03	4.25	0.10757	0.55
tqwt_entropy_shannon_dec_13	-0.03065	0.953	0.970	0.987	0.686	0.667	0.687	0.714	0.611	0.725	1.58e-01	0.770	3.96	3.86	0.11413	0.45
tqwt_stdValue_dec_11	-0.01135	0.982	0.989	0.995	0.669	0.712	0.673	0.680	0.565	0.689	1.62e-01	0.680	3.88	3.31	0.12358	0.15
tqwt_TKEO_std_dec_12	-0.05920	0.910	0.943	0.976	0.723	0.701	0.732	0.719	0.639	0.746	1.56e-01	0.797	3.86	4.01	0.10638	1.00
minIntensity	-1.12118	0.168	0.326	0.633	0.606	0.699	0.698	0.682	0.676	0.739	1.54e-01	0.750	3.79	3.87	0.06324	0.95
tqwt_TKEO_std_dec_7	-0.03302	0.949	0.968	0.986	0.638	0.628	0.714	0.676	0.658	0.750	1.50e-01	0.750	3.79	3.72	0.09135	0.45
tqwt_meanValue_dec_10	0.48705	1.199	1.628	2.209	0.748	0.728	0.728	0.500	0.759	0.759	5.95e-12	0.790	3.77	4.33	0.00000	0.15
tqwt_entropy_log_dec_16	-0.18595	0.739	0.830	0.934	0.686	0.689	0.702	0.697	0.611	0.723	1.48e-01	0.760	3.76	3.76	0.11239	1.00
std_6th_delta_delta	1.26571	1.786	3.546	7.039	0.669	0.710	0.748	0.702	0.721	0.774	1.52e-01	0.747	3.75	3.71	0.05303	1.00
std_5th_delta_delta	0.04320	1.019	1.044	1.070	0.588	0.759	0.687	0.635	0.626	0.700	1.52e-01	0.635	3.72	3.14	0.07319	0.10
tqwt_maxValue_dec_7	-0.04908	0.922	0.952	0.983	0.634	0.678	0.710	0.677	0.670	0.730	1.49e-01	0.779	3.67	3.93	0.06007	0.50
std_11th_delta_delta	0.29019	1.101	1.337	1.622	0.686	0.682	0.731	0.706	0.650	0.747	1.42e-01	0.696	3.62	3.43	0.09761	0.90
tqwt_minValue_dec_12	0.25000	1.105	1.284	1.492	0.718	0.669	0.748	0.738	0.698	0.775	1.35e-01	0.853	3.60	4.36	0.07680	1.00
tqwt_minValue_dec_11	0.17136	1.071	1.187	1.316	0.688	0.681	0.734	0.705	0.677	0.763	1.34e-01	0.658	3.57	3.23	0.08559	1.00
std_delta_delta_log_energy	0.25885	1.118	1.295	1.502	0.717	0.708	0.750	0.725	0.726	0.768	1.36e-01	0.688	3.53	3.36	0.04253	1.00
std_Log_energy	0.12746	1.039	1.136	1.242	0.657	0.731	0.721	0.685	0.677	0.724	1.43e-01	0.688	3.52	3.45	0.04627	0.90
std_9th_delta	0.23816	1.088	1.269	1.479	0.663	0.726	0.710	0.704	0.688	0.724	1.40e-01	0.752	3.52	3.79	0.03629	0.80
tqwt_entropy_log_dec_12	-0.44976	0.480	0.638	0.847	0.739	0.678	0.741	0.730	0.703	0.759	1.28e-01	0.843	3.46	4.30	0.05577	1.00
std_delta_log_energy	0.25998	1.111	1.297	1.513	0.707	0.702	0.745	0.714	0.715	0.769	1.29e-01	0.636	3.44	3.09	0.05375	1.00
std_9th_delta_delta	0.48995	1.197	1.632	2.226	0.688	0.704	0.734	0.718	0.714	0.759	1.29e-01	0.723	3.41	3.58	0.04519	1.00
tqwt_kurtosisValue_dec_28	-0.02222	0.965	0.978	0.992	0.676	0.665	0.720	0.671	0.672	0.737	1.29e-01	0.705	3.40	3.46	0.06535	0.40
std_6th_delta	0.76070	1.346	2.140	3.401	0.634	0.714	0.736	0.666	0.719	0.763	1.27e-01	0.636	3.39	3.08	0.04449	1.00
tqwt_stdValue_dec_12	-0.05717	0.910	0.944	0.980	0.726	0.655	0.719	0.721	0.686	0.740	1.24e-01	0.738	3.39	3.63	0.05334	0.95
mean_MFCC_3rd_coef	0.00236	1.001	1.002	1.004	0.626	0.658	0.717	0.645	0.686	0.727	1.28e-01	0.634	3.36	3.06	0.04090	0.20
std_7th_delta	0.37797	1.148	1.459	1.855	0.671	0.708	0.717	0.707	0.706	0.739	1.27e-01	0.820	3.33	4.18	0.03266	1.00
tqwt_kurtosisValue_dec_18	0.15096	1.053	1.163	1.284	0.652	0.679	0.697	0.681	0.653	0.720	1.24e-01	0.693	3.32	3.40	0.06683	0.55
std_8th_delta	0.35184	1.125	1.422	1.797	0.661	0.727	0.731	0.686	0.718	0.750	1.26e-01	0.686	3.25	3.36	0.03271	1.00
mean_MFCC_2nd_coef	0.00314	1.001	1.003	1.005	0.761	0.648	0.715	0.654	0.688	0.722	1.15e-01	0.626	3.23	3.38	0.03319	0.25
tqwt_maxValue_dec_11	-0.17309	0.754	0.841	0.939	0.689	0.693	0.745	0.704	0.712	0.774	1.07e-01	0.566	3.23	2.74	0.06261	1.00
std_8th_delta_delta	0.31912	1.119	1.376	1.692	0.684	0.708	0.720	0.711	0.720	0.745	1.21e-01	0.721	3.21	3.58	0.02592	1.00
tqwt_entropy_log_dec_13	-0.15799	0.765	0.854	0.953	0.711	0.650	0.708	0.716	0.678	0.734	1.09e-01	0.739	3.13	3.66	0.05530	1.00
tqwt_meanValue_dec_18	-0.28498	0.608	0.752	0.930	0.748	0.746	0.746	0.500	0.772	0.772	3.46e-13	0.655	3.10	3.40	0.00000	0.90
std_7th_delta_delta	0.31081	1.109	1.365	1.679	0.690	0.713	0.731	0.718	0.727	0.748	1.14e-01	0.691	3.09	3.41	0.02109	1.00
std_MFCC_6th_coef	0.37724	1.120	1.458	1.898	0.580	0.683	0.738	0.604	0.693	0.761	1.08e-01	0.528	3.08	2.53	0.06795	0.55
IMF_SNR_SEO	0.01885	1.006	1.019	1.032	0.611	0.692	0.741	0.606	0.721	0.758	1.11e-01	0.736	3.07	3.63	0.03729	0.15
std_MFCC_8th_coef	0.27809	1.093	1.321	1.596	0.652	0.702	0.699	0.674	0.680	0.729	1.09e-01	0.669	3.04	3.25	0.04907	0.85
tqwt_kurtosisValue_dec_27	-0.00626	0.990	0.994	0.998	0.690	0.674	0.697	0.666	0.617	0.702	1.04e-01	0.540	3.01	2.59	0.08437	0.15
tqwt_maxValue_dec_12	-0.10713	0.835	0.898	0.966	0.713	0.710	0.736	0.733	0.723	0.759	9.32e-02	0.574	3.00	2.74	0.03641	1.00
tqwt_energy_dec_6	-0.00469	0.992	0.995	0.999	0.600	0.668	0.724	0.634	0.701	0.750	9.68e-02	0.691	2.98	3.38	0.04878	0.10
tqwt_energy_dec_12	-0.01750	0.971	0.983	0.995	0.672	0.644	0.717	0.673	0.687	0.739	9.14e-02	0.623	2.94	3.00	0.05150	0.45
std_10th_delta_delta	0.20112	1.055	1.223	1.417	0.678	0.709	0.715	0.696	0.711	0.741	1.02e-01	0.573	2.94	2.74	0.03040	1.00
std_10th_delta	0.06796	1.018	1.070	1.125	0.656	0.699	0.716	0.681	0.695	0.723	1.01e-01	0.488	2.85	2.34	0.02796	0.35
tqwt_maxValue_dec_10	-0.00289	0.995	0.997	0.999	0.651	0.650	0.684	0.689	0.679	0.719	8.77e-02	0.569	2.84	2.72	0.04023	0.10
tqwt_entropy_log_dec_35	-0.02821	0.953	0.972	0.992	0.661	0.684	0.720	0.645	0.687	0.732	9.47e-02	0.683	2.82	3.35	0.04546	0.10
locAbsJitter	0.01135	1.003	1.011	1.020	0.631	0.665	0.671	0.655	0.694	0.696	9.42e-02	0.534	2.82	2.54	0.00246	0.15
tqwt_kurtosisValue_dec_26	-0.07301	0.881	0.930	0.980	0.760	0.659	0.723	0.661	0.700	0.734	9.17e-02	0.581	2.79	3.02	0.03390	0.95
IMF_NSR_TKEO	-0.06870	0.883	0.934	0.987	0.620	0.729	0.768	0.593	0.742	0.782	8.49e-02	0.497	2.78	2.39	0.03979	0.40
mean_delta_log_energy	-0.00972	0.982	0.990	0.998	0.742	0.710	0.753	0.658	0.728	0.764	8.28e-02	0.625	2.74	3.24	0.03611	1.00
std_5th_delta	0.29966	1.082	1.349	1.684	0.620	0.686	0.704	0.657	0.710	0.742	8.96e-02	0.494	2.73	2.38	0.03220	0.90
tqwt_meanValue_dec_11	0.07639	1.020	1.079	1.143	0.748	0.724	0.724	0.500	0.743	0.743	-3.34e-13	0.557	2.70	2.91	0.00000	0.65
tqwt_kurtosisValue_dec_20	0.24883	1.050	1.283	1.566	0.664	0.671	0.722	0.686	0.718	0.754	8.35e-02	0.675	2.69	3.29	0.03562	1.00
tqwt_kurtosisValue_dec_36	0.05679	1.013	1.058	1.106	0.679	0.714	0.761	0.708	0.726	0.773	8.28e-02	0.721	2.65	3.58	0.04631	1.00
tqwt_entropy_log_dec_34	-0.03346	0.944	0.967	0.991	0.674	0.689	0.727	0.662	0.693	0.740	8.11e-02	0.617	2.64	2.97	0.04619	0.15
tqwt_kurtosisValue_dec_34	0.02517	1.006	1.025	1.046	0.617	0.684	0.730	0.640	0.695	0.747	7.45e-02	0.580	2.59	2.78	0.05202	0.35
apq11Shimmer	0.00313	1.001	1.003	1.006	0.649	0.684	0.700	0.658	0.701	0.715	7.58e-02	0.671	2.52	3.25	0.01388	0.10
tqwt_kurtosisValue_dec_1	-0.00556	0.990	0.994	0.999	0.623	0.757	0.787	0.620	0.766	0.790	5.77e-02	0.545	2.43	2.60	0.02452	0.10
tqwt_kurtosisValue_dec_35	0.08600	1.015	1.090	1.170	0.644	0.706	0.744	0.668	0.718	0.759	6.80e-02	0.596	2.42	2.88	0.04069	1.00


pander::pander(smDecor$coefficients)

	Estimate	lower	OR	upper	u.Accuracy	r.Accuracy	full.Accuracy	u.AUC	r.AUC	full.AUC	IDI	NRI	z.IDI	z.NRI	Delta.AUC	Frequency
mean_MFCC_2nd_coef	0.0168	1.01e+00	1.01696	1.026	0.761	0.674	0.774	0.654	0.689	0.773	1.64e-01	0.718	4.11	3.848	0.08480	0.50
tqwt_meanValue_dec_11	-0.0262	9.60e-01	0.97410	0.989	0.748	0.744	0.744	0.500	0.753	0.753	1.33e-14	0.737	4.02	3.999	0.00000	0.15
tqwt_maxValue_dec_12	-0.3024	6.32e-01	0.73901	0.865	0.714	0.715	0.780	0.733	0.719	0.794	1.51e-01	0.729	4.00	3.603	0.07494	1.00
mean_MFCC_3rd_coef	0.0255	1.01e+00	1.02587	1.040	0.626	0.700	0.780	0.645	0.712	0.786	1.63e-01	0.696	3.99	3.422	0.07427	0.55
La_tqwt_entropy_log_dec_4	-2.4537	2.09e-02	0.08598	0.354	0.615	0.696	0.780	0.624	0.700	0.784	1.57e-01	0.833	3.94	4.213	0.08343	0.50
std_MFCC_6th_coef	1.0443	1.54e+00	2.84154	5.234	0.577	0.733	0.773	0.604	0.719	0.777	1.56e-01	0.764	3.91	3.911	0.05744	0.60
std_delta_delta_log_energy	0.5652	1.31e+00	1.75978	2.361	0.716	0.749	0.801	0.725	0.745	0.798	1.55e-01	0.797	3.87	3.985	0.05296	1.00
tqwt_energy_dec_6	-0.1151	8.34e-01	0.89131	0.952	0.598	0.692	0.762	0.633	0.704	0.774	1.44e-01	0.781	3.72	3.912	0.07002	0.55
tqwt_kurtosisValue_dec_20	0.9820	1.47e+00	2.66987	4.833	0.663	0.704	0.763	0.686	0.717	0.783	1.41e-01	0.885	3.67	4.560	0.06649	0.90
tqwt_entropy_log_dec_11	-0.7823	2.86e-01	0.45733	0.732	0.691	0.679	0.744	0.693	0.689	0.762	1.35e-01	0.827	3.62	4.184	0.07386	1.00
La_tqwt_TKEO_std_dec_28	-0.0827	8.76e-01	0.92066	0.967	0.632	0.678	0.731	0.625	0.683	0.736	1.36e-01	0.612	3.50	2.949	0.05358	0.35
La_tqwt_entropy_log_dec_31	0.6173	1.27e+00	1.85390	2.707	0.613	0.657	0.747	0.637	0.672	0.749	1.29e-01	0.790	3.49	4.007	0.07727	0.50
apq11Shimmer	0.1937	1.09e+00	1.21368	1.356	0.656	0.669	0.733	0.662	0.668	0.746	1.26e-01	0.749	3.48	3.696	0.07837	0.50
tqwt_entropy_log_dec_35	-0.6965	3.18e-01	0.49834	0.781	0.662	0.707	0.769	0.648	0.716	0.776	1.25e-01	0.742	3.46	3.673	0.05982	0.85
f2	-0.1130	8.31e-01	0.89313	0.960	0.605	0.698	0.749	0.592	0.697	0.764	1.23e-01	0.540	3.44	2.570	0.06644	0.10
tqwt_kurtosisValue_dec_18	0.8663	1.37e+00	2.37798	4.124	0.652	0.720	0.757	0.676	0.722	0.769	1.29e-01	0.748	3.43	3.717	0.04708	0.80
tqwt_meanValue_dec_25	-0.3364	5.76e-01	0.71435	0.886	0.748	0.760	0.760	0.500	0.768	0.768	2.67e-13	0.787	3.41	4.100	0.00000	0.50
tqwt_entropy_log_dec_16	-0.3471	5.71e-01	0.70670	0.875	0.687	0.708	0.761	0.697	0.706	0.775	1.20e-01	0.775	3.37	3.846	0.06949	0.95
tqwt_energy_dec_11	-0.0153	9.76e-01	0.98478	0.994	0.659	0.678	0.731	0.649	0.669	0.751	1.25e-01	0.721	3.35	3.553	0.08157	0.10
tqwt_kurtosisValue_dec_33	0.0829	1.03e+00	1.08641	1.143	0.534	0.670	0.742	0.583	0.675	0.743	1.19e-01	0.800	3.33	4.030	0.06842	0.30
La_tqwt_stdValue_dec_1	-0.5159	4.27e-01	0.59695	0.835	0.674	0.627	0.713	0.628	0.644	0.721	1.12e-01	0.659	3.29	3.244	0.07729	0.15
La_std_12th_delta	1.7791	1.87e+00	5.92477	18.723	0.669	0.725	0.766	0.672	0.727	0.774	1.20e-01	0.639	3.29	3.113	0.04639	1.00
tqwt_energy_dec_12	-0.0666	8.95e-01	0.93561	0.978	0.672	0.696	0.759	0.674	0.708	0.774	1.08e-01	0.679	3.28	3.316	0.06552	0.65
La_tqwt_energy_dec_5	-0.1450	7.89e-01	0.86505	0.949	0.609	0.717	0.764	0.635	0.715	0.777	1.16e-01	0.806	3.27	4.074	0.06184	0.25
La_tqwt_TKEO_mean_dec_17	-0.2082	7.10e-01	0.81206	0.929	0.643	0.693	0.732	0.671	0.692	0.746	1.20e-01	0.759	3.26	3.778	0.05400	0.40
La_tqwt_kurtosisValue_dec_3	0.1117	1.04e+00	1.11817	1.201	0.708	0.666	0.740	0.650	0.687	0.757	1.12e-01	0.843	3.25	4.363	0.06985	0.15
La_tqwt_kurtosisValue_dec_31	-0.4262	4.92e-01	0.65301	0.866	0.637	0.725	0.770	0.611	0.750	0.787	1.16e-01	0.681	3.25	3.373	0.03735	1.00
VFER_SNR_TKEO	0.0330	1.01e+00	1.03356	1.056	0.685	0.727	0.774	0.688	0.735	0.779	1.12e-01	0.797	3.24	4.010	0.04388	0.15
La_locShimmer	1.5022	1.64e+00	4.49134	12.337	0.664	0.744	0.783	0.704	0.757	0.797	7.16e-02	0.678	3.15	3.373	0.03927	1.00
std_11th_delta	0.3676	1.13e+00	1.44423	1.851	0.636	0.706	0.745	0.657	0.709	0.758	1.12e-01	0.668	3.15	3.273	0.04872	0.70
La_std_MFCC_5th_coef	-1.0723	1.72e-01	0.34221	0.679	0.599	0.764	0.798	0.573	0.757	0.804	1.04e-01	0.628	3.14	3.087	0.04727	0.15
tqwt_entropy_shannon_dec_36	-0.0271	9.56e-01	0.97325	0.990	0.679	0.701	0.729	0.662	0.716	0.744	1.12e-01	0.671	3.13	3.295	0.02761	0.30
La_tqwt_TKEO_std_dec_17	-0.4358	4.67e-01	0.64673	0.896	0.621	0.708	0.751	0.626	0.714	0.765	1.09e-01	0.636	3.12	3.124	0.05068	0.50
La_tqwt_TKEO_std_dec_32	0.2590	1.08e+00	1.29565	1.547	0.624	0.722	0.773	0.619	0.740	0.776	1.06e-01	0.486	3.11	2.324	0.03565	1.00
La_det_LT_entropy_shannon_1_coef	0.0876	1.02e+00	1.09160	1.164	0.634	0.685	0.742	0.637	0.713	0.757	1.01e-01	0.668	3.06	3.258	0.04435	0.40
La_tqwt_kurtosisValue_dec_2	-0.1319	8.03e-01	0.87640	0.957	0.686	0.720	0.771	0.641	0.724	0.779	1.03e-01	0.694	3.04	3.446	0.05500	0.35
La_tqwt_maxValue_dec_1	-0.7541	2.85e-01	0.47042	0.777	0.605	0.697	0.745	0.598	0.708	0.757	1.02e-01	0.523	3.04	2.486	0.04931	0.55
tqwt_kurtosisValue_dec_28	-0.0420	9.31e-01	0.95889	0.988	0.678	0.678	0.720	0.672	0.688	0.740	1.06e-01	0.678	3.04	3.296	0.05240	0.65
tqwt_meanValue_dec_18	-0.3003	5.89e-01	0.74057	0.931	0.748	0.767	0.767	0.500	0.777	0.777	2.78e-13	0.771	2.99	3.953	0.00000	0.60
std_MFCC_8th_coef	0.5357	1.16e+00	1.70867	2.516	0.653	0.690	0.740	0.674	0.715	0.756	1.03e-01	0.654	2.97	3.184	0.04117	0.85
std_5th_delta	0.4696	1.16e+00	1.59940	2.200	0.622	0.707	0.726	0.659	0.706	0.750	1.04e-01	0.573	2.97	2.770	0.04388	0.70
locAbsJitter	0.0512	1.02e+00	1.05251	1.090	0.632	0.730	0.777	0.654	0.726	0.780	9.26e-02	0.550	2.94	2.628	0.05406	0.20
La_std_4th_delta	0.9385	1.33e+00	2.55616	4.906	0.692	0.712	0.758	0.686	0.723	0.772	9.72e-02	0.732	2.92	3.625	0.04916	0.80
La_tqwt_minValue_dec_11	0.0812	1.03e+00	1.08454	1.144	0.612	0.676	0.734	0.628	0.675	0.741	9.19e-02	0.528	2.86	2.522	0.06641	0.15
La_tqwt_minValue_dec_20	-0.1514	7.77e-01	0.85950	0.951	0.624	0.695	0.734	0.619	0.701	0.741	9.34e-02	0.578	2.85	2.752	0.04080	0.15
IMF_SNR_entropy	0.0323	1.01e+00	1.03284	1.057	0.608	0.727	0.761	0.607	0.728	0.775	8.94e-02	0.621	2.85	3.014	0.04693	0.15
tqwt_kurtosisValue_dec_35	0.1116	1.03e+00	1.11804	1.214	0.641	0.716	0.766	0.668	0.730	0.783	8.13e-02	0.626	2.79	3.028	0.05306	0.95
f1	-0.2683	6.24e-01	0.76464	0.936	0.599	0.727	0.758	0.598	0.736	0.764	8.67e-02	0.440	2.79	2.094	0.02858	0.40
La_std_10th_delta	1.1297	1.34e+00	3.09474	7.143	0.678	0.729	0.761	0.660	0.730	0.764	8.75e-02	0.671	2.74	3.287	0.03438	0.70
La_tqwt_minValue_dec_17	-0.2422	6.51e-01	0.78491	0.947	0.546	0.749	0.754	0.564	0.755	0.775	8.46e-02	0.719	2.65	3.550	0.02018	0.15
La_app_LT_entropy_log_3_coef	-6.8711	4.42e-06	0.00104	0.244	0.607	0.740	0.767	0.618	0.753	0.774	7.38e-02	0.549	2.62	2.675	0.02066	0.10
minIntensity	-2.3412	1.70e-02	0.09621	0.544	0.611	0.751	0.761	0.682	0.757	0.780	8.36e-02	0.567	2.59	2.834	0.02276	1.00
La_tqwt_energy_dec_33	-0.2641	6.26e-01	0.76791	0.942	0.748	0.776	0.801	0.736	0.772	0.798	6.82e-02	0.523	2.56	2.500	0.02660	1.00
mean_delta_log_energy	-0.0201	9.65e-01	0.98012	0.996	0.742	0.748	0.774	0.658	0.763	0.783	7.49e-02	0.639	2.54	3.306	0.01965	0.90
La_std_3rd_delta	0.7956	1.20e+00	2.21574	4.085	0.729	0.742	0.775	0.712	0.755	0.783	7.35e-02	0.643	2.54	3.157	0.02783	1.00
IMF_NSR_TKEO	-0.0308	9.46e-01	0.96963	0.994	0.620	0.738	0.761	0.592	0.758	0.767	7.25e-02	0.512	2.51	2.454	0.00896	0.15
numPulses	-0.0298	9.50e-01	0.97065	0.992	0.632	0.732	0.732	0.644	0.682	0.747	6.94e-02	0.592	2.35	2.838	0.06504	0.10
La_std_MFCC_2nd_coef	-1.5069	6.06e-02	0.22159	0.810	0.703	0.766	0.801	0.686	0.776	0.798	6.50e-02	0.635	2.26	3.071	0.02282	1.00
La_tqwt_TKEO_std_dec_10	-0.3678	5.06e-01	0.69225	0.947	0.628	0.746	0.773	0.620	0.744	0.774	6.06e-02	0.627	2.26	3.076	0.02981	0.40
La_tqwt_entropy_log_dec_29	-0.0875	8.49e-01	0.91618	0.989	0.515	0.694	0.728	0.649	0.705	0.746	5.40e-02	0.630	2.23	3.493	0.04116	0.25
La_tqwt_kurtosisValue_dec_32	-0.4351	4.41e-01	0.64723	0.949	0.617	0.756	0.784	0.606	0.774	0.796	5.39e-02	0.162	2.11	0.745	0.02186	1.00


pander::pander(smDecorU$coefficients)

	Estimate	lower	OR	upper	u.Accuracy	r.Accuracy	full.Accuracy	u.AUC	r.AUC	full.AUC	IDI	NRI	z.IDI	z.NRI	Delta.AUC	Frequency
tqwt_entropy_shannon_dec_11	-0.2508	6.78e-01	7.78e-01	8.93e-01	0.667	0.719	0.782	0.683	0.728	0.793	1.44e-01	0.729	3.91	3.59	0.06517	0.80
La_locShimmer	3.3361	3.80e+00	2.81e+01	2.08e+02	0.663	0.708	0.782	0.704	0.718	0.793	1.05e-01	0.813	3.78	4.14	0.07484	0.85
La_std_12th_delta	3.3775	4.48e+00	2.93e+01	1.92e+02	0.667	0.689	0.780	0.672	0.688	0.785	1.45e-01	0.765	3.78	3.82	0.09675	0.65
tqwt_TKEO_mean_dec_7	-0.0213	9.67e-01	9.79e-01	9.91e-01	0.624	0.676	0.743	0.661	0.682	0.757	1.44e-01	0.700	3.75	3.43	0.07532	0.10
La_tqwt_TKEO_mean_dec_33	-0.5643	3.95e-01	5.69e-01	8.19e-01	0.631	0.711	0.786	0.626	0.718	0.787	1.22e-01	0.517	3.41	2.46	0.06904	0.85
std_5th_delta	0.6315	1.26e+00	1.88e+00	2.80e+00	0.622	0.725	0.755	0.661	0.733	0.768	1.26e-01	0.704	3.33	3.53	0.03575	0.40
tqwt_kurtosisValue_dec_28	-0.0460	9.26e-01	9.55e-01	9.85e-01	0.678	0.722	0.735	0.672	0.668	0.741	1.26e-01	0.706	3.33	3.46	0.07301	0.30
mean_MFCC_2nd_coef	0.0116	1.00e+00	1.01e+00	1.02e+00	0.761	0.709	0.766	0.654	0.711	0.768	1.19e-01	0.645	3.33	3.50	0.05701	0.30
tqwt_kurtosisValue_dec_36	0.2019	1.07e+00	1.22e+00	1.39e+00	0.679	0.707	0.786	0.708	0.717	0.790	1.18e-01	0.700	3.33	3.46	0.07323	0.90
La_std_4th_delta	1.2987	1.58e+00	3.66e+00	8.50e+00	0.692	0.738	0.766	0.689	0.733	0.772	1.16e-01	0.804	3.28	4.04	0.03942	0.55
tqwt_energy_dec_12	-0.0709	8.88e-01	9.32e-01	9.77e-01	0.672	0.721	0.779	0.674	0.734	0.796	9.95e-02	0.662	3.27	3.23	0.06243	0.40
La_tqwt_TKEO_std_dec_33	-0.0809	8.71e-01	9.22e-01	9.77e-01	0.615	0.744	0.785	0.647	0.740	0.788	9.97e-02	0.582	3.26	2.79	0.04736	0.15
La_std_MFCC_2nd_coef	-4.2406	1.01e-03	1.44e-02	2.05e-01	0.703	0.779	0.840	0.686	0.776	0.838	1.13e-01	0.660	3.26	3.21	0.06262	1.00
La_tqwt_entropy_shannon_dec_33	-0.3589	5.44e-01	6.98e-01	8.97e-01	0.664	0.717	0.786	0.680	0.735	0.789	9.29e-02	0.486	3.07	2.31	0.05438	1.00
La_tqwt_kurtosisValue_dec_33	2.2832	2.18e+00	9.81e+00	4.41e+01	0.693	0.776	0.833	0.680	0.776	0.832	9.06e-02	0.584	3.06	2.82	0.05571	1.00
La_tqwt_TKEO_std_dec_36	0.2751	1.08e+00	1.32e+00	1.60e+00	0.618	0.730	0.780	0.591	0.745	0.791	9.02e-02	0.604	3.02	2.95	0.04554	0.50
La_std_3rd_delta	1.1897	1.45e+00	3.29e+00	7.47e+00	0.729	0.739	0.770	0.712	0.724	0.777	9.93e-02	0.713	2.98	3.55	0.05292	0.60
IMF_NSR_TKEO	-0.0707	8.85e-01	9.32e-01	9.81e-01	0.619	0.708	0.762	0.592	0.709	0.770	9.69e-02	0.512	2.97	2.49	0.06172	0.15
std_MFCC_8th_coef	1.2268	1.49e+00	3.41e+00	7.80e+00	0.654	0.757	0.778	0.675	0.757	0.790	1.01e-01	0.697	2.92	3.46	0.03342	0.65
std_delta_log_energy	1.1784	1.44e+00	3.25e+00	7.31e+00	0.708	0.795	0.840	0.712	0.795	0.838	9.37e-02	0.653	2.91	3.19	0.04337	1.00
La_std_10th_delta	0.2471	1.07e+00	1.28e+00	1.53e+00	0.674	0.710	0.753	0.654	0.704	0.761	9.16e-02	0.662	2.89	3.23	0.05700	0.10
tqwt_meanValue_dec_18	-0.1582	7.49e-01	8.54e-01	9.73e-01	0.737	0.788	0.788	0.500	0.799	0.799	1.04e-13	0.788	2.84	4.08	0.00000	0.45
La_tqwt_entropy_log_dec_16	-0.1368	7.87e-01	8.72e-01	9.67e-01	0.683	0.698	0.738	0.664	0.722	0.751	9.50e-02	0.737	2.81	3.64	0.02922	0.10
La_app_LT_TKEO_mean_4_coef	-23.0048	7.56e-18	1.02e-10	1.38e-03	0.711	0.731	0.777	0.688	0.742	0.787	9.17e-02	0.665	2.80	3.29	0.04514	0.45
maxIntensity	-0.5928	3.58e-01	5.53e-01	8.53e-01	0.602	0.719	0.748	0.685	0.739	0.775	8.36e-02	0.695	2.77	3.57	0.03532	0.15
La_tqwt_TKEO_std_dec_28	-0.0654	8.94e-01	9.37e-01	9.81e-01	0.630	0.726	0.756	0.625	0.734	0.767	8.91e-02	0.498	2.76	2.37	0.03266	0.15
tqwt_meanValue_dec_25	-0.2819	5.95e-01	7.54e-01	9.56e-01	0.748	0.772	0.772	0.500	0.774	0.774	3.08e-13	0.670	2.73	3.41	0.00000	0.25
mean_delta_log_energy	-0.0368	9.37e-01	9.64e-01	9.91e-01	0.742	0.727	0.798	0.658	0.742	0.805	7.88e-02	0.613	2.73	3.13	0.06359	0.60
tqwt_meanValue_dec_11	-0.1927	7.21e-01	8.25e-01	9.44e-01	0.748	0.774	0.774	0.500	0.772	0.772	3.20e-14	0.500	2.72	2.54	0.00000	0.20
tqwt_kurtosisValue_dec_20	2.0598	1.72e+00	7.84e+00	3.57e+01	0.672	0.798	0.824	0.689	0.793	0.827	8.04e-02	0.746	2.69	3.72	0.03423	0.90
std_11th_delta	0.0785	1.01e+00	1.08e+00	1.15e+00	0.633	0.733	0.748	0.661	0.735	0.753	8.18e-02	0.601	2.61	2.92	0.01833	0.15
La_tqwt_maxValue_dec_1	-0.1033	8.31e-01	9.02e-01	9.79e-01	0.605	0.708	0.748	0.598	0.703	0.750	7.45e-02	0.452	2.61	2.13	0.04663	0.10
La_tqwt_TKEO_mean_dec_17	-0.2284	6.65e-01	7.96e-01	9.52e-01	0.644	0.744	0.770	0.670	0.753	0.778	8.44e-02	0.748	2.60	3.75	0.02445	0.25
La_tqwt_entropy_shannon_dec_1	0.0796	1.02e+00	1.08e+00	1.15e+00	0.590	0.765	0.790	0.563	0.775	0.794	7.15e-02	0.509	2.52	2.41	0.01919	0.10
La_tqwt_TKEO_std_dec_10	-0.2455	6.27e-01	7.82e-01	9.77e-01	0.619	0.737	0.771	0.614	0.748	0.779	6.70e-02	0.572	2.45	2.74	0.03104	0.20
La_tqwt_kurtosisValue_dec_4	0.0423	1.01e+00	1.04e+00	1.08e+00	0.665	0.757	0.781	0.635	0.772	0.799	6.03e-02	0.470	2.30	2.25	0.02677	0.15
VFER_SNR_TKEO	0.0249	1.00e+00	1.03e+00	1.05e+00	0.683	0.750	0.780	0.685	0.766	0.794	5.52e-02	0.662	2.21	3.19	0.02841	0.10
f1	-0.0929	8.40e-01	9.11e-01	9.89e-01	0.603	0.726	0.776	0.598	0.737	0.774	5.51e-02	0.241	2.14	1.11	0.03775	0.10
tqwt_kurtosisValue_dec_17	0.0753	1.01e+00	1.08e+00	1.16e+00	0.635	0.785	0.798	0.676	0.788	0.807	5.14e-02	0.425	2.09	2.02	0.01890	0.10
La_tqwt_TKEO_std_dec_17	-0.1817	6.99e-01	8.34e-01	9.95e-01	0.624	0.758	0.765	0.627	0.763	0.770	5.54e-02	0.581	1.98	2.81	0.00738	0.25
La_minIntensity	-1.3346	6.85e-02	2.63e-01	1.01e+00	0.578	0.734	0.746	0.628	0.736	0.757	4.74e-02	0.412	1.85	1.97	0.02099	0.15


## Let focus on the new features

decorCoeff <- smDecor$coefficients[newvars,];
ncoef <- dc[newvars]
cnames <- lapply(ncoef,names)
names(cnames) <- NULL;
decorCoeff$Elements <- lapply(cnames,paste,collapse="+")
pander::pander(decorCoeff)

	Estimate	lower	OR	upper	u.Accuracy	r.Accuracy	full.Accuracy	u.AUC	r.AUC	full.AUC	IDI	NRI	z.IDI	z.NRI	Delta.AUC	Frequency	Elements
La_tqwt_entropy_log_dec_4	-2.4537	2.09e-02	0.08598	0.354	0.615	0.696	0.780	0.624	0.700	0.784	0.1573	0.833	3.94	4.213	0.0834	0.50	tqwt_entropy_log_dec_1+tqwt_entropy_log_dec_4
La_tqwt_TKEO_std_dec_28	-0.0827	8.76e-01	0.92066	0.967	0.632	0.678	0.731	0.625	0.683	0.736	0.1358	0.612	3.50	2.949	0.0536	0.35	tqwt_TKEO_mean_dec_28+tqwt_TKEO_std_dec_28
La_tqwt_entropy_log_dec_31	0.6173	1.27e+00	1.85390	2.707	0.613	0.657	0.747	0.637	0.672	0.749	0.1287	0.790	3.49	4.007	0.0773	0.50	tqwt_entropy_log_dec_31+tqwt_entropy_log_dec_35
La_tqwt_stdValue_dec_1	-0.5159	4.27e-01	0.59695	0.835	0.674	0.627	0.713	0.628	0.644	0.721	0.1116	0.659	3.29	3.244	0.0773	0.15	tqwt_entropy_shannon_dec_1+tqwt_entropy_shannon_dec_2+tqwt_stdValue_dec_1+tqwt_stdValue_dec_2
La_std_12th_delta	1.7791	1.87e+00	5.92477	18.723	0.669	0.725	0.766	0.672	0.727	0.774	0.1199	0.639	3.29	3.113	0.0464	1.00	std_MFCC_12th_coef+std_12th_delta
La_tqwt_energy_dec_5	-0.1450	7.89e-01	0.86505	0.949	0.609	0.717	0.764	0.635	0.715	0.777	0.1156	0.806	3.27	4.074	0.0618	0.25	tqwt_energy_dec_4+tqwt_energy_dec_5
La_tqwt_TKEO_mean_dec_17	-0.2082	7.10e-01	0.81206	0.929	0.643	0.693	0.732	0.671	0.692	0.746	0.1198	0.759	3.26	3.778	0.0540	0.40	tqwt_TKEO_mean_dec_17+tqwt_minValue_dec_17
La_tqwt_kurtosisValue_dec_3	0.1117	1.04e+00	1.11817	1.201	0.708	0.666	0.740	0.650	0.687	0.757	0.1118	0.843	3.25	4.363	0.0698	0.15	tqwt_kurtosisValue_dec_2+tqwt_kurtosisValue_dec_3
La_tqwt_kurtosisValue_dec_31	-0.4262	4.92e-01	0.65301	0.866	0.637	0.725	0.770	0.611	0.750	0.787	0.1163	0.681	3.25	3.373	0.0373	1.00	tqwt_kurtosisValue_dec_31+tqwt_kurtosisValue_dec_33
La_locShimmer	1.5022	1.64e+00	4.49134	12.337	0.664	0.744	0.783	0.704	0.757	0.797	0.0716	0.678	3.15	3.373	0.0393	1.00	locShimmer+apq3Shimmer
La_std_MFCC_5th_coef	-1.0723	1.72e-01	0.34221	0.679	0.599	0.764	0.798	0.573	0.757	0.804	0.1042	0.628	3.14	3.087	0.0473	0.15	std_MFCC_5th_coef+std_5th_delta
La_tqwt_TKEO_std_dec_17	-0.4358	4.67e-01	0.64673	0.896	0.621	0.708	0.751	0.626	0.714	0.765	0.1092	0.636	3.12	3.124	0.0507	0.50	tqwt_TKEO_std_dec_17+tqwt_minValue_dec_17
La_tqwt_TKEO_std_dec_32	0.2590	1.08e+00	1.29565	1.547	0.624	0.722	0.773	0.619	0.740	0.776	0.1064	0.486	3.11	2.324	0.0356	1.00	tqwt_TKEO_mean_dec_33+tqwt_TKEO_std_dec_32
La_det_LT_entropy_shannon_1_coef	0.0876	1.02e+00	1.09160	1.164	0.634	0.685	0.742	0.637	0.713	0.757	0.1011	0.668	3.06	3.258	0.0443	0.40	det_TKEO_mean_1_coef+det_LT_entropy_shannon_1_coef
La_tqwt_kurtosisValue_dec_2	-0.1319	8.03e-01	0.87640	0.957	0.686	0.720	0.771	0.641	0.724	0.779	0.1027	0.694	3.04	3.446	0.0550	0.35	tqwt_kurtosisValue_dec_2+tqwt_kurtosisValue_dec_4
La_tqwt_maxValue_dec_1	-0.7541	2.85e-01	0.47042	0.777	0.605	0.697	0.745	0.598	0.708	0.757	0.1018	0.523	3.04	2.486	0.0493	0.55	tqwt_minValue_dec_1+tqwt_maxValue_dec_1+tqwt_skewnessValue_dec_1
La_std_4th_delta	0.9385	1.33e+00	2.55616	4.906	0.692	0.712	0.758	0.686	0.723	0.772	0.0972	0.732	2.92	3.625	0.0492	0.80	std_MFCC_4th_coef+std_4th_delta
La_tqwt_minValue_dec_11	0.0812	1.03e+00	1.08454	1.144	0.612	0.676	0.734	0.628	0.675	0.741	0.0919	0.528	2.86	2.522	0.0664	0.15	tqwt_minValue_dec_10+tqwt_minValue_dec_11
La_tqwt_minValue_dec_20	-0.1514	7.77e-01	0.85950	0.951	0.624	0.695	0.734	0.619	0.701	0.741	0.0934	0.578	2.85	2.752	0.0408	0.15	tqwt_TKEO_mean_dec_20+tqwt_minValue_dec_20
La_std_10th_delta	1.1297	1.34e+00	3.09474	7.143	0.678	0.729	0.761	0.660	0.730	0.764	0.0875	0.671	2.74	3.287	0.0344	0.70	std_MFCC_10th_coef+std_10th_delta
La_tqwt_minValue_dec_17	-0.2422	6.51e-01	0.78491	0.947	0.546	0.749	0.754	0.564	0.755	0.775	0.0846	0.719	2.65	3.550	0.0202	0.15	tqwt_entropy_shannon_dec_17+tqwt_minValue_dec_17
La_app_LT_entropy_log_3_coef	-6.8711	4.42e-06	0.00104	0.244	0.607	0.740	0.767	0.618	0.753	0.774	0.0738	0.549	2.62	2.675	0.0207	0.10	app_LT_entropy_log_1_coef+app_LT_entropy_log_2_coef+app_LT_entropy_log_3_coef
La_tqwt_energy_dec_33	-0.2641	6.26e-01	0.76791	0.942	0.748	0.776	0.801	0.736	0.772	0.798	0.0682	0.523	2.56	2.500	0.0266	1.00	tqwt_energy_dec_31+tqwt_energy_dec_33
La_std_3rd_delta	0.7956	1.20e+00	2.21574	4.085	0.729	0.742	0.775	0.712	0.755	0.783	0.0735	0.643	2.54	3.157	0.0278	1.00	std_MFCC_3rd_coef+std_3rd_delta
La_std_MFCC_2nd_coef	-1.5069	6.06e-02	0.22159	0.810	0.703	0.766	0.801	0.686	0.776	0.798	0.0650	0.635	2.26	3.071	0.0228	1.00	std_MFCC_2nd_coef+std_2nd_delta
La_tqwt_TKEO_std_dec_10	-0.3678	5.06e-01	0.69225	0.947	0.628	0.746	0.773	0.620	0.744	0.774	0.0606	0.627	2.26	3.076	0.0298	0.40	tqwt_entropy_shannon_dec_9+tqwt_entropy_shannon_dec_10+tqwt_TKEO_std_dec_9+tqwt_TKEO_std_dec_10
La_tqwt_entropy_log_dec_29	-0.0875	8.49e-01	0.91618	0.989	0.515	0.694	0.728	0.649	0.705	0.746	0.0540	0.630	2.23	3.493	0.0412	0.25	tqwt_entropy_log_dec_28+tqwt_entropy_log_dec_29
La_tqwt_kurtosisValue_dec_32	-0.4351	4.41e-01	0.64723	0.949	0.617	0.756	0.784	0.606	0.774	0.796	0.0539	0.162	2.11	0.745	0.0219	1.00	tqwt_kurtosisValue_dec_32+tqwt_kurtosisValue_dec_33

1.5 Differences Between Blind vs. Outcome-Driven Decorrelation

In this section I will show the differences in unaltered basis vectors between the Outcome driven Transformation vs. the blind decorrelated transformation

par(op)
par(mfrow=c(1,1))


smDecorU <- summary(bmdU)
decornamesU <- rownames(smDecorU$coefficients)

get_La_names <- decornames[!str_detect(decornames,"La_")]
get_La_namesU <- decornamesU[!str_detect(decornamesU,"La_")]

unn <- bmd$univariate[,3]
names(unn) <- rownames(bmd$univariate)
pander::pander(as.matrix(unn[get_La_names]))

mean_MFCC_2nd_coef	3.85
tqwt_meanValue_dec_11	3.38
tqwt_maxValue_dec_12	5.99
mean_MFCC_3rd_coef	3.42
std_MFCC_6th_coef	3.55
std_delta_delta_log_energy	6.15
tqwt_energy_dec_6	2.77
tqwt_kurtosisValue_dec_20	4.85
tqwt_entropy_log_dec_11	4.83
apq11Shimmer	4.44
tqwt_entropy_log_dec_35	3.56
f2	3.34
tqwt_kurtosisValue_dec_18	4.55
tqwt_meanValue_dec_25	2.16
tqwt_entropy_log_dec_16	5.02
tqwt_energy_dec_11	3.44
tqwt_kurtosisValue_dec_33	2.06
tqwt_energy_dec_12	4.45
VFER_SNR_TKEO	3.59
std_11th_delta	4.42
tqwt_entropy_shannon_dec_36	3.98
tqwt_kurtosisValue_dec_28	4.51
tqwt_meanValue_dec_18	1.77
std_MFCC_8th_coef	4.40
std_5th_delta	4.28
locAbsJitter	4.18
IMF_SNR_entropy	3.22
tqwt_kurtosisValue_dec_35	4.43
f1	4.02
minIntensity	6.02
mean_delta_log_energy	3.92
IMF_NSR_TKEO	2.92
numPulses	3.80

pander::pander(summary(unn[get_La_names]))

Min.	1st Qu.	Median	Mean	3rd Qu.	Max.
1.77	3.42	3.98	3.98	4.45	6.15


unnU <- bmdU$univariate[,3]
names(unnU) <- rownames(bmdU$univariate)
pander::pander(as.matrix(unnU[get_La_namesU]))

tqwt_entropy_shannon_dec_11	4.99
tqwt_TKEO_mean_dec_7	4.36
std_5th_delta	4.28
tqwt_kurtosisValue_dec_28	4.51
mean_MFCC_2nd_coef	3.85
tqwt_kurtosisValue_dec_36	5.83
tqwt_energy_dec_12	4.45
IMF_NSR_TKEO	2.92
std_MFCC_8th_coef	4.40
std_delta_log_energy	5.87
tqwt_meanValue_dec_18	1.77
maxIntensity	4.82
tqwt_meanValue_dec_25	2.16
mean_delta_log_energy	3.92
tqwt_meanValue_dec_11	3.38
tqwt_kurtosisValue_dec_20	4.85
std_11th_delta	4.42
VFER_SNR_TKEO	3.59
f1	4.02
tqwt_kurtosisValue_dec_17	4.27

pander::pander(summary(unnU[get_La_namesU]))

Min.	1st Qu.	Median	Mean	3rd Qu.	Max.
1.77	3.79	4.32	4.13	4.59	5.87

#boxplot(unn[get_La_names],unnU[get_La_namesU],xlab=c("Method"),ylab="Z",main="Z Values of Basis Features")

x1 <- unn[get_La_names]
x2 <- unnU[get_La_namesU]
X3 <- x1[!(get_La_names %in% get_La_namesU)]
X4 <- x2[!(get_La_namesU %in% get_La_names)]
vioplot(x1, x2, X3,X4, 
        names = c("Outcome-Driven", 
                  "Blind",
                  "Not in Blind",
                  "Not in Outcome-Driven"),
        ylab="Z IDI",
   col="gold")
title("Violin Plots of Unaltered-Basis")


sameFeatures <- get_La_names[get_La_names %in% get_La_namesU]
pander::pander(as.matrix(unn[sameFeatures]))

mean_MFCC_2nd_coef	3.85
tqwt_meanValue_dec_11	3.38
tqwt_kurtosisValue_dec_20	4.85
tqwt_meanValue_dec_25	2.16
tqwt_energy_dec_12	4.45
VFER_SNR_TKEO	3.59
std_11th_delta	4.42
tqwt_kurtosisValue_dec_28	4.51
tqwt_meanValue_dec_18	1.77
std_MFCC_8th_coef	4.40
std_5th_delta	4.28
f1	4.02
mean_delta_log_energy	3.92
IMF_NSR_TKEO	2.92

## The features by Outcome Drive not in Blind
pander::pander(as.matrix(x1[!(get_La_names %in% get_La_namesU)]))

tqwt_maxValue_dec_12	5.99
mean_MFCC_3rd_coef	3.42
std_MFCC_6th_coef	3.55
std_delta_delta_log_energy	6.15
tqwt_energy_dec_6	2.77
tqwt_entropy_log_dec_11	4.83
apq11Shimmer	4.44
tqwt_entropy_log_dec_35	3.56
f2	3.34
tqwt_kurtosisValue_dec_18	4.55
tqwt_entropy_log_dec_16	5.02
tqwt_energy_dec_11	3.44
tqwt_kurtosisValue_dec_33	2.06
tqwt_entropy_shannon_dec_36	3.98
locAbsJitter	4.18
IMF_SNR_entropy	3.22
tqwt_kurtosisValue_dec_35	4.43
minIntensity	6.02
numPulses	3.80


## The features not in outcome driven
pander::pander(as.matrix(x2[!(get_La_namesU %in% get_La_names)]))

tqwt_entropy_shannon_dec_11	4.99
tqwt_TKEO_mean_dec_7	4.36
tqwt_kurtosisValue_dec_36	5.83
std_delta_log_energy	5.87
maxIntensity	4.82
tqwt_kurtosisValue_dec_17	4.27

1.5.1 The Final Table

I’ll create a table subset of the logistic model from the Outcome-Driven decorrelated data.

The table will have:

The top associated features described by the feature network, as well as, and the new features.
1. For Decorrelated features it will provide the decorrelation formula
Nugget labels
1. The label of nugget as found by the clustering procedure
The feature coefficient
The feature Odd ratios and their corresponding 95%CI


## The features in top nugget
clusterFeatures <- clusterOutcome$names
## The new features 
discoveredFeatures <- newvars[zvaluePrePost$ZUni<1.96]

tablefinal <- smDecor$coefficients[unique(c(clusterFeatures,discoveredFeatures)),
                                   c("Estimate",
                                     "lower",
                                     "OR",
                                     "upper",
                                     "full.AUC",
                                     "Delta.AUC",
                                     "z.IDI",
                                     "Frequency")]

nugget <- clusterOutcome$membership
names(nugget) <- clusterOutcome$names
tablefinal$Nugget <- nugget[rownames(tablefinal)]
tablefinal$Nugget[is.na(tablefinal$Nugget)] <- "D"
deFromula <- character(length(theDeFormulas))
names(deFromula) <- names(theDeFormulas)
for (dx in names(deFromula))
{
  coef <- theDeFormulas[[dx]]
  cname <- names(theDeFormulas[[dx]])
  names(cname) <- cname
  for (cf in names(coef))
  {
    if (cf != dx)
    {
      if (coef[cf]>0)
      {
        deFromula[dx] <- paste(deFromula[dx],
                               sprintf("+ %5.3f*%s",coef[cf],cname[cf]))
      }
      else
      {
        deFromula[dx] <- paste(deFromula[dx],
                               sprintf("%5.3f*%s",coef[cf],cname[cf]))
      }
    }
  }
}
tablefinal$DecorFormula <- deFromula[rownames(tablefinal)]
pander::pander(tablefinal)

	Estimate	lower	OR	upper	full.AUC	Delta.AUC	z.IDI	Frequency	Nugget	DecorFormula
std_delta_delta_log_energy	0.5652	1.31e+00	1.75978	2.361	0.798	0.0530	3.87	1.00	1	NA
La_tqwt_energy_dec_33	-0.2641	6.26e-01	0.76791	0.942	0.798	0.0266	2.56	1.00	1	-0.884tqwt_energy_dec_31 + 1.000tqwt_energy_dec_33
La_std_MFCC_2nd_coef	-1.5069	6.06e-02	0.22159	0.810	0.798	0.0228	2.26	1.00	1	+ 1.000std_MFCC_2nd_coef -0.828std_2nd_delta
tqwt_maxValue_dec_12	-0.3024	6.32e-01	0.73901	0.865	0.794	0.0749	4.00	1.00	2	NA
La_locShimmer	1.5022	1.64e+00	4.49134	12.337	0.797	0.0393	3.15	1.00	2	+ 1.000locShimmer -0.957apq3Shimmer
La_tqwt_kurtosisValue_dec_32	-0.4351	4.41e-01	0.64723	0.949	0.796	0.0219	2.11	1.00	2	+ 1.000tqwt_kurtosisValue_dec_32 -1.018tqwt_kurtosisValue_dec_33
minIntensity	-2.3412	1.70e-02	0.09621	0.544	0.780	0.0228	2.59	1.00	3	NA
La_tqwt_kurtosisValue_dec_31	-0.4262	4.92e-01	0.65301	0.866	0.787	0.0373	3.25	1.00	3	+ 1.000tqwt_kurtosisValue_dec_31 -0.921tqwt_kurtosisValue_dec_33
La_tqwt_TKEO_std_dec_32	0.2590	1.08e+00	1.29565	1.547	0.776	0.0356	3.11	1.00	4	-0.882tqwt_TKEO_mean_dec_33 + 1.000tqwt_TKEO_std_dec_32
La_std_3rd_delta	0.7956	1.20e+00	2.21574	4.085	0.783	0.0278	2.54	1.00	3	-0.950std_MFCC_3rd_coef + 1.000std_3rd_delta
tqwt_kurtosisValue_dec_20	0.9820	1.47e+00	2.66987	4.833	0.783	0.0665	3.67	0.90	3	NA
tqwt_meanValue_dec_18	-0.3003	5.89e-01	0.74057	0.931	0.777	0.0000	2.99	0.60	3	NA
tqwt_entropy_log_dec_11	-0.7823	2.86e-01	0.45733	0.732	0.762	0.0739	3.62	1.00	4	NA
tqwt_entropy_log_dec_16	-0.3471	5.71e-01	0.70670	0.875	0.775	0.0695	3.37	0.95	5	NA
La_std_12th_delta	1.7791	1.87e+00	5.92477	18.723	0.774	0.0464	3.29	1.00	4	-0.960std_MFCC_12th_coef + 1.000std_12th_delta
mean_delta_log_energy	-0.0201	9.65e-01	0.98012	0.996	0.783	0.0196	2.54	0.90	4	NA
La_std_4th_delta	0.9385	1.33e+00	2.55616	4.906	0.772	0.0492	2.92	0.80	5	-0.970std_MFCC_4th_coef + 1.000std_4th_delta
std_5th_delta	0.4696	1.16e+00	1.59940	2.200	0.750	0.0439	2.97	0.70	4	NA
tqwt_kurtosisValue_dec_35	0.1116	1.03e+00	1.11804	1.214	0.783	0.0531	2.79	0.95	5	NA
std_MFCC_8th_coef	0.5357	1.16e+00	1.70867	2.516	0.756	0.0412	2.97	0.85	6	NA
tqwt_energy_dec_12	-0.0666	8.95e-01	0.93561	0.978	0.774	0.0655	3.28	0.65	7	NA
std_11th_delta	0.3676	1.13e+00	1.44423	1.851	0.758	0.0487	3.15	0.70	8	NA
La_tqwt_TKEO_std_dec_17	-0.4358	4.67e-01	0.64673	0.896	0.765	0.0507	3.12	0.50	9	+ 1.000tqwt_TKEO_std_dec_17 + 2.124tqwt_minValue_dec_17
tqwt_kurtosisValue_dec_28	-0.0420	9.31e-01	0.95889	0.988	0.740	0.0524	3.04	0.65	10	NA
tqwt_kurtosisValue_dec_18	0.8663	1.37e+00	2.37798	4.124	0.769	0.0471	3.43	0.80	5	NA
tqwt_meanValue_dec_25	-0.3364	5.76e-01	0.71435	0.886	0.768	0.0000	3.41	0.50	4	NA
tqwt_energy_dec_6	-0.1151	8.34e-01	0.89131	0.952	0.774	0.0700	3.72	0.55	5	NA
tqwt_entropy_log_dec_35	-0.6965	3.18e-01	0.49834	0.781	0.776	0.0598	3.46	0.85	10	NA
mean_MFCC_3rd_coef	0.0255	1.01e+00	1.02587	1.040	0.786	0.0743	3.99	0.55	5	NA
apq11Shimmer	0.1937	1.09e+00	1.21368	1.356	0.746	0.0784	3.48	0.50	11	NA
La_tqwt_entropy_log_dec_31	0.6173	1.27e+00	1.85390	2.707	0.749	0.0773	3.49	0.50	12	+ 1.000tqwt_entropy_log_dec_31 -0.731tqwt_entropy_log_dec_35
La_tqwt_maxValue_dec_1	-0.7541	2.85e-01	0.47042	0.777	0.757	0.0493	3.04	0.55	11	+ 0.949tqwt_minValue_dec_1 + 1.000tqwt_maxValue_dec_1 -0.015*tqwt_skewnessValue_dec_1
La_std_10th_delta	1.1297	1.34e+00	3.09474	7.143	0.764	0.0344	2.74	0.70	13	-1.049std_MFCC_10th_coef + 1.000std_10th_delta
mean_MFCC_2nd_coef	0.0168	1.01e+00	1.01696	1.026	0.773	0.0848	4.11	0.50	13	NA
La_tqwt_entropy_log_dec_4	-2.4537	2.09e-02	0.08598	0.354	0.784	0.0834	3.94	0.50	5	-1.090tqwt_entropy_log_dec_1 + 1.000tqwt_entropy_log_dec_4
std_MFCC_6th_coef	1.0443	1.54e+00	2.84154	5.234	0.777	0.0574	3.91	0.60	13	NA
La_tqwt_TKEO_std_dec_28	-0.0827	8.76e-01	0.92066	0.967	0.736	0.0536	3.50	0.35	D	-0.730tqwt_TKEO_mean_dec_28 + 1.000tqwt_TKEO_std_dec_28
La_tqwt_stdValue_dec_1	-0.5159	4.27e-01	0.59695	0.835	0.721	0.0773	3.29	0.15	D	-0.524tqwt_entropy_shannon_dec_1 + 0.448tqwt_entropy_shannon_dec_2 + 1.000tqwt_stdValue_dec_1 -0.849tqwt_stdValue_dec_2
La_tqwt_energy_dec_5	-0.1450	7.89e-01	0.86505	0.949	0.777	0.0618	3.27	0.25	D	-0.933tqwt_energy_dec_4 + 1.000tqwt_energy_dec_5
La_tqwt_kurtosisValue_dec_3	0.1117	1.04e+00	1.11817	1.201	0.757	0.0698	3.25	0.15	D	-0.939tqwt_kurtosisValue_dec_2 + 1.000tqwt_kurtosisValue_dec_3
La_det_LT_entropy_shannon_1_coef	0.0876	1.02e+00	1.09160	1.164	0.757	0.0443	3.06	0.40	D	-0.781det_TKEO_mean_1_coef + 1.000det_LT_entropy_shannon_1_coef
La_tqwt_kurtosisValue_dec_2	-0.1319	8.03e-01	0.87640	0.957	0.779	0.0550	3.04	0.35	D	+ 1.000tqwt_kurtosisValue_dec_2 -0.917tqwt_kurtosisValue_dec_4
La_tqwt_minValue_dec_20	-0.1514	7.77e-01	0.85950	0.951	0.741	0.0408	2.85	0.15	D	+ 0.431tqwt_TKEO_mean_dec_20 + 1.000tqwt_minValue_dec_20
La_app_LT_entropy_log_3_coef	-6.8711	4.42e-06	0.00104	0.244	0.774	0.0207	2.62	0.10	D	+ 1.587app_LT_entropy_log_1_coef -2.692app_LT_entropy_log_2_coef + 1.000*app_LT_entropy_log_3_coef
La_tqwt_entropy_log_dec_29	-0.0875	8.49e-01	0.91618	0.989	0.746	0.0412	2.23	0.25	D	-1.015tqwt_entropy_log_dec_28 + 1.000tqwt_entropy_log_dec_29

1.5.1.1 Saving all the generated data

save.image("~/GitHub/FCA/ParkinsonDemo.RData")

Decorrelation-Based Feature Discovery: Parkinson

Jose Tamez

2022-10-02