Data Conditioning and Visualization using FRESA.CAD

Abstract

Medical data arrives in many forms: continuous, dichotomous, ordinal or categorical and in many distributions. Furthermore, many measures have anthropometric associations, are mutually correlated and some of them varied between gender, race or ethnicity. The wide presentation of the data may affect the machine learning process. Therefore, data conditioning (Cleaning, adjusting and normalization) is key in handling this wide presentation of data types. FRESA.CAD provides a set of tools that allows a quick analysis, adjusting, normalization and visualization of the high dimensionality of data sets that are typical in the design of diagnostic models.

Introduction

This vignette will show how to use FRESA.CAD for raw data visualization, univariate analysis and conditioning tasks on high-dimensional data sets. It will provide working samples of data being conditioned to address biases and anthropometric associations.

Overview

For this vignette, I will use the publicly available data sets of the osteoarthritis initiative (OAI) on the demographics, physical examination, joint-space width and radiologist scores. The joint space width consists of a set of evenly spaced measurements between the tibial-plateau and the distal femur condyles. The measurements were done on plain films and repeated each year for a period of four years. The same films Ire scored by an expert radiologist that provided their interpretation on the degree of joint space narrowing and their Kellgren and Lawrence (KL) scores. The physical examination data provided the height, weight, and BMI of all subjects. I complemented this information with the site ID, in order to adjust for site bias.

Preparation

Loading the libraries

library("FRESA.CAD")
library("epiR")

Loading the data sets.

#Load the JSW Data
OAI_JSW <- read.delim("OAI_JSW_Analysis.txt",na.strings=c("NA","","#N/A","#DIV/0!"))
# just substract 1 from the knee side variable
OAI_JSW$SIDEC = OAI_JSW$SIDE - 1

#Load the Variables to Be processed
OAI_JSWVarDescription <- read.delim("OAI_JSWDescription.txt")
rownames(OAI_JSWVarDescription) <- OAI_JSWVarDescription$Var

#Load the  Test Re-test scores
JSNTestRetestScores <- read.delim("TestRetestBU.txt")


plot(ecdf(OAI_JSW$HEIGHT),main="Height",col="black",lwd = 2,verticals = TRUE, do.points = FALSE,xlim=c(1,2),ylim=c(0,1));
par(new=TRUE)
cca <- OAI_JSW[complete.cases(OAI_JSW),]
plot(density(cca$HEIGHT),xlim=c(1,2),ylim=c(0,5))

plot(density(cca$JSW150))

plot(density(cca$CVJSW))

Heat Maps Auxilary Plots

The first step in our analysis is to visualize the data content using Heat maps. To look at the data, I will create the layout of the Heat-map with an extra plot function to see the JSN scores and the type of features.

# the colors of the type of JSW measurement. These are categorical description of the bars 
lType <- c("Original","Derived")
lTypecolors <- c("green","Orange")

feattype <- c("Medial","Lateral","Distance","Location","Joint")
featcolor <- c("green","blue","orange","yellow","red")



# the rowcolors is a global variable that will have the ROI colors of the features
# lJSNBar is a global variable that will have the lateral JSN 
# mJSNBar is a global variable that will have the medial JSN 

# the colors for the JSN scores and KL scores. These are ordinal
colorJSNpalette <- c("blue","cyan","red","yellow")

# the colors for the heatmap These are continous
colorpalette <- c("cyan","blue","black","red","yellow")

# the Layoout of the plot will be on 6 rows and four columns
# Color Key     |             |            |Col Dendogram
#               |             |            |ColSideColors
#               |             |            |Lateral JSN
#               |             |            |Medial  JSN
# Row Dendogram |RowSideCOlors|Feature Type|HeatMap
#               |             |            | 

lmat <- rbind( c(6,0, 0,5), c(0,0,0,2),c(0,0,0,8),c(0,0,0,9),c(0,0,0,10), c(4,1,7,3), c(0,0,0,0) )
# the height is 5.0in
lhei <- c(0.9, 0.15,0.15,0.15,0.15,3.0,0.20) 
# the width is 6.0in
lwid <- c(1.0, 0.15,0.15,4.2 )

myplot <- function() {
            op <- par(no.readonly = TRUE)
              rindx <- get('rowInd',parent.frame())
              cindx <- get('colInd',parent.frame())
              marg <- get('margins',parent.frame())
              cnames <- rownames(get('orderData',parent.frame(2)))
              nr <-  length(featureColor)
              par(mar = c(marg[1], 0, 0, 0.5),cex = 0.66)
              image(rbind(1:nr), col = featureColor[rindx], axes = FALSE)

            nr <-  length(mJSNBar)
            index <- pmin(floor(nr*(0.9999999*mJSNBar[cnames])/4)+1,nr);
             colcolors <- colorRampPalette(colorJSNpalette)(n = nr)[index];
             par(mar = c(0.5, 0, 0, marg[2]),cex = 0.66)
             image(cbind(1:nr), col = colcolors[cindx], axes = FALSE)
             mtext("mJSN", side=4, line=1, cex=0.4,las=1, col="black")
            index <- pmin(floor(nr*(0.9999999*lJSNBar[cnames])/4)+1,nr);
             colcolors <- colorRampPalette(colorJSNpalette)(n = nr)[index];
             par(mar = c(0.5, 0, 0, marg[2]),cex = 0.66)
             image(cbind(1:nr), col = colcolors[cindx], axes = FALSE)
             mtext("lJSN", side=4, line=1, cex=0.4,las=1, col="black")

             index <- pmin(floor(nr*(0.9999999*Womac[cnames])/50)+1,nr);
             colcolors <- colorRampPalette(colorpalette)(n = nr)[index];
             par(mar = c(0.5, 0, 0, marg[2]),cex = 0.66)
             image(cbind(1:nr), col = colcolors[cindx], axes = FALSE)
             mtext("Total WOMAC", side=4, line=1, cex=0.4,las=1, col="black")

             
             
              par(op)

}

Once I have defined the auxiliary information to be presented in the heat-map, I will plot the baseline data with complete cases.

baseline <- subset(OAI_JSW,VISITID==0 & BLKL > -1 & Y2DeltaJSN >= 0 )
baseline <- baseline[complete.cases(baseline),]
rownames(baseline) <- baseline$IDSIDE

op <- par(no.readonly = TRUE)
featureColor <- as.character(OAI_JSWVarDescription[,"Type.Color"])
lJSNBar <- baseline$lJSN
names(lJSNBar) <- rownames(baseline)
mJSNBar <- baseline$mJSN
names(mJSNBar) <- rownames(baseline)
Womac <- baseline$BLTOTALWOMAC
names(Womac) <- rownames(baseline)

hm <- heatMaps(variableList = OAI_JSWVarDescription,Outcome = "BLKL",data = baseline,hCluster= TRUE,Scale=TRUE,theFiveColors=colorpalette,outcomeColors=colorJSNpalette,transpose=TRUE,xlab="Baseline",ylab="Features",cexRow=0.50,cexCol=0.10,srtCol=35,reorderfun=function(d, w) reorder(d, w, agglo.FUN = mean),distfun=function(c) dist(c, method = "manhattan"),keysize=1.0,RowSideColors=as.character(OAI_JSWVarDescription[,"Region.Color"]),lmat=lmat, lhei=lhei, lwid=lwid,extrafun=myplot,key.par=list(cex=0.5))

par(new=TRUE,mar=c(1,2,0,0),omd=c(0.02,0.98,0.02,0.98))
plot(c(0,5), c(0,1), type='n', bty='n', xaxt='n', xlab='', yaxt='n', ylab='',main='')
lg <- legend("bottomleft",legend=feattype,fill=featcolor,cex=0.5)
legend(lg$rect$left+lg$rect$w,lg$rect$top,legend=lType,fill=lTypecolors,cex=0.5)

par(op)

Longitudinal plots of unadjusted data

I will review the longitudinal behavior of the raw data divide by radiological osteoarthritis (ROA) status (ROA)

par(mfrow=c(4,4),cex=0.3)
OAI_JSW$RADOA <- 1*(OAI_JSW$BLKL >= 2)
OAI_JSW$YearsFromBL <- OAI_JSW$DAYSFROMBL/356.25

oasubset <- subset(OAI_JSW,BLKL<4)
table(oasubset$VISITID)
oasubset$VISITID <- oasubset$VISITID+1
cs1CorCAR1 <- nlme::corCAR1(0.99,form = ~ YearsFromBL | IDSIDE)

tsa <- timeSerieAnalysis(OAI_JSWVarDescription,"I(YearsFromBL-2)*RADOA",oasubset,timevar="VISITID",contime="YearsFromBL",Outcome="RADOA",cs1CorCAR1 ,timesign= "+",Ptoshow = c(2,3,4),plegend= c("Visit","ROA","Visit:ROA"),catgo.names=c("Normal", "ROA"),description="Description") 

par(mfrow=c(1,1),cex=1)


tvaluesUnadjusted <- tsa$t.values

Adjusting for Site Bias

I can notice that the cohort of subjects with KL<2 is progressing. I will look at the subjects stratified by Site. First, I get the set of subjects that did not progress in two years and had a normal looking knee.

JSWtimecontrol <- subset(OAI_JSW, BLKL == 0 & Y2DeltaJSN==0 & BLTOTALWOMAC < 10)

Then we look how those subjects beheave by site.

par(mfrow=c(4,4),cex=0.3)
OAI_JSWVarDescription$FullDescription <- paste("Site 1:",OAI_JSWVarDescription$Description)
timeSerieAnalysis(OAI_JSWVarDescription,"1+DaysFomStart",subset(JSWtimecontrol,SITE==1),timevar="VISITID",contime="DaysFomStart",Outcome=".", cs1CorCAR1,timesign= "+",Ptoshow = c(2),plegend= c("Days"),description="FullDescription") 


par(mfrow=c(1,1),cex=1)

par(mfrow=c(4,4),cex=0.3)
OAI_JSWVarDescription$FullDescription <- paste("Site 2:",OAI_JSWVarDescription$Description)
timeSerieAnalysis(OAI_JSWVarDescription,"1+DaysFomStart",subset(JSWtimecontrol,SITE==2),timevar="VISITID",contime="DaysFomStart",Outcome=".", cs1CorCAR1,timesign= "+",Ptoshow = c(2),plegend= c("Days"),description="FullDescription") 


par(mfrow=c(1,1),cex=1)

par(mfrow=c(4,4),cex=0.3)
OAI_JSWVarDescription$FullDescription <- paste("Site 3:",OAI_JSWVarDescription$Description)
timeSerieAnalysis(OAI_JSWVarDescription,"1+DaysFomStart",subset(JSWtimecontrol,SITE==3),timevar="VISITID",contime="DaysFomStart",Outcome=".", cs1CorCAR1,timesign= "+",Ptoshow = c(2),plegend= c("Days"),description="FullDescription") 


par(mfrow=c(1,1),cex=1)

par(mfrow=c(4,4),cex=0.3)
OAI_JSWVarDescription$FullDescription <- paste("Site 4:",OAI_JSWVarDescription$Description)
timeSerieAnalysis(OAI_JSWVarDescription,"1+DaysFomStart",subset(JSWtimecontrol,SITE==4),timevar="VISITID",contime="DaysFomStart",Outcome=".", cs1CorCAR1,timesign= "+",Ptoshow = c(2),plegend= c("Days"),description="FullDescription") 


par(mfrow=c(1,1),cex=1)

par(mfrow=c(4,4),cex=0.3)
OAI_JSWVarDescription$FullDescription <- paste("Site 5:",OAI_JSWVarDescription$Description)
timeSerieAnalysis(OAI_JSWVarDescription,"1+DaysFomStart",subset(JSWtimecontrol,SITE==5),timevar="VISITID",contime="DaysFomStart",Outcome=".",cs1CorCAR1,timesign= "+",Ptoshow = c(2),plegend= c("Days"),description="FullDescription") 

par(mfrow=c(1,1),cex=1)

The plots show that there are several sites whose quantitative measurements are drifting. So I need to adjust for this drift. I used the feature adjustment function from FRESA.CAD. The function will adjust the features only if the fitting procedure reduces the residual in a significant way (p<0.05), and it is set to use the GLS procedure to handle the correlated longitudinal measures.


OAI_JSW.adj <- featureAdjustment(OAI_JSWVarDescription, baseModel="1+DaysFomStart", strata = "SITE",data=OAI_JSW,JSWtimecontrol, type = "GLS", pvalue = 0.05,correlationGroup = "IDSIDE")

Once adjusted, the time drift has been corrected


JSWtimecontrol.adj <- subset(OAI_JSW.adj, BLKL == 0 & Y2DeltaJSN==0 & BLTOTALWOMAC < 10)
#JSWtimecontrol.adj <- JSWtimecontrol.adj[complete.cases(JSWtimecontrol.adj),]

par(mfrow=c(4,4),cex=0.3)
OAI_JSWVarDescription$FullDescription <- paste("Site 1:",OAI_JSWVarDescription$Description)
tsa <- timeSerieAnalysis(OAI_JSWVarDescription,"1+DaysFomStart",subset(JSWtimecontrol.adj,SITE==1),timevar="VISITID",contime="DaysFomStart",Outcome=".", cs1CorCAR1,timesign= "+",Ptoshow = c(2),plegend= c("Days"),description="FullDescription") 


par(mfrow=c(1,1),cex=1)

par(mfrow=c(4,4),cex=0.3)

OAI_JSWVarDescription$FullDescription <- paste("Site 2:",OAI_JSWVarDescription$Description)
tsa <- timeSerieAnalysis(OAI_JSWVarDescription,"1+DaysFomStart",subset(JSWtimecontrol.adj,SITE==2),timevar="VISITID",contime="DaysFomStart",Outcome=".", cs1CorCAR1,timesign= "+",Ptoshow = c(2),plegend= c("Days"),description="FullDescription") 


par(mfrow=c(1,1),cex=1)

par(mfrow=c(4,4),cex=0.3)
OAI_JSWVarDescription$FullDescription <- paste("Site 3:",OAI_JSWVarDescription$Description)
tsa <- timeSerieAnalysis(OAI_JSWVarDescription,"1+DaysFomStart",subset(JSWtimecontrol.adj,SITE==3),timevar="VISITID",contime="DaysFomStart",Outcome=".", cs1CorCAR1,timesign= "+",Ptoshow = c(2),plegend= c("Days"),description="FullDescription") 


par(mfrow=c(1,1),cex=1)

par(mfrow=c(4,4),cex=0.3)
OAI_JSWVarDescription$FullDescription <- paste("Site 4:",OAI_JSWVarDescription$Description)
tsa <- timeSerieAnalysis(OAI_JSWVarDescription,"1+DaysFomStart",subset(JSWtimecontrol.adj,SITE==4),timevar="VISITID",contime="DaysFomStart",Outcome=".", cs1CorCAR1,timesign= "+",Ptoshow = c(2),plegend= c("Days"),description="FullDescription") 


par(mfrow=c(1,1),cex=1)

par(mfrow=c(4,4),cex=0.3)
OAI_JSWVarDescription$FullDescription <- paste("Site 5:",OAI_JSWVarDescription$Description)
tsa <- timeSerieAnalysis(OAI_JSWVarDescription,"1+DaysFomStart",subset(JSWtimecontrol.adj,SITE==5),timevar="VISITID",contime="DaysFomStart",Outcome=".", cs1CorCAR1,timesign= "+",Ptoshow = c(2),plegend= c("Days"),description="FullDescription") 

par(mfrow=c(1,1),cex=1)

Once adjusted I will check again how the non-OA group compares to the ROA group

par(mfrow=c(4,4),cex=0.3)
OAI_JSW.adj$RADOA <- 1*(OAI_JSW.adj$BLKL >= 2)
OAI_JSW.adj$YearsFromBL <- OAI_JSW.adj$DAYSFROMBL/356.25
OAI_JSWVarDescription$FullDescription <- paste("Adjusted: ",OAI_JSWVarDescription$Description)
tsa <- timeSerieAnalysis(OAI_JSWVarDescription,"I(YearsFromBL-2)*RADOA",subset(OAI_JSW.adj,BLKL<4),timevar="VISITID",contime="YearsFromBL",Outcome="RADOA", cs1CorCAR1,timesign= "+",Ptoshow = c(2,3,4),plegend= c("Visit","ROA","Visit:ROA"),catgo.names=c("Normal", "ROA"),description="FullDescription") 

par(mfrow=c(1,1),cex=1)


tvaluesSiteAdjusted <- tsa$t.values

Adjusting for Age Height and Gender

Now, I will adjust for age, anthropometric (height) and gender associations in joint space width measurements


#lets use the subjects that did not progress to get anthrometric (height) and gender associations
JSWcontrol.BL <- subset(OAI_JSW.adj, VISITID==0 & BLKL == 0 &  Y2DeltaJSN==0 & BLTOTALWOMAC < 10)
#JSWcontrol.BL <- JSWcontrol.BL[complete.cases(JSWcontrol.BL),]

#We adjust for those differences
OAI_JSW.full.adj <- featureAdjustment(OAI_JSWVarDescription, baseModel="1+AGE+HEIGHT+CFWDTH
", strata = "Gender", data=OAI_JSW.adj,JSWcontrol.BL, type = "LM", pvalue = 0.05)

Once adjusted for gender, height, and age; I’ll plot the new longitudinal behavior.

par(mfrow=c(4,4),cex=0.3)
OAI_JSW.full.adj$RADOA <- 1*(OAI_JSW.full.adj$BLKL >= 2)
OAI_JSW.full.adj$YearsFromBL <- OAI_JSW.full.adj$DAYSFROMBL/356.25
OAI_JSWVarDescription$FullDescription <- paste("Full Adjusted: ",OAI_JSWVarDescription$Description)
tsa <- timeSerieAnalysis(OAI_JSWVarDescription,"I(YearsFromBL-2)*RADOA",subset(OAI_JSW.full.adj,BLKL<4),timevar="VISITID",contime="YearsFromBL",Outcome="RADOA", cs1CorCAR1,timesign= "+",Ptoshow = c(2,3,4),plegend= c("Visit","ROA","Visit:ROA"),catgo.names=c("Normal", "ROA"),description="FullDescription") 

par(mfrow=c(1,1),cex=1)

tvaluesSiteAgeAdjusted <- tsa$t.values

Quantile Normalization

After that, I can do a quantile normalization using the behavior of the reference control. For this, I will use only the left knee and subjects that had a KL=0 and a total WOMAC score lower than 10.


JSWBLadjc <- subset(OAI_JSW.full.adj,SIDE==2 & VISITID==0 & BLKL == 0 & Y2DeltaJSN==0 & BLTOTALWOMAC < 10)
JSWBLadjc <- JSWBLadjc[complete.cases(JSWBLadjc),]
OAI_JSW.full.adj.norm <- rankInverseNormalDataFrame(OAI_JSWVarDescription, OAI_JSW.full.adj, JSWBLadjc,strata="Gender")


plot(density(JSWBLadjc$JSW150))

plot(density(cca$JSW150,na.rm=TRUE))

plot(density(subset(cca,mJSN==0 & BLKL==0 & VISITID==0)$JSW150,na.rm=TRUE))

plot(density(subset(OAI_JSW.full.adj.norm,mJSN==0 & BLKL==0 & VISITID==0)$JSW150,na.rm=TRUE))

plot(density(subset(cca,mJSN>0 & VISITID==0)$JSW150,na.rm=TRUE))

plot(density(subset(OAI_JSW.full.adj.norm,mJSN>0 & VISITID==0)$JSW150,na.rm=TRUE))


plot(density(subset(cca,mJSN>0 & VISITID==0)$CVJSW))

plot(density(subset(OAI_JSW.full.adj,VISITID==0)$CVJSW,na.rm=TRUE))

plot(density(subset(OAI_JSW.full.adj.norm,VISITID==0)$CVJSW,na.rm=TRUE))

plot(density(subset(OAI_JSW.full.adj.norm,mJSN==0 & BLKL==0 & VISITID==0)$CVJSW,na.rm=TRUE))

plot(density(subset(OAI_JSW.full.adj,mJSN>0 & VISITID==0)$CVJSW,na.rm=TRUE))

plot(density(subset(OAI_JSW.full.adj.norm,mJSN>0 & VISITID==0)$CVJSW,na.rm=TRUE))



bln <- subset(OAI_JSW.full.adj.norm,OAI_JSW.full.adj.norm$VISITID==0)
blo <- subset(OAI_JSW,VISITID==0)
cor.test(bln$mJSN,bln$JSW250,na.rm=TRUE,method="spearman")
cor.test(blo$mJSN,blo$JSW250,na.rm=TRUE,method="spearman")
cor.test(blo$BMI,blo$JSW250,na.rm=TRUE,method="spearman")
blkl01n <- subset(bln,bln$BLKL<2)
cor.test(blkl01n$BMI,blkl01n$JSW250,na.rm=TRUE,method="spearman")
cor.test(blkl01n$Gender,blkl01n$JSW250,na.rm=TRUE,method="spearman")
cor.test(blkl01n$HEIGHT,blkl01n$JSW250,na.rm=TRUE,method="spearman")
blkl01o <- subset(blo,blo$BLKL<2)
cor.test(blkl01o$BMI,blkl01o$JSW250,na.rm=TRUE,method="spearman")
cor.test(blkl01o$BMI,blkl01o$LJSW750,na.rm=TRUE,method="spearman")
cor.test(blkl01o$BMANG,blkl01o$LJSW750,na.rm=TRUE,method="spearman")
cor.test(blkl01o$BMANG,blkl01o$JSW250,na.rm=TRUE,method="spearman")
cor.test(blkl01o$BMANG,blkl01o$BMI,na.rm=TRUE,method="spearman")
cor.test(blkl01o$Gender,blkl01o$JSW250,na.rm=TRUE,method="spearman")
cor.test(blkl01o$HEIGHT,blkl01o$JSW250,na.rm=TRUE,method="spearman")

JSWn <- blkl01o$JSW250/blkl01o$HEIGHT
cor.test(blkl01o$Gender,JSWn,na.rm=TRUE,method="spearman")



cor.test(OAI_JSW$mJSN,OAI_JSW$JSW250,na.rm=TRUE,method="spearman")

cor.test(OAI_JSW.full.adj.norm$lJSN,OAI_JSW.full.adj.norm$LJSW750,na.rm=TRUE,method="spearman")
cor.test(OAI_JSW$lJSN,OAI_JSW$LJSW750,na.rm=TRUE,method="spearman")

I’ll check again the longitudinal behavior. This time the units of the y-axis are in standard deviations.

par(mfrow=c(4,4),cex=0.3)
OAI_JSW.full.adj.norm$RADOA <- 1*(OAI_JSW.full.adj.norm$BLKL >= 2)
OAI_JSW.full.adj.norm$YearsFromBL <- OAI_JSW.full.adj.norm$DAYSFROMBL/356.25
OAI_JSWVarDescription$FullDescription <- paste("Normalized: ",OAI_JSWVarDescription$Description)
tsa <- timeSerieAnalysis(OAI_JSWVarDescription,"I(YearsFromBL-2)*RADOA",subset(OAI_JSW.full.adj.norm,BLKL<4),timevar="VISITID",contime="YearsFromBL",Outcome="RADOA", cs1CorCAR1,timesign= "+",Ptoshow = c(2,3,4),plegend= c("Visit","ROA","Visit:ROA"),catgo.names=c("Normal", "ROA"),description="FullDescription") 

par(mfrow=c(1,1),cex=1)

tvaluesSiteAgeAdjustedNormalized <- tsa$t.values

Comparing the t values after adjustment


tvaluesTable <- cbind(tvaluesUnadjusted[,3],tvaluesSiteAdjusted[,3],tvaluesSiteAgeAdjusted[,3],tvaluesSiteAgeAdjustedNormalized[,3])
colnames(tvaluesTable) <- c("Original","Site Adjusted","Gender, Age & Heigth","Normalized")

tvaluesTableSlope <- cbind(tvaluesUnadjusted[,4],tvaluesSiteAdjusted[,4],tvaluesSiteAgeAdjusted[,4],tvaluesSiteAgeAdjustedNormalized[,4])
colnames(tvaluesTableSlope) <- c("Original","Site Adjusted","Gender, Age & Heigth","Normalized")


# Table of Differences between ROA and normal after 2 years

pander::pandoc.table(tvaluesTable)
  Original Site Adjusted Gender, Age & Heigth Normalized
JSW150 -20.06 -19.9 -18.39 -19.93
JSW175 -19.84 -19.62 -18.37 -20.1
JSW200 -19.6 -19.36 -18.44 -20.6
JSW225 -19.58 -19.35 -18.64 -19.92
JSW250 -18.97 -18.57 -18.12 -18.8
JSW275 -17.63 -17.21 -16.83 -17.43
JSW300 -15.66 -15.38 -15.12 -15.64
LJSW700 -13.98 -13.85 -12.6 -11.49
LJSW725 -11.62 -11.39 -10.27 -9.189
LJSW750 -9.538 -9.299 -8.135 -7.416
LJSW775 -7.692 -7.457 -6.591 -5.834
LJSW800 -5.84 -5.616 -4.576 -3.793
LJSW825 -4.008 -3.781 -2.618 -2.364
LJSW850 -2.734 -2.542 -1.544 -1.804
MCMJSW -22.02 -21.82 -20.37 -21.4
TPCFDS 1.241 0.3034 0.5206 -0.2217
XMJSW 1.319 1.319 1.631 1.054
meanMJSW -19.4 -19.08 -18.35 -19.64
StdMJSW 4.23 4.23 4.294 3.883
meanLJSW -7.531 -7.316 -6.411 -5.775
stdLJSW -9.237 -9.46 -8.783 -8.766
MtoLDiff 23.4 23.85 24.9 24.69
mCV 17.26 17.47 17.02 16.19
lCV 2.73 2.73 2.829 2.472
medialSlope 2.822 2.822 2.628 2.602
LateralSlope 15.22 15.52 14.12 13.48
JointSlope 5.107 5.1 5.861 7.45
DiffSlope -9.292 -9.553 -9.072 -8.725
MaxSlope 3.979 3.979 3.528 3.501
MedSlope 1.253 1.253 0.9979 1.016
MedCurvature 2.124 2.124 2.653 2.554
MedIntersept -19.34 -19.5 -18.66 -20.02
LatSlope 14.31 14.37 13.18 12.77
LatCurvature -5.658 -5.658 -5.988 -5.871
LatIntersept -6.055 -6.318 -4.525 -3.896
MaxCV 20.51 20.57 20.4 19.92
MeanJSW -15.15 -14.72 -14.34 -14.55
STDJSW 21.19 21.64 23.37 23.28
CVJSW 26.52 26.78 26.72 26.5

# Table of the rate of change between no-ROA and ROA at two years
pander::pandoc.table(tvaluesTableSlope)
  Original Site Adjusted Gender, Age & Heigth Normalized
JSW150 -3.882 -3.665 -3.594 -3.467
JSW175 -4.339 -4.069 -4.014 -4.115
JSW200 -4.58 -4.273 -4.231 -4.461
JSW225 -5.267 -4.926 -4.899 -4.988
JSW250 -5.276 -4.663 -4.638 -4.632
JSW275 -5.21 -4.547 -4.492 -4.731
JSW300 -3.527 -3.03 -2.987 -3.21
LJSW700 -2.475 -2.169 -2.094 -1.909
LJSW725 -2.395 -1.924 -1.863 -1.723
LJSW750 -2.244 -1.736 -1.667 -1.421
LJSW775 -1.732 -1.237 -1.152 -0.9762
LJSW800 -1.185 -0.7277 -0.6654 -0.4019
LJSW825 -0.9047 -0.4325 -0.3697 -0.2982
LJSW850 -0.5854 -0.2118 -0.1232 -0.05759
MCMJSW -3.548 -3.291 -3.205 -3.409
TPCFDS -0.8563 -1.354 -1.354 -1.236
XMJSW 1.498 1.498 1.419 0.8401
meanMJSW -5.001 -4.508 -4.469 -4.455
StdMJSW -0.03128 -0.03128 -0.06137 -0.5709
meanLJSW -1.307 -0.8302 -0.7227 -0.4912
stdLJSW 1.252 1.078 1.112 0.9428
MtoLDiff 6.039 6.463 6.511 7.007
mCV 7.26 7.454 7.41 7.393
lCV 3.971 3.971 3.992 4.008
medialSlope -0.328 -0.328 -0.3905 -0.7924
LateralSlope 1.911 2.266 2.318 2.114
JointSlope 2.139 2.225 2.162 2.143
DiffSlope -1.824 -2.075 -2.136 -1.958
MaxSlope 0.2066 0.2066 0.2326 -0.1231
MedSlope -0.8422 -0.8422 -1.093 -1.259
MedCurvature 0.6832 0.6832 0.6609 0.7992
MedIntersept -5.192 -5.448 -5.713 -5.775
LatSlope 2.61 2.688 2.768 2.567
LatCurvature -1.152 -1.152 -1.039 -0.9385
LatIntersept -1.061 -1.392 -1.723 -1.436
MaxCV 8.878 8.935 8.99 9.03
MeanJSW -3.684 -2.816 -2.739 -2.814
STDJSW 6.585 6.959 7.062 7.333
CVJSW 10 10.23 10.28 10.4

Final Heat Maps

baseline <- subset(OAI_JSW.full.adj.norm,VISITID==0 & Y2DeltaJSN >= 0 )
baseline <- baseline[complete.cases(baseline),]
rownames(baseline) <- baseline$IDSIDE

I’ll do a final plot of the normalized data.


featureColor <- as.character(OAI_JSWVarDescription[,"Type.Color"])
lJSNBar <- baseline$lJSN
names(lJSNBar) <- rownames(baseline)
mJSNBar <- baseline$mJSN
names(mJSNBar) <- rownames(baseline)
hm <- NULL;

op <- par(no.readonly = TRUE)
hm <- heatMaps(variableList = OAI_JSWVarDescription,Outcome = "BLKL",data = baseline,hCluster= TRUE,theFiveColors=colorpalette,outcomeColors=colorJSNpalette,transpose=TRUE,xlab="Baseline",ylab="Features",cexRow=0.50,cexCol=0.10,srtCol=35,reorderfun=function(d, w) reorder(d, w, agglo.FUN = mean),distfun=function(c) dist(c, method = "manhattan"),keysize=1.0,RowSideColors=as.character(OAI_JSWVarDescription[,"Region.Color"]),lmat=lmat, lhei=lhei, lwid=lwid,extrafun=myplot,key.par=list(cex=0.5))
par(new=TRUE,mar=c(1,2,0,0),omd=c(0.02,0.98,0.02,0.98))
plot(c(0,5), c(0,1), type='n', bty='n', xaxt='n', xlab='', yaxt='n', ylab='',main='')
lg <- legend("bottomleft",legend=feattype,fill=featcolor,cex=0.5)
legend(lg$rect$left+lg$rect$w,lg$rect$top,legend=lType,fill=lTypecolors,cex=0.5)

par(op)
op <- par(no.readonly = TRUE)
hm <- heatMaps(variableList = OAI_JSWVarDescription,Outcome = "BLKL",data = baseline,hCluster= FALSE,theFiveColors=colorpalette,outcomeColors=colorJSNpalette,transpose=TRUE,xlab="Baseline",ylab="Features",cexRow=0.50,cexCol=0.10,srtCol=35,reorderfun=function(d, w) reorder(d, w, agglo.FUN = mean),distfun=function(c) dist(c, method = "manhattan"),keysize=1.0,RowSideColors=as.character(OAI_JSWVarDescription[,"Region.Color"]),lmat=lmat, lhei=lhei, lwid=lwid,extrafun=myplot,key.par=list(cex=0.5))
par(new=TRUE,mar=c(1,2,0,0),omd=c(0.02,0.98,0.02,0.98))
plot(c(0,5), c(0,1), type='n', bty='n', xaxt='n', xlab='', yaxt='n', ylab='',main='')
lg <- legend("bottomleft",legend=feattype,fill=featcolor,cex=0.5)
legend(lg$rect$left+lg$rect$w,lg$rect$top,legend=lType,fill=lTypecolors,cex=0.5)

par(op)

Univariate Analysis to KL

Once I have normalized all the data, I will perform a univariate analysis of all the variables. First, I will study the association to the baseline KL

rawbase <- subset(OAI_JSW,VISITID==0 & Y2DeltaJSN >= 0)
rownames(rawbase) <- rawbase$IDSIDE

#baseline <- subset(OAI_JSW.full.adj,VISITID==0 & Y2DeltaJSN >= 0 )
baseline <- subset(OAI_JSW.full.adj.norm,VISITID==0 & Y2DeltaJSN >= 0 )
rownames(baseline) <- baseline$IDSIDE
baseline <- baseline[complete.cases(baseline),]
rawbase <- rawbase[rownames(baseline),]

UniRankFeaturesRaw <- univariateRankVariables(variableList = OAI_JSWVarDescription,
                                                formula = "BLKL ~ 1+BMI+HEIGHT+Gender+AnySurgery",
                                                Outcome = "BLKL",
                                                data = rawbase, 
                                                categorizationType = "Raw", 
                                                type = "LM", 
                                                rankingTest = "Ztest",
                                                description = "Description",
                                               uniType="Regression")

UniRankFeaturesNorm<- univariateRankVariables(variableList = OAI_JSWVarDescription,
                                                formula = "BLKL ~ 1+BMI+AnySurgery",
                                                Outcome = "BLKL",
                                                data = baseline, 
                                                categorizationType = "Raw", 
                                                type = "LM", 
                                                rankingTest = "Ztest",
                                                description = "Description",
                                              raw.dataFrame = rawbase,
                                               uniType="Regression")

topnames <- rownames(UniRankFeaturesNorm)[1:15]
Cstat <- abs(cbind(UniRankFeaturesRaw[topnames,"ZGLM"],UniRankFeaturesNorm[topnames,"ZGLM"]))
colnames(Cstat) <- c("Original","Normalized")
rownames(Cstat) <- topnames
ymax <- 1.2*max(Cstat)
barplot(t(Cstat),beside=TRUE,legend = colnames(Cstat),ylim=c(0,ymax),las=2,cex.names=0.5)



BLDescriptive <- cbind(UniRankFeaturesNorm$cohortMean,UniRankFeaturesNorm$cohortStd,UniRankFeaturesNorm$spearman.r,UniRankFeaturesNorm$Beta,UniRankFeaturesNorm$ZGLM)
rownames(BLDescriptive) <- UniRankFeaturesNorm$descrip
colnames(BLDescriptive) <- c("Mean","Std","Sperman r","Beta","Z")
pander::pandoc.table(BLDescriptive[1:15,])
  Mean Std Sperman r Beta Z
Coefficient of variation of JSW 0.2256 0.1188 0.5674 0.2131 44.66
Medial to Lateral Absolute Difference 1.909 1.332 0.5114 0.2745 40.53
Standard deviation of the JSW 1.335 0.5805 0.4837 0.2616 38.43
medial JSW at x=0.200 [mm] 4.878 1.303 -0.403 -0.2402 31.1
Mean: medial JSW (Calc) 5.357 1.294 -0.4041 -0.2493 30.33
Medial Quadratic Fitting Intercept 0.05855 0.01469 -0.4057 -0.2466 30.21
medial JSW at x=0.225 [mm] 5.153 1.313 -0.4026 -0.2424 30.11
medial JSW at x=0.175 [mm] 4.693 1.308 -0.4033 -0.2557 29.93
medial minimum JSW [mm] 4.135 1.287 -0.4229 -0.2479 29.62
medial JSW at x=0.250 [mm] 5.527 1.344 -0.3964 -0.2455 28.96
medial JSW at x=0.150 [mm] 4.579 1.323 -0.4046 -0.2517 28.82
medial JSW at x=0.275 [mm] 6.016 1.392 -0.369 -0.2613 26.68
Maximum cofficient of variation 0.1926 0.1277 0.3951 0.133 26.66
medial JSW at x=0.300 [mm] 6.653 1.448 -0.3366 -0.2635 23.9
Mean JSW 6.145 1.121 -0.3306 -0.2355 22.27

par(cex=0.5)
ymax <- 1.2*max(abs(tvaluesTable))
barplot(t(abs(tvaluesTable[topnames,])),beside=TRUE,legend = colnames(tvaluesTable),ylim=c(0,ymax),las=2)


ymax <- 1.2*max((tvaluesTableSlope))
barplot(t(abs(tvaluesTableSlope[topnames,])),beside=TRUE,legend = colnames(tvaluesTableSlope),ylim=c(0,ymax),las=2)

par(cex=1.0)

Univariate Analysis to 2 Year Change in KL

I´ll check the association to a change in KL score from BL. I am checking only subjects with a BLKL<3

rawbase <- subset(OAI_JSW,VISITID==0 & Y2DeltaJSN >= 0 & BLKL < 4 & BLKL >= 2)
rownames(rawbase) <- rawbase$IDSIDE

#baseline <- subset(OAI_JSW.full.adj,VISITID==0 & Y2DeltaJSN >= 0 )
baseline <- subset(OAI_JSW.full.adj.norm,VISITID==0 & Y2DeltaJSN >= 0 & BLKL < 4 & BLKL >= 2)
rownames(baseline) <- baseline$IDSIDE
baseline <- baseline[complete.cases(baseline),]
rawbase <- rawbase[rownames(baseline),]

rawbase$progress <- 1*(rawbase$Y2DeltaJSN > 0)
baseline$progress <- 1*(baseline$Y2DeltaJSN > 0)

UniRankFeaturesRaw <- univariateRankVariables(variableList = OAI_JSWVarDescription,
                                                formula = "progress ~ 1+BMI+HEIGHT+Gender+AnySurgery",
                                                Outcome = "progress",
                                                data = rawbase, 
                                                categorizationType = "Raw", 
                                                type = "LOGIT", 
                                                rankingTest = "zIDI",
                                                description = "Description",
                                               uniType="Binary")

UniRankFeaturesNorm<- univariateRankVariables(variableList = OAI_JSWVarDescription,
                                                formula = "progress ~ 1+BMI+AnySurgery",
                                                Outcome = "progress",
                                                data = baseline, 
                                                categorizationType = "Raw", 
                                                type = "LOGIT", 
                                                rankingTest = "zIDI",
                                                description = "Description",
                                              raw.dataFrame = rawbase,
                                               uniType="Binary")

topnames <- rownames(UniRankFeaturesNorm)[1:15]
Cstat <- abs(cbind(UniRankFeaturesRaw[topnames,"ZGLM"],UniRankFeaturesNorm[topnames,"ZGLM"]))
colnames(Cstat) <- c("Original","Normalized")
rownames(Cstat) <- topnames
ymax <- 1.2*max(Cstat)
barplot(t(Cstat),beside=TRUE,legend = colnames(Cstat),ylim=c(0,ymax),las=2,cex.names=0.5)



BLDescriptive <- cbind(UniRankFeaturesNorm$controlMean,UniRankFeaturesNorm$controlStd,UniRankFeaturesNorm$caseMean,UniRankFeaturesNorm$caseStd,UniRankFeaturesNorm$ROCAUC,UniRankFeaturesNorm$kendall.r,UniRankFeaturesNorm$Beta,UniRankFeaturesNorm$ZGLM)
rownames(BLDescriptive) <- UniRankFeaturesNorm$descrip
colnames(BLDescriptive) <- c("Control Mean","Control Std","Case Mean","Case Std","ROCAUC","Kendall","Beta","Z")
pander::pandoc.table(BLDescriptive[1:15,])
Table continues below
  Control Mean Control Std Case Mean
Coefficient of variation of JSW 0.2158 0.08775 0.2844
Standard deviation of the JSW 1.306 0.4932 1.595
Medial to Lateral Absolute Difference 1.881 1.152 2.553
Maximum cofficient of variation 0.178 0.07265 0.2261
Mean JSW 6.172 1.084 5.714
medial JSW at x=0.200 [mm] 4.872 1.148 4.337
Medial Quadratic Fitting Intercept 0.0586 0.01325 0.05233
medial JSW at x=0.225 [mm] 5.149 1.165 4.609
Mean: medial JSW (Calc) 5.346 1.151 4.838
medial JSW at x=0.175 [mm] 4.675 1.156 4.152
medial JSW at x=0.250 [mm] 5.522 1.208 4.999
medial JSW at x=0.275 [mm] 6.011 1.26 5.521
medial minimum JSW [mm] 4.09 1.138 3.599
medial JSW at x=0.150 [mm] 4.547 1.167 4.056
medial JSW at x=0.300 [mm] 6.647 1.335 6.191
  Case Std ROCAUC Kendall Beta Z
Coefficient of variation of JSW 0.1173 0.6847 0.2162 0.2802 13.55
Standard deviation of the JSW 0.6582 0.6419 0.1607 0.2761 11.47
Medial to Lateral Absolute Difference 1.496 0.6419 0.1602 0.2716 11.24
Maximum cofficient of variation 0.1029 0.6572 0.1811 0.2926 11.28
Mean JSW 1.138 0.6398 -0.1612 -0.3313 9.763
medial JSW at x=0.200 [mm] 1.43 0.6238 -0.1427 -0.2527 9.844
Medial Quadratic Fitting Intercept 0.01599 0.626 -0.1438 -0.2663 9.692
medial JSW at x=0.225 [mm] 1.411 0.6255 -0.1434 -0.2597 9.665
Mean: medial JSW (Calc) 1.395 0.6213 -0.1384 -0.2585 9.44
medial JSW at x=0.175 [mm] 1.461 0.6185 -0.1361 -0.2604 9.173
medial JSW at x=0.250 [mm] 1.409 0.623 -0.1403 -0.2615 9.249
medial JSW at x=0.275 [mm] 1.458 0.6148 -0.1289 -0.2742 8.472
medial minimum JSW [mm] 1.462 0.6097 -0.1292 -0.2297 8.157
medial JSW at x=0.150 [mm] 1.519 0.6089 -0.1265 -0.2335 8.079
medial JSW at x=0.300 [mm] 1.523 0.6076 -0.1195 -0.2684 7.564

Univariate analysis to WOMAC

Now I study the association to total WOMAC

rawbase <- subset(OAI_JSW,VISITID==0 )
rownames(rawbase) <- rawbase$IDSIDE

baseline <- subset(OAI_JSW.full.adj.norm,VISITID==0 )
rownames(baseline) <- baseline$IDSIDE
baseline <- baseline[complete.cases(baseline),]
rawbase <- rawbase[rownames(baseline),]


UniRankFeaturesRawWomac <- univariateRankVariables(variableList = OAI_JSWVarDescription,
                                                formula = "BLTOTALWOMAC ~ 1+BMI+HEIGHT+Gender+AnySurgery",
                                                Outcome = "BLTOTALWOMAC",
                                                data = rawbase, 
                                                categorizationType = "Raw", 
                                                type = "LM", 
                                                rankingTest = "Ztest",
                                                description = "Description",
                                               uniType="Regression")

UniRankFeaturesNormWomac <- univariateRankVariables(variableList = OAI_JSWVarDescription,
                                                formula = "BLTOTALWOMAC ~ 1+BMI+AnySurgery",
                                                Outcome = "BLTOTALWOMAC",
                                                data = baseline, 
                                                categorizationType = "Raw", 
                                                type = "LM", 
                                                rankingTest = "Ztest",
                                                description = "Description",
                                              raw.dataFrame = rawbase,
                                               uniType="Regression")

topnames <- rownames(UniRankFeaturesNormWomac)[1:15]
Cstat <- abs(cbind(UniRankFeaturesRawWomac[topnames,"ZGLM"],UniRankFeaturesNormWomac[topnames,"ZGLM"]))
colnames(Cstat) <- c("Original","Normalized")
rownames(Cstat) <- topnames
ymax <- 1.2*max(Cstat)
barplot(t(Cstat),beside=TRUE,legend = colnames(Cstat),ylim=c(0,ymax),las=2,cex.names=0.5)


BLDescriptive <- cbind(UniRankFeaturesNormWomac$cohortMean,UniRankFeaturesNormWomac$cohortStd,UniRankFeaturesNormWomac$spearman.r,UniRankFeaturesNormWomac$Beta,UniRankFeaturesNormWomac$ZGLM)
rownames(BLDescriptive) <- UniRankFeaturesNormWomac$descrip
colnames(BLDescriptive) <- c("Mean","Std","Sperman r","Beta","Z")
pander::pandoc.table(BLDescriptive[1:15,])
  Mean Std Sperman r Beta Z
Coefficient of variation of JSW 0.2278 0.1221 0.136 0.7364 9.627
Mean JSW 6.116 1.138 -0.1316 -1.285 8.629
lateral JSW at x=0.700 [mm] 7.685 1.868 -0.1277 -1.308 8.036
lateral JSW at x=0.725 [mm] 7.437 1.766 -0.1133 -1.145 7.687
Medial to Lateral Absolute Difference 1.921 1.354 0.1096 0.8094 7.63
Medial Quadratic Fitting Intercept 0.05835 0.01479 -0.1012 -0.8813 7.272
Maximum cofficient of variation 0.1945 0.1352 0.1218 0.5066 7.244
lateral JSW at x=0.750 [mm] 7.14 1.706 -0.09307 -0.9958 7.045
Mean: medial JSW (Calc) 5.335 1.303 -0.09535 -0.8569 7.021
Standard deviation of the JSW 1.339 0.5907 0.09998 0.7309 6.945
medial JSW at x=0.225 [mm] 5.132 1.322 -0.0953 -0.8199 6.859
medial JSW at x=0.200 [mm] 4.859 1.311 -0.09404 -0.7892 6.842
medial JSW at x=0.250 [mm] 5.504 1.354 -0.09339 -0.8108 6.479
medial JSW at x=0.275 [mm] 5.99 1.4 -0.09135 -0.9164 6.404
medial JSW at x=0.300 [mm] 6.625 1.456 -0.0858 -1.009 6.358

Univariate analysis to 2Y Change WOMAC

Now I study the association to the 2Y change in total WOMAC

rawbase <- subset(OAI_JSW,VISITID==0  & BLKL < 4 & BLKL >= 1 & BLTOTALWOMAC <25 )
rownames(rawbase) <- rawbase$IDSIDE

baseline <- subset(OAI_JSW.full.adj.norm,VISITID==0 & BLKL < 4 & BLKL >= 1 & BLTOTALWOMAC <25 )
rownames(baseline) <- baseline$IDSIDE
baseline <- baseline[complete.cases(baseline),]
rawbase <- rawbase[rownames(baseline),]

rawbase$Worsen <- 1*(rawbase$WOMAC2YTotalChange > 5)
baseline$Worsen <- 1*(baseline$WOMAC2YTotalChange > 5)

UniRankFeaturesRawWomac <- univariateRankVariables(variableList = OAI_JSWVarDescription,
                                                formula = "Worsen ~ 1+BLTOTALWOMAC+BMI+HEIGHT+Gender+AnySurgery",
                                                Outcome = "Worsen",
                                                data = rawbase, 
                                                categorizationType = "Raw", 
                                                type = "LOGIT", 
                                                rankingTest = "zIDI",
                                                description = "Description",
                                               uniType="Binary")

UniRankFeaturesNormWomac <- univariateRankVariables(variableList = OAI_JSWVarDescription,
                                                formula = "Worsen ~ 1+BLTOTALWOMAC+BMI+AnySurgery",
                                                Outcome = "Worsen",
                                                data = baseline, 
                                                categorizationType = "Raw", 
                                                type = "LOGIT", 
                                                rankingTest = "zIDI",
                                                description = "Description",
                                              raw.dataFrame = rawbase,
                                               uniType="Binary")

topnames <- rownames(UniRankFeaturesNormWomac)[1:15]
Cstat <- abs(cbind(UniRankFeaturesRawWomac[topnames,"ZGLM"],UniRankFeaturesNormWomac[topnames,"ZGLM"]))
colnames(Cstat) <- c("Original","Normalized")
rownames(Cstat) <- topnames
ymax <- 1.2*max(Cstat)
barplot(t(Cstat),beside=TRUE,legend = colnames(Cstat),ylim=c(0,ymax),las=2,cex.names=0.5)


BLDescriptive <- cbind(UniRankFeaturesNormWomac$controlMean,UniRankFeaturesNormWomac$controlStd,UniRankFeaturesNormWomac$caseMean,UniRankFeaturesNormWomac$caseStd,UniRankFeaturesNormWomac$ROCAUC,UniRankFeaturesNormWomac$kendall.r,UniRankFeaturesNormWomac$Beta,UniRankFeaturesNormWomac$ZGLM)
rownames(BLDescriptive) <- UniRankFeaturesNormWomac$descrip
colnames(BLDescriptive) <- c("Control Mean","Control Std","Case Mean","Case Std","ROCAUC","Kendall","Beta","Z")
pander::pandoc.table(BLDescriptive[1:15,])
Table continues below
  Control Mean Control Std Case Mean
Mean JSW 6.237 1.091 6.008
lateral JSW at x=0.725 [mm] 7.633 1.678 7.302
lateral JSW at x=0.700 [mm] 7.877 1.787 7.542
lateral JSW at x=0.750 [mm] 7.328 1.613 7.018
medial JSW at x=0.200 [mm] 4.925 1.151 4.735
Medial Quadratic Fitting Intercept 0.05883 0.01306 0.05702
Mean: Lateral JSW (Calc) 6.886 1.504 6.624
Maximum cofficient of variation 0.1788 0.07493 0.1904
Mean: medial JSW (Calc) 5.403 1.157 5.216
medial JSW at x=0.225 [mm] 5.198 1.169 5.013
x coordinate of minimum JSW 0.1237 0.04101 0.1295
lateral JSW at x=0.775 [mm] 7.011 1.584 6.733
Lateral Quadratic Fitting Intercept 0.07529 0.0165 0.07282
medial JSW at x=0.175 [mm] 4.729 1.159 4.546
medial JSW at x=0.250 [mm] 5.578 1.214 5.385
  Case Std ROCAUC Kendall Beta Z
Mean JSW 1.108 0.5955 -0.06759 -0.1351 4.136
lateral JSW at x=0.725 [mm] 1.71 0.5903 -0.06073 -0.1173 3.595
lateral JSW at x=0.700 [mm] 1.772 0.5878 -0.05662 -0.1198 3.37
lateral JSW at x=0.750 [mm] 1.677 0.5895 -0.05766 -0.1045 3.398
medial JSW at x=0.200 [mm] 1.221 0.5864 -0.04067 -0.07795 2.982
Medial Quadratic Fitting Intercept 0.01384 0.5861 -0.0425 -0.08286 2.984
Mean: Lateral JSW (Calc) 1.589 0.5893 -0.05218 -0.09444 3.285
Maximum cofficient of variation 0.08607 0.5907 0.04055 0.07825 3.124
Mean: medial JSW (Calc) 1.204 0.5865 -0.04194 -0.08018 2.882
medial JSW at x=0.225 [mm] 1.223 0.5855 -0.04096 -0.07731 2.838
x coordinate of minimum JSW 0.04354 0.5848 0.04233 0.1273 2.894
lateral JSW at x=0.775 [mm] 1.676 0.588 -0.05265 -0.09501 3.15
Lateral Quadratic Fitting Intercept 0.01748 0.5884 -0.05128 -0.09333 3.136
medial JSW at x=0.175 [mm] 1.232 0.5858 -0.04023 -0.07519 2.636
medial JSW at x=0.250 [mm] 1.241 0.5859 -0.039 -0.07786 2.724

Univariate analysis to 2Y Change Symptoms

Now I study the association to the 2Y change in symptoms

rawbase <- subset(OAI_JSW,VISITID==0  & BLKL < 4 & BLKL >= 1 & BLTOTALWOMAC <25 & PainMedication == 0)
rownames(rawbase) <- rawbase$IDSIDE

baseline <- subset(OAI_JSW.full.adj.norm,VISITID==0 & BLKL < 4 & BLKL >= 1 & BLTOTALWOMAC <25 & PainMedication == 0 )
rownames(baseline) <- baseline$IDSIDE
baseline <- baseline[complete.cases(baseline),]
rawbase <- rawbase[rownames(baseline),]


UniRankFeaturesRawWomac <- univariateRankVariables(variableList = OAI_JSWVarDescription,
                                                formula = "X2YIncreaseSymptoms ~ 1+BLTOTALWOMAC+BMI+HEIGHT+Gender+AnySurgery",
                                                Outcome = "X2YIncreaseSymptoms",
                                                data = rawbase, 
                                                categorizationType = "Raw", 
                                                type = "LOGIT", 
                                                rankingTest = "zIDI",
                                                description = "Description",
                                               uniType="Binary")

UniRankFeaturesNormWomac <- univariateRankVariables(variableList = OAI_JSWVarDescription,
                                                formula = "X2YIncreaseSymptoms ~ 1+BLTOTALWOMAC+BMI+AnySurgery",
                                                Outcome = "X2YIncreaseSymptoms",
                                                data = baseline, 
                                                categorizationType = "Raw", 
                                                type = "LOGIT", 
                                                rankingTest = "zIDI",
                                                description = "Description",
                                              raw.dataFrame = rawbase,
                                               uniType="Binary")

topnames <- rownames(UniRankFeaturesNormWomac)[1:15]
Cstat <- abs(cbind(UniRankFeaturesRawWomac[topnames,"ZGLM"],UniRankFeaturesNormWomac[topnames,"ZGLM"]))
colnames(Cstat) <- c("Original","Normalized")
rownames(Cstat) <- topnames
ymax <- 1.2*max(Cstat)
barplot(t(Cstat),beside=TRUE,legend = colnames(Cstat),ylim=c(0,ymax),las=2,cex.names=0.5)


BLDescriptive <- cbind(UniRankFeaturesNormWomac$controlMean,UniRankFeaturesNormWomac$controlStd,UniRankFeaturesNormWomac$caseMean,UniRankFeaturesNormWomac$caseStd,UniRankFeaturesNormWomac$ROCAUC,UniRankFeaturesNormWomac$kendall.r,UniRankFeaturesNormWomac$Beta,UniRankFeaturesNormWomac$ZGLM)
rownames(BLDescriptive) <- UniRankFeaturesNormWomac$descrip
colnames(BLDescriptive) <- c("Control Mean","Control Std","Case Mean","Case Std","ROCAUC","Kendall","Beta","Z")
pander::pandoc.table(BLDescriptive[1:15,])
Table continues below
  Control Mean Control Std Case Mean
Mean JSW 6.279 1.077 5.99
Medial Quadratic Fitting Intercept 0.05903 0.01271 0.05651
medial JSW at x=0.200 [mm] 4.964 1.114 4.694
medial JSW at x=0.225 [mm] 5.236 1.138 4.97
medial JSW at x=0.175 [mm] 4.771 1.115 4.505
Mean: medial JSW (Calc) 5.44 1.125 5.186
Maximum cofficient of variation 0.1757 0.07148 0.192
medial JSW at x=0.150 [mm] 4.648 1.13 4.399
medial JSW at x=0.250 [mm] 5.614 1.188 5.363
Coefficient of variation of Medial JSW 0.1508 0.06598 0.164
medial minimum JSW [mm] 4.185 1.102 3.975
Mean: Lateral JSW (Calc) 6.932 1.488 6.615
lateral JSW at x=0.725 [mm] 7.692 1.664 7.336
lateral JSW at x=0.750 [mm] 7.374 1.597 7.028
Coefficient of variation of JSW 0.2122 0.08153 0.228
  Case Std ROCAUC Kendall Beta Z
Mean JSW 1.113 0.6111 -0.09425 -0.1884 5.198
Medial Quadratic Fitting Intercept 0.01316 0.6022 -0.07049 -0.1446 4.59
medial JSW at x=0.200 [mm] 1.202 0.6019 -0.06976 -0.1334 4.529
medial JSW at x=0.225 [mm] 1.2 0.601 -0.06902 -0.1367 4.434
medial JSW at x=0.175 [mm] 1.221 0.6 -0.06653 -0.1379 4.286
Mean: medial JSW (Calc) 1.185 0.6002 -0.06447 -0.1322 4.197
Maximum cofficient of variation 0.08823 0.6009 0.06065 0.1245 4.452
medial JSW at x=0.150 [mm] 1.242 0.5969 -0.05814 -0.1216 3.751
medial JSW at x=0.250 [mm] 1.218 0.5981 -0.05961 -0.1241 3.833
Coefficient of variation of Medial JSW 0.08559 0.5931 0.04271 0.1177 3.628
medial minimum JSW [mm] 1.196 0.5947 -0.05167 -0.1062 3.37
Mean: Lateral JSW (Calc) 1.597 0.5995 -0.0697 -0.1205 3.812
lateral JSW at x=0.725 [mm] 1.764 0.5974 -0.06724 -0.1276 3.612
lateral JSW at x=0.750 [mm] 1.717 0.5983 -0.06946 -0.1241 3.7
Coefficient of variation of JSW 0.09576 0.594 0.04704 0.08578 3.607