Uncertainty in hc analysis can be measured by p-values of each cluster via multiscale bootstrap resampling. Approximately unbiased (AU) p-value is by multiscale bootstrap resampling. Bootstrap probability (BP) p-value computed by normal bootstrap resampling is more biased.
Following perform hc analysis with multiscale bootstrap with 10 repetitions, complete linkage for cluster joining and a correlation-based dissimilarity matrix. Clusters in the dendrogram with AU value above 0.95 are highlighted with red rectangles. Pick significant cluster using pvpick().
library(pvclust)
y <- matrix(rnorm(500), 50, 10, dimnames = list(paste("g", 1:50, sep = ""), paste("t", 1:10, sep = "")))
pv <- pvclust(scale(t(y)), method.dist = "correlation", method.hclust = "complete", nboot = 10)
## Bootstrap (r = 0.5)... Done.
## Bootstrap (r = 0.6)... Done.
## Bootstrap (r = 0.7)... Done.
## Bootstrap (r = 0.8)... Done.
## Bootstrap (r = 0.9)... Done.
## Bootstrap (r = 1.0)... Done.
## Bootstrap (r = 1.1)... Done.
## Bootstrap (r = 1.2)... Done.
## Bootstrap (r = 1.3)... Done.
## Bootstrap (r = 1.4)... Done.
plot(pv, hang=-1)
pvrect(pv, alpha = 0.95)
clsig <- unlist(pvpick(pv, alpha = 0.95, pv = "au", type = "geq", max.only = TRUE)$clusters)
Plot a dendrogram where the significant clusters are highlighted in red.
source("http://faculty.ucr.edu/~tgirke/Documents/R_BioCond/My_R_Scripts/dendroCol.R")
dend_colored <- dendrapply(as.dendrogram(pv$hclust), dendroCol, keys=clsig, xPar="edgePar", bgr="black", fgr="red", pch=20)
plot(dend_colored, horiz = T)