library(bigMap)
# source aux. stuff (graph plot function)
source('../graphs.R')
Run pt-SNE on Sierpinski-3d with perplexity 1948 (95% of the data set size) and a decreasing range of thread-ratio. We show:
load('../s3d.RData')
threads <- c(1, 3, 4, 5, 6, 8)
# compute bandwiths ($\beta_{i}$) for $ppx=1948$,
g.0 <- bdm.init(s3d$data, is.distance = T, ppx = 1948, threads = 4)
# run pt-SNE
g.list <- lapply(threads, function(z)
{
g <- bdm.ptsne(s3d$data, g.0, theta = 0.0, threads = z, layers = ifelse(z > 1, 2, 1))
g <- bdm.hlCorr(s3d$data, g, threads = 4)
g <- bdm.knp(s3d$data, g, threads = 4)
g
})
save(g.list, file = './s3d_ppx_1948.RData')
nulL <- lapply(g.list, function(g) bdm.cost(g))
Note that the global structure is more or less preserved (we are using a very high perplexity) but the local structure gets worst as the thread-ratio is decreased.
nulL <- lapply(g.list, function(g) graph.plot(g, s3d$edges))
In terms of HL-correlation, the loss in structure (on average) is minimal.
hlTable <- sapply(g.list, function(g) mean(g$hlC))
hlTable <- matrix(round(hlTable, 4), nrow = 1)
threadRatio <- sapply(g.list, function(g) g$ptsne$layers /g$ptsne$threads)
colnames(hlTable) <- round(threadRatio, 2)
rownames(hlTable) <- c('<hlC>')
knitr::kable(hlTable, caption = 'hl-Correlation by thread-ratio') %>%
kable_styling(full_width = F)
| 1 | 0.67 | 0.5 | 0.4 | 0.33 | 0.25 | |
|---|---|---|---|---|---|---|
| <hlC> | 0.7451 | 0.7468 | 0.747 | 0.7526 | 0.7486 | 0.7509 |
In terms of kNP, the global structure (linAUC) is notably preserved (with the exception of g5 with thread-ratio = 0.33, blue line) but the loss in local structre (logAUC) is clearly perceptible.
bdm.knp.plot(g.list)
Note the correlation between decreasing thread-ratio and decreasing running time.
rTimes <- sapply(g.list, function(g) c(g$ppx$t[3], g$t$epoch, g$t$ptsne[3], sum(c(g$ppx$t[3], g$t$ptsne[3]))))
rTimes <- round(rTimes, 2)
threadRatio <- sapply(g.list, function(g) g$ptsne$layers /g$ptsne$threads)
colnames(rTimes) <- round(threadRatio, 2)
rownames(rTimes) <- c('betas', 'epoch', 'ptSNE', 'total')
knitr::kable(rTimes, caption = 'Computation times (s) by thread-ratio') %>%
kable_styling(full_width = F)
| 1 | 0.67 | 0.5 | 0.4 | 0.33 | 0.25 | |
|---|---|---|---|---|---|---|
| betas | 0.74 | 0.74 | 0.74 | 0.74 | 0.74 | 0.74 |
| epoch | 2.18 | 0.97 | 0.62 | 0.54 | 0.42 | 0.32 |
| ptSNE | 66.04 | 29.68 | 19.04 | 16.60 | 13.00 | 9.96 |
| total | 66.78 | 30.42 | 19.78 | 17.34 | 13.74 | 10.69 |
Run on: Intel(R) Xeon(R) CPU E31225 @ 3.10GHz, 4 cores, 16GB RAM.