Abstract
blockclusterパッケージを使った共クラスタリングrequire(UsingR)
require(ggplot2)
require(tidyr)
require(dplyr)
require(plotly)
require(coefplot)
require(blockcluster)
共クラスタリングはクラスタ数を指定するアルゴリズムと指定しないアルゴリズムがある。クラスタ数を指定しないアルゴリズムとしてはinfinite relational modelがある。ここではクラスタ数を指定するアルゴリズムを用いる。
defaultstrategy <- coclusterStrategy()
summary(defaultstrategy)
## ******************************************************************
## Algorithm: BEM
## Initialization method(There is no default value):
## Stopping Criteria: Parameter
##
## Various Iterations
## ******************
## Number of global iterations while running initialization: 10
## Number of iterations for internal E-step: 5
## Number of EM iterations used during xem: 50
## Number of EM iterations used during XEM: 500
## Number of xem iterations: 5
## Number of tries: 2
##
## Various epsilons
## ****************
## Tolerance value used while initialization: 0.01
## Tolerance value for internal E-step: 0.01
## Tolerance value used during xem: 1e-04
## Tolerance value used during XEM: 1e-10
## Hyper-parameters: 1 1
## ******************************************************************
data("binarydata")
binarydata[1:10,1:10]
## V1 V2 V3 V4 V5 V6 V7 V8 V9 V10
## [1,] 1 1 1 0 1 1 1 1 1 0
## [2,] 0 0 1 0 0 0 0 0 0 1
## [3,] 0 1 1 0 1 1 1 1 1 0
## [4,] 0 0 0 0 1 0 1 1 0 1
## [5,] 0 0 0 1 1 1 1 1 1 0
## [6,] 1 1 1 1 0 0 1 1 0 1
## [7,] 0 0 0 1 0 0 0 1 0 1
## [8,] 0 0 0 1 0 0 0 0 0 1
## [9,] 0 0 1 0 0 1 1 1 1 0
## [10,] 0 0 0 1 0 1 0 0 0 1
out<-coclusterBinary(binarydata, nbcocluster=c(2,3))
## Co-Clustering successfully terminated!
summary(out)
## ******************************************************************
## Model Family : Bernoulli Latent block model
## Model Name : pik_rhol_epsilonkl
## Co-Clustering Type : Unsupervised
## ICL value: -45557.07
##
## Model Parameters..
##
## Class Mean:
## [,1] [,2] [,3]
## [1,] FALSE TRUE FALSE
## [2,] FALSE FALSE TRUE
##
## Class Dispersion:
## [,1] [,2] [,3]
## [1,] 0.30176927 0.2003679 0.1006314
## [2,] 0.09798014 0.3022391 0.1011803
##
## Row proportions: 0.382 0.618
## Column proportions: 0.29 0.37 0.34
## Pseudo-likelihood: -0.4552043
## ******************************************************************
plot(out)