ライブラリーの読み込み
> library(readr)
> library(dplyr)
> library(clustrd)
> library(knitr)
データの読み込みと表示
> mori1 <- read_csv("mori1.csv")
> sui1 <- select(mori1, code, rate, industry1, aging:taxgain)
> sui1.tokyo <- sui1[which(as.numeric(sui1$code) %/% 1000 == 13),]
> kable(sui1.tokyo[1:10,])
| code | rate | industry1 | aging | unemployment | taxgain |
|---|---|---|---|---|---|
| 13101 | 19.59 | 0.0003645 | 0.1761120 | 0.0181717 | 0.1134598 |
| 13102 | 19.77 | 0.0003888 | 0.1607417 | 0.0240528 | 0.1948065 |
| 13103 | 17.35 | 0.0006927 | 0.1754911 | 0.0273618 | 0.5428776 |
| 13104 | 20.16 | 0.0006748 | 0.1956889 | 0.0343030 | 0.3382430 |
| 13105 | 11.96 | 0.0006616 | 0.1909031 | 0.0272838 | 0.2612840 |
| 13106 | 18.35 | 0.0006166 | 0.2352163 | 0.0376425 | 0.1590275 |
| 13107 | 17.11 | 0.0007521 | 0.2270851 | 0.0371070 | 0.1969722 |
| 13108 | 16.95 | 0.0006992 | 0.2108695 | 0.0351113 | 0.4161563 |
| 13109 | 10.32 | 0.0009192 | 0.2022644 | 0.0329637 | 0.3759405 |
| 13110 | 11.20 | 0.0017310 | 0.1988243 | 0.0311272 | 0.3609426 |
> sui1.outRKM.1 <- cluspca(sui1.tokyo[,-1], 3,2,
+ method = "RKM",
+ rotation = "varimax",
+ scale = TRUE, nstart = 10, center = TRUE,seed = 1234)
|
| | 0%
|
|====== | 10%
|
|============= | 20%
|
|==================== | 30%
|
|========================== | 40%
|
|================================ | 50%
|
|======================================= | 60%
|
|============================================== | 70%
|
|==================================================== | 80%
|
|========================================================== | 90%
|
|=================================================================| 100%
> summary(sui1.outRKM.1)
Solution with 3 clusters of sizes 51 (82.3%), 8 (12.9%), 3 (4.8%) in 2 dimensions. Variables were mean centered and standardized.
Cluster centroids:
Dim.1 Dim.2
Cluster 1 0.5672 -0.1095
Cluster 2 0.5672 -0.1095
Cluster 3 0.5672 -0.1095
Variable scores:
Dim.1 Dim.2
rate 0.0812 -0.7557
industry1 -0.6449 0.0022
aging -0.4729 -0.5230
unemployment 0.4759 -0.3941
taxgain 0.3569 0.0085
Within cluster sum of squares by cluster:
[1] 27.8825 20.5320 3.7710
(between_SS / total_SS = 0 %)
Clustering vector:
[1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
[36] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 3 3 2 2 2 2 2 2 3 2 2
Objective criterion value: 73.1532
Available output:
[1] "obscoord" "attcoord" "centroid" "cluster" "criterion"
[6] "size" "odata" "scale" "center" "nstart"
> plot(sui1.outRKM.1, cludesc = TRUE)
> sui1.outRKM.2 <- cluspca(sui1.tokyo[,-1], 3, 2,
+ method = "RKM",
+ rotation = "none",
+ scale = TRUE, nstart = 10, center = TRUE,seed = 1234)
|
| | 0%
|
|====== | 10%
|
|============= | 20%
|
|==================== | 30%
|
|========================== | 40%
|
|================================ | 50%
|
|======================================= | 60%
|
|============================================== | 70%
|
|==================================================== | 80%
|
|========================================================== | 90%
|
|=================================================================| 100%
> summary(sui1.outRKM.2)
Solution with 3 clusters of sizes 51 (82.3%), 8 (12.9%), 3 (4.8%) in 2 dimensions. Variables were mean centered and standardized.
Cluster centroids:
Dim.1 Dim.2
Cluster 1 0.5523 0.1693
Cluster 2 -3.2806 0.4883
Cluster 3 -0.6414 -4.1806
Variable scores:
Dim.1 Dim.2
rate 0.4262 -0.6293
industry1 -0.5706 -0.3006
aging -0.1723 -0.6837
unemployment 0.6051 -0.1248
taxgain 0.3112 0.1749
Within cluster sum of squares by cluster:
[1] 27.8825 20.5320 3.7710
(between_SS / total_SS = 75.25 %)
Clustering vector:
[1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
[36] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 3 3 2 2 2 2 2 2 3 2 2
Objective criterion value: 73.1532
Available output:
[1] "obscoord" "attcoord" "centroid" "cluster" "criterion"
[6] "size" "odata" "scale" "center" "nstart"
> plot(sui1.outRKM.2, cludesc = TRUE)
> sui1.outFKM.1 <- cluspca(sui1.tokyo[,-1], 3, 2,
+ method = "FKM",
+ rotation = "varimax",
+ scale = TRUE, nstart = 10, center = TRUE,seed = 1234)
|
| | 0%
|
|====== | 10%
|
|============= | 20%
|
|==================== | 30%
|
|========================== | 40%
|
|================================ | 50%
|
|======================================= | 60%
|
|============================================== | 70%
|
|==================================================== | 80%
|
|========================================================== | 90%
|
|=================================================================| 100%
> summary(sui1.outFKM.1)
Solution with 3 clusters of sizes 30 (48.4%), 26 (41.9%), 6 (9.7%) in 2 dimensions. Variables were mean centered and standardized.
Cluster centroids:
Dim.1 Dim.2
Cluster 1 0.5507 -0.2206
Cluster 2 0.5507 -0.2206
Cluster 3 0.5507 -0.2206
Variable scores:
Dim.1 Dim.2
rate 0.5946 -0.1470
industry1 0.0169 0.6093
aging -0.6889 -0.3545
unemployment -0.1755 0.6824
taxgain -0.3753 0.1260
Within cluster sum of squares by cluster:
[1] 4.5353 9.8674 2.8766
(between_SS / total_SS = 0 %)
Clustering vector:
[1] 2 2 2 2 2 2 2 1 1 1 1 3 2 1 1 1 1 1 1 1 1 1 1 1 1 2 2 1 2 1 1 1 2 2 2
[36] 2 2 2 2 2 1 1 1 2 1 1 1 2 1 1 3 2 3 3 1 3 1 3 2 2 2 2
Objective criterion value: 17.2792
Available output:
[1] "obscoord" "attcoord" "centroid" "cluster" "criterion"
[6] "size" "odata" "scale" "center" "nstart"
> plot(sui1.outFKM.1, cludesc = TRUE)
> sui1.outFKM.2 <- cluspca(sui1.tokyo[,-1], 3, 2,
+ method = "FKM",
+ rotation = "none",
+ scale = TRUE, nstart = 10, center = TRUE,seed = 1234)
|
| | 0%
|
|====== | 10%
|
|============= | 20%
|
|==================== | 30%
|
|========================== | 40%
|
|================================ | 50%
|
|======================================= | 60%
|
|============================================== | 70%
|
|==================================================== | 80%
|
|========================================================== | 90%
|
|=================================================================| 100%
> summary(sui1.outFKM.2)
Solution with 3 clusters of sizes 30 (48.4%), 26 (41.9%), 6 (9.7%) in 2 dimensions. Variables were mean centered and standardized.
Cluster centroids:
Dim.1 Dim.2
Cluster 1 -0.1670 0.2605
Cluster 2 0.5641 -0.1839
Cluster 3 -1.6091 -0.5054
Variable scores:
Dim.1 Dim.2
rate 0.6030 -0.1076
industry1 -0.0233 0.6091
aging -0.6640 -0.3991
unemployment -0.2200 0.6693
taxgain -0.3828 0.1011
Within cluster sum of squares by cluster:
[1] 4.5353 9.8674 2.8766
(between_SS / total_SS = 62.74 %)
Clustering vector:
[1] 2 2 2 2 2 2 2 1 1 1 1 3 2 1 1 1 1 1 1 1 1 1 1 1 1 2 2 1 2 1 1 1 2 2 2
[36] 2 2 2 2 2 1 1 1 2 1 1 1 2 1 1 3 2 3 3 1 3 1 3 2 2 2 2
Objective criterion value: 17.2792
Available output:
[1] "obscoord" "attcoord" "centroid" "cluster" "criterion"
[6] "size" "odata" "scale" "center" "nstart"
> plot(sui1.outFKM.2, cludesc = TRUE)