Maureen Corrales
Gabriel Cordero
Felipe Jiménez
setwd("C:/Users/Usuario/Desktop/Datos_Mineria")
REC <- read.table("EjemploAlgoritmosRecomendacion.csv", header = TRUE, sep = ";",
dec = ",")
suppressMessages(library(ggplot2))
suppressMessages(library(FactoMineR))
suppressWarnings(library(cluster))
suppressWarnings(library(grid))
suppressWarnings(library(vcd))
suppressMessages(library(rattle))
Punto a. Método de k-medias, con iter.max=200, k=4, y verificación del Teorema de Fisher
head(REC)
## X Velocidad.Entrega Precio Durabilidad Imagen.Producto
## 1 Adam 2.05 0.30 3.45 2.35
## 2 Anna 0.90 1.50 3.15 3.30
## 3 Bernard 1.70 2.60 2.85 3.00
## 4 Edward 1.35 0.50 3.55 2.95
## 5 Emilia 3.00 0.45 4.80 3.90
## 6 Fabian 0.95 1.65 3.95 2.40
## Valor.Educativo Servicio.Retorno Tamano.Paquete Calidad.Producto
## 1 2.4 2.3 2.60 2.10
## 2 2.5 4.0 4.20 2.15
## 3 4.3 2.7 4.10 2.60
## 4 1.8 2.3 3.90 1.95
## 5 3.4 4.6 2.25 3.40
## 6 2.6 1.9 4.85 2.20
## Numero.Estrellas
## 1 1.7
## 2 2.8
## 3 3.3
## 4 1.7
## 5 4.3
## 6 3.0
str(REC)
## 'data.frame': 100 obs. of 10 variables:
## $ X : Factor w/ 100 levels "Adam","Anna",..: 1 2 3 4 5 11 79 19 100 20 ...
## $ Velocidad.Entrega: num 2.05 0.9 1.7 1.35 3 0.95 2.3 0.65 2.75 2 ...
## $ Precio : num 0.3 1.5 2.6 0.5 0.45 1.65 1.2 2.1 0.8 1.75 ...
## $ Durabilidad : num 3.45 3.15 2.85 3.55 4.8 3.95 4.75 3.1 4.7 3.25 ...
## $ Imagen.Producto : num 2.35 3.3 3 2.95 3.9 2.4 3.3 2.55 2.35 3 ...
## $ Valor.Educativo : num 2.4 2.5 4.3 1.8 3.4 2.6 3.5 2.8 3.5 3.7 ...
## $ Servicio.Retorno : num 2.3 4 2.7 2.3 4.6 1.9 4.5 2.2 3 3.2 ...
## $ Tamano.Paquete : num 2.6 4.2 4.1 3.9 2.25 4.85 3.8 3.45 3.8 4.35 ...
## $ Calidad.Producto : num 2.1 2.15 2.6 1.95 3.4 2.2 2.9 2.15 2.7 2.7 ...
## $ Numero.Estrellas : num 1.7 2.8 3.3 1.7 4.3 3 3.1 2.9 4.8 3.9 ...
K4 <- kmeans(REC[, 2:10], 4, iter.max = 200, nstart = 50)
K4
## K-means clustering with 4 clusters of sizes 27, 29, 14, 30
##
## Cluster means:
## Velocidad.Entrega Precio Durabilidad Imagen.Producto Valor.Educativo
## 1 1.463 1.8926 3.380 3.020 3.348
## 2 2.214 0.8845 4.443 2.328 3.103
## 3 2.571 0.8857 4.682 3.096 3.443
## 4 1.202 0.9683 3.635 2.333 2.100
## Servicio.Retorno Tamano.Paquete Calidad.Producto Numero.Estrellas
## 1 3.044 3.948 2.389 3.185
## 2 2.179 2.764 2.557 3.528
## 3 3.579 3.489 2.961 4.271
## 4 2.367 3.765 1.948 2.097
##
## Clustering vector:
## [1] 4 1 1 4 3 4 3 1 3 1 4 2 4 4 3 2 1 2 3 3 4 4 1 4 2 2 4 3 2 1 1 1 2 1 4
## [36] 4 1 2 4 4 4 3 2 2 4 2 2 1 2 2 2 1 1 4 4 1 3 3 2 1 2 2 2 1 4 2 3 1 2 1
## [71] 1 2 2 3 4 2 2 2 4 2 2 1 1 4 1 4 1 4 4 3 1 2 1 4 4 4 3 4 1 4
##
## Within cluster sum of squares by cluster:
## [1] 56.17 55.38 27.98 64.46
## (between_SS / total_SS = 52.9 %)
##
## Available components:
##
## [1] "cluster" "centers" "totss" "withinss"
## [5] "tot.withinss" "betweenss" "size" "iter"
## [9] "ifault"
# Verificación del Teorema de Fisher
K4$totss == K4$tot.withinss + K4$betweenss
## [1] TRUE
Punto b. Método de k-medias, con iter.max=200, k=2 y utilización de interpretación Horizontal-Vertical
K2 <- kmeans(REC[, 2:10], 2, iter.max = 200)
# Interpretación
centros <- K2$centers
centros
## Velocidad.Entrega Precio Durabilidad Imagen.Producto Valor.Educativo
## 1 2.161 1.192 4.292 2.708 3.353
## 2 1.264 1.170 3.526 2.521 2.382
## Servicio.Retorno Tamano.Paquete Calidad.Producto Numero.Estrellas
## 1 2.771 3.254 2.676 3.745
## 2 2.536 3.769 2.030 2.333
rownames(centros) <- c("Cluster1", "Cluster2")
centros <- t(centros)
atributo <- c("Velocidad.Entrega", "Precio", "Durabilidad", "Imagen.Producto",
"Valor.Educativo", "Servicio.Retorno", "Tamano.Paquete", "Calidad.Producto",
"Numero.Estrella")
cluster1 <- cbind(cbind(centros[, 1], "1"), atributo)
cluster2 <- cbind(cbind(centros[, 2], "2"), atributo)
clusters <- rbind(cluster1, cluster2)
colnames(clusters) <- c("Centro", "Cluster", "Atributo")
rownames(clusters) <- c("1", "2", "3", "4", "5", "6", "7", "8", "9", "10", "11",
"12", "13", "14", "15", "16", "17", "18")
clusters <- as.data.frame(clusters)
clusters$Centro <- as.numeric(as.character(clusters$Centro))
clusters
## Centro Cluster Atributo
## 1 2.161 1 Velocidad.Entrega
## 2 1.192 1 Precio
## 3 4.292 1 Durabilidad
## 4 2.708 1 Imagen.Producto
## 5 3.353 1 Valor.Educativo
## 6 2.771 1 Servicio.Retorno
## 7 3.254 1 Tamano.Paquete
## 8 2.676 1 Calidad.Producto
## 9 3.745 1 Numero.Estrella
## 10 1.264 2 Velocidad.Entrega
## 11 1.170 2 Precio
## 12 3.526 2 Durabilidad
## 13 2.521 2 Imagen.Producto
## 14 2.382 2 Valor.Educativo
## 15 2.536 2 Servicio.Retorno
## 16 3.769 2 Tamano.Paquete
## 17 2.030 2 Calidad.Producto
## 18 2.333 2 Numero.Estrella
g1 <- ggplot(data = clusters, aes(x = Atributo, y = Centro, colour = Cluster,
fill = Cluster)) + geom_bar(stat = "identity", position = "dodge") + scale_colour_brewer() +
ggtitle("Distribución de los atributos por cluster") + theme(text = element_text(size = 12,
face = "italic")) + labs(x = "Atributo", y = "Centro") + ylim(0, 5)
g1
Interpretación: Se generan dos grupos de clientes que tienen una valoración muy parecida (y baja) del precio. El grupo 1 da mejores valoraciones de la calidad, durabilidad, número estrella, servicio y velocidad. A su vez el grupo 2 solamente da mejores valoraciones para el atributo tamaño del paquete.
Punto c. Método de k-means, con 50 ejecuciones, iter.max=200 para graficar Codo de Jambu
InerciaIC = rep(0, 50)
for (k in 1:50) {
K = kmeans(REC[, 2:10], k, nstart = 50)
InerciaIC[k] = K$tot.withinss
}
plot(InerciaIC, col = "skyblue", type = "b")
Comentario: La disminución de la inercia parece cambiar poco a partir del 2-3 cluster, ya que luego de esos puntos los cambios son pequeños.
Punto d. Método de k-means, con 7 cluster para hacer recomendación a Leo, Teresa y Justin
K7 <- kmeans(REC[, 2:10], 7, iter.max = 200, nstart = 30)
RECK7 <- cbind(REC, K7$cluster)
RECK7
## X Velocidad.Entrega Precio Durabilidad Imagen.Producto
## 1 Adam 2.05 0.30 3.45 2.35
## 2 Anna 0.90 1.50 3.15 3.30
## 3 Bernard 1.70 2.60 2.85 3.00
## 4 Edward 1.35 0.50 3.55 2.95
## 5 Emilia 3.00 0.45 4.80 3.90
## 6 Fabian 0.95 1.65 3.95 2.40
## 7 Philip 2.30 1.20 4.75 3.30
## 8 Frank 0.65 2.10 3.10 2.55
## 9 Xavier 2.75 0.80 4.70 2.35
## 10 Gabriel 2.00 1.75 3.25 3.00
## 11 Marisol 1.20 0.80 4.40 2.40
## 12 Henry 1.95 1.10 4.55 2.30
## 13 Irene 1.40 0.70 4.05 1.90
## 14 Isabelle 1.85 0.75 4.30 2.85
## 15 Isidore 2.35 0.65 4.95 3.35
## 16 Joseph 1.70 1.00 4.85 2.35
## 17 Eugene 1.60 2.05 2.85 2.55
## 18 Eugenia 2.45 0.90 3.85 2.15
## 19 Eunice 2.65 0.70 4.85 3.05
## 20 Eva 2.35 0.65 4.95 3.35
## 21 Evdokia 1.65 0.45 4.30 2.00
## 22 Fedir 1.70 0.20 4.15 1.25
## 23 Felix 1.50 2.00 4.55 3.55
## 24 Fialka 1.20 0.75 3.35 2.40
## 25 Flavia 2.55 0.70 4.35 2.40
## 26 Flora 2.30 1.05 3.95 2.90
## 27 Florent 1.20 0.75 3.30 2.40
## 28 Florence 2.60 0.65 4.85 3.05
## 29 Hannah 1.75 1.40 4.95 1.75
## 30 Helen 2.05 1.85 2.95 2.75
## 31 Herman 1.50 1.60 3.00 2.65
## 32 Hilary 1.40 1.90 4.45 3.45
## 33 Lourdes 2.60 1.00 4.65 2.95
## 34 Isadore 1.70 1.85 3.20 2.85
## 35 Ivan 1.20 0.50 3.85 1.70
## 36 Jacob 0.90 1.65 3.75 2.25
## 37 Jeremiah 1.80 2.00 2.90 2.90
## 38 Jervis 2.00 0.45 4.55 2.70
## 39 Joachim 0.00 1.05 3.45 2.70
## 40 John 1.20 1.00 3.20 2.25
## 41 Santiago 0.95 1.70 3.80 2.30
## 42 Josephine 2.95 0.45 4.80 3.90
## 43 Judith 2.45 1.15 4.65 2.25
## 44 Justin 2.50 0.65 4.30 2.35
## 45 Kalyna 1.00 1.30 3.25 1.85
## 46 Larissa 2.50 1.25 4.70 2.30
## 47 Lawrence 1.55 0.95 5.00 2.25
## 48 Leon 1.70 1.95 2.80 2.80
## 49 Leonard 2.90 0.10 4.40 2.25
## 50 Leonid 2.70 1.05 4.00 1.50
## 51 Lesia 1.85 0.35 4.10 3.00
## 52 Leo 1.30 2.40 4.10 2.50
## 53 Louise 2.25 2.05 3.15 2.95
## 54 Lubomyr 1.40 1.20 3.35 2.45
## 55 Lydia 1.90 0.40 4.35 1.45
## 56 Magdalyna 1.45 1.30 3.85 3.50
## 57 Maksym 2.45 2.20 3.70 3.45
## 58 Marcel 2.70 1.25 4.80 2.75
## 59 Margaret 2.15 0.90 3.80 2.70
## 60 Maria 1.15 2.25 4.00 2.35
## 61 Marian 1.55 0.95 4.95 2.25
## 62 Marianna 2.55 0.95 4.60 2.90
## 63 Markian 2.05 0.55 4.65 2.75
## 64 Marko 1.50 1.90 2.75 2.45
## 65 Martha 0.55 1.00 3.60 2.35
## 66 Martin 1.85 0.70 4.50 2.25
## 67 Maryna 2.10 1.25 4.60 3.10
## 68 Matthew 0.80 2.25 3.20 2.65
## 69 Maura 2.65 0.85 4.25 1.85
## 70 Maya 1.15 1.85 4.15 2.60
## 71 Maximillian 1.80 2.70 2.95 3.10
## 72 Melania 2.80 1.10 4.10 1.55
## 73 Methodius 1.80 1.10 4.95 2.40
## 74 Michael 2.60 0.65 4.55 2.25
## 75 Michaelina 1.50 1.00 3.30 3.30
## 76 Mina 2.10 1.20 4.70 2.45
## 77 Monica 1.90 0.40 4.15 3.05
## 78 Mykyta 1.65 1.30 4.85 1.65
## 79 Myron 0.50 0.95 3.55 2.25
## 80 Myroslav 2.25 0.80 4.35 2.30
## 81 Myroslava 2.75 0.90 4.35 1.90
## 82 Salome 1.70 2.30 2.75 4.10
## 83 Samuel 0.80 1.40 3.05 3.20
## 84 Sandra 1.15 1.85 3.80 2.50
## 85 Sarah 1.30 1.50 4.25 3.00
## 86 Savina 1.25 1.55 3.50 2.10
## 87 Sebastian 1.20 1.45 4.20 2.95
## 88 Sophia 1.05 1.75 3.70 2.40
## 89 Stephan 1.45 0.60 3.65 3.05
## 90 Stephania 2.15 1.25 4.65 3.15
## 91 Susanna 1.50 1.40 3.90 3.55
## 92 Sylvan 2.40 0.85 3.80 2.10
## 93 Sylvester 1.55 2.10 2.55 3.90
## 94 Tamara 0.95 1.35 2.50 2.45
## 95 Theodore 2.00 0.25 3.35 2.25
## 96 Teofan 0.30 0.80 3.20 2.50
## 97 Teofil 3.05 0.25 4.60 2.40
## 98 Teofila 1.00 1.40 2.60 2.50
## 99 Teon 1.55 1.10 3.35 3.40
## 100 Teresa 1.25 0.90 4.50 2.50
## Valor.Educativo Servicio.Retorno Tamano.Paquete Calidad.Producto
## 1 2.4 2.3 2.60 2.10
## 2 2.5 4.0 4.20 2.15
## 3 4.3 2.7 4.10 2.60
## 4 1.8 2.3 3.90 1.95
## 5 3.4 4.6 2.25 3.40
## 6 2.6 1.9 4.85 2.20
## 7 3.5 4.5 3.80 2.90
## 8 2.8 2.2 3.45 2.15
## 9 3.5 3.0 3.80 2.70
## 10 3.7 3.2 4.35 2.70
## 11 2.0 2.8 2.90 2.15
## 12 3.0 2.5 4.15 2.50
## 13 2.1 1.4 3.30 2.20
## 14 2.7 3.7 3.35 2.50
## 15 3.0 2.6 3.40 2.95
## 16 2.7 1.7 2.40 2.35
## 17 3.6 2.9 3.10 2.20
## 18 3.4 1.5 2.95 2.80
## 19 3.3 3.9 3.40 2.95
## 20 3.0 2.6 3.40 3.00
## 21 2.1 1.8 3.15 2.25
## 22 1.2 1.7 2.60 1.65
## 23 3.5 3.4 4.20 2.60
## 24 1.9 2.5 3.60 1.85
## 25 3.3 2.6 1.90 2.45
## 26 3.4 2.8 2.35 2.95
## 27 1.9 2.5 3.60 1.85
## 28 3.2 3.9 3.35 2.90
## 29 3.1 1.7 2.70 2.70
## 30 3.9 3.0 4.20 2.55
## 31 3.1 3.0 4.00 1.65
## 32 3.3 3.2 4.10 2.50
## 33 3.7 2.4 2.30 3.05
## 34 3.5 3.4 4.20 1.90
## 35 1.7 1.1 3.10 2.05
## 36 2.5 2.4 3.80 1.80
## 37 3.7 2.5 4.65 2.40
## 38 2.4 2.6 3.65 2.55
## 39 1.1 2.6 4.45 1.95
## 40 2.1 2.2 4.40 1.65
## 41 2.6 2.5 3.85 1.85
## 42 3.4 4.6 2.25 3.35
## 43 3.6 1.3 3.10 2.95
## 44 3.1 2.5 1.85 2.40
## 45 2.4 1.7 4.25 1.60
## 46 3.7 1.4 3.15 3.00
## 47 2.6 3.2 1.90 2.45
## 48 3.6 2.3 4.55 2.35
## 49 3.0 2.4 3.35 2.45
## 50 3.8 1.4 2.60 1.90
## 51 2.1 2.5 2.60 2.50
## 52 3.6 2.5 4.50 2.60
## 53 4.3 3.4 4.40 2.75
## 54 2.5 2.6 4.60 1.85
## 55 1.6 2.1 2.80 1.85
## 56 2.8 3.6 3.85 2.10
## 57 4.6 4.0 4.80 3.10
## 58 4.0 3.0 3.85 3.00
## 59 3.1 2.5 2.20 2.80
## 60 3.3 2.2 4.35 2.50
## 61 2.6 3.1 1.90 2.40
## 62 3.6 2.3 2.25 3.05
## 63 2.5 2.7 3.70 2.65
## 64 3.4 2.6 3.00 2.10
## 65 1.6 3.2 5.00 1.70
## 66 2.6 2.3 3.40 2.45
## 67 3.3 3.9 3.65 3.00
## 68 3.0 2.5 3.55 2.25
## 69 3.5 1.9 2.40 2.15
## 70 3.0 2.3 4.55 2.40
## 71 4.5 2.9 4.20 2.70
## 72 4.0 1.6 2.65 1.95
## 73 2.9 1.9 2.45 2.45
## 74 3.3 2.7 3.65 2.55
## 75 2.4 2.7 4.10 2.05
## 76 3.2 2.7 4.25 2.60
## 77 2.2 2.6 2.65 2.55
## 78 2.9 1.5 2.60 2.55
## 79 1.5 3.1 4.95 1.65
## 80 3.1 2.1 3.40 2.55
## 81 3.6 2.1 2.45 2.25
## 82 4.0 4.4 3.15 2.80
## 83 2.3 3.8 4.10 2.05
## 84 3.0 2.5 3.70 2.20
## 85 2.8 2.8 3.40 2.80
## 86 2.8 2.2 4.50 1.85
## 87 2.7 2.7 3.35 2.75
## 88 2.8 2.3 3.60 2.15
## 89 2.0 2.5 4.00 2.00
## 90 3.4 4.0 3.70 3.05
## 91 3.0 3.8 3.95 2.20
## 92 3.3 1.4 2.90 2.75
## 93 3.6 4.0 2.95 2.60
## 94 2.2 2.5 4.10 1.80
## 95 2.2 2.1 2.50 2.00
## 96 0.7 2.1 4.20 1.70
## 97 3.3 2.8 3.55 2.60
## 98 2.4 2.7 4.20 1.85
## 99 2.6 2.9 4.20 2.15
## 100 2.2 3.0 3.00 2.20
## Numero.Estrellas K7$cluster
## 1 1.7 7
## 2 2.8 4
## 3 3.3 6
## 4 1.7 1
## 5 4.3 2
## 6 3.0 4
## 7 3.1 2
## 8 2.9 4
## 9 4.8 2
## 10 3.9 6
## 11 1.7 7
## 12 3.2 3
## 13 2.4 7
## 14 2.3 3
## 15 3.9 2
## 16 3.4 5
## 17 2.3 4
## 18 2.5 5
## 19 3.9 2
## 20 4.0 2
## 21 2.6 7
## 22 2.0 7
## 23 4.0 3
## 24 2.1 1
## 25 3.4 5
## 26 3.4 5
## 27 2.1 1
## 28 3.9 2
## 29 3.4 5
## 30 3.1 6
## 31 2.8 4
## 32 3.8 3
## 33 4.5 5
## 34 3.2 6
## 35 2.0 7
## 36 2.4 4
## 37 2.9 6
## 38 3.1 3
## 39 1.4 1
## 40 1.3 1
## 41 2.5 4
## 42 4.3 2
## 43 3.8 5
## 44 3.3 5
## 45 2.3 1
## 46 3.9 5
## 47 4.0 5
## 48 2.8 4
## 49 4.2 5
## 50 3.8 5
## 51 2.6 7
## 52 3.8 3
## 53 3.5 6
## 54 1.7 1
## 55 2.4 7
## 56 3.2 3
## 57 4.7 6
## 58 5.0 2
## 59 3.1 5
## 60 3.5 3
## 61 3.9 5
## 62 4.5 5
## 63 3.2 3
## 64 2.1 4
## 65 2.5 1
## 66 3.0 7
## 67 4.4 2
## 68 3.1 4
## 69 4.3 5
## 70 3.4 3
## 71 3.5 6
## 72 4.0 5
## 73 3.6 5
## 74 4.5 2
## 75 2.6 4
## 76 3.4 3
## 77 2.7 7
## 78 3.2 5
## 79 2.4 1
## 80 4.1 5
## 81 4.4 5
## 82 3.2 6
## 83 2.6 4
## 84 2.2 4
## 85 3.8 3
## 86 2.8 4
## 87 3.6 3
## 88 2.1 4
## 89 1.9 1
## 90 4.5 2
## 91 3.4 3
## 92 2.4 5
## 93 2.8 6
## 94 2.1 1
## 95 1.6 7
## 96 1.0 1
## 97 4.5 2
## 98 2.3 4
## 99 2.7 4
## 100 1.8 7
colnames(RECK7) <- c("ID")
# Función recomienda, requiere: - Un data frame con la variable Id del
# cliente llamada ID - Parámetros: DF: Data frame. C: Id de cliente. P1:
# Posición del campo del ID. P2: Posición del campo del cluster.
recomienda <- function(DF, C, P1, P2) {
Cluster <- D2 <- subset(DF[, P2], DF$ID == C)
Cliente <- subset(DF[, P1], DF$ID == C)
Y <- as.matrix(dist(subset(DF, match(DF[, P2], D2, nomatch = 0) == 1), method = "euclidean"))
rownames(Y) <- subset(DF, match(DF[, P2], Cluster, nomatch = 0) == 1)[,
P1]
colnames(Y) <- t(subset(DF, match(DF[, P2], Cluster, nomatch = 0) == 1)[,
P1])
r <- data.frame(subset(Y, colnames(Y) != Cliente, rownames(Y) == Cliente))
r$ClienteRecomendado <- rownames(r)
colnames(r) <- c("Distancia", "ClienteRecomendado")
p <- ggplot(data = r, aes(Distancia, ClienteRecomendado)) + geom_point(aes(size = 1/Distancia),
shape = 21, colour = "black", fill = "darkred") + theme_bw() + theme(panel.grid.major.x = element_blank(),
panel.grid.minor.x = element_blank(), panel.grid.major.y = element_line(colour = "grey60",
linetype = "dashed")) + ggtitle("Distancia de los clientes del cluster") +
theme(text = element_text(size = 14, face = "italic")) + labs(x = "Distancia",
y = "Cliente Recomendado")
p
return(p)
}
# Validación
recomienda(RECK7, "Teresa", 1, 11)
## Warning: NAs introduced by coercion
recomienda(RECK7, "Leo", 1, 11)
## Warning: NAs introduced by coercion
recomienda(RECK7, "Justin", 1, 11)
## Warning: NAs introduced by coercion
Punto a. Explicación de los datos
Este archivo de datos consiste en datos de 1000 jugadores de fútbol provenientes del juego PES2012. Las variables son 5 factores creados con un análisis factorial de diversas variables de capacidad futbolística más el valor del jugador en dólares.
Punto b1. Método de k-medias, con iter.max=200, k=4, y verificación del Teorema de Fisher
setwd("C:/Users/Usuario/Desktop/Datos_Mineria")
PES <- read.table("pesfactores.csv", header = TRUE, sep = ";", dec = ",")
head(PES)
## Nombre Costo Tecnica JuegoAereo Velocidad Balones.Largos
## 1 MANATOZ 11472139 80.8 85.00 69.67 82.6
## 2 QUAGLIARELLA 4828579 77.8 80.00 82.33 73.0
## 3 PELLISSIER 4585035 77.0 78.75 85.67 72.0
## 4 JEDA 3782562 78.2 81.00 83.00 75.6
## 5 JELEN 3338251 76.0 78.75 81.00 69.2
## 6 J. MARTINEZ 2946648 79.8 81.25 80.00 78.0
## Compromiso
## 1 80.00
## 2 84.67
## 3 80.67
## 4 79.33
## 5 79.67
## 6 76.00
str(PES)
## 'data.frame': 1964 obs. of 7 variables:
## $ Nombre : Factor w/ 1938 levels "?BRAH?M TORAMAN",..: 1116 1464 1406 874 878 846 1058 1743 1262 909 ...
## $ Costo : int 11472139 4828579 4585035 3782562 3338251 2946648 3400479 8151984 6288443 4698810 ...
## $ Tecnica : num 80.8 77.8 77 78.2 76 79.8 77.6 79.4 81.6 82.8 ...
## $ JuegoAereo : num 85 80 78.8 81 78.8 ...
## $ Velocidad : num 69.7 82.3 85.7 83 81 ...
## $ Balones.Largos: num 82.6 73 72 75.6 69.2 78 74.6 75.2 83.6 79.2 ...
## $ Compromiso : num 80 84.7 80.7 79.3 79.7 ...
# Obtener muestra del archivo de tamaño 1000
PESM <- PES[sample(1:nrow(PES), 1000, replace = FALSE), ]
summary(PESM)
## Nombre Costo Tecnica JuegoAereo
## ANDERSON: 3 Min. : 2527420 Min. :55.4 Min. :62.0
## ALEX : 2 1st Qu.: 3320736 1st Qu.:72.2 1st Qu.:73.0
## MAICON : 2 Median : 4354808 Median :75.4 Median :75.8
## MARVEAUX: 2 Mean : 6401813 Mean :75.4 Mean :75.9
## ROSSI : 2 3rd Qu.: 6827653 3rd Qu.:78.5 3rd Qu.:78.8
## SILVA : 2 Max. :66596855 Max. :92.4 Max. :89.0
## (Other) :987
## Velocidad Balones.Largos Compromiso
## Min. :63.7 Min. :57.2 Min. :66.0
## 1st Qu.:74.0 1st Qu.:69.6 1st Qu.:72.7
## Median :77.3 Median :73.4 Median :75.0
## Mean :77.1 Mean :73.5 Mean :75.4
## 3rd Qu.:80.3 3rd Qu.:77.4 3rd Qu.:78.0
## Max. :94.0 Max. :88.8 Max. :89.0
##
DF <- PESM[, 2:ncol(PESM)]
# Estandarizar variables
estandariza <- function(M) {
n <- dim(M)[1]
m <- dim(M)[2]
M2 <- matrix(0, nrow = n, ncol = m)
colnames(M2) <- colnames(M)
for (i in 1:n) {
for (j in 1:m) {
DF2 <- as.data.frame(M[, j])
colnames(DF2) <- c("J")
M2[i, j] <- ((DF2[i, ] - mean(DF2$J))/sd(DF2$J))
}
}
return(M2)
}
DFE <- cbind(as.data.frame(PESM[, 1]), estandariza(DF))
K4 <- kmeans(DFE[, 2:7], 4, iter.max = 200, nstart = 50)
DFE <- cbind(DFE, K4$cluster)
summary(DFE)
## PESM[, 1] Costo Tecnica JuegoAereo
## ANDERSON: 3 Min. :-0.608 Min. :-4.186 Min. :-3.601
## ALEX : 2 1st Qu.:-0.484 1st Qu.:-0.667 1st Qu.:-0.752
## MAICON : 2 Median :-0.321 Median : 0.003 Median :-0.040
## MARVEAUX: 2 Mean : 0.000 Mean : 0.000 Mean : 0.000
## ROSSI : 2 3rd Qu.: 0.067 3rd Qu.: 0.642 3rd Qu.: 0.737
## SILVA : 2 Max. : 9.448 Max. : 3.563 Max. : 3.392
## (Other) :987
## Velocidad Balones.Largos Compromiso K4$cluster
## Min. :-2.827 Min. :-3.187 Min. :-2.422 Min. :1.00
## 1st Qu.:-0.655 1st Qu.:-0.764 1st Qu.:-0.703 1st Qu.:1.00
## Median : 0.046 Median :-0.022 Median :-0.101 Median :3.00
## Mean : 0.000 Mean : 0.000 Mean : 0.000 Mean :2.67
## 3rd Qu.: 0.677 3rd Qu.: 0.759 3rd Qu.: 0.672 3rd Qu.:4.00
## Max. : 3.550 Max. : 2.986 Max. : 3.509 Max. :4.00
##
head(DFE)
## PESM[, 1] Costo Tecnica JuegoAereo Velocidad Balones.Largos
## 1 UGOL -0.56729 -0.373991 -1.72357 -1.84603 -0.33460
## 2 CISMA -0.09808 0.002974 -0.55799 -0.23427 0.79845
## 3 BOUTARAU -0.54219 1.343292 -2.30636 0.60665 1.11102
## 4 KWAK TAE HWI -0.50980 -1.714309 0.41333 -1.14526 -1.07695
## 5 REO COKER -0.53275 -0.332106 -0.03995 -0.58465 -0.17832
## 6 TOY -0.46985 0.170514 -0.10471 0.04604 0.09518
## Compromiso K4$cluster
## 1 1.01608 1
## 2 -0.10134 4
## 3 -1.82045 4
## 4 -1.04685 1
## 5 -0.78899 1
## 6 -0.01539 4
# Verificación del Teorema de Fisher
K4$totss == K4$tot.withinss + K4$betweenss
## [1] TRUE
Punto b2. Método de k-medias, con iter.max=200, k=2 y utilización de interpretación Horizontal-Vertical
centros <- K4$centers
centros
## Costo Tecnica JuegoAereo Velocidad Balones.Largos Compromiso
## 1 -0.2344 -0.9645 0.78965 -0.8556 -0.9916 -0.1431
## 2 5.0861 1.9024 0.40207 1.2282 1.5204 1.4197
## 3 0.1743 0.7154 0.09994 0.1369 0.7030 0.9099
## 4 -0.2561 0.1801 -0.84036 0.6003 0.2402 -0.7103
rownames(centros) <- c("Cluster1", "Cluster2", "Cluster3", "Cluster4")
centros <- t(centros)
centros
## Cluster1 Cluster2 Cluster3 Cluster4
## Costo -0.2344 5.0861 0.17428 -0.2561
## Tecnica -0.9645 1.9024 0.71545 0.1801
## JuegoAereo 0.7897 0.4021 0.09994 -0.8404
## Velocidad -0.8556 1.2282 0.13692 0.6003
## Balones.Largos -0.9916 1.5204 0.70297 0.2402
## Compromiso -0.1431 1.4197 0.90988 -0.7103
caract <- c("Costo", "Tecnica", "Juego.Aereo", "Velocidad", "Balones.Largos",
"Compromiso")
grupo1 <- cbind(cbind(centros[, 1], "1"), caract)
grupo2 <- cbind(cbind(centros[, 2], "2"), caract)
grupo3 <- cbind(cbind(centros[, 3], "3"), caract)
grupo4 <- cbind(cbind(centros[, 4], "4"), caract)
grupos <- rbind(grupo1, grupo2, grupo3, grupo4)
colnames(grupos) <- c("Centro", "Grupo", "Caracteristica")
rownames(grupos) <- c("1", "2", "3", "4", "5", "6", "7", "8", "9", "10", "11",
"12", "13", "14", "15", "16", "17", "18", "19", "20", "21", "22", "23",
"24")
grupos <- as.data.frame(grupos)
grupos$Centro <- as.numeric(as.character(grupos$Centro))
grupos
## Centro Grupo Caracteristica
## 1 -0.23444 1 Costo
## 2 -0.96445 1 Tecnica
## 3 0.78965 1 Juego.Aereo
## 4 -0.85559 1 Velocidad
## 5 -0.99156 1 Balones.Largos
## 6 -0.14314 1 Compromiso
## 7 5.08612 2 Costo
## 8 1.90237 2 Tecnica
## 9 0.40207 2 Juego.Aereo
## 10 1.22820 2 Velocidad
## 11 1.52041 2 Balones.Largos
## 12 1.41970 2 Compromiso
## 13 0.17428 3 Costo
## 14 0.71545 3 Tecnica
## 15 0.09994 3 Juego.Aereo
## 16 0.13692 3 Velocidad
## 17 0.70297 3 Balones.Largos
## 18 0.90988 3 Compromiso
## 19 -0.25610 4 Costo
## 20 0.18007 4 Tecnica
## 21 -0.84036 4 Juego.Aereo
## 22 0.60034 4 Velocidad
## 23 0.24023 4 Balones.Largos
## 24 -0.71029 4 Compromiso
g2 <- ggplot(data = grupos, aes(x = Caracteristica, y = Centro, colour = Grupo,
fill = Grupo)) + geom_bar(stat = "identity", position = "dodge") + scale_colour_brewer() +
ggtitle("Distribución de los atributos por grupo") + theme(text = element_text(size = 12,
face = "italic")) + labs(x = "Caracteristica", y = "Centro") + ylim(0, 3)
g2
Interpretación: la interpretación cambia dependiendo de la muestra seleccionada.
Punto b3. Método de k-means, con 50 ejecuciones, iter.max=200 para graficar Codo de Jambu
InerciaIC = rep(0, 50)
for (k in 1:50) {
K = kmeans(DFE[, 2:7], k, nstart = 50)
InerciaIC[k] = K$tot.withinss
}
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
## Warning: did not converge in 10 iterations
plot(InerciaIC, col = "skyblue", type = "b")
Comentario: Se escoje el número de grupos en 4 por el procedimiento anterior.
Punto c. Método de Clustering Jerárquico para comparar resultados
# Cluster Jerárquico
PESJ <- dist(DFE[, 2:7], method = "euclidean")
ClusterWard = hclust(PESJ, method = "ward")
plot(ClusterWard)
rect.hclust(ClusterWard, k = 4, border = "red")
GrupoWard <- cutree(ClusterWard, k = 4)
centrosJ <- centers.hclust(DFE[, 2:7], ClusterWard, nclust = 4, use.median = FALSE)
PESJ <- cbind(DFE, GrupoWard)
res <- PCA(DFE[, 2:7], scale.unit = TRUE, ncp = 6, graph = FALSE)
res.hcpc <- HCPC(res, nb.clust = -1, consol = TRUE, min = 4, max = 4, graph = FALSE)
plot.HCPC(res.hcpc, choice = "bar")
plot.HCPC(res.hcpc, choice = "map")
plot.HCPC(res.hcpc, choice = "3D.map", angle = 60)
# Comparando
PESF <- cbind(DFE, PESJ$GrupoWard)
head(PESF)
## PESM[, 1] Costo Tecnica JuegoAereo Velocidad Balones.Largos
## 1 UGOL -0.56729 -0.373991 -1.72357 -1.84603 -0.33460
## 2 CISMA -0.09808 0.002974 -0.55799 -0.23427 0.79845
## 3 BOUTARAU -0.54219 1.343292 -2.30636 0.60665 1.11102
## 4 KWAK TAE HWI -0.50980 -1.714309 0.41333 -1.14526 -1.07695
## 5 REO COKER -0.53275 -0.332106 -0.03995 -0.58465 -0.17832
## 6 TOY -0.46985 0.170514 -0.10471 0.04604 0.09518
## Compromiso K4$cluster PESJ$GrupoWard
## 1 1.01608 1 1
## 2 -0.10134 4 2
## 3 -1.82045 4 2
## 4 -1.04685 1 3
## 5 -0.78899 1 1
## 6 -0.01539 4 1
# Centros KMeans
cbind(t(centros), K4$size)
## Costo Tecnica JuegoAereo Velocidad Balones.Largos Compromiso
## Cluster1 -0.2344 -0.9645 0.78965 -0.8556 -0.9916 -0.1431
## Cluster2 5.0861 1.9024 0.40207 1.2282 1.5204 1.4197
## Cluster3 0.1743 0.7154 0.09994 0.1369 0.7030 0.9099
## Cluster4 -0.2561 0.1801 -0.84036 0.6003 0.2402 -0.7103
##
## Cluster1 329
## Cluster2 23
## Cluster3 293
## Cluster4 355
# Centros Cluster Jerárquico
centrosJ
## Costo Tecnica JuegoAereo Velocidad Balones.Largos Compromiso
## [1,] -0.2031 -0.3605 0.5412 -0.2698 -0.4973 0.04216
## [2,] -0.1470 0.2461 -0.8913 0.8205 0.4268 -0.71531
## [3,] -0.4084 -1.4435 0.6546 -1.4213 -1.3764 -0.53571
## [4,] 0.6085 0.9534 -0.1921 0.2787 0.9298 0.88585
# La agrupación del K-means genera una mejor partición en los grupos de
# jugadores, ya que el grupo de mejores valores promedio, es por lo general
# el mejor en todas las variables.
# Escoger el grupo con los valores más altos para visualizar
subset(PESJ, K4$cluster == 3)
## PESM[, 1] Costo Tecnica JuegoAereo Velocidad
## 12 ROUDET -0.361012 -0.206451 -0.62274 0.67673
## 21 G. SIGURÐSSON -0.549373 0.882558 -0.62274 -0.09411
## 25 KABORÉ -0.261954 0.421823 -0.36373 0.67673
## 27 A. SVENSSON 0.030566 1.468947 -0.49323 -0.65473
## 28 FALCAO -0.128073 0.589363 2.09694 0.74681
## 32 THIL -0.550065 0.463708 0.80185 0.18619
## 40 POLI -0.005684 1.301407 -0.36373 -0.37442
## 43 SIONKO 0.762860 0.547478 0.08956 1.72788
## 46 NEY SANTOS 0.348266 0.673133 -0.10471 0.46650
## 47 IRELAND -0.533064 1.091983 -0.42848 0.46650
## 52 LUKOVI? 0.013348 0.296168 0.02480 -0.16419
## 58 MORRONE -0.546109 0.044859 0.08956 -0.58465
## 61 PLAIL 0.563511 1.008213 -0.42848 0.53658
## 63 DIDOT -0.504150 1.343292 -1.20553 -0.37442
## 72 FUCILE 0.268418 0.254284 -0.75225 0.32635
## 75 G. DELVECCHIO -0.325233 0.296168 1.38464 -0.44450
## 78 COSSU 0.307918 2.432301 -0.94652 0.95704
## 80 ZATOJO -0.543999 0.631248 -0.23422 -0.93503
## 85 ZAKI -0.141998 0.338053 1.90268 0.74681
## 87 WHITEHEAD -0.509259 -0.038911 -0.29897 -0.65473
## 93 MAXI PEREIRA 0.047900 -0.625300 -0.16946 -0.09411
## 96 MIGLIACCIO -0.523303 -0.206451 0.73710 -0.30434
## 97 BRIGHI 0.108223 0.631248 0.47808 -0.23427
## 98 MULGREW -0.416044 0.128629 -0.36373 -0.79488
## 100 ZAMBROTTA 0.231916 0.631248 -0.16946 0.32635
## 105 F. NAVARRO 1.318519 0.212399 0.54284 0.39642
## 106 GOVOU -0.413925 0.882558 0.28382 0.81688
## 107 SEITARIDIS 0.537902 0.086744 0.21906 0.25627
## 117 MASCHERANO 1.083619 0.589363 1.12563 0.11612
## 118 L. JACOBSEN -0.482671 -0.541531 -0.36373 -0.65473
## 119 PATEIRO -0.561314 0.547478 -0.81701 -0.51457
## 120 BAUTHÉAC -0.604377 -0.164566 -0.16946 -0.02404
## 123 ESCUDÉ 0.681947 0.338053 1.64366 -0.93503
## 125 CHALMÉ 0.128180 -0.206451 -0.29897 0.25627
## 131 ROSSI 1.138225 1.385177 0.21906 1.79796
## 132 ÁNGEL 0.767421 0.463708 -0.42848 -0.23427
## 133 AROMGA -0.341707 0.882558 0.93136 0.95704
## 138 QUAGLIARELLA -0.246925 0.505593 1.06087 1.09719
## 142 ROSENTHAL -0.543933 0.547478 -0.29897 -1.28542
## 146 NESTA 2.406796 0.002974 1.57891 -0.58465
## 147 LANDZAAT 0.508041 0.715018 -0.10471 -0.37442
## 152 JAKOB POULSEN -0.108331 0.924443 0.08956 -0.51457
## 153 AQUILANI 0.816477 1.971567 -0.81701 -0.02404
## 160 PAVLYUCHENKO -0.603446 0.002974 0.67235 -0.16419
## 162 JESÚS GÁMEZ 0.093148 -0.332106 0.28382 -0.02404
## 165 SIMON DAVIES 0.162815 0.966328 -1.27029 0.74681
## 170 MATUZALEM 0.917022 1.887797 -0.16946 0.39642
## 174 MARGAIRAZ -0.531852 -0.122681 -0.16946 -0.65473
## 177 WILKSHIRE 0.304126 0.296168 -0.29897 0.18619
## 178 BOSINGWA 2.573423 0.756903 -0.42848 0.88696
## 179 J. CASQUERO -0.477174 0.505593 -0.36373 -0.51457
## 180 LEKO -0.392067 0.505593 -0.36373 -0.72480
## 184 CHAMAKH -0.309317 0.715018 2.03219 1.02711
## 191 MANU DEL MORAL -0.605773 0.547478 0.34857 0.67673
## 192 COLUCCI 0.042483 0.798788 -0.10471 -0.30434
## 200 DEL PIERO 0.049188 2.390416 0.08956 -0.09411
## 203 S. CAZORLA 2.023223 1.594602 -1.20553 1.58773
## 205 POGATETZ -0.127348 -0.122681 1.12563 -1.42557
## 208 PEDRO MENDES -0.541758 -0.457761 0.02480 -0.72480
## 212 REYES 0.912963 1.510832 -0.29897 1.23734
## 225 JÚLIO BAPTISTA -0.104292 0.421823 2.35596 0.46650
## 227 KI SUNG YUENG -0.279983 1.133868 -1.39980 -1.35549
## 231 C. LEDESMA 1.046809 1.762142 -0.10471 -0.16419
## 236 WILSHERE 0.400910 1.929682 -1.14078 1.30742
## 239 LAZZARI -0.597396 0.673133 0.41333 0.25627
## 244 PJANI? -0.093761 1.971567 -1.39980 0.74681
## 259 GERA -0.546109 1.175752 0.99612 0.67673
## 260 HAZARD 1.139405 1.301407 -0.94652 1.86804
## 262 E. JUÁREZ -0.312256 -0.206451 -1.07602 0.04604
## 263 SHALNIV -0.519120 0.086744 0.93136 -0.93503
## 268 TIOTÉ -0.021845 0.673133 0.54284 0.04604
## 273 GALLOPPA -0.065818 1.217637 -0.68750 -0.16419
## 277 F. DELLA ROCCA -0.322257 0.840673 -0.23422 -0.09411
## 279 PAROLO -0.414632 0.463708 -0.10471 -0.23427
## 280 INLER 2.063907 1.343292 0.80185 -0.02404
## 282 ARMAND 1.939043 0.882558 0.93136 0.11612
## 284 VAN NISTELROOY -0.103122 0.798788 2.16170 -1.00511
## 285 YESTE 0.335670 1.929682 0.34857 0.11612
## 287 KENNEDY -0.267293 0.966328 -0.23422 0.18619
## 291 BONNART -0.130266 -0.499646 -0.55799 0.95704
## 293 PASTORE 0.989997 2.348531 0.15431 0.67673
## 298 DUDKA -0.385736 -0.122681 0.80185 -0.23427
## 299 SALVIO 0.525959 0.840673 1.44940 0.88696
## 300 SÁNCHEZ -0.371005 0.296168 0.47808 0.18619
## 301 ALONSO -0.483282 0.547478 -1.01127 0.46650
## 307 KILBANE -0.437062 0.673133 0.73710 -0.16419
## 311 WITSEL -0.223830 0.589363 0.15431 0.39642
## 317 IGNASHEVITCH 0.374092 0.338053 1.51415 -1.98618
## 319 SCULLI -0.478358 0.212399 1.25514 1.02711
## 320 SCHWAAB -0.285954 -0.499646 -0.03995 -0.23427
## 326 HAMSIK 1.669599 2.013451 0.67235 0.60665
## 336 ENOH -0.515191 0.547478 -0.49323 -0.72480
## 339 LEÃO -0.257975 0.840673 0.34857 -0.93503
## 342 S. DALMAT -0.186180 1.385177 -0.68750 0.04604
## 344 GIGGS 2.305428 2.264761 -0.68750 0.32635
## 355 PEDRETTI 1.173921 1.887797 -0.62274 -0.79488
## 357 MESTO 0.306469 0.631248 -0.68750 1.16727
## 359 P. AIMAR -0.062728 1.720257 -0.23422 0.88696
## 360 MEHMET TOPUZ 0.798967 0.756903 -0.42848 0.39642
## 368 BARI? ÖZBEK -0.525470 0.086744 0.28382 0.39642
## 370 JALLET 0.259600 0.715018 -1.20553 0.60665
## 372 F. PORCARI -0.576052 0.128629 -0.16946 -0.44450
## 380 ANSALDI 0.128180 0.421823 0.15431 -0.02404
## 382 T. KRISTENSEN -0.335549 0.338053 -0.42848 -0.79488
## 384 ARDA TURAN 0.911686 1.594602 -0.68750 1.16727
## 385 GUTI -0.229366 2.432301 -0.62274 -1.28542
## 388 GORTER -0.293189 0.421823 -0.49323 -0.30434
## 394 CABAYE -0.235831 0.463708 0.34857 0.32635
## 395 LEE YONG RAE -0.300439 0.798788 -0.81701 -0.44450
## 397 RIVIÈRE -0.290776 0.296168 1.51415 1.58773
## 398 ZOKORA 0.197013 0.798788 0.99612 0.25627
## 399 GAITAN 1.077384 1.259522 -0.03995 1.44757
## 400 RAÚL ALBIOL 1.068167 0.463708 1.44940 -0.23427
## 404 PITAU -0.335220 0.673133 -0.03995 -1.07519
## 405 VAN PERSIE 1.446958 2.264761 -0.10471 1.02711
## 407 JOHNSON 0.526669 -0.332106 0.73710 0.74681
## 410 VILLA 1.723746 1.259522 1.25514 1.44757
## 412 GARAY 1.227537 -0.499646 1.31989 -0.58465
## 415 KANUEHIHO -0.482060 0.002974 -0.29897 -0.37442
## 420 JEUNECHAMP -0.216278 0.296168 0.47808 0.25627
## 426 MATHIEU 0.245677 0.296168 0.21906 0.32635
## 428 KÄLLSTRÖM 0.342153 0.882558 -0.55799 -0.16419
## 431 NJANGO -0.171577 0.421823 1.57891 1.09719
## 433 PARK JI SUNG 0.065394 1.259522 0.93136 1.30742
## 434 AYHAN AKMAN -0.599257 0.170514 0.86661 0.04604
## 445 TOROSIDIS 0.884681 -0.332106 0.60759 0.60665
## 447 JOÃO PEREIRA 1.123306 0.212399 -0.29897 1.16727
## 450 RIVERA -0.116688 0.924443 -1.01127 0.11612
## 455 HUNT -0.285150 0.254284 -0.55799 0.67673
## 456 MAXWELL 1.455225 0.882558 -0.68750 1.51765
## 459 MAURI -0.540385 0.840673 0.67235 -0.37442
## 463 MELLTHROP -0.362616 0.715018 -0.03995 0.04604
## 465 ENDO -0.160674 1.259522 -1.33504 -1.70588
## 469 MARTÍ -0.528723 -0.080796 -0.36373 -1.14526
## 470 GAGO 0.106836 1.217637 -0.49323 -0.30434
## 480 SEAN DAVIS -0.293189 0.673133 -0.23422 -1.14526
## 481 PARKER 0.679142 0.505593 0.67235 0.11612
## 484 HIGUAÍN 0.513358 0.715018 0.93136 1.44757
## 486 KATSOURANIS -0.128173 0.463708 1.31989 -0.65473
## 488 EVRA 2.046163 0.463708 -0.75225 1.44757
## 489 NOBOA -0.345763 0.756903 -0.42848 -0.58465
## 490 LUCAS 0.032899 0.715018 0.67235 -0.30434
## 493 BI?AN -0.482060 0.212399 0.93136 -1.14526
## 497 GROßKREUTZ -0.155551 -0.373991 -0.16946 0.53658
## 499 ISLA 1.082329 0.547478 0.41333 0.60665
## 503 EBOUÉ 1.618330 0.296168 0.47808 1.09719
## 505 ADAM 0.931963 1.427062 -0.62274 0.04604
## 508 CHIVU 0.583227 1.385177 0.73710 -0.30434
## 510 GARICS -0.254329 -0.038911 -0.16946 0.04604
## 512 BOLATTI -0.321447 0.631248 0.80185 -0.72480
## 514 TRÉMOULINAS 1.549333 0.673133 -0.55799 1.72788
## 516 C. RODRÍGUEZ -0.133406 -0.499646 -0.23422 0.53658
## 519 E. BARRETO -0.145981 0.882558 -0.03995 -0.02404
## 521 NEYMAR 1.256389 1.887797 -0.03995 2.35857
## 523 ADRIANO 1.742366 0.798788 -0.62274 1.44757
## 527 SEYDOU KEITA 1.619225 0.966328 1.96743 0.39642
## 529 ETOO 1.222514 1.259522 0.80185 2.21842
## 534 ESTRADA -0.067880 1.343292 -1.52931 -0.16419
## 535 PEPE 2.195036 0.756903 2.09694 1.16727
## 539 TISSONE -0.331167 0.547478 -0.10471 0.04604
## 549 KARNA -0.295349 1.050098 0.34857 1.37750
## 550 DEROIN -0.518433 0.547478 -1.46455 0.11612
## 552 ELANO -0.215536 1.636487 -0.29897 0.25627
## 555 DUNN -0.402635 1.050098 -0.10471 -0.44450
## 556 BELLAMY -0.291969 0.044859 0.02480 2.35857
## 559 CALLEJÓN 0.155047 0.505593 0.21906 1.23734
## 563 BALZARETTI 0.602973 0.296168 0.15431 0.60665
## 567 DUFF 0.273442 1.175752 -0.42848 0.95704
## 573 MARESCA -0.548829 0.589363 0.08956 -0.86496
## 574 GIACOMAZZI -0.601118 1.008213 0.99612 -0.79488
## 575 ANDERSON 0.938233 0.840673 0.08956 0.60665
## 576 JONÁS GUTIÉRREZ 0.112350 1.091983 -0.55799 0.25627
## 579 AGGER 0.950677 0.254284 1.25514 -0.44450
## 591 MASCARA -0.389252 1.678372 -1.07602 0.81688
## 593 SCALONI -0.404046 -0.080796 -0.55799 -0.51457
## 600 MILNER 0.817244 0.756903 -0.03995 0.67673
## 604 WRIGHT-PHILLIPS -0.008705 0.212399 -0.29897 1.51765
## 606 DIAMANTI -0.402949 1.301407 -1.33504 0.81688
## 610 PIENAAR -0.075108 1.427062 0.47808 0.81688
## 611 DI VAIO -0.475953 0.505593 0.93136 0.46650
## 613 LEDESMA -0.374242 0.379938 -0.03995 -0.93503
## 616 FANNI 0.772972 0.673133 1.38464 0.32635
## 623 LUIS GARCÍA -0.089608 1.385177 0.93136 1.30742
## 625 BERBATOV 0.277374 1.887797 1.19038 -0.79488
## 627 HONDA -0.404751 1.175752 0.47808 -0.58465
## 630 ÉVER BANEGA 0.752279 1.762142 -0.36373 -0.58465
## 635 IAQUINTA -0.601584 -0.038911 1.57891 0.04604
## 640 R. HAMOUMA -0.256180 0.924443 -0.75225 0.53658
## 641 SUTTER -0.106058 -0.332106 -0.03995 0.04604
## 642 ROMAN EREMENKO -0.394883 0.715018 -0.36373 -0.51457
## 646 ROMAO -0.474732 0.379938 -0.03995 -0.58465
## 651 ROLFES 2.106806 1.175752 0.86661 -0.79488
## 654 RAKYTSKYI -0.423448 0.547478 -0.03995 -0.86496
## 655 VIROTA 0.093455 0.296168 1.31989 0.25627
## 658 RAÚL GARCÍA 0.218404 1.008213 -0.10471 -0.37442
## 671 WHELAN -0.117218 0.715018 -0.36373 -1.00511
## 673 AGUIRREGARAY 0.027709 -0.373991 -0.03995 0.18619
## 677 JEANMA 0.424351 0.421823 1.19038 -0.23427
## 678 TYMOSHCHUK -0.604842 -0.038911 0.28382 -1.98618
## 679 JAVI MÁRQUEZ 0.041053 1.133868 -0.68750 -0.16419
## 680 DEJAN STANKOVI? 0.760495 1.552717 1.06087 0.39642
## 682 CANDREVA -0.027936 1.552717 0.08956 0.74681
## 684 L. OLÍMPIO -0.323876 0.547478 -0.10471 -0.72480
## 694 ARONICA -0.272329 -0.164566 0.54284 -0.58465
## 695 FÉRET -0.246925 1.594602 -0.36373 -0.72480
## 697 MARCOS SENNA 0.588811 1.175752 0.28382 -0.37442
## 699 WARNOCK 0.123669 0.296168 -0.29897 0.32635
## 700 PAULO FERREIRA -0.485115 -0.122681 -0.23422 -0.58465
## 703 KARAGOUNIS -0.547741 1.091983 -0.42848 0.60665
## 708 LEMAÎTRE -0.407968 0.338053 0.28382 -0.79488
## 714 FABIO SIMPLICIO -0.214693 0.631248 0.02480 0.04604
## 715 SCHAARS -0.220142 0.756903 -0.42848 -0.65473
## 719 G. FERNANDES -0.491033 -0.122681 -0.03995 -0.23427
## 723 TUNCAY ?ANLI -0.540670 0.296168 0.99612 0.81688
## 728 A. HUGHES -0.464365 0.002974 0.08956 -0.44450
## 730 RAKITI? 0.008336 1.720257 -0.62274 0.11612
## 737 MVILA -0.087604 -0.164566 1.19038 0.53658
## 742 DIEGO CASTRO 0.963589 0.715018 -0.81701 1.02711
## 747 ILUCZNICA -0.205785 0.128629 0.67235 -0.65473
## 748 EMERTON -0.594140 -0.248336 -0.29897 0.04604
## 750 THIAGO 0.346173 1.762142 0.21906 0.32635
## 751 BIONDINI -0.350632 0.421823 -0.23422 -0.16419
## 754 PARK CHU YOUNG -0.444923 0.254284 2.09694 0.60665
## 763 LUCHO 0.421149 1.343292 1.25514 -0.51457
## 767 NAINGGOLAN 0.071168 0.966328 0.15431 0.32635
## 769 SILVA 1.212269 2.306646 -0.81701 1.86804
## 770 MANATOZ 0.795808 1.133868 2.35596 -1.56572
## 787 DOMIZZI -0.225675 0.128629 1.12563 -0.30434
## 798 VUCINIC -0.211854 1.091983 0.73710 0.81688
## 799 PERII? -0.560856 1.008213 0.86661 -0.09411
## 800 CLEVERLEY -0.096879 0.966328 -0.62274 0.04604
## 804 CEARÁ 0.174958 0.212399 -0.81701 0.88696
## 805 MEHMET TOPAL -0.082071 0.254284 0.80185 -0.23427
## 806 L. BIGLIA -0.235831 0.840673 -0.23422 -0.30434
## 811 PULZETTI -0.545810 0.044859 -0.23422 0.18619
## 812 OLI? -0.535236 0.296168 1.25514 1.02711
## 817 ARTETA 0.579911 2.432301 -0.94652 0.88696
## 820 EMERSE FAÉ -0.404134 0.086744 -0.10471 0.53658
## 826 ABRIEL -0.595535 1.175752 -0.49323 -0.02404
## 829 BRUNO AMARO -0.522762 0.338053 0.47808 -1.07519
## 834 CODREA -0.411861 1.008213 -0.68750 -0.65473
## 835 S. LARSSON -0.225617 1.091983 -1.14078 -0.02404
## 836 J. MOUTINHO 0.382453 1.552717 -1.65881 0.46650
## 848 ?ORLUKA -0.119810 0.212399 0.34857 -0.30434
## 853 AFELLAY 0.039889 1.678372 -0.62274 1.30742
## 854 R. CARVALHO 1.927972 0.379938 1.96743 -1.14526
## 855 BOVO 0.156159 -0.038911 1.64366 -1.28542
## 860 BARRY 0.382453 1.217637 -0.42848 -0.93503
## 862 MATUIDI 1.287096 1.008213 -0.29897 1.16727
## 870 MARVEAUX -0.270067 -0.164566 -0.16946 0.67673
## 873 BAINES 2.090806 0.505593 -0.49323 0.39642
## 877 JOSÉ PEDRO -0.540670 -0.457761 -0.10471 0.32635
## 878 JONATHAN 0.246946 0.296168 -0.36373 0.81688
## 881 SRNA 0.065394 0.547478 -0.81701 -0.30434
## 889 VÄYRYNEN -0.548285 0.631248 -0.68750 -0.58465
## 894 GONALONS -0.545021 0.421823 0.41333 -0.51457
## 895 KHEDIRA 0.140575 0.254284 0.99612 0.04604
## 898 ALEX 0.083820 2.474186 0.21906 -0.79488
## 899 DANIC 1.097446 0.086744 0.08956 0.95704
## 903 REGINALDO -0.375908 0.631248 -0.23422 0.95704
## 904 KOMPANY 2.129641 0.924443 1.83793 -0.37442
## 905 DAVID PIZARRO 1.739402 2.725496 -1.33504 1.16727
## 909 ARSHAVIN 2.065063 1.552717 -1.01127 1.44757
## 913 STROOTMAN -0.171577 1.050098 -0.55799 -1.00511
## 916 SERVET ÇETIN 0.250749 -0.038911 1.12563 -0.86496
## 919 JEROME LEROY -0.602049 1.301407 -0.88176 -0.02404
## 922 C. SEEDORF 1.836482 2.641726 0.41333 -0.09411
## 923 DONADEL 0.484902 1.427062 0.02480 -0.09411
## 930 PINZI -0.089558 0.212399 -0.10471 0.18619
## 932 LESOIMIER -0.485115 0.254284 -0.75225 0.60665
## 940 CAIÇARA 0.663178 0.882558 -1.27029 0.32635
## 945 JEDA -0.411102 0.589363 1.31989 1.23734
## 948 R. MEIRELES 1.395199 1.427062 0.86661 0.04604
## 949 PISZCZEK 0.214494 -0.038911 0.08956 0.60665
## 951 CLICHY 2.088093 0.338053 -1.01127 2.28850
## 953 MARTIN -0.268752 1.468947 -1.20553 0.81688
## 955 C. MARCHENA -0.368905 0.631248 0.60759 -1.70588
## 957 CARLOS MARTINS -0.429828 0.715018 -0.23422 0.39642
## 958 JOÃO ALVES -0.376610 0.840673 -1.01127 -0.37442
## 959 COLMAN 0.463292 1.385177 -1.65881 -0.51457
## 960 PONZIO -0.387143 0.296168 -0.16946 -0.16419
## 962 STEVEN DAVIS 0.210884 0.631248 -0.16946 1.23734
## 964 CORGNET -0.099125 0.170514 -0.42848 0.18619
## 965 GRAVA 0.162711 -0.164566 -0.16946 0.25627
## 976 ROLANDO 1.374108 0.421823 1.77317 -0.30434
## 979 SAGNA 1.300718 -0.122681 -0.36373 1.09719
## 983 HEINZE 1.646410 0.086744 2.09694 -0.09411
## 984 VIDAL -0.024889 0.547478 1.25514 0.18619
## 989 JERTHSKI 1.438317 1.427062 0.54284 1.37750
## 993 J. ILI?I? -0.342519 1.175752 0.80185 0.25627
## 995 ENRIQUE 0.683142 0.631248 -0.88176 0.18619
## 997 STANKEVICIUS -0.302052 0.379938 1.06087 -0.51457
## 1000 CASITOLYUK -0.359299 1.259522 -0.62274 -0.16419
## Balones.Largos Compromiso K4$cluster GrupoWard
## 12 0.83752 0.84417 3 2
## 21 1.15009 0.24248 3 4
## 25 0.72031 0.24248 3 2
## 27 1.42358 1.10204 3 4
## 28 0.05611 0.93012 3 1
## 32 0.99381 0.67226 3 4
## 40 0.72031 -0.35921 3 4
## 43 0.72031 0.07057 3 2
## 46 0.83752 -0.10134 3 2
## 47 -0.02204 1.44586 3 4
## 52 0.99381 0.32844 3 4
## 58 0.01703 2.30541 3 4
## 61 1.18916 0.67226 3 4
## 63 1.22823 1.44586 3 4
## 72 0.75938 0.67226 3 2
## 75 0.40774 0.07057 3 1
## 78 1.73615 0.84417 3 4
## 80 0.01703 -0.01539 3 4
## 85 0.25146 -0.87494 3 1
## 87 0.75938 1.01608 3 4
## 93 -0.06111 2.13350 3 4
## 96 -0.88159 2.13350 3 1
## 97 0.48588 1.61777 3 4
## 98 1.73615 -0.27325 3 4
## 100 0.40774 1.70372 3 4
## 105 0.48588 0.58630 3 2
## 106 0.17332 1.10204 3 4
## 107 -0.02204 0.32844 3 2
## 117 -0.37367 2.64923 3 4
## 118 0.60310 1.18799 3 4
## 119 0.68124 1.10204 3 4
## 120 0.87659 1.27395 3 4
## 123 0.09518 1.35990 3 1
## 125 0.75938 0.75821 3 4
## 131 1.22823 0.58630 3 4
## 132 0.32960 0.93012 3 4
## 133 1.07195 -0.27325 3 1
## 138 -0.10018 2.39137 3 4
## 142 0.17332 -0.10134 3 4
## 146 -0.72531 1.53181 3 1
## 147 0.48588 1.61777 3 4
## 152 1.42358 0.67226 3 4
## 153 1.69708 -0.10134 3 4
## 160 0.09518 0.24248 3 1
## 162 -0.25646 1.10204 3 4
## 165 1.26730 0.67226 3 4
## 170 1.26730 0.41439 3 4
## 174 -0.02204 0.84417 3 1
## 177 1.38451 1.01608 3 4
## 178 -0.17832 0.32844 3 4
## 179 0.52496 1.61777 3 4
## 180 0.99381 -0.01539 3 4
## 184 0.52496 -0.01539 3 1
## 191 -0.02204 0.58630 3 4
## 192 0.72031 1.10204 3 4
## 200 2.20500 2.21946 3 4
## 203 1.42358 1.44586 3 4
## 205 0.91566 0.50035 3 1
## 208 0.09518 1.53181 3 4
## 212 1.54080 -0.27325 3 4
## 225 0.72031 0.24248 3 1
## 227 1.42358 0.41439 3 4
## 231 1.69708 1.70372 3 4
## 236 0.60310 1.01608 3 4
## 239 0.72031 0.07057 3 1
## 244 2.08779 0.24248 3 4
## 259 0.36867 0.32844 3 1
## 260 0.68124 0.41439 3 2
## 262 -0.72531 1.44586 3 4
## 263 0.40774 1.01608 3 1
## 268 0.17332 0.15652 3 1
## 273 0.87659 -0.01539 3 4
## 277 0.32960 0.32844 3 4
## 279 0.17332 0.84417 3 4
## 280 0.91566 2.21946 3 4
## 282 1.46266 1.70372 3 4
## 284 0.32960 0.50035 3 1
## 285 1.89243 -0.10134 3 4
## 287 1.42358 0.24248 3 4
## 291 -0.10018 1.27395 3 4
## 293 1.42358 -0.53112 3 4
## 298 0.52496 0.58630 3 1
## 299 1.15009 0.41439 3 4
## 300 -0.33460 -0.01539 3 1
## 301 0.75938 1.10204 3 4
## 307 1.18916 1.10204 3 4
## 311 -0.17832 0.75821 3 4
## 317 0.99381 0.58630 3 1
## 319 -0.21739 1.78968 3 1
## 320 0.29053 0.58630 3 1
## 326 1.77522 1.87563 3 4
## 336 0.17332 0.32844 3 4
## 339 -0.02204 1.27395 3 1
## 342 1.38451 -0.01539 3 4
## 344 1.69708 1.61777 3 4
## 355 2.08779 1.70372 3 4
## 357 0.52496 0.41439 3 2
## 359 1.38451 -0.10134 3 4
## 360 1.57987 0.50035 3 4
## 368 -0.02204 1.01608 3 4
## 370 1.65801 0.41439 3 4
## 372 0.29053 0.41439 3 4
## 380 0.79845 0.93012 3 4
## 382 0.29053 0.07057 3 4
## 384 1.03288 1.01608 3 4
## 385 1.18916 0.50035 3 4
## 388 1.18916 0.24248 3 4
## 394 1.65801 1.18799 3 4
## 395 0.79845 1.78968 3 4
## 397 -0.17832 0.24248 3 1
## 398 0.52496 1.61777 3 4
## 399 1.65801 0.67226 3 4
## 400 0.05611 -0.27325 3 1
## 404 -0.13925 1.87563 3 1
## 405 2.63478 -0.53112 3 4
## 407 -0.02204 0.32844 3 2
## 410 1.65801 2.04755 3 4
## 412 0.95473 -0.27325 3 1
## 415 0.83752 0.75821 3 4
## 420 0.01703 0.75821 3 4
## 426 1.03288 0.32844 3 4
## 428 2.08779 -0.35921 3 4
## 431 -0.33460 1.61777 3 1
## 433 0.44681 3.16497 3 4
## 434 -0.21739 1.35990 3 1
## 445 0.05611 -0.01539 3 2
## 447 1.26730 1.53181 3 4
## 450 0.60310 1.10204 3 4
## 455 0.52496 1.27395 3 2
## 456 0.87659 1.27395 3 4
## 459 0.48588 1.01608 3 1
## 463 -0.37367 1.01608 3 4
## 465 1.42358 1.18799 3 4
## 469 0.68124 1.87563 3 4
## 470 -0.68624 1.70372 3 4
## 480 0.79845 0.41439 3 4
## 481 0.32960 2.64923 3 4
## 484 0.83752 1.18799 3 4
## 486 0.36867 1.70372 3 1
## 488 -0.21739 0.93012 3 4
## 489 0.79845 0.93012 3 4
## 490 -0.06111 0.84417 3 1
## 493 0.32960 0.58630 3 1
## 497 -0.29553 0.67226 3 4
## 499 0.17332 2.56328 3 4
## 503 -0.10018 0.15652 3 2
## 505 2.12686 1.35990 3 4
## 508 2.08779 1.53181 3 4
## 510 0.05611 1.01608 3 4
## 512 0.17332 0.24248 3 1
## 514 1.22823 1.01608 3 4
## 516 -0.45181 1.78968 3 4
## 519 1.61894 -0.44516 3 4
## 521 0.72031 -0.01539 3 4
## 523 1.03288 1.44586 3 4
## 527 1.46266 1.78968 3 4
## 529 0.64217 2.04755 3 4
## 534 1.38451 0.67226 3 4
## 535 -0.02204 -0.18730 3 4
## 539 0.64217 0.32844 3 4
## 549 0.05611 0.75821 3 4
## 550 0.68124 1.01608 3 4
## 552 1.34544 1.18799 3 4
## 555 1.07195 0.50035 3 4
## 556 0.60310 1.61777 3 4
## 559 0.91566 0.50035 3 2
## 563 0.87659 1.53181 3 4
## 567 0.95473 1.70372 3 4
## 573 0.64217 -0.18730 3 4
## 574 0.52496 1.27395 3 1
## 575 1.50173 0.07057 3 4
## 576 0.05611 0.84417 3 4
## 579 1.30637 0.84417 3 4
## 591 1.54080 1.35990 3 4
## 593 -0.17832 0.50035 3 1
## 600 1.15009 0.67226 3 4
## 604 0.25146 1.10204 3 4
## 606 1.89243 0.58630 3 4
## 610 0.68124 0.84417 3 4
## 611 0.52496 1.18799 3 1
## 613 -0.33460 1.10204 3 1
## 616 0.48588 1.18799 3 4
## 623 1.73615 1.35990 3 4
## 625 0.17332 0.15652 3 1
## 627 1.73615 0.84417 3 4
## 630 0.83752 0.58630 3 4
## 635 0.09518 1.61777 3 1
## 640 0.87659 1.27395 3 4
## 641 -0.41274 0.75821 3 4
## 642 0.48588 0.24248 3 4
## 646 0.72031 0.32844 3 4
## 651 0.72031 1.87563 3 4
## 654 1.54080 0.50035 3 4
## 655 1.97058 1.87563 3 4
## 658 0.95473 -0.27325 3 4
## 671 1.22823 -0.27325 3 4
## 673 -0.21739 0.75821 3 4
## 677 0.87659 0.58630 3 4
## 678 0.75938 1.53181 3 1
## 679 1.77522 0.93012 3 4
## 680 1.18916 2.47732 3 4
## 682 1.03288 1.53181 3 4
## 684 0.36867 0.93012 3 4
## 694 -0.10018 0.84417 3 1
## 695 1.61894 0.84417 3 4
## 697 1.73615 1.70372 3 4
## 699 -0.17832 1.10204 3 4
## 700 0.52496 0.41439 3 4
## 703 0.95473 1.44586 3 4
## 708 0.64217 -0.18730 3 4
## 714 0.83752 0.67226 3 4
## 715 0.83752 1.27395 3 4
## 719 0.09518 1.18799 3 4
## 723 -0.13925 1.18799 3 1
## 728 -0.37367 1.10204 3 1
## 730 1.57987 0.41439 3 4
## 737 -0.33460 1.01608 3 1
## 742 0.40774 1.27395 3 4
## 747 0.09518 1.87563 3 1
## 748 0.64217 0.84417 3 4
## 750 1.11102 0.24248 3 4
## 751 -0.10018 2.04755 3 4
## 754 0.87659 0.93012 3 1
## 763 1.42358 2.82114 3 4
## 767 0.32960 1.70372 3 4
## 769 1.18916 1.01608 3 4
## 770 1.77522 1.18799 3 1
## 787 1.34544 0.41439 3 1
## 798 0.64217 1.10204 3 4
## 799 0.32960 0.15652 3 1
## 800 0.64217 0.58630 3 4
## 804 0.83752 0.58630 3 2
## 805 0.40774 0.75821 3 1
## 806 0.32960 0.93012 3 4
## 811 0.01703 1.10204 3 4
## 812 -0.10018 2.47732 3 4
## 817 2.00965 1.61777 3 4
## 820 0.36867 0.84417 3 4
## 826 1.26730 1.35990 3 4
## 829 1.46266 0.15652 3 4
## 834 0.60310 0.07057 3 4
## 835 1.50173 -0.01539 3 4
## 836 1.18916 2.30541 3 4
## 848 -0.37367 0.15652 3 1
## 853 1.18916 -0.01539 3 4
## 854 -0.41274 1.87563 3 1
## 855 1.42358 0.07057 3 1
## 860 1.22823 1.10204 3 4
## 862 1.07195 1.35990 3 4
## 870 1.11102 2.04755 3 4
## 873 1.57987 1.18799 3 4
## 877 0.75938 0.84417 3 4
## 878 -0.33460 0.93012 3 4
## 881 1.61894 1.61777 3 4
## 889 0.75938 0.07057 3 4
## 894 -0.17832 -0.01539 3 1
## 895 -0.17832 1.27395 3 1
## 898 1.46266 -0.27325 3 4
## 899 1.46266 1.18799 3 4
## 903 0.79845 0.32844 3 2
## 904 0.40774 0.93012 3 4
## 905 1.38451 1.61777 3 4
## 909 0.60310 1.78968 3 4
## 913 1.54080 0.75821 3 4
## 916 -0.29553 1.87563 3 1
## 919 0.91566 0.67226 3 4
## 922 1.81429 1.35990 3 4
## 923 0.95473 2.04755 3 4
## 930 -0.21739 1.70372 3 4
## 932 0.56403 0.67226 3 2
## 940 0.99381 0.32844 3 4
## 945 0.40774 1.01608 3 1
## 948 1.38451 2.47732 3 4
## 949 -0.25646 0.67226 3 2
## 951 -0.02204 0.75821 3 4
## 953 0.95473 1.10204 3 4
## 955 -0.02204 0.75821 3 1
## 957 1.34544 1.44586 3 4
## 958 0.91566 1.01608 3 4
## 959 1.22823 0.41439 3 4
## 960 0.64217 1.18799 3 4
## 962 0.56403 1.35990 3 4
## 964 -0.13925 1.18799 3 4
## 965 -0.45181 1.27395 3 4
## 976 -0.45181 0.84417 3 1
## 979 -0.52996 0.67226 3 2
## 983 0.68124 1.78968 3 4
## 984 0.83752 1.61777 3 4
## 989 0.75938 1.27395 3 4
## 993 0.79845 0.32844 3 1
## 995 0.13425 0.32844 3 4
## 997 0.87659 0.32844 3 1
## 1000 1.46266 0.07057 3 4
Tabla <- table(K4$cluster, PESJ$GrupoWard)
Tabla
##
## 1 2 3 4
## 1 198 0 131 0
## 2 0 0 0 23
## 3 62 19 0 212
## 4 91 233 0 31
assocstats(Tabla)
## X^2 df P(> X^2)
## Likelihood Ratio 1195.5 9 0
## Pearson 1126.5 9 0
##
## Phi-Coefficient : 1.061
## Contingency Coeff.: 0.728
## Cramer's V : 0.613
Comparación: Según el coeficiente phi, existe una alta asociación entre las agrupaciones de Kmeans y Jerárquico.
Punto a.
Ventajas: . Al igual que con k-medias, permite procesar altos volúmenes de datos.
. Se caracteriza por ser un algoritmo particional y por tener la propiedad de necesitar una única pasada sobre los datos de entrada para devolver un resultado, lo que le confiere la cualidad de la rapidez.
. El método de k-medias debe elegir previamente la cantidad de conglomerados, por lo que no se tiene referencia sobre la relación que guardan las instancias de un cluster entre sí, simplemente se sabe que son similares, mientras que con el método del líder, al fijar una similitud mínima, si se puede tener una mejor idea de cómo están conformados los conglomerados resultantes.
. Se asegura que toda similitud, incluido la similitud mínima, está entre 0 y 1, por lo que es más fácil tener presente el grado de similitud que se desea.
. El número de cluster está determinado por el algoritmo, no se elige a priori como el k medias.
Desventajas: . El usuario debe definir un umbral d que es la distancia máxima aceptada, por lo que una mala elección de este parámetro podría llevar a malos resultados. Aunque esta desventaja es en cierto modo similar a la de k-medias, dado que en esta se debe seleccionar la cantidad de conglomerados, por lo que la mala elección de k, también puede llevar a malos resultados.
. Al igual que con k-medias, La convergencia a óptimos locales puede traer malos resultados.
. Existe una dependencia del orden de los datos de entrada y una dependencia del orden de comparación con los líderes, lo que puede alterar los resultados y arrojar resultados incorrectos.
Punto b.
Modificaciones propuestas: . Como se mencionó antes, el orden de comparación y el orden de entrada puede afectar los resultados, por lo que algunas de las propuestas van en el sentido de proponer cambiar la forma de recorrer cada cluster, haciéndolo ahora de más nuevo a más viejo. A pesar del cambio, no se ve afectado el rendimiento, ni la dependencia del orden de comprobación, que continúa existiendo.
Punto a.
Punto b.
Punto c.
Punto d.