UAS Metode Multivariat

#Nomor 5

library(readr)
library(ggplot2)
library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(FactoMineR)
library(factoextra)
## Welcome! Want to learn more? See two factoextra-related books at https://goo.gl/ve3WBa

Dataset:

# Select relevant numerical columns (columns 2 to 11)
numerical_data <- data[, 2:11]

# Perform PCA
pca_result <- PCA(numerical_data, scale.unit = TRUE, graph = FALSE)
# Scree plot (Explained variance)
fviz_screeplot(pca_result, addlabels = TRUE, ylim = c(0, 50))

# Biplot of PCA (First two principal components)
fviz_pca_biplot(pca_result, 
                repel = TRUE,        # Avoid text overlapping
                col.var = "steelblue", # Color of variable vectors
                col.ind = data$Rank, # Color individuals by Rank
                gradient.cols = c("#00AFBB", "#E7B800", "#FC4E07"))

# Summary of PCA result
summary(pca_result)
## 
## Call:
## PCA(X = numerical_data, scale.unit = TRUE, graph = FALSE) 
## 
## 
## Eigenvalues
##                        Dim.1   Dim.2   Dim.3   Dim.4   Dim.5   Dim.6   Dim.7
## Variance               3.272   1.737   1.405   1.057   0.685   0.599   0.451
## % of var.             32.719  17.371  14.049  10.569   6.848   5.993   4.512
## Cumulative % of var.  32.719  50.090  64.140  74.708  81.556  87.548  92.061
##                        Dim.8   Dim.9  Dim.10
## Variance               0.397   0.215   0.182
## % of var.              3.969   2.148   1.822
## Cumulative % of var.  96.030  98.178 100.000
## 
## Individuals (the 10 first)
##                  Dist    Dim.1    ctr   cos2    Dim.2    ctr   cos2    Dim.3
## 1            |  2.369 |  0.792  0.467  0.112 |  0.772  0.836  0.106 |  0.827
## 2            |  3.507 |  1.235  1.137  0.124 |  0.575  0.464  0.027 |  2.141
## 3            |  3.396 |  1.358  1.375  0.160 |  0.484  0.329  0.020 |  1.956
## 4            |  2.763 | -0.610  0.277  0.049 | -0.875  1.074  0.100 |  0.890
## 5            |  3.018 | -0.586  0.256  0.038 |  2.131  6.376  0.499 | -1.225
## 6            |  2.428 |  0.357  0.095  0.022 | -1.685  3.986  0.482 |  0.767
## 7            |  2.563 |  0.272  0.055  0.011 | -1.094  1.680  0.182 | -1.283
## 8            |  2.561 |  0.588  0.257  0.053 |  0.231  0.075  0.008 | -0.418
## 9            |  3.742 | -1.995  2.968  0.284 |  0.561  0.442  0.022 | -0.730
## 10           |  2.794 | -1.546  1.782  0.306 |  0.488  0.335  0.031 |  0.841
##                 ctr   cos2  
## 1             1.187  0.122 |
## 2             7.960  0.373 |
## 3             6.644  0.332 |
## 4             1.375  0.104 |
## 5             2.606  0.165 |
## 6             1.020  0.100 |
## 7             2.857  0.250 |
## 8             0.303  0.027 |
## 9             0.925  0.038 |
## 10            1.227  0.091 |
## 
## Variables
##                 Dim.1    ctr   cos2    Dim.2    ctr   cos2    Dim.3    ctr
## X100m        | -0.775 18.344  0.600 |  0.187  2.016  0.035 | -0.184  2.420
## Long.jump    |  0.742 16.822  0.550 | -0.345  6.869  0.119 |  0.182  2.363
## Shot.put     |  0.623 11.844  0.388 |  0.598 20.607  0.358 | -0.023  0.039
## High.jump    |  0.572  9.998  0.327 |  0.350  7.064  0.123 | -0.260  4.794
## X400m        | -0.680 14.116  0.462 |  0.569 18.666  0.324 |  0.131  1.230
## X110m.hurdle | -0.746 17.020  0.557 |  0.229  3.013  0.052 | -0.093  0.611
## Discus       |  0.552  9.328  0.305 |  0.606 21.162  0.368 |  0.043  0.131
## Pole.vault   |  0.050  0.077  0.003 | -0.180  1.873  0.033 |  0.692 34.061
## Javeline     |  0.277  2.347  0.077 |  0.317  5.784  0.100 | -0.390 10.807
## X1500m       | -0.058  0.103  0.003 |  0.474 12.946  0.225 |  0.782 43.543
##                cos2  
## X100m         0.034 |
## Long.jump     0.033 |
## Shot.put      0.001 |
## High.jump     0.067 |
## X400m         0.017 |
## X110m.hurdle  0.009 |
## Discus        0.002 |
## Pole.vault    0.479 |
## Javeline      0.152 |
## X1500m        0.612 |

(a) Banyaknya Komponen utama yang dihasilkan dan interpretasinya

Dari hasil PCA sebelumnya:

Berdasarkan kumulatif variansi yang didapatkan pada analisis, 4 komponen utama cukup untuk menjelaskan sekitar 74.71% dari variasi total data, yang sudah signifikan untuk banyak aplikasi, terutama ketika varians lebih dari 70% dianggap cukup.

Interpretasi:

sehingga banyak komponen utama yang dihasilkan untuk menjelaskan variasi data adalah 4 komponen utama

(b) Buatkan scatter plot data dalam ruang dua dimensi menggunakan komponen utama pertama dan kedua. Apa Pola atau klaster yang terlihat?

# Extract the scores for the first two principal components
pca_scores <- data.frame(pca_result$ind$coord)

# Add rank information for coloring
pca_scores$Rank <- data$Rank

# Scatter plot for the first two principal components
ggplot(pca_scores, aes(x = Dim.1, y = Dim.2, color = Rank)) +
  geom_point(size = 3, alpha = 0.7) +
  scale_color_gradient(low = "blue", high = "red") +
  labs(title = "PCA Scatter Plot: First Two Principal Components",
       x = "Principal Component 1",
       y = "Principal Component 2",
       color = "Rank") +
  theme_minimal()

Analisis Pola atau Klaster:

Distribusi Titik:

Clustering:

Separasi Kelas:

(c) hubungan antara hasil PCA yang diperoleh dengan variabel/cabang lomba serta sampel/atlet yang diperoleh.

Hubungan dengan variabel (cabang lomba):

Hubungan dengan Sampel (Atlet):

(d) Komponen utama mana yang paling berkaitan dengan performa atlet di cabang tertentu.

Untuk menentukan komponen utama yang paling berkaitan dengan performa atlet pada cabang tertentu, perlu menganalisis loading atau koefisien variabel asli (cabang lomba) pada masing-masing komponen utama.

# Extract loading values
loading_matrix <- data.frame(pca_result$var$coord)
loading_matrix$Variable <- rownames(loading_matrix)

# Reshape data for visualization
library(reshape2)
loading_melted <- melt(loading_matrix, id.vars = "Variable")

# Visualize loading values with a heatmap
ggplot(loading_melted, aes(x = variable, y = Variable, fill = value)) +
  geom_tile(color = "white") +
  scale_fill_gradient2(low = "blue", high = "red", mid = "white", midpoint = 0) +
  labs(title = "PCA Loading Heatmap",
       x = "Principal Components",
       y = "Variables (Cabang Lomba)",
       fill = "Loading") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))

dari visualisasi heatmap tersebut dapat diketahui bahwa:

(e) berdasarkan analisis, urutan atlet yang memiliki performa terbaik.

# Calculate rankings based on the first principal component
data$Performance_Score <- pca_result$ind$coord[, 1]

# Rank athletes
data <- data[order(-data$Performance_Score), ]
print(data)
##        Athlets X100m Long.jump Shot.put High.jump X400m X110m.hurdle Discus
## 16      Karpov 10.50      7.81    15.93      2.09 46.81        13.97  51.65
## 14      Sebrle 10.85      7.84    16.36      2.12 48.36        14.05  48.72
## 15        Clay 10.44      7.96    15.23      2.06 49.19        14.13  50.11
## 17       Macey 10.89      7.47    15.73      2.15 48.97        14.56  48.34
## 18     Warners 10.62      7.74    14.48      1.97 47.97        14.01  43.73
## 22     Bernard 10.69      7.48    14.80      2.12 49.13        14.17  44.75
## 3       KARPOV 11.02      7.30    14.77      2.04 48.37        14.09  48.95
## 2         CLAY 10.76      7.40    14.26      1.86 49.37        14.05  50.72
## 19   Zsivoczky 10.91      7.14    15.31      2.12 49.40        14.95  45.62
## 20       Hernu 10.97      7.19    14.65      2.03 48.73        14.25  44.72
## 27       Smith 10.85      6.81    15.24      1.91 49.27        14.01  49.02
## 1       SEBRLE 11.04      7.58    14.83      2.07 49.81        14.69  43.75
## 8     McMULLEN 10.83      7.31    13.76      2.13 49.91        14.38  44.41
## 24   Pogorelov 10.95      7.31    15.10      2.06 50.79        14.21  44.60
## 29    Ojaniemi 10.68      7.50    14.97      1.94 49.12        15.01  40.35
## 6      WARNERS 11.11      7.60    14.31      1.98 48.68        14.23  41.10
## 28   Averyanov 10.55      7.34    14.44      1.94 49.72        14.39  39.88
## 21        Nool 10.80      7.53    14.26      1.88 48.81        14.80  42.05
## 7    ZSIVOCZKY 11.13      7.30    13.48      2.01 48.62        14.17  45.67
## 25  Schoenbeck 10.90      7.30    14.77      1.88 50.30        14.34  44.41
## 23    Schwarzl 10.98      7.49    14.01      1.94 49.76        14.25  42.43
## 26      Barras 11.14      6.99    14.91      1.94 49.41        14.37  44.83
## 32       Drews 10.87      7.38    13.07      1.88 48.51        14.01  40.11
## 35       Gomez 11.08      7.26    14.57      1.85 48.61        14.41  40.95
## 31          Qi 11.06      7.34    13.55      1.97 49.65        14.78  45.13
## 30     Smirnov 10.89      7.07    13.88      1.94 49.11        14.77  42.47
## 5       YURKOV 11.34      7.09    15.19      2.10 50.42        15.31  46.26
## 4      BERNARD 11.02      7.23    14.25      1.92 48.93        14.99  40.87
## 34       Terek 10.92      6.94    15.15      1.94 49.56        15.12  45.62
## 39 Korkizoglou 10.86      7.07    14.81      1.94 51.16        14.96  46.07
## 33 Parkhomenko 11.14      6.61    15.69      2.03 51.04        14.88  41.90
## 11      BARRAS 11.33      6.97    14.09      1.95 49.48        14.48  42.10
## 36        Turi 11.08      6.91    13.62      2.03 51.67        14.26  39.83
## 10       HERNU 11.37      7.56    14.41      1.86 51.10        15.06  44.99
## 38   Karlivans 11.33      7.26    13.30      1.97 50.54        14.98  43.34
## 9    MARTINEAU 11.64      6.81    14.57      1.95 50.14        14.93  47.60
## 12        NOOL 11.33      7.27    12.68      1.98 49.20        15.29  37.92
## 37     Lorenzo 11.10      7.03    13.22      1.85 49.34        15.38  40.22
## 40       Uldal 11.23      6.99    13.53      1.85 50.95        15.09  43.01
## 41     Casarsa 11.36      6.68    14.92      1.94 53.20        15.39  48.66
## 13 BOURGUIGNON 11.36      6.80    13.46      1.86 51.16        15.67  40.49
##    Pole.vault Javeline X1500m Rank Points Competition Performance_Score
## 16       4.60    55.54 278.11    3   8725    OlympicG       4.619987275
## 14       5.00    70.52 280.01    1   8893    OlympicG       4.038448501
## 15       4.90    69.71 282.00    2   8820    OlympicG       3.919365157
## 17       4.40    58.46 265.42    4   8414    OlympicG       2.233460566
## 18       4.90    55.39 278.05    5   8343    OlympicG       2.168396445
## 22       4.40    55.27 276.31    9   8225    OlympicG       1.906334368
## 3        4.92    50.31 300.20    3   8099    Decastar       1.358214936
## 2        4.92    60.15 301.50    2   8122    Decastar       1.234990563
## 19       4.70    63.45 269.54    6   8287    OlympicG       0.925132183
## 20       4.80    57.76 264.35    7   8237    OlympicG       0.889037852
## 27       4.20    61.52 272.74   14   8023    OlympicG       0.870310570
## 1        5.02    63.19 291.70    1   8217    Decastar       0.791627717
## 8        4.42    56.37 285.10    8   7995    Decastar       0.587516189
## 24       5.00    53.45 287.63   11   8084    OlympicG       0.539677028
## 29       4.60    59.26 275.71   16   8006    OlympicG       0.380113999
## 6        4.92    51.77 278.10    6   8030    Decastar       0.356889530
## 28       4.80    54.51 271.02   15   8021    OlympicG       0.349155138
## 21       5.40    61.33 276.33    8   8235    OlympicG       0.295305667
## 7        4.42    55.37 268.00    7   8004    Decastar       0.271774781
## 25       5.00    60.89 278.82   12   8077    OlympicG       0.114430985
## 23       5.10    56.32 273.56   10   8102    OlympicG       0.081078659
## 26       4.60    64.55 267.09   13   8067    OlympicG       0.002145203
## 32       5.00    51.53 274.21   19   7926    OlympicG      -0.248684024
## 35       4.40    60.71 269.70   22   7865    OlympicG      -0.289889208
## 31       4.50    60.79 272.63   18   7934    OlympicG      -0.434466691
## 30       4.70    60.88 263.31   17   7993    OlympicG      -0.484514213
## 5        4.72    63.44 276.40    5   8036    Decastar      -0.585968338
## 4        5.32    62.77 280.10    4   8067    Decastar      -0.609515083
## 34       5.30    50.62 290.36   21   7893    OlympicG      -0.681953059
## 39       4.70    53.05 317.00   26   7573    OlympicG      -0.957829813
## 33       4.80    65.82 277.94   20   7918    OlympicG      -1.069429104
## 11       4.72    55.40 282.00   11   7708    Decastar      -1.341652727
## 36       4.80    59.34 290.01   23   7708    OlympicG      -1.541813056
## 10       4.82    57.19 285.10   10   7733    Decastar      -1.546076462
## 38       4.50    52.92 278.67   25   7583    OlympicG      -1.994368727
## 9        4.92    52.33 262.10    9   7802    Decastar      -1.995359298
## 12       4.62    57.44 266.60   12   7651    Decastar      -2.344973806
## 37       4.50    58.36 263.08   24   7592    OlympicG      -2.408509980
## 40       4.50    60.00 281.70   27   7495    OlympicG      -2.562259591
## 41       4.40    58.62 296.12   28   7404    OlympicG      -2.857088268
## 13       5.02    54.68 291.70   13   7313    Decastar      -3.979041865

Tabel diatas menunjukkan ranking dari semua atlet dalam dataset, atlet dengan performa terbaik dapat diidentifikasi berdasarkan nilai PC1 yang lebih tinggi yaitu Karpov dengan skor 4.61. yang kemungkinan mencerminkan atribut utama seperti kecepatan atau kekuatan.