Analisis multivariat adalah metode statistik yang digunakan untuk menganalisis data yang memiliki lebih dari satu variabel secara bersamaan. Metode ini bertujuan untuk mengetahui hubungan antar variabel dan pola yang terdapat dalam data. Dengan analisis multivariat, data yang memiliki banyak variabel dapat dipahami secara lebih menyeluruh sehingga membantu dalam pengambilan keputusan dan penarikan kesimpulan secara lebih tepat.
Dataset yang digunakan merupakan Concrete Compressive Strength dengan beberapa variabel numerik yang relevan untuk analisis multivariat.
#library
library("readr")
#read data
data_ganjil <- read_csv("Concrete_Data.csv")
## Rows: 1030 Columns: 9
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## dbl (9): Cement (component 1)(kg in a m^3 mixture), Blast Furnace Slag (comp...
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
#menampilkan beberapa baris awal data
head(data_ganjil)
Pada tahap ini, variabel target yaitu Concrete Compressive Strength dihapus dari dataset. Data yang digunakan untuk analisis selanjutnya adalah data tanpa kolom target.
# Hapus kolom target
data_fitur <- data_ganjil[, -ncol(data_ganjil)]
# Cek hasil
colnames(data_fitur)
## [1] "Cement (component 1)(kg in a m^3 mixture)"
## [2] "Blast Furnace Slag (component 2)(kg in a m^3 mixture)"
## [3] "Fly Ash (component 3)(kg in a m^3 mixture)"
## [4] "Water (component 4)(kg in a m^3 mixture)"
## [5] "Superplasticizer (component 5)(kg in a m^3 mixture)"
## [6] "Coarse Aggregate (component 6)(kg in a m^3 mixture)"
## [7] "Fine Aggregate (component 7)(kg in a m^3 mixture)"
## [8] "Age (day)"
# Mengecek jumlah missing value
sapply(data_fitur, function(x) sum(is.na(x)))
## Cement (component 1)(kg in a m^3 mixture)
## 0
## Blast Furnace Slag (component 2)(kg in a m^3 mixture)
## 0
## Fly Ash (component 3)(kg in a m^3 mixture)
## 0
## Water (component 4)(kg in a m^3 mixture)
## 0
## Superplasticizer (component 5)(kg in a m^3 mixture)
## 0
## Coarse Aggregate (component 6)(kg in a m^3 mixture)
## 0
## Fine Aggregate (component 7)(kg in a m^3 mixture)
## 0
## Age (day)
## 0
Berdasarkan hasil pengecekan di atas, dapat diketahui bahwa jumlah nilai yang hilang (missing value) pada setiap variabel berjumlah nol, maka data dinyatakan lengkap dan dapat langsung digunakan untuk analisis selanjutnya.
Pada tahap ini dilakukan proses scaling pada data fitur. Scaling bertujuan untuk menyamakan skala antar variabel sehingga setiap variabel memiliki pengaruh yang seimbang dalam analisis multivariat. Proses scaling dilakukan menggunakan metode standardisasi (Z-score), yaitu dengan mengubah data agar memiliki nilai rata-rata 0 dan standar deviasi 1.
# Melakukan scaling pada data fitur
data_scaled <- scale(data_fitur)
# Menampilkan beberapa baris awal data hasil scaling
head(data_scaled)
## Cement (component 1)(kg in a m^3 mixture)
## [1,] 2.4767147
## [2,] 2.4767147
## [3,] 0.4912044
## [4,] 0.4912044
## [5,] -0.7900477
## [6,] -0.1451157
## Blast Furnace Slag (component 2)(kg in a m^3 mixture)
## [1,] -0.8564702
## [2,] -0.8564702
## [3,] 0.7951464
## [4,] 0.7951464
## [5,] 0.6780844
## [6,] 0.4648230
## Fly Ash (component 3)(kg in a m^3 mixture)
## [1,] -0.8467207
## [2,] -0.8467207
## [3,] -0.8467207
## [4,] -0.8467207
## [5,] -0.8467207
## [6,] -0.8467207
## Water (component 4)(kg in a m^3 mixture)
## [1,] -0.9162182
## [2,] -0.9162182
## [3,] 2.1743108
## [4,] 2.1743108
## [5,] 0.4885677
## [6,] 2.1743108
## Superplasticizer (component 5)(kg in a m^3 mixture)
## [1,] -0.6199241
## [2,] -0.6199241
## [3,] -1.0384398
## [4,] -1.0384398
## [5,] -1.0384398
## [6,] -1.0384398
## Coarse Aggregate (component 6)(kg in a m^3 mixture)
## [1,] 0.86274101
## [2,] 1.05565758
## [3,] -0.52625830
## [4,] -0.52625830
## [5,] 0.07049696
## [6,] -0.52625830
## Fine Aggregate (component 7)(kg in a m^3 mixture) Age (day)
## [1,] -1.2170672 -0.2795973
## [2,] -1.2170672 -0.2795973
## [3,] -2.2398245 3.5513405
## [4,] -2.2398245 5.0552210
## [5,] 0.6475939 4.9760694
## [6,] -1.2919031 0.7018826
Setelah dilakukan scaling, data siap digunakan untuk analisis multivariat seperti matriks korelasi, matriks varians-kovarians, serta perhitungan eigen value dan eigen vector.
Correlation matrix digunakan untuk melihat hubungan antar variabel pada data fitur. Nilai korelasi berada pada rentang -1 sampai 1, di mana nilai mendekati 1 menunjukkan hubungan positif yang kuat, nilai mendekati -1 menunjukkan hubungan negatif yang kuat, dan nilai mendekati 0 menunjukkan hubungan yang lemah atau tidak ada hubungan antar variabel.
# Menghitung matriks korelasi
cor_matrix <- cor(data_scaled)
# Menampilkan correlation matrix
cor_matrix
## Cement (component 1)(kg in a m^3 mixture)
## Cement (component 1)(kg in a m^3 mixture) 1.00000000
## Blast Furnace Slag (component 2)(kg in a m^3 mixture) -0.27519344
## Fly Ash (component 3)(kg in a m^3 mixture) -0.39747544
## Water (component 4)(kg in a m^3 mixture) -0.08154361
## Superplasticizer (component 5)(kg in a m^3 mixture) 0.09277137
## Coarse Aggregate (component 6)(kg in a m^3 mixture) -0.10935604
## Fine Aggregate (component 7)(kg in a m^3 mixture) -0.22272017
## Age (day) 0.08194726
## Blast Furnace Slag (component 2)(kg in a m^3 mixture)
## Cement (component 1)(kg in a m^3 mixture) -0.27519344
## Blast Furnace Slag (component 2)(kg in a m^3 mixture) 1.00000000
## Fly Ash (component 3)(kg in a m^3 mixture) -0.32356947
## Water (component 4)(kg in a m^3 mixture) 0.10728594
## Superplasticizer (component 5)(kg in a m^3 mixture) 0.04337574
## Coarse Aggregate (component 6)(kg in a m^3 mixture) -0.28399823
## Fine Aggregate (component 7)(kg in a m^3 mixture) -0.28159326
## Age (day) -0.04424580
## Fly Ash (component 3)(kg in a m^3 mixture)
## Cement (component 1)(kg in a m^3 mixture) -0.397475440
## Blast Furnace Slag (component 2)(kg in a m^3 mixture) -0.323569468
## Fly Ash (component 3)(kg in a m^3 mixture) 1.000000000
## Water (component 4)(kg in a m^3 mixture) -0.257043997
## Superplasticizer (component 5)(kg in a m^3 mixture) 0.377339559
## Coarse Aggregate (component 6)(kg in a m^3 mixture) -0.009976788
## Fine Aggregate (component 7)(kg in a m^3 mixture) 0.079076351
## Age (day) -0.154370165
## Water (component 4)(kg in a m^3 mixture)
## Cement (component 1)(kg in a m^3 mixture) -0.08154361
## Blast Furnace Slag (component 2)(kg in a m^3 mixture) 0.10728594
## Fly Ash (component 3)(kg in a m^3 mixture) -0.25704400
## Water (component 4)(kg in a m^3 mixture) 1.00000000
## Superplasticizer (component 5)(kg in a m^3 mixture) -0.65746444
## Coarse Aggregate (component 6)(kg in a m^3 mixture) -0.18231167
## Fine Aggregate (component 7)(kg in a m^3 mixture) -0.45063498
## Age (day) 0.27760443
## Superplasticizer (component 5)(kg in a m^3 mixture)
## Cement (component 1)(kg in a m^3 mixture) 0.09277137
## Blast Furnace Slag (component 2)(kg in a m^3 mixture) 0.04337574
## Fly Ash (component 3)(kg in a m^3 mixture) 0.37733956
## Water (component 4)(kg in a m^3 mixture) -0.65746444
## Superplasticizer (component 5)(kg in a m^3 mixture) 1.00000000
## Coarse Aggregate (component 6)(kg in a m^3 mixture) -0.26630276
## Fine Aggregate (component 7)(kg in a m^3 mixture) 0.22250149
## Age (day) -0.19271652
## Coarse Aggregate (component 6)(kg in a m^3 mixture)
## Cement (component 1)(kg in a m^3 mixture) -0.109356039
## Blast Furnace Slag (component 2)(kg in a m^3 mixture) -0.283998230
## Fly Ash (component 3)(kg in a m^3 mixture) -0.009976788
## Water (component 4)(kg in a m^3 mixture) -0.182311668
## Superplasticizer (component 5)(kg in a m^3 mixture) -0.266302755
## Coarse Aggregate (component 6)(kg in a m^3 mixture) 1.000000000
## Fine Aggregate (component 7)(kg in a m^3 mixture) -0.178505755
## Age (day) -0.003015507
## Fine Aggregate (component 7)(kg in a m^3 mixture)
## Cement (component 1)(kg in a m^3 mixture) -0.22272017
## Blast Furnace Slag (component 2)(kg in a m^3 mixture) -0.28159326
## Fly Ash (component 3)(kg in a m^3 mixture) 0.07907635
## Water (component 4)(kg in a m^3 mixture) -0.45063498
## Superplasticizer (component 5)(kg in a m^3 mixture) 0.22250149
## Coarse Aggregate (component 6)(kg in a m^3 mixture) -0.17850575
## Fine Aggregate (component 7)(kg in a m^3 mixture) 1.00000000
## Age (day) -0.15609405
## Age (day)
## Cement (component 1)(kg in a m^3 mixture) 0.081947264
## Blast Furnace Slag (component 2)(kg in a m^3 mixture) -0.044245801
## Fly Ash (component 3)(kg in a m^3 mixture) -0.154370165
## Water (component 4)(kg in a m^3 mixture) 0.277604429
## Superplasticizer (component 5)(kg in a m^3 mixture) -0.192716518
## Coarse Aggregate (component 6)(kg in a m^3 mixture) -0.003015507
## Fine Aggregate (component 7)(kg in a m^3 mixture) -0.156094049
## Age (day) 1.000000000
library(corrplot)
## corrplot 0.95 loaded
# Membuat palet warna
colour <- colorRampPalette(c(
"#e9edc9", #
"#ccd5ae",
"#a3b18a",
"#6b705c",
"#3a5a40"
))
# Visualisasi correlation matrix
corrplot(cor_matrix,
method = "color",
type = "upper",
col = colour(200),
tl.col = "black",
tl.cex = 0.3,
title = "Visualisasi Matriks Korelasi",
mar = c(0, 0, 2, 0))
## Warning in ind1:ind2: numerical expression has 2 elements: only the first used
Berdasarkan visualisasi matriks korelasi, dapat dilihat adanya perbedaan
tingkat hubungan antar variabel dalam dataset. Warna yang semakin gelap
menunjukkan bahwa hubungan antar variabel semakin kuat, sedangkan warna
yang lebih terang menunjukkan hubungan yang lemah atau mendekati nol.
Hasil visualisasi menunjukkan bahwa beberapa variabel memiliki korelasi
positif yang cukup kuat, yang berarti ketika nilai suatu variabel
meningkat, nilai variabel lainnya cenderung ikut meningkat. Selain itu,
terdapat juga variabel yang memiliki korelasi rendah, yang menandakan
bahwa perubahan pada satu variabel tidak terlalu memengaruhi variabel
lainnya.
Variance–Covariance Matrix digunakan untuk melihat tingkat variasi (variance) setiap variabel dan hubungan bersama antar dua variabel (covariance). Nilai varians berada pada diagonal matriks dan menunjukkan seberapa besar penyebaran data tiap variabel, sedangkan nilai kovarians berada di luar diagonal dan menunjukkan arah hubungan antar variabel. Nilai kovarians positif menandakan kedua variabel cenderung bergerak searah, sedangkan nilai negatif menunjukkan pergerakan yang berlawanan.
# Menghitung variance-covariance matrix
cov_matrix <- cov(data_scaled)
# Menampilkan variance-covariance matrix
cov_matrix
## Cement (component 1)(kg in a m^3 mixture)
## Cement (component 1)(kg in a m^3 mixture) 1.00000000
## Blast Furnace Slag (component 2)(kg in a m^3 mixture) -0.27519344
## Fly Ash (component 3)(kg in a m^3 mixture) -0.39747544
## Water (component 4)(kg in a m^3 mixture) -0.08154361
## Superplasticizer (component 5)(kg in a m^3 mixture) 0.09277137
## Coarse Aggregate (component 6)(kg in a m^3 mixture) -0.10935604
## Fine Aggregate (component 7)(kg in a m^3 mixture) -0.22272017
## Age (day) 0.08194726
## Blast Furnace Slag (component 2)(kg in a m^3 mixture)
## Cement (component 1)(kg in a m^3 mixture) -0.27519344
## Blast Furnace Slag (component 2)(kg in a m^3 mixture) 1.00000000
## Fly Ash (component 3)(kg in a m^3 mixture) -0.32356947
## Water (component 4)(kg in a m^3 mixture) 0.10728594
## Superplasticizer (component 5)(kg in a m^3 mixture) 0.04337574
## Coarse Aggregate (component 6)(kg in a m^3 mixture) -0.28399823
## Fine Aggregate (component 7)(kg in a m^3 mixture) -0.28159326
## Age (day) -0.04424580
## Fly Ash (component 3)(kg in a m^3 mixture)
## Cement (component 1)(kg in a m^3 mixture) -0.397475440
## Blast Furnace Slag (component 2)(kg in a m^3 mixture) -0.323569468
## Fly Ash (component 3)(kg in a m^3 mixture) 1.000000000
## Water (component 4)(kg in a m^3 mixture) -0.257043997
## Superplasticizer (component 5)(kg in a m^3 mixture) 0.377339559
## Coarse Aggregate (component 6)(kg in a m^3 mixture) -0.009976788
## Fine Aggregate (component 7)(kg in a m^3 mixture) 0.079076351
## Age (day) -0.154370165
## Water (component 4)(kg in a m^3 mixture)
## Cement (component 1)(kg in a m^3 mixture) -0.08154361
## Blast Furnace Slag (component 2)(kg in a m^3 mixture) 0.10728594
## Fly Ash (component 3)(kg in a m^3 mixture) -0.25704400
## Water (component 4)(kg in a m^3 mixture) 1.00000000
## Superplasticizer (component 5)(kg in a m^3 mixture) -0.65746444
## Coarse Aggregate (component 6)(kg in a m^3 mixture) -0.18231167
## Fine Aggregate (component 7)(kg in a m^3 mixture) -0.45063498
## Age (day) 0.27760443
## Superplasticizer (component 5)(kg in a m^3 mixture)
## Cement (component 1)(kg in a m^3 mixture) 0.09277137
## Blast Furnace Slag (component 2)(kg in a m^3 mixture) 0.04337574
## Fly Ash (component 3)(kg in a m^3 mixture) 0.37733956
## Water (component 4)(kg in a m^3 mixture) -0.65746444
## Superplasticizer (component 5)(kg in a m^3 mixture) 1.00000000
## Coarse Aggregate (component 6)(kg in a m^3 mixture) -0.26630276
## Fine Aggregate (component 7)(kg in a m^3 mixture) 0.22250149
## Age (day) -0.19271652
## Coarse Aggregate (component 6)(kg in a m^3 mixture)
## Cement (component 1)(kg in a m^3 mixture) -0.109356039
## Blast Furnace Slag (component 2)(kg in a m^3 mixture) -0.283998230
## Fly Ash (component 3)(kg in a m^3 mixture) -0.009976788
## Water (component 4)(kg in a m^3 mixture) -0.182311668
## Superplasticizer (component 5)(kg in a m^3 mixture) -0.266302755
## Coarse Aggregate (component 6)(kg in a m^3 mixture) 1.000000000
## Fine Aggregate (component 7)(kg in a m^3 mixture) -0.178505755
## Age (day) -0.003015507
## Fine Aggregate (component 7)(kg in a m^3 mixture)
## Cement (component 1)(kg in a m^3 mixture) -0.22272017
## Blast Furnace Slag (component 2)(kg in a m^3 mixture) -0.28159326
## Fly Ash (component 3)(kg in a m^3 mixture) 0.07907635
## Water (component 4)(kg in a m^3 mixture) -0.45063498
## Superplasticizer (component 5)(kg in a m^3 mixture) 0.22250149
## Coarse Aggregate (component 6)(kg in a m^3 mixture) -0.17850575
## Fine Aggregate (component 7)(kg in a m^3 mixture) 1.00000000
## Age (day) -0.15609405
## Age (day)
## Cement (component 1)(kg in a m^3 mixture) 0.081947264
## Blast Furnace Slag (component 2)(kg in a m^3 mixture) -0.044245801
## Fly Ash (component 3)(kg in a m^3 mixture) -0.154370165
## Water (component 4)(kg in a m^3 mixture) 0.277604429
## Superplasticizer (component 5)(kg in a m^3 mixture) -0.192716518
## Coarse Aggregate (component 6)(kg in a m^3 mixture) -0.003015507
## Fine Aggregate (component 7)(kg in a m^3 mixture) -0.156094049
## Age (day) 1.000000000
library(corrplot)
# Palet warna
color <- colorRampPalette(c("forestgreen", "darkolivegreen3", "lightsteelblue3"))
# Visualisasi variance-covariance matrix
corrplot(cov_matrix,
method = "color",
type = "upper",
col = color(200),
tl.col = "black",
tl.cex = 0.3)
Berdasarkan visualisasi variance-covariance matrix, terlihat bahwa nilai
pada diagonal memiliki warna paling kuat yang menunjukkan varians
masing-masing variabel. Sementara itu, warna di luar diagonal
menunjukkan kovarians antar variabel. Warna yang lebih gelap menandakan
hubungan yang lebih kuat antar variabel, sedangkan warna yang lebih
terang menunjukkan hubungan yang lemah.
Eigen value dan eigen vector digunakan untuk mengetahui arah dan besarnya variasi data pada analisis multivariat. Perhitungan ini biasanya dilakukan pada matriks kovarians atau matriks korelasi untuk melihat komponen utama yang paling berpengaruh dalam data.
Eigen value menunjukkan besarnya variasi data yang dapat dijelaskan oleh masing-masing komponen. Semakin besar nilai eigen value, maka semakin besar kontribusi komponen tersebut dalam menjelaskan variasi data.
# Menghitung eigen value dari covariance matrix
eigen_value <- eigen(cov_matrix)$values
# Menampilkan eigen value
eigen_value
## [1] 2.27988839 1.41621134 1.34024303 1.01415438 0.95160179 0.79013022 0.17772856
## [8] 0.03004229
Eigen value terbesar terdapat pada komponen pertama (PC1) sebesar 2,2799, sehingga komponen ini paling dominan dalam menjelaskan variasi data. Komponen PC2 hingga PC4 juga masih memiliki kontribusi yang cukup penting, sedangkan PC5 sampai PC8 memiliki kontribusi yang kecil karena nilai eigen kurang dari 1.
# Menghitung eigen vector dari covariance matrix
eigen_vector <- eigen(cov_matrix)$vectors
# Menampilkan eigen vector
eigen_vector
## [,1] [,2] [,3] [,4] [,5] [,6]
## [1,] 0.09827295 -0.11181022 0.814495336 -0.05437612 0.14788131 -0.20312941
## [2,] 0.17725317 0.68562442 -0.173400934 -0.36269994 -0.02121136 0.30495397
## [3,] -0.39464178 -0.14379962 -0.407775045 0.22654071 0.54994390 -0.18309239
## [4,] 0.54705427 0.05292130 -0.213084327 0.29601729 0.07046483 -0.36612798
## [5,] -0.50591697 0.28360405 0.234191279 -0.03741495 0.35441099 0.19324298
## [6,] 0.03805569 -0.63034067 -0.172563917 -0.54574680 -0.03310011 0.31451971
## [7,] -0.40190575 -0.01956876 -0.004845761 0.38554226 -0.70110560 0.09236092
## [8,] 0.29152151 -0.12567848 0.100978731 0.52788520 0.22809163 0.74389043
## [,7] [,8]
## [1,] 0.22208449 -0.44612725
## [2,] 0.22837173 -0.43735666
## [3,] 0.35236521 -0.38191098
## [4,] -0.52417861 -0.38874361
## [5,] -0.66463655 -0.05176469
## [6,] -0.22701428 -0.34935768
## [7,] -0.03908382 -0.43337671
## [8,] 0.06925024 -0.01289534
Berdasarkan hasil eigen vector, pada komponen utama pertama (PC1) variabel ke-4 dan variabel ke-5 memiliki kontribusi terbesar dengan nilai masing-masing 0,5471 dan −0,5059. Pada komponen kedua (PC2), variabel ke-2 dan variabel ke-6 menjadi yang paling dominan dengan nilai 0,6856 dan −0,6303. Sementara itu, pada komponen ketiga (PC3) variabel ke-1 dan variabel ke-3 memiliki pengaruh terbesar dengan nilai 0,8145 dan −0,4078. Hal ini menunjukkan bahwa setiap komponen utama dibentuk oleh variabel yang berbeda dengan tingkat pengaruh yang tidak sama.