Dataset Concrete Compressive Strength berasal dari UCI Machine Learning Repository. Dataset ini berisi komposisi bahan penyusun beton dan umur beton yang digunakan untuk memprediksi kuat tekan beton.
Analisis yang dilakukan meliputi:
Correlation Matrix
Variance–Covariance Matrix
Eigen Value dan Eigen Vector
https://archive.ics.uci.edu/dataset/165/concrete+compressive+strength
library(readxl)
library(corrplot)
## corrplot 0.95 loaded
library(knitr)
data <- read_excel("Concrete_Data.xls")
data <- as.data.frame(data)
kable(head(data, 10))
| Cement (component 1)(kg in a m^3 mixture) | Blast Furnace Slag (component 2)(kg in a m^3 mixture) | Fly Ash (component 3)(kg in a m^3 mixture) | Water (component 4)(kg in a m^3 mixture) | Superplasticizer (component 5)(kg in a m^3 mixture) | Coarse Aggregate (component 6)(kg in a m^3 mixture) | Fine Aggregate (component 7)(kg in a m^3 mixture) | Age (day) | Concrete compressive strength(MPa, megapascals) |
|---|---|---|---|---|---|---|---|---|
| 540.0 | 0.0 | 0 | 162 | 2.5 | 1040.0 | 676.0 | 28 | 79.98611 |
| 540.0 | 0.0 | 0 | 162 | 2.5 | 1055.0 | 676.0 | 28 | 61.88737 |
| 332.5 | 142.5 | 0 | 228 | 0.0 | 932.0 | 594.0 | 270 | 40.26954 |
| 332.5 | 142.5 | 0 | 228 | 0.0 | 932.0 | 594.0 | 365 | 41.05278 |
| 198.6 | 132.4 | 0 | 192 | 0.0 | 978.4 | 825.5 | 360 | 44.29608 |
| 266.0 | 114.0 | 0 | 228 | 0.0 | 932.0 | 670.0 | 90 | 47.02985 |
| 380.0 | 95.0 | 0 | 228 | 0.0 | 932.0 | 594.0 | 365 | 43.69830 |
| 380.0 | 95.0 | 0 | 228 | 0.0 | 932.0 | 594.0 | 28 | 36.44777 |
| 266.0 | 114.0 | 0 | 228 | 0.0 | 932.0 | 670.0 | 28 | 45.85429 |
| 475.0 | 0.0 | 0 | 228 | 0.0 | 932.0 | 594.0 | 28 | 39.28979 |
# Merename kolom agar lebih ringkas dan tidak bertumpuk saat visualisasi
colnames(data) <- c(
"Cement",
"Slag",
"FlyAsh",
"Water",
"Superplasticizer",
"CoarseAgg",
"FineAgg",
"Age",
"Strength"
)
head(data)
## Cement Slag FlyAsh Water Superplasticizer CoarseAgg FineAgg Age Strength
## 1 540.0 0.0 0 162 2.5 1040.0 676.0 28 79.98611
## 2 540.0 0.0 0 162 2.5 1055.0 676.0 28 61.88737
## 3 332.5 142.5 0 228 0.0 932.0 594.0 270 40.26954
## 4 332.5 142.5 0 228 0.0 932.0 594.0 365 41.05278
## 5 198.6 132.4 0 192 0.0 978.4 825.5 360 44.29608
## 6 266.0 114.0 0 228 0.0 932.0 670.0 90 47.02985
struktur_df <- data.frame(
Nama_Variabel = names(data),
Tipe_Data = sapply(data, class),
row.names = NULL
)
kable(struktur_df)
| Nama_Variabel | Tipe_Data |
|---|---|
| Cement | numeric |
| Slag | numeric |
| FlyAsh | numeric |
| Water | numeric |
| Superplasticizer | numeric |
| CoarseAgg | numeric |
| FineAgg | numeric |
| Age | numeric |
| Strength | numeric |
Seluruh variabel dalam dataset bertipe numerik.
fitur <- data[, -9] # Menghapus variabel Strength
head(fitur)
## Cement Slag FlyAsh Water Superplasticizer CoarseAgg FineAgg Age
## 1 540.0 0.0 0 162 2.5 1040.0 676.0 28
## 2 540.0 0.0 0 162 2.5 1055.0 676.0 28
## 3 332.5 142.5 0 228 0.0 932.0 594.0 270
## 4 332.5 142.5 0 228 0.0 932.0 594.0 365
## 5 198.6 132.4 0 192 0.0 978.4 825.5 360
## 6 266.0 114.0 0 228 0.0 932.0 670.0 90
cor_matrix <- cor(fitur)
kable(round(cor_matrix, 2))
| Cement | Slag | FlyAsh | Water | Superplasticizer | CoarseAgg | FineAgg | Age | |
|---|---|---|---|---|---|---|---|---|
| Cement | 1.00 | -0.28 | -0.40 | -0.08 | 0.09 | -0.11 | -0.22 | 0.08 |
| Slag | -0.28 | 1.00 | -0.32 | 0.11 | 0.04 | -0.28 | -0.28 | -0.04 |
| FlyAsh | -0.40 | -0.32 | 1.00 | -0.26 | 0.38 | -0.01 | 0.08 | -0.15 |
| Water | -0.08 | 0.11 | -0.26 | 1.00 | -0.66 | -0.18 | -0.45 | 0.28 |
| Superplasticizer | 0.09 | 0.04 | 0.38 | -0.66 | 1.00 | -0.27 | 0.22 | -0.19 |
| CoarseAgg | -0.11 | -0.28 | -0.01 | -0.18 | -0.27 | 1.00 | -0.18 | 0.00 |
| FineAgg | -0.22 | -0.28 | 0.08 | -0.45 | 0.22 | -0.18 | 1.00 | -0.16 |
| Age | 0.08 | -0.04 | -0.15 | 0.28 | -0.19 | 0.00 | -0.16 | 1.00 |
Berdasarkan hasil korelasi, sebagian besar variabel memiliki hubungan rendah hingga sedang. Tidak terdapat korelasi yang mendekati ±1 sehingga tidak ditemukan indikasi multikolinearitas yang kuat. Jika diperhatikan lebih lanjut, pasangan variabel dengan nilai korelasi yang lebih tinggi menunjukkan adanya hubungan linear yang cukup berarti antar komponen campuran beton. Namun karena tidak ada nilai korelasi yang sangat tinggi (mendekati ±1), maka setiap variabel masih memberikan informasi yang relatif berbeda dan tidak saling redundan.
corrplot(cor_matrix,
method = "color",
type = "upper",
addCoef.col = "black",
tl.col = "black",
tl.srt = 45,
number.cex = 0.7)
cov_matrix <- cov(fitur)
kable(round(cov_matrix, 2))
| Cement | Slag | FlyAsh | Water | Superplasticizer | CoarseAgg | FineAgg | Age | |
|---|---|---|---|---|---|---|---|---|
| Cement | 10921.74 | -2481.36 | -2658.35 | -181.99 | 57.91 | -888.61 | -1866.15 | 540.99 |
| Slag | -2481.36 | 7444.08 | -1786.61 | 197.68 | 22.36 | -1905.21 | -1947.91 | -241.15 |
| FlyAsh | -2658.35 | -1786.61 | 4095.55 | -351.30 | 144.25 | -49.64 | 405.74 | -624.06 |
| Water | -181.99 | 197.68 | -351.30 | 456.06 | -83.87 | -302.72 | -771.57 | 374.50 |
| Superplasticizer | 57.91 | 22.36 | 144.25 | -83.87 | 35.68 | -123.69 | 106.56 | -72.72 |
| CoarseAgg | -888.61 | -1905.21 | -49.64 | -302.72 | -123.69 | 6045.66 | -1112.80 | -14.81 |
| FineAgg | -1866.15 | -1947.91 | 405.74 | -771.57 | 106.56 | -1112.80 | 6428.10 | -790.57 |
| Age | 540.99 | -241.15 | -624.06 | 374.50 | -72.72 | -14.81 | -790.57 | 3990.44 |
Nilai diagonal menunjukkan varians masing-masing variabel, sedangkan nilai di luar diagonal menunjukkan kovarians antar variabel. Kovarians positif menunjukkan hubungan searah, sedangkan kovarians negatif menunjukkan hubungan berlawanan arah antar variabel.
eigen_result <- eigen(cov_matrix)
eigen_result$values
## [1] 12840.97152 9809.73610 7284.34193 4243.67465 3979.16746 1176.42112
## [7] 71.66399 11.33366
eigen_result$vectors
## [,1] [,2] [,3] [,4] [,5] [,6]
## [1,] 0.905642491 -0.032638607 0.15480715 -0.008242651 0.15137736 -0.3065154
## [2,] -0.262539831 -0.786053324 0.07291600 -0.199058277 0.10670802 -0.4534540
## [3,] -0.238615941 0.303014979 -0.05149092 0.687223886 0.17758357 -0.5123562
## [4,] 0.005566835 -0.076263559 -0.04145565 0.075552203 -0.09842420 0.4824817
## [5,] -0.001306160 0.005093971 0.02406543 0.020513644 0.02293166 -0.1044518
## [6,] -0.009104736 0.274574303 -0.76069849 -0.480046914 0.07636126 -0.2707187
## [7,] -0.210131322 0.450692923 0.61077597 -0.485145472 -0.13283562 -0.2571290
## [8,] 0.098367597 -0.069853972 -0.11857274 0.126850611 -0.94893247 -0.2341287
## [,7] [,8]
## [1,] -0.1943806101 -0.007910220
## [2,] -0.2261845864 -0.009246849
## [3,] -0.2867754410 0.005607725
## [4,] -0.8246302637 -0.253446680
## [5,] 0.2332324978 -0.965991173
## [6,] -0.1859495571 -0.041496031
## [7,] -0.2445950510 -0.026831816
## [8,] 0.0003334611 0.002108410
Eigenvalue menunjukkan besarnya variasi yang dijelaskan oleh masing-masing komponen, sedangkan eigenvector menunjukkan arah kontribusi setiap variabel terhadap komponen tersebut.
Berdasarkan analisis multivariat yang dilakukan, hubungan antar variabel fitur cenderung tidak terlalu kuat. Hasil dekomposisi eigen menunjukkan adanya beberapa komponen utama yang mampu merepresentasikan sebagian besar variasi data. Analisis ini dapat menjadi dasar untuk penerapan metode reduksi dimensi seperti Principal Component Analysis (PCA).