Persiapan Data
# Load library
library(readxl)
library(knitr)
# Load dataset
Raisin_Dataset <- read_excel("Raisin_Dataset.xlsx")
Dataset dimuat dari file Excel ke dalam R untuk digunakan dalam
proses analisis statistik selanjutnya.
Seleksi Data Numerik
numeric_data <- Raisin_Dataset[sapply(Raisin_Dataset, is.numeric)]
str(numeric_data)
## tibble [900 × 7] (S3: tbl_df/tbl/data.frame)
## $ Area : num [1:900] 87524 75166 90856 45928 79408 ...
## $ MajorAxisLength: num [1:900] 442 407 442 287 352 ...
## $ MinorAxisLength: num [1:900] 253 243 266 209 291 ...
## $ Eccentricity : num [1:900] 0.82 0.802 0.798 0.685 0.564 ...
## $ ConvexArea : num [1:900] 90546 78789 93717 47336 81463 ...
## $ Extent : num [1:900] 0.759 0.684 0.638 0.7 0.793 ...
## $ Perimeter : num [1:900] 1184 1122 1209 844 1073 ...
Interpretasi Hasil
Output menunjukkan bahwa objek numeric_data merupakan data frame
yang hanya berisi variabel bertipe numerik. Setiap kolom ditampilkan
sebagai tipe num, yang menandakan bahwa variabel non-numerik telah
berhasil dikeluarkan. Hasil ini memastikan bahwa seluruh variabel siap
digunakan untuk analisis korelasi, kovarians, dan eigen.
Matriks Korelasi
cor_matrix <- cor(numeric_data)
cor_matrix
## Area MajorAxisLength MinorAxisLength Eccentricity
## Area 1.00000000 0.9327744 0.9066499 0.3361066
## MajorAxisLength 0.93277443 1.0000000 0.7280302 0.5836084
## MinorAxisLength 0.90664987 0.7280302 1.0000000 -0.0276835
## Eccentricity 0.33610660 0.5836084 -0.0276835 1.0000000
## ConvexArea 0.99591967 0.9450309 0.8956513 0.3482103
## Extent -0.01349934 -0.2038656 0.1453215 -0.3610615
## Perimeter 0.96135172 0.9779780 0.8274170 0.4478452
## ConvexArea Extent Perimeter
## Area 0.99591967 -0.01349934 0.9613517
## MajorAxisLength 0.94503093 -0.20386556 0.9779780
## MinorAxisLength 0.89565132 0.14532153 0.8274170
## Eccentricity 0.34821030 -0.36106149 0.4478452
## ConvexArea 1.00000000 -0.05480247 0.9766122
## Extent -0.05480247 1.00000000 -0.1734489
## Perimeter 0.97661223 -0.17344893 1.0000000
Interpretasi Hasil
Matriks Varians-Kovarians
cov_matrix <- cov(numeric_data)
cov_matrix
## Area MajorAxisLength MinorAxisLength Eccentricity
## Area 1.521165e+09 4.221378e+06 1.767671e+06 1.183972e+03
## MajorAxisLength 4.221378e+06 1.346415e+04 4.222916e+03 6.116279e+00
## MinorAxisLength 1.767671e+06 4.222916e+03 2.498890e+03 -1.249887e-01
## Eccentricity 1.183972e+03 6.116279e+00 -1.249887e-01 8.157415e-03
## ConvexArea 1.583600e+09 4.470629e+06 1.825348e+06 1.282186e+03
## Extent -2.815116e+01 -1.264820e+00 3.884178e-01 -1.743625e-03
## Perimeter 1.026472e+07 3.106672e+04 1.132335e+04 1.107340e+01
## ConvexArea Extent Perimeter
## Area 1.583600e+09 -2.815116e+01 1.026472e+07
## MajorAxisLength 4.470629e+06 -1.264820e+00 3.106672e+04
## MinorAxisLength 1.825348e+06 3.884178e-01 1.132335e+04
## Eccentricity 1.282186e+03 -1.743625e-03 1.107340e+01
## ConvexArea 1.662135e+09 -1.194617e+02 1.090014e+07
## Extent -1.194617e+02 2.858848e-03 -2.538891e+00
## Perimeter 1.090014e+07 -2.538891e+00 7.494690e+04
Interpretasi Hasil
Eigen Value dan Eigen Vector
Eigen <- eigen(cov_matrix)
Eigen$values
## [1] 3.176903e+09 6.484070e+06 3.411647e+03 5.871721e+02 4.592383e+01
## [6] 1.781033e-03 1.565458e-03
Eigen$vectors
## [,1] [,2] [,3] [,4] [,5]
## [1,] -6.911978e-01 7.225900e-01 0.0099167946 2.336904e-03 0.002339507
## [2,] -1.935429e-03 -6.101735e-03 0.5800095074 -5.162209e-01 -0.630129625
## [3,] -7.998219e-04 2.427483e-03 -0.2137523625 6.499835e-01 -0.729259089
## [4,] -5.492743e-07 -4.737215e-06 0.0010540138 -1.661067e-03 0.001299584
## [5,] -7.226472e-01 -6.910123e-01 -0.0159270484 -5.209011e-03 -0.001481976
## [6,] 3.330314e-08 9.602308e-06 -0.0003483057 7.835626e-05 -0.001134514
## [7,] -4.712872e-03 -1.795989e-02 0.7858407300 5.576756e-01 0.266659316
## [,6] [,7]
## [1,] 1.174284e-06 6.537297e-06
## [2,] 4.904183e-04 6.364268e-04
## [3,] 8.891598e-04 -2.278570e-03
## [4,] -2.806809e-02 -9.996032e-01
## [5,] -1.815070e-08 -6.392384e-06
## [6,] -9.996054e-01 2.806617e-02
## [7,] -5.255305e-04 2.634392e-04
Interpretasi Hasil