1 OPERASI MATRIKS

1.1 Penjumlahan

Penjumlahan matriks adalah operasi penjumlahan dua matriks dengan menjumlahkan komponen-komponennya yang seletak. Syaratnya adalah memiliki ordo yang sama.

Contoh : \[ \boldsymbol A = \begin{bmatrix} p & q \\ r & s \\ \end{bmatrix} \boldsymbol+\begin{bmatrix} v & w \\ x & y \\ \end{bmatrix} = \begin{bmatrix} p+v & q+w \\ r+x & s+y \\ \end{bmatrix} \]

Syntax R untuk penjumlahan matriks:

#input matriks
X <- matrix(c(1,2,3,
              4,5,6,
              7,8,9), nrow=3, byrow=TRUE);X
##      [,1] [,2] [,3]
## [1,]    1    2    3
## [2,]    4    5    6
## [3,]    7    8    9
Y <- matrix(c(10,11,12,
              13,14,15,
              16,17,18), nrow=3, byrow=TRUE);Y
##      [,1] [,2] [,3]
## [1,]   10   11   12
## [2,]   13   14   15
## [3,]   16   17   18
#penjumlahan
X + Y
##      [,1] [,2] [,3]
## [1,]   11   13   15
## [2,]   17   19   21
## [3,]   23   25   27

1.2 Pengurangan

Konsep pengurangan pada matriks sama seperti penjumlahan,yaitu mengurangkan elemen yang letaknya sama atau seletak. Salah satu syarat yang harus dipenuhi juga sama, yaitu memiliki ordo yang sama.

Contoh: \[ \boldsymbol A = \begin{bmatrix} p & q \\ r & s \\ \end{bmatrix} \boldsymbol-\begin{bmatrix} v & w \\ x & y \\ \end{bmatrix} = \begin{bmatrix} p-v & q-w \\ r-x & s-y \\ \end{bmatrix} \]

Syntax R untuk pengurangan matriks:

#input matriks
X <- matrix(c(1,2,3,
              4,5,6,
              7,8,9), nrow=3, byrow=TRUE);X
##      [,1] [,2] [,3]
## [1,]    1    2    3
## [2,]    4    5    6
## [3,]    7    8    9
Y <- matrix(c(10,11,12,
              13,14,15,
              16,17,18), nrow=3, byrow=TRUE);Y
##      [,1] [,2] [,3]
## [1,]   10   11   12
## [2,]   13   14   15
## [3,]   16   17   18
#pengurangan
X - Y
##      [,1] [,2] [,3]
## [1,]   -9   -9   -9
## [2,]   -9   -9   -9
## [3,]   -9   -9   -9

1.3 Perkalian

  1. Perkalian skalar dengan matriks

Contoh: \[ \boldsymbol k\begin{bmatrix} a & b \\ c & d \\ \end{bmatrix} \boldsymbol=\begin{bmatrix} ka & kb \\ kc & kd \\ \end{bmatrix}\]

Syntax R untuk perkalian skalar dengan matriks:

#input matriks
k <- 2
Y <- matrix(c(10,11,12,
              13,14,15,
              16,17,18), nrow=3, byrow=TRUE);Y
##      [,1] [,2] [,3]
## [1,]   10   11   12
## [2,]   13   14   15
## [3,]   16   17   18
#Perkalian
k*Y
##      [,1] [,2] [,3]
## [1,]   20   22   24
## [2,]   26   28   30
## [3,]   32   34   36
  1. Perkalian antar matriks Perkalian matriks adalah nilai pada matriks yang bisa dihasilkan dengan cara dikalikan-nya tiap baris dengan setiap kolom yang memiliki jumlah baris yang sama.

Contoh:

Misalkan ada matriks:

\[ A = \begin{bmatrix} a_{11} & a_{12} & \dots & a_{1q} \\ a_{21} & a_{22} & \dots & a_{2q} \\ \vdots & \vdots & \ddots & \vdots \\ a_{p1} & a_{p2} & \dots & a_{pq} \end{bmatrix}_{p \times q}, \quad B = \begin{bmatrix} b_{11} & b_{12} & \dots & b_{1n} \\ b_{21} & b_{22} & \dots & b_{2n} \\ \vdots & \vdots & \ddots & \vdots \\ b_{q1} & b_{q2} & \dots & b_{qn} \end{bmatrix}_{q \times n} \]

Maka hasil perkalian \(AB\) adalah:

\[ AB = \begin{bmatrix} \sum_{k=1}^{q} a_{1k} b_{k1} & \sum_{k=1}^{q} a_{1k} b_{k2} & \dots & \sum_{k=1}^{q} a_{1k} b_{kn} \\ \sum_{k=1}^{q} a_{2k} b_{k1} & \sum_{k=1}^{q} a_{2k} b_{k2} & \dots & \sum_{k=1}^{q} a_{2k} b_{kn} \\ \vdots & \vdots & \ddots & \vdots \\ \sum_{k=1}^{q} a_{pk} b_{k1} & \sum_{k=1}^{q} a_{pk} b_{k2} & \dots & \sum_{k=1}^{q} a_{pk} b_{kn} \end{bmatrix}_{p \times n} \]

1.4 Transpose

Transpose dari sebuah matriks adalah matriks baru yang diperoleh dengan menukar baris menjadi kolom, atau sebaliknya.

Contoh: \[ \boldsymbol A =\begin{bmatrix} a & b \\ c & d \\ \end{bmatrix} \boldsymbol A'=\begin{bmatrix} a & c \\ b & d \\ \end{bmatrix}\]

Syntax R untuk transpose:

#input matriks
X <- matrix(c(1,2,3,
              4,5,6,
              7,8,9), nrow=3, byrow=TRUE);X
##      [,1] [,2] [,3]
## [1,]    1    2    3
## [2,]    4    5    6
## [3,]    7    8    9
Y <- matrix(c(10,11,12,
              13,14,15,
              16,17,18), nrow=3, byrow=TRUE);Y
##      [,1] [,2] [,3]
## [1,]   10   11   12
## [2,]   13   14   15
## [3,]   16   17   18
#transpose matriks
transX = t(X); transX
##      [,1] [,2] [,3]
## [1,]    1    4    7
## [2,]    2    5    8
## [3,]    3    6    9
transY = t(Y); transY
##      [,1] [,2] [,3]
## [1,]   10   13   16
## [2,]   11   14   17
## [3,]   12   15   18

1.5 Invers

Invers matriks adalah matriks kebalikan (dilambangkan A⁻¹) yang, jika dikalikan dengan matriks aslinya, akan menghasilkan matriks identitas (I).

Contoh: \[ \boldsymbol A =\begin{bmatrix} x & w \\ y & z \\ \end{bmatrix}\]

\[ \boldsymbol{A^{-1}} = \frac{1}{\det{A}}\times adj({A}) \]

Syntax R untuk invers matriks:

#input matriks
X <- matrix(c(4, 2,
              3, 1), nrow = 2, byrow = TRUE)

#invers matriks
inv_X = solve(X); inv_X
##      [,1] [,2]
## [1,] -0.5    1
## [2,]  1.5   -2

1.6 Determinan

Determinan adalah nilai skalar unik yang dapat dihitung dari unsur-unsur matriks persegi dan dilambangkan dengan det(A) atau |A|

Contoh: \[ \boldsymbol A =\begin{bmatrix} a & b \\ c & d \\ \end{bmatrix}\]

\[ \ det(A) = ab - dc\]

Syntax R untuk determinan matriks:

#input matriks
X <- matrix(c(1,2,3,
              4,5,6,
              7,8,9), nrow=3, byrow=TRUE);X
##      [,1] [,2] [,3]
## [1,]    1    2    3
## [2,]    4    5    6
## [3,]    7    8    9
Y <- matrix(c(10,11,12,
              13,14,15,
              16,17,18), nrow=3, byrow=TRUE);Y
##      [,1] [,2] [,3]
## [1,]   10   11   12
## [2,]   13   14   15
## [3,]   16   17   18
#determinan matriks
det(X)
## [1] 6.661338e-16
det(Y)
## [1] 0

1.7 Cara memanggil komponen matriks di R:

A <- matrix(21:40, nrow=4, ncol=5)

A
##      [,1] [,2] [,3] [,4] [,5]
## [1,]   21   25   29   33   37
## [2,]   22   26   30   34   38
## [3,]   23   27   31   35   39
## [4,]   24   28   32   36   40
A[,2]         # Kolom 2
## [1] 25 26 27 28
A[3,]         # Baris 3
## [1] 23 27 31 35 39
A[3,2]        # Sel(3, 2)
## [1] 27
A[c(1,3),2]   # Sel(1,2) dan sel(3,2)
## [1] 25 27
A[,1:3]       # kolom(1,2,3)
##      [,1] [,2] [,3]
## [1,]   21   25   29
## [2,]   22   26   30
## [3,]   23   27   31
## [4,]   24   28   32
A[2:4,]       # baris(2,3,4)
##      [,1] [,2] [,3] [,4] [,5]
## [1,]   22   26   30   34   38
## [2,]   23   27   31   35   39
## [3,]   24   28   32   36   40

2 EIGEN VALUE

2.1 Eigen value

Eigenvalue adalah skalar (angka) yang menunjukkan seberapa besar eigenvector diperbesar atau diperkecil ketika matriks A mengalikan vektor itu.

Contoh: \[ \boldsymbol A = \begin{bmatrix} 2 & 0 \\ 1 & 3 \end{bmatrix} \]

\[ \det(A - \lambda I) = \det \begin{bmatrix} 2-\lambda & 0 \\ 1 & 3-\lambda \end{bmatrix} \]

\[ \det(A - \lambda I) = (2-\lambda)(3-\lambda) - (1)(0) \]

\[ \det(A - \lambda I) = \lambda^2 -5\lambda +6 \]

\[ \det(A - \lambda I) = (\lambda - 2)(\lambda - 3) \]

Dari sini didapat: \[ \lambda_1 = 2, \quad \lambda_2 = 3 \]

2.2 Eigen vector

Eigenvector adalah vektor non-nol yang arahnya tidak berubah ketika dikalikan dengan matriks A.

Contoh: \[ \text{Matriks: } \boldsymbol A = \begin{bmatrix} 2 & 0 \\ 1 & 3 \end{bmatrix} \]

\[ \text{Eigenvalues: } \lambda_1 = 2, \quad \lambda_2 = 3 \]

\[ \text{Untuk } \lambda_1 = 2: (A - 2I)v = 0 \quad \Rightarrow \quad \begin{bmatrix} 0 & 0 \\ 1 & 1 \end{bmatrix} \begin{bmatrix} x \\ y \end{bmatrix} = \begin{bmatrix} 0 \\ 0 \end{bmatrix} \]

\[ y = -x \quad \Rightarrow \quad v_1 = \begin{bmatrix} 1 \\ -1 \end{bmatrix} \]

\[ \text{Untuk } \lambda_2 = 3: (A - 3I)v = 0 \quad \Rightarrow \quad \begin{bmatrix} -1 & 0 \\ 1 & 0 \end{bmatrix} \begin{bmatrix} x \\ y \end{bmatrix} = \begin{bmatrix} 0 \\ 0 \end{bmatrix} \]

\[ x = 0 \quad \Rightarrow \quad v_2 = \begin{bmatrix} 0 \\ 1 \end{bmatrix} \]

Syntax R untuk eigen value dan eigen vector:

eigX = eigen(X); eigX
## eigen() decomposition
## $values
## [1]  1.611684e+01 -1.116844e+00 -1.303678e-15
## 
## $vectors
##            [,1]        [,2]       [,3]
## [1,] -0.2319707 -0.78583024  0.4082483
## [2,] -0.5253221 -0.08675134 -0.8164966
## [3,] -0.8186735  0.61232756  0.4082483
eigY = eigen(Y); eigY
## eigen() decomposition
## $values
## [1]  4.242429e+01 -4.242853e-01 -8.760878e-16
## 
## $vectors
##            [,1]        [,2]       [,3]
## [1,] -0.4481957 -0.73921067  0.4082483
## [2,] -0.5688793 -0.03327957 -0.8164966
## [3,] -0.6895629  0.67265152  0.4082483
eigvalX = eigX$values; eigvalX
## [1]  1.611684e+01 -1.116844e+00 -1.303678e-15
eigvalY = eigY$values; eigvalY
## [1]  4.242429e+01 -4.242853e-01 -8.760878e-16
eigvecX = eigX$vectors; eigvecX
##            [,1]        [,2]       [,3]
## [1,] -0.2319707 -0.78583024  0.4082483
## [2,] -0.5253221 -0.08675134 -0.8164966
## [3,] -0.8186735  0.61232756  0.4082483
eigvecY = eigY$vectors; eigvecY
##            [,1]        [,2]       [,3]
## [1,] -0.4481957 -0.73921067  0.4082483
## [2,] -0.5688793 -0.03327957 -0.8164966
## [3,] -0.6895629  0.67265152  0.4082483

3 DEKOMPOSISI SINGULAR VALUE (SVD)

Dekomposisi Singular Value (SVD) adalah metode untuk memecah sebuah matriks menjadi tiga bagian yang membantu kita memahami strukturnya.

Syntax SVD di R:

library(MASS)
A <- matrix(c(5,-3,6,2,-4,8,-2,5,-1,7,3,9), 4, 3, byrow=TRUE)
A
##      [,1] [,2] [,3]
## [1,]    5   -3    6
## [2,]    2   -4    8
## [3,]   -2    5   -1
## [4,]    7    3    9
svd_result <- svd(A)
singular_value <- svd_result$d ; singular_value
## [1] 16.07076  7.41936  3.11187
U <- svd_result$u ; U
##            [,1]       [,2]       [,3]
## [1,] -0.5046975  0.2278362 -0.3742460
## [2,] -0.5178195  0.4138180  0.7413297
## [3,]  0.1646416 -0.6063789  0.5337354
## [4,] -0.6708477 -0.6396483 -0.1596770
V <- svd_result$v ; V
##            [,1]        [,2]       [,3]
## [1,] -0.5341591 -0.17494276 -0.8270847
## [2,]  0.1490928 -0.98251336  0.1115295
## [3,] -0.8321330 -0.06373793  0.5509011

4 MATRIKS JARAK

Matriks jarak adalah sebuah matriks yang menunjukkan seberapa jauh atau berbeda tiap observasi (baris) satu sama lain dalam sebuah dataset.

Syntax untuk matriks jarak di R:

set.seed(321)
ss <- sample(1:50, 15)
df <- USArrests[ss, ]
df.scaled <- scale(df); df.scaled
##                  Murder     Assault   UrbanPop        Rape
## Wyoming      -0.3721741 -0.02296746 -0.3418930 -0.62039386
## Illinois      0.4221896  1.02244775  1.2520675  0.62633064
## Mississippi   1.6799322  1.14124493 -1.4507350 -0.39776448
## Kansas       -0.5486994 -0.56943449  0.0739228 -0.26418686
## New York      0.5766492  1.08184634  1.4599754  0.93801176
## Kentucky      0.2677300 -0.64071280 -0.8963140 -0.51650015
## Oklahoma     -0.4163054 -0.14176464  0.2125281  0.03265231
## Hawaii       -0.7031590 -1.38913505  1.2520675  0.06233622
## Missouri      0.1132704  0.17898775  0.3511333  1.24969289
## New Mexico    0.6428462  1.45011760  0.3511333  1.82852926
## Louisiana     1.5254725  1.02244775  0.0739228  0.35917539
## South Dakota -1.0341439 -0.91394632 -1.3814324 -1.03596869
## Iowa         -1.3871944 -1.27033787 -0.5498008 -1.25859806
## North Dakota -1.6961136 -1.40101477 -1.4507350 -1.85227639
## Texas         0.9296998  0.45222127  1.0441596  0.84896001
## attr(,"scaled:center")
##     Murder    Assault   UrbanPop       Rape 
##   8.486667 162.933333  64.933333  19.780000 
## attr(,"scaled:scale")
##    Murder   Assault  UrbanPop      Rape 
##  4.531929 84.177081 14.429467  6.737655

4.1 Jarak Euclidean

Euclidean → Jarak lurus (garis terpendek) antara dua titik di ruang 𝑝-dimensi [jarak lurus standar].

Contoh aplikasi: - menghitung jarak garis lurus antar dua koordinat (GPS) - Clustering (K-Means, Hierarchical) → objek yang jaraknya dekat digabungkan

Syntax Jarak Euclidean di R:

library(factoextra)
## Loading required package: ggplot2
## Warning: package 'ggplot2' was built under R version 4.4.3
## Welcome! Want to learn more? See two factoextra-related books at https://goo.gl/ve3WBa
dist.eucl <- dist(df.scaled, method = "euclidean"); dist.eucl
##                Wyoming  Illinois Mississippi    Kansas  New York  Kentucky
## Illinois     2.4122476                                                    
## Mississippi  2.6164146 3.1543527                                          
## Kansas       0.7934567 2.3786048   3.1993198                              
## New York     2.7921742 0.4095812   3.3878156 2.7128511                    
## Kentucky     1.0532156 2.9515362   2.3433244 1.2948587 3.2757206          
## Oklahoma     0.8659748 1.8685718   2.9986711 0.5547563 2.2043102 1.4993175
## Hawaii       2.2322175 2.7203365   4.4270510 1.4800030 2.9246694 2.5403456
## Missouri     2.0625111 1.4167282   3.0563398 1.8349434 1.5351057 2.3176129
## New Mexico   3.1109091 1.5775154   3.0617092 3.1551035 1.4705638 3.4011133
## Louisiana    2.4137967 1.6360410   1.7133330 2.6879097 1.7776353 2.4609320
## South Dakota 1.5765126 3.9457686   3.4644086 1.7515852 4.3067435 1.5082173
## Iowa         1.7426214 3.9154083   4.0958166 1.6038155 4.2724405 1.9508929
## North Dakota 2.5296038 4.8794481   4.4694938 2.6181473 5.2524274 2.5546862
## Texas        2.4496576 0.8218968   2.9692463 2.3259192 0.8377979 2.6949264
##               Oklahoma    Hawaii  Missouri New Mexico Louisiana South Dakota
## Illinois                                                                    
## Mississippi                                                                 
## Kansas                                                                      
## New York                                                                    
## Kentucky                                                                    
## Oklahoma                                                                    
## Hawaii       1.6491638                                                      
## Missouri     1.3724911 2.3123720                                            
## New Mexico   2.6268378 3.7154012 1.4937447                                  
## Louisiana    2.2916633 3.5012381 1.8909275  1.7882330                       
## South Dakota 2.1588538 2.9115203 3.2767510  4.4281177 3.7902169             
## Iowa         2.1130016 2.3395756 3.3845451  4.6758935 4.0922753    0.9964108
## North Dakota 3.0891779 3.4578871 4.3173165  5.5131433 4.8442635    1.1604313
## Texas        1.8768374 2.5920693 1.1756214  1.5867966 1.3643137    3.8935265
##                   Iowa North Dakota
## Illinois                           
## Mississippi                        
## Kansas                             
## New York                           
## Kentucky                           
## Oklahoma                           
## Hawaii                             
## Missouri                           
## New Mexico                         
## Louisiana                          
## South Dakota                       
## Iowa                               
## North Dakota 1.1298867             
## Texas        3.9137858    4.8837032
fviz_dist(dist.eucl)

4.2 Jarak Chebyshev

Chebyshev → jarak ditentukan oleh selisih terbesar. Jarak maksimum di antara perbedaan koordinat. Fokus pada dimensi dengan selisih terbesar.

Contoh aplikasi: - jarak langkah raja antara dua posisi = jarak Chebyshev. - Berguna di quality control multivariat, misalnya mengecek dimensi produk (lebar, panjang, tinggi) → fokus pada dimensi terburuk.

Syntax Jarak Chebyshev di R:

dist.cheb <- dist(df.scaled, method = "maximum"); dist.cheb
##                Wyoming  Illinois Mississippi    Kansas  New York  Kentucky
## Illinois     1.5939604                                                    
## Mississippi  2.0521063 2.7028025                                          
## Kansas       0.5464670 1.5918822   2.2286315                              
## New York     1.8018683 0.3116811   2.9107104 1.6512808                    
## Kentucky     0.6399041 2.1483815   1.7819577 0.9702368 2.3562894          
## Oklahoma     0.6530462 1.1642124   2.0962376 0.4276699 1.2474473 1.1088421
## Hawaii       1.5939604 2.4115828   2.7028025 1.1781447 2.4709814 2.1483815
## Missouri     1.8700867 0.9009342   1.8018683 1.5138797 1.1088421 1.7661930
## New Mexico   2.4489231 1.2021986   2.2262937 2.0927161 1.1088421 2.3450294
## Louisiana    1.8976467 1.1781447   1.5246578 2.0741719 1.3860526 1.6631605
## South Dakota 1.0395394 2.6334999   2.7140760 1.4553552 2.8414078 1.3018739
## Iowa         1.2473704 2.2927856   3.0671266 0.9944112 2.3521842 1.6549244
## North Dakota 1.3780473 2.7028025   3.3760458 1.5880895 2.9107104 1.9638436
## Texas        1.4693539 0.5702265   2.4948946 1.4783991 0.6296251 1.9404736
##               Oklahoma    Hawaii  Missouri New Mexico Louisiana South Dakota
## Illinois                                                                    
## Mississippi                                                                 
## Kansas                                                                      
## New York                                                                    
## Kentucky                                                                    
## Oklahoma                                                                    
## Hawaii       1.2473704                                                      
## Missouri     1.2170406 1.5681228                                            
## New Mexico   1.7958770 2.8392526 1.2711298                                  
## Louisiana    1.9417780 2.4115828 1.4122022  1.4693539                       
## South Dakota 1.5939604 2.6334999 2.2856616  2.8644979 2.5596164             
## Iowa         1.2912504 1.8018683 2.5082909  3.0871273 2.9126670    0.8316315
## North Dakota 1.8849287 2.7028025 3.1019693  3.6808057 3.2215862    0.8163077
## Texas        1.3460052 1.8413563 0.8164294  0.9978963 0.9702368    2.4255920
##                   Iowa North Dakota
## Illinois                           
## Mississippi                        
## Kansas                             
## New York                           
## Kentucky                           
## Oklahoma                           
## Hawaii                             
## Missouri                           
## New Mexico                         
## Louisiana                          
## South Dakota                       
## Iowa                               
## North Dakota 0.9009342             
## Texas        2.3168942    2.7012364
fviz_dist(dist.cheb)

4.3 Jarak Manhattan

Manhattan → Jumlah perbedaan absolut antar koordinat, seperti berjalan di jalan kota berbentuk grid [jarak berbasis grid (jumlah selisih)].

Contoh aplikasi: - menghitung jarak dalam gudang/grid jalan yang tidak memungkinkan jalur diagonal. - menghitung jarak antar dokumen berdasarkan frekuensi kata (NLP).

Syntax Jarak Manhattan di R:

dist.man <- dist(df.scaled, method = "manhattan"); dist.man
##                 Wyoming   Illinois Mississippi     Kansas   New York   Kentucky
## Illinois      4.6804639                                                        
## Mississippi   4.5477901  5.1034373                                             
## Kansas        1.4950151  4.6314334   5.5975464                                 
## New York      5.4139111  0.7334472   5.4091682  5.3648806                      
## Kentucky      1.9159642  5.1088324   3.8673166  2.1102578  5.8422796           
## Oklahoma      1.3703957  3.6359252   5.4729270  0.9955082  4.3693724  2.8409781
## Hawaii        3.9738430  4.1009258   8.0763743  2.4788279  4.8343730  4.4465291
## Missouri      3.2505127  2.6766756   5.9782446  3.2014823  2.7867606  3.9878005
## New Mexico    5.6300548  2.7514592   5.3741207  5.5810243  2.4338278  6.0584233
## Louisiana     4.3384469  2.5485829   2.5548545  4.2894164  2.9731109  4.7668154
## South Dakota  3.0080629  7.6885267   5.4767741  3.0570933  8.4219740  2.5796943
## Iowa          3.1085028  7.7889667   7.2404771  3.1575333  8.5224139  3.3731605
## North Dakota  5.0427114  9.7231753   7.3728174  5.0917419 10.4566225  4.6143429
## Texas         4.6324690  1.5082739   5.1808752  4.5834386  1.4875431  5.0608376
##                Oklahoma     Hawaii   Missouri New Mexico  Louisiana
## Illinois                                                           
## Mississippi                                                        
## Kansas                                                             
## New York                                                           
## Kentucky                                                           
## Oklahoma                                                           
## Hawaii        2.6034473                                            
## Missouri      2.2059740  4.4728430                                 
## New Mexico    4.5855161  6.8523850  2.3795420                      
## Louisiana     3.5711187  6.1151982  3.4233902  3.0568606           
## South Dakota  4.0526016  4.5379784  6.2585756  8.6381176  7.3465098
## Iowa          4.1530415  3.9256352  6.3590155  8.7385576  7.4469497
## North Dakota  6.0872501  5.6222495  8.2932241 10.6727662  9.3811583
## Texas         3.5879303  4.4687467  2.1834220  2.9573454  2.6260207
##              South Dakota       Iowa North Dakota
## Illinois                                         
## Mississippi                                      
## Kansas                                           
## New York                                         
## Kentucky                                         
## Oklahoma                                         
## Hawaii                                           
## Missouri                                         
## New Mexico                                       
## Louisiana                                        
## South Dakota                                     
## Iowa            1.7637030                        
## North Dakota    2.0346485  1.9342086             
## Texas           7.6405319  7.7409718    9.6751804
fviz_dist(dist.man)

4.4 Jarak Mahalanobis

Mahalanobis → Jarak antar titik yang mempertimbangkan skala (varians) dan korelasi antar variabel.

Contoh aplikasi: - misalnya mendeteksi transaksi keuangan yang tidak wajar - memisahkan kelompok dengan varians dan korelasi berbeda (Analisis Diskriminan)

Syntax Jarak Mahalanobis di R:

center <- colMeans(df.scaled)
cov_matrix <- cov(df.scaled)
dist.mah <- mahalanobis(df.scaled, center, cov_matrix)
dist.mah
##      Wyoming     Illinois  Mississippi       Kansas     New York     Kentucky 
##    1.4833812    4.3099818    7.1777961    0.4220713    3.7281363    3.8406053 
##     Oklahoma       Hawaii     Missouri   New Mexico    Louisiana South Dakota 
##    0.4255505    7.0025193    4.7697204    7.7270275    3.1431292    2.7059402 
##         Iowa North Dakota        Texas 
##    2.2743931    4.4972638    2.4924839

5 VEKTOR RATA-RATA

Vektor rata-rata adalah vektor yang berisi nilai rata-rata dari setiap peubah (variabel) dalam suatu data multivariat.

5.1 Matriks rata-rata

Matriks rata-rata adalah matriks yang dibentuk dengan mengulang vektor rata-rata pada setiap baris sebanyak jumlah observasi, sehingga seluruh baris dalam matriks tersebut identik dengan vektor rata-rata.

Syntax matriks rata-rata di R:

#input data kadal
BB = c(6.2,11.5,8.7,10.1,7.8,6.9,12.0,3.1,14.8,9.4)
PM = c(61,73,68,70,64,60,76,49,84,71)
RTB = c(115,138,127,123,131,120,143,95,160,128)
lizard = as.matrix(cbind(BB,PM,RTB)); lizard
##         BB PM RTB
##  [1,]  6.2 61 115
##  [2,] 11.5 73 138
##  [3,]  8.7 68 127
##  [4,] 10.1 70 123
##  [5,]  7.8 64 131
##  [6,]  6.9 60 120
##  [7,] 12.0 76 143
##  [8,]  3.1 49  95
##  [9,] 14.8 84 160
## [10,]  9.4 71 128
#matriks rata-rata
vecMeans = as.matrix(colMeans(lizard)); vecMeans
##       [,1]
## BB    9.05
## PM   67.60
## RTB 128.00
vecRata = matrix(c(mean(BB), mean(PM), mean(RTB)), nrow=3, ncol=1); vecRata
##        [,1]
## [1,]   9.05
## [2,]  67.60
## [3,] 128.00

5.2 Matriks kovarians

Matriks kovarians adalah matriks persegi yang menunjukkan seberapa banyak dua atau lebih variabel acak bervariasi bersama-sama; elemen diagonal utama matriks berisi variansi dari masing-masing variabel, sedangkan elemen di luar diagonal utama menunjukkan kovarians antar pasangan variabel.

Contoh: \[ \Sigma = \begin{bmatrix} \mathrm{Var}(X_1) & \mathrm{Cov}(X_1, X_2) & \cdots & \mathrm{Cov}(X_1, X_p) \\ \mathrm{Cov}(X_2, X_1) & \mathrm{Var}(X_2) & \cdots & \mathrm{Cov}(X_2, X_p) \\ \vdots & \vdots & \ddots & \vdots \\ \mathrm{Cov}(X_p, X_1) & \mathrm{Cov}(X_p, X_2) & \cdots & \mathrm{Var}(X_p) \end{bmatrix} \]

Syntax matriks kovarians di R:

#input data kadal
BB = c(6.2,11.5,8.7,10.1,7.8,6.9,12.0,3.1,14.8,9.4)
PM = c(61,73,68,70,64,60,76,49,84,71)
RTB = c(115,138,127,123,131,120,143,95,160,128)
lizard = as.matrix(cbind(BB,PM,RTB)); lizard
##         BB PM RTB
##  [1,]  6.2 61 115
##  [2,] 11.5 73 138
##  [3,]  8.7 68 127
##  [4,] 10.1 70 123
##  [5,]  7.8 64 131
##  [6,]  6.9 60 120
##  [7,] 12.0 76 143
##  [8,]  3.1 49  95
##  [9,] 14.8 84 160
## [10,]  9.4 71 128
#matriks kovarians
varkov = cov(lizard); varkov
##           BB        PM       RTB
## BB  10.98056  31.80000  54.96667
## PM  31.80000  94.04444 160.22222
## RTB 54.96667 160.22222 300.66667

5.3 Matriks korelasi

Matriks korelasi yaitu sebuah matriks dengan elemen-elemen matriks yang merupakan koefisien korelasi dengan nilai terletak pada interval [−1,1] dan khusus elemen diagonal matriks bernilai satu.

Contoh: \[ R = \begin{bmatrix} 1 & r_{12} & r_{13} & \cdots & r_{1p} \\ r_{21} & 1 & r_{23} & \cdots & r_{2p} \\ r_{31} & r_{32} & 1 & \cdots & r_{3p} \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ r_{p1} & r_{p2} & r_{p3} & \cdots & 1 \end{bmatrix} \]

Syntax matriks korelasi di R:

#input data kadal
BB = c(6.2,11.5,8.7,10.1,7.8,6.9,12.0,3.1,14.8,9.4)
PM = c(61,73,68,70,64,60,76,49,84,71)
RTB = c(115,138,127,123,131,120,143,95,160,128)
lizard = as.matrix(cbind(BB,PM,RTB)); lizard
##         BB PM RTB
##  [1,]  6.2 61 115
##  [2,] 11.5 73 138
##  [3,]  8.7 68 127
##  [4,] 10.1 70 123
##  [5,]  7.8 64 131
##  [6,]  6.9 60 120
##  [7,] 12.0 76 143
##  [8,]  3.1 49  95
##  [9,] 14.8 84 160
## [10,]  9.4 71 128
#matriks korelasi
korel = cor(lizard); korel
##            BB        PM       RTB
## BB  1.0000000 0.9895743 0.9566313
## PM  0.9895743 1.0000000 0.9528259
## RTB 0.9566313 0.9528259 1.0000000

5.4 Matriks standardisasi

Matriks standarisasi adalah matriks yang diperoleh dengan mentransformasi data asli sehingga setiap variabel memiliki rata-rata nol dan simpangan baku satu. Proses standarisasi dilakukan dengan mengurangkan rata-rata variabel dari setiap observasi, kemudian membaginya dengan simpangan baku variabel tersebut.

Syntax matriks standarisasi di R:

#input data kadal
BB = c(6.2,11.5,8.7,10.1,7.8,6.9,12.0,3.1,14.8,9.4)
PM = c(61,73,68,70,64,60,76,49,84,71)
RTB = c(115,138,127,123,131,120,143,95,160,128)
lizard = as.matrix(cbind(BB,PM,RTB)); lizard
##         BB PM RTB
##  [1,]  6.2 61 115
##  [2,] 11.5 73 138
##  [3,]  8.7 68 127
##  [4,] 10.1 70 123
##  [5,]  7.8 64 131
##  [6,]  6.9 60 120
##  [7,] 12.0 76 143
##  [8,]  3.1 49  95
##  [9,] 14.8 84 160
## [10,]  9.4 71 128
#matriks standarisasi
n = nrow(lizard);n
## [1] 10
u = matrix(1,n,1); u
##       [,1]
##  [1,]    1
##  [2,]    1
##  [3,]    1
##  [4,]    1
##  [5,]    1
##  [6,]    1
##  [7,]    1
##  [8,]    1
##  [9,]    1
## [10,]    1
xbar = cbind((1/n)*t(u)%*%lizard); xbar
##        BB   PM RTB
## [1,] 9.05 67.6 128
D = lizard - u %*% xbar; D
##          BB    PM RTB
##  [1,] -2.85  -6.6 -13
##  [2,]  2.45   5.4  10
##  [3,] -0.35   0.4  -1
##  [4,]  1.05   2.4  -5
##  [5,] -1.25  -3.6   3
##  [6,] -2.15  -7.6  -8
##  [7,]  2.95   8.4  15
##  [8,] -5.95 -18.6 -33
##  [9,]  5.75  16.4  32
## [10,]  0.35   3.4   0
S = (1/(n-1))*t(D)%*%D; S
##           BB        PM       RTB
## BB  10.98056  31.80000  54.96667
## PM  31.80000  94.04444 160.22222
## RTB 54.96667 160.22222 300.66667
Ds = diag(sqrt(diag(S))); Ds
##          [,1]     [,2]     [,3]
## [1,] 3.313692 0.000000  0.00000
## [2,] 0.000000 9.697651  0.00000
## [3,] 0.000000 0.000000 17.33974
R = solve(Ds) %*% S %*% solve(Ds); R
##           [,1]      [,2]      [,3]
## [1,] 1.0000000 0.9895743 0.9566313
## [2,] 0.9895743 1.0000000 0.9528259
## [3,] 0.9566313 0.9528259 1.0000000