Analisis Korelasi Kanonik

Pengertian

Analisis korelasi kanonik digunakan untuk mengukur kekuatan hubungan antara dua set (kelompok) variabel yang tidak bisa diukur dengan analisis korelasi linier biasa, misalnya: - Set variabel sosial dengan set variabel ekonomi. - Set variabel lingkungan dengan set kesehatan - Set variabel pendidikan dengan set variabel kesehatan - Set variabel social ekonomi dan set variabel sanitasi

Analisis korelasi kanonik mengukur korelasi antara kombinasi linier variabel-variabel di suatu kelompok dengan kombinasi linier variabel-variabel pada satu kelompok lainnya.

Contoh Kasus

Digunakan dataset chemicaldata (Box and Youle 1955; Rencher2002) yang terdapat pada package ACSWR. Dataset chemicaldata berisi hasil eksperimen reaksi kimia, dimana dalam eksperimen ini variabel yang digunakan adalah : - X1 = temperatur
- X2 = konsentrasi zat
- X3 = waktu
- Y1= persentase material yang tidak berubah
- Y2 = persentase material yang berubah - sesuai yang diharapkan
- Y3 = persentase material yang berubah tidak sesuai yang diharapkan

Loading Package dan Dataset

# Memuat library
library(CCA) 
## Warning: package 'CCA' was built under R version 4.3.3
## Loading required package: fda
## Warning: package 'fda' was built under R version 4.3.3
## Loading required package: splines
## Loading required package: fds
## Warning: package 'fds' was built under R version 4.3.3
## Loading required package: rainbow
## Warning: package 'rainbow' was built under R version 4.3.3
## Loading required package: MASS
## Loading required package: pcaPP
## Warning: package 'pcaPP' was built under R version 4.3.3
## Loading required package: RCurl
## Warning: package 'RCurl' was built under R version 4.3.3
## Loading required package: deSolve
## Warning: package 'deSolve' was built under R version 4.3.3
## 
## Attaching package: 'fda'
## The following object is masked from 'package:graphics':
## 
##     matplot
## Loading required package: fields
## Warning: package 'fields' was built under R version 4.3.3
## Loading required package: spam
## Warning: package 'spam' was built under R version 4.3.3
## Spam version 2.11-0 (2024-10-03) is loaded.
## Type 'help( Spam)' or 'demo( spam)' for a short introduction 
## and overview of this package.
## Help for individual functions is also obtained by adding the
## suffix '.spam' to the function name, e.g. 'help( chol.spam)'.
## 
## Attaching package: 'spam'
## The following objects are masked from 'package:base':
## 
##     backsolve, forwardsolve
## Loading required package: viridisLite
## 
## Try help(fields) to get started.
library(candisc)
## Warning: package 'candisc' was built under R version 4.3.3
## Loading required package: heplots
## Warning: package 'heplots' was built under R version 4.3.3
## Loading required package: broom
## Warning: package 'broom' was built under R version 4.3.3
## 
## Attaching package: 'candisc'
## The following object is masked from 'package:stats':
## 
##     cancor
library(ACSWR)
## Warning: package 'ACSWR' was built under R version 4.3.3
# Memuat data
data(chemicaldata)
chem <- chemicaldata
X <- chem[, 4:6]  # Variabel independen
Y <- chem[, 1:3]  # Variabel dependen

Plot Korelasi

library(GGally)
## Warning: package 'GGally' was built under R version 4.3.3
## Loading required package: ggplot2
## Warning: package 'ggplot2' was built under R version 4.3.3
## Registered S3 method overwritten by 'GGally':
##   method from   
##   +.gg   ggplot2
# Korelasi variabel independen (X)
ggpairs(X, title = "Korelasi Antar Variabel Independen (X)")

# Korelasi variabel dependen (Y)
ggpairs(Y, title = "Korelasi Antar Variabel Dependen (Y)")

Korelasi Kanonik X dan Y

chem_cc2 <- candisc::cancor(X, Y) 
summary(chem_cc2)
## 
## Canonical correlation analysis of:
##   3   X  variables:  x1, x2, x3 
##   with    3   Y  variables:  y1, y2, y3 
## 
##      CanR   CanRSQ     Eigen  percent    cum                          scree
## 1 0.98153 0.963395 26.318349 99.60771  99.61 ******************************
## 2 0.30199 0.091200  0.100353  0.37981  99.99                               
## 3 0.05733 0.003287  0.003298  0.01248 100.00                               
## 
## Test of H0: The canonical correlations in the 
## current row and all that follow are zero
## 
##      CanR LR test stat approx F numDF  denDF   Pr(> F)    
## 1 0.98153      0.03316  10.7870     9 31.789 1.884e-07 ***
## 2 0.30199      0.90581   0.3549     4 28.000    0.8384    
## 3 0.05733      0.99671   0.0495     1 15.000    0.8270    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Raw canonical coefficients
## 
##    X  variables: 
##       Xcan1    Xcan2     Xcan3
## x1 -0.16062 -0.06944 -0.049094
## x2 -0.14861 -0.12159  0.191179
## x3 -0.21568  0.58392  0.037688
## 
##    Y  variables: 
##       Ycan1   Ycan2   Ycan3
## y1 0.170794 0.62594 0.37732
## y2 0.069097 0.72999 0.21730
## y3 0.085825 0.71274 0.53900

Syntax di atas adalah menggunakan fungsi cancor, dimana dari output terlihat bahwa korelasi kanonik pertama sebesar 0,9815, korelasi kanonik kedua sebesar 0,3020, dan korelasi kanonik ketiga sebesar 0,0573.

0.963394567 –> Persentase varians U1 yang mampu dijelaskan oleh kelompok variabel Y

Persentase varians V1 yang mampu dijelaskan oleh kelompok variabel X

Kemudian fungsi dari masing-masing variat kanonik adalah sebagai berikut:

Variat kanonik dependen (Y variables): - 𝑢1 = 0,1707y1 + 0,0691y2 + 0,0858y3 - 𝑢2 = 0,6259 y1 + 0,7300 y2 + 0,7127 y3 - 𝑢3 = 0,3773 y1 + 0,2173 y2 + 0,5390 y3

Variat kanonik independen (X variables): - 𝑣1 = -0,1606 x1 - 0,1486 x2 - 0,2157 x3 - 𝑣2 = -0,0694 x1 - 0,1216 x2 + 0,5839 x3 - 𝑣3 = -0,0491 x1 + 0,1911 x2 + 0,0377 x3

Analisis Reduksi Dimensi

Test of H0: The canonical correlations in the current row and all that follow are zero

Korelasi Variabel Asal dan Variat Kanonik

Untuk mengetahui korelasi antara variabel asal dengan variat kanoniknya (U dan y; atau V dan x), maupun korelasi antara variabel asal dengan pasangan variat kovariatnya (U dan x; atau V dan y), maka digunakan fungsi cc yaitu :

res.cc <- cc(X,Y)
res.cc$scores
## $xscores
##               [,1]        [,2]        [,3]
##  [1,]  2.230504539 -1.16584543 -0.67087105
##  [2,]  1.152094308  1.75375064 -0.48242915
##  [3,]  0.758876244 -0.84911235  0.74275969
##  [4,]  0.111830106  0.90264530  0.85582484
##  [5,] -0.104246314 -0.93557992 -0.70407353
##  [6,] -0.751292453  0.81617773 -0.59100839
##  [7,] -0.847292173 -1.54351231  0.25182217
##  [8,] -1.494338312  0.20824534  0.36488731
##  [9,]  0.003791896 -0.01646731  0.07587565
## [10,] -1.602376522 -0.71086727 -0.41506187
## [11,]  1.609960313  0.67793265  0.56681317
## [12,] -0.739253963 -0.62439970  1.03177136
## [13,]  0.746837755  0.59146508 -0.88002005
## [14,] -0.643254243  1.73529034  0.18894079
## [15,]  0.650838034 -1.76822496 -0.03718949
## [16,] -0.487807733  0.20103131 -1.84890543
## [17,] -0.487807733  0.20103131 -1.84890543
## [18,] -0.053531875  0.26321977  1.69988471
## [19,] -0.053531875  0.26321977  1.69988471
## 
## $yscores
##              [,1]       [,2]         [,3]
##  [1,]  2.09778362 -1.1041111  0.611132567
##  [2,]  1.29398950 -0.5219566 -0.686230525
##  [3,]  0.67109241 -0.2051390 -1.266743130
##  [4,]  0.01937793 -0.6597517 -1.469499523
##  [5,] -0.14651111 -0.3299210 -1.649740910
##  [6,] -0.61377318 -0.7334426 -0.614024293
##  [7,] -0.88747010 -0.6938741 -0.516054185
##  [8,] -1.50650496 -0.5426948  2.432288140
##  [9,] -0.06693847  0.2285540  0.068736060
## [10,] -1.55848793 -2.1700100 -0.004206386
## [11,]  1.80117275 -0.3206936  1.243361361
## [12,] -0.18408905  0.3482939  0.150233450
## [13,]  0.73985934  1.0849391  0.680133375
## [14,] -0.95917850  0.5800591  0.567780019
## [15,]  0.51920558 -0.4512992  1.216491428
## [16,] -0.51895573  1.2512678 -0.064387766
## [17,] -0.40410523  1.6144702 -0.497884842
## [18,] -0.36477628  0.9335076 -0.109333771
## [19,]  0.06830943  1.6918019 -0.092051069
## 
## $corr.X.xscores
##          [,1]       [,2]       [,3]
## x1 -0.7002371 -0.2147010 -0.6808608
## x2 -0.2303823 -0.1471813  0.9619052
## x3 -0.4415774  0.8719836  0.2113147
## 
## $corr.Y.xscores
##          [,1]        [,2]        [,3]
## y1  0.9770283 -0.02342752 -0.00320610
## y2 -0.5921615  0.15884355 -0.03437028
## y3 -0.8455922 -0.02060390  0.02884731
## 
## $corr.X.yscores
##          [,1]        [,2]        [,3]
## x1 -0.6873014 -0.06483844 -0.03903680
## x2 -0.2261263 -0.04444788  0.05515033
## x3 -0.4334200  0.26333395  0.01211562
## 
## $corr.Y.yscores
##          [,1]        [,2]        [,3]
## y1  0.9954170 -0.07757607 -0.05591924
## y2 -0.6033066  0.52598224 -0.59946959
## y3 -0.8615072 -0.06822615  0.50314074

Korelasi antara X dengan U

res.cc$scores$corr.X.xscores
##          [,1]       [,2]       [,3]
## x1 -0.7002371 -0.2147010 -0.6808608
## x2 -0.2303823 -0.1471813  0.9619052
## x3 -0.4415774  0.8719836  0.2113147

Korelasi antara Y dan V

res.cc$scores$corr.Y.yscores
##          [,1]        [,2]        [,3]
## y1  0.9954170 -0.07757607 -0.05591924
## y2 -0.6033066  0.52598224 -0.59946959
## y3 -0.8615072 -0.06822615  0.50314074

Korelasi antara X dan V

res.cc$scores$corr.X.yscores
##          [,1]        [,2]        [,3]
## x1 -0.6873014 -0.06483844 -0.03903680
## x2 -0.2261263 -0.04444788  0.05515033
## x3 -0.4334200  0.26333395  0.01211562

Korelasi antara Y dan U

res.cc$scores$corr.Y.xscores
##          [,1]        [,2]        [,3]
## y1  0.9770283 -0.02342752 -0.00320610
## y2 -0.5921615  0.15884355 -0.03437028
## y3 -0.8455922 -0.02060390  0.02884731

Uji Hipotesis

# Mengambil nilai korelasi kanonik
canonical_cor <- res.cc$cor  # Korelasi kanonik
eigen_values <- canonical_cor^2

# Ukuran sampel (n), jumlah variabel independen (p), dan dependen (q)
n <- nrow(X)
p <- ncol(X)
q <- ncol(Y)

# Hitung Wilks' Lambda
wilks_lambda <- prod(1 - eigen_values)

# Statistik uji
df <- (p - 1) * (q - 1)
test_stat <- -((n - 1) - 0.5 * (p + q + 1)) * log(wilks_lambda)

# P-value
p_value <- pchisq(test_stat, df = df, lower.tail = FALSE)

list(
  "Wilks' Lambda" = wilks_lambda,
  "Test Statistic" = test_stat,
  "Degrees of Freedom" = df,
  "P-value" = p_value
)
## $`Wilks' Lambda`
## [1] 0.03315764
## 
## $`Test Statistic`
## [1] 49.39399
## 
## $`Degrees of Freedom`
## [1] 4
## 
## $`P-value`
## [1] 4.831855e-10