Analisis korelasi kanonik digunakan untuk mengukur kekuatan hubungan antara dua set (kelompok) variabel yang tidak bisa diukur dengan analisis korelasi linier biasa, misalnya: - Set variabel sosial dengan set variabel ekonomi. - Set variabel lingkungan dengan set kesehatan - Set variabel pendidikan dengan set variabel kesehatan - Set variabel social ekonomi dan set variabel sanitasi
Analisis korelasi kanonik mengukur korelasi antara kombinasi linier variabel-variabel di suatu kelompok dengan kombinasi linier variabel-variabel pada satu kelompok lainnya.
Digunakan dataset chemicaldata (Box and Youle 1955; Rencher2002) yang
terdapat pada package ACSWR. Dataset chemicaldata berisi hasil
eksperimen reaksi kimia, dimana dalam eksperimen ini variabel yang
digunakan adalah : - X1 = temperatur
- X2 = konsentrasi zat
- X3 = waktu
- Y1= persentase material yang tidak berubah
- Y2 = persentase material yang berubah - sesuai yang diharapkan
- Y3 = persentase material yang berubah tidak sesuai yang diharapkan
# Memuat library
library(CCA)
## Warning: package 'CCA' was built under R version 4.3.3
## Loading required package: fda
## Warning: package 'fda' was built under R version 4.3.3
## Loading required package: splines
## Loading required package: fds
## Warning: package 'fds' was built under R version 4.3.3
## Loading required package: rainbow
## Warning: package 'rainbow' was built under R version 4.3.3
## Loading required package: MASS
## Loading required package: pcaPP
## Warning: package 'pcaPP' was built under R version 4.3.3
## Loading required package: RCurl
## Warning: package 'RCurl' was built under R version 4.3.3
## Loading required package: deSolve
## Warning: package 'deSolve' was built under R version 4.3.3
##
## Attaching package: 'fda'
## The following object is masked from 'package:graphics':
##
## matplot
## Loading required package: fields
## Warning: package 'fields' was built under R version 4.3.3
## Loading required package: spam
## Warning: package 'spam' was built under R version 4.3.3
## Spam version 2.11-0 (2024-10-03) is loaded.
## Type 'help( Spam)' or 'demo( spam)' for a short introduction
## and overview of this package.
## Help for individual functions is also obtained by adding the
## suffix '.spam' to the function name, e.g. 'help( chol.spam)'.
##
## Attaching package: 'spam'
## The following objects are masked from 'package:base':
##
## backsolve, forwardsolve
## Loading required package: viridisLite
##
## Try help(fields) to get started.
library(candisc)
## Warning: package 'candisc' was built under R version 4.3.3
## Loading required package: heplots
## Warning: package 'heplots' was built under R version 4.3.3
## Loading required package: broom
## Warning: package 'broom' was built under R version 4.3.3
##
## Attaching package: 'candisc'
## The following object is masked from 'package:stats':
##
## cancor
library(ACSWR)
## Warning: package 'ACSWR' was built under R version 4.3.3
# Memuat data
data(chemicaldata)
chem <- chemicaldata
X <- chem[, 4:6] # Variabel independen
Y <- chem[, 1:3] # Variabel dependen
library(GGally)
## Warning: package 'GGally' was built under R version 4.3.3
## Loading required package: ggplot2
## Warning: package 'ggplot2' was built under R version 4.3.3
## Registered S3 method overwritten by 'GGally':
## method from
## +.gg ggplot2
# Korelasi variabel independen (X)
ggpairs(X, title = "Korelasi Antar Variabel Independen (X)")
# Korelasi variabel dependen (Y)
ggpairs(Y, title = "Korelasi Antar Variabel Dependen (Y)")
chem_cc2 <- candisc::cancor(X, Y)
summary(chem_cc2)
##
## Canonical correlation analysis of:
## 3 X variables: x1, x2, x3
## with 3 Y variables: y1, y2, y3
##
## CanR CanRSQ Eigen percent cum scree
## 1 0.98153 0.963395 26.318349 99.60771 99.61 ******************************
## 2 0.30199 0.091200 0.100353 0.37981 99.99
## 3 0.05733 0.003287 0.003298 0.01248 100.00
##
## Test of H0: The canonical correlations in the
## current row and all that follow are zero
##
## CanR LR test stat approx F numDF denDF Pr(> F)
## 1 0.98153 0.03316 10.7870 9 31.789 1.884e-07 ***
## 2 0.30199 0.90581 0.3549 4 28.000 0.8384
## 3 0.05733 0.99671 0.0495 1 15.000 0.8270
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Raw canonical coefficients
##
## X variables:
## Xcan1 Xcan2 Xcan3
## x1 -0.16062 -0.06944 -0.049094
## x2 -0.14861 -0.12159 0.191179
## x3 -0.21568 0.58392 0.037688
##
## Y variables:
## Ycan1 Ycan2 Ycan3
## y1 0.170794 0.62594 0.37732
## y2 0.069097 0.72999 0.21730
## y3 0.085825 0.71274 0.53900
Syntax di atas adalah menggunakan fungsi cancor, dimana dari output terlihat bahwa korelasi kanonik pertama sebesar 0,9815, korelasi kanonik kedua sebesar 0,3020, dan korelasi kanonik ketiga sebesar 0,0573.
0.963394567 –> Persentase varians U1 yang mampu dijelaskan oleh kelompok variabel Y
Persentase varians V1 yang mampu dijelaskan oleh kelompok variabel X
Kemudian fungsi dari masing-masing variat kanonik adalah sebagai berikut:
Variat kanonik dependen (Y variables): - 𝑢1 = 0,1707y1 + 0,0691y2 + 0,0858y3 - 𝑢2 = 0,6259 y1 + 0,7300 y2 + 0,7127 y3 - 𝑢3 = 0,3773 y1 + 0,2173 y2 + 0,5390 y3
Variat kanonik independen (X variables): - 𝑣1 = -0,1606 x1 - 0,1486 x2 - 0,2157 x3 - 𝑣2 = -0,0694 x1 - 0,1216 x2 + 0,5839 x3 - 𝑣3 = -0,0491 x1 + 0,1911 x2 + 0,0377 x3
Test of H0: The canonical correlations in the current row and all that follow are zero
Untuk mengetahui korelasi antara variabel asal dengan variat kanoniknya (U dan y; atau V dan x), maupun korelasi antara variabel asal dengan pasangan variat kovariatnya (U dan x; atau V dan y), maka digunakan fungsi cc yaitu :
res.cc <- cc(X,Y)
res.cc$scores
## $xscores
## [,1] [,2] [,3]
## [1,] 2.230504539 -1.16584543 -0.67087105
## [2,] 1.152094308 1.75375064 -0.48242915
## [3,] 0.758876244 -0.84911235 0.74275969
## [4,] 0.111830106 0.90264530 0.85582484
## [5,] -0.104246314 -0.93557992 -0.70407353
## [6,] -0.751292453 0.81617773 -0.59100839
## [7,] -0.847292173 -1.54351231 0.25182217
## [8,] -1.494338312 0.20824534 0.36488731
## [9,] 0.003791896 -0.01646731 0.07587565
## [10,] -1.602376522 -0.71086727 -0.41506187
## [11,] 1.609960313 0.67793265 0.56681317
## [12,] -0.739253963 -0.62439970 1.03177136
## [13,] 0.746837755 0.59146508 -0.88002005
## [14,] -0.643254243 1.73529034 0.18894079
## [15,] 0.650838034 -1.76822496 -0.03718949
## [16,] -0.487807733 0.20103131 -1.84890543
## [17,] -0.487807733 0.20103131 -1.84890543
## [18,] -0.053531875 0.26321977 1.69988471
## [19,] -0.053531875 0.26321977 1.69988471
##
## $yscores
## [,1] [,2] [,3]
## [1,] 2.09778362 -1.1041111 0.611132567
## [2,] 1.29398950 -0.5219566 -0.686230525
## [3,] 0.67109241 -0.2051390 -1.266743130
## [4,] 0.01937793 -0.6597517 -1.469499523
## [5,] -0.14651111 -0.3299210 -1.649740910
## [6,] -0.61377318 -0.7334426 -0.614024293
## [7,] -0.88747010 -0.6938741 -0.516054185
## [8,] -1.50650496 -0.5426948 2.432288140
## [9,] -0.06693847 0.2285540 0.068736060
## [10,] -1.55848793 -2.1700100 -0.004206386
## [11,] 1.80117275 -0.3206936 1.243361361
## [12,] -0.18408905 0.3482939 0.150233450
## [13,] 0.73985934 1.0849391 0.680133375
## [14,] -0.95917850 0.5800591 0.567780019
## [15,] 0.51920558 -0.4512992 1.216491428
## [16,] -0.51895573 1.2512678 -0.064387766
## [17,] -0.40410523 1.6144702 -0.497884842
## [18,] -0.36477628 0.9335076 -0.109333771
## [19,] 0.06830943 1.6918019 -0.092051069
##
## $corr.X.xscores
## [,1] [,2] [,3]
## x1 -0.7002371 -0.2147010 -0.6808608
## x2 -0.2303823 -0.1471813 0.9619052
## x3 -0.4415774 0.8719836 0.2113147
##
## $corr.Y.xscores
## [,1] [,2] [,3]
## y1 0.9770283 -0.02342752 -0.00320610
## y2 -0.5921615 0.15884355 -0.03437028
## y3 -0.8455922 -0.02060390 0.02884731
##
## $corr.X.yscores
## [,1] [,2] [,3]
## x1 -0.6873014 -0.06483844 -0.03903680
## x2 -0.2261263 -0.04444788 0.05515033
## x3 -0.4334200 0.26333395 0.01211562
##
## $corr.Y.yscores
## [,1] [,2] [,3]
## y1 0.9954170 -0.07757607 -0.05591924
## y2 -0.6033066 0.52598224 -0.59946959
## y3 -0.8615072 -0.06822615 0.50314074
Korelasi antara X dengan U
res.cc$scores$corr.X.xscores
## [,1] [,2] [,3]
## x1 -0.7002371 -0.2147010 -0.6808608
## x2 -0.2303823 -0.1471813 0.9619052
## x3 -0.4415774 0.8719836 0.2113147
Korelasi antara Y dan V
res.cc$scores$corr.Y.yscores
## [,1] [,2] [,3]
## y1 0.9954170 -0.07757607 -0.05591924
## y2 -0.6033066 0.52598224 -0.59946959
## y3 -0.8615072 -0.06822615 0.50314074
Korelasi antara X dan V
res.cc$scores$corr.X.yscores
## [,1] [,2] [,3]
## x1 -0.6873014 -0.06483844 -0.03903680
## x2 -0.2261263 -0.04444788 0.05515033
## x3 -0.4334200 0.26333395 0.01211562
Korelasi antara Y dan U
res.cc$scores$corr.Y.xscores
## [,1] [,2] [,3]
## y1 0.9770283 -0.02342752 -0.00320610
## y2 -0.5921615 0.15884355 -0.03437028
## y3 -0.8455922 -0.02060390 0.02884731
Uji Hipotesis
# Mengambil nilai korelasi kanonik
canonical_cor <- res.cc$cor # Korelasi kanonik
eigen_values <- canonical_cor^2
# Ukuran sampel (n), jumlah variabel independen (p), dan dependen (q)
n <- nrow(X)
p <- ncol(X)
q <- ncol(Y)
# Hitung Wilks' Lambda
wilks_lambda <- prod(1 - eigen_values)
# Statistik uji
df <- (p - 1) * (q - 1)
test_stat <- -((n - 1) - 0.5 * (p + q + 1)) * log(wilks_lambda)
# P-value
p_value <- pchisq(test_stat, df = df, lower.tail = FALSE)
list(
"Wilks' Lambda" = wilks_lambda,
"Test Statistic" = test_stat,
"Degrees of Freedom" = df,
"P-value" = p_value
)
## $`Wilks' Lambda`
## [1] 0.03315764
##
## $`Test Statistic`
## [1] 49.39399
##
## $`Degrees of Freedom`
## [1] 4
##
## $`P-value`
## [1] 4.831855e-10