Mengetahui hubungan antara tingkat pendidikan dan status pernikahan pada populasi dewasa berdasarkan data sensus.
Variabel yang digunakan dalam analisis ini adalah:
Kedua variabel tersebut bersifat kategorikal, sehingga sesuai untuk dianalisis menggunakan Analisis Korespondensi.
Tujuan dari analisis ini adalah untuk:
Dataset yang digunakan pada analisis ini merupakan data sekunder yang diperoleh dari repositori publik UCI Machine Learning Repository. Dataset yang dipilih adalah Adult (Census Income) dataset, yang berisi data sensus penduduk dewasa di Amerika Serikat.
https://archive.ics.uci.edu/dataset/2/adult
Dataset ini mencatat berbagai karakteristik demografis individu, seperti usia, tingkat pendidikan, status pernikahan, jenis pekerjaan, jam kerja per minggu, dan informasi sosial ekonomi lainnya. Dataset ini secara luas digunakan untuk keperluan pembelajaran dan penelitian di bidang statistika dan data science karena bersifat terbuka, terdokumentasi dengan baik, serta memiliki jumlah observasi yang besar.
Pada analisis ini, dari keseluruhan variabel yang tersedia, dipilih dua variabel kategorikal, yaitu:
# 1. Load data langsung dari repository publik
url <- "https://archive.ics.uci.edu/ml/machine-learning-databases/adult/adult.data"
adult <- read.csv(url, header = FALSE, stringsAsFactors = TRUE)
# 2. Kasih nama kolom sesuai dokumentasi UCI
colnames(adult) <- c(
"age","workclass","fnlwgt","education","education_num",
"marital_status","occupation","relationship","race","sex",
"capital_gain","capital_loss","hours_per_week","native_country","income"
)
# 3. Pilih dua variabel kategorikal
data_ca <- adult[, c("education", "marital_status")]
# 4. Factor levels (tidy untuk kontingensi)
data_ca$education <- factor(data_ca$education)
data_ca$marital_status <- factor(data_ca$marital_status)
# 5. Tabel kontingensi
table_count <- xtabs(~ education + marital_status, data = data_ca)
table_count
## marital_status
## education Divorced Married-AF-spouse Married-civ-spouse
## 10th 120 0 349
## 11th 130 0 354
## 12th 39 0 130
## 1st-4th 10 0 81
## 5th-6th 20 0 172
## 7th-8th 73 0 359
## 9th 64 0 230
## Assoc-acdm 203 2 460
## Assoc-voc 234 1 689
## Bachelors 546 4 2768
## Doctorate 33 0 286
## HS-grad 1613 13 4845
## Masters 233 0 1003
## Preschool 1 0 20
## Prof-school 55 0 412
## Some-college 1069 3 2818
## marital_status
## education Married-spouse-absent Never-married Separated Widowed
## 10th 15 361 49 39
## 11th 19 586 48 38
## 12th 8 232 14 10
## 1st-4th 12 39 9 17
## 5th-6th 20 89 18 14
## 7th-8th 14 113 23 64
## 9th 9 155 33 23
## Assoc-acdm 12 337 30 23
## Assoc-voc 13 362 42 41
## Bachelors 68 1795 92 82
## Doctorate 7 73 7 7
## HS-grad 121 3089 406 414
## Masters 17 404 25 41
## Preschool 4 22 1 3
## Prof-school 3 93 8 5
## Some-college 76 2933 220 172
rn <- sum(table_count)
P <- table_count
P
## marital_status
## education Divorced Married-AF-spouse Married-civ-spouse
## 10th 120 0 349
## 11th 130 0 354
## 12th 39 0 130
## 1st-4th 10 0 81
## 5th-6th 20 0 172
## 7th-8th 73 0 359
## 9th 64 0 230
## Assoc-acdm 203 2 460
## Assoc-voc 234 1 689
## Bachelors 546 4 2768
## Doctorate 33 0 286
## HS-grad 1613 13 4845
## Masters 233 0 1003
## Preschool 1 0 20
## Prof-school 55 0 412
## Some-college 1069 3 2818
## marital_status
## education Married-spouse-absent Never-married Separated Widowed
## 10th 15 361 49 39
## 11th 19 586 48 38
## 12th 8 232 14 10
## 1st-4th 12 39 9 17
## 5th-6th 20 89 18 14
## 7th-8th 14 113 23 64
## 9th 9 155 33 23
## Assoc-acdm 12 337 30 23
## Assoc-voc 13 362 42 41
## Bachelors 68 1795 92 82
## Doctorate 7 73 7 7
## HS-grad 121 3089 406 414
## Masters 17 404 25 41
## Preschool 4 22 1 3
## Prof-school 3 93 8 5
## Some-college 76 2933 220 172
Vektor bobot baris
r <- rowSums(P)
r
## 10th 11th 12th 1st-4th 5th-6th
## 933 1175 433 168 333
## 7th-8th 9th Assoc-acdm Assoc-voc Bachelors
## 646 514 1067 1382 5355
## Doctorate HS-grad Masters Preschool Prof-school
## 413 10501 1723 51 576
## Some-college
## 7291
Vektor bobot kolom
c <- colSums(P)
c
## Divorced Married-AF-spouse Married-civ-spouse
## 4443 23 14976
## Married-spouse-absent Never-married Separated
## 418 10683 1025
## Widowed
## 993
Dr <- diag(r)
Dc <- diag(c)
Dr
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13]
## [1,] 933 0 0 0 0 0 0 0 0 0 0 0 0
## [2,] 0 1175 0 0 0 0 0 0 0 0 0 0 0
## [3,] 0 0 433 0 0 0 0 0 0 0 0 0 0
## [4,] 0 0 0 168 0 0 0 0 0 0 0 0 0
## [5,] 0 0 0 0 333 0 0 0 0 0 0 0 0
## [6,] 0 0 0 0 0 646 0 0 0 0 0 0 0
## [7,] 0 0 0 0 0 0 514 0 0 0 0 0 0
## [8,] 0 0 0 0 0 0 0 1067 0 0 0 0 0
## [9,] 0 0 0 0 0 0 0 0 1382 0 0 0 0
## [10,] 0 0 0 0 0 0 0 0 0 5355 0 0 0
## [11,] 0 0 0 0 0 0 0 0 0 0 413 0 0
## [12,] 0 0 0 0 0 0 0 0 0 0 0 10501 0
## [13,] 0 0 0 0 0 0 0 0 0 0 0 0 1723
## [14,] 0 0 0 0 0 0 0 0 0 0 0 0 0
## [15,] 0 0 0 0 0 0 0 0 0 0 0 0 0
## [16,] 0 0 0 0 0 0 0 0 0 0 0 0 0
## [,14] [,15] [,16]
## [1,] 0 0 0
## [2,] 0 0 0
## [3,] 0 0 0
## [4,] 0 0 0
## [5,] 0 0 0
## [6,] 0 0 0
## [7,] 0 0 0
## [8,] 0 0 0
## [9,] 0 0 0
## [10,] 0 0 0
## [11,] 0 0 0
## [12,] 0 0 0
## [13,] 0 0 0
## [14,] 51 0 0
## [15,] 0 576 0
## [16,] 0 0 7291
Dc
## [,1] [,2] [,3] [,4] [,5] [,6] [,7]
## [1,] 4443 0 0 0 0 0 0
## [2,] 0 23 0 0 0 0 0
## [3,] 0 0 14976 0 0 0 0
## [4,] 0 0 0 418 0 0 0
## [5,] 0 0 0 0 10683 0 0
## [6,] 0 0 0 0 0 1025 0
## [7,] 0 0 0 0 0 0 993
###Analisis Profil Baris
F <- solve(Dr) %*% P
F
## marital_status
## Divorced Married-AF-spouse Married-civ-spouse
## [1,] 0.12861736 0.0000000000 0.3740622
## [2,] 0.11063830 0.0000000000 0.3012766
## [3,] 0.09006928 0.0000000000 0.3002309
## [4,] 0.05952381 0.0000000000 0.4821429
## [5,] 0.06006006 0.0000000000 0.5165165
## [6,] 0.11300310 0.0000000000 0.5557276
## [7,] 0.12451362 0.0000000000 0.4474708
## [8,] 0.19025305 0.0018744142 0.4311153
## [9,] 0.16931983 0.0007235890 0.4985528
## [10,] 0.10196078 0.0007469655 0.5169001
## [11,] 0.07990315 0.0000000000 0.6924939
## [12,] 0.15360442 0.0012379773 0.4613846
## [13,] 0.13522925 0.0000000000 0.5821242
## [14,] 0.01960784 0.0000000000 0.3921569
## [15,] 0.09548611 0.0000000000 0.7152778
## [16,] 0.14661912 0.0004114662 0.3865039
## marital_status
## Married-spouse-absent Never-married Separated Widowed
## [1,] 0.016077170 0.3869239 0.05251876 0.041800643
## [2,] 0.016170213 0.4987234 0.04085106 0.032340426
## [3,] 0.018475751 0.5357968 0.03233256 0.023094688
## [4,] 0.071428571 0.2321429 0.05357143 0.101190476
## [5,] 0.060060060 0.2672673 0.05405405 0.042042042
## [6,] 0.021671827 0.1749226 0.03560372 0.099071207
## [7,] 0.017509728 0.3015564 0.06420233 0.044747082
## [8,] 0.011246485 0.3158388 0.02811621 0.021555764
## [9,] 0.009406657 0.2619392 0.03039074 0.029667149
## [10,] 0.012698413 0.3352007 0.01718021 0.015312792
## [11,] 0.016949153 0.1767554 0.01694915 0.016949153
## [12,] 0.011522712 0.2941625 0.03866298 0.039424817
## [13,] 0.009866512 0.2344748 0.01450958 0.023795705
## [14,] 0.078431373 0.4313725 0.01960784 0.058823529
## [15,] 0.005208333 0.1614583 0.01388889 0.008680556
## [16,] 0.010423810 0.4022768 0.03017419 0.023590728
Berdasarkan hasil analisis profil baris, dapat diinterpretasikan hubungan antara tingkat pendidikan dan status pernikahan sebagai berikut.
Pada kolom Never-married, dapat dilihat bahwa baris pendidikan Some-college memiliki nilai proporsi tertinggi dibandingkan tingkat pendidikan lainnya. Hal ini menunjukkan bahwa responden dengan tingkat pendidikan Some-college lebih banyak berada pada status belum menikah dibandingkan kategori pendidikan lain.
Pada kolom Married-civ-spouse, terlihat bahwa baris pendidikan Bachelors dan Masters memiliki nilai proporsi yang relatif lebih tinggi dibandingkan kategori pendidikan lainnya. Hal ini menunjukkan bahwa responden dengan pendidikan tinggi cenderung berada pada status menikah (civil spouse).
Pada kolom Divorced, dapat dilihat bahwa baris pendidikan HS-grad memiliki nilai proporsi yang cukup dominan. Hal ini menunjukkan bahwa responden dengan tingkat pendidikan lulusan SMA memiliki kecenderungan lebih besar berada pada status cerai dibandingkan tingkat pendidikan lainnya.
Pada kolom Separated, proporsi terbesar juga terdapat pada baris pendidikan HS-grad dan Some-college, yang menunjukkan bahwa responden dengan pendidikan menengah memiliki kecenderungan lebih besar mengalami pisah sementara dibandingkan responden berpendidikan tinggi.
Pada kolom Widowed, terlihat bahwa nilai proporsi relatif lebih besar pada tingkat pendidikan HS-grad, yang menunjukkan bahwa status janda/duda lebih banyak dijumpai pada responden dengan pendidikan menengah.
mass_c <- colSums(F)
mass_c
## Divorced Married-AF-spouse Married-civ-spouse
## 1.778409077 0.004994412 7.653936973
## Married-spouse-absent Never-married Separated
## 0.387146767 5.010812307 0.542613706
## Widowed
## 0.622086757
Secara umum, nilai massa kolom menggambarkan distribusi proporsi masing-masing status pernikahan. Dominasi kategori Married-civ-spouse dan Never-married menunjukkan bahwa kedua status tersebut memiliki kontribusi terbesar dalam pembentukan struktur analisis korespondensi.
G <- P %*% solve(Dc)
G
##
## education [,1] [,2] [,3] [,4] [,5]
## 10th 0.0270087779 0.00000000 0.023303953 0.035885167 0.033792006
## 11th 0.0292595093 0.00000000 0.023637821 0.045454545 0.054853506
## 12th 0.0087778528 0.00000000 0.008680556 0.019138756 0.021716746
## 1st-4th 0.0022507315 0.00000000 0.005408654 0.028708134 0.003650660
## 5th-6th 0.0045014630 0.00000000 0.011485043 0.047846890 0.008330993
## 7th-8th 0.0164303399 0.00000000 0.023971688 0.033492823 0.010577553
## 9th 0.0144046815 0.00000000 0.015357906 0.021531100 0.014509033
## Assoc-acdm 0.0456898492 0.08695652 0.030715812 0.028708134 0.031545446
## Assoc-voc 0.0526671168 0.04347826 0.046006944 0.031100478 0.033885613
## Bachelors 0.1228899392 0.17391304 0.184829060 0.162679426 0.168023963
## Doctorate 0.0074274139 0.00000000 0.019097222 0.016746411 0.006833287
## HS-grad 0.3630429890 0.56521739 0.323517628 0.289473684 0.289150988
## Masters 0.0524420437 0.00000000 0.066973825 0.040669856 0.037817093
## Preschool 0.0002250731 0.00000000 0.001335470 0.009569378 0.002059347
## Prof-school 0.0123790232 0.00000000 0.027510684 0.007177033 0.008705420
## Some-college 0.2406031960 0.13043478 0.188167735 0.181818182 0.274548348
##
## education [,6] [,7]
## 10th 0.0478048780 0.039274924
## 11th 0.0468292683 0.038267875
## 12th 0.0136585366 0.010070493
## 1st-4th 0.0087804878 0.017119839
## 5th-6th 0.0175609756 0.014098691
## 7th-8th 0.0224390244 0.064451158
## 9th 0.0321951220 0.023162135
## Assoc-acdm 0.0292682927 0.023162135
## Assoc-voc 0.0409756098 0.041289023
## Bachelors 0.0897560976 0.082578046
## Doctorate 0.0068292683 0.007049345
## HS-grad 0.3960975610 0.416918429
## Masters 0.0243902439 0.041289023
## Preschool 0.0009756098 0.003021148
## Prof-school 0.0078048780 0.005035247
## Some-college 0.2146341463 0.173212487
mass_r <- rowSums(G)
mass_r
## 10th 11th 12th 1st-4th 5th-6th
## 0.20706971 0.23830252 0.08204294 0.06591851 0.10382406
## 7th-8th 9th Assoc-acdm Assoc-voc Bachelors
## 0.17136259 0.12115998 0.27604619 0.28940305 0.98466958
## Doctorate HS-grad Masters Preschool Prof-school
## 0.06398295 2.64341867 0.26358208 0.01718603 0.06861229
## Some-college
## 1.40341888
P0 <- r %*% t(c)
P0
## Divorced Married-AF-spouse Married-civ-spouse Married-spouse-absent
## [1,] 4145319 21459 13972608 389994
## [2,] 5220525 27025 17596800 491150
## [3,] 1923819 9959 6484608 180994
## [4,] 746424 3864 2515968 70224
## [5,] 1479519 7659 4987008 139194
## [6,] 2870178 14858 9674496 270028
## [7,] 2283702 11822 7697664 214852
## [8,] 4740681 24541 15979392 446006
## [9,] 6140226 31786 20696832 577676
## [10,] 23792265 123165 80196480 2238390
## [11,] 1834959 9499 6185088 172634
## [12,] 46655943 241523 157262976 4389418
## [13,] 7655289 39629 25803648 720214
## [14,] 226593 1173 763776 21318
## [15,] 2559168 13248 8626176 240768
## [16,] 32393913 167693 109190016 3047638
## Never-married Separated Widowed
## [1,] 9967239 956325 926469
## [2,] 12552525 1204375 1166775
## [3,] 4625739 443825 429969
## [4,] 1794744 172200 166824
## [5,] 3557439 341325 330669
## [6,] 6901218 662150 641478
## [7,] 5491062 526850 510402
## [8,] 11398761 1093675 1059531
## [9,] 14763906 1416550 1372326
## [10,] 57207465 5488875 5317515
## [11,] 4412079 423325 410109
## [12,] 112182183 10763525 10427493
## [13,] 18406809 1766075 1710939
## [14,] 544833 52275 50643
## [15,] 6153408 590400 571968
## [16,] 77889753 7473275 7239963
Dev <- P - P0
Dev
## marital_status
## education Divorced Married-AF-spouse Married-civ-spouse
## 10th -4145199 -21459 -13972259
## 11th -5220395 -27025 -17596446
## 12th -1923780 -9959 -6484478
## 1st-4th -746414 -3864 -2515887
## 5th-6th -1479499 -7659 -4986836
## 7th-8th -2870105 -14858 -9674137
## 9th -2283638 -11822 -7697434
## Assoc-acdm -4740478 -24539 -15978932
## Assoc-voc -6139992 -31785 -20696143
## Bachelors -23791719 -123161 -80193712
## Doctorate -1834926 -9499 -6184802
## HS-grad -46654330 -241510 -157258131
## Masters -7655056 -39629 -25802645
## Preschool -226592 -1173 -763756
## Prof-school -2559113 -13248 -8625764
## Some-college -32392844 -167690 -109187198
## marital_status
## education Married-spouse-absent Never-married Separated Widowed
## 10th -389979 -9966878 -956276 -926430
## 11th -491131 -12551939 -1204327 -1166737
## 12th -180986 -4625507 -443811 -429959
## 1st-4th -70212 -1794705 -172191 -166807
## 5th-6th -139174 -3557350 -341307 -330655
## 7th-8th -270014 -6901105 -662127 -641414
## 9th -214843 -5490907 -526817 -510379
## Assoc-acdm -445994 -11398424 -1093645 -1059508
## Assoc-voc -577663 -14763544 -1416508 -1372285
## Bachelors -2238322 -57205670 -5488783 -5317433
## Doctorate -172627 -4412006 -423318 -410102
## HS-grad -4389297 -112179094 -10763119 -10427079
## Masters -720197 -18406405 -1766050 -1710898
## Preschool -21314 -544811 -52274 -50640
## Prof-school -240765 -6153315 -590392 -571963
## Some-college -3047562 -77886820 -7473055 -7239791
S <- solve(sqrt(Dr)) %*% Dev %*% solve(sqrt(Dc))
S
## [,1] [,2] [,3] [,4] [,5] [,6]
## [1,] -2035.9467 -146.48891 -3737.9018 -624.4710 -3156.9791 -977.8686
## [2,] -2284.7899 -164.39282 -4194.7696 -700.7938 -3542.7889 -1097.3965
## [3,] -1386.9899 -99.79479 -2546.4383 -425.4151 -2150.6453 -666.1809
## [4,] -863.9468 -62.16108 -1586.1293 -264.9528 -1339.6514 -414.9482
## [5,] -1216.3384 -87.51571 -2233.0840 -373.0335 -1886.0703 -584.1995
## [6,] -1694.1169 -121.89340 -3110.2698 -519.6152 -2626.9739 -813.6977
## [7,] -1511.1499 -108.72902 -2774.3835 -463.5019 -2343.2354 -725.7989
## [8,] -2177.2173 -156.64291 -3997.3081 -667.8188 -3376.1053 -1045.7605
## [9,] -2477.8535 -178.28068 -4549.2257 -760.0329 -3842.2886 -1190.1538
## [10,] -4877.6196 -350.93732 -8954.9396 -1496.0795 -7563.3243 -2342.7956
## [11,] -1354.5822 -97.46281 -2486.8687 -415.4756 -2100.4602 -650.6235
## [12,] -6830.2794 -491.42344 -12540.0672 -2095.0360 -10591.3174 -3280.6574
## [13,] -2766.7351 -199.07034 -5079.5317 -848.6342 -4290.2216 -1328.9187
## [14,] -476.0158 -34.24909 -873.9199 -145.9795 -738.0982 -228.6329
## [15,] -1599.7056 -115.09996 -2936.8950 -490.6750 -2480.5689 -768.3645
## [16,] -5691.3773 -409.49603 -10449.1330 -1745.7050 -8825.1848 -2733.6487
## [,7]
## [1,] -962.4921
## [2,] -1080.1384
## [3,] -655.7050
## [4,] -408.3993
## [5,] -575.0139
## [6,] -800.8433
## [7,] -714.3920
## [8,] -1029.3129
## [9,] -1171.4282
## [10,] -2305.9382
## [11,] -640.3866
## [12,] -3229.0347
## [13,] -1307.9973
## [14,] -225.0267
## [15,] -756.2790
## [16,] -2690.6540
svd_S <- svd(S)
U <- svd_S$u
D <- diag(svd_S$d)
V <- svd_S$v
X_plot <- U %*% D
Y_plot <- V %*% D
rownames(X_plot) <- rownames(P)
rownames(Y_plot) <- colnames(P)
plot(X_plot[,1], X_plot[,2],
xlab = "Dimensi 1",
ylab = "Dimensi 2",
pch = 16,
col = "blue",
xlim = range(c(X_plot[,1], Y_plot[,1])),
ylim = range(c(X_plot[,2], Y_plot[,2])))
text(X_plot[,1], X_plot[,2], rownames(X_plot), pos = 3, col = "blue")
points(Y_plot[,1], Y_plot[,2], pch = 17, col = "red")
text(Y_plot[,1], Y_plot[,2], rownames(Y_plot), pos = 3, col = "red")
abline(h = 0, v = 0, lty = 2)
INTERPRETASI: 1. Never-married berada jauh di sisi kanan Dimensi 1 dan
paling dekat dengan pendidikan rendah–menengah seperti Some-college dan
HS-grad. Ini menunjukkan kelompok belum menikah cenderung berasosiasi
dengan tingkat pendidikan tersebut. 2. Married-civ-spouse dan
Married-spouse-absent berada dekat pusat dan relatif dekat dengan
pendidikan menengah seperti Assoc-acdm, Assoc-voc, dan Bachelors,
menandakan karakteristik pendidikan yang lebih beragam dan tidak
ekstrem. 3. Widowed, Divorced, dan Separated berada di sisi kanan
Dimensi 1 namun lebih dekat ke sumbu tengah Dimensi 2, menunjukkan pola
pendidikan yang tidak terlalu spesifik tetapi berbeda dari kelompok
menikah. 4. Pendidikan tinggi (Masters dan Prof-school) cenderung berada
di bagian atas Dimensi 2 dan relatif dekat dengan status
Married-civ-spouse, mengindikasikan keterkaitan antara pendidikan tinggi
dan status menikah.