library(knitr)
library(kableExtra)
library(epitools)
library(vcd)
library(DescTools)
library(ggplot2)
library(dplyr)
library(tidyr)
library(scales)
library(RColorBrewer)Tabel kontingensi dua arah merupakan salah satu alat utama dalam analisis data kategorik untuk mengeksplorasi dan menguji hubungan antara dua variabel kategorik. Dalam tugas ini, dilakukan analisis inferensi pada dua kasus:
Dalam analisis tabel kontingensi 2×2, terdapat tiga ukuran asosiasi utama yang digunakan untuk mengkuantifikasi kekuatan hubungan antara dua variabel kategorik, yaitu Risk Difference (RD), Risk Ratio (RR), dan Odds Ratio (OR).
Definisi: Risk Difference (RD), atau selisih risiko, adalah ukuran asosiasi absolut yang mengukur perbedaan besarnya probabilitas kejadian (risk) antara kelompok terpapar dan kelompok tidak terpapar.
\[\text{RD} = p_1 - p_2\]
di mana \(p_1\) adalah proporsi kejadian pada kelompok terpapar dan \(p_2\) pada kelompok tidak terpapar.
Interpretasi:
Misalnya, \(\text{RD} = 0{,}25\) berarti risiko pada kelompok terpapar 25 poin persentase lebih tinggi dibanding kelompok tidak terpapar. RD intuitif karena menyatakan selisih probabilitas secara langsung dan berguna untuk mengukur dampak absolut dalam kebijakan kesehatan masyarakat.
Definisi: Risk Ratio (RR), atau relative risk, adalah ukuran asosiasi relatif yang membandingkan besarnya risiko antara dua kelompok secara proporsional.
\[\text{RR} = \frac{p_1}{p_2}\]
Interpretasi:
Misalnya, \(\text{RR} = 2\) berarti kelompok terpapar memiliki risiko dua kali lipat dibanding kelompok tidak terpapar. RR mudah diinterpretasi namun tidak dapat dihitung langsung pada desain studi case-control karena proporsi kasus dikontrol oleh peneliti.
Definisi: Odds Ratio (OR) adalah ukuran asosiasi yang membandingkan odds (nisbah kemungkinan) kejadian antara dua kelompok. Odds didefinisikan sebagai rasio probabilitas suatu kejadian terjadi terhadap probabilitas tidak terjadi: \(\text{odds} = p/(1-p)\).
\[\text{OR} = \frac{p_1/(1-p_1)}{p_2/(1-p_2)} = \frac{ad}{bc}\]
di mana \(a, b, c, d\) adalah frekuensi sel tabel 2×2.
Interpretasi:
Misalnya, \(\text{OR} = 3\) berarti odds kejadian pada kelompok terpapar tiga kali lebih besar dibanding kelompok tidak terpapar. OR adalah ukuran asosiasi yang paling fleksibel — valid untuk semua desain studi termasuk case-control. Pada kejadian yang jarang (rare disease assumption, prevalensi < 10%), nilai OR mendekati nilai RR.
| Ukuran | Nilai Null | Sifat | Kegunaan Utama |
|---|---|---|---|
| RD | 0 | Absolut | Kohort, Cross-sectional; dampak kebijakan |
| RR | 1 | Relatif | Kohort, Cross-sectional |
| OR | 1 | Relatif (odds) | Semua desain, termasuk case-control |
Analisis meliputi empat metode pengujian hipotesis: uji dua proporsi, chi-square Pearson, likelihood ratio (\(G^2\)), dan Fisher exact test; serta partisi chi-square untuk kasus 2×3.
Referensi utama:
Data yang digunakan menggambarkan hubungan antara status merokok (Smoker vs Non-Smoker) dan kejadian kanker paru (Cancer (+) vs Control (-)) pada desain studi case-control.
tabel1 <- matrix(
c(688, 650, 21, 59),
nrow = 2,
byrow = TRUE,
dimnames = list(
"Status Merokok" = c("Smoker", "Non-Smoker"),
"Status Kanker" = c("Cancer (+)", "Control (-)")
)
)
tabel1_margin <- addmargins(tabel1)
kable(tabel1_margin,
caption = "Tabel 1. Tabel Kontingensi 2x2: Status Merokok dan Kanker Paru",
align = "c") |>
kable_styling(bootstrap_options = c("striped","hover","bordered"),
full_width = FALSE, position = "center") |>
row_spec(nrow(tabel1_margin), bold=TRUE, background="#dce8f5") |>
column_spec(ncol(tabel1_margin), bold=TRUE, background="#dce8f5") |>
row_spec(0, bold=TRUE, color="white", background="#2166ac")| Cancer (+) | Control (-) | Sum | |
|---|---|---|---|
| Smoker | 688 | 650 | 1338 |
| Non-Smoker | 21 | 59 | 80 |
| Sum | 709 | 709 | 1418 |
Notasi sel tabel 2×2:
| Cancer (+) | Control (−) | Total | |
|---|---|---|---|
| Smoker | \(a = 688\) | \(b = 650\) | \(n_{1+} = 1338\) |
| Non-Smoker | \(c = 21\) | \(d = 59\) | \(n_{2+} = 80\) |
| Total | \(n_{+1} = 709\) | \(n_{+2} = 709\) | \(n = 1418\) |
Estimasi proporsi kejadian kanker paru pada masing-masing kelompok:
\[\hat{p}_1 = \frac{a}{n_{1+}} = \frac{688}{1338} = 0{,}5142\]
\[\hat{p}_2 = \frac{c}{n_{2+}} = \frac{21}{80} = 0{,}2625\]
## Proporsi Smoker (p1_hat): 0.5142
## Proporsi Non-Smoker (p2_hat): 0.2625
Interpretasi: Proporsi kejadian kanker paru pada kelompok Smoker sebesar 0.5142 (51,42%), sedangkan pada kelompok Non-Smoker sebesar 0.2625 (26,25%). Secara deskriptif, perokok memiliki risiko kanker paru yang lebih tinggi.
Digunakan metode Wilson Score yang lebih akurat dibanding Wald, terutama untuk proporsi mendekati 0 atau 1 (Agresti, 2013):
\[\text{CI}_{95\%}(p) = \frac{\hat{p} + \dfrac{z^2}{2n} \pm z\sqrt{\dfrac{\hat{p}(1-\hat{p})}{n} + \dfrac{z^2}{4n^2}}}{1 + \dfrac{z^2}{n}}\]
Untuk Smoker (\(\hat{p}_1 = 0{,}5142,\ n_1 = 1338,\ z_{0.025} = 1{,}96\)):
\[\text{Batas bawah} = \frac{0{,}5142 + \frac{3{,}8416}{2\times1338} - 1{,}96\sqrt{\frac{0{,}5142\times0{,}4858}{1338} + \frac{3{,}8416}{4\times1338^2}}}{1 + \frac{3{,}8416}{1338}} = \frac{0{,}5156 - 1{,}96\times0{,}01368}{1{,}00287} = \frac{0{,}4887}{1{,}00287} \approx 0{,}4873\]
\[\text{Batas atas} = \frac{0{,}5156 + 1{,}96\times0{,}01368}{1{,}00287} = \frac{0{,}5424}{1{,}00287} \approx 0{,}5409\]
Untuk Non-Smoker (\(\hat{p}_2 = 0{,}2625,\ n_2 = 80\)):
\[\text{CI}_{95\%}(\hat{p}_2) \approx \left[\frac{0{,}2863 - 1{,}96\times0{,}04985}{1{,}04802};\ \frac{0{,}2863 + 1{,}96\times0{,}04985}{1{,}04802}\right] = [0{,}1773;\ 0{,}3718]\]
ci_wilson <- function(x, n_obs, conf = 0.95) {
z <- qnorm(1 - (1 - conf)/2)
p <- x / n_obs
lo <- (p + z^2/(2*n_obs) - z*sqrt(p*(1-p)/n_obs + z^2/(4*n_obs^2))) / (1 + z^2/n_obs)
hi <- (p + z^2/(2*n_obs) + z*sqrt(p*(1-p)/n_obs + z^2/(4*n_obs^2))) / (1 + z^2/n_obs)
c(estimate = p, lower = lo, upper = hi)
}
ci_p1 <- ci_wilson(a, n1)
ci_p2 <- ci_wilson(c, n2)
ci_prop_df <- data.frame(
Kelompok = c("Smoker","Non-Smoker"),
n = c(n1, n2),
Proporsi = c(round(p1_hat,4), round(p2_hat,4)),
"CI Lower 95%" = c(round(ci_p1["lower"],4), round(ci_p2["lower"],4)),
"CI Upper 95%" = c(round(ci_p1["upper"],4), round(ci_p2["upper"],4)),
check.names = FALSE
)
kable(ci_prop_df,
caption = "Tabel 2. Estimasi Proporsi dan CI 95% (Wilson Score)",
align = "c") |>
kable_styling(bootstrap_options = c("striped","hover","bordered"),
full_width = FALSE, position="center") |>
row_spec(0, bold=TRUE, color="white", background="#2166ac")| Kelompok | n | Proporsi | CI Lower 95% | CI Upper 95% |
|---|---|---|---|---|
| Smoker | 1338 | 0.5142 | 0.4874 | 0.5409 |
| Non-Smoker | 80 | 0.2625 | 0.1786 | 0.3682 |
Perhitungan Manual:
\[\text{RD} = \hat{p}_1 - \hat{p}_2 = 0{,}5142 - 0{,}2625 = 0{,}2517\]
\[\text{SE}(\text{RD}) = \sqrt{\frac{\hat{p}_1(1-\hat{p}_1)}{n_1} + \frac{\hat{p}_2(1-\hat{p}_2)}{n_2}} = \sqrt{\frac{0{,}5142\times0{,}4858}{1338} + \frac{0{,}2625\times0{,}7375}{80}}\]
\[= \sqrt{\frac{0{,}24982}{1338} + \frac{0{,}19359}{80}} = \sqrt{0{,}000187 + 0{,}002420} = \sqrt{0{,}002607} = 0{,}05106\]
\[\text{CI}_{95\%}(\text{RD}) = 0{,}2517 \pm 1{,}96 \times 0{,}05106 = 0{,}2517 \pm 0{,}1001 = [0{,}1516;\ 0{,}3518]\]
RD <- p1_hat - p2_hat
SE_RD <- sqrt(p1_hat*(1-p1_hat)/n1 + p2_hat*(1-p2_hat)/n2)
CI_RD <- c(RD - z95*SE_RD, RD + z95*SE_RD)
cat("RD :", round(RD, 4), "\n")## RD : 0.2517
## SE(RD) : 0.0511
## 95% CI RD : [ 0.1516 ; 0.3518 ]
Interpretasi: \(\text{RD} = 0.2517\), artinya probabilitas kanker paru pada perokok 25.17% lebih tinggi secara absolut dibanding non-perokok. CI 95%: [\(0.1516\); \(0.3518\)] tidak mencakup 0, menandakan perbedaan yang signifikan secara statistik.
Perhitungan Manual:
\[\text{RR} = \frac{\hat{p}_1}{\hat{p}_2} = \frac{0{,}5142}{0{,}2625} = 1{,}9589\]
\[\text{SE}(\ln\text{RR}) = \sqrt{\frac{1-\hat{p}_1}{a} + \frac{1-\hat{p}_2}{c}} = \sqrt{\frac{0{,}4858}{688} + \frac{0{,}7375}{21}} = \sqrt{0{,}000706 + 0{,}035119} = \sqrt{0{,}035825} = 0{,}1893\]
\[\ln\text{RR} = \ln(1{,}9589) = 0{,}6726\]
\[\text{CI}_{95\%}(\ln\text{RR}) = 0{,}6726 \pm 1{,}96 \times 0{,}1893 = [0{,}3016;\ 1{,}0437]\]
\[\text{CI}_{95\%}(\text{RR}) = \left[e^{0{,}3016};\ e^{1{,}0437}\right] = [1{,}3521;\ 2{,}8394]\]
RR <- p1_hat / p2_hat
SE_lnRR <- sqrt((1-p1_hat)/a + (1-p2_hat)/c)
CI_RR <- exp(log(RR) + c(-1,1)*z95*SE_lnRR)
cat("RR :", round(RR, 4), "\n")## RR : 1.9589
## ln(RR) : 0.6724
## SE(ln RR) : 0.1893
## 95% CI RR : [ 1.3517 ; 2.8387 ]
Interpretasi: \(\text{RR} = 1.9589\), artinya Smoker memiliki risiko kanker paru 1.96 kali lebih besar secara relatif dibanding Non-Smoker. CI 95%: [\(1.3517\); \(2.8387\)] tidak mencakup 1.
Desain case-control tidak memungkinkan estimasi RD dan RR yang valid karena proporsi kasus dikontrol oleh peneliti. Oleh karena itu, OR adalah ukuran asosiasi yang paling tepat untuk data ini (Fleiss et al., 2003).
Perhitungan Manual:
\[\text{OR} = \frac{ad}{bc} = \frac{688 \times 59}{650 \times 21} = \frac{40{.}592}{13{.}650} = 2{,}9737\]
\[\text{SE}(\ln\text{OR}) = \sqrt{\frac{1}{a} + \frac{1}{b} + \frac{1}{c} + \frac{1}{d}} = \sqrt{\frac{1}{688} + \frac{1}{650} + \frac{1}{21} + \frac{1}{59}}\]
\[= \sqrt{0{,}001453 + 0{,}001538 + 0{,}047619 + 0{,}016949} = \sqrt{0{,}067559} = 0{,}2599\]
\[\ln\text{OR} = \ln(2{,}9737) = 1{,}0900\]
\[\text{CI}_{95\%}(\ln\text{OR}) = 1{,}0900 \pm 1{,}96 \times 0{,}2599 = [0{,}5806;\ 1{,}5994]\]
\[\text{CI}_{95\%}(\text{OR}) = \left[e^{0{,}5806};\ e^{1{,}5994}\right] = [1{,}7870;\ 4{,}9502]\]
OR <- (a * d) / (b * c)
SE_lnOR <- sqrt(1/a + 1/b + 1/c + 1/d)
CI_OR <- exp(log(OR) + c(-1,1)*z95*SE_lnOR)
cat("OR :", round(OR, 4), "\n")## OR : 2.9738
## ln(OR) : 1.0898
## SE(ln OR) : 0.2599
## 95% CI OR : [ 1.7867 ; 4.9494 ]
Interpretasi: \(\text{OR} = 2.9738\), artinya odds kanker paru pada Smoker 2.97 kali lebih besar dibanding Non-Smoker. CI 95%: [\(1.7867\); \(4.9494\)] jauh di atas 1, mengindikasikan asosiasi yang kuat dan signifikan.
asosiasi_df <- data.frame(
"Ukuran Asosiasi" = c("Risk Difference (RD)","Risk Ratio (RR)","Odds Ratio (OR)"),
"Estimasi" = c(round(RD,4), round(RR,4), round(OR,4)),
"CI Lower 95%" = c(round(CI_RD[1],4), round(CI_RR[1],4), round(CI_OR[1],4)),
"CI Upper 95%" = c(round(CI_RD[2],4), round(CI_RR[2],4), round(CI_OR[2],4)),
"Nilai Null" = c("0","1","1"),
"Kesimpulan" = rep("Signifikan", 3),
check.names = FALSE
)
kable(asosiasi_df,
caption = "Tabel 3. Ringkasan Ukuran Asosiasi dan CI 95%",
align = "c") |>
kable_styling(bootstrap_options = c("striped","hover","bordered"),
full_width = TRUE) |>
row_spec(0, bold=TRUE, color="white", background="#2166ac") |>
column_spec(6, bold=TRUE, color="#1a7a1a")| Ukuran Asosiasi | Estimasi | CI Lower 95% | CI Upper 95% | Nilai Null | Kesimpulan |
|---|---|---|---|---|---|
| Risk Difference (RD) | 0.2517 | 0.1516 | 0.3518 | 0 | Signifikan |
| Risk Ratio (RR) | 1.9589 | 1.3517 | 2.8387 | 1 | Signifikan |
| Odds Ratio (OR) | 2.9738 | 1.7867 | 4.9494 | 1 | Signifikan |
Hipotesis:
\[H_0: p_1 = p_2 \quad \text{vs} \quad H_1: p_1 \neq p_2\]
Statistik Uji (pooled):
\[z = \frac{\hat{p}_1 - \hat{p}_2}{\sqrt{\hat{p}(1-\hat{p})\!\left(\dfrac{1}{n_1}+\dfrac{1}{n_2}\right)}}\]
Perhitungan Manual:
Proporsi gabungan: \[\hat{p} = \frac{a + c}{n} = \frac{688 + 21}{1418} = \frac{709}{1418} = 0{,}5000\]
Standar error: \[\text{SE} = \sqrt{0{,}5000 \times 0{,}5000 \times \left(\frac{1}{1338}+\frac{1}{80}\right)} = \sqrt{0{,}2500 \times \left(0{,}000747 + 0{,}012500\right)} = \sqrt{0{,}2500 \times 0{,}013247} = \sqrt{0{,}003312} = 0{,}05755\]
Statistik uji: \[z = \frac{0{,}5142 - 0{,}2625}{0{,}05755} = \frac{0{,}2517}{0{,}05755} = 4{,}3733\]
\(p\)-value (dua sisi): \[p = 2 \times P(Z > 4{,}3733) = 2 \times (1 - \Phi(4{,}3733)) \approx 1{,}22 \times 10^{-5}\]
p_pool <- (a + c) / n
SE_pool <- sqrt(p_pool*(1-p_pool)*(1/n1 + 1/n2))
z_stat <- (p1_hat - p2_hat) / SE_pool
p_val_z <- 2 * pnorm(-abs(z_stat))
cat("p_pool :", round(p_pool, 4), "\n")## p_pool : 0.5
## SE pool : 0.0575
## z : 4.3737
## p-value : 1.222e-05
## --- Konfirmasi prop.test() ---
##
## 2-sample test for equality of proportions without continuity correction
##
## data: c(a, c) out of c(n1, n2)
## X-squared = 19.129, df = 1, p-value = 1.222e-05
## alternative hypothesis: two.sided
## 95 percent confidence interval:
## 0.1516343 0.3517663
## sample estimates:
## prop 1 prop 2
## 0.5142003 0.2625000
Keputusan & Interpretasi: \(z = 4.3737\), \(p \approx 1.22e-05\) \(\ll \alpha = 0{,}05\). \(H_0\) ditolak. Terdapat perbedaan proporsi yang signifikan antara Smoker dan Non-Smoker.
Hipotesis:
\[H_0: \text{Status merokok independen terhadap status kanker} \quad \text{vs} \quad H_1: \text{ada asosiasi}\]
Derajat bebas: \(df = (2-1)(2-1) = 1\)
Perhitungan Manual — Frekuensi Harapan:
\[E_{11} = \frac{n_{1+} \times n_{+1}}{n} = \frac{1338 \times 709}{1418} = \frac{949{.}242}{1418} = 669{,}4302\]
\[E_{12} = \frac{n_{1+} \times n_{+2}}{n} = \frac{1338 \times 709}{1418} = 669{,}4302 \quad \text{(simetris karena }n_{+1}=n_{+2}\text{)}\]
\[E_{21} = \frac{n_{2+} \times n_{+1}}{n} = \frac{80 \times 709}{1418} = \frac{56{.}720}{1418} = 39{,}9859\]
\[E_{22} = \frac{80 \times 709}{1418} = 39{,}9859\]
Statistik Chi-Square:
\[\chi^2 = \frac{(688-669{,}43)^2}{669{,}43} + \frac{(650-669{,}43)^2}{669{,}43} + \frac{(21-39{,}99)^2}{39{,}99} + \frac{(59-39{,}99)^2}{39{,}99}\]
\[= \frac{(18{,}57)^2}{669{,}43} + \frac{(-19{,}43)^2}{669{,}43} + \frac{(-18{,}99)^2}{39{,}99} + \frac{(19{,}01)^2}{39{,}99}\]
\[= \frac{344{,}85}{669{,}43} + \frac{377{,}52}{669{,}43} + \frac{360{,}62}{39{,}99} + \frac{361{,}38}{39{,}99}\]
\[= 0{,}5151 + 0{,}5639 + 9{,}0173 + 9{{,}0373} = 19{,}1336\]
E11 <- (n1*(a+c))/n; E12 <- (n1*(b+d))/n
E21 <- (n2*(a+c))/n; E22 <- (n2*(b+d))/n
E_mat <- matrix(c(E11,E12,E21,E22), 2, 2, byrow=TRUE, dimnames=dimnames(tabel1))
cat("Frekuensi Harapan:\n")## Frekuensi Harapan:
## Status Kanker
## Status Merokok Cancer (+) Control (-)
## Smoker 669 669
## Non-Smoker 40 40
chi2_stat <- sum((tabel1 - E_mat)^2 / E_mat)
p_chi <- pchisq(chi2_stat, df=1, lower.tail=FALSE)
cat("\nChi-square :", round(chi2_stat,4), "| df: 1 | p-value:", format(p_chi, scientific=TRUE), "\n\n")##
## Chi-square : 19.1292 | df: 1 | p-value: 1.221601e-05
## --- Konfirmasi chisq.test() ---
##
## Pearson's Chi-squared test
##
## data: tabel1
## X-squared = 19.129, df = 1, p-value = 1.222e-05
Keputusan & Interpretasi: \(\chi^2 = 19.1292\), \(df = 1\), \(p \approx 1.221601e-05\). \(H_0\) ditolak. Ada asosiasi yang signifikan antara merokok dan kanker paru. Perhatikan \(z^2 = (4.3737)^2 = 19.1292 \approx \chi^2 = 19.1292\), membuktikan ekuivalensi kedua uji.
Statistik Uji:
\[G^2 = 2\sum_{i,j} O_{ij} \ln\!\left(\frac{O_{ij}}{E_{ij}}\right)\]
Perhitungan Manual:
\[G^2 = 2\!\left[688\ln\!\left(\frac{688}{669{,}43}\right) + 650\ln\!\left(\frac{650}{669{,}43}\right) + 21\ln\!\left(\frac{21}{39{,}99}\right) + 59\ln\!\left(\frac{59}{39{,}99}\right)\right]\]
\[= 2\!\left[688\ln(1{,}02772) + 650\ln(0{,}97097) + 21\ln(0{,}52513) + 59\ln(1{,}47537)\right]\]
\[= 2\!\left[688(0{,}02734) + 650(-0{,}02946) + 21(-0{,}64378) + 59(0{,}38882)\right]\]
\[= 2\!\left[18{,}81 + (-19{,}15) + (-13{,}52) + 22{,}94\right] = 2 \times 9{,}08 = 18{,}16\]
G2_stat <- 2 * sum(tabel1 * log(tabel1 / E_mat))
p_G2 <- pchisq(G2_stat, df=1, lower.tail=FALSE)
cat("G2 statistik:", round(G2_stat,4), "| df: 1 | p-value:", format(p_G2, scientific=TRUE), "\n\n")## G2 statistik: 19.878 | df: 1 | p-value: 8.25441e-06
## --- Konfirmasi GTest() ---
##
## Log likelihood ratio (G-test) test of independence without correction
##
## data: tabel1
## G = 19.878, X-squared df = 1, p-value = 8.254e-06
Keputusan & Interpretasi: \(G^2 = 19.878\), \(p \approx 8.25441e-06\). \(H_0\) ditolak. Nilai \(G^2\) sedikit berbeda dari \(\chi^2\) Pearson karena menggunakan fungsi logaritma; keduanya konvergen dan konsisten mendeteksi asosiasi.
Uji ini menghitung probabilitas exact dari distribusi hipergeometrik tanpa bergantung pada aproksimasi asimptotik — sangat berguna ketika \(n\) kecil atau \(E_{ij} < 5\).
Formula hipergeometrik:
\[P(X = a \mid n_{1+}, n_{2+}, n_{+1}) = \frac{\dbinom{n_{1+}}{a}\dbinom{n_{2+}}{n_{+1}-a}}{\dbinom{n}{n_{+1}}} = \frac{\dbinom{1338}{688}\dbinom{80}{21}}{\dbinom{1418}{709}}\]
Nilai \(p\) dihitung sebagai jumlah probabilitas semua konfigurasi yang sama ekstrem atau lebih ekstrem dari data observasi.
## OR (MLE) : 2.9716
## 95% CI OR : [ 1.7556 ; 5.2107 ]
## p-value : 1.476303e-05
##
## Fisher's Exact Test for Count Data
##
## data: tabel1
## p-value = 1.476e-05
## alternative hypothesis: true odds ratio is not equal to 1
## 95 percent confidence interval:
## 1.755611 5.210711
## sample estimates:
## odds ratio
## 2.971634
Keputusan & Interpretasi: \(p \approx 1.476303e-05\). \(H_0\) ditolak. OR dari Fisher exact test = 2.9716, sangat konsisten dengan estimasi manual 2.9738.
comp_df <- data.frame(
"Metode Uji" = c("Uji Dua Proporsi (Z)","Chi-Square Pearson",
"Likelihood Ratio (G2)","Fisher Exact Test"),
"H0" = rep("Tidak ada asosiasi", 4),
"Statistik Uji" = c(paste0("Z = ", round(z_stat,4)),
paste0("chi2 = ",round(chi2_stat,4)),
paste0("G2 = ", round(G2_stat,4)),
"Hipergeometrik (exact)"),
"df" = c("1","1","1","—"),
"p-value" = c(format(p_val_z, scientific=TRUE, digits=3),
format(p_chi, scientific=TRUE, digits=3),
format(p_G2, scientific=TRUE, digits=3),
format(ft$p.value, scientific=TRUE, digits=3)),
"Keputusan" = rep("Tolak H0", 4),
"Catatan" = c("Z^2 = chi^2 (ekuivalen)",
"Asimptotik; syarat E>=5",
"Asimptotik; berbasis log-likelihood",
"Exact; valid untuk n kecil"),
check.names = FALSE
)
kable(comp_df,
caption = "Tabel 4. Perbandingan Hasil Keempat Metode Pengujian",
align = "c") |>
kable_styling(bootstrap_options = c("striped","hover","bordered"),
full_width = TRUE) |>
row_spec(0, bold=TRUE, color="white", background="#2166ac") |>
column_spec(6, bold=TRUE, color="#1a7a1a")| Metode Uji | H0 | Statistik Uji | df | p-value | Keputusan | Catatan |
|---|---|---|---|---|---|---|
| Uji Dua Proporsi (Z) | Tidak ada asosiasi | Z = 4.3737 | 1 | 1.22e-05 | Tolak H0 | Z^2 = chi^2 (ekuivalen) |
| Chi-Square Pearson | Tidak ada asosiasi | chi2 = 19.1292 | 1 | 1.22e-05 | Tolak H0 | Asimptotik; syarat E>=5 |
| Likelihood Ratio (G2) | Tidak ada asosiasi | G2 = 19.878 | 1 | 8.25e-06 | Tolak H0 | Asimptotik; berbasis log-likelihood |
| Fisher Exact Test | Tidak ada asosiasi | Hipergeometrik (exact) | — | 1.48e-05 | Tolak H0 | Exact; valid untuk n kecil |
Diskusi:
mosaic(tabel1,
shade = TRUE,
legend = TRUE,
main = "Mosaic Plot: Status Merokok vs Kanker Paru",
labeling = labeling_border(rot_labels=c(0,0,0,0)),
gp = shading_hcl)Gambar 1. Mosaic Plot — Status Merokok vs Kanker Paru
prop_df <- data.frame(
Kelompok = c("Smoker","Non-Smoker"),
Proporsi = c(p1_hat, p2_hat),
Lower = c(ci_p1["lower"], ci_p2["lower"]),
Upper = c(ci_p1["upper"], ci_p2["upper"])
)
ggplot(prop_df, aes(x=Kelompok, y=Proporsi, fill=Kelompok)) +
geom_col(width=0.5, alpha=0.9, color="white") +
geom_errorbar(aes(ymin=Lower, ymax=Upper),
width=0.12, linewidth=1.1, color="#222222") +
geom_text(aes(label=paste0(round(Proporsi*100,2),"%")),
vjust=-2.0, fontface="bold", size=5) +
scale_fill_manual(values=c("Non-Smoker"="#4393c3", "Smoker"="#d6604d")) +
scale_y_continuous(labels=percent_format(), limits=c(0,0.72)) +
labs(title = "Proporsi Kejadian Kanker Paru per Kelompok",
subtitle = "Error bar = CI 95% (Wilson Score)",
x=NULL, y="Proporsi", fill=NULL) +
theme_minimal(base_size=13) +
theme(legend.position = "none",
plot.title = element_text(face="bold", size=14),
plot.subtitle = element_text(color="grey40"),
panel.grid.major.x = element_blank())Gambar 2. Proporsi Kanker Paru per Kelompok dengan CI 95%
tabel2 <- matrix(
c(495, 272, 590,
330, 265, 498),
nrow = 2,
byrow = TRUE,
dimnames = list(
"Gender" = c("Female","Male"),
"Partai" = c("Democrat","Republican","Independent")
)
)
tabel2_margin <- addmargins(tabel2)
kable(tabel2_margin,
caption = "Tabel 5. Tabel Kontingensi 2x3: Gender dan Identifikasi Partai Politik",
align = "c") |>
kable_styling(bootstrap_options = c("striped","hover","bordered"),
full_width = FALSE, position="center") |>
row_spec(nrow(tabel2_margin), bold=TRUE, background="#dce8f5") |>
column_spec(ncol(tabel2_margin), bold=TRUE, background="#dce8f5") |>
row_spec(0, bold=TRUE, color="white", background="#2166ac")| Democrat | Republican | Independent | Sum | |
|---|---|---|---|---|
| Female | 495 | 272 | 590 | 1357 |
| Male | 330 | 265 | 498 | 1093 |
| Sum | 825 | 537 | 1088 | 2450 |
Di bawah \(H_0\) independensi: \(E_{ij} = \dfrac{n_{i+} \cdot n_{+j}}{n}\)
Perhitungan Manual (\(n = 2450\), \(n_{F+} = 1357\), \(n_{M+} = 1093\)):
\[E_{F,\text{Dem}} = \frac{1357 \times 825}{2450} = \frac{1{.}119{.}525}{2450} = 456{,}949 \qquad E_{F,\text{Rep}} = \frac{1357 \times 537}{2450} = \frac{728{.}709}{2450} = 297{,}432\]
\[E_{F,\text{Ind}} = \frac{1357 \times 1088}{2450} = \frac{1{.}476{.}416}{2450} = 602{,}619 \qquad E_{M,\text{Dem}} = \frac{1093 \times 825}{2450} = \frac{901{.}725}{2450} = 368{,}051\]
\[E_{M,\text{Rep}} = \frac{1093 \times 537}{2450} = \frac{586{.}941}{2450} = 239{,}568 \qquad E_{M,\text{Ind}} = \frac{1093 \times 1088}{2450} = \frac{1{.}189{.}184}{2450} = 485{,}381\]
n_row2 <- rowSums(tabel2)
n_col2 <- colSums(tabel2)
n2_tot <- sum(tabel2)
E2 <- outer(n_row2, n_col2) / n2_tot
E2_margin <- addmargins(E2)
rownames(E2_margin)[nrow(E2_margin)] <- "Total"
colnames(E2_margin)[ncol(E2_margin)] <- "Total"
kable(round(E2_margin, 3),
caption = "Tabel 6. Frekuensi Harapan (E_ij) di Bawah H0 Independensi",
align = "c") |>
kable_styling(bootstrap_options = c("striped","hover","bordered"),
full_width = FALSE, position="center") |>
row_spec(nrow(E2_margin), bold=TRUE, background="#dce8f5") |>
column_spec(ncol(E2_margin), bold=TRUE, background="#dce8f5") |>
row_spec(0, bold=TRUE, color="white", background="#2166ac")| Democrat | Republican | Independent | Total | |
|---|---|---|---|---|
| Female | 456.949 | 297.432 | 602.619 | 1357 |
| Male | 368.051 | 239.568 | 485.381 | 1093 |
| Total | 825.000 | 537.000 | 1088.000 | 2450 |
cat("Minimum E_ij:", round(min(E2),3),
"->", ifelse(min(E2)>=5,"Syarat E>=5 terpenuhi ✓","TIDAK terpenuhi"), "\n")## Minimum E_ij: 239.568 -> Syarat E>=5 terpenuhi ✓
Hipotesis:
\[H_0: \text{Gender dan partai independen} \quad \text{vs} \quad H_1: \text{ada asosiasi}\]
Derajat bebas: \(df = (2-1)(3-1) = 2\)
Perhitungan Manual:
\[\chi^2 = \frac{(495-456{,}95)^2}{456{,}95} + \frac{(272-297{,}43)^2}{297{,}43} + \frac{(590-602{,}62)^2}{602{,}62} + \frac{(330-368{,}05)^2}{368{,}05} + \frac{(265-239{,}57)^2}{239{,}57} + \frac{(498-485{,}38)^2}{485{,}38}\]
\[= \frac{(38{,}05)^2}{456{,}95} + \frac{(-25{,}43)^2}{297{,}43} + \frac{(-12{,}62)^2}{602{,}62} + \frac{(-38{,}05)^2}{368{,}05} + \frac{(25{,}43)^2}{239{,}57} + \frac{(12{,}62)^2}{485{,}38}\]
\[= \frac{1447{,}80}{456{,}95} + \frac{646{,}69}{297{,}43} + \frac{159{,}26}{602{,}62} + \frac{1447{,}80}{368{,}05} + \frac{646{,}69}{239{,}57} + \frac{159{,}26}{485{,}38}\]
\[= 3{,}167 + 2{,}175 + 0{,}264 + 3{,}934 + 2{,}700 + 0{,}328 = 12{,}568\]
cs2 <- chisq.test(tabel2, correct=FALSE)
chi2_manual <- sum((tabel2 - E2)^2 / E2)
cat("Chi-square manual :", round(chi2_manual,4), "\n")## Chi-square manual : 12.5693
## Chi-square (chisq.test): 12.5693
## df : 2
## p-value : 1.86475e-03
cramer_V <- sqrt(cs2$statistic / (n2_tot * (min(nrow(tabel2),ncol(tabel2))-1)))
cat("Cramer's V :", round(cramer_V,4), "\n\n")## Cramer's V : 0.0716
##
## Pearson's Chi-squared test
##
## data: tabel2
## X-squared = 12.569, df = 2, p-value = 0.001865
Keputusan & Interpretasi: \(\chi^2 = 12.5693\), \(df = 2\), \(p = 1.86475e-03\). \(H_0\) ditolak. Ada asosiasi signifikan antara gender dan identifikasi partai. Cramér’s \(V = 0.0716\) menunjukkan kekuatan asosiasi yang lemah hingga sedang.
Residual Pearson: \[r_{ij} = \frac{O_{ij} - E_{ij}}{\sqrt{E_{ij}}}\]
Standardized (Adjusted) Residual: \[r_{ij}^{\text{std}} = \frac{O_{ij} - E_{ij}}{\sqrt{E_{ij}(1-p_{i+})(1-p_{+j})}}\]
Perhitungan Manual — sel Female-Democrat:
\[r_{F,D} = \frac{495 - 456{,}95}{\sqrt{456{,}95}} = \frac{38{,}05}{21{,}38} = 1{,}780\]
Proporsi marginal: \(p_{F+} = 1357/2450 = 0{,}5539\); \(p_{+D} = 825/2450 = 0{,}3367\)
\[r_{F,D}^{\text{std}} = \frac{38{,}05}{\sqrt{456{,}95 \times (1-0{,}5539) \times (1-0{,}3367)}} = \frac{38{,}05}{\sqrt{456{,}95 \times 0{,}4461 \times 0{,}6633}} = \frac{38{,}05}{\sqrt{135{,}23}} = \frac{38{,}05}{11{,}63} = 3{,}272\]
Karena \(|r_{F,D}^{\text{std}}| = 3{,}272 > 2\), sel ini berkontribusi signifikan terhadap chi-square.
pearson_res <- cs2$residuals
std_res <- cs2$stdres
res_df <- data.frame(
Sel = c("Female-Democrat","Female-Republican","Female-Independent",
"Male-Democrat","Male-Republican","Male-Independent"),
O = as.vector(t(tabel2)),
E = round(as.vector(t(E2)),3),
"Pearson Residual" = round(as.vector(t(pearson_res)),4),
"Standardized Residual" = round(as.vector(t(std_res)),4),
"Signifikan (|r|>2)" = ifelse(abs(as.vector(t(std_res)))>2,"Ya","Tidak"),
check.names = FALSE
)
kable(res_df,
caption = "Tabel 7. Residual Pearson dan Standardized Residual",
align = "c") |>
kable_styling(bootstrap_options = c("striped","hover","bordered"),
full_width = FALSE, position="center") |>
row_spec(0, bold=TRUE, color="white", background="#2166ac") |>
row_spec(which(abs(as.vector(t(std_res)))>2), bold=TRUE, background="#fff3b0")| Sel | O | E | Pearson Residual | Standardized Residual | Signifikan (|r|>2) |
|---|---|---|---|---|---|
| Female-Democrat | 495 | 456.949 | 1.7801 | 3.2724 | Ya |
| Female-Republican | 272 | 297.432 | -1.4747 | -2.4986 | Ya |
| Female-Independent | 590 | 602.619 | -0.5140 | -1.0322 | Tidak |
| Male-Democrat | 330 | 368.051 | -1.9834 | -3.2724 | Ya |
| Male-Republican | 265 | 239.568 | 1.6431 | 2.4986 | Ya |
| Male-Independent | 498 | 485.381 | 0.5728 | 1.0322 | Tidak |
res_long <- as.data.frame(as.table(std_res))
colnames(res_long) <- c("Gender","Partai","Residual")
res_long$label <- round(res_long$Residual, 3)
res_long$text_col <- ifelse(abs(res_long$Residual) > 1.5, "white", "#333333")
ggplot(res_long, aes(x=Partai, y=Gender, fill=Residual)) +
geom_tile(color="white", linewidth=1.2) +
geom_text(aes(label=label, color=text_col), size=6, fontface="bold") +
scale_color_identity() +
scale_fill_gradient2(low="#b2182b", mid="#f7f7f7", high="#2166ac",
midpoint=0, name="Std.\nResidual", limits=c(-4,4)) +
labs(title = "Heatmap Standardized Residuals",
subtitle = "Biru = lebih tinggi dari harapan | Merah = lebih rendah dari harapan",
x="Partai Politik", y="Gender") +
theme_minimal(base_size=13) +
theme(plot.title = element_text(face="bold", size=14),
plot.subtitle = element_text(color="grey40"),
axis.text = element_text(size=12, face="bold"),
panel.grid = element_blank())Gambar 3. Heatmap Standardized Residuals — Gender vs Partai
Interpretasi Residual: - Female-Democrat (\(r^{\text{std}} = 3.272\), \(|r|>2\) — signifikan): Perempuan lebih banyak mengidentifikasi sebagai Democrat dari yang diharapkan. - Male-Democrat (\(r^{\text{std}} = -3.272\), \(|r|>2\) — signifikan): Laki-laki lebih sedikit mengidentifikasi sebagai Democrat. - Sel lainnya memiliki \(|r^{\text{std}}| < 2\), tidak signifikan secara individual.
Partisi chi-square membagi uji keseluruhan (\(df=2\)) menjadi dua uji ortogonal (\(df=1\) masing-masing):
\[\chi^2_{\text{total}}(df=2) \approx \chi^2_{\text{Dem vs Rep}}(df=1) + \chi^2_{\text{(Dem+Rep) vs Ind}}(df=1)\]
Sub-tabel hanya kolom Democrat dan Republican (\(n = 1362\)):
| Democrat | Republican | Total | |
|---|---|---|---|
| Female | 495 | 272 | 767 |
| Male | 330 | 265 | 595 |
| Total | 825 | 537 | 1362 |
Frekuensi Harapan:
\[E_{F,D}^{(1)} = \frac{767 \times 825}{1362} = \frac{632{.}775}{1362} = 464{,}59 \qquad E_{F,R}^{(1)} = \frac{767 \times 537}{1362} = 302{,}41\]
\[E_{M,D}^{(1)} = \frac{595 \times 825}{1362} = 360{,}41 \qquad E_{M,R}^{(1)} = \frac{595 \times 537}{1362} = 234{,}59\]
Statistik Chi-Square:
\[\chi^2_1 = \frac{(495-464{,}59)^2}{464{,}59} + \frac{(272-302{,}41)^2}{302{,}41} + \frac{(330-360{,}41)^2}{360{,}41} + \frac{(265-234{,}59)^2}{234{,}59}\]
\[= \frac{(30{,}41)^2}{464{,}59} + \frac{(-30{,}41)^2}{302{,}41} + \frac{(-30{,}41)^2}{360{,}41} + \frac{(30{,}41)^2}{234{,}59}\]
\[= \frac{924{,}77}{464{,}59} + \frac{924{,}77}{302{,}41} + \frac{924{,}77}{360{,}41} + \frac{924{,}77}{234{,}59} = 1{,}990 + 3{,}059 + 2{,}565 + 3{,}942 = 11{,}556\]
sub1 <- tabel2[, c("Democrat","Republican")]
cs_sub1 <- chisq.test(sub1, correct=FALSE)
cat("Sub-tabel Partisi 1:\n"); print(addmargins(sub1))## Sub-tabel Partisi 1:
## Partai
## Gender Democrat Republican Sum
## Female 495 272 767
## Male 330 265 595
## Sum 825 537 1362
cat("\nChi-square:", round(cs_sub1$statistic,4),
"| df:", cs_sub1$parameter,
"| p-value:", format(cs_sub1$p.value, scientific=TRUE), "\n")##
## Chi-square: 11.5545 | df: 1 | p-value: 6.758479e-04
| Dem+Rep | Independent | Total | |
|---|---|---|---|
| Female | 767 | 590 | 1357 |
| Male | 595 | 498 | 1093 |
| Total | 1362 | 1088 | 2450 |
Frekuensi Harapan:
\[E_{F,DR}^{(2)} = \frac{1357 \times 1362}{2450} = \frac{1{.}848{.}234}{2450} = 754{,}789 \qquad E_{F,I}^{(2)} = \frac{1357 \times 1088}{2450} = 602{,}211\]
Statistik Chi-Square:
\[\chi^2_2 = \frac{(767-754{,}79)^2}{754{,}79} + \frac{(590-602{,}21)^2}{602{,}21} + \frac{(595-607{,}21)^2}{607{,}21} + \frac{(498-485{,}79)^2}{485{,}79}\]
\[= \frac{(12{,}21)^2}{754{,}79} + \frac{(-12{,}21)^2}{602{,}21} + \frac{(-12{,}21)^2}{607{,}21} + \frac{(12{,}21)^2}{485{,}79}\]
\[= 0{,}197 + 0{,}247 + 0{,}245 + 0{,}307 = 0{,}996\]
tabel2_p2 <- cbind(
"Dem+Rep" = tabel2[,"Democrat"] + tabel2[,"Republican"],
"Independent" = tabel2[,"Independent"]
)
cs_sub2 <- chisq.test(tabel2_p2, correct=FALSE)
cat("Sub-tabel Partisi 2:\n"); print(addmargins(tabel2_p2))## Sub-tabel Partisi 2:
## Dem+Rep Independent Sum
## Female 767 590 1357
## Male 595 498 1093
## Sum 1362 1088 2450
cat("\nChi-square:", round(cs_sub2$statistic,4),
"| df:", cs_sub2$parameter,
"| p-value:", format(cs_sub2$p.value, scientific=TRUE), "\n")##
## Chi-square: 1.0654 | df: 1 | p-value: 3.01979e-01
chi_sum <- cs_sub1$statistic + cs_sub2$statistic
partisi_df <- data.frame(
"Uji" = c("Chi-Square Keseluruhan (2x3)",
"Partisi 1: Dem vs Rep (df=1)",
"Partisi 2: (Dem+Rep) vs Ind (df=1)",
"Jumlah Partisi (df=2)"),
"Chi-Square" = c(round(cs2$statistic,4),
round(cs_sub1$statistic,4),
round(cs_sub2$statistic,4),
round(chi_sum,4)),
"df" = c(2,1,1,2),
"p-value" = c(format(cs2$p.value, scientific=TRUE, digits=3),
format(cs_sub1$p.value, scientific=TRUE, digits=3),
format(cs_sub2$p.value, scientific=TRUE, digits=3),
format(pchisq(chi_sum,2,lower.tail=FALSE), scientific=TRUE, digits=3)),
"Keputusan" = c(rep("Tolak H0",3),"—"),
check.names = FALSE
)
kable(partisi_df,
caption = "Tabel 8. Perbandingan Chi-Square Keseluruhan vs Partisi",
align = "c") |>
kable_styling(bootstrap_options = c("striped","hover","bordered"),
full_width = FALSE, position="center") |>
row_spec(0, bold=TRUE, color="white", background="#2166ac") |>
row_spec(4, bold=TRUE, background="#dce8f5")| Uji | Chi-Square | df | p-value | Keputusan |
|---|---|---|---|---|
| Chi-Square Keseluruhan (2x3) | 12.5693 | 2 | 1.86e-03 | Tolak H0 |
| Partisi 1: Dem vs Rep (df=1) | 11.5545 | 1 | 6.76e-04 | Tolak H0 |
| Partisi 2: (Dem+Rep) vs Ind (df=1) | 1.0654 | 1 | 3.02e-01 | Tolak H0 |
| Jumlah Partisi (df=2) | 12.6200 | 2 | 1.82e-03 | — |
cat("Aditivitas:", round(cs_sub1$statistic,4), "+", round(cs_sub2$statistic,4),
"=", round(chi_sum,4), "~=", round(cs2$statistic,4), "\n")## Aditivitas: 11.5545 + 1.0654 = 12.62 ~= 12.5693
Diskusi: - Partisi 1 (Dem vs Rep): \(\chi^2 = 11.5545\), \(p = 6.758479e-04\) — sangat signifikan; perbedaan gender paling jelas pada pilihan Democrat vs Republican. - Partisi 2 ((Dem+Rep) vs Ind): \(\chi^2 = 1.0654\), \(p = 3.01979e-01\) — tidak signifikan; gender tidak membedakan secara bermakna pemilih partai mainstream vs Independent. - Jumlah \(\chi^2\) partisi (\(12.62\)) \(\approx\) \(\chi^2\) total (\(12.5693\)), memverifikasi properti aditivitas.
kontrib <- (tabel2 - E2)^2 / E2
persen_kontrib <- kontrib / sum(kontrib) * 100
kont_df <- as.data.frame(as.table(round(persen_kontrib,2)))
colnames(kont_df) <- c("Gender","Partai","Kontribusi (%)")
kont_df <- kont_df[order(-kont_df[,"Kontribusi (%)"]),]
kont_df[,"Rank"] <- 1:nrow(kont_df)
kable(kont_df,
caption = "Tabel 9. Kontribusi Setiap Sel terhadap Chi-Square Total (%)",
align="c", row.names=FALSE) |>
kable_styling(bootstrap_options = c("striped","hover","bordered"),
full_width=FALSE, position="center") |>
row_spec(0, bold=TRUE, color="white", background="#2166ac") |>
row_spec(1:2, bold=TRUE, background="#fff3b0")| Gender | Partai | Kontribusi (%) | Rank |
|---|---|---|---|
| Male | Democrat | 31.30 | 1 |
| Female | Democrat | 25.21 | 2 |
| Male | Republican | 21.48 | 3 |
| Female | Republican | 17.30 | 4 |
| Male | Independent | 2.61 | 5 |
| Female | Independent | 2.10 | 6 |
kont_all <- as.data.frame(as.table(round(persen_kontrib,2)))
colnames(kont_all) <- c("Gender","Partai","Kontribusi")
kont_all$Sel <- paste(kont_all$Gender, kont_all$Partai, sep="\n")
kont_all$Partai <- factor(kont_all$Partai,
levels=c("Democrat","Republican","Independent"))
ggplot(kont_all, aes(x=reorder(Sel,-Kontribusi), y=Kontribusi, fill=Partai)) +
geom_col(alpha=0.9, color="white", linewidth=0.5) +
geom_text(aes(label=paste0(round(Kontribusi,1),"%")),
vjust=-0.4, size=4.2, fontface="bold") +
scale_fill_manual(values=c("Democrat" = "#2166ac",
"Republican" = "#d6604d",
"Independent" = "#1a9850")) +
scale_y_continuous(limits=c(0,35)) +
labs(title="Kontribusi Setiap Sel terhadap Chi-Square Total",
x="Sel (Gender x Partai)", y="Kontribusi (%)", fill="Partai") +
theme_minimal(base_size=13) +
theme(plot.title = element_text(face="bold", size=14),
panel.grid.major.x = element_blank())Gambar 4. Kontribusi Setiap Sel terhadap Chi-Square Total
mosaic(tabel2,
shade = TRUE,
legend = TRUE,
main = "Mosaic Plot: Gender x Identifikasi Partai Politik",
labeling = labeling_border(rot_labels=c(0,0,0,0)),
gp = shading_hcl)Gambar 5. Mosaic Plot — Gender vs Partai Politik
prop2 <- as.data.frame(prop.table(tabel2, margin=1))
colnames(prop2) <- c("Gender","Partai","Proporsi")
prop2$Partai <- factor(prop2$Partai,
levels=c("Democrat","Republican","Independent"))
ggplot(prop2, aes(x=Gender, y=Proporsi, fill=Partai)) +
geom_col(position="fill", alpha=0.9, width=0.55,
color="white", linewidth=0.5) +
geom_text(aes(label=paste0(round(Proporsi*100,1),"%")),
position=position_fill(vjust=0.5),
color="white", fontface="bold", size=5) +
scale_fill_manual(values=c("Democrat" = "#2166ac",
"Republican" = "#d6604d",
"Independent" = "#1a9850")) +
scale_y_continuous(labels=percent_format()) +
labs(title = "Distribusi Identifikasi Partai per Gender",
subtitle = "Proporsi baris (row percentage)",
x=NULL, y="Proporsi", fill="Partai Politik") +
theme_minimal(base_size=13) +
theme(plot.title = element_text(face="bold", size=14),
plot.subtitle = element_text(color="grey40"),
panel.grid.major.x = element_blank())Gambar 6. Distribusi Proporsi Identifikasi Partai per Gender
kesimpulan_df <- data.frame(
"Kasus" = c("Kasus 1 (2x2)","Kasus 2 (2x3)"),
"Variabel" = c("Merokok – Kanker Paru","Gender – Partai Politik"),
"Asosiasi" = c(paste0("RD=",round(RD,3),"; RR=",round(RR,3),"; OR=",round(OR,3)),
paste0("V=",round(cramer_V,3))),
"Chi-Square" = c(round(chi2_stat,3), round(cs2$statistic,3)),
"p-value" = c(format(p_chi, scientific=TRUE, digits=2),
format(cs2$p.value, scientific=TRUE, digits=2)),
"Keputusan" = c("Tolak H0","Tolak H0"),
"Temuan Utama" = c("OR=2,97; asosiasi kuat & signifikan",
"Democrat paling membedakan gender"),
check.names = FALSE
)
kable(kesimpulan_df,
caption = "Tabel 10. Ringkasan Kesimpulan Kedua Kasus",
align = "c") |>
kable_styling(bootstrap_options = c("striped","hover","bordered"),
full_width = TRUE) |>
row_spec(0, bold=TRUE, color="white", background="#2166ac") |>
column_spec(6, bold=TRUE, color="#1a7a1a")| Kasus | Variabel | Asosiasi | Chi-Square | p-value | Keputusan | Temuan Utama | |
|---|---|---|---|---|---|---|---|
| Kasus 1 (2x2) | Merokok – Kanker Paru | RD=0.252; RR=1.959; OR=2.974 | 19.129 | 1.2e-05 | Tolak H0 | OR=2,97; asosiasi kuat & signifikan | |
| X-squared | Kasus 2 (2x3) | Gender – Partai Politik | V=0.072 | 12.569 | 1.9e-03 | Tolak H0 | Democrat paling membedakan gender |
Kedua kasus membuktikan pentingnya analisis inferensi yang komprehensif — tidak hanya uji signifikansi statistik, tetapi juga estimasi ukuran asosiasi beserta interval kepercayaannya, serta analisis kontribusi sel melalui residual. Pendekatan terpadu ini memberikan gambaran yang lebih utuh dan substantif tentang hubungan antar variabel kategorik.