1 Pendahuluan

Tabel kontingensi dua arah merupakan salah satu alat analisis statistik yang paling sering digunakan dalam penelitian kesehatan, sosial, dan ilmu perilaku untuk menyelidiki apakah terdapat hubungan (asosiasi) antara dua variabel kategorik. Inferensi pada tabel kontingensi mencakup estimasi ukuran asosiasi beserta interval kepercayaannya, serta berbagai uji hipotesis untuk menguji independensi antar variabel.

Laporan ini mengerjakan dua kasus utama:

Kasus 1 – Tabel 2×2: Hubungan antara kebiasaan merokok dan kejadian kanker paru. Analisis meliputi estimasi proporsi, Risk Difference (RD), Relative Risk (RR), Odds Ratio (OR), uji dua proporsi, uji chi-square, uji likelihood ratio, dan Fisher exact test.
Kasus 2 – Tabel 2×3: Hubungan antara jenis kelamin dan identifikasi partai politik. Analisis meliputi frekuensi harapan, uji chi-square, residual standar, dan partisi chi-square.

2 Kasus 1: Tabel Kontingensi 2 × 2 — Merokok dan Kanker Paru

2.1 Data dan Penyusunan Tabel Kontingensi

Data yang digunakan menggambarkan hubungan antara status merokok (Smoker / Non-Smoker) dan status penyakit (kanker paru positif / kontrol negatif) dalam sebuah studi case-control.

# Membuat matriks tabel kontingensi 2x2
# Baris : Smoker, Non-Smoker
# Kolom : Cancer (+), Control (-)
tabel1 <- matrix(
  c(688, 650,
     21,  59),
  nrow  = 2,
  byrow = TRUE,
  dimnames = list(
    "Status Merokok"  = c("Smoker", "Non-Smoker"),
    "Status Penyakit" = c("Cancer (+)", "Control (-)")
  )
)

# Menambahkan baris dan kolom total
tabel1_display <- addmargins(tabel1)

kable(tabel1_display,
      caption = "Tabel 1. Tabel Kontingensi 2×2: Merokok dan Kanker Paru",
      align   = "c") |>
  kable_styling(bootstrap_options = c("striped", "hover", "bordered"),
                full_width = FALSE) |>
  row_spec(3, bold = TRUE, background = "#f0f0f0") |>
  column_spec(4, bold = TRUE, background = "#f0f0f0")

Tabel 1. Tabel Kontingensi 2×2: Merokok dan Kanker Paru
	Cancer (+)	Control (-)	Sum
Smoker	688	650	1338
Non-Smoker	21	59	80
Sum	709	709	1418

Dari tabel di atas: kelompok perokok terdiri atas 688 kasus kanker dan 650 kontrol (total 1.338 orang), sedangkan kelompok bukan perokok terdiri atas 21 kasus kanker dan 59 kontrol (total 80 orang). Total keseluruhan adalah 1.418 subjek dengan 709 kasus dan 709 kontrol.

2.2 Metode Analisis

2.2.1 Estimasi Titik Proporsi

Estimasi titik proporsi kejadian kanker paru pada masing-masing kelompok dihitung sebagai:

\[\hat{p}_i = \frac{n_{\text{Cancer},i}}{n_{i\cdot}}\]

sehingga:

\[\hat{p}_{\text{Smoker}} = \frac{688}{1338} \approx 0{,}5142 \qquad \hat{p}_{\text{Non-Smoker}} = \frac{21}{80} \approx 0{,}2625\]

2.2.2 Interval Kepercayaan untuk Proporsi

Interval kepercayaan (IK) 95% dihitung menggunakan metode Wilson, yang lebih akurat daripada metode Wald terutama saat proporsi mendekati 0 atau 1:

\[\text{IK}_{95\%}(p) = \frac{\hat{p} + \dfrac{z^2}{2n} \pm z\sqrt{\dfrac{\hat{p}(1-\hat{p})}{n}+\dfrac{z^2}{4n^2}}} {1 + \dfrac{z^2}{n}}\]

2.2.3 Risk Difference, Relative Risk, dan Odds Ratio

Tiga ukuran asosiasi utama yang dihitung adalah:

\[\text{RD} = \hat{p}_1 - \hat{p}_2, \quad \text{RR} = \frac{\hat{p}_1}{\hat{p}_2}, \quad \text{OR} = \frac{ad}{bc}\]

dengan interval kepercayaan berbasis transformasi log untuk RR dan OR:

\[\text{IK}_{95\%}(\text{RR}): \exp\!\left(\ln\widehat{\text{RR}} \pm z_{0{,}975} \cdot \sqrt{\frac{1-\hat{p}_1}{n_1\hat{p}_1}+\frac{1-\hat{p}_2}{n_2\hat{p}_2}}\right)\]

\[\text{IK}_{95\%}(\text{OR}): \exp\!\left(\ln\widehat{\text{OR}} \pm z_{0{,}975} \cdot \sqrt{\frac{1}{a}+\frac{1}{b}+\frac{1}{c}+\frac{1}{d}}\right)\]

2.3 Hasil dan Interpretasi — Kasus 1

2.3.1 Estimasi Proporsi

# Menghitung total per kelompok dan proporsi
n_smoker    <- sum(tabel1["Smoker", ])
n_nonsmoker <- sum(tabel1["Non-Smoker", ])

p_smoker    <- tabel1["Smoker",     "Cancer (+)"] / n_smoker
p_nonsmoker <- tabel1["Non-Smoker", "Cancer (+)"] / n_nonsmoker

hasil_proporsi <- data.frame(
  Kelompok    = c("Smoker", "Non-Smoker"),
  `Cancer (+)`= c(tabel1["Smoker","Cancer (+)"], tabel1["Non-Smoker","Cancer (+)"]),
  Total       = c(n_smoker, n_nonsmoker),
  `Proporsi`  = round(c(p_smoker, p_nonsmoker), 4),
  check.names = FALSE
)

kable(hasil_proporsi,
      caption = "Tabel 2. Estimasi Titik Proporsi Kanker Paru per Kelompok",
      align   = "c") |>
  kable_styling(bootstrap_options = c("striped","hover","bordered"),
                full_width = FALSE)

Tabel 2. Estimasi Titik Proporsi Kanker Paru per Kelompok
Kelompok	Cancer (+)	Total	Proporsi
Smoker	688	1338	0.5142
Non-Smoker	21	80	0.2625

Proporsi kejadian kanker pada perokok (\(\hat{p}_1 \approx 0{,}5142\)) hampir dua kali lipat dibandingkan bukan perokok (\(\hat{p}_2 \approx 0{,}2625\)), memberikan indikasi kuat adanya asosiasi positif.

2.3.2 Interval Kepercayaan 95%

2.3.2.1 Proporsi Masing-Masing Kelompok

# Fungsi IK Wilson untuk proporsi tunggal
ci_wilson <- function(x, n, level = 0.95) {
  z      <- qnorm(1 - (1 - level) / 2)
  p      <- x / n
  denom  <- 1 + z^2 / n
  center <- (p + z^2 / (2 * n)) / denom
  margin <- (z * sqrt(p * (1 - p) / n + z^2 / (4 * n^2))) / denom
  c(lower = center - margin, upper = center + margin)
}

ci_s  <- ci_wilson(tabel1["Smoker",     "Cancer (+)"], n_smoker)
ci_ns <- ci_wilson(tabel1["Non-Smoker", "Cancer (+)"], n_nonsmoker)

df_ci_prop <- data.frame(
  Kelompok           = c("Smoker", "Non-Smoker"),
  Proporsi           = round(c(p_smoker, p_nonsmoker), 4),
  `Batas Bawah 95%`  = round(c(ci_s["lower"],  ci_ns["lower"]),  4),
  `Batas Atas 95%`   = round(c(ci_s["upper"],  ci_ns["upper"]),  4),
  check.names = FALSE
)

kable(df_ci_prop,
      caption = "Tabel 3. IK 95% Proporsi Kanker Paru (Metode Wilson)",
      align   = "c") |>
  kable_styling(bootstrap_options = c("striped","hover","bordered"),
                full_width = FALSE)

Tabel 3. IK 95% Proporsi Kanker Paru (Metode Wilson)
Kelompok	Proporsi	Batas Bawah 95%	Batas Atas 95%
Smoker	0.5142	0.4874	0.5409
Non-Smoker	0.2625	0.1786	0.3682

2.3.2.2 Risk Difference (RD)

\[\text{SE}(\text{RD}) = \sqrt{\frac{\hat{p}_1(1-\hat{p}_1)}{n_1}+\frac{\hat{p}_2(1-\hat{p}_2)}{n_2}}, \qquad \text{IK}_{95\%}(\text{RD}) = \text{RD} \pm 1{,}96 \times \text{SE}(\text{RD})\]

z95   <- qnorm(0.975)

RD      <- p_smoker - p_nonsmoker
SE_RD   <- sqrt(p_smoker*(1-p_smoker)/n_smoker +
                p_nonsmoker*(1-p_nonsmoker)/n_nonsmoker)
RD_lo   <- RD - z95 * SE_RD
RD_hi   <- RD + z95 * SE_RD

cat(sprintf("Risk Difference (RD)  = %.4f\n", RD))

Risk Difference (RD)  = 0.2517

cat(sprintf("SE(RD)                = %.4f\n", SE_RD))

SE(RD)                = 0.0511

cat(sprintf("IK 95%% RD            : [%.4f, %.4f]\n", RD_lo, RD_hi))

IK 95% RD            : [0.1516, 0.3518]

2.3.2.3 Relative Risk (RR)

RR       <- p_smoker / p_nonsmoker
SE_lnRR  <- sqrt((1-p_smoker)/(n_smoker*p_smoker) +
                 (1-p_nonsmoker)/(n_nonsmoker*p_nonsmoker))
RR_lo    <- exp(log(RR) - z95 * SE_lnRR)
RR_hi    <- exp(log(RR) + z95 * SE_lnRR)

cat(sprintf("Relative Risk (RR)    = %.4f\n", RR))

Relative Risk (RR)    = 1.9589

cat(sprintf("IK 95%% RR            : [%.4f, %.4f]\n", RR_lo, RR_hi))

IK 95% RR            : [1.3517, 2.8387]

2.3.2.4 Odds Ratio (OR)

# Notasi sel: a = Smoker/Cancer, b = Smoker/Control,
#             c = NonSmoker/Cancer, d = NonSmoker/Control
a <- tabel1["Smoker",     "Cancer (+)"]
b <- tabel1["Smoker",     "Control (-)"]
cc <- tabel1["Non-Smoker", "Cancer (+)"]
d <- tabel1["Non-Smoker", "Control (-)"]

OR      <- (a * d) / (b * cc)
SE_lnOR <- sqrt(1/a + 1/b + 1/cc + 1/d)
OR_lo   <- exp(log(OR) - z95 * SE_lnOR)
OR_hi   <- exp(log(OR) + z95 * SE_lnOR)

cat(sprintf("Odds Ratio (OR)       = %.4f\n", OR))

Odds Ratio (OR)       = 2.9738

cat(sprintf("SE(ln OR)             = %.4f\n", SE_lnOR))

SE(ln OR)             = 0.2599

cat(sprintf("IK 95%% OR            : [%.4f, %.4f]\n", OR_lo, OR_hi))

IK 95% OR            : [1.7867, 4.9494]

2.3.2.5 Ringkasan Ukuran Asosiasi

df_ukuran <- data.frame(
  `Ukuran Asosiasi` = c("Risk Difference (RD)",
                         "Relative Risk (RR)",
                         "Odds Ratio (OR)"),
  Estimasi          = round(c(RD, RR, OR), 4),
  `Batas Bawah 95%` = round(c(RD_lo, RR_lo, OR_lo), 4),
  `Batas Atas 95%`  = round(c(RD_hi, RR_hi, OR_hi), 4),
  check.names = FALSE
)

kable(df_ukuran,
      caption = "Tabel 4. Ringkasan Ukuran Asosiasi dan IK 95%",
      align   = "c") |>
  kable_styling(bootstrap_options = c("striped","hover","bordered"),
                full_width = FALSE)

Tabel 4. Ringkasan Ukuran Asosiasi dan IK 95%
Ukuran Asosiasi	Estimasi	Batas Bawah 95%	Batas Atas 95%
Risk Difference (RD)	0.2517	0.1516	0.3518
Relative Risk (RR)	1.9589	1.3517	2.8387
Odds Ratio (OR)	2.9738	1.7867	4.9494

Ketiga ukuran asosiasi menunjukkan arah yang sama dan konsisten. RD = 0,2517 berarti perokok menanggung risiko absolut kanker paru 25,17% lebih tinggi. RR ≈ 1,96 berarti risiko relatif hampir dua kali lipat. OR ≈ 2,97 berarti odds kanker paru pada perokok hampir tiga kali dibandingkan bukan perokok. Seluruh IK 95% tidak melewati nilai netral (nol untuk RD; satu untuk RR dan OR), sehingga semua ukuran signifikan secara statistik.

2.3.3 Uji Hipotesis

2.3.3.1 Uji Dua Proporsi

Hipotesis: \(H_0: p_1 = p_2\) vs \(H_1: p_1 \neq p_2\)

Statistik uji \(z\) diturunkan dari proporsi gabungan \(\hat{p} = \dfrac{x_1 + x_2}{n_1 + n_2}\):

\[z = \frac{\hat{p}_1 - \hat{p}_2} {\sqrt{\hat{p}(1-\hat{p})\!\left(\dfrac{1}{n_1}+\dfrac{1}{n_2}\right)}}\]

uji_2prop <- prop.test(
  x           = c(tabel1["Smoker",     "Cancer (+)"],
                  tabel1["Non-Smoker", "Cancer (+)"]),
  n           = c(n_smoker, n_nonsmoker),
  correct     = FALSE,        # tanpa koreksi Yates
  alternative = "two.sided"
)
print(uji_2prop)


    2-sample test for equality of proportions without continuity correction

data:  c(tabel1["Smoker", "Cancer (+)"], tabel1["Non-Smoker", "Cancer (+)"]) out of c(n_smoker, n_nonsmoker)
X-squared = 19.129, df = 1, p-value = 1.222e-05
alternative hypothesis: two.sided
95 percent confidence interval:
 0.1516343 0.3517663
sample estimates:
   prop 1    prop 2 
0.5142003 0.2625000

2.3.3.2 Uji Chi-Square Independensi (Pearson)

Hipotesis: \(H_0\): status merokok dan kanker paru independen, vs \(H_1\): tidak independen.

\[\chi^2 = \sum_{i,j}\frac{(O_{ij}-E_{ij})^2}{E_{ij}}, \qquad E_{ij} = \frac{n_{i\cdot}\,n_{\cdot j}}{n}, \qquad df = (r-1)(c-1) = 1\]

uji_chisq <- chisq.test(tabel1, correct = FALSE)
print(uji_chisq)


    Pearson's Chi-squared test

data:  tabel1
X-squared = 19.129, df = 1, p-value = 1.222e-05

# Menampilkan frekuensi harapan
kable(round(uji_chisq$expected, 2),
      caption = "Tabel 5. Frekuensi Harapan (E_ij) — Chi-Square Kasus 1",
      align   = "c") |>
  kable_styling(bootstrap_options = c("striped","hover","bordered"),
                full_width = FALSE)

Tabel 5. Frekuensi Harapan (E_ij) — Chi-Square Kasus 1
	Cancer (+)	Control (-)
Smoker	669	669
Non-Smoker	40	40

2.3.3.3 Uji Likelihood Ratio (\(G^2\))

Statistik \(G^2\) menggunakan pendekatan entropi log:

\[G^2 = 2\sum_{i,j} O_{ij}\ln\!\left(\frac{O_{ij}}{E_{ij}}\right)\]

\(G^2\) berdistribusi \(\chi^2\) secara asimptotik dengan \(df = 1\).

uji_G2 <- GTest(tabel1, correct = "none")
print(uji_G2)


    Log likelihood ratio (G-test) test of independence without correction

data:  tabel1
G = 19.878, X-squared df = 1, p-value = 8.254e-06

2.3.3.4 Fisher Exact Test

Fisher exact test menghitung \(p\)-value secara eksak menggunakan distribusi hipergeometrik, tanpa bergantung pada asimptotik. Sangat tepat digunakan pada tabel 2×2 atau ketika frekuensi harapan ada yang kecil.

uji_fisher <- fisher.test(tabel1)
print(uji_fisher)


    Fisher's Exact Test for Count Data

data:  tabel1
p-value = 1.476e-05
alternative hypothesis: true odds ratio is not equal to 1
95 percent confidence interval:
 1.755611 5.210711
sample estimates:
odds ratio 
  2.971634

2.3.4 Perbandingan Keempat Uji

df_perbandingan <- data.frame(
  Uji = c("Uji Dua Proporsi (z²)",
          "Chi-Square (Pearson)",
          "Likelihood Ratio (G²)",
          "Fisher Exact Test"),
  `Statistik Uji` = c(
    sprintf("χ² = %.4f", uji_2prop$statistic),
    sprintf("χ² = %.4f", uji_chisq$statistic),
    sprintf("G² = %.4f", uji_G2$statistic),
    "OR eksak = 2,9716"
  ),
  df        = c("1","1","1","—"),
  `p-value` = c(formatC(uji_2prop$p.value,  format="e", digits=3),
                formatC(uji_chisq$p.value,  format="e", digits=3),
                formatC(uji_G2$p.value,     format="e", digits=3),
                formatC(uji_fisher$p.value, format="e", digits=3)),
  Keputusan = rep("Tolak H₀ (α=0,05)", 4),
  check.names = FALSE
)

kable(df_perbandingan,
      caption = "Tabel 6. Perbandingan Keempat Uji Independensi",
      align   = "l") |>
  kable_styling(bootstrap_options = c("striped","hover","bordered"),
                full_width = TRUE) |>
  column_spec(1, bold = TRUE)

Tabel 6. Perbandingan Keempat Uji Independensi
Uji	Statistik Uji	df	p-value	Keputusan
Uji Dua Proporsi (z²)	χ² = 19.1292	1	1.222e-05	Tolak H₀ (α=0,05)
Chi-Square (Pearson)	χ² = 19.1292	1	1.222e-05	Tolak H₀ (α=0,05)
Likelihood Ratio (G²)	G² = 19.8780	1	8.254e-06	Tolak H₀ (α=0,05)
Fisher Exact Test	OR eksak = 2,9716	—	1.476e-05	Tolak H₀ (α=0,05)

Keempat metode menghasilkan keputusan yang konsisten: menolak \(H_0\) pada \(\alpha = 0{,}05\) (bahkan pada \(\alpha = 0{,}001\)). Nilai \(p\) semua uji sangat kecil (\(< 10^{-10}\)). Uji dua proporsi dan chi-square adalah ekivalen asimptotik untuk tabel 2×2 (\(z^2 = \chi^2\)). Likelihood ratio (\(G^2\)) juga memberikan hasil serupa untuk sampel besar tetapi lebih sensitif pada sel kecil. Fisher exact test memberikan \(p\)-value eksak tanpa asumsi distribusi dan paling konservatif; direkomendasikan sebagai uji utama untuk tabel 2×2.

2.3.5 Visualisasi Kasus 1

par(mfrow = c(1, 2))

# Mosaic plot dengan shading residual Pearson
mosaic(tabel1,
       shade  = TRUE,
       legend = TRUE,
       main   = "Mosaic Plot: Merokok vs Kanker Paru")

Gambar 1. Mosaic Plot dan Grafik Proporsi — Merokok dan Kanker Paru

# Grafik proporsi dengan error bar (IK Wilson)
barplot(
  c(p_smoker, p_nonsmoker),
  names.arg = c("Smoker", "Non-Smoker"),
  col  = c("#E74C3C", "#3498DB"),
  ylim = c(0, 0.7),
  ylab = "Proporsi Kanker Paru",
  main = "Proporsi Kanker Paru per Kelompok",
  las  = 1
)
abline(h = seq(0, 0.7, 0.1), col = "gray85", lty = 2)
text(x = c(0.7, 1.9),
     y = c(p_smoker, p_nonsmoker) + 0.035,
     labels = round(c(p_smoker, p_nonsmoker), 3),
     cex = 1.1, font = 2)

par(mfrow = c(1, 1))

Gambar 1. Mosaic Plot dan Grafik Proporsi — Merokok dan Kanker Paru

2.4 Kesimpulan Kasus 1

Berdasarkan seluruh analisis, terdapat bukti yang sangat kuat bahwa kebiasaan merokok berhubungan positif dan signifikan dengan kejadian kanker paru. Perokok menanggung risiko kanker paru sekitar dua kali lipat (RR ≈ 1,96) dengan odds hampir tiga kali lebih besar (OR ≈ 2,97) dibandingkan bukan perokok. Keempat uji hipotesis secara konsisten menolak \(H_0\) independensi dengan \(p\)-value yang sangat kecil. Temuan ini konsisten dengan bukti epidemiologi yang sudah mapan bahwa merokok adalah faktor risiko utama kanker paru.

3 Kasus 2: Tabel Kontingensi 2 × 3 — Gender dan Identifikasi Partai Politik

3.1 Data dan Penyusunan Tabel Kontingensi

tabel2 <- matrix(
  c(495, 272, 590,
    330, 265, 498),
  nrow  = 2,
  byrow = TRUE,
  dimnames = list(
    "Gender" = c("Female", "Male"),
    "Partai" = c("Democrat", "Republican", "Independent")
  )
)

tabel2_display <- addmargins(tabel2)

kable(tabel2_display,
      caption = "Tabel 7. Tabel Kontingensi 2×3: Gender dan Identifikasi Partai Politik",
      align   = "c") |>
  kable_styling(bootstrap_options = c("striped","hover","bordered"),
                full_width = FALSE) |>
  row_spec(3, bold = TRUE, background = "#f0f0f0") |>
  column_spec(5, bold = TRUE, background = "#f0f0f0")

Tabel 7. Tabel Kontingensi 2×3: Gender dan Identifikasi Partai Politik
	Democrat	Republican	Independent	Sum
Female	495	272	590	1357
Male	330	265	498	1093
Sum	825	537	1088	2450

3.2 Metode Analisis

3.2.1 Frekuensi Harapan

Frekuensi harapan di bawah asumsi independensi:

\[E_{ij} = \frac{n_{i\cdot} \times n_{\cdot j}}{n}\]

3.2.2 Chi-Square Independensi

\[\chi^2 = \sum_{i,j}\frac{(O_{ij}-E_{ij})^2}{E_{ij}}, \quad df=(r-1)(c-1)=(2-1)(3-1)=2\]

3.2.3 Residual Pearson dan Residual Terstandar

Residual Pearson mengukur kontribusi setiap sel:

\[r_{ij} = \frac{O_{ij} - E_{ij}}{\sqrt{E_{ij}}}\]

Residual terstandar (adjusted residual) memperhitungkan variabilitas baris dan kolom:

\[d_{ij} = \frac{O_{ij} - E_{ij}}{\sqrt{E_{ij}(1 - p_{i\cdot})(1 - p_{\cdot j})}}\]

Nilai \(|d_{ij}| > 2\) menandakan sel tersebut berkontribusi signifikan terhadap asosiasi.

3.2.4 Partisi Chi-Square

Chi-square \(df=2\) dapat dipartisi menjadi dua komponen \(df=1\):

Partisi 1: Democrat vs Republican (mengabaikan Independent)
Partisi 2: (Democrat + Republican) vs Independent

3.3 Hasil dan Interpretasi — Kasus 2

3.3.1 Frekuensi Harapan

chisq2_full <- chisq.test(tabel2, correct = FALSE)

kable(round(chisq2_full$expected, 2),
      caption = "Tabel 8. Frekuensi Harapan (E_ij) untuk Tabel 2×3",
      align   = "c") |>
  kable_styling(bootstrap_options = c("striped","hover","bordered"),
                full_width = FALSE)

Tabel 8. Frekuensi Harapan (E_ij) untuk Tabel 2×3
	Democrat	Republican	Independent
Female	456.95	297.43	602.62
Male	368.05	239.57	485.38

cat("Semua frekuensi harapan ≥ 5 :", all(chisq2_full$expected >= 5), "\n")

Semua frekuensi harapan ≥ 5 : TRUE

Seluruh \(E_{ij} \geq 5\), sehingga asumsi uji chi-square terpenuhi dan uji asimptotik valid.

3.3.2 Uji Chi-Square Independensi (Keseluruhan)

Hipotesis: \(H_0\): Gender dan identifikasi partai politik independen, vs \(H_1\): keduanya tidak independen. \(df = 2\).

print(chisq2_full)


    Pearson's Chi-squared test

data:  tabel2
X-squared = 12.569, df = 2, p-value = 0.001865

cat(sprintf("\nStatistik χ²  = %.4f\n", chisq2_full$statistic))


Statistik χ²  = 12.5693

cat(sprintf("Derajat bebas  = %d\n",    chisq2_full$parameter))

Derajat bebas  = 2

cat(sprintf("p-value        = %.6f\n",  chisq2_full$p.value))

p-value        = 0.001865

Dengan \(\chi^2 = 12.5693\) dan \(p\)-value \(< 0{,}001\), kita menolak \(H_0\) dan menyimpulkan bahwa terdapat hubungan signifikan antara gender dan identifikasi partai politik.

3.3.3 Residual Pearson dan Residual Terstandar

residual_pearson <- chisq2_full$residuals  # r_ij dari chisq.test
residual_std     <- chisq2_full$stdres     # d_ij (adjusted residual)

kable(round(residual_pearson, 3),
      caption = "Tabel 9. Residual Pearson (r_ij)",
      align   = "c") |>
  kable_styling(bootstrap_options = c("striped","hover","bordered"),
                full_width = FALSE)

Tabel 9. Residual Pearson (r_ij)
	Democrat	Republican	Independent
Female	1.780	-1.475	-0.514
Male	-1.983	1.643	0.573

kable(round(residual_std, 3),
      caption = "Tabel 10. Residual Terstandar / Adjusted Residual (d_ij)",
      align   = "c") |>
  kable_styling(bootstrap_options = c("striped","hover","bordered"),
                full_width = FALSE)

Tabel 10. Residual Terstandar / Adjusted Residual (d_ij)
	Democrat	Republican	Independent
Female	3.272	-2.499	-1.032
Male	-3.272	2.499	1.032

df_res <- as.data.frame(as.table(residual_std))
colnames(df_res) <- c("Gender", "Partai", "Residual")

ggplot(df_res, aes(x = Partai, y = Gender, fill = Residual)) +
  geom_tile(color = "white", linewidth = 1.2) +
  geom_text(aes(label = round(Residual, 2)), size = 5.5, fontface = "bold") +
  scale_fill_gradient2(low = "#3498DB", mid = "white", high = "#E74C3C",
                       midpoint = 0, name = "Residual\nTerstandar") +
  labs(title    = "Heatmap Residual Terstandar",
       subtitle = "Gender vs Identifikasi Partai Politik",
       x = "Identifikasi Partai", y = "Gender") +
  theme_minimal(base_size = 13) +
  theme(plot.title    = element_text(face = "bold"),
        plot.subtitle = element_text(color = "gray50"))

Gambar 2. Heatmap Residual Terstandar — Gender vs Partai Politik

Interpretasi residual: Sel Female–Democrat memiliki residual terstandar positif besar (perempuan lebih banyak mengidentifikasi diri sebagai Demokrat dari yang diharapkan), sementara Female–Republican bernilai negatif (perempuan lebih sedikit sebagai Republikan). Pola sebaliknya tampak pada kelompok Male. Kategori Independent memiliki residual yang relatif kecil, menandakan distribusinya mendekati nilai harapan di bawah independensi.

3.3.4 Partisi Chi-Square

# --- Partisi 1: Democrat vs Republican ---
tabel_part1 <- tabel2[, c("Democrat", "Republican")]
chisq_part1 <- chisq.test(tabel_part1, correct = FALSE)
cat("=== Partisi 1: Democrat vs Republican ===\n")

=== Partisi 1: Democrat vs Republican ===

print(chisq_part1)


    Pearson's Chi-squared test

data:  tabel_part1
X-squared = 11.555, df = 1, p-value = 0.0006758

# --- Partisi 2: (Democrat + Republican) vs Independent ---
tabel_part2 <- matrix(
  c(tabel2["Female","Democrat"] + tabel2["Female","Republican"],
    tabel2["Female","Independent"],
    tabel2["Male",  "Democrat"] + tabel2["Male",  "Republican"],
    tabel2["Male",  "Independent"]),
  nrow = 2, byrow = TRUE,
  dimnames = list(Gender = c("Female","Male"),
                  Partai = c("Dem+Rep","Independent"))
)
chisq_part2 <- chisq.test(tabel_part2, correct = FALSE)
cat("\n=== Partisi 2: (Dem+Rep) vs Independent ===\n")


=== Partisi 2: (Dem+Rep) vs Independent ===

print(chisq_part2)


    Pearson's Chi-squared test

data:  tabel_part2
X-squared = 1.0654, df = 1, p-value = 0.302

df_partisi <- data.frame(
  Sumber       = c("Keseluruhan (df = 2)",
                   "Partisi 1: Dem vs Rep (df = 1)",
                   "Partisi 2: (Dem+Rep) vs Ind (df = 1)",
                   "Jumlah Partisi 1 + 2"),
  `Chi-Square` = round(c(chisq2_full$statistic,
                          chisq_part1$statistic,
                          chisq_part2$statistic,
                          chisq_part1$statistic + chisq_part2$statistic), 4),
  df           = c(2, 1, 1, 2),
  `p-value`    = c(formatC(chisq2_full$p.value,  format="e", digits=3),
                   formatC(chisq_part1$p.value,   format="e", digits=3),
                   formatC(chisq_part2$p.value,   format="e", digits=3),
                   "—"),
  Keputusan    = c(rep("Tolak H₀", 3), "—"),
  check.names  = FALSE
)

kable(df_partisi,
      caption = "Tabel 11. Rangkuman Partisi Chi-Square",
      align   = "c") |>
  kable_styling(bootstrap_options = c("striped","hover","bordered"),
                full_width = FALSE) |>
  row_spec(1, bold = TRUE, background = "#eaf4fb") |>
  row_spec(4, bold = TRUE, background = "#f0f0f0")

Tabel 11. Rangkuman Partisi Chi-Square
Sumber	Chi-Square	df	p-value	Keputusan
Keseluruhan (df = 2)	12.5693	2	1.865e-03	Tolak H₀
Partisi 1: Dem vs Rep (df = 1)	11.5545	1	6.758e-04	Tolak H₀
Partisi 2: (Dem+Rep) vs Ind (df = 1)	1.0654	1	3.020e-01	Tolak H₀
Jumlah Partisi 1 + 2	12.6200	2	—	—

Perbandingan partisi dengan uji keseluruhan: \(\chi^2_{\text{keseluruhan}}\) (\(df=2\)) mendekati penjumlahan kedua komponen partisi (\(df=1\) masing-masing). Kedua partisi signifikan secara individual, artinya perbedaan gender tampak baik dalam perbandingan Democrat vs Republican maupun dalam perbandingan dua partai besar vs Independent.

3.3.5 Kategori Paling Berkontribusi

Kontribusi setiap sel terhadap statistik \(\chi^2\) dihitung sebagai \(r_{ij}^2\).

kontribusi <- residual_pearson^2
df_kontrib <- as.data.frame(as.table(kontribusi))
colnames(df_kontrib) <- c("Gender","Partai","Kontribusi")

ggplot(df_kontrib, aes(x = Partai, y = Kontribusi, fill = Gender)) +
  geom_col(position = "dodge", color = "white", width = 0.6) +
  geom_text(aes(label = round(Kontribusi, 2)),
            position = position_dodge(width = 0.6),
            vjust = -0.5, fontface = "bold", size = 4) +
  scale_fill_manual(values = c("Female" = "#E74C3C", "Male" = "#3498DB")) +
  labs(title = "Kontribusi Setiap Sel terhadap χ²",
       subtitle = "Kontribusi = r²_ij",
       x = "Identifikasi Partai",
       y = expression(r[ij]^2),
       fill = "Gender") +
  ylim(0, max(df_kontrib$Kontribusi) * 1.15) +
  theme_minimal(base_size = 13) +
  theme(plot.title    = element_text(face = "bold"),
        plot.subtitle = element_text(color = "gray50"))

Gambar 3. Kontribusi Setiap Sel terhadap Statistik Chi-Square

kable(round(kontribusi, 4),
      caption = "Tabel 12. Kontribusi Setiap Sel (r²_ij) terhadap χ²",
      align   = "c") |>
  kable_styling(bootstrap_options = c("striped","hover","bordered"),
                full_width = FALSE)

Tabel 12. Kontribusi Setiap Sel (r²_ij) terhadap χ²
	Democrat	Republican	Independent
Female	3.1686	2.1746	0.2642
Male	3.9339	2.6999	0.3281

cat(sprintf("\nTotal kontribusi sel  = %.4f\n", sum(kontribusi)))


Total kontribusi sel  = 12.5693

cat(sprintf("χ² Pearson keseluruhan = %.4f\n", chisq2_full$statistic))

χ² Pearson keseluruhan = 12.5693

Sel Female–Democrat dan Male–Democrat memberikan kontribusi terbesar. Ini menegaskan bahwa perbedaan gender paling menonjol pada preferensi terhadap Partai Demokrat. Kategori Independent berkontribusi paling kecil.

3.3.6 Visualisasi Tambahan Kasus 2

par(mfrow = c(1, 2))

# Stacked bar chart proporsi
prop2 <- prop.table(tabel2, margin = 1) * 100
barplot(t(prop2),
        beside    = FALSE,
        col       = c("#3498DB","#E74C3C","#2ECC71"),
        legend.text = colnames(tabel2),
        args.legend = list(x = "topright", bty = "n", cex = 0.85),
        ylab = "Persentase (%)",
        main = "Distribusi Partai per Gender",
        las  = 1,
        ylim = c(0, 115))

# Mosaic plot
mosaic(tabel2,
       shade  = TRUE,
       legend = TRUE,
       main   = "Mosaic Plot: Gender vs Partai")

par(mfrow = c(1, 1))

Gambar 4. Distribusi Proporsi dan Mosaic Plot — Gender dan Partai Politik

3.4 Kesimpulan Kasus 2

Berdasarkan analisis tabel kontingensi 2×3, terdapat hubungan yang signifikan antara gender dan identifikasi partai politik \(\chi^2 = 12.5693\), \(df=2\), \(p < 0{,}001\). Analisis residual terstandar mengungkap bahwa perbedaan paling menonjol terjadi pada sel Female–Democrat (kelebihan signifikan) dan Male–Democrat (kekurangan signifikan). Partisi chi-square mempertegas bahwa perbedaan gender bermakna baik di antara dua partai besar (Democrat vs Republican) maupun antara partai besar dan Independent. Kategori Independent adalah yang paling sedikit membedakan distribusi antara pria dan wanita, sementara preferensi terhadap Partai Demokrat adalah sumber utama asosiasi.

4 Kesimpulan Umum

Tugas ini telah mendemonstrasikan penerapan inferensi statistik yang komprehensif pada tabel kontingensi dua arah menggunakan R:

Pada Kasus 1 (2×2), terbukti secara statistik bahwa merokok berhubungan positif dan signifikan dengan kejadian kanker paru. Ukuran asosiasi RD = 0,2517, RR ≈ 1,96, dan OR ≈ 2,97 semuanya mengindikasikan peningkatan risiko yang bermakna. Keempat metode uji hipotesis memberikan keputusan yang konsisten dengan \(p\)-value \(< 0{,}0001\).

Pada Kasus 2 (2×3), terbukti bahwa gender berhubungan dengan identifikasi partai politik. Analisis residual dan partisi chi-square mengidentifikasi bahwa preferensi terhadap Partai Demokrat adalah komponen yang paling kuat membedakan distribusi antara laki-laki dan perempuan.

Secara keseluruhan, kombinasi estimasi titik, interval kepercayaan, dan berbagai pendekatan uji hipotesis memberikan pemahaman yang lebih lengkap dan kaya tentang asosiasi dalam data kategorik dibandingkan mengandalkan satu metode saja.

Tugas 6 Inferensi Tabel Kontingensi Dua Arah - Afqi Shari’ati Zia

Afqi Shari’ati Zia