1. JACKKNIFE

Konsep

Jackknife adalah metode resampling yang bekerja dengan cara membuang satu observasi ke-i secara bergantian (leave-one-out), lalu menghitung ulang statistik pada setiap subset.

Tujuan utama:

  • Mengestimasi bias dari suatu statistik
  • Mengestimasi standard error dari suatu statistik

Formula:

  • Estimasi jackknife ke-i: \(\hat{\theta}_{(i)} = \hat{\theta}\) dihitung tanpa observasi ke-\(i\)
  • Bias: \(\widehat{\text{bias}} = (n-1)\left(\bar{\hat{\theta}}_{(\cdot)} - \hat{\theta}\right)\)
  • Standard Error: \(\widehat{SE} = \sqrt{\frac{n-1}{n} \sum_{i=1}^{n}\left(\hat{\theta}_{(i)} - \bar{\hat{\theta}}_{(\cdot)}\right)^2}\)

1.1 Jackknife dari Scratch

set.seed(123)
x <- rnorm(100)
n <- length(x)

# --- Statistik asli ---
theta.hat <- mean(x)

# --- Leave-one-out estimates ---
theta.jack <- numeric(n)
for (i in 1:n) {
  theta.jack[i] <- mean(x[-i])   # hitung mean tanpa obs ke-i
}

# --- Bias ---
bias <- (n - 1) * (mean(theta.jack) - theta.hat)

# --- Standard Error ---
jack.se <- sqrt((n - 1) * mean((theta.jack - mean(theta.jack))^2))

# --- Confidence Interval 95% ---
t.val  <- qt(0.025, df = n - 1, lower.tail = FALSE)
lower  <- theta.hat - t.val * jack.se
upper  <- theta.hat + t.val * jack.se

cat("Estimasi        :", round(theta.hat, 6), "\n")
## Estimasi        : 0.090406
cat("Bias Jackknife  :", round(bias, 6),      "\n")
## Bias Jackknife  : 0
cat("Standard Error  :", round(jack.se, 6),   "\n")
## Standard Error  : 0.091282
cat("CI 95%          : [", round(lower, 6), ",", round(upper, 6), "]\n")
## CI 95%          : [ -0.090717 , 0.271528 ]

1.2 Jackknife dengan Package bootstrap

# install.packages("bootstrap")
library(bootstrap)

# Definisikan fungsi statistik (harus menerima indeks)
theta_fn <- function(idx, data) mean(data[idx])

hasil_jack <- jackknife(1:n, theta_fn, x)

cat("Bias (package)  :", round(hasil_jack$jack.bias,  6), "\n")
## Bias (package)  : 0
cat("SE   (package)  :", round(hasil_jack$jack.se,    6), "\n")
## SE   (package)  : 0.091282

1.3 Perbandingan Scratch vs Package

Komponen Scratch (manual loop) Package bootstrap
Cara kerja Loop for, buang obs ke-i satu per satu jackknife(indices, fn, data)
Fleksibilitas Bisa untuk statistik apa pun (custom formula) Perlu wrapper fungsi berindeks
Bias formula (n-1) * (mean(jack) - theta.hat) Dihitung otomatis
SE formula sqrt((n-1)/n * sum(...)) Dihitung otomatis

Kesimpulan: Scratch lebih transparan dan cocok untuk ujian. Package lebih praktis di dunia nyata.


1.4 Fungsi Jackknife Universal (Reusable)

# Fungsi jackknife yang bisa menerima statistik apa pun
jknife <- function(x, fxn, ci = 0.95) {
  theta.hat  <- fxn(x)
  n          <- length(x)
  theta.jack <- numeric(n)

  for (i in 1:n) theta.jack[i] <- fxn(x[-i])

  bias   <- (n - 1) * (mean(theta.jack) - theta.hat)
  se     <- sqrt((n - 1) * mean((theta.jack - mean(theta.jack))^2))
  alpha  <- 1 - ci
  t.val  <- qt(alpha / 2, df = n - 1, lower.tail = FALSE)

  list(
    est        = theta.hat,
    bias       = bias,
    se         = se,
    ci         = c(theta.hat - t.val * se, theta.hat + t.val * se),
    replicates = theta.jack
  )
}

# Contoh: estimasi mean
hasil <- jknife(x, mean)
cat("Est:", round(hasil$est,  6), "\n")
## Est: 0.090406
cat("Bias:", round(hasil$bias, 6), "\n")
## Bias: 0
cat("SE:  ", round(hasil$se,   6), "\n")
## SE:   0.091282
cat("CI:  [", round(hasil$ci[1], 6), ",", round(hasil$ci[2], 6), "]\n")
## CI:  [ -0.090717 , 0.271528 ]
# Contoh: estimasi median
hasil_med <- jknife(x, median)
cat("\nJackknife Median:\n")
## 
## Jackknife Median:
cat("Est:", round(hasil_med$est, 6), "| Bias:", round(hasil_med$bias, 6),
    "| SE:", round(hasil_med$se, 6), "\n")
## Est: 0.061756 | Bias: 0 | SE: 0.087082

2. BOOTSTRAP

Konsep

Bootstrap adalah metode resampling yang bekerja dengan cara mengambil sampel dengan pengembalian (sampling with replacement) sebanyak \(B\) kali dari data asli.

Tujuan utama:

  • Mengestimasi distribusi sampling dari suatu statistik
  • Mengestimasi standard error dan confidence interval tanpa asumsi distribusi

Formula:

  • Ambil \(B\) sampel bootstrap \(x^{*(b)}\) dari \(x\) (dengan pengembalian, ukuran \(n\))
  • Hitung \(\hat{\theta}^{*(b)}\) untuk setiap sampel
  • SE Bootstrap: \(\widehat{SE}_B = \sqrt{\frac{1}{B-1}\sum_{b=1}^{B}\left(\hat{\theta}^{*(b)} - \bar{\hat{\theta}}^*\right)^2}\)
  • Bias Bootstrap: \(\widehat{\text{bias}}_B = \bar{\hat{\theta}}^* - \hat{\theta}\)

2.1 Bootstrap dari Scratch

set.seed(42)
x    <- c(25, 10, 3, 2, 7, 20, 30, 9, 15)
n    <- length(x)
B    <- 1000     # jumlah iterasi bootstrap

theta.hat   <- mean(x)
theta.boot  <- numeric(B)

for (b in 1:B) {
  x.star       <- sample(x, size = n, replace = TRUE)  # resample dengan pengembalian
  theta.boot[b] <- mean(x.star)
}

# Standard Error Bootstrap
boot.se   <- sd(theta.boot)

# Bias Bootstrap
boot.bias <- mean(theta.boot) - theta.hat

# Confidence Interval — Percentile Method
ci.lower  <- quantile(theta.boot, 0.025)
ci.upper  <- quantile(theta.boot, 0.975)

cat("Estimasi Asli   :", round(theta.hat, 4),  "\n")
## Estimasi Asli   : 13.4444
cat("Bias Bootstrap  :", round(boot.bias, 4),  "\n")
## Bias Bootstrap  : 0.0214
cat("SE Bootstrap    :", round(boot.se,   4),  "\n")
## SE Bootstrap    : 3.1541
cat("CI 95% (percentile): [", round(ci.lower, 4), ",", round(ci.upper, 4), "]\n")
## CI 95% (percentile): [ 7.8889 , 19.8917 ]
# Visualisasi distribusi bootstrap
hist(theta.boot, breaks = 30, col = "steelblue", border = "white",
     main = "Distribusi Bootstrap (Mean)", xlab = expression(hat(theta)^"*"))
abline(v = theta.hat, col = "red",    lwd = 2, lty = 2)
abline(v = ci.lower,  col = "orange", lwd = 2, lty = 3)
abline(v = ci.upper,  col = "orange", lwd = 2, lty = 3)
legend("topright",
       legend = c("Estimasi Asli", "CI 95%"),
       col    = c("red", "orange"), lty = c(2, 3), lwd = 2)


2.2 Bootstrap dengan Package boot

library(boot)

# Fungsi statistik HARUS menerima (data, indices)
mean_fn <- function(data, indices) mean(data[indices])

hasil_boot <- boot(data = x, statistic = mean_fn, R = 1000)
print(hasil_boot)
## 
## ORDINARY NONPARAMETRIC BOOTSTRAP
## 
## 
## Call:
## boot(data = x, statistic = mean_fn, R = 1000)
## 
## 
## Bootstrap Statistics :
##     original  bias    std. error
## t1* 13.44444   0.019    2.958989
# Confidence Interval (4 metode sekaligus)
boot.ci(hasil_boot, type = c("norm", "basic", "perc", "bca"))
## BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS
## Based on 1000 bootstrap replicates
## 
## CALL : 
## boot.ci(boot.out = hasil_boot, type = c("norm", "basic", "perc", 
##     "bca"))
## 
## Intervals : 
## Level      Normal              Basic         
## 95%   ( 7.63, 19.22 )   ( 7.67, 19.00 )  
## 
## Level     Percentile            BCa          
## 95%   ( 7.89, 19.22 )   ( 8.11, 19.44 )  
## Calculations and Intervals on Original Scale

2.3 Perbandingan Scratch vs Package

Komponen Scratch (manual loop) Package boot
Cara kerja sample(..., replace=TRUE) dalam loop for boot(data, statistic, R)
Fungsi stat Tulis bebas di dalam loop Harus wrapper function(data, indices)
CI Manual quantile() (percentile) boot.ci() — 4 metode sekaligus
Output Vektor replikasi manual Objek boot dengan plot bawaan

Perbedaan kunci Jackknife vs Bootstrap: - Jackknife: deterministik, leave-one-out, cocok untuk bias & SE - Bootstrap: stokastik (random), sampling ulang \(B\) kali, lebih fleksibel untuk CI


3. OLS (Ordinary Least Squares)

Konsep

OLS adalah metode estimasi parameter regresi linear yang meminimalkan jumlah kuadrat residual antara nilai aktual \(y_i\) dan nilai prediksi \(\hat{y}_i\).

Model:

\[y_i = \beta_0 + \beta_1 x_i + \varepsilon_i, \quad \varepsilon_i \sim N(0, \sigma^2)\]

Estimator OLS (solusi closed-form):

\[\hat{\boldsymbol{\beta}} = (\mathbf{X}^\top \mathbf{X})^{-1} \mathbf{X}^\top \mathbf{y}\]

Standard Error koefisien:

\[\widehat{Var}(\hat{\boldsymbol{\beta}}) = \hat{\sigma}^2 (\mathbf{X}^\top \mathbf{X})^{-1}, \quad \hat{\sigma}^2 = \frac{RSS}{n - p}\]

dengan \(p\) = jumlah parameter (termasuk intercept), \(RSS = \sum(y_i - \hat{y}_i)^2\).


3.1 OLS dari Scratch (Matrix Algebra)

# --- Data contoh ---
set.seed(123)
n   <- 50
x1  <- rnorm(n, mean = 5, sd = 2)
x2  <- rnorm(n, mean = 3, sd = 1)
y   <- 2 + 3*x1 - 1.5*x2 + rnorm(n, sd = 2)  # true: b0=2, b1=3, b2=-1.5

# --- Matriks desain X (dengan kolom 1 untuk intercept) ---
X   <- cbind(1, x1, x2)   # dimensi: n x 3
p   <- ncol(X)             # jumlah parameter

# --- Estimasi OLS: beta_hat = (X'X)^{-1} X'y ---
XtX      <- t(X) %*% X
Xty      <- t(X) %*% y
beta_hat <- solve(XtX) %*% Xty

cat("Koefisien OLS (scratch):\n")
## Koefisien OLS (scratch):
cat("  Intercept (b0):", round(beta_hat[1], 4), "\n")
##   Intercept (b0): 2.432
cat("  b1            :", round(beta_hat[2], 4), "\n")
##   b1            : 3.0252
cat("  b2            :", round(beta_hat[3], 4), "\n")
##   b2            : -1.8392
# --- Residual dan RSS ---
y_hat  <- X %*% beta_hat
resid  <- y - y_hat
RSS    <- sum(resid^2)

# --- Estimasi varians galat ---
sigma2 <- RSS / (n - p)

# --- Varians-kovarians koefisien ---
Var_beta <- sigma2 * solve(XtX)
SE_beta  <- sqrt(diag(Var_beta))

# --- t-statistik dan p-value ---
t_stat  <- beta_hat / SE_beta
p_val   <- 2 * pt(abs(t_stat), df = n - p, lower.tail = FALSE)

# --- R-squared ---
SST  <- sum((y - mean(y))^2)
R2   <- 1 - RSS / SST
R2_adj <- 1 - (1 - R2) * (n - 1) / (n - p)

# --- Tabel ringkasan ---
tabel_ols <- data.frame(
  Koefisien = c("Intercept", "x1", "x2"),
  Estimate  = round(beta_hat, 4),
  Std_Error = round(SE_beta,  4),
  t_value   = round(t_stat,   4),
  p_value   = round(p_val,    6)
)
print(tabel_ols)
##    Koefisien Estimate Std_Error t_value  p_value
##    Intercept   2.4320     1.314  1.8509 0.070481
## x1        x1   3.0252     0.154 19.6427 0.000000
## x2        x2  -1.8392     0.315 -5.8394 0.000000
cat("\nR-squared    :", round(R2, 4), "\n")
## 
## R-squared    : 0.9012
cat("Adj R-squared:", round(R2_adj, 4), "\n")
## Adj R-squared: 0.897
cat("Sigma (galat):", round(sqrt(sigma2), 4), "\n")
## Sigma (galat): 1.995

3.2 OLS dengan Fungsi lm()

df <- data.frame(y = y, x1 = x1, x2 = x2)

model_lm <- lm(y ~ x1 + x2, data = df)
summary(model_lm)
## 
## Call:
## lm(formula = y ~ x1 + x2, data = df)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -3.7627 -1.4811 -0.1275  1.0503  4.5409 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)    2.432      1.314   1.851   0.0705 .  
## x1             3.025      0.154  19.643  < 2e-16 ***
## x2            -1.839      0.315  -5.839 4.71e-07 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.995 on 47 degrees of freedom
## Multiple R-squared:  0.9012, Adjusted R-squared:  0.897 
## F-statistic: 214.4 on 2 and 47 DF,  p-value: < 2.2e-16
# Koefisien saja
coef(model_lm)
## (Intercept)          x1          x2 
##    2.432023    3.025152   -1.839216
# Confidence interval koefisien
confint(model_lm, level = 0.95)
##                  2.5 %    97.5 %
## (Intercept) -0.2114061  5.075452
## x1           2.7153258  3.334977
## x2          -2.4728442 -1.205588
# Nilai fitted dan residual
# fitted(model_lm)
# residuals(model_lm)

3.3 Perbandingan Scratch vs lm()

Komponen Scratch (matrix algebra) lm()
Estimasi \(\hat\beta\) solve(t(X) %*% X) %*% t(X) %*% y lm(y ~ x1 + x2)
SE koefisien sqrt(diag(sigma2 * solve(XtX))) Otomatis di summary()
1 - RSS/SST summary()$r.squared
CI Manual dari qt() × SE confint(model)
Kemampuan Transparan, cocok untuk ujian Ringkas, lengkap, siap pakai

3.4 Pemeriksaan Asumsi OLS

par(mfrow = c(2, 2))
plot(model_lm)

par(mfrow = c(1, 1))

# Uji normalitas residual
shapiro.test(residuals(model_lm))
## 
##  Shapiro-Wilk normality test
## 
## data:  residuals(model_lm)
## W = 0.97735, p-value = 0.4463

4 plot diagnostik: 1. Residuals vs Fitted — cek linearitas & homoskedastisitas 2. Q-Q Plot — cek normalitas residual 3. Scale-Location — cek homoskedastisitas 4. Residuals vs Leverage — cek outlier/influential points


4. MLE (Maximum Likelihood Estimation)

Konsep

MLE mencari nilai parameter \(\theta\) yang memaksimalkan fungsi likelihood \(L(\theta)\), yaitu probabilitas mengamati data yang ada jika parameter adalah \(\theta\).

Ide inti:

\[\hat{\theta}_{MLE} = \arg\max_\theta \, L(\theta \mid \mathbf{x}) = \arg\max_\theta \prod_{i=1}^n f(x_i \mid \theta)\]

Dalam praktik, kita maksimalkan log-likelihood (lebih mudah secara komputasi):

\[\ell(\theta) = \log L(\theta) = \sum_{i=1}^n \log f(x_i \mid \theta)\]

MLE untuk distribusi Normal \(X \sim N(\mu, \sigma^2)\):

\[\hat{\mu}_{MLE} = \bar{x}, \qquad \hat{\sigma}^2_{MLE} = \frac{1}{n}\sum_{i=1}^n (x_i - \bar{x})^2\]

(Catatan: \(\hat{\sigma}^2_{MLE}\) adalah estimator biased — pembaginya \(n\), bukan \(n-1\))


4.1 MLE dari Scratch — Distribusi Normal

set.seed(42)
x <- rnorm(100, mean = 5, sd = 2)

# --- Log-likelihood untuk distribusi Normal ---
loglik_normal <- function(params, x) {
  mu    <- params[1]
  sigma <- params[2]
  if (sigma <= 0) return(-Inf)   # sigma harus positif
  sum(dnorm(x, mean = mu, sd = sigma, log = TRUE))
}

# --- Optim: MAKSIMALKAN log-likelihood (minimasi negatif-nya) ---
neg_loglik <- function(params) -loglik_normal(params, x)

hasil_optim <- optim(
  par     = c(mu = 0, sigma = 1),   # nilai awal
  fn      = neg_loglik,
  method  = "L-BFGS-B",
  lower   = c(-Inf, 1e-6),          # sigma > 0
  hessian = TRUE
)

mu_mle    <- hasil_optim$par[1]
sigma_mle <- hasil_optim$par[2]

cat("MLE mu    :", round(mu_mle,    4), "| True: 5\n")
## MLE mu    : 5.065 | True: 5
cat("MLE sigma :", round(sigma_mle, 4), "| True: 2\n")
## MLE sigma : 2.0723 | True: 2
cat("Rumus cepat mu_mle    =", round(mean(x), 4), "\n")
## Rumus cepat mu_mle    = 5.065
cat("Rumus cepat sigma_mle =", round(sqrt(mean((x - mean(x))^2)), 4), "\n")
## Rumus cepat sigma_mle = 2.0723
# --- Standard Error dari Hessian ---
Fisher_info <- hasil_optim$hessian       # Hessian dari -loglik ≈ Fisher info
SE_mle      <- sqrt(diag(solve(Fisher_info)))
cat("\nSE mu    :", round(SE_mle[1], 4), "\n")
## 
## SE mu    : 0.2072
cat("SE sigma :", round(SE_mle[2], 4), "\n")
## SE sigma : 0.1465

4.2 MLE dari Scratch — Distribusi Poisson

set.seed(1)
x_pois <- rpois(80, lambda = 3.5)

# Log-likelihood Poisson: sum(x*log(lambda) - lambda - log(x!))
loglik_pois <- function(lambda, x) {
  if (lambda <= 0) return(-Inf)
  sum(dpois(x, lambda = lambda, log = TRUE))
}

# Grid search sederhana (alternatif optim)
lambdas   <- seq(0.1, 10, by = 0.01)
ll_values <- sapply(lambdas, loglik_pois, x = x_pois)
lambda_mle_grid <- lambdas[which.max(ll_values)]

# Solusi analitik: lambda_MLE = mean(x)
lambda_mle_analitik <- mean(x_pois)

cat("MLE lambda (grid search)  :", lambda_mle_grid,      "\n")
## MLE lambda (grid search)  : 3.6
cat("MLE lambda (rumus analitik):", lambda_mle_analitik,  "\n")
## MLE lambda (rumus analitik): 3.6
cat("True lambda               : 3.5\n")
## True lambda               : 3.5
# Plot log-likelihood
plot(lambdas, ll_values, type = "l", col = "steelblue", lwd = 2,
     main = "Log-Likelihood Poisson", xlab = expression(lambda), ylab = "log L")
abline(v = lambda_mle_analitik, col = "red", lwd = 2, lty = 2)
legend("topright", legend = "MLE", col = "red", lty = 2, lwd = 2)


4.3 MLE dengan Package MASS (fitdistr)

library(MASS)

# MLE untuk Normal
fit_normal <- fitdistr(x, "normal")
print(fit_normal)
##      mean         sd    
##   5.0650296   2.0722742 
##  (0.2072274) (0.1465319)
# MLE untuk Poisson
fit_pois <- fitdistr(x_pois, "Poisson")
print(fit_pois)
##     lambda 
##   3.600000 
##  (0.212132)

4.4 MLE untuk Regresi Linear (ekuivalen OLS)

# Pada regresi linear dengan asumsi normalitas,
# MLE menghasilkan estimator beta yang SAMA dengan OLS

loglik_regresi <- function(params, y, X) {
  k     <- ncol(X)
  beta  <- params[1:k]
  sigma <- params[k + 1]
  if (sigma <= 0) return(-Inf)
  mu_hat <- X %*% beta
  sum(dnorm(y, mean = mu_hat, sd = sigma, log = TRUE))
}

X_mat  <- cbind(1, x1, x2)
inits  <- c(rep(0, ncol(X_mat)), 1)

res <- optim(
  par    = inits,
  fn     = function(p) -loglik_regresi(p, y, X_mat),
  method = "L-BFGS-B",
  lower  = c(rep(-Inf, ncol(X_mat)), 1e-6)
)

cat("MLE beta (regresi):\n")
## MLE beta (regresi):
cat("  Intercept:", round(res$par[1], 4), "\n")
##   Intercept: 2.4321
cat("  b1       :", round(res$par[2], 4), "\n")
##   b1       : 3.0251
cat("  b2       :", round(res$par[3], 4), "\n")
##   b2       : -1.8392
cat("  sigma    :", round(res$par[4], 4), "\n\n")
##   sigma    : 1.9342
cat("Bandingkan dengan OLS (lm()):\n")
## Bandingkan dengan OLS (lm()):
print(coef(model_lm))
## (Intercept)          x1          x2 
##    2.432023    3.025152   -1.839216

4.5 Perbandingan Scratch vs Package

Komponen Scratch (optim) Package MASS::fitdistr
Definisi fungsi Tulis sendiri loglik Tinggal sebut nama distribusi
Optimisasi optim() dengan negatif log-lik Otomatis
SE sqrt(diag(solve(hessian))) Langsung tersedia di output
Fleksibilitas Bisa untuk distribusi/model apa pun Terbatas pada distribusi standar

Hubungan OLS & MLE: Ketika \(\varepsilon \sim N(0, \sigma^2)\), estimator MLE untuk \(\beta\) identik dengan estimator OLS. Perbedaannya: MLE menggunakan \(n\) sebagai pembagi \(\hat\sigma^2\), sementara OLS menggunakan \(n-p\) (unbiased).


5. ANOVA

Konsep Umum

ANOVA (Analysis of Variance) menguji apakah rata-rata beberapa kelompok berbeda secara signifikan dengan mempartisi total keragaman data menjadi komponen-komponen.

Prinsip umum:

\[F = \frac{\text{Kuadrat Tengah Perlakuan (KTP)}}{\text{Kuadrat Tengah Galat (KTG)}}\]

Tolak \(H_0\) jika \(F_{hitung} > F_{tabel}\) atau \(p\)-value \(< \alpha\).


5.1 RAL (Rancangan Acak Lengkap)

Kapan digunakan: Satu faktor perlakuan, unit eksperimen homogen.

Model: \(Y_{ij} = \mu + \tau_i + \varepsilon_{ij}\)

Hipotesis: \(H_0: \mu_1 = \mu_2 = \cdots = \mu_t\) vs \(H_1:\) minimal satu berbeda

Scratch (Manual)

data_ral <- data.frame(
  Pakan = rep(c("A","B","C","D"), each = 5),
  Berat = c(42,45,43,44,46, 50,52,49,51,53, 47,48,46,49,47, 58,60,57,59,61)
)

y   <- data_ral$Berat
grp <- data_ral$Pakan
N   <- length(y)
k   <- length(unique(grp))

grand_mean  <- mean(y)
mean_grp    <- tapply(y, grp, mean)
n_grp       <- tapply(y, grp, length)

# Jumlah Kuadrat
JKT <- sum((y - grand_mean)^2)
JKP <- sum(n_grp * (mean_grp - grand_mean)^2)
JKG <- JKT - JKP

# Derajat Bebas
db_P <- k - 1
db_G <- N - k
db_T <- N - 1

# Kuadrat Tengah & F
KTP     <- JKP / db_P
KTG     <- JKG / db_G
F_hit   <- KTP / KTG
p_value <- pf(F_hit, db_P, db_G, lower.tail = FALSE)

tabel_ral <- data.frame(
  Sumber = c("Perlakuan","Galat","Total"),
  db     = c(db_P, db_G, db_T),
  JK     = round(c(JKP, JKG, JKT), 3),
  KT     = round(c(KTP, KTG, NA),  3),
  F_hit  = round(c(F_hit, NA, NA), 3),
  p_val  = round(c(p_value, NA, NA), 6)
)
print(tabel_ral)
##      Sumber db     JK      KT  F_hit p_val
## 1 Perlakuan  3 621.35 207.117 94.144     0
## 2     Galat 16  35.20   2.200     NA    NA
## 3     Total 19 656.55      NA     NA    NA

Package aov()

model_ral <- aov(Berat ~ Pakan, data = data_ral)
summary(model_ral)
##             Df Sum Sq Mean Sq F value   Pr(>F)    
## Pakan        3  621.4   207.1   94.14 2.22e-10 ***
## Residuals   16   35.2     2.2                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# Uji lanjut (Post Hoc)
library(agricolae)
lsd_ral <- LSD.test(model_ral, "Pakan", alpha = 0.05, console = FALSE)
lsd_ral$groups
##   Berat groups
## D  59.0      a
## B  51.0      b
## C  47.4      c
## A  44.0      d
hsd_ral <- HSD.test(model_ral, "Pakan", alpha = 0.05, console = FALSE)
hsd_ral$groups
##   Berat groups
## D  59.0      a
## B  51.0      b
## C  47.4      c
## A  44.0      d

5.2 RAKL (Rancangan Acak Kelompok Lengkap)

Kapan digunakan: Satu faktor perlakuan + satu sumber keragaman yang dikendalikan (blok). Unit eksperimen tidak homogen.

Model: \(Y_{ij} = \mu + \tau_i + \beta_j + \varepsilon_{ij}\)

Scratch (Manual)

data_rak <- data.frame(
  Blok   = factor(rep(paste0("K", 1:5), each = 4)),
  Metode = factor(rep(paste0("M", 1:4), times = 5)),
  Tinggi = c(31,35,33,39, 29,34,32,37, 33,38,35,42, 30,36,34,40, 32,37,34,41)
)

y   <- data_rak$Tinggi
grp <- data_rak$Metode
blk <- data_rak$Blok
N   <- length(y)
t   <- nlevels(grp)
b   <- nlevels(blk)

grand_mean   <- mean(y)
mean_perlk   <- tapply(y, grp, mean)
mean_blok    <- tapply(y, blk, mean)

JKT <- sum((y - grand_mean)^2)
JKP <- b * sum((mean_perlk - grand_mean)^2)
JKB <- t * sum((mean_blok  - grand_mean)^2)
JKG <- JKT - JKP - JKB

db_P <- t - 1
db_B <- b - 1
db_G <- (t - 1) * (b - 1)
db_T <- N - 1

KTP   <- JKP / db_P
KTB   <- JKB / db_B
KTG   <- JKG / db_G
F_P   <- KTP / KTG
F_B   <- KTB / KTG
pP    <- pf(F_P, db_P, db_G, lower.tail = FALSE)
pB    <- pf(F_B, db_B, db_G, lower.tail = FALSE)

tabel_rak <- data.frame(
  Sumber = c("Perlakuan","Blok","Galat","Total"),
  db     = c(db_P, db_B, db_G, db_T),
  JK     = round(c(JKP, JKB, JKG, JKT), 3),
  KT     = round(c(KTP, KTB, KTG, NA),  3),
  F_hit  = round(c(F_P, F_B, NA, NA),   3),
  p_val  = round(c(pP, pB, NA, NA),     6)
)
print(tabel_rak)
##      Sumber db    JK     KT  F_hit p_val
## 1 Perlakuan  3 209.8 69.933 262.25 0e+00
## 2      Blok  4  36.8  9.200  34.50 2e-06
## 3     Galat 12   3.2  0.267     NA    NA
## 4     Total 19 249.8     NA     NA    NA

Package aov()

model_rak <- aov(Tinggi ~ Metode + Blok, data = data_rak)
summary(model_rak)
##             Df Sum Sq Mean Sq F value   Pr(>F)    
## Metode       3  209.8   69.93   262.2 3.35e-11 ***
## Blok         4   36.8    9.20    34.5 1.71e-06 ***
## Residuals   12    3.2    0.27                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# Uji lanjut pada Metode
lsd_rak <- LSD.test(model_rak, "Metode", console = FALSE)
lsd_rak$groups
##    Tinggi groups
## M4   39.8      a
## M2   36.0      b
## M3   33.6      c
## M1   31.0      d
hsd_rak <- HSD.test(model_rak, "Metode", console = FALSE)
hsd_rak$groups
##    Tinggi groups
## M4   39.8      a
## M2   36.0      b
## M3   33.6      c
## M1   31.0      d

5.3 RBSL (Rancangan Bujur Sangkar Latin)

Kapan digunakan: Satu faktor perlakuan, dua sumber keragaman yang dikendalikan (baris & kolom).

Model: \(Y_{ijk} = \mu + \alpha_i + \beta_j + \tau_k + \varepsilon_{ijk}\), dengan \(p \times p\) layout.

Scratch (Manual)

rbsl <- data.frame(
  baris     = factor(rep(1:4, each = 4)),
  kolom     = factor(rep(1:4, times = 4)),
  perlakuan = factor(c("A","B","C","D", "B","C","D","A",
                       "C","D","A","B", "D","A","B","C")),
  Y         = c(24,22,20,28, 21,23,26,31, 18,30,29,27, 24,31,26,23)
)

p  <- nlevels(rbsl$perlakuan)
N  <- p^2
G  <- sum(rbsl$Y)

sum_baris <- tapply(rbsl$Y, rbsl$baris,     sum)
sum_kolom <- tapply(rbsl$Y, rbsl$kolom,     sum)
sum_perl  <- tapply(rbsl$Y, rbsl$perlakuan, sum)

FK  <- G^2 / N
JKT <- sum(rbsl$Y^2) - FK
JKB <- sum(sum_baris^2) / p - FK
JKK <- sum(sum_kolom^2) / p - FK
JKP <- sum(sum_perl^2)  / p - FK
JKG <- JKT - JKB - JKK - JKP

db_B <- p - 1; db_K <- p - 1; db_P <- p - 1
db_G <- (p-1)*(p-2); db_T <- N - 1

KTB <- JKB/db_B; KTK <- JKK/db_K; KTP <- JKP/db_P; KTG <- JKG/db_G
F_B <- KTB/KTG;  F_K <- KTK/KTG;  F_P <- KTP/KTG

tabel_rbsl <- data.frame(
  Sumber = c("Baris","Kolom","Perlakuan","Galat","Total"),
  db     = c(db_B, db_K, db_P, db_G, db_T),
  JK     = round(c(JKB, JKK, JKP, JKG, JKT), 3),
  KT     = round(c(KTB, KTK, KTP, KTG, NA),  3),
  F_hit  = round(c(F_B, F_K, F_P, NA, NA),   3),
  p_val  = round(c(pf(F_B,db_B,db_G,lower.tail=F),
                   pf(F_K,db_K,db_G,lower.tail=F),
                   pf(F_P,db_P,db_G,lower.tail=F), NA, NA), 5)
)
print(tabel_rbsl)
##      Sumber db      JK     KT  F_hit   p_val
## 1     Baris  3  16.688  5.562  3.761 0.07865
## 2     Kolom  3  71.188 23.729 16.042 0.00285
## 3 Perlakuan  3 139.688 46.562 31.479 0.00046
## 4     Galat  6   8.875  1.479     NA      NA
## 5     Total 15 236.438     NA     NA      NA

Package aov()

model_rbsl <- aov(Y ~ baris + kolom + perlakuan, data = rbsl)
summary(model_rbsl)
##             Df Sum Sq Mean Sq F value   Pr(>F)    
## baris        3  16.69    5.56   3.761 0.078652 .  
## kolom        3  71.19   23.73  16.042 0.002853 ** 
## perlakuan    3 139.69   46.56  31.479 0.000456 ***
## Residuals    6   8.87    1.48                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# Uji lanjut
lsd_rbsl <- LSD.test(model_rbsl, "perlakuan", console = FALSE)
lsd_rbsl$groups
##       Y groups
## A 28.75      a
## D 27.00      a
## B 24.00      b
## C 21.00      c
hsd_rbsl <- HSD.test(model_rbsl, "perlakuan", console = FALSE)
hsd_rbsl$groups
##       Y groups
## A 28.75      a
## D 27.00      a
## B 24.00      b
## C 21.00      c

5.4 Faktorial RAL

Kapan digunakan: Dua atau lebih faktor perlakuan, unit eksperimen homogen.

Model 2 faktor: \(Y_{ijk} = \mu + \tau_i + \beta_j + (\tau\beta)_{ij} + \varepsilon_{ijk}\)

Kunci: Periksa interaksi terlebih dahulu!

  • Jika interaksi signifikan → uji lanjut pada kombinasi perlakuan atau simple effect
  • Jika interaksi tidak signifikan → uji lanjut pada main effect

Scratch (Manual)

fakt <- data.frame(
  bahan = factor(rep(1:3, each = 12)),
  suhu  = factor(rep(rep(c(15,70,125), each = 4), times = 3), levels = c(15,70,125)),
  Y     = c(130,155,74,180, 34,40,80,75, 20,70,82,58,
            150,188,159,126, 136,122,106,115, 25,70,58,45,
            138,110,168,160, 174,120,150,139, 96,104,82,60)
)

a <- nlevels(fakt$bahan); b <- nlevels(fakt$suhu); n <- 4
G  <- sum(fakt$Y); FK <- G^2 / (a*b*n)

sum_A   <- tapply(fakt$Y, fakt$bahan, sum)
sum_B   <- tapply(fakt$Y, fakt$suhu,  sum)
sum_sel <- tapply(fakt$Y, list(fakt$bahan, fakt$suhu), sum)

JKT    <- sum(fakt$Y^2) - FK
JK_A   <- sum(sum_A^2)/(b*n) - FK
JK_B   <- sum(sum_B^2)/(a*n) - FK
JK_sel <- sum(sum_sel^2)/n   - FK
JK_AB  <- JK_sel - JK_A - JK_B
JK_G   <- JKT - JK_sel

db_A <- a-1; db_B <- b-1; db_AB <- (a-1)*(b-1)
db_G <- a*b*(n-1); db_T <- a*b*n - 1
KTG  <- JK_G / db_G

tabel_fakt <- data.frame(
  Sumber = c("Bahan (A)","Suhu (B)","Interaksi AB","Galat","Total"),
  db     = c(db_A, db_B, db_AB, db_G, db_T),
  JK     = round(c(JK_A, JK_B, JK_AB, JK_G, JKT), 3),
  KT     = round(c(JK_A/db_A, JK_B/db_B, JK_AB/db_AB, KTG, NA), 3),
  F_hit  = round(c((JK_A/db_A)/KTG, (JK_B/db_B)/KTG,
                   (JK_AB/db_AB)/KTG, NA, NA), 3),
  p_val  = round(c(pf((JK_A/db_A)/KTG, db_A, db_G, lower.tail=F),
                   pf((JK_B/db_B)/KTG, db_B, db_G, lower.tail=F),
                   pf((JK_AB/db_AB)/KTG, db_AB, db_G, lower.tail=F), NA, NA), 5)
)
print(tabel_fakt)
##         Sumber db        JK        KT  F_hit   p_val
## 1    Bahan (A)  2 10683.722  5341.861  7.911 0.00198
## 2     Suhu (B)  2 39118.722 19559.361 28.968 0.00000
## 3 Interaksi AB  4  9613.778  2403.444  3.560 0.01861
## 4        Galat 27 18230.750   675.213     NA      NA
## 5        Total 35 77646.972        NA     NA      NA

Package aov()

model_fakt <- aov(Y ~ bahan * suhu, data = fakt)   # bahan*suhu = bahan + suhu + bahan:suhu
summary(model_fakt)
##             Df Sum Sq Mean Sq F value   Pr(>F)    
## bahan        2  10684    5342   7.911  0.00198 ** 
## suhu         2  39119   19559  28.968 1.91e-07 ***
## bahan:suhu   4   9614    2403   3.560  0.01861 *  
## Residuals   27  18231     675                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# --- Uji lanjut: interaksi SIGNIFIKAN → kombinasi perlakuan ---
lsd_komb   <- LSD.test(model_fakt, c("bahan","suhu"), console = FALSE)
lsd_komb$groups
##            Y groups
## 2:15  155.75      a
## 3:70  145.75      a
## 3:15  144.00      a
## 1:15  134.75      a
## 2:70  119.75     ab
## 3:125  85.50     bc
## 1:125  57.50      c
## 1:70   57.25      c
## 2:125  49.50      c
tukey_komb <- HSD.test(model_fakt, c("bahan","suhu"), console = FALSE)
tukey_komb$groups
##            Y groups
## 2:15  155.75      a
## 3:70  145.75     ab
## 3:15  144.00     ab
## 1:15  134.75     ab
## 2:70  119.75     ab
## 3:125  85.50     bc
## 1:125  57.50      c
## 1:70   57.25      c
## 2:125  49.50      c
# --- Uji lanjut: Simple Effect (B dalam setiap level A) ---
hasil_simple <- list()
for (level_suhu in levels(fakt$suhu)) {
  sub <- subset(fakt, suhu == level_suhu)
  mod <- aov(Y ~ bahan, data = sub)
  hasil_simple[[level_suhu]] <- LSD.test(mod, "bahan", console = FALSE)$groups
}
print(hasil_simple)
## $`15`
##        Y groups
## 2 155.75      a
## 3 144.00      a
## 1 134.75      a
## 
## $`70`
##        Y groups
## 3 145.75      a
## 2 119.75      a
## 1  57.25      b
## 
## $`125`
##      Y groups
## 3 85.5      a
## 1 57.5     ab
## 2 49.5      b

5.5 Faktorial RAKL

Kapan digunakan: Dua atau lebih faktor + blok (unit eksperimen tidak homogen).

Model: \(Y_{ijkl} = \mu + \tau_i + \beta_j + (\tau\beta)_{ij} + \rho_k + \varepsilon_{ijkl}\)

rakl <- expand.grid(
  filter   = factor(c("T1","T2")),
  operator = factor(c("O1","O2","O3","O4")),
  lokasi   = factor(c("Rendah","Sedang","Tinggi"), levels = c("Rendah","Sedang","Tinggi"))
)
rakl$Y <- c(90,86, 96,84, 100,92, 92,81,
            102,87, 106,90, 105,97, 96,80,
            114,93, 112,91, 108,95, 98,83)

# Blok (operator) sebagai term aditif
model_rakl <- aov(Y ~ operator + lokasi * filter, data = rakl)
summary(model_rakl)
##               Df Sum Sq Mean Sq F value   Pr(>F)    
## operator       3  402.2   134.1  12.089 0.000277 ***
## lokasi         2  335.6   167.8  15.132 0.000253 ***
## filter         1 1066.7  1066.7  96.192 6.45e-08 ***
## lokasi:filter  2   77.1    38.5   3.476 0.057507 .  
## Residuals     15  166.3    11.1                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# Interaksi TIDAK signifikan → uji lanjut pada main effect
lsd_lok <- LSD.test(model_rakl, "lokasi", console = FALSE)
lsd_lok$groups
##             Y groups
## Tinggi 99.250      a
## Sedang 95.375      b
## Rendah 90.125      c
lsd_fil <- LSD.test(model_rakl, "filter", console = FALSE)
lsd_fil$groups
##           Y groups
## T1 101.5833      a
## T2  88.2500      b
hsd_lok <- HSD.test(model_rakl, "lokasi", console = FALSE)
hsd_lok$groups
##             Y groups
## Tinggi 99.250      a
## Sedang 95.375      a
## Rendah 90.125      b

5.6 Ringkasan Pemilihan Rancangan

Rancangan Faktor Perlakuan Sumber Keragaman Dikontrol Interaksi
RAL 1 Tidak ada (unit homogen) -
RAKL 1 1 arah (blok) -
RBSL 1 2 arah (baris & kolom) -
Faktorial RAL 2+ Tidak ada (unit homogen) Ada
Faktorial RAKL 2+ 1 arah (blok) Ada

5.7 Ringkasan Keputusan Uji Lanjut

Setelah ANOVA:
├── RAL / RAKL / RBSL (satu faktor)
│   └── Perlakuan signifikan? → LSD atau Tukey HSD langsung
│
└── Faktorial (dua faktor)
    ├── Interaksi SIGNIFIKAN?
    │   ├── Tujuan: cari perlakuan terbaik → uji lanjut KOMBINASI perlakuan
    │   └── Tujuan: jelaskan pola → uji SIMPLE EFFECT
    └── Interaksi TIDAK signifikan?
        └── Uji lanjut pada MAIN EFFECT yang signifikan

Ringkasan Akhir

Metode Tujuan Scratch (inti) Package
Jackknife Bias & SE suatu statistik Loop LOO, hitung bias & SE manual bootstrap::jackknife()
Bootstrap Distribusi sampling, SE, CI sample(..., replace=TRUE) dalam loop boot::boot()
OLS Estimasi koefisien regresi solve(t(X)%*%X) %*% t(X)%*%y lm()
MLE Estimasi parameter distribusi optim() minimasi -loglik MASS::fitdistr()
ANOVA Bandingkan rata-rata ≥2 kelompok Hitung JK, KT, F manual aov() + agricolae