Sebuah studi pengembangan proses dilakukan untuk menyelidiki pengaruh 4 faktor terhadap yield (hasil) dalam satuan lbs. Keempat faktor tersebut adalah:
| Faktor | Variabel | Level Rendah (−) | Level Tinggi (+) |
|---|---|---|---|
| A | Time (h) | 2.5 | 3 |
| B | Concentration (%) | 14 | 18 |
| C | Pressure (psi) | 60 | 80 |
| D | Temperature (°C) | 225 | 250 |
Desain yang digunakan adalah 2⁴ Full Factorial Design dengan satu replikasi (single replicate), menghasilkan 2⁴ = 16 run eksperimen.
# ============================================================
# Input data eksperimen 2^4 factorial design
# ============================================================
# Data run (urutan aktual, bukan run order)
run_number <- 1:16
actual_order <- c(5, 9, 3, 13, 6, 7, 14, 1, 6, 11, 2, 15, 4, 16, 10, 12)
yield <- c(12, 18, 13, 16, 17, 15, 20, 15, 10, 25, 13, 24, 19, 21, 17, 23)
# Matriks desain (−1 = low, +1 = high)
A <- c(-1, +1, -1, +1, -1, +1, -1, +1, -1, +1, -1, +1, -1, +1, -1, +1)
B <- c(-1, -1, +1, +1, -1, -1, +1, +1, -1, -1, +1, +1, -1, -1, +1, +1)
C <- c(-1, -1, -1, -1, +1, +1, +1, +1, -1, -1, -1, -1, +1, +1, +1, +1)
D <- c(-1, -1, -1, -1, -1, -1, -1, -1, +1, +1, +1, +1, +1, +1, +1, +1)
# Buat data frame
df <- data.frame(
Run = run_number,
Order = actual_order,
A = A,
B = B,
C = C,
D = D,
Yield = yield
)
# Tampilkan data
knitr::kable(df,
caption = "Data Eksperimen 2⁴ Factorial Design",
align = "c"
)| Run | Order | A | B | C | D | Yield |
|---|---|---|---|---|---|---|
| 1 | 5 | -1 | -1 | -1 | -1 | 12 |
| 2 | 9 | 1 | -1 | -1 | -1 | 18 |
| 3 | 3 | -1 | 1 | -1 | -1 | 13 |
| 4 | 13 | 1 | 1 | -1 | -1 | 16 |
| 5 | 6 | -1 | -1 | 1 | -1 | 17 |
| 6 | 7 | 1 | -1 | 1 | -1 | 15 |
| 7 | 14 | -1 | 1 | 1 | -1 | 20 |
| 8 | 1 | 1 | 1 | 1 | -1 | 15 |
| 9 | 6 | -1 | -1 | -1 | 1 | 10 |
| 10 | 11 | 1 | -1 | -1 | 1 | 25 |
| 11 | 2 | -1 | 1 | -1 | 1 | 13 |
| 12 | 15 | 1 | 1 | -1 | 1 | 24 |
| 13 | 4 | -1 | -1 | 1 | 1 | 19 |
| 14 | 16 | 1 | -1 | 1 | 1 | 21 |
| 15 | 10 | -1 | 1 | 1 | 1 | 17 |
| 16 | 12 | 1 | 1 | 1 | 1 | 23 |
Pada 2⁴ factorial design dengan satu replikasi, terdapat 15 efek yang dapat diestimasi: 4 main effects, 6 two-factor interactions, 4 three-factor interactions, dan 1 four-factor interaction.
# ============================================================
# Hitung semua contrast dan effect estimates menggunakan
# metode yates atau formula langsung: Effect = Contrast / (n/2)
# Untuk 2^4 dengan n=16: Effect = Contrast / 8
# ============================================================
n <- nrow(df)
k <- 4 # jumlah faktor
# Fungsi untuk menghitung effect estimate
calc_effect <- function(contrast_vec, y) {
sum(contrast_vec * y) / (n / 2)
}
y <- df$Yield
# ---- Main Effects ----
eff_A <- calc_effect(A, y)
eff_B <- calc_effect(B, y)
eff_C <- calc_effect(C, y)
eff_D <- calc_effect(D, y)
# ---- Two-Factor Interactions ----
eff_AB <- calc_effect(A * B, y)
eff_AC <- calc_effect(A * C, y)
eff_AD <- calc_effect(A * D, y)
eff_BC <- calc_effect(B * C, y)
eff_BD <- calc_effect(B * D, y)
eff_CD <- calc_effect(C * D, y)
# ---- Three-Factor Interactions ----
eff_ABC <- calc_effect(A * B * C, y)
eff_ABD <- calc_effect(A * B * D, y)
eff_ACD <- calc_effect(A * C * D, y)
eff_BCD <- calc_effect(B * C * D, y)
# ---- Four-Factor Interaction ----
eff_ABCD <- calc_effect(A * B * C * D, y)
# Rangkum semua efek
effects_df <- data.frame(
Term = c("A", "B", "C", "D",
"AB", "AC", "AD", "BC", "BD", "CD",
"ABC", "ABD", "ACD", "BCD", "ABCD"),
Effect = round(c(eff_A, eff_B, eff_C, eff_D,
eff_AB, eff_AC, eff_AD, eff_BC, eff_BD, eff_CD,
eff_ABC, eff_ABD, eff_ACD, eff_BCD, eff_ABCD), 4)
)
knitr::kable(effects_df,
caption = "Estimasi Semua Efek dalam 2⁴ Factorial Design",
align = "c"
)| Term | Effect |
|---|---|
| A | 4.50 |
| B | 0.50 |
| C | 2.00 |
| D | 3.25 |
| AB | -0.75 |
| AC | -4.25 |
| AD | 4.00 |
| BC | 0.25 |
| BD | 0.00 |
| CD | 0.00 |
| ABC | 1.00 |
| ABD | 0.75 |
| ACD | -0.25 |
| BCD | -0.75 |
| ABCD | 1.00 |
# ============================================================
# Normal Probability Plot of Effects
# Efek yang signifikan akan menyimpang dari garis lurus
# ============================================================
effects_vec <- effects_df$Effect
effect_names <- effects_df$Term
# Hitung normal scores (Daniel's method)
m <- length(effects_vec)
i_rank <- rank(effects_vec)
p_i <- (i_rank - 0.5) / m
z_scores <- qnorm(p_i)
# Buat data frame untuk plot
plot_df <- data.frame(
Term = effect_names,
Effect = effects_vec,
z_score = z_scores
)
plot_df <- plot_df[order(plot_df$Effect), ]
# Identifikasi efek besar (|effect| > threshold)
threshold <- 2.0
plot_df$significant <- abs(plot_df$Effect) > threshold
# Plot
par(mar = c(5, 5, 4, 2))
plot(
plot_df$Effect, plot_df$z_score,
xlab = "Effect Estimate",
ylab = "Normal Score (z)",
main = "Normal Probability Plot of Effects\n(2⁴ Factorial Design - Soal 9-15)",
pch = 16,
col = ifelse(plot_df$significant, "red", "steelblue"),
cex = 1.3,
xlim = range(effects_vec) * 1.2
)
# Tambahkan garis referensi (fit dari efek tidak signifikan)
nonsig <- plot_df[!plot_df$significant, ]
fit_ref <- lm(z_score ~ Effect, data = nonsig)
abline(fit_ref, lty = 2, col = "gray50", lwd = 1.5)
abline(v = 0, lty = 3, col = "gray70")
# Label efek yang signifikan
sig_pts <- plot_df[plot_df$significant, ]
text(
sig_pts$Effect, sig_pts$z_score,
labels = sig_pts$Term,
pos = ifelse(sig_pts$Effect > 0, 4, 2),
col = "red",
cex = 0.9,
font = 2
)
# Label untuk efek tidak signifikan (opsional, lebih kecil)
nonsig_pts <- plot_df[!plot_df$significant, ]
text(
nonsig_pts$Effect, nonsig_pts$z_score,
labels = nonsig_pts$Term,
pos = 3,
col = "steelblue",
cex = 0.7
)
legend(
"topleft",
legend = c("Efek Signifikan (|eff| > 2)", "Efek Tidak Signifikan"),
col = c("red", "steelblue"),
pch = 16,
cex = 0.85,
bty = "n"
)Interpretasi Bagian (a):
Dari Normal Probability Plot di atas, efek yang menyimpang jauh dari garis referensi (garis lurus) dianggap signifikan secara statistik. Berdasarkan plot:
Karena hanya ada satu replikasi, tidak ada derajat bebas untuk error murni. Kita menggunakan efek-efek yang tampak tidak signifikan dari Normal Probability Plot sebagai error term (pendekatan Daniel).
# ============================================================
# ANOVA menggunakan faktor sebagai variabel
# Kita bangun model penuh dulu, lalu tentukan error term
# berdasarkan normal probability plot
# ============================================================
# Konversi ke faktor untuk ANOVA
df$fA <- factor(df$A, labels = c("Low", "High"))
df$fB <- factor(df$B, labels = c("Low", "High"))
df$fC <- factor(df$C, labels = c("Low", "High"))
df$fD <- factor(df$D, labels = c("Low", "High"))
# Fit model penuh (semua main effects dan 2FI)
model_full <- lm(
Yield ~ A * B * C * D,
data = df
)
# Tampilkan ANOVA tabel penuh
anova_full <- anova(model_full)
knitr::kable(
round(as.data.frame(anova_full), 4),
caption = "ANOVA Tabel — Model Penuh 2⁴ (Semua Efek)",
align = "c"
)| Df | Sum Sq | Mean Sq | F value | Pr(>F) | |
|---|---|---|---|---|---|
| A | 1 | 81.00 | 81.00 | NaN | NaN |
| B | 1 | 1.00 | 1.00 | NaN | NaN |
| C | 1 | 16.00 | 16.00 | NaN | NaN |
| D | 1 | 42.25 | 42.25 | NaN | NaN |
| A:B | 1 | 2.25 | 2.25 | NaN | NaN |
| A:C | 1 | 72.25 | 72.25 | NaN | NaN |
| B:C | 1 | 0.25 | 0.25 | NaN | NaN |
| A:D | 1 | 64.00 | 64.00 | NaN | NaN |
| B:D | 1 | 0.00 | 0.00 | NaN | NaN |
| C:D | 1 | 0.00 | 0.00 | NaN | NaN |
| A:B:C | 1 | 4.00 | 4.00 | NaN | NaN |
| A:B:D | 1 | 2.25 | 2.25 | NaN | NaN |
| A:C:D | 1 | 0.25 | 0.25 | NaN | NaN |
| B:C:D | 1 | 2.25 | 2.25 | NaN | NaN |
| A:B:C:D | 1 | 4.00 | 4.00 | NaN | NaN |
| Residuals | 0 | 0.00 | NaN | NA | NA |
# ============================================================
# ANOVA dengan error term dari pooling efek tidak signifikan
# Berdasarkan normal plot: gunakan efek-efek kecil sebagai error
# Efek yang dimasukkan model: A, B, D, AB, AD (signifikan)
# Sisanya di-pool ke error
# ============================================================
# Model tereduksi berdasarkan normal probability plot
model_reduced <- lm(
Yield ~ A + B + D + A:B + A:D,
data = df
)
anova_reduced <- anova(model_reduced)
knitr::kable(
round(as.data.frame(anova_reduced), 4),
caption = "ANOVA Tabel — Model Tereduksi (Efek Signifikan Saja)",
align = "c"
)| Df | Sum Sq | Mean Sq | F value | Pr(>F) | |
|---|---|---|---|---|---|
| A | 1 | 81.00 | 81.000 | 8.0000 | 0.0179 |
| B | 1 | 1.00 | 1.000 | 0.0988 | 0.7598 |
| D | 1 | 42.25 | 42.250 | 4.1728 | 0.0683 |
| A:B | 1 | 2.25 | 2.250 | 0.2222 | 0.6475 |
| A:D | 1 | 64.00 | 64.000 | 6.3210 | 0.0307 |
| Residuals | 10 | 101.25 | 10.125 | NA | NA |
##
## Call:
## lm(formula = Yield ~ A + B + D + A:B + A:D, data = df)
##
## Residuals:
## Min 1Q Median 3Q Max
## -4.125 -2.375 0.000 1.688 4.875
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 17.3750 0.7955 21.842 9.06e-10 ***
## A 2.2500 0.7955 2.828 0.0179 *
## B 0.2500 0.7955 0.314 0.7598
## D 1.6250 0.7955 2.043 0.0683 .
## A:B -0.3750 0.7955 -0.471 0.6475
## A:D 2.0000 0.7955 2.514 0.0307 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 3.182 on 10 degrees of freedom
## Multiple R-squared: 0.653, Adjusted R-squared: 0.4794
## F-statistic: 3.763 on 5 and 10 DF, p-value: 0.03545
Interpretasi Bagian (b):
Dari ANOVA model tereduksi:
# ============================================================
# Analisis Residual untuk Model Tereduksi
# ============================================================
df$fitted <- fitted(model_reduced)
df$residuals <- residuals(model_reduced)
# Layout 2x2 untuk 4 plot diagnostik
par(mfrow = c(2, 2), mar = c(4, 4, 3, 2))
# ---- Plot 1: Normal Q-Q Plot of Residuals ----
qqnorm(df$residuals,
main = "Normal Q-Q Plot of Residuals",
pch = 16, col = "steelblue", cex = 1.1
)
qqline(df$residuals, col = "red", lwd = 2)
# ---- Plot 2: Residuals vs Fitted ----
plot(df$fitted, df$residuals,
xlab = "Fitted Values",
ylab = "Residuals",
main = "Residuals vs Fitted Values",
pch = 16, col = "steelblue", cex = 1.1
)
abline(h = 0, lty = 2, col = "red", lwd = 1.5)
text(df$fitted, df$residuals,
labels = df$Run, pos = 3, cex = 0.6, col = "gray40"
)
# ---- Plot 3: Residuals vs Run Order ----
plot(df$Order, df$residuals,
xlab = "Run Order (Actual)",
ylab = "Residuals",
main = "Residuals vs Run Order",
pch = 16, col = "darkorange", cex = 1.1,
type = "b"
)
abline(h = 0, lty = 2, col = "red", lwd = 1.5)
# ---- Plot 4: Histogram Residuals ----
hist(df$residuals,
breaks = 6,
col = "steelblue",
border = "white",
xlab = "Residuals",
main = "Histogram of Residuals",
freq = FALSE
)
curve(
dnorm(x, mean = mean(df$residuals), sd = sd(df$residuals)),
add = TRUE, col = "red", lwd = 2
)# Tabel residual
resid_table <- data.frame(
Run = df$Run,
Yield = df$Yield,
Fitted = round(df$fitted, 3),
Residual = round(df$residuals, 3)
)
knitr::kable(
resid_table,
caption = "Tabel Nilai Fitted dan Residual",
align = "c"
)| Run | Yield | Fitted | Residual |
|---|---|---|---|
| 1 | 12 | 14.875 | -2.875 |
| 2 | 18 | 16.125 | 1.875 |
| 3 | 13 | 16.125 | -3.125 |
| 4 | 16 | 15.875 | 0.125 |
| 5 | 17 | 14.875 | 2.125 |
| 6 | 15 | 16.125 | -1.125 |
| 7 | 20 | 16.125 | 3.875 |
| 8 | 15 | 15.875 | -0.875 |
| 9 | 10 | 14.125 | -4.125 |
| 10 | 25 | 23.375 | 1.625 |
| 11 | 13 | 15.375 | -2.375 |
| 12 | 24 | 23.125 | 0.875 |
| 13 | 19 | 14.125 | 4.875 |
| 14 | 21 | 23.375 | -2.375 |
| 15 | 17 | 15.375 | 1.625 |
| 16 | 23 | 23.125 | -0.125 |
# Uji normalitas residual
shapiro_test <- shapiro.test(df$residuals)
cat("Shapiro-Wilk Test for Residuals:\n")## Shapiro-Wilk Test for Residuals:
## W = 0.9685
## p-value = 0.8145
cat(sprintf("\n Kesimpulan: %s\n",
ifelse(shapiro_test$p.value > 0.05,
"Residual berdistribusi normal (gagal tolak H₀)",
"Ada indikasi ketidaknormalan residual (tolak H₀)"
)
))##
## Kesimpulan: Residual berdistribusi normal (gagal tolak H₀)
Interpretasi Bagian (c):
Normal Q-Q Plot: Titik-titik residual mengikuti garis diagonal dengan cukup baik, menunjukkan bahwa asumsi normalitas residual terpenuhi.
Residuals vs Fitted: Tidak ada pola sistematis yang jelas (seperti funnel atau kurva), sehingga asumsi homoskedastisitas (varians konstan) tampaknya terpenuhi.
Residuals vs Run Order: Tidak ada tren naik/turun yang jelas terhadap run order, mengindikasikan tidak ada efek waktu (time drift) atau masalah independence.
Histogram: Distribusi residual mendekati simetris dan bell-shaped, mendukung asumsi normalitas.
Shapiro-Wilk Test: Nilai p-value > 0.05 menunjukkan residual berdistribusi normal secara formal.
Kesimpulan: Tidak ditemukan masalah serius pada residual. Model tereduksi cukup memadai untuk mendeskripsikan data.
Jika satu atau lebih faktor tidak signifikan, desain 2⁴ dapat dikollapskan (collapsed) menjadi desain 2³ dengan dua replikasi. Faktor yang dihilangkan memberikan replikasi alami.
# ============================================================
# Faktor C (Pressure) tidak signifikan → collapsible ke 2^3
# Dengan menghilangkan C, setiap kombinasi ABD akan muncul 2x
# (sekali saat C = low, sekali saat C = high)
# ============================================================
cat("Effect Estimates:\n")## Effect Estimates:
## A (Time) = 4.500
## B (Concentration) = 0.500
## C (Pressure) = 2.000 ← TIDAK SIGNIFIKAN
## D (Temperature) = 3.250
## AB = -0.750
## AC = -4.250
## AD = 4.000
## BC = 0.250
## BD = 0.000
## CD = 0.000
##
## Karena faktor C (Pressure) tidak signifikan,
## desain dapat dikollapskan menjadi 2³ dengan 2 replikasi.
# ============================================================
# Collapsed 2^3 Design: faktor A, B, D
# Setiap kombinasi ABD muncul 2x (C = -1 dan C = +1)
# ============================================================
# Definisi label treatment combination
df$combo_ABD <- paste0(
ifelse(df$A == -1, "a-", "a+"),
ifelse(df$B == -1, "b-", "b+"),
ifelse(df$D == -1, "d-", "d+")
)
# Hitung rata-rata dan range untuk tiap kombinasi
library(dplyr)
cube_data <- df %>%
group_by(A, B, D, combo_ABD) %>%
summarise(
Yield_1 = sort(Yield)[1],
Yield_2 = sort(Yield)[2],
Mean_Yield = mean(Yield),
Range = max(Yield) - min(Yield),
.groups = "drop"
) %>%
arrange(A, B, D)
# Beri label lebih mudah dibaca
cube_data$Label <- paste0(
"A=", ifelse(cube_data$A == -1, "Low", "High"), ", ",
"B=", ifelse(cube_data$B == -1, "Low", "High"), ", ",
"D=", ifelse(cube_data$D == -1, "Low", "High")
)
knitr::kable(
cube_data[, c("Label", "Yield_1", "Yield_2", "Mean_Yield", "Range")],
col.names = c("Treatment (A, B, D)", "Rep 1 (C=Low)", "Rep 2 (C=High)",
"Mean Yield", "Range"),
caption = "2³ Collapsed Design: Rata-rata dan Range Yield per Titik Kubus",
align = "c"
)| Treatment (A, B, D) | Rep 1 (C=Low) | Rep 2 (C=High) | Mean Yield | Range |
|---|---|---|---|---|
| A=Low, B=Low, D=Low | 12 | 17 | 14.5 | 5 |
| A=Low, B=Low, D=High | 10 | 19 | 14.5 | 9 |
| A=Low, B=High, D=Low | 13 | 20 | 16.5 | 7 |
| A=Low, B=High, D=High | 13 | 17 | 15.0 | 4 |
| A=High, B=Low, D=Low | 15 | 18 | 16.5 | 3 |
| A=High, B=Low, D=High | 21 | 25 | 23.0 | 4 |
| A=High, B=High, D=Low | 15 | 16 | 15.5 | 1 |
| A=High, B=High, D=High | 23 | 24 | 23.5 | 1 |
# ============================================================
# Sketsa kubus 2^3 dengan Mean dan Range pada tiap titik
# ============================================================
library(scatterplot3d)
# Koordinat titik kubus
cube_pts <- cube_data
x_vals <- ifelse(cube_pts$A == -1, 0, 1) # A
y_vals <- ifelse(cube_pts$B == -1, 0, 1) # B
z_vals <- ifelse(cube_pts$D == -1, 0, 1) # D
# Plot 3D kubus
s3d <- scatterplot3d(
x = x_vals,
y = y_vals,
z = z_vals,
xlab = "A: Time (2.5 / 3 h)",
ylab = "B: Conc. (14 / 18 %)",
zlab = "D: Temp (225 / 250 °C)",
main = "Collapsed 2³ Design (Faktor C dihilangkan)\nMean (Range) Yield di Setiap Titik Kubus",
pch = 16,
color = "steelblue",
cex.symbols = 2,
xlim = c(-0.2, 1.3),
ylim = c(-0.2, 1.3),
zlim = c(-0.2, 1.3),
type = "h",
grid = TRUE,
box = FALSE,
angle = 45
)
# Tambahkan label Mean (Range) di setiap titik
label_text <- paste0(
round(cube_pts$Mean_Yield, 1),
"\n(", cube_pts$Range, ")"
)
for (i in 1:nrow(cube_pts)) {
s3d$points3d(
x_vals[i], y_vals[i], z_vals[i],
type = "p", pch = 16, col = "steelblue", cex = 2
)
}
# Tambahkan text label
coords <- s3d$xyz.convert(x_vals, y_vals, z_vals)
text(
coords$x, coords$y + 0.05,
labels = paste0(round(cube_pts$Mean_Yield, 1),
" (", cube_pts$Range, ")"),
cex = 0.8,
col = "darkred",
font = 2
)## === RINGKASAN COLLAPSED 2³ DESIGN ===
## Faktor yang dipertahankan:
## A = Time (2.5 h vs 3 h)
## B = Concentration (14% vs 18%)
## D = Temperature (225°C vs 250°C)
## Faktor yang dihilangkan:
## C = Pressure (tidak signifikan)
## Setiap kombinasi A,B,D memiliki 2 observasi:
## Replikasi 1: C = Low (60 psi)
## Replikasi 2: C = High (80 psi)
## Mean Yield per sudut kubus:
print(data.frame(
Corner = cube_data$Label,
Mean = round(cube_data$Mean_Yield, 2),
Range = cube_data$Range
))## Corner Mean Range
## 1 A=Low, B=Low, D=Low 14.5 5
## 2 A=Low, B=Low, D=High 14.5 9
## 3 A=Low, B=High, D=Low 16.5 7
## 4 A=Low, B=High, D=High 15.0 4
## 5 A=High, B=Low, D=Low 16.5 3
## 6 A=High, B=Low, D=High 23.0 4
## 7 A=High, B=High, D=Low 15.5 1
## 8 A=High, B=High, D=High 23.5 1
Interpretasi Bagian (d):
Apakah desain dapat dikollapskan?
Ya. Karena faktor C (Pressure) tidak signifikan
berdasarkan Normal Probability Plot dan ANOVA, desain 2⁴ dapat
dikollapskan menjadi desain 2³ dengan 2
replikasi.
Mekanisme Collapsing:
Dengan mengabaikan faktor C, setiap kombinasi level faktor A, B, dan D
muncul dua kali — satu saat C = Low (60 psi) dan satu
saat C = High (80 psi). Ini memberikan replikasi
alami.
Informasi pada Titik Kubus:
Observasi dari Kubus: