Bayesianische “Power Analyse” für Mittelwertsunterschiede
10.12.2024
Definition 1 (Power) \[ \text{Power} = 1 - \beta \]
\(\beta\) = Fehler 2. Art
Definition 2 (Cohen’s d) \[ d = \frac{{\mu}_{1}-{\mu}_{0}}{\sigma} \]
Definition 3 (Stichrpobenermittlung mit Cohen’s d) \[ n = \frac{2\cdot(Z_{\alpha/2}+Z_{\beta})^2}{d^2} \]
\[ \text{Bayes-Faktor}_{10} = \frac{Data | H_1}{Data | H_0} \]
Wie viele Daten brauchen wir, um einen klaren Bayes-Faktor zu erhalten?
| Bayes Factor (BF) | Classification |
|---|---|
| BF < 1 | Bad (Evidence against) |
| 1 ≤ BF < 3 | Weak Evidence |
| 3 ≤ BF < 10 | Moderate Evidence |
| 10 ≤ BF < 30 | Strong Evidence |
| BF ≥ 30 | Very Strong Evidence (Good) |
Definition 4 (Bayes-Power) \[ \begin{align} \text{Power}_{BF} &= 1 - P(LB < BF < UB) \\ &= P(\text{Bayes Faktor ist conclusive}) \end{align} \]
Ist Gruppe 1 “genug sicher” besser als Gruppe 2?
Wir möchten die minimale Stichprobengrösse finden, wo wir in 80% der Fälle \(BF \ge 3\) erreichen.
für jede Stichprobengrösse (pro Gruppe) von 21 bis 200:
for(i in (21:200)) { ## Arbitrary decision to test between 21 and 200 n <- i # set number (this is always PER group)
print(paste("sample size", n))Zieh diese Stichprobengrösse aus der Grundgesamtheit 100 mal
for (j in (1:100)) {
subset_kurs <- sample(1:10000, size = n) # personen 1 bis 10000 im datensatz sind in der Gruppe "Kurs"
subset_nokurs <- sample(10001:20000, size = n)
print(paste("subset-test", length(subset_kurs))) # Personen 10001 bis 20'000 sind in der Gruppe "nokurs"
df2_subset <- df2 %>% filter(X %in% c(subset_kurs,subset_nokurs))Teste jedes Mal das Nullmodell gegen das Alternativmodell und notier den Bayes-Faktor
Füge den Bayes-Faktor in eine Tabelle ein
Zähle innerhalb dieser Tabelle, wie viele BF inkonklusiv sind
Füge diese Information in einer zweiten Tabelle zur “Übersicht pro n” hinzu
Rinse and repeat
In diesem Beispiel:
## libraries
library(rstanarm)
library(dplyr)
library(bayestestR)
library(BayesFactor)
library(ggplot2)
n = 10000
mean_kurs <- 75
mean_nokurs <- 70
sd <- 10
group_kurs <- rnorm(n, mean_kurs, sd)
group_nokurs <- rnorm(n, mean_nokurs, sd)
df <- data.frame(
score = c(group_kurs,group_nokurs),
group=factor(rep(c("group_kurs","group_nokurs"),each=n))
)
## write.csv2(df,"df.csv") ## for reproducability
## Perform T-Test
df2 <- read.csv2("df.csv")
df2_small <- subset(df2, X < 21 | (X > 10000 & X < 10021))
mean(df2_small$score[df2_small$group == "group_kurs"])
mean(df2_small$score[df2_small$group == "group_nokurs"])
sd(df2_small$score[df2_small$group == "group_kurs"])
sd(df2_small$score[df2_small$group == "group_nokurs"])
model_small <- stan_glm(score ~ group, data = df2_small, family = gaussian(),diagnostic_file = "model_small.csv")
summary(model_small)
model_zero <- stan_glm(score ~ 1, data = df2_small, family = gaussian(),diagnostic_file = "model_zero.csv")
summary(model_zero)
bayesfactor_models(model_zero,model_small)
## Generate random subsets
n = 21
subset_kurs <- sample(1:10000, size = n)
subset_nokurs <- sample(10001:20000,size = n)
df2_subset <- df2 %>% filter(X %in% c(subset_kurs,subset_nokurs))
test <- ttestBF(formula = score ~ group, data = df2_subset)
bf <- test %>% data.frame() %>% select(bf) %>% as.numeric()
bf
## Parteh parteh, we are doing this ** again and again **
bayes_factors <- data.frame(
it = numeric(),
n = numeric(),
bf_conclusive = numeric(),
bf_inconclusive = numeric()
)
bayes_factors_within <- data.frame(
it = numeric(),
bf = numeric()
)
for(i in (21:200)) { ## Arbitrary decision to test between 21 and 200
n <- i # set number (this is always PER group)
print(paste("sample size", n))
for (j in (1:100)) {
subset_kurs <- sample(1:10000, size = n)
subset_nokurs <- sample(10001:20000, size = n)
print(paste("subset-test", length(subset_kurs)))
df2_subset <- df2 %>% filter(X %in% c(subset_kurs,subset_nokurs))
test <- ttestBF(formula = score ~ group, data = df2_subset)
bf <- test %>% data.frame() %>% select(bf) %>% as.numeric()
bayes_factors_within <- rbind(bayes_factors_within, data.frame(it = j, bf = as.numeric(bf)))
}
bayes_conclusive <- nrow(subset(bayes_factors_within, (bf <= 1/3) | (bf > 3)))
bayes_inconclusive <- nrow(subset(bayes_factors_within, (bf > 1/3) & (bf <= 3)))
bayes_factors <- rbind(bayes_factors, data.frame(it = i, n = n, bf_conclusive = as.numeric(bayes_conclusive),bf_inconclusive=as.numeric(bayes_inconclusive)))
rm(subset_kurs)
rm(subset_nokurs)
rm(bayes_conclusive)
rm(bayes_inconclusive)
bayes_factors_within <- data.frame(
it = numeric(),
bf = numeric()
)
}
bayes_factors$bayes_power <- 1-(bayes_factors$bf_inconclusive)/100
## visualize
## sample size from where it is consistent
lowest_n <- bayes_factors %>%
arrange(bayes_power) %>%
filter(bayes_power >= 0.8) %>%
slice_min(n) %>% pull(n)
condition_met <- bayes_factors %>%
arrange(n) %>%
mutate(
condition_met = sapply(1:nrow(bayes_factors), function(i) {
current_val <- bayes_power[i]
subsequent_vals <- bayes_power[(i + 1):nrow(bayes_factors)]
current_val >= 0.8 && all(subsequent_vals >= 0.8)
})
) %>%
filter(condition_met == TRUE) %>%
slice_min(n) %>% pull(n)
ggplot(bayes_factors, aes(x = n, y = bayes_power)) +
geom_point(color = "#0028A5", size = 2) +
geom_hline(yintercept = 0.8, # Horizontal line at y = 0.8
color = "#FC4C02", linetype = "dashed", size = 1) +
geom_vline(xintercept = lowest_n,
color = "#F3AB00",linetype="dashed",size=1) +
geom_vline(xintercept = condition_met,
color = "#A4D233",linetype="dashed",size=1) +
scale_y_log10() +
labs(
title = "Bayes Factor vs Sample Size",
x = "Sample Size (n)",
y = "Bayes Power"
) +
theme_minimal() +
theme(
plot.title = element_text(hjust = 0.5)
)
## export plot
ggsave("plot.png",width=8,height=6)Advanced Data Analytics - HS2024