We have two groups \(1\) (systemic population-based approach) and \(2\) (family history approach). Let \(p_i\) be the prevalence rate (number of individuals having the mutations divided by total number of individuals in the group) for \(i =\) {\(1,2\)}. We will conduct a hypothesis testing for two proportions.
Four criteria to consider for power calculations:
We will need to make assumptions on what is the prevalent rate of group \(1\) and group \(2\). This report considers different values of possible prevalent rates we may see in the study.
Note that for Cohen’s h, an \(h\) near 0.2 is a \(small\) effect, an \(h\) near 0.5 is a \(medium\) effect, and an \(h\) near 0.8 is a \(large\) effect.
We calculate Cohen’s h effect size in the following way:
\(h = \varphi_1 - \varphi_2\) where \(\varphi_i = 2sin^{-1}(\sqrt{p_i})\) referred to as the acrsine root (angular transformation).
Then our one-sided hypothesis testing we consider is
\(H_0:h=0\) vs. \(H_a: h>0\). We want to test that group \(1\) the systemic population-based approach detected a significanlty greater proportion of individuals with the mutations compared to group \(2\) in the family based approach.
# Library for power calculations using test of proportions
library(pwr)
h = data.frame(seq(from=0.08, to=0.30, by = .02))
names(h)="h"
sample_size = seq(from=100, to = 250, by=10)
for(i in 1:length(sample_size)){
h[,i+1] = pwr.2p.test(h = h[,1], n = sample_size[i], alternative = "greater")$power
}
names(h)[-1] = sample_size
df = data.frame(t(h))
names(df) = as.character(df[1,])
df$size = rownames(df)
df = df[-1,]
Next, make some figures.
library(data.table)
df2 = melt(setDT(df), id.vars = "size", variable.name = "h")
df2$h = factor(df2$h)
names(df2)[3] = "power"
library(ggplot2)
ggplot(df2, aes(x = size, y = power, group = h, color = h)) +
geom_point() +
geom_line() +
guides(color = guide_legend(title = "Cohen's h")) +
labs(x = "Sample Size (per group)", y = "Power") +
geom_hline(aes(yintercept=0.8))
Note that we may observe a small effect size in our project. Consider the case where \(p_1 = 0.20\) as the proportion of individuals with mutations in group \(1\) systemic population-based) and \(p_2 = 0.15\) as the proportion of people with mutations in group \(2\) family history approach, then we observe a small effect size with cohen’s h at 0.13. With small effect size, we need a larger sample size.
p1 = 0.2
p2 = 0.15
ES.h(p1, p2)
## [1] 0.1318964
This analysis is for all individuals that have consented and we have data for. We may consider target sample size as \(n\times1.2\) if we assume a 20% dropout rate.