Let \(X_{1,1}, X_{1,2}, \cdots, X_{1,n}\) represent relative abundances of a microbial feature measured from n individuals in Group 1. Let \(X_{2,1}, X_{2,2}, \cdots, X_{2,n}\) represent relative abundances of a microbial feature measured from n individuals in Group 2. After a data transformation such that \(Y_{i,j}=\arcsin\sqrt{X_{i,j}}\), assume that \(Y_{1,1}, Y_{1,2}, \cdots, Y_{1,n} \sim N(\mu, \sigma^{2})\) and \(Y_{2,1}, Y_{2,2}, \cdots, Y_{2,n} \sim N(\mu+\delta, \sigma^{2})\). (these are essentially the assumptions of the shiny app power calculation). As such, \(\delta\) is the (constant) effect size on the arcsin-sqrt scale.
Applying the delta method, we find that since \(\sqrt{n}[\bar{Y}_{1,n} – \mu] \xrightarrow{\text{D}} N(0, \sigma^2)\), then \(\sqrt{n}[\bar{X}_{1,n} – \sin^2{\mu}] \xrightarrow{\text{D}} N(0, \sigma^2*2\sin{\sigma}\cos{\sigma})\). Therefore, back-transforming the sample means of the arcsin-sqrt-transformed data will give us, for large n, estimates for the back-transformations of the population means, \(\sin^2{\mu}\) and \(\sin^2(\mu+\delta)\). Thus, we can estimate the expected difference in means (i.e. effect size) on the relative abundance scale as \(\sin^2(\bar{X}_1)-\sin^2(\bar{X}_2)\), which converges in probablity to \(\sin^2{\mu}-\sin^2(\mu+\delta)\).
The relationship between the effect size delta on the asin-sqrt scale and the effect size on the relative abundance scale is nonlinear, and in fact follows a concave down pattern (see Figure). To give a conservative estimate of the kind of effect sizes on the relative abundance scale that we can expect to detect with a given level of power, we can find the maximum value of this curve. The maximum value signifies the smallest relative abundance that is guaranteed to be detectable with nominal power 1-B at all relative abundance values.
To find the mean relative abundance value at which we have the largest minimum detectable effect size, we take the derivative with respect to the mean relative abundance value mu: \[\begin{equation} \begin{split} \displaystyle \frac{d}{d\mu} \sin^2(\mu+\delta)-\sin^2(\mu) & = 0 \\ \sin(2\mu+2\delta) - \sin(2\mu) & = 0 \\ \sin(2\mu)\cos(2\delta)+\cos(2\mu)\sin(2\delta) - \sin(2\mu) & = 0 \end{split} \end{equation}\]
Using double angle identities, as well as a restriction for \(\mu\) to fall between \(\arcsin \sqrt0 = 0\) and \(\arcsin\sqrt1 = \frac{\pi}{2}\), as well as arbitrarily choosing \(\delta>0\) (i.e. the group with the smaller mean is Group 1), we find that:
\[\mu = \displaystyle \frac{\pi}{4}- \displaystyle \frac{1}{2}\delta\] which is the Group 1 mean. Consequently, the group 2 mean is \(\mu + \delta = \frac{\pi}{4} + \frac{1}{2}\delta\). I’m skipping the algebraic steps here, but an intuitive graphical explanation is as follows: From above,
\[\begin{equation} \begin{split} \sin(2\mu+2\delta) - \sin(2\mu) & = 0 \\ \sin(2\mu+2\delta) & = \sin(2\mu) \end{split} \end{equation}\]
If we again restrict \(\mu\) between 0 and \(\pi/2\), we can visualize the problem like this:
In order for \(\sin(2\mu+2\delta) = \sin(2\mu)\) to hold, we need \(2\mu+\delta = \pi/2\); therefore, \(\mu = \pi/4 - \delta/2\), as stated above. It is straightforward to show that the function in question is concave down on its whole support, using the second derivative (not shown).