Pulseq FAM Project Power Analysis
Statistical Power and Sample Size
For Aim 3A, the primary endpoint is the reproducibility (repeatability) coefficient for whole-liver PDFF, \[ RC \;=\; 1.96 \sqrt{2}\,\sigma, \] where \(\sigma\) is the within-subject standard deviation across repeated measurements. We justify the sample size using the expected width of the 95% confidence interval (CI) for \(RC\). Under the standard normal-errors model for replicate measurements, a 95% CI for \(\sigma^2\) is based on a chi-square distribution; therefore, a 95% CI for \(RC\) is \[ RC_L \;=\; \hat{RC}\sqrt{\frac{\nu}{\chi^2_{0.975,\nu}}}, \qquad RC_U \;=\; \hat{RC}\sqrt{\frac{\nu}{\chi^2_{0.025,\nu}}}, \] where \(\nu\) is the effective degrees of freedom for the within-subject variance estimate. With \(n \;=\; 30\) participants scanned on \(K \;=\; 3\) vendor platforms, a conservative approximation is \(\nu \approx n(K-1) \;=\; 60\). This yields relative CI multipliers of approximately 0.85 and 1.22, so the CI width is about \((1.22 - 0.85)\hat{RC} \approx 0.37\hat{RC}\). For example, if \(\hat{RC} \approx 2.5\%\) as suggested by our preliminary Pulseq-FAM data, the expected 95% CI is approximately \([2.1\%,\,3.0\%]\) (width about 0.9 percentage points). These precision targets are meaningfully tighter than literature benchmarks for conventional PDFF reproducibility (e.g., ROI-level reproducibility coefficient 4.12% in a meta-analysis) (Yokoo et al. 2018) and are consistent with recent vendor-agnostic, motion-insensitive PDFF mapping results using Pulseq-based FAM acquisitions (Tang et al. 2025, 2026).
For Aim 3B, the primary goal is to characterize how reproducibility changes with hepatic iron concentration (LIC) and field strength, and to determine whether FAM-based acquisitions mitigate iron-related loss of precision. The primary analysis will stratify participants by iron (LIC < 3 vs. ≥ 3 mg/g) and estimate \(RC\) within each stratum for both Pulseq-FAM and conventional acquisitions. We again justify sample size by the 95% CI width for \(RC\) within stratum using the same chi-square formula above. With approximately \(n \approx 15\) per LIC stratum, and at least two replicate acquisitions per stratum-specific condition, a conservative approximation is \(\nu \approx 15\), which yields relative CI multipliers of approximately 0.74 and 1.55. Thus, the 95% CI width is about \((1.55 - 0.74)\hat{RC} \approx 0.81\hat{RC}\) within stratum. This precision is adequate as our pilot data suggest large, practically meaningful differences in variability under elevated iron. For example, a recent multi-center, multi-vendor phantom validation reported PDFF reproducibility coefficients of 6.2% under a conventional 3D protocol and 3.8% under an optimized 3D protocol, with variability increasing at higher PDFF and higher R2* (Starekova et al. 2025). In our ongoing BFAM repeatability study (unpublished), we observe substantially larger \(RC\) for conventional methods in iron-overloaded livers and materially lower \(RC\) for FAM-based acquisitions. With \(n \approx 30\) total, the resulting stratum-specific CIs will be sufficiently informative to quantify iron-dependent degradation and to support comparisons between Pulseq-FAM and conventional acquisitions, with additional refinement in precision expected when modeling all repeated measurements jointly in a variance-component framework.