Sample size calculations for RCC studies

Author

Lu Mao

Pallavi/Andrew project

  • Looking at low grade vs. high grade for MRI-based AI classifier
  • Previously published accuracy of 0.77 (95% CI 0.68–0.84) (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7658976/)
  • This is compared to biopsy accuracy, which is 52-76%

Assume that the AI classifier has an accuracy of 0.77 and biopsy has an accuracy of 0.64. We want to test the null hypothesis that the AI classifier is more accurate than biopsy. We will use two-sided McNemar test with a significance level of 0.05 and a power of 0.80. We expect that most of the cases correctly classified by biopsy will be also correctly classified by the AI classifier. As sensitivity analysis, we consider a range of 0–0.05 for the proportion (p10) of cases classified correctly by biopsy but incorrectly by AI. The corresponding sample sizes (n) are listed below.

p10 n
0.00 58
0.01 68
0.02 77
0.03 86
0.04 96
0.05 105

Ivan/Andrew project

  • Looking at low grade vs. high grade for Ultrasound-based AI classifier
  • Previously published accuracy of 0.779 (https://bmcmedimaging.biomedcentral.com/articles/10.1186/s12880-024-01317-1)
  • This is compared to biopsy accuracy, which is 52-76%

Assume that the AI classifier has an accuracy of 0.78 and biopsy has an accuracy of 0.64. We want to test the null hypothesis that the AI classifier is more accurate than biopsy. We will use two-sided McNemar test with a significance level of 0.05 and a power of 0.80. We expect that most of the cases correctly classified by biopsy will be also correctly classified by the AI classifier. As sensitivity analysis, we consider a range of 0–0.05 for the proportion (p10) of cases classified correctly by biopsy but incorrectly by AI. The corresponding sample sizes (n) are listed below.

p10 n
0.00 54
0.01 62
0.02 70
0.03 78
0.04 86
0.05 94