| readers | bleeder_cases | nonbleeder_cases | total_cases | imaging_conditions | reads_per_reader | total_reads |
|---|---|---|---|---|---|---|
| 2 | 50 | 50 | 100 | 2 | 200 | 400 |
| 3 | 50 | 50 | 100 | 2 | 200 | 600 |
| 4 | 50 | 50 | 100 | 2 | 200 | 800 |
Assessment of AI generated spectral CT series from conventional CT scan
Study question
To compare radiologists’ performance when reading conventional CT images versus AI-generated spectral CT images for detecting active bleeding.
The proposed cohort includes:
- 50 cases with active bleeding
- 50 cases without active bleeding
Statistical planning assumptions
We consider 2, 3, and 4 readers.
For planning purposes, assume:
- each reader reads each case under both imaging conditions
- primary endpoint: AUC for detecting active bleeding
- baseline AUC with conventional CT: 0.75
- possible AUC with AI spectral CT: 0.80, 0.83, or 0.85
- one-sided test for improvement with AI spectral CT
Reader workload
With 50 bleeders and 50 non-bleeders, each reader would perform 200 reads: 100 conventional CT reads and 100 AI spectral CT reads.
The total workload would be 400 reads with 2 readers, 600 reads with 3 readers, and 800 reads with 4 readers.
Precision for sensitivity and specificity
For each imaging condition, the number of positive-case reader interpretations is:
\[ 50 \times \text{number of readers}. \]
The same is true for negative-case reader interpretations.
| readers | positive_reader_interpretations | negative_reader_interpretations |
|---|---|---|
| 2 | 100 | 100 |
| 3 | 150 | 150 |
| 4 | 200 | 200 |
The table below shows approximate 95% confidence intervals for sensitivity or specificity under several possible observed values.
| readers | n_reader_interpretations | assumed_sensitivity_or_specificity | lower_95_ci | upper_95_ci |
|---|---|---|---|---|
| 2 | 100 | 0.75 | 0.657 | 0.825 |
| 2 | 100 | 0.80 | 0.711 | 0.867 |
| 2 | 100 | 0.85 | 0.767 | 0.907 |
| 2 | 100 | 0.90 | 0.826 | 0.945 |
| 3 | 150 | 0.75 | 0.675 | 0.812 |
| 3 | 150 | 0.80 | 0.729 | 0.856 |
| 3 | 150 | 0.85 | 0.784 | 0.898 |
| 3 | 150 | 0.90 | 0.842 | 0.938 |
| 4 | 200 | 0.75 | 0.686 | 0.805 |
| 4 | 200 | 0.80 | 0.739 | 0.850 |
| 4 | 200 | 0.85 | 0.794 | 0.893 |
| 4 | 200 | 0.90 | 0.851 | 0.934 |
Power results for the current 50/50 cohort
The following scenarios use 50 bleeder cases and 50 non-bleeder cases.
| bleeder_cases | nonbleeder_cases | readers | assumed_auc_conventional | assumed_auc_ai | estimated_power |
|---|---|---|---|---|---|
| 50 | 50 | 2 | 0.75 | 0.80 | 0.17 |
| 50 | 50 | 2 | 0.75 | 0.83 | 0.24 |
| 50 | 50 | 2 | 0.75 | 0.85 | 0.35 |
| 50 | 50 | 3 | 0.75 | 0.80 | 0.34 |
| 50 | 50 | 3 | 0.75 | 0.83 | 0.61 |
| 50 | 50 | 3 | 0.75 | 0.85 | 0.76 |
| 50 | 50 | 4 | 0.75 | 0.80 | 0.48 |
| 50 | 50 | 4 | 0.75 | 0.83 | 0.82 |
| 50 | 50 | 4 | 0.75 | 0.85 | 0.96 |
| 50 | 75 | 2 | 0.75 | 0.80 | 0.19 |
| 50 | 75 | 2 | 0.75 | 0.83 | 0.29 |
| 50 | 75 | 2 | 0.75 | 0.85 | 0.38 |
| 50 | 75 | 3 | 0.75 | 0.80 | 0.40 |
| 50 | 75 | 3 | 0.75 | 0.83 | 0.65 |
| 50 | 75 | 3 | 0.75 | 0.85 | 0.79 |
| 50 | 75 | 4 | 0.75 | 0.80 | 0.57 |
| 50 | 75 | 4 | 0.75 | 0.83 | 0.86 |
| 50 | 75 | 4 | 0.75 | 0.85 | 0.97 |
| 50 | 100 | 2 | 0.75 | 0.80 | 0.20 |
| 50 | 100 | 2 | 0.75 | 0.83 | 0.32 |
| 50 | 100 | 2 | 0.75 | 0.85 | 0.39 |
| 50 | 100 | 3 | 0.75 | 0.80 | 0.41 |
| 50 | 100 | 3 | 0.75 | 0.83 | 0.71 |
| 50 | 100 | 3 | 0.75 | 0.85 | 0.84 |
| 50 | 100 | 4 | 0.75 | 0.80 | 0.60 |
| 50 | 100 | 4 | 0.75 | 0.83 | 0.91 |
| 50 | 100 | 4 | 0.75 | 0.85 | 0.99 |
Scenarios with power \(\geq 0.8\) are bolded in the table above.