From the figures above, the scores are in general significantly different between active and inactive (\(p\)-values based on the Wilcoxon-rank-sum test). However, MRI appears to be less discriminating than PET and PET/MR.
We plot the receiver operating characteristic (ROC) curves for each modality below, along with the area under the curve (AUC) and 95% confidence limits (CL). Table 1 shows the \(p\)-values comparing the modality-specific AUCs by the DeLong et al. (1988) test.
MR AUC | PET AUC | PET/MR AUC | MR v PET | MR v PET/MR | PET v PET/MR | |
---|---|---|---|---|---|---|
Overall | 0.803 (0.699, 0.907) | 0.876 (0.79, 0.963) | 0.892 (0.813, 0.971) | 0.152 | 0.020 | 0.562 |
A colon/TI/cecum | 0.902 (0.783, 1) | 0.907 (0.795, 1) | 0.956 (0.877, 1) | 0.948 | 0.307 | 0.220 |
Colon T | 0.708 (0.404, 1) | 0.917 (0.794, 1) | 0.931 (0.82, 1) | 0.051 | 0.051 | 0.386 |
Rectum/sigmoid/descending | 0.679 (0.433, 0.925) | 0.792 (0.563, 1) | 0.775 (0.554, 0.996) | 0.343 | 0.294 | 0.774 |
Overall, it appears that PET/MR > PET > MRI in terms of diagnostic performance.
The distributions of change score (\(-1, 0, 1, 2, 3\)) for active and inactive cases are plotted below (\(p\)-values based on the Wilcoxon-rank-sum test). The scores for inactive cases mostly stay put (i.e., change score 0), and those for active cases are mostly increased (i.e., change score 1–3). This explains the improvement of diagnostic performance by PET/MR as compared to MRI.
Boxplots of SUVmax in active and inactive cases are shown below (\(p\)-values based on the Wilcoxon-rank-sum test). The values are increased in active cases to varying degrees by segment.
Boxplots and ROC curves for patient-level status are plotted below by modality. it appears that PET/MR > MRI > PET in terms of diagnostic performance.
MR AUC | PET AUC | PET/MR AUC | MR v PET | MR v PET/MR | PET v PET/MR | |
---|---|---|---|---|---|---|
Patient-level | 0.873 (0.737, 1) | 0.828 (0.664, 0.993) | 0.931 (0.839, 1) | 0.667 | 0.349 | 0.07 |