Multi-Center Multi-Vendor Validation of Liver QSM in Patients with Iron Overload

Subject characteristics

Descriptive statistics for subject-level variables are tabulated below by site and overall. Quantitative variables are summarized by median (inter-quartile range; IQR) and categorical variables by N (%).

Table 1. Patient-level summary.
	UW (N = 53)	UTSW (N = 42)	JHU (N = 29)	Stanford (N = 37)	Overall (N = 161)
Age	50 (23, 59)	46 (36, 56.8)	47 (20, 59)	17 (13, 22)	41 (19, 56)
Sex - m	34 (64.2%)	25 (59.5%)	15 (51.7%)	20 (54.1%)	94 (58.4%)
Sex - f	19 (35.8%)	17 (40.5%)	14 (48.3%)	17 (45.9%)	67 (41.6%)
Weight	75.3 (65.4, 97.1)	83.6 (61.5, 105.7)	66.2 (59, 79.4)	52.5 (35.2, 67.2)	69 (58.8, 90.1)
Height	172.7 (162.6, 178)	172.7 (162.5, 182.9)	167.6 (160, 174)	155 (139.3, 165.1)	168 (158, 177.9)
Ferriscan	3.4 (1.9, 6.5)	1.4 (0.7, 2.5)	3.2 (1.2, 7)	3 (1.4, 7.7)	2.6 (1.2, 6.5)

Linear regressions of QSM vs R2* and QSM vs LIC

Numbers of observations per site, field strength (FS), and test/retest are tabulated below.

Table 2. Number (n) of observations per site, FS, and test/retest
Site	FS	Retest	n
UW	1.5	test	47
UW	1.5	retest	10
UW	3.0	test	50
UW	3.0	retest	7
UTSW	1.5	test	42
UTSW	1.5	retest	33
UTSW	3.0	test	42
UTSW	3.0	retest	20
JHU	1.5	test	18
JHU	1.5	retest	6
JHU	3.0	test	24
JHU	3.0	retest	4
Stanford	1.5	test	32
Stanford	3.0	test	25

QSM vs R2*

Site-specific regression lines of QSM vs R2* for each FS and test/retest scenario are plotted below.

We perform F-tests (with 2 degrees of freedom for intercept and slope) to compare the regression lines between pairs of sites. The p-values are summarized in Table 3a.

Table 3a. P-values for pairwise tests of site-specific regression lines for QSM vs R2*
	UW vs UTSW	UW vs JHU	UW vs Stanford	UTSW vs JHU	UTSW vs Stanford	JHU vs Stanford
1.5T	0.324	0.16	0.054	0.556	0.003	0.067
3.0T	0.385	<0.001	<0.001	<0.001	0.205	<0.001

QSM vs LIC

Site-specific regression lines of QSM vs liver iron concentration (LIC) for each FS and test/retest scenario are plotted below. Test data also show stronger associations than retest data do.

Similarly, we perform F-tests to compare the regression lines between pairs of sites. The p-values are summarized in Table 3b.

Table 3b. P-values for pairwise tests of site-specific regression lines for QSM vs LIC
	UW vs UTSW	UW vs JHU	UW vs Stanford	UTSW vs JHU	UTSW vs Stanford	JHU vs Stanford
1.5T	0.298	0.069	0.052	0.469	0.002	0.023
3.0T	0.066	<0.001	<0.001	0.195	0.16	0.003

Repeatability & reproducibility of QSM

Test-retest repeatability

Bland-Altman plots (difference vs mean) for test-retest QSM are plotted by FS and overall below. Data points are color-coded by site.

The bias (mean test-retest difference), repeatability coefficient (RC; range covering 95% test-retest differences), intraclass correlation coefficient (ICC) with 95% confidence interval (CI) and p-value (for testing ICC = 0) are presented in Table 4 below.

Table 4. Test-retest repeatability analysis of QSM.
FS	Bias	RC	ICC	P
1.5T	-0.005	0.228	0.96 (0.93, 0.977)	<0.001
3T	0.005	0.166	0.984 (0.968, 0.992)	<0.001
Overall	-0.001	0.205	0.971 (0.955, 0.981)	<0.001

Field strength reproducibility

The Bland-Altman plot for reproducibility between 1.5T and 3T is shown below, with bias, ICC (95% CI), and p-value indicated on the figure.

Sex differences in LIC

LIC values are compared between females and males by site and overall in Table 5 below. The p-values are based on the Wilcoxon rank sum test.

Table 5. Median (IQR) of LIC by site and sex.
Site	Female	Male	P
UW	3.8 (2.3, 6.1)	3.3 (1.7, 7.3)	0.926
UTSW	2.4 (1.4, 5.8)	0.9 (0.7, 1.6)	0.021
JHU	5.4 (2, 7.3)	2.4 (1, 5.2)	0.149
Stanford	6.2 (2.3, 9.4)	2.5 (1.2, 3.9)	0.037
Overall	3.9 (1.9, 6.9)	2 (1, 4.1)	0.003

The following boxplot visualizes the comparisons.

Sex differences in QSM vs (R2*, LIC)

Sex-specific regression lines of QSM vs R2* for each FS and test/retest scenario are plotted below, with F-test p-values comparing the two lines indicated on the plots.

Similarly, sex-specific regression lines of QSM vs LIC for each FS and test/retest scenario are plotted below, with F-test p-values comparing the two lines indicated on the plots.