code written: 2020-01-12
last ran: 2020-05-06


Something has apparently gone very wrong; the data processing has (presumably) improved, but our white matter segmentation is worse than before. My Slurm job was interrupted but then resumed. Maybe this is the (a) source of our issues. Will have to run yet again…

Notes: This script summarizes the key output values from Slicer, for all tracts. We have data for n=34 tracts (n=60 unique combinations with hemisphere). We expect to have data from n=41 tracts.

The data we have available is summarized below (collapsed across all sites):


Data summary.

Missing participants. In total, we only have data from n=393 participants, whereas n=412 were expected (a difference of -19.) The missing participants are SPN01_CMP_0219, SPN01_CMP_0220, SPN01_ZHP_0063, SPN01_ZHP_0082, SPN01_ZHP_0125, SPN01_ZHP_0156, SPN01_ZHP_0158, SPN01_ZHP_0159, SPN01_ZHP_0160, SPN01_ZHP_0161, SPN01_ZHP_0163, SPN01_ZHP_0165, SPN01_ZHP_0166, SPN01_ZHP_0167, SPN01_ZHP_0168, SPN01_ZHP_0169, SPN01_ZHP_0170, SPN01_ZHP_0171, SPN01_ZHP_0172; I need to follow up to understand why the pipeline failed (these participants do have DWI data that passed QC); I believe it’s a queue / Slurm error.

Missing tracts. The count variable in the table above indicates the number of participants with data for a given tract, and percent indicates corresponding percentage. We see that some tracts have data from far fewer participants than others. In total, we have data for 14846 tracts out of a possible maximum 23580, i.e., 62.9601357%.

Missing tracts by site. Before it looked as though tracts were missing by site. Now, it appears that this is less the case. But a very large number of tracts are missing:

Missing tracts by participant. A large number of participants are missing several, as follows:

Tracts missing 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 43 49
Percent missing 20.00 21.67 23.33 25.00 26.67 28.33 30.00 31.67 33.33 35.00 36.67 38.33 40.00 41.67 43.33 45.00 46.67 48.33 50.00 51.67 53.33 55.00 56.67 58.33 60.00 61.67 63.33 71.67 81.67
Participant count 1 4 8 13 24 20 37 28 33 31 32 31 28 17 13 15 8 8 9 6 4 6 4 2 4 2 2 2 1

Visualization: Number of Fibers.

The following plot shows the number of raw data for the number of fibers variable from the n=393 participants summarized above, separated by tract and hemisphere (n=74) and coloured by site / scanner. Outlier values are apparent….