Chemical compounds study
Quality Control Report
Cohort; chemical compounds;
- Aim
The aim of the analysis was to assess concentrations of compounds from biological samples collected from a cohort subjects.
- Sample size and selection of samples for the analysis
In total, 957 samples from 857 unique participants were collected (Figure 1):
- Descriptive analysis of compounds
- Limits of detection (LOD) and quantification (LOQ)
| Compound | n | %<LOD | n<LOD | %<LLOQ | n<LLOQ | %>ULOQ | n>ULOQ |
|---|---|---|---|---|---|---|---|
| Compound 1 | 826 | 0.6 | 5 | 0.6 | 5 | 0.1 | 1 |
| Compound 2 | 830 | 0.7 | 6 | 0.7 | 6 | 6.5 | 54 |
| Compound 3 | 830 | 0.2 | 2 | 0.2 | 2 | 1.6 | 13 |
| Compound 4 | 830 | 0.6 | 5 | 0.6 | 5 | 0.1 | 1 |
| Compound 5 | 830 | 0.4 | 3 | 0.4 | 3 | 0.5 | 4 |
| Compound 6 | 830 | 0.4 | 3 | 0.4 | 3 | 0.1 | 1 |
| Compound 7 | 830 | 0.6 | 5 | 1.2 | 10 | 0.0 | 0 |
| Compound 8 | 828 | 3.1 | 26 | 4.1 | 34 | 0.0 | 0 |
| Compound 9 | 830 | 0.2 | 2 | 0.2 | 2 | 70.5 | 585 |
LODs and lower and upper limits of quantification (LLOQ, ULOQ) for each compound are indicated in Table 1. LODs and LOQs were calculated for 50 mg of sample. Compounds concentrations provided by the analytical lab were adjusted for sample weight (raw concentration was divided by the sample weight). For a reference see (Author 2021).
- Numerical summary of compounds concentrations
| Compound | Min | 5% | 25% | Median | Mean | 75% | 95% | Max |
|---|---|---|---|---|---|---|---|---|
| Compound 1 | <LOD | 1.6 | 2.7 | 3.9 | 5.1 | 5.9 | 12.4 | 83.7 |
| Compound 2 | <LOD | 0.3 | 0.6 | 0.9 | 1.5 | 1.6 | 4.3 | 45.3 |
| Compound 3 | <LOD | 2.3 | 3.9 | 5.3 | 6.2 | 7.3 | 12.8 | 101.5 |
| Compound 4 | <LOD | 8.3 | 17.7 | 32.6 | 38.2 | 48.5 | 82.9 | 938.2 |
| Compound 5 | <LOD | 4.6 | 7.8 | 11.4 | 14.3 | 16.3 | 32.8 | 434.4 |
| Compound 6 | <LOD | 2.7 | 5.0 | 7.0 | 8.3 | 9.6 | 16.6 | 177.4 |
| Compound 7 | <LOD | 1.2 | 2.4 | 3.5 | 3.9 | 4.9 | 7.7 | 26.5 |
| Compound 8 | <LOD | 0.2 | 0.3 | 0.5 | 0.7 | 0.9 | 1.8 | 5.6 |
| Compound 9 | <LOD | 49.6 | 125.0 | 228.6 | 381.2 | 438.9 | 1168.8 | 6466.0 |
Detection rate of all compounds was very high. Compound 7 and Compound 8 are the two compounds with the highest number of <LLOQ values (Table 2). Compound 9 had a very high number of values >ULOQ and this may be problematic (Table 2). Consider imputation.
- Compounds missing data
Only a few values for Compound 8 and Compound 1 were missing (Figure 2). There were 7 samples (fakeID1, fakeID2, fakeID3, fakeID4, fakeID5, fakeID6, fakeID7) that were sent to the analytical lab but were not processed (due to too small amount of the sample or sample loss), and so they are not included in the QC report and will not be imputed. As for other samples, there were no missing values except for 2 values for Compound 8 and 4 values for Compound 1 that did not pass the quality control (see above). These values will not be imputed.
- Impact of data imputation
The imputed values ranges look sane.
- Compounds correlation structure
As expected, the highest correlations are observed for Compound 1 and its metabolites and Compound 4 and its metabolites (Figure 5).
- Outliers
Out of the final 830 available samples, there were 4 samples that contained outliers due to potential contamination in Compound 1 (fakeID9, fakeID10, fakeID11, fakeID12) and 2 in Compound 8 (fakeID13, fakeID14). These values were indicated by the analytical lab to be removed and we decided to follow this recommendation. These values were not included in the QC report and will not be imputed.
One additional sample (fakeID15) was indicated by the analytical lab as an outlier for all metabolites, due to extremely low concentration of internal standard and this sample was removed prior to the following analyses, was not included in the QC report and will not be imputed.
PCA - individual contribution
After running a PCA, we see that the majority of variability (59.4%) is explained by the PCA 1 there are two more IDs that are potential outliers: fakeID16 and fakeID17 (see Figure 6). Analytical lab did not detect any reasons for these, so most probably these are biological and not technical outliers, however fakeID20 provided also a very low amount of sample (<5mg). We recommend sensitivity analysis without these samples.
PCA - effect of covariates
There is a strong effect of batch and sample weight, with little impact of other factors (Figure 7).
Exploration of the sample weight effect
Visual inspection of associations between compound concentrations and sample weight
fakeID18 seems like an outlier (Figure 8), but visual inspection is not accurate in this case. Analytical lab did not detect any reasons for these, so most probably this is a biological and not technical outlier. We recommend sensitivity analysis without this sample.
Numerical analysis of the linear effects and correlation between compound concentrations and sample weight
| Compound | Estimate (CI) | Overall p-val. | rho |
|---|---|---|---|
| Compound 1 | 0.02 (0.01; 0.02) | 0.00 | 0.15 |
| Compound 2 | 0.01 (0.01; 0.02) | 0.00 | 0.10 |
| Compound 3 | 0.02 (0.01; 0.02) | 0.00 | 0.15 |
| Compound 4 | 0.02 (0.02; 0.03) | 0.00 | 0.17 |
| Compound 5 | 0.02 (0.01; 0.02) | 0.00 | 0.14 |
| Compound 6 | 0.02 (0.01; 0.02) | 0.00 | 0.18 |
| Compound 7 | 0.02 (0.01; 0.02) | 0.00 | 0.13 |
| Compound 8 | 0.01 (0; 0.01) | 0.02 | 0.04 |