Chemical compounds study

Quality Control Report

Author

Paulina Jedynak (paulina.jedynak@gmail.com)

Published

June 12, 2023

Abstract
Disclaimer: this is an exemplary quality report on simulated data. The presented descriptions or conclusions may not be accurate and may not reflect scientifically correct approach, they have an illustrative purpose only.
Keywords

Cohort; chemical compounds;

  1. Aim

The aim of the analysis was to assess concentrations of compounds from biological samples collected from a cohort subjects.

  1. Sample size and selection of samples for the analysis

In total, 957 samples from 857 unique participants were collected (Figure 1):

Figure 1: Study flowchart.
  1. Descriptive analysis of compounds
  1. Limits of detection (LOD) and quantification (LOQ)
Table 1: Number and percentage of values above the LOD and above LOQ for compound amounts adjusted for sample weight (ng/g).
Compound n %<LOD n<LOD %<LLOQ n<LLOQ %>ULOQ n>ULOQ
Compound 1 826 0.6 5 0.6 5 0.1 1
Compound 2 830 0.7 6 0.7 6 6.5 54
Compound 3 830 0.2 2 0.2 2 1.6 13
Compound 4 830 0.6 5 0.6 5 0.1 1
Compound 5 830 0.4 3 0.4 3 0.5 4
Compound 6 830 0.4 3 0.4 3 0.1 1
Compound 7 830 0.6 5 1.2 10 0.0 0
Compound 8 828 3.1 26 4.1 34 0.0 0
Compound 9 830 0.2 2 0.2 2 70.5 585

LODs and lower and upper limits of quantification (LLOQ, ULOQ) for each compound are indicated in Table 1. LODs and LOQs were calculated for 50 mg of sample. Compounds concentrations provided by the analytical lab were adjusted for sample weight (raw concentration was divided by the sample weight). For a reference see (Author 2021).

  1. Numerical summary of compounds concentrations
Table 2: Compounds concentrations.
Compound Min 5% 25% Median Mean 75% 95% Max
Compound 1 <LOD 1.6 2.7 3.9 5.1 5.9 12.4 83.7
Compound 2 <LOD 0.3 0.6 0.9 1.5 1.6 4.3 45.3
Compound 3 <LOD 2.3 3.9 5.3 6.2 7.3 12.8 101.5
Compound 4 <LOD 8.3 17.7 32.6 38.2 48.5 82.9 938.2
Compound 5 <LOD 4.6 7.8 11.4 14.3 16.3 32.8 434.4
Compound 6 <LOD 2.7 5.0 7.0 8.3 9.6 16.6 177.4
Compound 7 <LOD 1.2 2.4 3.5 3.9 4.9 7.7 26.5
Compound 8 <LOD 0.2 0.3 0.5 0.7 0.9 1.8 5.6
Compound 9 <LOD 49.6 125.0 228.6 381.2 438.9 1168.8 6466.0

Detection rate of all compounds was very high. Compound 7 and Compound 8 are the two compounds with the highest number of <LLOQ values (Table 2). Compound 9 had a very high number of values >ULOQ and this may be problematic (Table 2). Consider imputation.

  1. Compounds missing data
Figure 2: Percentage of missing values per compound.

Only a few values for Compound 8 and Compound 1 were missing (Figure 2). There were 7 samples (fakeID1, fakeID2, fakeID3, fakeID4, fakeID5, fakeID6, fakeID7) that were sent to the analytical lab but were not processed (due to too small amount of the sample or sample loss), and so they are not included in the QC report and will not be imputed. As for other samples, there were no missing values except for 2 values for Compound 8 and 4 values for Compound 1 that did not pass the quality control (see above). These values will not be imputed.

  1. Impact of data imputation
Figure 3: Comparison of ln-transformed non-imputed and imputed values for samples adjusted and unadjusted for weight.

The imputed values ranges look sane.

Figure 4: Histograms for imputed compound concentration values for samples adjusted and non-adjusted for weight.
  1. Compounds correlation structure
Figure 5: Correlation structure for the compounds.

As expected, the highest correlations are observed for Compound 1 and its metabolites and Compound 4 and its metabolites (Figure 5).

  1. Outliers

Out of the final 830 available samples, there were 4 samples that contained outliers due to potential contamination in Compound 1 (fakeID9, fakeID10, fakeID11, fakeID12) and 2 in Compound 8 (fakeID13, fakeID14). These values were indicated by the analytical lab to be removed and we decided to follow this recommendation. These values were not included in the QC report and will not be imputed.

One additional sample (fakeID15) was indicated by the analytical lab as an outlier for all metabolites, due to extremely low concentration of internal standard and this sample was removed prior to the following analyses, was not included in the QC report and will not be imputed.

PCA - individual contribution

Figure 6: PCA-based individual contribution (the quality of the individuals on the factor map).

After running a PCA, we see that the majority of variability (59.4%) is explained by the PCA 1 there are two more IDs that are potential outliers: fakeID16 and fakeID17 (see Figure 6). Analytical lab did not detect any reasons for these, so most probably these are biological and not technical outliers, however fakeID20 provided also a very low amount of sample (<5mg). We recommend sensitivity analysis without these samples.

PCA - effect of covariates

Figure 7: PCA analysis with potentially influential factors overlapped.

There is a strong effect of batch and sample weight, with little impact of other factors (Figure 7).

Exploration of the sample weight effect

Visual inspection of associations between compound concentrations and sample weight

Figure 8: Fitted associations between compounds’ concentrations and sample weight.

fakeID18 seems like an outlier (Figure 8), but visual inspection is not accurate in this case. Analytical lab did not detect any reasons for these, so most probably this is a biological and not technical outlier. We recommend sensitivity analysis without this sample.

Numerical analysis of the linear effects and correlation between compound concentrations and sample weight

Table 3: Regression estimates for single linear regressions with sample weight predicting compound concentrations. Only estimates for associations with p-value for overall effect <0.1 or linear association p-value < 0.1 were displayed.
Compound Estimate (CI) Overall p-val. rho
Compound 1 0.02 (0.01; 0.02) 0.00 0.15
Compound 2 0.01 (0.01; 0.02) 0.00 0.10
Compound 3 0.02 (0.01; 0.02) 0.00 0.15
Compound 4 0.02 (0.02; 0.03) 0.00 0.17
Compound 5 0.02 (0.01; 0.02) 0.00 0.14
Compound 6 0.02 (0.01; 0.02) 0.00 0.18
Compound 7 0.02 (0.01; 0.02) 0.00 0.13
Compound 8 0.01 (0; 0.01) 0.02 0.04

Bibliography

Author. 2021. Title of the Publication.” J Chromatogr A 1624 (August): 24534654.