* These authors contributed equally to this work.
1 Barcelona Institute for Global Health.
2 Department of Psychiatry and Psychotherapy, Heidelberg
University.
3 Univ Rennes, Inserm, EHESP, Irset.
✉ Correspondence: Augusto Anguita-Ruiz <augusto.anguita@isglobal.org>
The current report serves as an online companion for the manuscript “To be supplied [”Title”. Author X et al. 2022.]”. With this analysis, we demonstrate the suitability of our method “dsLassoCov” to perform feature-selection in a real scenario with complex exposome data and under the privacy-protected federated DataSHIELD system.
In this section, we will describe the data employed for the showcase, giving details about the experimental design, the type of features available, and the structure and organization of the data in the Opal infrastructure.
The research dataset employed here derives from the HELIX (Human Early-Life Exposome) Project. The HELIX project gathers data from 6 longitudinal-based European birth cohorts with the aim of evaluating the effect of environmental risk factors on mothers’s and children’s health. HELIX cohorts include the BIB (Born in Bradford) (United Kingdom), EDEN (Étude des Déterminants pré et postnatals du développement et de la santé de l’ENfant) (France), INMA (INfancia y Medio Ambiente) (Spain), KANC (Kaunus Cohort) (Lithuania), MoBa (Norwegian Mother and Child Cohort Study) (Norway), and Rhea (Mother-Child Cohort in Crete) (Greece). General details of the study design can be found in the Figure 1. The whole HELIX dataset includes a total of 31,472 mother-child pairs. Among them, a subcohort of 1,298 children (approximately 200 children in each cohort) was selected in this study according to the following criteria of eligibility: 1) age 6 to 11 years at the moment of outcome evaluation; 2) complete address history; and 3) no serious health problems that may affect the clinical testing or the child safety.
Figure 1. General overview of the HELIX research project.
In the 1,301 children, a wide range of environmental exposures were evaluated to define the early-life exposome during two time periods: the prenatal pregnancy period and postnatal period (childhood age 6 to 11 years). Collected exposures comprise the three main parts of the exposome: outdoor exposures, chemical exposures, and lifestyle factors. All variables incorporated in the dataset have been appropriately pre-processed previous to analysis (normalized and scaled, outliers removed, and missing values imputated). Regarding phenotype data, a wide range of health outcomes are also available for the postnatal (childhood) period including outcomes related to (1) obesity and cardiometabolic health, (2) respiratory health, and (3) cognition and mental health.
HELIX Exposome data included information for 208 exposures and 88 phenotypes (Table 1).
| Cohort | N Individuals | N Predictors | N Phenotypes |
|---|---|---|---|
| BIB_cohort | 202 | 208 | 88 |
| EDEN_cohort | 198 | 208 | 88 |
| INMA-SAB_cohort | 223 | 208 | 88 |
| KANC_cohort | 204 | 208 | 88 |
| MoBA_cohort | 272 | 208 | 88 |
| Rhea_cohort | 199 | 208 | 88 |
| Combined Studies | 1298 | 208 | 88 |
The outcome of interest in the present study was the blood pressure at the postnatal period (childhood). Specifically, systolic and diastolic blood pressure values (SBP and DBP). These outcomes have been previously investigated in this population and the main findings of the research can be found in (Warembourg et al. 2019) and (Warembourg et al. 2021). In this showcase, available exposure data comprised 115 postnatal and 93 prenatal variables, respectively. Among them, we focused only in continuous exposures (77 postnatal and 69 prenatal, respectively). A separate model was run for each outcome (SBP and DBP) and each time point (prenatal and postnatal).
Nine confounders were identified from a Directed Acyclic Graph (Figure 2), including cohort of recruitment, maternal age (continuous in years), maternal educational level (low, middle, or high), self-reported maternal pre-pregnancy body mass index (continuous in kg/m2), parity (nulliparous, primiparous, or multiparous), native (if the child family is native from country of recruitment), and child age, height and sex. Cohort of recruitment was treated as six dummy variables (one for each population), and only five of them were finally included as confounders.
Figure 2. Direct Acyclic Graph design to identified confounder in the association between the early-life exposome and child blood pressure.
The dataset was uploaded to the Opal BRGE site hosted by the Bioinformatic Research Group in Epidemiology of ISGlobal, simulating a single-site DataSHIELD infrastructure (Figure 3). Details for accessing the server can be found in the respective DataSHIELD analysis section. Data were stored under the form of .csv (one per cohort, outcome and time point), and can be easily loaded as tables into the DataSHIELD environment. A summary of the structure and organization of the data in the Opal for the HELIX project is illustrated in Figure 3.
Figure 3. Input HELIX dataset organized by cohort in the Opal server.
In this section, we show how to load the data into the DataSHIELD environment. We start by creating the connection to the opal server using an user, who have DataSHIELD permissions to Opal servers. Please, note that in our example, all datasets are hosted in the same Opal but each cohort sub-dataset is accessed separately.
# Loading DataSHIELD required packages
require(DSOpal)
require(DSI)
require(dsBaseClient)
require(dsExposomeClient)
require(dsMTLClient)
require(dsHelper)
# Loading additional required packages
require(dplyr)
require(ggplot2)
require(ggrepel)
require(tidyverse)
require(ggrepel)
require(reshape2)
require(RColorBrewer)
require(kableExtra)
require(grid)
require(gridExtra)
require(lattice)
require(ggpubr)
# Create connections
builder <- DSI::newDSLoginBuilder()
builder$append(server = "BIB", url = "https://opal.isglobal.org/repo",
user = "invited", password = "12345678Aa@",
profile = "rock-lasso")
builder$append(server = "EDEN", url = "https://opal.isglobal.org/repo",
user = "invited", password = "12345678Aa@",
profile = "rock-lasso")
builder$append(server = "KANC", url = "https://opal.isglobal.org/repo",
user = "invited", password = "12345678Aa@",
profile = "rock-lasso")
builder$append(server = "MoBA", url = "https://opal.isglobal.org/repo",
user = "invited", password = "12345678Aa@",
profile = "rock-lasso")
builder$append(server = "Rhea", url = "https://opal.isglobal.org/repo",
user = "invited", password = "12345678Aa@",
profile = "rock-lasso")
builder$append(server = "INMASAB", url = "https://opal.isglobal.org/repo",
user = "invited", password = "12345678Aa@",
profile = "rock-lasso")
logindata <- builder$build()
conns <- DSI::datashield.login(logins = logindata)Then, we can load all the .csv tables available in the Opal server, corresponding to each cohort/phenotype, using the DSI::datashield.assign.table() function. This function takes the connections to the server created in the previous code chunk to assign all available objects from the the Opal, to an R object in the DataSHIELD remote session.
# We assign the resources for the DBP outcome
DSI::datashield.assign.table(conns[1], "dataDBP_PRE", "HELIX.data1DBP_pre", async = FALSE)
DSI::datashield.assign.table(conns[2], "dataDBP_PRE", "HELIX.data2DBP_pre", async = FALSE)
DSI::datashield.assign.table(conns[3], "dataDBP_PRE", "HELIX.data3DBP_pre", async = FALSE)
DSI::datashield.assign.table(conns[4], "dataDBP_PRE", "HELIX.data4DBP_pre", async = FALSE)
DSI::datashield.assign.table(conns[5], "dataDBP_PRE", "HELIX.data5DBP_pre", async = FALSE)
DSI::datashield.assign.table(conns[6], "dataDBP_PRE", "HELIX.data6DBP_pre", async = FALSE)
DSI::datashield.assign.table(conns[1], "dataDBP_POS", "HELIX.data1DBP_POS", async = FALSE)
DSI::datashield.assign.table(conns[2], "dataDBP_POS", "HELIX.data2DBP_POS", async = FALSE)
DSI::datashield.assign.table(conns[3], "dataDBP_POS", "HELIX.data3DBP_POS", async = FALSE)
DSI::datashield.assign.table(conns[4], "dataDBP_POS", "HELIX.data4DBP_POS", async = FALSE)
DSI::datashield.assign.table(conns[5], "dataDBP_POS", "HELIX.data5DBP_POS", async = FALSE)
DSI::datashield.assign.table(conns[6], "dataDBP_POS", "HELIX.data6DBP_POS", async = FALSE)
# We assign the resources for the SBP outcome
DSI::datashield.assign.table(conns[1], "dataSBP_PRE", "HELIX.data1SBP_PRE", async = FALSE)
DSI::datashield.assign.table(conns[2], "dataSBP_PRE", "HELIX.data2SBP_PRE", async = FALSE)
DSI::datashield.assign.table(conns[3], "dataSBP_PRE", "HELIX.data3SBP_PRE", async = FALSE)
DSI::datashield.assign.table(conns[4], "dataSBP_PRE", "HELIX.data4SBP_PRE", async = FALSE)
DSI::datashield.assign.table(conns[5], "dataSBP_PRE", "HELIX.data5SBP_PRE", async = FALSE)
DSI::datashield.assign.table(conns[6], "dataSBP_PRE", "HELIX.data6SBP_PRE", async = FALSE)
DSI::datashield.assign.table(conns[1], "dataSBP_POS", "HELIX.data1SBP_POS", async = FALSE)
DSI::datashield.assign.table(conns[2], "dataSBP_POS", "HELIX.data2SBP_POS", async = FALSE)
DSI::datashield.assign.table(conns[3], "dataSBP_POS", "HELIX.data3SBP_POS", async = FALSE)
DSI::datashield.assign.table(conns[4], "dataSBP_POS", "HELIX.data4SBP_POS", async = FALSE)
DSI::datashield.assign.table(conns[5], "dataSBP_POS", "HELIX.data5SBP_POS", async = FALSE)
DSI::datashield.assign.table(conns[6], "dataSBP_POS", "HELIX.data6SBP_POS", async = FALSE)Since dsLassoCov functions require data of each cohort to be passed as separate matrices for predictors and the outcome, we still needed to do some additional data preparation (i.e., predictor data were assigned to a matrix called “X”, and the outcome to an object called “Y”).
# Checks on the DBP dataset
ds.class("dataDBP_POS")
ds.dim("dataDBP_POS")
# Checks on the SBP dataset
ds.class("dataSBP_POS")
ds.dim("dataSBP_POS")
# Assign outcome data to a separate object called Y
ds.assign(toAssign='dataDBP_POS$Y', newobj='Y_DBP', datasources = conns)
ds.assign(toAssign='dataSBP_POS$Y', newobj='Y_SBP', datasources = conns)
# Create a vector with all ones
ds.make(toAssign = "Y_DBP-Y_DBP+1",newobj = "ONES",datasources = conns)
# Select predictors and assign them to a separate object called X (we will exclude here one of the cohorts dummy variables (Cohort EDEN) to avoid perfect collinearity between predictors)
ds.dataFrameSubset(df.name = 'dataDBP_PRE', V1.name = "ONES", V2.name = "ONES", Boolean.operator = "==",keep.cols = c(1,3:84),
newobj = 'X_DBP_PRE', datasources = conns)
ds.dataFrameSubset(df.name = 'dataDBP_POS', V1.name = "ONES", V2.name = "ONES", Boolean.operator = "==",keep.cols = c(1,3:92),
newobj = 'X_DBP_POS', datasources = conns)
ds.dataFrameSubset(df.name = 'dataSBP_PRE', V1.name = "ONES", V2.name = "ONES", Boolean.operator = "==",keep.cols = c(1,3:84),
newobj = 'X_SBP_PRE', datasources = conns)
ds.dataFrameSubset(df.name = 'dataSBP_POS', V1.name = "ONES", V2.name = "ONES", Boolean.operator = "==",keep.cols = c(1,3:92),
newobj = 'X_SBP_POS', datasources = conns)
# Checks on created files
ds.dim('X_DBP_PRE')
ds.dim('X_DBP_POS')
ds.dim('X_SBP_PRE')
ds.dim('X_SBP_POS')
ds.length('Y_DBP')
ds.length('Y_SBP')
# See covariates names:
ds.colnames('X_DBP_PRE')[[1]][1:13]
ds.colnames('X_SBP_POS')[[1]][1:13]
# We coerce both Xs and Y objects into matrix-type objects
ds.asMatrix(x.name = 'X_DBP_PRE', newobj = 'X_DBP_PRE', datasources = conns)
ds.asMatrix(x.name = 'X_DBP_POS', newobj = 'X_DBP_POS', datasources = conns)
ds.asMatrix(x.name = 'X_SBP_PRE', newobj = 'X_SBP_PRE', datasources = conns)
ds.asMatrix(x.name = 'X_SBP_POS', newobj = 'X_SBP_POS', datasources = conns)
ds.asMatrix(x.name = 'Y_DBP', newobj = 'Y_DBP', datasources = conns)
ds.asMatrix(x.name = 'Y_SBP', newobj = 'Y_SBP', datasources = conns)
# We assign objects from the remote session to R objects
# in the client-side
X_DBP="X_DBP_PRE"; X_DBP="X_DBP_POS"; Y_DBP="Y_DBP"
X_SBP="X_SBP_PRE"; X_SBP="X_SBP_POS"; Y_SBP="Y_SBP"The first step of the analysis involved performing a 5-folds cross-validation procedure over a sequence of 50 lambda values (estimated from the data), for the identification of the optimal lambda hyperparameter, which was selected according to MSE criterion.
# The "opts" parameter allows controlling the optimization algorithm employed to minimize the sum of squared errors (SSE) (objective function) during the coefficients estimation
opts=list(); opts$init=0; opts$maxIter=100; opts$tol=0.001; opts$ter=2;# Identification of the optimal lambda value by k-fold cross-validation for each outcome
cvResult_DBP_PRE <- ds.LassoCov_CVInSite(X=X_DBP_PRE, Y=Y_DBP, type="regress", nlambda=50, lam_ratio=0.01,
opts=opts, covar=c(1:13), datasources=conns, nDigits=4,
nfolds=5)
cvResult_DBP_POS <- ds.LassoCov_CVInSite(X=X_DBP_POS, Y=Y_DBP, type="regress", nlambda=50, lam_ratio=0.01,
opts=opts, covar=c(1:13), datasources=conns, nDigits=4,
nfolds=5)
cvResult_SBP_PRE <- ds.LassoCov_CVInSite(X=X_SBP_PRE, Y=Y_SBP, type="regress", nlambda=50, lam_ratio=0.01,
opts=opts, covar=c(1:13), datasources=conns, nDigits=4,
nfolds=5)
cvResult_SBP_POS <- ds.LassoCov_CVInSite(X=X_SBP_POS, Y=Y_SBP, type="regress", nlambda=50, lam_ratio=0.01,
opts=opts, covar=c(1:13), datasources=conns, nDigits=4,
nfolds=5)[1] 0.1263654
[1] 0.01969534
[1] 0.04561056
[1] 0.02002656
# PRENATAL
# Prepare model results for plot
dataset <- na.omit(reshape2::melt(cvResult_DBP_PRE$mse_fold))
LambdaVal <- round(apply(cvResult_DBP_PRE$lam_seq,2,mean),2)
LambdaVal <- paste(unique(dataset$Var2),"=",LambdaVal)
# Set range of colors for lambdas
fun_color_range <- colorRampPalette(c("#08737f", "#f95559")) # Apply colorRampPalette
my_colors <- fun_color_range(50)
# Boxplot showing the averaged MSE obtained for lambda values over folds
G1 <- ggplot(dataset,aes(x=Var1,y=value,fill=Var2))+geom_boxplot()+ labs(x = "Averaged lambda over folds",
y = "Mean squared error", fill='Lambda value') + scale_fill_manual(values=my_colors,labels = LambdaVal) + theme(legend.position = "none")+ ggtitle("DBP Prenatal")
# POSTNATAL
# Prepare model results for plot
dataset <- na.omit(reshape2::melt(cvResult_DBP_POS$mse_fold))
LambdaVal <- round(apply(cvResult_DBP_POS$lam_seq,2,mean),2)
LambdaVal <- paste(unique(dataset$Var2),"=",LambdaVal)
# Set range of colors for lambdas
fun_color_range <- colorRampPalette(c("#08737f", "#f95559")) # Apply colorRampPalette
my_colors <- fun_color_range(50)
# Boxplot showing the averaged MSE obtained for lambda values over folds
G2 <- ggplot(dataset,aes(x=Var1,y=value,fill=Var2))+geom_boxplot()+ labs(x = "Averaged lambda over folds",
y = "Mean squared error", fill='Lambda value') + scale_fill_manual(values=my_colors,labels = LambdaVal) + theme(legend.position = "none")+ ggtitle("DBP Postnatal")# PRENATAL
# Prepare model results for plot
dataset <- na.omit(reshape2::melt(cvResult_SBP_PRE$mse_fold))
LambdaVal <- round(apply(cvResult_SBP_PRE$lam_seq,2,mean),2)
LambdaVal <- paste(unique(dataset$Var2),"=",LambdaVal)
# Set range of colors for lambdas
fun_color_range <- colorRampPalette(c("#08737f", "#f95559")) # Apply colorRampPalette
my_colors <- fun_color_range(50)
# Boxplot showing the averaged MSE obtained for lambda values over folds
G3 <- ggplot(dataset,aes(x=Var1,y=value,fill=Var2))+geom_boxplot()+ labs(x = "Averaged lambda over folds",
y = "Mean squared error", fill='Lambda value') + scale_fill_manual(values=my_colors,labels = LambdaVal) + theme(legend.position = "none")+ ggtitle("SBP Prenatal")
# POSTNATAL
# Prepare model results for plot
dataset <- na.omit(reshape2::melt(cvResult_SBP_POS$mse_fold))
LambdaVal <- round(apply(cvResult_SBP_POS$lam_seq,2,mean),2)
LambdaVal <- paste(unique(dataset$Var2),"=",LambdaVal)
# Set range of colors for lambdas
fun_color_range <- colorRampPalette(c("#08737f", "#f95559")) # Apply colorRampPalette
my_colors <- fun_color_range(50)
# Boxplot showing the averaged MSE obtained for lambda values over folds
G4 <- ggplot(dataset,aes(x=Var1,y=value,fill=Var2))+geom_boxplot()+ labs(x = "Averaged lambda over folds",
y = "Mean squared error", fill='Lambda value') + scale_fill_manual(values=my_colors,labels = LambdaVal) + theme(legend.position = "none")+ ggtitle("SBP Postnatal")
ggarrange(G1, G2, G3, G4, ncol = 2, nrow = 2, labels=c("A","B","C","D"))The optimal lambda was 0.13 for the prenatal and 0.02 for the postnatal period. Predictors 1 to 13 in the dataset were defined as the adjusting covariates.
# Solver of Lasso Regression
OptimalModel_DBP_PRE <-ds.LassoCov_Train(X=X_DBP_PRE, Y=Y_DBP,type = "regress", lambda=cvResult_DBP$lambda.min, covar=c(1:13), opts=opts, datasources=conns, nDigits=15)
names(OptimalModel_DBP_PRE$ws) <- ds.colnames("X_DBP_PRE")[[1]]
OptimalModel_DBP_POS <- ds.LassoCov_Train(X=X_DBP_POS, Y=Y_DBP,type = "regress", lambda=cvResult_DBP$lambda.min, covar=c(1:13), opts=opts, datasources=conns, nDigits=15)
names(OptimalModel_DBP_POS$ws) <- ds.colnames("X_DBP_POS")[[1]]Below, the number and additional details for selected predictors in each model are shown.
[1] 13
# Show estimated coefficients and additional metadata for selected variables
toplot <- data.frame(OptimalModel_DBP_PRE$ws,ds.colnames("X_DBP_PRE")[[1]])
colnames(toplot) <- c("Coefficient","Index")
toplot_ <- codebook_vars[toplot[,2],c("Period","Group")]
toplot_save <- cbind(toplot,toplot_)
toplot_save <- toplot_save[which(toplot_save$Coefficient!=0),c(2,1,3,4)]
rownames(toplot_save) <- NULL
knitr::kable(toplot_save[order(abs(toplot_save$Coefficient),decreasing=T),], caption = "DBP Prenatal Optimal model") %>% row_spec(0,bold=TRUE) %>% kable_styling()| Index | Coefficient | Period | Group | |
|---|---|---|---|---|
| 1 | h_cohort_BIB | 0.4702208 | Pregnancy | Key Covariates |
| 4 | h_cohort_RHEA | -0.3599177 | Pregnancy | Key Covariates |
| 3 | h_cohort_MOBA | -0.1713993 | Pregnancy | Key Covariates |
| 5 | h_cohort_INMA | 0.1677149 | Pregnancy | Key Covariates |
| 10 | h_native_None | -0.0888654 | Pregnancy | Key Covariates |
| 13 | hs_c_height_None | 0.0883577 | Postnatal | Key Covariates |
| 6 | h_age_None | 0.0602509 | Pregnancy | Key Covariates |
| 8 | h_mbmi_None | 0.0496987 | Pregnancy | Key Covariates |
| 7 | h_edumc_None | -0.0483261 | Pregnancy | Key Covariates |
| 12 | hs_child_age_days_None | 0.0466813 | Postnatal | Key Covariates |
| 9 | h_parity_None | -0.0406308 | Pregnancy | Key Covariates |
| 2 | h_cohort_KANC | -0.0338111 | Pregnancy | Key Covariates |
| 11 | e3_sex_None | -0.0195808 | Pregnancy | Key Covariates |
toplot$vargroup <- rep(NA,nrow(toplot))
toplot$vargroup[which(toplot$Coefficient > 0)] <- "Positive"
toplot$vargroup[which(toplot$Coefficient == 0)] <- "Null"
toplot$vargroup[which(toplot$Coefficient < 0)] <- "Negative"
G5 <- ggplot(toplot, aes(x=Index, y=Coefficient, group = vargroup, label=Index)) +
geom_point(size=2,aes(color = vargroup)) +
scale_color_manual(values=c("#CD534CFF","black","#227CAD")) +
geom_hline(yintercept=0, linetype='dashed', col = "#868686FF") +
geom_label_repel(data = subset(toplot, toplot$vargroup!="Null" ),
size = 2,
box.padding = 0.5,
point.padding = 0.5,
force = 100,
segment.size = 0.2,
segment.color = "grey50", max.overlaps = 8
) +
theme(text = element_text(size=6), axis.text.x = element_text(angle = 90),legend.position = "none") + ggtitle("DBP Prenatal Optimal model") + xlab("Exposures") + ylab("Coefficients")
# POSTNATAL
# Get the number of selected variables
sum(OptimalModel_DBP_POS$ws!=0)[1] 43
# Show estimated coefficients and additional metadata for selected variables
toplot <- data.frame(OptimalModel_DBP_POS$ws,ds.colnames("X_DBP_POS")[[1]])
colnames(toplot) <- c("Coefficient","Index")
toplot_ <- codebook_vars[toplot[,2],c("Period","Group")]
toplot_save <- cbind(toplot,toplot_)
toplot_save <- toplot_save[which(toplot_save$Coefficient!=0),c(2,1,3,4)]
rownames(toplot_save) <- NULL
knitr::kable(toplot_save[order(abs(toplot_save$Coefficient),decreasing=T),], caption = "DBP Postnatal Optimal model") %>% row_spec(0,bold=TRUE) %>%
kable_styling()| Index | Coefficient | Period | Group | |
|---|---|---|---|---|
| 1 | h_cohort_BIB | 0.4518531 | Pregnancy | Key Covariates |
| 4 | h_cohort_RHEA | -0.3514934 | Pregnancy | Key Covariates |
| 5 | h_cohort_INMA | 0.1744509 | Pregnancy | Key Covariates |
| 3 | h_cohort_MOBA | -0.1676391 | Pregnancy | Key Covariates |
| 10 | h_native_None | -0.0866137 | Pregnancy | Key Covariates |
| 13 | hs_c_height_None | 0.0756864 | Postnatal | Key Covariates |
| 6 | h_age_None | 0.0661424 | Pregnancy | Key Covariates |
| 28 | hs_mg_c_Log2 | 0.0589452 | Postnatal | Essential minerals |
| 20 | hs_cu_c_Log2 | 0.0537502 | Postnatal | Metals |
| 43 | PSS_4_Score_None | 0.0433349 | Postnatal | Others |
| 12 | hs_child_age_days_None | 0.0396376 | Postnatal | Key Covariates |
| 25 | hs_mbzp_cadj_Log2 | -0.0390572 | Postnatal | Phthalates |
| 9 | h_parity_None | -0.0387779 | Pregnancy | Key Covariates |
| 7 | h_edumc_None | -0.0380214 | Pregnancy | Key Covariates |
| 31 | hs_na_c_Log2 | -0.0355180 | Postnatal | Essential minerals |
| 27 | hs_mep_cadj_Log2 | -0.0347788 | Postnatal | Phthalates |
| 8 | h_mbmi_None | 0.0325115 | Pregnancy | Key Covariates |
| 21 | hs_dde_cadj_Log2 | -0.0315272 | Postnatal | OCs |
| 2 | h_cohort_KANC | -0.0313477 | Pregnancy | Key Covariates |
| 41 | hs_tm_mt_hs_h_None | -0.0302424 | Postnatal | Meteorological |
| 16 | hs_as_c_Log2 | -0.0288183 | Postnatal | Metals |
| 17 | hs_bupa_cadj_Log2 | -0.0270073 | Postnatal | Phenols |
| 11 | e3_sex_None | -0.0236309 | Pregnancy | Key Covariates |
| 35 | hs_pbde47_cadj_Log2 | 0.0209138 | Postnatal | PBDEs |
| 34 | hs_pbde153_cadj_Log2 | -0.0202860 | Postnatal | PBDEs |
| 40 | hs_pm25abs_yr_hs_h_Log | 0.0145727 | Postnatal | Air Pollution |
| 42 | hs_trcs_cadj_Log2 | 0.0132693 | Postnatal | Phenols |
| 39 | hs_pfos_c_Log2 | 0.0116524 | Postnatal | PFASs |
| 22 | hs_dmp_cadj_Log2 | -0.0110715 | Postnatal | OP Pesticides |
| 26 | hs_mehp_cadj_Log2 | -0.0107084 | Postnatal | Phthalates |
| 32 | hs_oxominp_cadj_Log2 | 0.0104273 | Postnatal | Phthalates |
| 15 | hs_accesspoints300_h_Log | -0.0086248 | Postnatal | Built Environment |
| 14 | h_Absorbance_Log | 0.0081270 | Postnatal | Indoor air |
| 37 | hs_pcb138_cadj_Log2 | -0.0069165 | Postnatal | OCs |
| 23 | hs_etpa_cadj_Log2 | 0.0063827 | Postnatal | Phenols |
| 29 | hs_mo_c_Log2 | -0.0032549 | Postnatal | Metals |
| 19 | hs_co_c_Log2 | -0.0030013 | Postnatal | Metals |
| 18 | hs_cd_c_Log2 | 0.0027984 | Postnatal | Metals |
| 36 | hs_pcb118_cadj_Log2 | -0.0024705 | Postnatal | OCs |
| 30 | hs_mvpa_prd_alt_None | 0.0015177 | Postnatal | Lifestyle |
| 24 | hs_k_c_Log2 | 0.0012254 | Postnatal | Essential minerals |
| 38 | hs_pcb170_cadj_Log2 | -0.0004701 | Postnatal | OCs |
| 33 | hs_pb_c_Log2 | -0.0001753 | Postnatal | Metals |
toplot$vargroup <- rep(NA,nrow(toplot))
toplot$vargroup[which(toplot$Coefficient > 0)] <- "Positive"
toplot$vargroup[which(toplot$Coefficient == 0)] <- "Null"
toplot$vargroup[which(toplot$Coefficient < 0)] <- "Negative"
G6 <- ggplot(toplot, aes(x=Index, y=Coefficient, group = vargroup, label=Index)) +
geom_point(size=2,aes(color = vargroup)) +
scale_color_manual(values=c("#CD534CFF","black","#227CAD")) +
geom_hline(yintercept=0, linetype='dashed', col = "#868686FF") +
geom_label_repel(data = subset(toplot, toplot$Coefficient > quantile(abs(toplot$Coefficient[toplot$Coefficient>0]),0.15) | toplot$Coefficient < -quantile(abs(toplot$Coefficient[toplot$Coefficient<0]) ,0.45)),
size = 2,
box.padding = 0.5,
point.padding = 0.5,
force = 100,
segment.size = 0.2,
segment.color = "grey50", max.overlaps = 8
) +
theme(text = element_text(size=6), axis.text.x = element_text(angle = 90),legend.position = "none") + ggtitle("DBP Postnatal Optimal model") + xlab("Exposures") + ylab("Coefficients")
ggarrange(G5, G6, ncol = 2, labels=c("A","B"))The number of predictors remaining in the final models was 13 (only confounders) for the prenatal period, and 43 (13 confounders + 30 exposures) for the postnatal period. Intriguingly, in the prenatal period, a model with only adjusting covariates was enough to explain all the variance in DBP. On the other hand, in the postnatal period, the model evidenced how exposure to metals such as copper, or chemicals like PFAS and phenols was associated with higher blood pressure levels in children. Likewise, exposure to a high-stress environment (measured as perceived stress score) was associated with higher DBP. Interestingly, some urban environment factors such as the accesibility to public transport were evidenced as protective factors for high blood pressure. Among all exposures remaining in the final model as predictors of DBP, copper blood levels were one of the (non-confounder) exposures showing a higher effect size (B=0.05). Interestingly, many of these findings, especially those related to copper exposure or the effect of the urban environment on blood pressure levels were already described in our previous publications (Warembourg et al. 2019) and (Warembourg et al. 2021).
The optimal lambda was 0.05 for the prenatal and 0.02 for the postnatal period. Predictors 1 to 13 in the dataset were defined as the adjusting covariates.
# Solver of Lasso Regression
OptimalModel_SBP_PRE <- ds.LassoCov_Train(X=X_SBP_PRE, Y=Y_SBP,type = "regress", lambda=cvResult_SBP$lambda.min, covar=c(1:13), opts=opts, datasources=conns, nDigits=15)
names(OptimalModel_SBP_PRE$w) <- ds.colnames("X_SBP_PRE")[[1]]
OptimalModel_SBP_POS <- ds.LassoCov_Train(X=X_SBP_POS, Y=Y_SBP,type = "regress", lambda=cvResult_SBP$lambda.min, covar=c(1:13), opts=opts, datasources=conns, nDigits=15)
names(OptimalModel_SBP_POS$w) <- ds.colnames("X_SBP_POS")[[1]]Below, the number and additional details for selected predictors in each model are shown.
[1] 14
# Show estimated coefficients and additional metadata for selected variables
toplot <- data.frame(OptimalModel_SBP_PRE$ws,ds.colnames("X_SBP_PRE")[[1]])
colnames(toplot) <- c("Coefficient","Index")
toplot_ <- codebook_vars[toplot[,2],c("Period","Group")]
toplot_save <- cbind(toplot,toplot_)
toplot_save <- toplot_save[which(toplot_save$Coefficient!=0),c(2,1,3,4)]
rownames(toplot_save) <- NULL
knitr::kable(toplot_save[order(abs(toplot_save$Coefficient),decreasing = T),], caption = "SBP Prenatal Optimal model") %>% row_spec(0,bold=TRUE) %>%
kable_styling()| Index | Coefficient | Period | Group | |
|---|---|---|---|---|
| 1 | h_cohort_BIB | 0.5046683 | Pregnancy | Key Covariates |
| 13 | hs_c_height_None | 0.3476911 | Postnatal | Key Covariates |
| 2 | h_cohort_KANC | -0.3461856 | Pregnancy | Key Covariates |
| 5 | h_cohort_INMA | 0.2611074 | Pregnancy | Key Covariates |
| 3 | h_cohort_MOBA | -0.2104300 | Pregnancy | Key Covariates |
| 4 | h_cohort_RHEA | -0.1598021 | Pregnancy | Key Covariates |
| 6 | h_age_None | 0.0788615 | Pregnancy | Key Covariates |
| 12 | hs_child_age_days_None | 0.0784751 | Postnatal | Key Covariates |
| 8 | h_mbmi_None | 0.0772208 | Pregnancy | Key Covariates |
| 10 | h_native_None | -0.0583598 | Pregnancy | Key Covariates |
| 7 | h_edumc_None | -0.0477134 | Pregnancy | Key Covariates |
| 11 | e3_sex_None | 0.0385127 | Pregnancy | Key Covariates |
| 14 | h_fdensity300_preg_Log | -0.0198690 | Pregnancy | Built Environment |
| 9 | h_parity_None | -0.0110181 | Pregnancy | Key Covariates |
toplot$vargroup <- rep(NA,nrow(toplot))
toplot$vargroup[which(toplot$Coefficient > 0)] <- "Positive"
toplot$vargroup[which(toplot$Coefficient == 0)] <- "Null"
toplot$vargroup[which(toplot$Coefficient < 0)] <- "Negative"
G7 <- ggplot(toplot, aes(x=Index, y=Coefficient, group = vargroup, label=Index)) +
geom_point(size=2,aes(color = vargroup)) +
scale_color_manual(values=c("#CD534CFF","black","#227CAD")) +
geom_hline(yintercept=0, linetype='dashed', col = "#868686FF") +
geom_label_repel(data = subset(toplot, toplot$vargroup!="Null" ),
size = 2,
box.padding = 0.5,
point.padding = 0.5,
force = 100,
segment.size = 0.2,
segment.color = "grey50", max.overlaps = 8
) +
theme(text = element_text(size=6), axis.text.x = element_text(angle = 90),legend.position = "none") + ggtitle("SBP Prenatal Optimal model") + xlab("Exposures") + ylab("Coefficients")
# POSTNATAL
# Get the number of selected variables
sum(OptimalModel_SBP_POS$ws!=0)[1] 39
# Show estimated coefficients and additional metadata for selected variables
toplot <- data.frame(OptimalModel_SBP_POS$ws,ds.colnames("X_SBP_POS")[[1]])
colnames(toplot) <- c("Coefficient","Index")
toplot_ <- codebook_vars[toplot[,2],c("Period","Group")]
toplot_save <- cbind(toplot,toplot_)
toplot_save <- toplot_save[which(toplot_save$Coefficient!=0),c(2,1,3,4)]
rownames(toplot_save) <- NULL
knitr::kable(toplot_save[order(abs(toplot_save$Coefficient),decreasing = T),], caption = "SBP Postnatal Optimal model") %>% row_spec(0,bold=TRUE) %>%
kable_styling()| Index | Coefficient | Period | Group | |
|---|---|---|---|---|
| 1 | h_cohort_BIB | 0.4363117 | Pregnancy | Key Covariates |
| 13 | hs_c_height_None | 0.3183230 | Postnatal | Key Covariates |
| 2 | h_cohort_KANC | -0.2730910 | Pregnancy | Key Covariates |
| 5 | h_cohort_INMA | 0.2488328 | Pregnancy | Key Covariates |
| 3 | h_cohort_MOBA | -0.2245687 | Pregnancy | Key Covariates |
| 4 | h_cohort_RHEA | -0.1375123 | Pregnancy | Key Covariates |
| 6 | h_age_None | 0.0887953 | Pregnancy | Key Covariates |
| 21 | hs_dde_cadj_Log2 | -0.0750493 | Postnatal | OCs |
| 10 | h_native_None | -0.0669917 | Pregnancy | Key Covariates |
| 8 | h_mbmi_None | 0.0607844 | Pregnancy | Key Covariates |
| 12 | hs_child_age_days_None | 0.0564032 | Postnatal | Key Covariates |
| 31 | hs_mg_c_Log2 | 0.0531831 | Postnatal | Essential minerals |
| 11 | e3_sex_None | 0.0486127 | Pregnancy | Key Covariates |
| 27 | hs_hcb_cadj_Log2 | -0.0481409 | Postnatal | OCs |
| 39 | PSS_4_Score_None | 0.0380141 | Postnatal | Others |
| 28 | hs_mbzp_cadj_Log2 | -0.0281945 | Postnatal | Phthalates |
| 17 | hs_as_c_Log2 | -0.0265266 | Postnatal | Metals |
| 33 | hs_pbde153_cadj_Log2 | -0.0260163 | Postnatal | PBDEs |
| 37 | hs_pfunda_c_Log2 | 0.0239332 | Postnatal | PFASs |
| 7 | h_edumc_None | -0.0232285 | Pregnancy | Key Covariates |
| 24 | hs_dmtp_cadj_Log2 | -0.0217748 | Postnatal | OP Pesticides |
| 32 | hs_na_c_Log2 | -0.0208633 | Postnatal | Essential minerals |
| 14 | h_Absorbance_Log | 0.0197715 | Postnatal | Indoor air |
| 29 | hs_mehp_cadj_Log2 | -0.0187647 | Postnatal | Phthalates |
| 36 | hs_pfoa_c_Log2 | 0.0108690 | Postnatal | PFASs |
| 20 | hs_cu_c_Log2 | 0.0105436 | Postnatal | Metals |
| 34 | hs_pbde47_cadj_Log2 | 0.0082825 | Postnatal | PBDEs |
| 9 | h_parity_None | -0.0080572 | Pregnancy | Key Covariates |
| 30 | hs_mep_cadj_Log2 | -0.0080463 | Postnatal | Phthalates |
| 16 | hs_accesspoints300_h_Log | -0.0071812 | Postnatal | Built Environment |
| 38 | hs_pm25abs_yr_hs_h_Log | 0.0068807 | Postnatal | Air Pollution |
| 15 | h_Benzene_Log | 0.0058062 | Postnatal | Indoor air |
| 18 | hs_bupa_cadj_Log2 | -0.0049075 | Postnatal | Phenols |
| 23 | hs_dmp_cadj_Log2 | -0.0034749 | Postnatal | OP Pesticides |
| 26 | hs_frichness300_h_None | -0.0032700 | Postnatal | Built Environment |
| 35 | hs_pfna_c_Log2 | 0.0022736 | Postnatal | PFASs |
| 25 | hs_etpa_cadj_Log2 | 0.0019301 | Postnatal | Phenols |
| 22 | hs_dep_cadj_Log2 | -0.0019300 | Postnatal | OP Pesticides |
| 19 | hs_cd_c_Log2 | 0.0000674 | Postnatal | Metals |
toplot$vargroup <- rep(NA,nrow(toplot))
toplot$vargroup[which(toplot$Coefficient > 0)] <- "Positive"
toplot$vargroup[which(toplot$Coefficient == 0)] <- "Null"
toplot$vargroup[which(toplot$Coefficient < 0)] <- "Negative"
G8 <- ggplot(toplot, aes(x=Index, y=Coefficient, group = vargroup, label=Index)) +
geom_point(size=2,aes(color = vargroup)) +
scale_color_manual(values=c("#CD534CFF","black","#227CAD")) +
geom_hline(yintercept=0, linetype='dashed', col = "#868686FF") +
geom_label_repel(data = subset(toplot, toplot$Coefficient > quantile(abs(toplot$Coefficient[toplot$Coefficient>0]),0.15) | toplot$Coefficient < -quantile(abs(toplot$Coefficient[toplot$Coefficient<0]) ,0.45)),
size = 2,
box.padding = 0.5,
point.padding = 0.5,
force = 100,
segment.size = 0.2,
segment.color = "grey50", max.overlaps = 8
) +
theme(text = element_text(size=6), axis.text.x = element_text(angle = 90),legend.position = "none") + ggtitle("SBP Postnatal Optimal model") + xlab("Exposures") + ylab("Coefficients")
ggarrange(G7, G8, ncol = 2, labels=c("A","B"))The number of predictors remaining in the final models was 14 (13 confounders + 1 exposure) for the prenatal period, and 39 (13 confounders + 26 exposures) for the postnatal period. Interestingly, a higher number of facility types near children’s residences in both pregnancy and postnatal periods was associated with lower SBP values (this was already observed by (Warembourg et al. 2021)). For the postnatal period, results were pretty concordant with those extracted for DBP, remaining the most interesting associations, such as those revealed for high-stress environment or copper blood levels.
In both cases (DBP and SBP), unadjusted models revealed similar findings to adjusted models in terms of selected variables and direction of associations. Nevertheless, in comparison to dsLassoCov, the uncorrected dsLasso approach yielded less sparse (and therefore less interpretable) models as well as selected some unexpected variables with extremely high effect sizes and underestimated the effect of some other exposures, such as those identified as key confounders. For example, in the case of the postnatal SBP unadjusted model, the blood levels of hexachlorobenzene were selected as one of the exposures with the highest effect sizes, even above key known-exposures such as the cohort of recruitment or the maternal BMI. Likewise, unadjusted models evidenced unexpected associations such as an inverse proportional relationship between the total traffic load of all roads in 100 m buffer at home and SBP. This is an unexpected finding since previous studies have shown how noise and air pollution in cities is a well-known trigger of high blood pressure and cardiovascular disease.
[1] 49
# Show estimated coefficients and additional metadata for selected variables
toplot <- data.frame(OptimalModel_SBP_POS_nocovs$w,ds.colnames("X_SBP_POS")[[1]])
colnames(toplot) <- c("Coefficient","Index")
toplot_ <- codebook_vars[toplot[,2],c("Period","Group")]
toplot_save <- cbind(toplot,toplot_)
toplot_save <- toplot_save[which(toplot_save$Coefficient!=0),c(2,1,3,4)]
rownames(toplot_save) <- NULL
knitr::kable(toplot_save[order(abs(toplot_save$Coefficient),decreasing=T),], caption = "SBP Postnatal Unadjusted Optimal model") %>% row_spec(0,bold=TRUE) %>%
kable_styling()| Index | Coefficient | Period | Group | |
|---|---|---|---|---|
| 9 | hs_c_height_None | 0.2287006 | Postnatal | Key Covariates |
| 25 | hs_hcb_cadj_Log2 | -0.1448063 | Postnatal | OCs |
| 19 | hs_dde_cadj_Log2 | -0.0938842 | Postnatal | OCs |
| 8 | hs_child_age_days_None | 0.0751858 | Postnatal | Key Covariates |
| 6 | h_native_None | -0.0684704 | Pregnancy | Key Covariates |
| 32 | hs_mg_c_Log2 | 0.0583768 | Postnatal | Essential minerals |
| 3 | h_age_None | 0.0561176 | Pregnancy | Key Covariates |
| 49 | PSS_4_Score_None | 0.0538807 | Postnatal | Others |
| 5 | h_mbmi_None | 0.0518649 | Pregnancy | Key Covariates |
| 28 | hs_mbzp_cadj_Log2 | -0.0473453 | Postnatal | Phthalates |
| 37 | hs_no2_yr_hs_h_Log | 0.0452616 | Postnatal | Air Pollution |
| 45 | hs_pm25abs_yr_hs_h_Log | 0.0404114 | Postnatal | Air Pollution |
| 42 | hs_pfoa_c_Log2 | 0.0397292 | Postnatal | PFASs |
| 11 | h_NO2_Log | 0.0369238 | Postnatal | Indoor air |
| 7 | e3_sex_None | 0.0339399 | Pregnancy | Key Covariates |
| 13 | hs_accesspoints300_h_Log | 0.0242645 | Postnatal | Built Environment |
| 47 | hs_trafload_h_pow1over3 | -0.0226582 | Postnatal | Traffic |
| 10 | h_Absorbance_Log | 0.0197312 | Postnatal | Indoor air |
| 40 | hs_pbde47_cadj_Log2 | 0.0185071 | Postnatal | PBDEs |
| 39 | hs_pbde153_cadj_Log2 | -0.0176491 | Postnatal | PBDEs |
| 18 | hs_cu_c_Log2 | 0.0174570 | Postnatal | Metals |
| 20 | hs_dmp_cadj_Log2 | -0.0173723 | Postnatal | OP Pesticides |
| 4 | h_edumc_None | -0.0156090 | Pregnancy | Key Covariates |
| 34 | hs_mnbp_cadj_Log2 | -0.0151162 | Postnatal | Phthalates |
| 30 | hs_mehp_cadj_Log2 | -0.0145265 | Postnatal | Phthalates |
| 26 | hs_hg_c_Log2 | 0.0142944 | Postnatal | Metals |
| 15 | hs_bupa_cadj_Log2 | -0.0138145 | Postnatal | Phenols |
| 24 | hs_frichness300_h_None | -0.0137729 | Postnatal | Built Environment |
| 21 | hs_dmtp_cadj_Log2 | -0.0135055 | Postnatal | OP Pesticides |
| 27 | hs_landuseshan300_h_None | 0.0108409 | Postnatal | Built Environment |
| 22 | hs_etpa_cadj_Log2 | 0.0100900 | Postnatal | Phenols |
| 31 | hs_mepa_cadj_Log2 | 0.0099626 | Postnatal | Phenols |
| 14 | hs_as_c_Log2 | -0.0098313 | Postnatal | Metals |
| 46 | hs_se_c_Log2 | 0.0088805 | Postnatal | Essential minerals |
| 36 | hs_ndvi100_h_None | -0.0082579 | Postnatal | Natural Spaces |
| 43 | hs_pfunda_c_Log2 | 0.0063179 | Postnatal | PFASs |
| 35 | hs_na_c_Log2 | -0.0042086 | Postnatal | Essential minerals |
| 33 | hs_mibp_cadj_Log2 | 0.0040371 | Postnatal | Phthalates |
| 41 | hs_pfhxs_c_Log2 | 0.0028548 | Postnatal | PFASs |
| 2 | h_cohort_MOBA | -0.0025974 | Pregnancy | Key Covariates |
| 1 | h_cohort_BIB | 0.0025684 | Pregnancy | Key Covariates |
| 38 | hs_ohminp_cadj_Log2 | 0.0011654 | Postnatal | Phthalates |
| 17 | hs_cs_c_Log2 | 0.0007918 | Postnatal | Metals |
| 44 | hs_pm25_yr_hs_h_None | 0.0007645 | Postnatal | Air Pollution |
| 48 | hs_uvdvf_mt_hs_h_None | 0.0006950 | Postnatal | Meteorological |
| 12 | h_TEX_Log | -0.0006882 | Postnatal | Indoor air |
| 23 | hs_fdensity300_h_Log | -0.0003913 | Postnatal | Built Environment |
| 16 | hs_cd_c_Log2 | 0.0000234 | Postnatal | Metals |
| 29 | hs_mecpp_cadj_Log2 | -0.0000119 | Postnatal | Phthalates |
With this showcase, we thereby demonstrate that our method dsLassoCov and the DataSHIELD infrastructure are powerful tools for performing feature selection in complex scenarios such as the case of exposome research datasets, in which confounding effects can be strong biasing research findings. The fact of two papers already published with a similar approach showing coincident results ((Warembourg et al. 2019) and (Warembourg et al. 2021)) also add robustness to our statements. Specially in the case of sensitive multi-centre biomedical research studies, the use of federated-analysis platforms such as DataSHIELD should be promoted.
R version 4.2.0 (2022-04-22)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 20.04.4 LTS
Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0
locale:
[1] LC_CTYPE=es_ES.UTF-8 LC_NUMERIC=C
[3] LC_TIME=es_ES.UTF-8 LC_COLLATE=es_ES.UTF-8
[5] LC_MONETARY=es_ES.UTF-8 LC_MESSAGES=es_ES.UTF-8
[7] LC_PAPER=es_ES.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=es_ES.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] grid stats graphics grDevices utils datasets methods
[8] base
other attached packages:
[1] ggpubr_0.4.0 lattice_0.20-45 gridExtra_2.3
[4] kableExtra_1.3.4 RColorBrewer_1.1-3 reshape2_1.4.4
[7] forcats_0.5.2 stringr_1.4.0 purrr_0.3.4
[10] readr_2.1.2 tidyr_1.2.1 tibble_3.1.5
[13] tidyverse_1.3.1 ggrepel_0.9.1 ggplot2_3.3.6
[16] dplyr_1.0.10 dsHelper_0.4.18.9000 dsMTLClient_0.9
[19] corpcor_1.6.9 dsExposomeClient_2.0.4 dsBaseClient_6.1.1
[22] DSOpal_1.3.1 DSI_1.4.0 R6_2.5.1
[25] progress_1.2.2 opalr_3.1.1 httr_1.4.4
loaded via a namespace (and not attached):
[1] nlme_3.1-159 fs_1.5.2 lubridate_1.8.0 webshot_0.5.3
[5] tools_4.2.0 backports_1.2.1 bslib_0.4.0 utf8_1.2.2
[9] metafor_3.8-1 DBI_1.1.2 colorspace_2.0-3 withr_2.5.0
[13] tidyselect_1.1.2 prettyunits_1.1.1 curl_4.3.2 compiler_4.2.0
[17] cli_3.4.0 rvest_1.0.2 xml2_1.3.3 labeling_0.4.2
[21] sass_0.4.2 scales_1.2.1 systemfonts_1.0.4 digest_0.6.28
[25] rmarkdown_2.16 svglite_2.1.0 pkgconfig_2.0.3 htmltools_0.5.3
[29] labelled_2.9.1 dbplyr_2.2.1 fastmap_1.1.0 highr_0.9
[33] rlang_1.0.5 readxl_1.4.1 rstudioapi_0.14 farver_2.1.1
[37] jquerylib_0.1.4 generics_0.1.2 jsonlite_1.7.2 car_3.1-0
[41] magrittr_2.0.1 metadat_1.2-0 Matrix_1.4-1 Rcpp_1.0.9
[45] munsell_0.5.0 fansi_0.5.0 abind_1.4-5 lifecycle_1.0.1
[49] stringi_1.7.5 yaml_2.3.5 carData_3.0-4 mathjaxr_1.6-0
[53] plyr_1.8.7 crayon_1.4.1 cowplot_1.1.1 haven_2.5.1
[57] hms_1.1.2 knitr_1.40 pillar_1.8.1 ggsignif_0.6.3
[61] codetools_0.2-18 reprex_2.0.2 glue_1.6.2 evaluate_0.16
[65] modelr_0.1.9 vctrs_0.4.1 tzdb_0.3.0 cellranger_1.1.0
[69] gtable_0.3.0 assertthat_0.2.1 cachem_1.0.6 xfun_0.32
[73] mime_0.12 broom_1.0.1 rstatix_0.7.0 viridisLite_0.4.1
[77] ellipsis_0.3.2