Developing a model to predict permeability (see Sect. 1.4) could save significant resources for a pharmaceutical company, while at the same time moremrapidly identifying molecules that have a sufficient permeability to become a drug:
Start R and use these commands to load the data:
library(AppliedPredictiveModeling)
data(permeability)
permeability <- data.frame(permeability)
dim(permeability)
## [1] 165 1
The matrix fingerprints contains the 1,107 binary molecular predictors for the 165 compounds, while permeability contains permeability response.
The fingerprint predictors indicate the presence or absence of substructures of a molecule and are often sparse meaning that relatively few of the molecules contain each substructure. Filter out the predictors that have low frequencies using the nearZeroVar function from the caret package.
How many predictors are left for modeling? 388 represents the number of predictorsin the fingerprints dataset remaining after removing features with low variability using the nearZeroVar function.
head(permeability)
## permeability
## 1 12.520
## 2 1.120
## 3 19.405
## 4 1.730
## 5 1.680
## 6 0.510
dim(fingerprints)
## [1] 165 1107
ft_fingerprints <- nearZeroVar(fingerprints)
# Remove these predictors from the dataset
ft_fingerprints <- fingerprints[, -ft_fingerprints]
# Get the dimensions of the filtered dataset to see how many predictors are left
dim(ft_fingerprints)
## [1] 165 388
Split the data into a training and a test set, pre-process the data, and tune a PLS model. How many latent variables are optimal and what is the corresponding resampled estimate of R2?
The optimal number of latent variables for the Partial Least Squares (PLS) model, based on the summary is 3 For the model with 3 latent variables (ncomp = 3), the reported value is 0.5613346
str(permeability)
## 'data.frame': 165 obs. of 1 variable:
## $ permeability: num 12.52 1.12 19.41 1.73 1.68 ...
combined_data <- data.frame(cbind(permeability, ft_fingerprints))
set.seed(123)
idx <- createDataPartition(combined_data$permeability, p = 0.8, list = FALSE)
train <- combined_data[idx,]
test <- combined_data[-idx,]
train_control <- trainControl(method = "cv", number = 10, search = "grid")
# Define the tuneGrid to specify the range of components to evaluate
tune_grid <- expand.grid(.ncomp = 1:20)
# LS model
set.seed(123)
pls_model <- train(permeability ~ ., data = train,
method = "pls",
tuneGrid = tune_grid,
trControl = train_control,
preProc = c("center", "scale"),
metric = "RMSE")
print(pls_model)
## Partial Least Squares
##
## 133 samples
## 388 predictors
##
## Pre-processing: centered (388), scaled (388)
## Resampling: Cross-Validated (10 fold)
## Summary of sample sizes: 120, 119, 118, 120, 121, 119, ...
## Resampling results across tuning parameters:
##
## ncomp RMSE Rsquared MAE
## 1 13.36372 0.3382069 10.237820
## 2 11.47594 0.5347231 8.245104
## 3 11.43313 0.5613346 8.816852
## 4 11.51915 0.5487654 8.915167
## 5 11.50333 0.5490984 8.703944
## 6 11.47723 0.5342174 8.540912
## 7 11.49816 0.5410694 8.847240
## 8 11.60656 0.5519058 9.008913
## 9 11.79130 0.5618654 8.932466
## 10 12.10267 0.5458038 9.186885
## 11 12.33462 0.5275119 9.395033
## 12 12.54970 0.5170371 9.587169
## 13 12.61600 0.5048875 9.563882
## 14 12.77412 0.4911729 9.714216
## 15 12.82636 0.4918708 9.846474
## 16 12.90771 0.4799217 9.845704
## 17 12.95694 0.4783518 9.847771
## 18 13.10433 0.4703636 9.932178
## 19 13.04104 0.4750762 9.902920
## 20 13.04802 0.4773802 9.894081
##
## RMSE was used to select the optimal model using the smallest value.
## The final value used for the model was ncomp = 3.
plot(pls_model)
Predict the response for the test set. What is the test set estimate of R2?
Test set estimate of Rsquared is 0.2695454
# Use postResample in Caret Package
predictions <- predict(pls_model, newdata = test)
results <- postResample(pred = predictions, obs = test$permeability)
print(results)
## RMSE Rsquared MAE
## 12.7568188 0.2695454 8.4155989
Try building other models discussed in this chapter. Do any have better predictive performance?
Ridge regression: penalizes the sum of the squares of the regression coefficients (squared L2 regularization). This method helps prevent overfitting and reduces the size of regression coefficients. No variable selection is made, and all variables are included in the model.
Lasso: Penalizes the sum of the absolute values of the regression coefficients (L1 regulation). Lasso makes some regression coefficients exactly 0, providing a variable selection effect.
Elastic Net: combines the approaches of Ridge Regression and Lasso, using a mixture of L1 and L2 regularization. This method can provide better performance than Lasso when the correlation between variables is high.
X_train <- model.matrix(permeability ~ ., data = train)[, -1]
Y_train <- train$permeability
X_test <- model.matrix(permeability ~ ., data = test)[, -1]
Y_test <- test$permeability
# find ridge lambda
set.seed(123)
cv_ridge <- cv.glmnet(X_train, Y_train, alpha = 0)
opt_lambda_ridge <- cv_ridge$lambda.min
opt_lambda_ridge
## [1] 105.3515
# find lasso lambda
set.seed(123)
cv_lasso <- cv.glmnet(X_train, Y_train, alpha = 1)
opt_lambda_lasso <- cv_lasso$lambda.min
opt_lambda_lasso
## [1] 0.4455569
# elastic lambda
set.seed(123)
cv_elastic <- cv.glmnet(X_train, Y_train, alpha = 0.5)
opt_lambda_elastic <- cv_elastic$lambda.min
opt_lambda_elastic
## [1] 0.9335449
# ridge regression
ridge_pred <- predict(cv_ridge, s = opt_lambda_ridge, newx = X_test)
ridge_res <- postResample(ridge_pred, Y_test)
# lasso regression
lasso_pred <- predict(cv_lasso, s = opt_lambda_lasso, newx = X_test)
lasso_res <- postResample(lasso_pred, Y_test)
# elasticNet regression
elastic_pred <- predict(cv_elastic, s = opt_lambda_elastic, newx = X_test)
elastic_res <- postResample(elastic_pred, Y_test)
print(list(ridge = ridge_res, lasso = lasso_res, elastic_net = elastic_res))
## $ridge
## RMSE Rsquared MAE
## 11.0203929 0.3266984 7.6511575
##
## $lasso
## RMSE Rsquared MAE
## 11.0642660 0.3881333 7.3758503
##
## $elastic_net
## RMSE Rsquared MAE
## 10.9494859 0.3956523 7.3143429
Would you recommend any of your models to replace the permeability laboratory experiment?
Of all the models, ElasticNet has the lowest RMSE value, meaning that it has the smallest error on average in its predictions on the test set.
Elastic Net has the highest R2 value, indicating that the model best explains the variability in the test set.
Elastic Net has the lowest MAE value and, on average, the smallest absolute error in predictions on the test set.
Overall, these results indicate that Elastic Net provides better prediction performance compared to Ridge Regression, Lasso, and PLS models on this dataset. These results show that when choosing a model for a particular data set, it is important to compare multiple models rather than simply relying on one model.
A chemical manufacturing process for a pharmaceutical product was discussed in Sect. 1.4. In this problem, the objective is to understand the relationship between biological measurements of the raw materials (predictors), measurements of the manufacturing process (predictors), and the response of product yield. Biological predictors cannot be changed but can be used to assess the quality of the raw material before processing. On the other hand, manufacturing process predictors can be changed in the manufacturing process. Improving product yield by 1 % will boost revenue by approximately one hundred thousand dollars per batch:
Start R and use these commands to load the data:
library(AppliedPredictiveModeling)
data(ChemicalManufacturingProcess)
ChemicalManufacturingProcess <- data.frame(ChemicalManufacturingProcess)
dim(ChemicalManufacturingProcess)
## [1] 176 58
head(ChemicalManufacturingProcess)
## Yield BiologicalMaterial01 BiologicalMaterial02 BiologicalMaterial03
## 1 38.00 6.25 49.58 56.97
## 2 42.44 8.01 60.97 67.48
## 3 42.03 8.01 60.97 67.48
## 4 41.42 8.01 60.97 67.48
## 5 42.49 7.47 63.33 72.25
## 6 43.57 6.12 58.36 65.31
## BiologicalMaterial04 BiologicalMaterial05 BiologicalMaterial06
## 1 12.74 19.51 43.73
## 2 14.65 19.36 53.14
## 3 14.65 19.36 53.14
## 4 14.65 19.36 53.14
## 5 14.02 17.91 54.66
## 6 15.17 21.79 51.23
## BiologicalMaterial07 BiologicalMaterial08 BiologicalMaterial09
## 1 100 16.66 11.44
## 2 100 19.04 12.55
## 3 100 19.04 12.55
## 4 100 19.04 12.55
## 5 100 18.22 12.80
## 6 100 18.30 12.13
## BiologicalMaterial10 BiologicalMaterial11 BiologicalMaterial12
## 1 3.46 138.09 18.83
## 2 3.46 153.67 21.05
## 3 3.46 153.67 21.05
## 4 3.46 153.67 21.05
## 5 3.05 147.61 21.05
## 6 3.78 151.88 20.76
## ManufacturingProcess01 ManufacturingProcess02 ManufacturingProcess03
## 1 NA NA NA
## 2 0.0 0 NA
## 3 0.0 0 NA
## 4 0.0 0 NA
## 5 10.7 0 NA
## 6 12.0 0 NA
## ManufacturingProcess04 ManufacturingProcess05 ManufacturingProcess06
## 1 NA NA NA
## 2 917 1032.2 210.0
## 3 912 1003.6 207.1
## 4 911 1014.6 213.3
## 5 918 1027.5 205.7
## 6 924 1016.8 208.9
## ManufacturingProcess07 ManufacturingProcess08 ManufacturingProcess09
## 1 NA NA 43.00
## 2 177 178 46.57
## 3 178 178 45.07
## 4 177 177 44.92
## 5 178 178 44.96
## 6 178 178 45.32
## ManufacturingProcess10 ManufacturingProcess11 ManufacturingProcess12
## 1 NA NA NA
## 2 NA NA 0
## 3 NA NA 0
## 4 NA NA 0
## 5 NA NA 0
## 6 NA NA 0
## ManufacturingProcess13 ManufacturingProcess14 ManufacturingProcess15
## 1 35.5 4898 6108
## 2 34.0 4869 6095
## 3 34.8 4878 6087
## 4 34.8 4897 6102
## 5 34.6 4992 6233
## 6 34.0 4985 6222
## ManufacturingProcess16 ManufacturingProcess17 ManufacturingProcess18
## 1 4682 35.5 4865
## 2 4617 34.0 4867
## 3 4617 34.8 4877
## 4 4635 34.8 4872
## 5 4733 33.9 4886
## 6 4786 33.4 4862
## ManufacturingProcess19 ManufacturingProcess20 ManufacturingProcess21
## 1 6049 4665 0.0
## 2 6097 4621 0.0
## 3 6078 4621 0.0
## 4 6073 4611 0.0
## 5 6102 4659 -0.7
## 6 6115 4696 -0.6
## ManufacturingProcess22 ManufacturingProcess23 ManufacturingProcess24
## 1 NA NA NA
## 2 3 0 3
## 3 4 1 4
## 4 5 2 5
## 5 8 4 18
## 6 9 1 1
## ManufacturingProcess25 ManufacturingProcess26 ManufacturingProcess27
## 1 4873 6074 4685
## 2 4869 6107 4630
## 3 4897 6116 4637
## 4 4892 6111 4630
## 5 4930 6151 4684
## 6 4871 6128 4687
## ManufacturingProcess28 ManufacturingProcess29 ManufacturingProcess30
## 1 10.7 21.0 9.9
## 2 11.2 21.4 9.9
## 3 11.1 21.3 9.4
## 4 11.1 21.3 9.4
## 5 11.3 21.6 9.0
## 6 11.4 21.7 10.1
## ManufacturingProcess31 ManufacturingProcess32 ManufacturingProcess33
## 1 69.1 156 66
## 2 68.7 169 66
## 3 69.3 173 66
## 4 69.3 171 68
## 5 69.4 171 70
## 6 68.2 173 70
## ManufacturingProcess34 ManufacturingProcess35 ManufacturingProcess36
## 1 2.4 486 0.019
## 2 2.6 508 0.019
## 3 2.6 509 0.018
## 4 2.5 496 0.018
## 5 2.5 468 0.017
## 6 2.5 490 0.018
## ManufacturingProcess37 ManufacturingProcess38 ManufacturingProcess39
## 1 0.5 3 7.2
## 2 2.0 2 7.2
## 3 0.7 2 7.2
## 4 1.2 2 7.2
## 5 0.2 2 7.3
## 6 0.4 2 7.2
## ManufacturingProcess40 ManufacturingProcess41 ManufacturingProcess42
## 1 NA NA 11.6
## 2 0.1 0.15 11.1
## 3 0.0 0.00 12.0
## 4 0.0 0.00 10.6
## 5 0.0 0.00 11.0
## 6 0.0 0.00 11.5
## ManufacturingProcess43 ManufacturingProcess44 ManufacturingProcess45
## 1 3.0 1.8 2.4
## 2 0.9 1.9 2.2
## 3 1.0 1.8 2.3
## 4 1.1 1.8 2.1
## 5 1.1 1.7 2.1
## 6 2.2 1.8 2.0
The matrix processPredictors contains the 57 predictors (12 describing the input biological material and 45 describing the process predictors) for the 176 manufacturing runs. yield contains the percent yield for each run.
A small percentage of cells in the predictor set contain missing values. Use an imputation function to fill in these missing values (e.g., see Sect. 3.8).
By using the preProcess function of the caret package, you can easily normalize and standardize numerical data, and it is also possible to apply parameters such as min, max, and average applied to a specific dataset to other datasets.
There are various options that can be used for method arguments.
106 Missing value is discovered and after inputation all missing value is filled.
sum(is.na(ChemicalManufacturingProcess))
## [1] 106
# preProcess & bagImpute
preProc_bagImpute <- preProcess(ChemicalManufacturingProcess, method = "bagImpute")
ChemicalManufacturing_imputed <- predict(preProc_bagImpute, ChemicalManufacturingProcess)
# head(ChemicalManufacturing_imputed)
Split the data into a training and a test set, pre-process the data, and tune a model of your choice from this chapter. What is the optimal value of the performance metric?
set.seed(123)
idx <- createDataPartition(ChemicalManufacturing_imputed$Yield, p = 0.8, list = FALSE)
trainData <- ChemicalManufacturing_imputed[idx, ]
testData <- ChemicalManufacturing_imputed[-idx, ]
# Pre-processing
preProcValues <- preProcess(trainData, method = c("center", "scale"))
trainTransformed <- predict(preProcValues, trainData)
testTransformed <- predict(preProcValues, testData)
#trainControl <- trainControl(method = "cv", number = 10)
train_Control <- trainControl(method = "repeatedcv",
number = 10,
repeats = 5,
search = "random",
verboseIter = FALSE)
# Tuning Elastic Net Regression
set.seed(123)
elasticNetModel <- train(Yield ~ ., data = trainTransformed, method = "glmnet",
trControl = train_Control, tuneLength = 20,
preProcess = c("center", "scale"),
tuneGrid = expand.grid(alpha = seq(0.1, 0.9, length = 5),
lambda = seq(0.001, 0.1, length = 10)))
print(elasticNetModel)
## glmnet
##
## 144 samples
## 57 predictor
##
## Pre-processing: centered (57), scaled (57)
## Resampling: Cross-Validated (10 fold, repeated 5 times)
## Summary of sample sizes: 128, 129, 129, 130, 128, 131, ...
## Resampling results across tuning parameters:
##
## alpha lambda RMSE Rsquared MAE
## 0.1 0.001 1.4728939 0.4422797 0.8249679
## 0.1 0.012 1.2474370 0.4859755 0.7403540
## 0.1 0.023 1.0543531 0.5155199 0.6700336
## 0.1 0.034 0.9472468 0.5382407 0.6259226
## 0.1 0.045 0.8732550 0.5568439 0.5942464
## 0.1 0.056 0.8251770 0.5690333 0.5740835
## 0.1 0.067 0.7897225 0.5794584 0.5594033
## 0.1 0.078 0.7662538 0.5878980 0.5492984
## 0.1 0.089 0.7490635 0.5934242 0.5416789
## 0.1 0.100 0.7346713 0.5975277 0.5359323
## 0.3 0.001 1.4028493 0.4532756 0.7936012
## 0.3 0.012 1.0140657 0.5183948 0.6537358
## 0.3 0.023 0.7713894 0.5740693 0.5619155
## 0.3 0.034 0.7151728 0.5969650 0.5365402
## 0.3 0.045 0.6778878 0.6174814 0.5180478
## 0.3 0.056 0.6524814 0.6293953 0.5077112
## 0.3 0.067 0.6412216 0.6337643 0.5038699
## 0.3 0.078 0.6352123 0.6351584 0.5021600
## 0.3 0.089 0.6280494 0.6371890 0.5011059
## 0.3 0.100 0.6221133 0.6391117 0.5007075
## 0.5 0.001 1.3460463 0.4617316 0.7774093
## 0.5 0.012 0.8404671 0.5527583 0.5880960
## 0.5 0.023 0.7028604 0.5971653 0.5334852
## 0.5 0.034 0.6518359 0.6263128 0.5092681
## 0.5 0.045 0.6382732 0.6333190 0.5034377
## 0.5 0.056 0.6237467 0.6374631 0.4989981
## 0.5 0.067 0.6170054 0.6380079 0.4993364
## 0.5 0.078 0.6142154 0.6392484 0.5004876
## 0.5 0.089 0.6213648 0.6333980 0.5056347
## 0.5 0.100 0.6297891 0.6293997 0.5105756
## 0.7 0.001 1.2898458 0.4666477 0.7569651
## 0.7 0.012 0.7491001 0.5748260 0.5540368
## 0.7 0.023 0.6647825 0.6172889 0.5148625
## 0.7 0.034 0.6342972 0.6344650 0.5025408
## 0.7 0.045 0.6204255 0.6359786 0.4991112
## 0.7 0.056 0.6112393 0.6415000 0.4971846
## 0.7 0.067 0.6181303 0.6352467 0.5038792
## 0.7 0.078 0.6300169 0.6289287 0.5108579
## 0.7 0.089 0.6406006 0.6247339 0.5158828
## 0.7 0.100 0.6469739 0.6210692 0.5198486
## 0.9 0.001 1.2419506 0.4703187 0.7399417
## 0.9 0.012 0.7133570 0.5902641 0.5379205
## 0.9 0.023 0.6461513 0.6290761 0.5063579
## 0.9 0.034 0.6227863 0.6361440 0.4990327
## 0.9 0.045 0.6080100 0.6434100 0.4958768
## 0.9 0.056 0.6167691 0.6356380 0.5039422
## 0.9 0.067 0.6323943 0.6290842 0.5119506
## 0.9 0.078 0.6384946 0.6256581 0.5158222
## 0.9 0.089 0.6468450 0.6198389 0.5220562
## 0.9 0.100 0.6566066 0.6128519 0.5291462
##
## RMSE was used to select the optimal model using the smallest value.
## The final values used for the model were alpha = 0.9 and lambda = 0.045.
plot(elasticNetModel)
For the Elastic Net Regression model:
The lowest mean RMSE: alpha = 0.9 and lambda = 0.045, with an RMSE of 0.6080100.
The highest mean R² value: alpha = 0.9 and lambda = 0.045, with an R² of 0.6434100.
Predict the response for the test set. What is the value of the performance metric and how does this compare with the resampled performance metric on the training set?
Predict the response for the test set. What is the test set estimate of R2?
Test set estimate of Rsquared is 0.5715289
predictions <- predict(elasticNetModel, newdata = testTransformed)
results <- postResample(pred = predictions, obs = testData$Yield)
print(results)
## RMSE Rsquared MAE
## 40.0566969 0.5715289 40.0345323
Which predictors are most important in the model you have trained? Do either the biological or process predictors dominate the list?
The ‘ManufacturingProcess’ predictor variable is more prominent in the list, indicating that in this particular model, variables related to the manufacturing process are playing a more important role in predicting the response variable compared to the biological material variables.
importance <- varImp(elasticNetModel, scale = FALSE)
importance
## glmnet variable importance
##
## only 20 most important variables shown (out of 57)
##
## Overall
## ManufacturingProcess32 0.410973
## ManufacturingProcess09 0.242202
## ManufacturingProcess17 0.197045
## BiologicalMaterial06 0.117321
## ManufacturingProcess06 0.103519
## ManufacturingProcess37 0.099345
## BiologicalMaterial05 0.091001
## ManufacturingProcess45 0.086781
## ManufacturingProcess04 0.065001
## ManufacturingProcess34 0.054550
## ManufacturingProcess43 0.040188
## ManufacturingProcess13 0.035020
## ManufacturingProcess15 0.034661
## ManufacturingProcess36 0.031018
## ManufacturingProcess44 0.030957
## ManufacturingProcess39 0.019626
## ManufacturingProcess07 0.012088
## BiologicalMaterial09 0.011601
## ManufacturingProcess01 0.008796
## BiologicalMaterial07 0.002457
top15 <- importance$importance %>%
as.data.frame() %>%
mutate(Variable = row.names(.)) %>%
slice_max(order_by = Overall, n = 15) %>%
mutate(Scaled = (Overall - min(Overall)) / (max(Overall) - min(Overall)) * 100) %>%
arrange(desc(Scaled))
ggplot(top15, aes(x = reorder(Variable, Scaled), y = Scaled)) +
geom_bar(stat = "identity") +
coord_flip() +
labs(x = "Variable", y = "Scaled Importance", title = "Top 10 Variable Importance") +
theme_minimal()
Explore the relationships between each of the top predictors and the response. How could this information be helpful in improving yield in future runs of the manufacturing process?
Identify key factors: * Correlation analysis helps identify factors that have the greatest impact on yield. Variables with a high positive correlation are good candidates for optimization, and increasing them can improve yield. Yield reduction can be prevented by reducing or controlling highly negatively correlated variables.
Material Selection: * Certain biological materials may be critical to the quality of the final product if they are highly correlated with yield. This knowledge can guide material selection or modification of material processing steps.
Resource Allocation: Knowing which factors have the greatest impact allows to more wisely allocate resources such as time, budget, and personnel to those aspects of the process that will best increase yield.
selected_variables <- c("ManufacturingProcess32", "ManufacturingProcess09", "ManufacturingProcess17",
"BiologicalMaterial06", "ManufacturingProcess06", "ManufacturingProcess37",
"BiologicalMaterial05", "ManufacturingProcess45", "ManufacturingProcess04",
"ManufacturingProcess34", "ManufacturingProcess43", "ManufacturingProcess13",
"ManufacturingProcess15", "ManufacturingProcess36", "ManufacturingProcess44",
"ManufacturingProcess39", "ManufacturingProcess07", "BiologicalMaterial09",
"ManufacturingProcess01", "BiologicalMaterial07")
selected_data <- trainTransformed %>%
select(all_of(selected_variables))
cor_matrix <- cor(selected_data, use = "complete.obs")
corrplot(cor_matrix, type = "upper",
tl.col = "indianred", tl.srt = 90, method = "color", diag = FALSE)
knitr::kable(cor_matrix, digits = 2)
| ManufacturingProcess32 | ManufacturingProcess09 | ManufacturingProcess17 | BiologicalMaterial06 | ManufacturingProcess06 | ManufacturingProcess37 | BiologicalMaterial05 | ManufacturingProcess45 | ManufacturingProcess04 | ManufacturingProcess34 | ManufacturingProcess43 | ManufacturingProcess13 | ManufacturingProcess15 | ManufacturingProcess36 | ManufacturingProcess44 | ManufacturingProcess39 | ManufacturingProcess07 | BiologicalMaterial09 | ManufacturingProcess01 | BiologicalMaterial07 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| ManufacturingProcess32 | 1.00 | 0.05 | 0.04 | 0.64 | 0.22 | -0.20 | 0.17 | 0.04 | -0.48 | 0.01 | 0.15 | -0.06 | 0.38 | -0.80 | 0.07 | 0.02 | 0.03 | 0.06 | -0.40 | -0.14 |
| ManufacturingProcess09 | 0.05 | 1.00 | -0.68 | 0.28 | 0.41 | 0.15 | -0.01 | -0.19 | -0.14 | 0.05 | 0.13 | -0.79 | -0.01 | -0.11 | -0.11 | -0.18 | -0.04 | 0.11 | 0.05 | 0.11 |
| ManufacturingProcess17 | 0.04 | -0.68 | 1.00 | 0.01 | -0.20 | -0.01 | 0.05 | 0.19 | 0.01 | -0.07 | 0.00 | 0.76 | 0.02 | 0.02 | 0.16 | 0.19 | -0.06 | -0.07 | -0.24 | 0.06 |
| BiologicalMaterial06 | 0.64 | 0.28 | 0.01 | 1.00 | 0.31 | 0.02 | 0.38 | -0.07 | -0.48 | -0.11 | 0.10 | -0.13 | 0.42 | -0.58 | -0.08 | -0.14 | -0.04 | 0.26 | -0.23 | -0.03 |
| ManufacturingProcess06 | 0.22 | 0.41 | -0.20 | 0.31 | 1.00 | 0.09 | 0.00 | -0.02 | -0.06 | -0.04 | 0.01 | -0.38 | 0.12 | -0.28 | 0.07 | 0.00 | -0.02 | 0.09 | -0.14 | -0.07 |
| ManufacturingProcess37 | -0.20 | 0.15 | -0.01 | 0.02 | 0.09 | 1.00 | 0.08 | -0.16 | 0.08 | -0.12 | -0.02 | -0.01 | 0.04 | 0.11 | -0.21 | -0.23 | -0.49 | -0.07 | -0.01 | -0.02 |
| BiologicalMaterial05 | 0.17 | -0.01 | 0.05 | 0.38 | 0.00 | 0.08 | 1.00 | -0.17 | -0.25 | 0.01 | -0.01 | -0.01 | 0.35 | -0.15 | -0.25 | -0.09 | -0.05 | -0.33 | 0.00 | -0.11 |
| ManufacturingProcess45 | 0.04 | -0.19 | 0.19 | -0.07 | -0.02 | -0.16 | -0.17 | 1.00 | -0.01 | 0.04 | 0.08 | 0.15 | -0.07 | 0.00 | 0.88 | 0.81 | -0.03 | 0.16 | -0.19 | 0.00 |
| ManufacturingProcess04 | -0.48 | -0.14 | 0.01 | -0.48 | -0.06 | 0.08 | -0.25 | -0.01 | 1.00 | -0.15 | -0.32 | 0.13 | -0.22 | 0.36 | -0.01 | -0.05 | -0.02 | 0.07 | 0.58 | -0.16 |
| ManufacturingProcess34 | 0.01 | 0.05 | -0.07 | -0.11 | -0.04 | -0.12 | 0.01 | 0.04 | -0.15 | 1.00 | 0.02 | -0.01 | -0.11 | 0.03 | 0.03 | 0.07 | 0.10 | -0.04 | -0.22 | 0.02 |
| ManufacturingProcess43 | 0.15 | 0.13 | 0.00 | 0.10 | 0.01 | -0.02 | -0.01 | 0.08 | -0.32 | 0.02 | 1.00 | -0.12 | -0.03 | -0.08 | 0.06 | 0.08 | -0.07 | 0.11 | -0.07 | -0.02 |
| ManufacturingProcess13 | -0.06 | -0.79 | 0.76 | -0.13 | -0.38 | -0.01 | -0.01 | 0.15 | 0.13 | -0.01 | -0.12 | 1.00 | 0.29 | 0.11 | 0.12 | 0.15 | -0.06 | -0.02 | -0.13 | -0.07 |
| ManufacturingProcess15 | 0.38 | -0.01 | 0.02 | 0.42 | 0.12 | 0.04 | 0.35 | -0.07 | -0.22 | -0.11 | -0.03 | 0.29 | 1.00 | -0.31 | -0.03 | 0.02 | -0.07 | -0.07 | -0.05 | -0.19 |
| ManufacturingProcess36 | -0.80 | -0.11 | 0.02 | -0.58 | -0.28 | 0.11 | -0.15 | 0.00 | 0.36 | 0.03 | -0.08 | 0.11 | -0.31 | 1.00 | -0.04 | 0.04 | 0.16 | -0.11 | 0.28 | 0.12 |
| ManufacturingProcess44 | 0.07 | -0.11 | 0.16 | -0.08 | 0.07 | -0.21 | -0.25 | 0.88 | -0.01 | 0.03 | 0.06 | 0.12 | -0.03 | -0.04 | 1.00 | 0.85 | 0.00 | 0.18 | -0.20 | 0.01 |
| ManufacturingProcess39 | 0.02 | -0.18 | 0.19 | -0.14 | 0.00 | -0.23 | -0.09 | 0.81 | -0.05 | 0.07 | 0.08 | 0.15 | 0.02 | 0.04 | 0.85 | 1.00 | 0.03 | 0.01 | -0.18 | 0.02 |
| ManufacturingProcess07 | 0.03 | -0.04 | -0.06 | -0.04 | -0.02 | -0.49 | -0.05 | -0.03 | -0.02 | 0.10 | -0.07 | -0.06 | -0.07 | 0.16 | 0.00 | 0.03 | 1.00 | -0.02 | 0.04 | -0.04 |
| BiologicalMaterial09 | 0.06 | 0.11 | -0.07 | 0.26 | 0.09 | -0.07 | -0.33 | 0.16 | 0.07 | -0.04 | 0.11 | -0.02 | -0.07 | -0.11 | 0.18 | 0.01 | -0.02 | 1.00 | 0.04 | 0.10 |
| ManufacturingProcess01 | -0.40 | 0.05 | -0.24 | -0.23 | -0.14 | -0.01 | 0.00 | -0.19 | 0.58 | -0.22 | -0.07 | -0.13 | -0.05 | 0.28 | -0.20 | -0.18 | 0.04 | 0.04 | 1.00 | -0.15 |
| BiologicalMaterial07 | -0.14 | 0.11 | 0.06 | -0.03 | -0.07 | -0.02 | -0.11 | 0.00 | -0.16 | 0.02 | -0.02 | -0.07 | -0.19 | 0.12 | 0.01 | 0.02 | -0.04 | 0.10 | -0.15 | 1.00 |