6.2.

Developing a model to predict permeability (see Sect. 1.4) could save significant resources for a pharmaceutical company, while at the same time more rapidly identifying molecules that have a sufficient permeability to become a drug: (a) Start R and use these commands to load the data:

library(AppliedPredictiveModeling)
data(permeability)

The matrix fingerprints contains the 1,107 binary molecular predictors for the 165 compounds, while permeability contains permeability response.

  1. The fingerprint predictors indicate the presence or absence of substructures of a molecule and are often sparse meaning that relatively few of the molecules contain each substructure. Filter out the predictors that have low frequencies using the nearZeroVar function from the caret package. How many predictors are left for modeling?
fingerprints %>%
  nearZeroVar() %>%
  length()
## [1] 719

There are 719 variables

  1. Split the data into a training and a test set, pre-process the data, and tune a PLS model. How many latent variables are optimal and what is the corresponding resampled estimate of R2?

here I will use a split for 80-20 - PLS model

df <- as.data.frame(fingerprints[, nearZeroVar(fingerprints)]) %>%
  mutate(y = permeability)

set.seed(123)

train_data <- createDataPartition(df$y, times = 1, p = 0.8, list = FALSE)
train_x <- df[train_data, ]
test_x <- df[-train_data, ]

pls_model <- train(
  y ~ ., data = train_x, method = "pls",
  center = TRUE,
  trControl = trainControl("cv", number = 10),
  tuneLength = 25
)

# Plot model RMSE vs different values of components
title <- paste("RMSE Minimized at",
               pls_model$bestTune$ncomp,
               "Components")
plot(pls_model, main = title)

pls_model$results %>%
  filter(ncomp == pls_model$bestTune$ncomp) %>%
  select(ncomp, RMSE, Rsquared) %>%
  kable() %>%
  kable_styling()
ncomp RMSE Rsquared
3 14.56647 0.2808471
  1. Predict the response for the test set. What is the test set estimate of R2?
# Make predictions
pls_predictions <- predict(pls_model, test_x)
# Model performance metrics
results <- data.frame(Model = "PLS",
                      RMSE = RMSE(pls_predictions, test_x$y),
                      Rsquared = R2(pls_predictions, test_x$y))
results
##              Model     RMSE  Rsquared
## permeability   PLS 11.52993 0.1796387
  1. Try building other models discussed in this chapter. Do any have better predictive performance?

PCR Model

pcr_model <- train(
  y ~ ., data = train_x, method = "pcr",
  center = TRUE,
  trControl = trainControl("cv", number = 10),
  tuneLength = 25
)

title <- paste("RMSE Minimized at",
               pcr_model$bestTune,
               "Components")
plot(pcr_model, main = title)

# Make predictions
pcr_predictions <- predict(pcr_model, test_x)
# Model performance metrics
pcr_results <- data.frame(Model = "PCR",
                          RMSE = RMSE(pcr_predictions, test_x$y),
                          Rsquared = R2(pcr_predictions, test_x$y))
pcr_results 
##              Model     RMSE   Rsquared
## permeability   PCR 12.98751 0.02725374

Ridge Regression

x <- model.matrix(y ~ ., data = train_x)
x_test <- model.matrix(y ~ ., data = test_x)
rr_cv <- cv.glmnet(x, train_x$y, alpha = 0)
rr_model <- glmnet(x, train_x$y, alpha = 0, lambda = rr_cv$lambda.min)
rr_predictions <- as.vector(predict(rr_model, x_test))
rr_results <- data.frame(Model = "Ridge Regression",
                         RMSE = RMSE(rr_predictions, test_x$y),
                         Rsquared = R2(rr_predictions, test_x$y))
rr_results 
##                         Model     RMSE  Rsquared
## permeability Ridge Regression 11.84515 0.1322738

Lasso Regression

lr_cv <- cv.glmnet(x, train_x$y, alpha = 1)
lr_model <- glmnet(x, train_x$y, alpha = 1, lambda = lr_cv$lambda.min)
lr_predictions <- as.vector(predict(lr_model, x_test))
lr_results <- data.frame(Model = "Lasso Regression",
                         RMSE = RMSE(lr_predictions, test_x$y),
                         Rsquared = R2(lr_predictions, test_x$y))
lr_results 
##                         Model     RMSE  Rsquared
## permeability Lasso Regression 9.179522 0.5487934

Elastic Net Regession

en_model <- train(
  y ~ ., data = train_x, method = "glmnet",
  trControl = trainControl("cv", number = 10),
  tuneLength = 10
)
## Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
## There were missing values in resampled performance measures.
# Best tuning parameters
en_model$bestTune
##   alpha   lambda
## 8   0.1 4.269261
en_predictions <- en_model %>% predict(x_test)
# Model performance metrics
en_results <- data.frame(Model = "Elastic Net Regression",
                         RMSE = RMSE(en_predictions, test_x$y),
                         Rsquared = R2(en_predictions, test_x$y))
en_results
##                               Model     RMSE  Rsquared
## permeability Elastic Net Regression 10.12057 0.3955913

Summary

pls_model$results %>%
  filter(ncomp == pls_model$bestTune$ncomp) %>%
  mutate("Model" = "PLS") %>%
  select(Model, RMSE, Rsquared) %>%
  as.data.frame() %>%
  bind_rows(pcr_results) %>%
  bind_rows(rr_results) %>%
  bind_rows(lr_results) %>%
  bind_rows(en_results) %>%
  arrange(desc(Rsquared))
##                                   Model      RMSE   Rsquared
## permeability...1       Lasso Regression  9.179522 0.54879339
## permeability...2 Elastic Net Regression 10.120573 0.39559125
## ...3                                PLS 14.566470 0.28084706
## permeability...4       Ridge Regression 11.845152 0.13227384
## permeability...5                    PCR 12.987506 0.02725374
  1. Would you recommend any of your models to replace the permeability laboratory experiment?

Since R2 was really low then I won't recommend any of the models to replace the permeability laboratory experiment, The model did not have much explanatory power.

6.3. A chemical manufacturing process for a pharmaceutical product was discussed in Sect.1.4. In this problem, the objective is to understand the relationship between biological measurements of the raw materials (predictors), measurements of the manufacturing process (predictors), and the response of product yield.

Biological predictors cannot be changed but can be used to assess the quality of the raw material before processing. On the other hand, manufacturing process predictors can be changed in the manufacturing process. Improving product yield by 1% will boost revenue by approximately one hundred thousand dollars per batch:

  1. Start R and use these commands to load the data:
library(AppliedPredictiveModeling)
data(ChemicalManufacturingProcess)

The matrix processPredictors contains the 57 predictors (12 describing the input biological material and 45 describing the process predictors) for the 176 manufacturing runs. yield contains the percent yield for each run.

  1. A small percentage of cells in the predictor set contain missing values. Use an imputation function to fill in these missing values (e.g., see Sect. 3.8).

I will use KNN to impute values

set.seed(123)

knn_model <- preProcess(ChemicalManufacturingProcess, "knnImpute")
df <- predict(knn_model, ChemicalManufacturingProcess)
head(df)
##        Yield BiologicalMaterial01 BiologicalMaterial02 BiologicalMaterial03
## 1 -1.1792673           -0.2261036           -1.5140979          -2.68303622
## 2  1.2263678            2.2391498            1.3089960          -0.05623504
## 3  1.0042258            2.2391498            1.3089960          -0.05623504
## 4  0.6737219            2.2391498            1.3089960          -0.05623504
## 5  1.2534583            1.4827653            1.8939391           1.13594780
## 6  1.8386128           -0.4081962            0.6620886          -0.59859075
##   BiologicalMaterial04 BiologicalMaterial05 BiologicalMaterial06
## 1            0.2201765            0.4941942           -1.3828880
## 2            1.2964386            0.4128555            1.1290767
## 3            1.2964386            0.4128555            1.1290767
## 4            1.2964386            0.4128555            1.1290767
## 5            0.9414412           -0.3734185            1.5348350
## 6            1.5894524            1.7305423            0.6192092
##   BiologicalMaterial07 BiologicalMaterial08 BiologicalMaterial09
## 1           -0.1313107            -1.233131           -3.3962895
## 2           -0.1313107             2.282619           -0.7227225
## 3           -0.1313107             2.282619           -0.7227225
## 4           -0.1313107             2.282619           -0.7227225
## 5           -0.1313107             1.071310           -0.1205678
## 6           -0.1313107             1.189487           -1.7343424
##   BiologicalMaterial10 BiologicalMaterial11 BiologicalMaterial12
## 1            1.1005296            -1.838655           -1.7709224
## 2            1.1005296             1.393395            1.0989855
## 3            1.1005296             1.393395            1.0989855
## 4            1.1005296             1.393395            1.0989855
## 5            0.4162193             0.136256            1.0989855
## 6            1.6346255             1.022062            0.7240877
##   ManufacturingProcess01 ManufacturingProcess02 ManufacturingProcess03
## 1              0.2154105              0.5662872              0.3765810
## 2             -6.1497028             -1.9692525              0.1979962
## 3             -6.1497028             -1.9692525              0.1087038
## 4             -6.1497028             -1.9692525              0.4658734
## 5             -0.2784345             -1.9692525              0.1087038
## 6              0.4348971             -1.9692525              0.5551658
##   ManufacturingProcess04 ManufacturingProcess05 ManufacturingProcess06
## 1              0.5655598            -0.44593467             -0.5414997
## 2             -2.3669726             0.99933318              0.9625383
## 3             -3.1638563             0.06246417             -0.1117745
## 4             -3.3232331             0.42279841              2.1850322
## 5             -2.2075958             0.84537219             -0.6304083
## 6             -1.2513352             0.49486525              0.5550403
##   ManufacturingProcess07 ManufacturingProcess08 ManufacturingProcess09
## 1             -0.1596700             -0.3095182             -1.7201524
## 2             -0.9580199              0.8941637              0.5883746
## 3              1.0378549              0.8941637             -0.3815947
## 4             -0.9580199             -1.1119728             -0.4785917
## 5              1.0378549              0.8941637             -0.4527258
## 6              1.0378549              0.8941637             -0.2199332
##   ManufacturingProcess10 ManufacturingProcess11 ManufacturingProcess12
## 1            -0.07700901            -0.09157342             -0.4806937
## 2             0.52297397             1.08204765             -0.4806937
## 3             0.31428424             0.55112383             -0.4806937
## 4            -0.02483658             0.80261406             -0.4806937
## 5            -0.39004361             0.10403009             -0.4806937
## 6             0.28819802             1.41736795             -0.4806937
##   ManufacturingProcess13 ManufacturingProcess14 ManufacturingProcess15
## 1             0.97711512              0.8093999              1.1846438
## 2            -0.50030980              0.2775205              0.9617071
## 3             0.28765016              0.4425865              0.8245152
## 4             0.28765016              0.7910592              1.0817499
## 5             0.09066017              2.5334227              3.3282665
## 6            -0.50030980              2.4050380              3.1396277
##   ManufacturingProcess16 ManufacturingProcess17 ManufacturingProcess18
## 1              0.3303945              0.9263296              0.1505348
## 2              0.1455765             -0.2753953              0.1559773
## 3              0.1455765              0.3655246              0.1831898
## 4              0.1967569              0.3655246              0.1695836
## 5              0.4754056             -0.3555103              0.2076811
## 6              0.6261033             -0.7560852              0.1423710
##   ManufacturingProcess19 ManufacturingProcess20 ManufacturingProcess21
## 1              0.4563798              0.3109942              0.2109804
## 2              1.5095063              0.1849230              0.2109804
## 3              1.0926437              0.1849230              0.2109804
## 4              0.9829430              0.1562704              0.2109804
## 5              1.6192070              0.2938027             -0.6884239
## 6              1.9044287              0.3998171             -0.5599376
##   ManufacturingProcess22 ManufacturingProcess23 ManufacturingProcess24
## 1             0.05833309              0.8317688              0.8907291
## 2            -0.72230090             -1.8147683             -1.0060115
## 3            -0.42205706             -1.2132826             -0.8335805
## 4            -0.12181322             -0.6117969             -0.6611496
## 5             0.77891831              0.5911745              1.5804530
## 6             1.07916216             -1.2132826             -1.3508734
##   ManufacturingProcess25 ManufacturingProcess26 ManufacturingProcess27
## 1              0.1200183              0.1256347              0.3460352
## 2              0.1093082              0.1966227              0.1906613
## 3              0.1842786              0.2159831              0.2104362
## 4              0.1708910              0.2052273              0.1906613
## 5              0.2726365              0.2912733              0.3432102
## 6              0.1146633              0.2417969              0.3516852
##   ManufacturingProcess28 ManufacturingProcess29 ManufacturingProcess30
## 1              0.7826636              0.5943242              0.7566948
## 2              0.8779201              0.8347250              0.7566948
## 3              0.8588688              0.7746248              0.2444430
## 4              0.8588688              0.7746248              0.2444430
## 5              0.8969714              0.9549255             -0.1653585
## 6              0.9160227              1.0150257              0.9615956
##   ManufacturingProcess31 ManufacturingProcess32 ManufacturingProcess33
## 1             -0.1952552             -0.4568829              0.9890307
## 2             -0.2672523              1.9517531              0.9890307
## 3             -0.1592567              2.6928719              0.9890307
## 4             -0.1592567              2.3223125              1.7943843
## 5             -0.1412574              2.3223125              2.5997378
## 6             -0.3572486              2.6928719              2.5997378
##   ManufacturingProcess34 ManufacturingProcess35 ManufacturingProcess36
## 1             -1.7202722            -0.88694718             -0.6557774
## 2              1.9568096             1.14638329             -0.6557774
## 3              1.9568096             1.23880740             -1.8000420
## 4              0.1182687             0.03729394             -1.8000420
## 5              0.1182687            -2.55058120             -2.9443066
## 6              0.1182687            -0.51725073             -1.8000420
##   ManufacturingProcess37 ManufacturingProcess38 ManufacturingProcess39
## 1             -1.1540243              0.7174727              0.2317270
## 2              2.2161351             -0.8224687              0.2317270
## 3             -0.7046697             -0.8224687              0.2317270
## 4              0.4187168             -0.8224687              0.2317270
## 5             -1.8280562             -0.8224687              0.2981503
## 6             -1.3787016             -0.8224687              0.2317270
##   ManufacturingProcess40 ManufacturingProcess41 ManufacturingProcess42
## 1             0.05969714            -0.06900773             0.20279570
## 2             2.14909691             2.34626280            -0.05472265
## 3            -0.46265281            -0.44058781             0.40881037
## 4            -0.46265281            -0.44058781            -0.31224099
## 5            -0.46265281            -0.44058781            -0.10622632
## 6            -0.46265281            -0.44058781             0.15129203
##   ManufacturingProcess43 ManufacturingProcess44 ManufacturingProcess45
## 1             2.40564734            -0.01588055             0.64371849
## 2            -0.01374656             0.29467248             0.15220242
## 3             0.10146268            -0.01588055             0.39796046
## 4             0.21667191            -0.01588055            -0.09355562
## 5             0.21667191            -0.32643359            -0.09355562
## 6             1.48397347            -0.01588055            -0.33931365
  1. Split the data into a training and a test set, pre-process the data, and tune a model of your choice from this chapter. What is the optimal value of the performance metric?

Will split the data to 80-20

df <- df %>%
  select_at(vars(-one_of(nearZeroVar(., names = TRUE))))

set.seed(123)

train_data <- createDataPartition(df$Yield, times = 1, p = 0.8, list = FALSE)
train_x <- df[train_data, ]
test_x <- df[-train_data, ]

Similar to 6.3 I will use the pls model.

pls_model <- train(
  Yield ~ ., data = train_x, method = "pls",
  center = TRUE,
  scale = TRUE,
  trControl = trainControl("cv", number = 10),
  tuneLength = 25
)

# Plot model RMSE vs different values of components
title <- paste("Training Set RMSE Minimized at",
               pls_model$bestTune$ncomp,
               "Components")
plot(pls_model, main = title)

pls_model$results %>%
  filter(ncomp == pls_model$bestTune$ncomp) %>%
  select(ncomp, RMSE, Rsquared) 
##   ncomp      RMSE  Rsquared
## 1     3 0.6606301 0.6025962

With 3 components I can see the RMSE is 0.66 and Rsquared is 0.60

  1. Predict the response for the test set. What is the value of the performance metric and how does this compare with the resampled performance metric on the training set?
# Make predictions
pls_predictions <- predict(pls_model, test_x)
# Model performance metrics
results <- data.frame(RMSE = RMSE(pls_predictions, test_x$Yield),
           Rsquared = R2(pls_predictions, test_x$Yield))

results
##        RMSE  Rsquared
## 1 0.7472301 0.4690064

`We have an RMSE of about 0.74 and a R2 of 47 on the test set.and in train set is 3 components with the RMSE is 0.66 and Rsquared is 0.60``

  1. Which predictors are most important in the model you have trained? Do either the biological or process predictors dominate the list?

I will select the variables with a varImp sore greater than or equal to 60 to be the “important” ones.

pls_importance <- varImp(pls_model)$importance %>%
  as.data.frame() %>%
  rownames_to_column("Variable") %>%
  filter(Overall >= 50) %>%
  arrange(desc(Overall)) %>%
  mutate(importance = row_number())
## 
## Attaching package: 'pls'
## The following object is masked from 'package:corrplot':
## 
##     corrplot
## The following object is masked from 'package:caret':
## 
##     R2
## The following object is masked from 'package:stats':
## 
##     loadings
varImp(pls_model) %>%
  plot(., top = max(pls_importance$importance), main = "Important Variables")

pls_importance %>%
  mutate(Variable = gsub("[0-9]+", "", Variable)) %>%
  group_by(Variable) %>%
  tally() %>%
  arrange(desc(n)) 
## # A tibble: 2 × 2
##   Variable                 n
##   <chr>                <int>
## 1 BiologicalMaterial       8
## 2 ManufacturingProcess     8
  1. Explore the relationships between each of the top predictors and the response. How could this information be helpful in improving yield in future runs of the manufacturing process?
important_vars <- df %>%
  select_at(vars(Yield, pls_importance$Variable))

important_vars_p <- cor.mtest(important_vars)$p

important_vars %>%
  cor() 
##                             Yield ManufacturingProcess32 ManufacturingProcess17
## Yield                   1.0000000             0.60833215          -0.4258068718
## ManufacturingProcess32  0.6083321             1.00000000           0.0160417779
## ManufacturingProcess17 -0.4258069             0.01604178           1.0000000000
## ManufacturingProcess13 -0.5036797            -0.10120679           0.7824134530
## ManufacturingProcess09  0.5034705             0.04100301          -0.7154560357
## ManufacturingProcess36 -0.5257521            -0.79074701          -0.0020023828
## ManufacturingProcess06  0.3918329             0.21107014          -0.2589184161
## ManufacturingProcess33  0.4259162             0.85503352           0.1031513306
## BiologicalMaterial06    0.4781634             0.60059580           0.0060040026
## BiologicalMaterial03    0.4450860             0.53185738          -0.0976050220
## BiologicalMaterial08    0.3809402             0.46509386           0.0366214316
## BiologicalMaterial02    0.4815158             0.62983209           0.0238757276
## ManufacturingProcess11  0.3541010            -0.04733682          -0.5431531836
## BiologicalMaterial12    0.3674976             0.38777603           0.0188428565
## BiologicalMaterial11    0.3549143             0.41303985           0.0008172166
## BiologicalMaterial01    0.3589380             0.58074472           0.0847218165
## BiologicalMaterial04    0.3798401             0.57339290           0.0681689648
##                        ManufacturingProcess13 ManufacturingProcess09
## Yield                             -0.50367972             0.50347051
## ManufacturingProcess32            -0.10120679             0.04100301
## ManufacturingProcess17             0.78241345            -0.71545604
## ManufacturingProcess13             1.00000000            -0.79135366
## ManufacturingProcess09            -0.79135366             1.00000000
## ManufacturingProcess36             0.10373819            -0.05878034
## ManufacturingProcess06            -0.41417324             0.37310580
## ManufacturingProcess33            -0.02674363            -0.03184275
## BiologicalMaterial06              -0.12186756             0.23005968
## BiologicalMaterial03              -0.13369531             0.21460099
## BiologicalMaterial08              -0.12879997             0.25382693
## BiologicalMaterial02              -0.11246895             0.21884418
## ManufacturingProcess11            -0.59713215             0.72290852
## BiologicalMaterial12              -0.11198335             0.24585610
## BiologicalMaterial11              -0.06622217             0.16169992
## BiologicalMaterial01              -0.05656480             0.15278122
## BiologicalMaterial04              -0.04342685             0.14854273
##                        ManufacturingProcess36 ManufacturingProcess06
## Yield                            -0.525752053              0.3918329
## ManufacturingProcess32           -0.790747007              0.2110701
## ManufacturingProcess17           -0.002002383             -0.2589184
## ManufacturingProcess13            0.103738191             -0.4141732
## ManufacturingProcess09           -0.058780337              0.3731058
## ManufacturingProcess36            1.000000000             -0.2532658
## ManufacturingProcess06           -0.253265795              1.0000000
## ManufacturingProcess33           -0.697075809              0.1363429
## BiologicalMaterial06             -0.536334146              0.2350466
## BiologicalMaterial03             -0.472963248              0.1938400
## BiologicalMaterial08             -0.431153749              0.2552878
## BiologicalMaterial02             -0.566668428              0.2622260
## ManufacturingProcess11            0.040543807              0.3172338
## BiologicalMaterial12             -0.378081244              0.2613239
## BiologicalMaterial11             -0.341479471              0.1795052
## BiologicalMaterial01             -0.482185491              0.1706736
## BiologicalMaterial04             -0.422013882              0.1173219
##                        ManufacturingProcess33 BiologicalMaterial06
## Yield                              0.42591617          0.478163422
## ManufacturingProcess32             0.85503352          0.600595801
## ManufacturingProcess17             0.10315133          0.006004003
## ManufacturingProcess13            -0.02674363         -0.121867557
## ManufacturingProcess09            -0.03184275          0.230059682
## ManufacturingProcess36            -0.69707581         -0.536334146
## ManufacturingProcess06             0.13634291          0.235046564
## ManufacturingProcess33             1.00000000          0.537843544
## BiologicalMaterial06               0.53784354          1.000000000
## BiologicalMaterial03               0.48154317          0.872363670
## BiologicalMaterial08               0.40861823          0.650342532
## BiologicalMaterial02               0.58248354          0.954311305
## ManufacturingProcess11            -0.10471326          0.110193475
## BiologicalMaterial12               0.33847699          0.812853967
## BiologicalMaterial11               0.34575678          0.775535740
## BiologicalMaterial01               0.52397577          0.652343094
## BiologicalMaterial04               0.51085539          0.651072303
##                        BiologicalMaterial03 BiologicalMaterial08
## Yield                            0.44508598           0.38094021
## ManufacturingProcess32           0.53185738           0.46509386
## ManufacturingProcess17          -0.09760502           0.03662143
## ManufacturingProcess13          -0.13369531          -0.12879997
## ManufacturingProcess09           0.21460099           0.25382693
## ManufacturingProcess36          -0.47296325          -0.43115375
## ManufacturingProcess06           0.19384003           0.25528776
## ManufacturingProcess33           0.48154317           0.40861823
## BiologicalMaterial06             0.87236367           0.65034253
## BiologicalMaterial03             1.00000000           0.56141220
## BiologicalMaterial08             0.56141220           1.00000000
## BiologicalMaterial02             0.86079011           0.76120292
## ManufacturingProcess11          -0.08441979           0.23193745
## BiologicalMaterial12             0.69731478           0.77795072
## BiologicalMaterial11             0.71227148           0.80030352
## BiologicalMaterial01             0.57619985           0.77996317
## BiologicalMaterial04             0.58487370           0.71596496
##                        BiologicalMaterial02 ManufacturingProcess11
## Yield                            0.48151579             0.35410099
## ManufacturingProcess32           0.62983209            -0.04733682
## ManufacturingProcess17           0.02387573            -0.54315318
## ManufacturingProcess13          -0.11246895            -0.59713215
## ManufacturingProcess09           0.21884418             0.72290852
## ManufacturingProcess36          -0.56666843             0.04054381
## ManufacturingProcess06           0.26222601             0.31723376
## ManufacturingProcess33           0.58248354            -0.10471326
## BiologicalMaterial06             0.95431130             0.11019348
## BiologicalMaterial03             0.86079011            -0.08441979
## BiologicalMaterial08             0.76120292             0.23193745
## BiologicalMaterial02             1.00000000             0.12330531
## ManufacturingProcess11           0.12330531             1.00000000
## BiologicalMaterial12             0.77934185             0.14321007
## BiologicalMaterial11             0.77168821             0.09038107
## BiologicalMaterial01             0.73931488             0.05962337
## BiologicalMaterial04             0.74881669             0.14613178
##                        BiologicalMaterial12 BiologicalMaterial11
## Yield                            0.36749764         0.3549143462
## ManufacturingProcess32           0.38777603         0.4130398504
## ManufacturingProcess17           0.01884286         0.0008172166
## ManufacturingProcess13          -0.11198335        -0.0662221726
## ManufacturingProcess09           0.24585610         0.1616999198
## ManufacturingProcess36          -0.37808124        -0.3414794713
## ManufacturingProcess06           0.26132389         0.1795052247
## ManufacturingProcess33           0.33847699         0.3457567766
## BiologicalMaterial06             0.81285397         0.7755357399
## BiologicalMaterial03             0.69731478         0.7122714753
## BiologicalMaterial08             0.77795072         0.8003035151
## BiologicalMaterial02             0.77934185         0.7716882097
## ManufacturingProcess11           0.14321007         0.0903810728
## BiologicalMaterial12             1.00000000         0.9037208723
## BiologicalMaterial11             0.90372087         1.0000000000
## BiologicalMaterial01             0.51872558         0.6117038001
## BiologicalMaterial04             0.47088117         0.6202809021
##                        BiologicalMaterial01 BiologicalMaterial04
## Yield                            0.35893797           0.37984010
## ManufacturingProcess32           0.58074472           0.57339290
## ManufacturingProcess17           0.08472182           0.06816896
## ManufacturingProcess13          -0.05656480          -0.04342685
## ManufacturingProcess09           0.15278122           0.14854273
## ManufacturingProcess36          -0.48218549          -0.42201388
## ManufacturingProcess06           0.17067357           0.11732187
## ManufacturingProcess33           0.52397577           0.51085539
## BiologicalMaterial06             0.65234309           0.65107230
## BiologicalMaterial03             0.57619985           0.58487370
## BiologicalMaterial08             0.77996317           0.71596496
## BiologicalMaterial02             0.73931488           0.74881669
## ManufacturingProcess11           0.05962337           0.14613178
## BiologicalMaterial12             0.51872558           0.47088117
## BiologicalMaterial11             0.61170380           0.62028090
## BiologicalMaterial01             1.00000000           0.81963941
## BiologicalMaterial04             0.81963941           1.00000000

From the above I can see all variables are positively correlated with the yield, Manufacturing process 32 are the top correlated, and manufacturing process 13 and 36 are negatively correlated with it. therefore these relationships could improve the yield.