Concrete Mixtures

Concrete is an integral part of most industrialized societies. It is used to some extent in nearly all structures and in many roads. One of the main properties of interest (besides cost) is the compressive strength of the hardened concrete. The composition of many concretes includes a number of dry ingredients that are mixed with water and then allowed to dry and harden. Given its abundance and critical role in infrastructure, the composition is important and has been widely studied. The objective of this assignment is to create models that help to find potential recipes to maximize compressive strength.

A standard type of experimental setup for this scenario is called a mixture design. Here, boundaries on the upper and lower limits on the mixture proportion for each ingredient are used to create multiple mixtures that methodically fill the space within the boundaries. The ingredients used in the experimental setup were:

There is also an additional non-mixture factor related to compressive strength: the age of the mixture (at testing). Since this is not an ingredient, it is usually referred to as a process factor.

Assignment

Create a predictive model for the compressive strength of different concrete mixtures. The trainingdata data set can be used to train the model. The testdata data set contains the mixtures of which the compressive strength needs to be predicted. Write the compressive strength of the mixture to a file with write.csv such that it only contains the index and the compressive strength. This can be done by creating a data frame with the index and the compressive strength only (say, e.g., predictions) and then issuing the command: write.csv(predictions, file = "predictions.csv", row.names = FALSE)

Sources

Introduction

Below is a brief summary of the methods I used including the effect on the predictions.

Decrease in model accuracy:

Increase in model accuracy:

I think it is remarkable that all the adjustments to the training data led to a decrease in model accuracy. At the end, putting all the data in the most complex model gave the best results.

General

library(doParallel)

# set working directory
setwd("C:/directories/r_directory/pdm")

# start core cluster
n_cores  <- detectCores() -1
registerDoParallel(cores = n_cores)
cl <- makeCluster(n_cores)
registerDoParallel(cl)

Import data

# import training data
dataset <- read.csv("trainingdata.csv") 

# set predictor values
predictors <- dataset[,1:8]

# set response values
response <- dataset[,9]

# import test data
testdata <- read.csv("testdata.csv")

# save index
index_testdata <- testdata[,1]

# test data without index
testdata <- testdata[,2:9]

Exploratory data analysis

library(skimr)
library(DataExplorer)
library(ggthemes)
library(ggplot2)
library(GGally)
library(PerformanceAnalytics)

# general overview data set
head(dataset)
##   Cement BlastFurnaceSlag FlyAsh Water Superplasticizer CoarseAggregate
## 1  540.0              0.0      0   162              2.5            1055
## 2  332.5            142.5      0   228              0.0             932
## 3  332.5            142.5      0   228              0.0             932
## 4  266.0            114.0      0   228              0.0             932
## 5  380.0             95.0      0   228              0.0             932
## 6  266.0            114.0      0   228              0.0             932
##   FineAggregate Age CompressiveStrength
## 1           676  28               61.89
## 2           594 270               40.27
## 3           594 365               41.05
## 4           670  90               47.03
## 5           594 365               43.70
## 6           670  28               45.85
skim(dataset)
Data summary
Name dataset
Number of rows 900
Number of columns 9
_______________________
Column type frequency:
numeric 9
________________________
Group variables None

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100 hist
Cement 0 1 281.14 105.76 102.00 190.70 266.10 350.00 540.0 ▆▇▆▃▂
BlastFurnaceSlag 0 1 73.00 85.67 0.00 0.00 21.00 142.50 359.4 ▇▂▃▁▁
FlyAsh 0 1 54.93 64.21 0.00 0.00 0.00 118.30 200.1 ▇▁▂▂▁
Water 0 1 181.58 21.37 121.80 164.90 185.00 192.00 247.0 ▁▅▇▂▁
Superplasticizer 0 1 6.25 5.97 0.00 0.00 6.50 10.30 32.2 ▇▇▁▁▁
CoarseAggregate 0 1 973.29 78.09 801.00 932.00 968.00 1029.40 1145.0 ▃▅▇▅▂
FineAggregate 0 1 773.16 80.84 594.00 728.68 778.90 824.00 992.6 ▂▃▇▃▁
Age 0 1 45.61 63.36 1.00 7.00 28.00 56.00 365.0 ▇▁▁▁▁
CompressiveStrength 0 1 35.55 16.69 2.33 23.61 33.95 45.37 82.6 ▅▇▇▃▂
summary(dataset)
##      Cement      BlastFurnaceSlag     FlyAsh           Water      
##  Min.   :102.0   Min.   :  0.0    Min.   :  0.00   Min.   :121.8  
##  1st Qu.:190.7   1st Qu.:  0.0    1st Qu.:  0.00   1st Qu.:164.9  
##  Median :266.1   Median : 21.0    Median :  0.00   Median :185.0  
##  Mean   :281.1   Mean   : 73.0    Mean   : 54.93   Mean   :181.6  
##  3rd Qu.:350.0   3rd Qu.:142.5    3rd Qu.:118.30   3rd Qu.:192.0  
##  Max.   :540.0   Max.   :359.4    Max.   :200.10   Max.   :247.0  
##  Superplasticizer CoarseAggregate  FineAggregate        Age        
##  Min.   : 0.000   Min.   : 801.0   Min.   :594.0   Min.   :  1.00  
##  1st Qu.: 0.000   1st Qu.: 932.0   1st Qu.:728.7   1st Qu.:  7.00  
##  Median : 6.500   Median : 968.0   Median :778.9   Median : 28.00  
##  Mean   : 6.246   Mean   : 973.3   Mean   :773.2   Mean   : 45.61  
##  3rd Qu.:10.300   3rd Qu.:1029.4   3rd Qu.:824.0   3rd Qu.: 56.00  
##  Max.   :32.200   Max.   :1145.0   Max.   :992.6   Max.   :365.00  
##  CompressiveStrength
##  Min.   : 2.33      
##  1st Qu.:23.61      
##  Median :33.95      
##  Mean   :35.55      
##  3rd Qu.:45.37      
##  Max.   :82.60
# pair plot
ggpairs(dataset)

# histograms
plot_histogram(dataset, ggtheme=theme_few())

# box plots
boxplot(dataset, col = "blue")

# scatter plots by CompressiveStrength
plot_scatterplot(dataset, by = "CompressiveStrength", ggtheme=theme_few())

# correlation plot
plot_correlation(cor(dataset), ggtheme=theme_few())

# count zeros per column and calculate as percentage from total to confirm the number of zeros and the percentage of the total values in the column
zero_count <- colSums(dataset==0)
zero_count
##              Cement    BlastFurnaceSlag              FlyAsh               Water 
##                   0                 417                 490                   0 
##    Superplasticizer     CoarseAggregate       FineAggregate                 Age 
##                 327                   0                   0                   0 
## CompressiveStrength 
##                   0
zero_pct <- colSums(dataset==0)/nrow(dataset)*100
zero_pct
##              Cement    BlastFurnaceSlag              FlyAsh               Water 
##             0.00000            46.33333            54.44444             0.00000 
##    Superplasticizer     CoarseAggregate       FineAggregate                 Age 
##            36.33333             0.00000             0.00000             0.00000 
## CompressiveStrength 
##             0.00000
# skewness
skewness(dataset)
##             Cement BlastFurnaceSlag    FlyAsh      Water Superplasticizer
## Skewness 0.5247352        0.7951561 0.5148602 0.05496505         0.896554
##          CoarseAggregate FineAggregate      Age CompressiveStrength
## Skewness     -0.01441195    -0.2375712 3.224019           0.4450269
plot_qq(dataset, ggtheme=theme_few())

Comments

General overview: there are no null values in any of the columns.

Histograms: the data is not normally distributed, we should transform the data to normal distributions.

Box plots: there are outliers in predictors BlastFurnaceSlag, Water, Superplasticizer, FineAggregate and Age. Removing the rows with outliers will remove around 10% of the data so we will try and substitute them with the mean and check the results.

Scatter plots: based on the scatter plots feature ‘Cement’ seems to be most correlated with the response value CompressiveStrength we will check/ confirm this with a correlation overview and plot.

Correlation plot: we can observe a high positive correlation between the predictor and Cement. Also, Age and Super Plasticizer are the other two factors influencing the predictor. There is also a strong negative correlation between SuperPlasticizer and Water.

Count zeros: based on the plots we see many zeros in the features BlastFurnaceSlag, FlyAsh and Superplasticizer. We should explore this further.

Skewness: skewness is clearly visible, especially at the features with many zeros. Age has some extreme outliers at the top. We should transform the distributions to normal distributions when preprocessing

Preprocessing

library(caret)
library(mlbench)
library(tidyverse)
library(dplyr)
library(dlookr)

# check if there is near zero variance
nzv <- nearZeroVar(predictors, saveMetrics = TRUE)
nzv
##                  freqRatio percentUnique zeroVar   nzv
## Cement            1.125000     29.555556   FALSE FALSE
## BlastFurnaceSlag 15.444444     19.222222   FALSE FALSE
## FlyAsh           30.625000     16.333333   FALSE FALSE
## Water             2.191489     20.777778   FALSE FALSE
## Superplasticizer 10.218750     11.777778   FALSE FALSE
## CoarseAggregate   1.225000     30.000000   FALSE FALSE
## FineAggregate     1.000000     32.222222   FALSE FALSE
## Age               3.066116      1.555556   FALSE FALSE
#no near zero variance so we don't use nzv as method in preProc

# check if there are linear dependencies
lindep <- findLinearCombos(predictors)
lindep
## $linearCombos
## list()
## 
## $remove
## NULL
#no linear dependencies

# check if there is significant correlation based on a .75 cutoff, to confirm what we have seen in the plots earlier
cor <- cor(predictors)
cor
##                       Cement BlastFurnaceSlag      FlyAsh       Water
## Cement            1.00000000      -0.27613504 -0.40435177 -0.07700660
## BlastFurnaceSlag -0.27613504       1.00000000 -0.31911721  0.09937857
## FlyAsh           -0.40435177      -0.31911721  1.00000000 -0.23894626
## Water            -0.07700660       0.09937857 -0.23894626  1.00000000
## Superplasticizer  0.08073602       0.05986934  0.36711238 -0.65179537
## CoarseAggregate  -0.10723333      -0.28199425 -0.01368126 -0.18203813
## FineAggregate    -0.23137835      -0.26633897  0.07629013 -0.45991660
## Age               0.08642213      -0.04364486 -0.16027293  0.27899737
##                  Superplasticizer CoarseAggregate FineAggregate          Age
## Cement                 0.08073602    -0.107233332   -0.23137835  0.086422133
## BlastFurnaceSlag       0.05986934    -0.281994254   -0.26633897 -0.043644863
## FlyAsh                 0.36711238    -0.013681262    0.07629013 -0.160272926
## Water                 -0.65179537    -0.182038132   -0.45991660  0.278997375
## Superplasticizer       1.00000000    -0.266468752    0.22047749 -0.192093136
## CoarseAggregate       -0.26646875     1.000000000   -0.18317900 -0.004315883
## FineAggregate          0.22047749    -0.183178995    1.00000000 -0.157180793
## Age                   -0.19209314    -0.004315883   -0.15718079  1.000000000
high_cor <- findCorrelation(cor, cutoff = .75)
high_cor
## integer(0)
#no features with a correlation at a cutoff of .75. This confirms the conclusion based on the correlation plot. We will not use corr as method in preProc and we will keep all the features

# feature importance using Recursive Feature Elimination (RFE)
set.seed(123)
ctrl <- rfeControl(functions = rfFuncs, 
                   method = "cv",
                   number = 10,
                   repeats = 10,
                   verbose = FALSE)

lmProfile <- rfe(x = predictors, 
                 y = response, 
                 sizes = c(1:8),
                 rfeControl = ctrl)

lmProfile
## 
## Recursive feature selection
## 
## Outer resampling method: Cross-Validated (10 fold) 
## 
## Resampling performance over subset size:
## 
##  Variables   RMSE Rsquared    MAE RMSESD RsquaredSD  MAESD Selected
##          1 12.820   0.4061 10.190 1.0823    0.10511 0.9413         
##          2  8.992   0.7101  7.035 0.9194    0.06239 0.7053         
##          3  7.292   0.8216  5.638 0.8096    0.05095 0.3940         
##          4  6.527   0.8729  5.070 0.6500    0.03824 0.2964         
##          5  6.401   0.8896  5.021 0.6837    0.03395 0.3438         
##          6  5.189   0.9125  3.858 0.7156    0.02927 0.2802        *
##          7  5.390   0.9093  3.986 0.7370    0.03216 0.2763         
##          8  5.552   0.9065  4.138 0.7181    0.03302 0.2689         
## 
## The top 5 variables (out of 6):
##    Age, Cement, Water, FineAggregate, BlastFurnaceSlag
predictors(lmProfile)
## [1] "Age"              "Cement"           "Water"            "FineAggregate"   
## [5] "BlastFurnaceSlag" "Superplasticizer"
plot(lmProfile, type = c("g", "o"))

#based on the RFE Age, Cement, Water, FineAggregate and BlastFurnanceSlag are the most important features

# set training parameters
ctrl <- trainControl(method = "repeatedcv",
                     number = 20,
                     repeats = 20)

ctrl_cv <- trainControl(method = "cv",
                        number = 20)

ctrl_xgb = trainControl(method = "cv",
                        number = 20,
                        search = "grid")

Training Top 5 Models

# neural networks
set.seed(123)
nnetGrid <- expand.grid(.decay = c(0, 0.01, .1), 
                        .size = c(1:10), 
                        .bag = FALSE)
nnetGrid
##    .decay .size  .bag
## 1    0.00     1 FALSE
## 2    0.01     1 FALSE
## 3    0.10     1 FALSE
## 4    0.00     2 FALSE
## 5    0.01     2 FALSE
## 6    0.10     2 FALSE
## 7    0.00     3 FALSE
## 8    0.01     3 FALSE
## 9    0.10     3 FALSE
## 10   0.00     4 FALSE
## 11   0.01     4 FALSE
## 12   0.10     4 FALSE
## 13   0.00     5 FALSE
## 14   0.01     5 FALSE
## 15   0.10     5 FALSE
## 16   0.00     6 FALSE
## 17   0.01     6 FALSE
## 18   0.10     6 FALSE
## 19   0.00     7 FALSE
## 20   0.01     7 FALSE
## 21   0.10     7 FALSE
## 22   0.00     8 FALSE
## 23   0.01     8 FALSE
## 24   0.10     8 FALSE
## 25   0.00     9 FALSE
## 26   0.01     9 FALSE
## 27   0.10     9 FALSE
## 28   0.00    10 FALSE
## 29   0.01    10 FALSE
## 30   0.10    10 FALSE
nnetFit <- train(predictors,
                 response, 
                 method = "avNNet",
                 tuneGrid = nnetGrid, 
                 trControl = ctrl_cv,
                 linout = TRUE, 
                 trace = FALSE, 
                 MaxNWts = 10 * (ncol(predictors) + 1) + 10 + 1,
                 maxit = 500, 
                 preProc = c("center", "scale"))
nnetFit
## Model Averaged Neural Network 
## 
## 900 samples
##   8 predictor
## 
## Pre-processing: centered (8), scaled (8) 
## Resampling: Cross-Validated (20 fold) 
## Summary of sample sizes: 855, 855, 856, 856, 856, 854, ... 
## Resampling results across tuning parameters:
## 
##   decay  size  RMSE       Rsquared   MAE     
##   0.00    1     9.381062  0.6911776  7.002066
##   0.00    2     6.740511  0.8377439  5.191630
##   0.00    3     6.208132  0.8611464  4.778825
##   0.00    4     5.802836  0.8799794  4.447783
##   0.00    5     5.524865  0.8935214  4.179284
##   0.00    6     5.388293  0.8970321  4.054678
##   0.00    7     5.298037  0.9024527  3.871275
##   0.00    8    10.456101  0.8627619  4.721305
##   0.00    9     5.118550  0.9067706  3.693407
##   0.00   10     6.258228  0.8607140  4.074554
##   0.01    1     9.416876  0.6920283  7.095914
##   0.01    2     6.897797  0.8300779  5.300253
##   0.01    3     6.183066  0.8647398  4.757632
##   0.01    4     5.729807  0.8837714  4.434294
##   0.01    5     5.309166  0.9004170  4.040946
##   0.01    6     5.293512  0.9008021  4.029641
##   0.01    7     5.204859  0.9044074  3.862407
##   0.01    8     4.952935  0.9132856  3.681698
##   0.01    9     4.878235  0.9163351  3.656338
##   0.01   10     4.995252  0.9110841  3.787124
##   0.10    1     9.383546  0.6909523  7.029851
##   0.10    2     6.924770  0.8274929  5.358764
##   0.10    3     6.239736  0.8600777  4.815771
##   0.10    4     5.696332  0.8843682  4.352744
##   0.10    5     5.446549  0.8950531  4.152041
##   0.10    6     5.329909  0.8996663  4.011057
##   0.10    7     5.106442  0.9083157  3.872992
##   0.10    8     5.014059  0.9106457  3.752738
##   0.10    9     4.951479  0.9131644  3.716294
##   0.10   10     4.918789  0.9145168  3.683385
## 
## Tuning parameter 'bag' was held constant at a value of FALSE
## RMSE was used to select the optimal model using the smallest value.
## The final values used for the model were size = 9, decay = 0.01 and bag = FALSE.
plot(nnetFit)

plot(varImp(nnetFit))

# random forests
set.seed(123)
rfGrid <- data.frame(mtry = 1:ncol(predictors))
rfGrid
##   mtry
## 1    1
## 2    2
## 3    3
## 4    4
## 5    5
## 6    6
## 7    7
## 8    8
rfFit <- train(x = predictors, 
               y = response, 
               method = "rf", 
               tuneGrid = rfGrid, 
               ntree = 2000, 
               importance = TRUE, 
               trainCtrl = "YeoJohnson",
               trControl = ctrl_cv)
rfFit
## Random Forest 
## 
## 900 samples
##   8 predictor
## 
## No pre-processing
## Resampling: Cross-Validated (20 fold) 
## Summary of sample sizes: 855, 855, 856, 856, 856, 854, ... 
## Resampling results across tuning parameters:
## 
##   mtry  RMSE      Rsquared   MAE     
##   1     7.264047  0.8594038  5.738772
##   2     5.320665  0.9150545  3.962373
##   3     4.886979  0.9219525  3.554419
##   4     4.774547  0.9223704  3.442779
##   5     4.743583  0.9217818  3.391820
##   6     4.736136  0.9213526  3.367641
##   7     4.744854  0.9206342  3.362591
##   8     4.759586  0.9198757  3.372045
## 
## RMSE was used to select the optimal model using the smallest value.
## The final value used for the model was mtry = 6.
plot(rfFit)

rfImp <- varImp(rfFit, competes = FALSE)
plot(rfImp)

# gradient boosting machines
set.seed(123)
gbmGrid <- expand.grid(interaction.depth = seq(1, 7, by = 2),
                       n.trees = seq(100, 2000, by = 50),
                       n.minobsinnode = 11,
                       shrinkage = c(0.01, 0.1))
gbmGrid
##     interaction.depth n.trees n.minobsinnode shrinkage
## 1                   1     100             11      0.01
## 2                   3     100             11      0.01
## 3                   5     100             11      0.01
## 4                   7     100             11      0.01
## 5                   1     150             11      0.01
## 6                   3     150             11      0.01
## 7                   5     150             11      0.01
## 8                   7     150             11      0.01
## 9                   1     200             11      0.01
## 10                  3     200             11      0.01
## 11                  5     200             11      0.01
## 12                  7     200             11      0.01
## 13                  1     250             11      0.01
## 14                  3     250             11      0.01
## 15                  5     250             11      0.01
## 16                  7     250             11      0.01
## 17                  1     300             11      0.01
## 18                  3     300             11      0.01
## 19                  5     300             11      0.01
## 20                  7     300             11      0.01
## 21                  1     350             11      0.01
## 22                  3     350             11      0.01
## 23                  5     350             11      0.01
## 24                  7     350             11      0.01
## 25                  1     400             11      0.01
## 26                  3     400             11      0.01
## 27                  5     400             11      0.01
## 28                  7     400             11      0.01
## 29                  1     450             11      0.01
## 30                  3     450             11      0.01
## 31                  5     450             11      0.01
## 32                  7     450             11      0.01
## 33                  1     500             11      0.01
## 34                  3     500             11      0.01
## 35                  5     500             11      0.01
## 36                  7     500             11      0.01
## 37                  1     550             11      0.01
## 38                  3     550             11      0.01
## 39                  5     550             11      0.01
## 40                  7     550             11      0.01
## 41                  1     600             11      0.01
## 42                  3     600             11      0.01
## 43                  5     600             11      0.01
## 44                  7     600             11      0.01
## 45                  1     650             11      0.01
## 46                  3     650             11      0.01
## 47                  5     650             11      0.01
## 48                  7     650             11      0.01
## 49                  1     700             11      0.01
## 50                  3     700             11      0.01
## 51                  5     700             11      0.01
## 52                  7     700             11      0.01
## 53                  1     750             11      0.01
## 54                  3     750             11      0.01
## 55                  5     750             11      0.01
## 56                  7     750             11      0.01
## 57                  1     800             11      0.01
## 58                  3     800             11      0.01
## 59                  5     800             11      0.01
## 60                  7     800             11      0.01
## 61                  1     850             11      0.01
## 62                  3     850             11      0.01
## 63                  5     850             11      0.01
## 64                  7     850             11      0.01
## 65                  1     900             11      0.01
## 66                  3     900             11      0.01
## 67                  5     900             11      0.01
## 68                  7     900             11      0.01
## 69                  1     950             11      0.01
## 70                  3     950             11      0.01
## 71                  5     950             11      0.01
## 72                  7     950             11      0.01
## 73                  1    1000             11      0.01
## 74                  3    1000             11      0.01
## 75                  5    1000             11      0.01
## 76                  7    1000             11      0.01
## 77                  1    1050             11      0.01
## 78                  3    1050             11      0.01
## 79                  5    1050             11      0.01
## 80                  7    1050             11      0.01
## 81                  1    1100             11      0.01
## 82                  3    1100             11      0.01
## 83                  5    1100             11      0.01
## 84                  7    1100             11      0.01
## 85                  1    1150             11      0.01
## 86                  3    1150             11      0.01
## 87                  5    1150             11      0.01
## 88                  7    1150             11      0.01
## 89                  1    1200             11      0.01
## 90                  3    1200             11      0.01
## 91                  5    1200             11      0.01
## 92                  7    1200             11      0.01
## 93                  1    1250             11      0.01
## 94                  3    1250             11      0.01
## 95                  5    1250             11      0.01
## 96                  7    1250             11      0.01
## 97                  1    1300             11      0.01
## 98                  3    1300             11      0.01
## 99                  5    1300             11      0.01
## 100                 7    1300             11      0.01
## 101                 1    1350             11      0.01
## 102                 3    1350             11      0.01
## 103                 5    1350             11      0.01
## 104                 7    1350             11      0.01
## 105                 1    1400             11      0.01
## 106                 3    1400             11      0.01
## 107                 5    1400             11      0.01
## 108                 7    1400             11      0.01
## 109                 1    1450             11      0.01
## 110                 3    1450             11      0.01
## 111                 5    1450             11      0.01
## 112                 7    1450             11      0.01
## 113                 1    1500             11      0.01
## 114                 3    1500             11      0.01
## 115                 5    1500             11      0.01
## 116                 7    1500             11      0.01
## 117                 1    1550             11      0.01
## 118                 3    1550             11      0.01
## 119                 5    1550             11      0.01
## 120                 7    1550             11      0.01
## 121                 1    1600             11      0.01
## 122                 3    1600             11      0.01
## 123                 5    1600             11      0.01
## 124                 7    1600             11      0.01
## 125                 1    1650             11      0.01
## 126                 3    1650             11      0.01
## 127                 5    1650             11      0.01
## 128                 7    1650             11      0.01
## 129                 1    1700             11      0.01
## 130                 3    1700             11      0.01
## 131                 5    1700             11      0.01
## 132                 7    1700             11      0.01
## 133                 1    1750             11      0.01
## 134                 3    1750             11      0.01
## 135                 5    1750             11      0.01
## 136                 7    1750             11      0.01
## 137                 1    1800             11      0.01
## 138                 3    1800             11      0.01
## 139                 5    1800             11      0.01
## 140                 7    1800             11      0.01
## 141                 1    1850             11      0.01
## 142                 3    1850             11      0.01
## 143                 5    1850             11      0.01
## 144                 7    1850             11      0.01
## 145                 1    1900             11      0.01
## 146                 3    1900             11      0.01
## 147                 5    1900             11      0.01
## 148                 7    1900             11      0.01
## 149                 1    1950             11      0.01
## 150                 3    1950             11      0.01
## 151                 5    1950             11      0.01
## 152                 7    1950             11      0.01
## 153                 1    2000             11      0.01
## 154                 3    2000             11      0.01
## 155                 5    2000             11      0.01
## 156                 7    2000             11      0.01
## 157                 1     100             11      0.10
## 158                 3     100             11      0.10
## 159                 5     100             11      0.10
## 160                 7     100             11      0.10
## 161                 1     150             11      0.10
## 162                 3     150             11      0.10
## 163                 5     150             11      0.10
## 164                 7     150             11      0.10
## 165                 1     200             11      0.10
## 166                 3     200             11      0.10
## 167                 5     200             11      0.10
## 168                 7     200             11      0.10
## 169                 1     250             11      0.10
## 170                 3     250             11      0.10
## 171                 5     250             11      0.10
## 172                 7     250             11      0.10
## 173                 1     300             11      0.10
## 174                 3     300             11      0.10
## 175                 5     300             11      0.10
## 176                 7     300             11      0.10
## 177                 1     350             11      0.10
## 178                 3     350             11      0.10
## 179                 5     350             11      0.10
## 180                 7     350             11      0.10
## 181                 1     400             11      0.10
## 182                 3     400             11      0.10
## 183                 5     400             11      0.10
## 184                 7     400             11      0.10
## 185                 1     450             11      0.10
## 186                 3     450             11      0.10
## 187                 5     450             11      0.10
## 188                 7     450             11      0.10
## 189                 1     500             11      0.10
## 190                 3     500             11      0.10
## 191                 5     500             11      0.10
## 192                 7     500             11      0.10
## 193                 1     550             11      0.10
## 194                 3     550             11      0.10
## 195                 5     550             11      0.10
## 196                 7     550             11      0.10
## 197                 1     600             11      0.10
## 198                 3     600             11      0.10
## 199                 5     600             11      0.10
## 200                 7     600             11      0.10
## 201                 1     650             11      0.10
## 202                 3     650             11      0.10
## 203                 5     650             11      0.10
## 204                 7     650             11      0.10
## 205                 1     700             11      0.10
## 206                 3     700             11      0.10
## 207                 5     700             11      0.10
## 208                 7     700             11      0.10
## 209                 1     750             11      0.10
## 210                 3     750             11      0.10
## 211                 5     750             11      0.10
## 212                 7     750             11      0.10
## 213                 1     800             11      0.10
## 214                 3     800             11      0.10
## 215                 5     800             11      0.10
## 216                 7     800             11      0.10
## 217                 1     850             11      0.10
## 218                 3     850             11      0.10
## 219                 5     850             11      0.10
## 220                 7     850             11      0.10
## 221                 1     900             11      0.10
## 222                 3     900             11      0.10
## 223                 5     900             11      0.10
## 224                 7     900             11      0.10
## 225                 1     950             11      0.10
## 226                 3     950             11      0.10
## 227                 5     950             11      0.10
## 228                 7     950             11      0.10
## 229                 1    1000             11      0.10
## 230                 3    1000             11      0.10
## 231                 5    1000             11      0.10
## 232                 7    1000             11      0.10
## 233                 1    1050             11      0.10
## 234                 3    1050             11      0.10
## 235                 5    1050             11      0.10
## 236                 7    1050             11      0.10
## 237                 1    1100             11      0.10
## 238                 3    1100             11      0.10
## 239                 5    1100             11      0.10
## 240                 7    1100             11      0.10
## 241                 1    1150             11      0.10
## 242                 3    1150             11      0.10
## 243                 5    1150             11      0.10
## 244                 7    1150             11      0.10
## 245                 1    1200             11      0.10
## 246                 3    1200             11      0.10
## 247                 5    1200             11      0.10
## 248                 7    1200             11      0.10
## 249                 1    1250             11      0.10
## 250                 3    1250             11      0.10
## 251                 5    1250             11      0.10
## 252                 7    1250             11      0.10
## 253                 1    1300             11      0.10
## 254                 3    1300             11      0.10
## 255                 5    1300             11      0.10
## 256                 7    1300             11      0.10
## 257                 1    1350             11      0.10
## 258                 3    1350             11      0.10
## 259                 5    1350             11      0.10
## 260                 7    1350             11      0.10
## 261                 1    1400             11      0.10
## 262                 3    1400             11      0.10
## 263                 5    1400             11      0.10
## 264                 7    1400             11      0.10
## 265                 1    1450             11      0.10
## 266                 3    1450             11      0.10
## 267                 5    1450             11      0.10
## 268                 7    1450             11      0.10
## 269                 1    1500             11      0.10
## 270                 3    1500             11      0.10
## 271                 5    1500             11      0.10
## 272                 7    1500             11      0.10
## 273                 1    1550             11      0.10
## 274                 3    1550             11      0.10
## 275                 5    1550             11      0.10
## 276                 7    1550             11      0.10
## 277                 1    1600             11      0.10
## 278                 3    1600             11      0.10
## 279                 5    1600             11      0.10
## 280                 7    1600             11      0.10
## 281                 1    1650             11      0.10
## 282                 3    1650             11      0.10
## 283                 5    1650             11      0.10
## 284                 7    1650             11      0.10
## 285                 1    1700             11      0.10
## 286                 3    1700             11      0.10
## 287                 5    1700             11      0.10
## 288                 7    1700             11      0.10
## 289                 1    1750             11      0.10
## 290                 3    1750             11      0.10
## 291                 5    1750             11      0.10
## 292                 7    1750             11      0.10
## 293                 1    1800             11      0.10
## 294                 3    1800             11      0.10
## 295                 5    1800             11      0.10
## 296                 7    1800             11      0.10
## 297                 1    1850             11      0.10
## 298                 3    1850             11      0.10
## 299                 5    1850             11      0.10
## 300                 7    1850             11      0.10
## 301                 1    1900             11      0.10
## 302                 3    1900             11      0.10
## 303                 5    1900             11      0.10
## 304                 7    1900             11      0.10
## 305                 1    1950             11      0.10
## 306                 3    1950             11      0.10
## 307                 5    1950             11      0.10
## 308                 7    1950             11      0.10
## 309                 1    2000             11      0.10
## 310                 3    2000             11      0.10
## 311                 5    2000             11      0.10
## 312                 7    2000             11      0.10
gbmFit <- train(x = predictors, 
                y = response, 
                method = "gbm",
                tuneGrid = gbmGrid,
                trControl = ctrl,
                verbose = FALSE)
gbmFit
## Stochastic Gradient Boosting 
## 
## 900 samples
##   8 predictor
## 
## No pre-processing
## Resampling: Cross-Validated (20 fold, repeated 20 times) 
## Summary of sample sizes: 855, 855, 856, 856, 856, 854, ... 
## Resampling results across tuning parameters:
## 
##   shrinkage  interaction.depth  n.trees  RMSE       Rsquared   MAE      
##   0.01       1                   100     13.666520  0.6551054  10.900110
##   0.01       1                   150     12.613273  0.6830760  10.068949
##   0.01       1                   200     11.779890  0.6972649   9.427925
##   0.01       1                   250     11.111855  0.7122930   8.893464
##   0.01       1                   300     10.552272  0.7300101   8.428726
##   0.01       1                   350     10.071879  0.7464196   8.030580
##   0.01       1                   400      9.656905  0.7608141   7.685632
##   0.01       1                   450      9.292617  0.7737359   7.382511
##   0.01       1                   500      8.969311  0.7851264   7.114585
##   0.01       1                   550      8.678086  0.7950275   6.873445
##   0.01       1                   600      8.418381  0.8035855   6.659639
##   0.01       1                   650      8.185273  0.8108008   6.461801
##   0.01       1                   700      7.974631  0.8171946   6.283892
##   0.01       1                   750      7.784333  0.8227216   6.122466
##   0.01       1                   800      7.614436  0.8274067   5.978000
##   0.01       1                   850      7.464265  0.8312918   5.850963
##   0.01       1                   900      7.328040  0.8347681   5.738070
##   0.01       1                   950      7.207202  0.8377441   5.637400
##   0.01       1                  1000      7.097933  0.8404494   5.547481
##   0.01       1                  1050      7.001105  0.8428097   5.469410
##   0.01       1                  1100      6.914179  0.8450386   5.400450
##   0.01       1                  1150      6.835279  0.8471380   5.339248
##   0.01       1                  1200      6.763299  0.8491083   5.283400
##   0.01       1                  1250      6.698125  0.8509298   5.232141
##   0.01       1                  1300      6.638306  0.8526741   5.184085
##   0.01       1                  1350      6.584007  0.8542949   5.139143
##   0.01       1                  1400      6.533514  0.8558289   5.097311
##   0.01       1                  1450      6.485838  0.8573181   5.057114
##   0.01       1                  1500      6.441891  0.8587193   5.018472
##   0.01       1                  1550      6.400730  0.8600526   4.981941
##   0.01       1                  1600      6.361558  0.8613289   4.947148
##   0.01       1                  1650      6.324935  0.8625174   4.914859
##   0.01       1                  1700      6.291395  0.8636311   4.885884
##   0.01       1                  1750      6.257794  0.8647462   4.856314
##   0.01       1                  1800      6.225380  0.8658365   4.828284
##   0.01       1                  1850      6.195533  0.8668396   4.801954
##   0.01       1                  1900      6.168129  0.8677547   4.777961
##   0.01       1                  1950      6.142243  0.8685957   4.755182
##   0.01       1                  2000      6.117107  0.8694574   4.733181
##   0.01       3                   100     11.476781  0.7470971   9.173866
##   0.01       3                   150     10.057987  0.7784653   8.023591
##   0.01       3                   200      9.028011  0.8028283   7.200153
##   0.01       3                   250      8.257895  0.8213980   6.594992
##   0.01       3                   300      7.666751  0.8357258   6.125281
##   0.01       3                   350      7.213305  0.8464275   5.755245
##   0.01       3                   400      6.862768  0.8547287   5.456521
##   0.01       3                   450      6.590829  0.8612759   5.219178
##   0.01       3                   500      6.378977  0.8666179   5.026619
##   0.01       3                   550      6.206140  0.8712990   4.863850
##   0.01       3                   600      6.061343  0.8754620   4.723662
##   0.01       3                   650      5.938052  0.8791947   4.601442
##   0.01       3                   700      5.831261  0.8825375   4.494099
##   0.01       3                   750      5.736366  0.8855863   4.398001
##   0.01       3                   800      5.652335  0.8883150   4.312914
##   0.01       3                   850      5.577834  0.8907611   4.237319
##   0.01       3                   900      5.510820  0.8929844   4.170542
##   0.01       3                   950      5.450256  0.8949961   4.110789
##   0.01       3                  1000      5.395854  0.8968095   4.057207
##   0.01       3                  1050      5.346542  0.8984504   4.008285
##   0.01       3                  1100      5.301998  0.8999478   3.963774
##   0.01       3                  1150      5.260633  0.9013342   3.923439
##   0.01       3                  1200      5.222805  0.9026026   3.885654
##   0.01       3                  1250      5.187647  0.9037870   3.851005
##   0.01       3                  1300      5.154718  0.9048872   3.818475
##   0.01       3                  1350      5.122984  0.9059343   3.787267
##   0.01       3                  1400      5.094057  0.9068873   3.758504
##   0.01       3                  1450      5.066414  0.9078034   3.730842
##   0.01       3                  1500      5.040295  0.9086745   3.705878
##   0.01       3                  1550      5.015242  0.9095019   3.680782
##   0.01       3                  1600      4.992754  0.9102513   3.659092
##   0.01       3                  1650      4.970866  0.9109811   3.638221
##   0.01       3                  1700      4.948907  0.9117050   3.617057
##   0.01       3                  1750      4.928597  0.9123701   3.598054
##   0.01       3                  1800      4.909402  0.9130089   3.579657
##   0.01       3                  1850      4.891609  0.9135860   3.561766
##   0.01       3                  1900      4.873949  0.9141629   3.544988
##   0.01       3                  1950      4.857290  0.9147048   3.528855
##   0.01       3                  2000      4.840937  0.9152515   3.513576
##   0.01       5                   100     10.559654  0.7869783   8.422737
##   0.01       5                   150      9.044945  0.8149391   7.210264
##   0.01       5                   200      8.013034  0.8353634   6.399440
##   0.01       5                   250      7.284967  0.8505883   5.819730
##   0.01       5                   300      6.765496  0.8618104   5.384891
##   0.01       5                   350      6.391794  0.8702927   5.051467
##   0.01       5                   400      6.113163  0.8771524   4.790720
##   0.01       5                   450      5.898335  0.8828520   4.580586
##   0.01       5                   500      5.726549  0.8877347   4.408275
##   0.01       5                   550      5.584822  0.8919571   4.263674
##   0.01       5                   600      5.467020  0.8955642   4.143718
##   0.01       5                   650      5.365305  0.8987756   4.040511
##   0.01       5                   700      5.278189  0.9015538   3.951869
##   0.01       5                   750      5.203752  0.9039379   3.876106
##   0.01       5                   800      5.136340  0.9061113   3.809596
##   0.01       5                   850      5.076626  0.9080489   3.750265
##   0.01       5                   900      5.023541  0.9097585   3.698860
##   0.01       5                   950      4.975168  0.9113134   3.651673
##   0.01       5                  1000      4.931358  0.9127281   3.609521
##   0.01       5                  1050      4.890689  0.9140303   3.570739
##   0.01       5                  1100      4.854839  0.9151876   3.537632
##   0.01       5                  1150      4.820342  0.9162929   3.505709
##   0.01       5                  1200      4.789538  0.9172859   3.476596
##   0.01       5                  1250      4.760771  0.9182015   3.450087
##   0.01       5                  1300      4.733875  0.9190606   3.425018
##   0.01       5                  1350      4.709072  0.9198499   3.402279
##   0.01       5                  1400      4.686273  0.9205721   3.381239
##   0.01       5                  1450      4.663345  0.9213067   3.360723
##   0.01       5                  1500      4.641330  0.9220054   3.341102
##   0.01       5                  1550      4.620639  0.9226566   3.322452
##   0.01       5                  1600      4.600773  0.9232849   3.304958
##   0.01       5                  1650      4.581923  0.9238826   3.288066
##   0.01       5                  1700      4.564533  0.9244181   3.272624
##   0.01       5                  1750      4.548865  0.9249051   3.258558
##   0.01       5                  1800      4.533205  0.9253905   3.244754
##   0.01       5                  1850      4.517527  0.9258834   3.230630
##   0.01       5                  1900      4.503087  0.9263376   3.217277
##   0.01       5                  1950      4.488720  0.9267790   3.205055
##   0.01       5                  2000      4.474157  0.9272319   3.192294
##   0.01       7                   100     10.041819  0.8138495   8.015326
##   0.01       7                   150      8.481977  0.8361965   6.776675
##   0.01       7                   200      7.467280  0.8532296   5.961306
##   0.01       7                   250      6.777678  0.8663061   5.393915
##   0.01       7                   300      6.300029  0.8762426   4.971094
##   0.01       7                   350      5.962009  0.8840055   4.652550
##   0.01       7                   400      5.710217  0.8903938   4.404960
##   0.01       7                   450      5.517159  0.8956369   4.209906
##   0.01       7                   500      5.363273  0.9000430   4.052576
##   0.01       7                   550      5.237406  0.9037713   3.926177
##   0.01       7                   600      5.132814  0.9069429   3.819864
##   0.01       7                   650      5.046161  0.9095850   3.731905
##   0.01       7                   700      4.970531  0.9119290   3.655322
##   0.01       7                   750      4.905565  0.9139559   3.590203
##   0.01       7                   800      4.849534  0.9156892   3.535911
##   0.01       7                   850      4.799360  0.9172602   3.487956
##   0.01       7                   900      4.754030  0.9186663   3.445635
##   0.01       7                   950      4.713601  0.9199192   3.408599
##   0.01       7                  1000      4.676275  0.9210802   3.373873
##   0.01       7                  1050      4.642902  0.9221102   3.342772
##   0.01       7                  1100      4.610896  0.9231098   3.313855
##   0.01       7                  1150      4.581561  0.9240154   3.287416
##   0.01       7                  1200      4.553670  0.9248831   3.262394
##   0.01       7                  1250      4.528602  0.9256516   3.239945
##   0.01       7                  1300      4.503637  0.9264183   3.217620
##   0.01       7                  1350      4.481271  0.9271193   3.198131
##   0.01       7                  1400      4.459539  0.9277703   3.178788
##   0.01       7                  1450      4.439335  0.9283903   3.160744
##   0.01       7                  1500      4.420429  0.9289724   3.143770
##   0.01       7                  1550      4.402666  0.9294954   3.127099
##   0.01       7                  1600      4.384941  0.9300227   3.111219
##   0.01       7                  1650      4.368778  0.9305082   3.095954
##   0.01       7                  1700      4.353111  0.9309850   3.081759
##   0.01       7                  1750      4.337178  0.9314558   3.067502
##   0.01       7                  1800      4.323119  0.9318726   3.054624
##   0.01       7                  1850      4.309069  0.9322904   3.041862
##   0.01       7                  1900      4.295091  0.9327032   3.029550
##   0.01       7                  1950      4.282014  0.9330952   3.017206
##   0.01       7                  2000      4.269354  0.9334735   3.005996
##   0.10       1                   100      7.066337  0.8396175   5.517732
##   0.10       1                   150      6.431465  0.8583025   4.999250
##   0.10       1                   200      6.118111  0.8688041   4.728490
##   0.10       1                   250      5.925450  0.8756672   4.563115
##   0.10       1                   300      5.800108  0.8802354   4.449750
##   0.10       1                   350      5.709749  0.8836028   4.363592
##   0.10       1                   400      5.642262  0.8860642   4.299499
##   0.10       1                   450      5.583663  0.8883094   4.247942
##   0.10       1                   500      5.538416  0.8899870   4.205484
##   0.10       1                   550      5.497250  0.8914992   4.165832
##   0.10       1                   600      5.467134  0.8926840   4.137389
##   0.10       1                   650      5.436025  0.8938329   4.106521
##   0.10       1                   700      5.409284  0.8948276   4.079563
##   0.10       1                   750      5.389421  0.8956096   4.060712
##   0.10       1                   800      5.371604  0.8962965   4.043509
##   0.10       1                   850      5.348599  0.8971093   4.023129
##   0.10       1                   900      5.333036  0.8977493   4.006564
##   0.10       1                   950      5.316400  0.8984013   3.993238
##   0.10       1                  1000      5.306270  0.8987512   3.980394
##   0.10       1                  1050      5.292625  0.8992656   3.967062
##   0.10       1                  1100      5.281080  0.8996747   3.954370
##   0.10       1                  1150      5.268421  0.9001361   3.943372
##   0.10       1                  1200      5.262407  0.9003550   3.935463
##   0.10       1                  1250      5.249760  0.9008182   3.924264
##   0.10       1                  1300      5.245265  0.9009617   3.919312
##   0.10       1                  1350      5.239526  0.9012138   3.911851
##   0.10       1                  1400      5.231996  0.9015164   3.905540
##   0.10       1                  1450      5.220723  0.9019317   3.898143
##   0.10       1                  1500      5.213992  0.9021510   3.889985
##   0.10       1                  1550      5.208490  0.9023749   3.886010
##   0.10       1                  1600      5.199746  0.9026593   3.877420
##   0.10       1                  1650      5.197730  0.9027752   3.873181
##   0.10       1                  1700      5.192467  0.9029650   3.868750
##   0.10       1                  1750      5.187184  0.9031622   3.863822
##   0.10       1                  1800      5.185281  0.9032526   3.861641
##   0.10       1                  1850      5.178522  0.9035221   3.855724
##   0.10       1                  1900      5.173185  0.9036802   3.851525
##   0.10       1                  1950      5.169971  0.9038424   3.848815
##   0.10       1                  2000      5.162622  0.9041475   3.841168
##   0.10       3                   100      5.469965  0.8935541   4.110766
##   0.10       3                   150      5.116505  0.9057135   3.766024
##   0.10       3                   200      4.912146  0.9126823   3.581519
##   0.10       3                   250      4.784843  0.9169434   3.462063
##   0.10       3                   300      4.687229  0.9201160   3.375329
##   0.10       3                   350      4.609170  0.9226274   3.304036
##   0.10       3                   400      4.544139  0.9246628   3.250289
##   0.10       3                   450      4.490430  0.9264022   3.203538
##   0.10       3                   500      4.444126  0.9278545   3.165182
##   0.10       3                   550      4.404236  0.9290896   3.128428
##   0.10       3                   600      4.363404  0.9303121   3.094755
##   0.10       3                   650      4.336233  0.9311255   3.069342
##   0.10       3                   700      4.309303  0.9319312   3.047166
##   0.10       3                   750      4.284892  0.9326847   3.028099
##   0.10       3                   800      4.261016  0.9333998   3.009906
##   0.10       3                   850      4.238128  0.9340857   2.992411
##   0.10       3                   900      4.222802  0.9345374   2.978883
##   0.10       3                   950      4.201946  0.9351286   2.961686
##   0.10       3                  1000      4.183039  0.9356892   2.947786
##   0.10       3                  1050      4.166973  0.9361434   2.933935
##   0.10       3                  1100      4.156366  0.9364907   2.922927
##   0.10       3                  1150      4.142264  0.9369015   2.911497
##   0.10       3                  1200      4.126254  0.9373285   2.898417
##   0.10       3                  1250      4.118437  0.9375404   2.890676
##   0.10       3                  1300      4.107853  0.9378472   2.881702
##   0.10       3                  1350      4.101030  0.9380341   2.875392
##   0.10       3                  1400      4.089167  0.9383224   2.864676
##   0.10       3                  1450      4.083406  0.9385038   2.858361
##   0.10       3                  1500      4.072651  0.9387885   2.849262
##   0.10       3                  1550      4.065400  0.9390033   2.841541
##   0.10       3                  1600      4.057730  0.9392113   2.832617
##   0.10       3                  1650      4.051363  0.9393623   2.826580
##   0.10       3                  1700      4.045395  0.9395699   2.821649
##   0.10       3                  1750      4.044156  0.9395739   2.818845
##   0.10       3                  1800      4.038861  0.9397549   2.813236
##   0.10       3                  1850      4.034443  0.9398499   2.808963
##   0.10       3                  1900      4.030638  0.9399713   2.804195
##   0.10       3                  1950      4.027842  0.9400279   2.801621
##   0.10       3                  2000      4.024769  0.9401271   2.797304
##   0.10       5                   100      5.034595  0.9088243   3.694425
##   0.10       5                   150      4.758379  0.9179949   3.439827
##   0.10       5                   200      4.585626  0.9235025   3.285723
##   0.10       5                   250      4.475932  0.9269322   3.187707
##   0.10       5                   300      4.381672  0.9297944   3.105669
##   0.10       5                   350      4.314372  0.9318725   3.047069
##   0.10       5                   400      4.257906  0.9335693   3.001173
##   0.10       5                   450      4.209904  0.9349909   2.959170
##   0.10       5                   500      4.174407  0.9360637   2.924560
##   0.10       5                   550      4.145280  0.9368662   2.896924
##   0.10       5                   600      4.115156  0.9377069   2.870903
##   0.10       5                   650      4.085346  0.9385295   2.844407
##   0.10       5                   700      4.067767  0.9390199   2.827928
##   0.10       5                   750      4.050826  0.9394967   2.809247
##   0.10       5                   800      4.037966  0.9398198   2.796835
##   0.10       5                   850      4.026852  0.9401125   2.785431
##   0.10       5                   900      4.018786  0.9403712   2.775263
##   0.10       5                   950      4.009202  0.9406232   2.765515
##   0.10       5                  1000      4.000686  0.9408470   2.756278
##   0.10       5                  1050      3.994534  0.9410176   2.747719
##   0.10       5                  1100      3.989532  0.9411346   2.742564
##   0.10       5                  1150      3.985342  0.9412306   2.733707
##   0.10       5                  1200      3.980246  0.9413931   2.728132
##   0.10       5                  1250      3.979727  0.9413756   2.724443
##   0.10       5                  1300      3.974031  0.9415448   2.717135
##   0.10       5                  1350      3.975728  0.9415199   2.715761
##   0.10       5                  1400      3.971615  0.9416226   2.711335
##   0.10       5                  1450      3.968586  0.9416967   2.706696
##   0.10       5                  1500      3.966840  0.9417641   2.702839
##   0.10       5                  1550      3.963532  0.9418597   2.700126
##   0.10       5                  1600      3.966621  0.9417777   2.699232
##   0.10       5                  1650      3.965503  0.9418077   2.693930
##   0.10       5                  1700      3.966328  0.9417702   2.693546
##   0.10       5                  1750      3.967436  0.9417408   2.690555
##   0.10       5                  1800      3.965903  0.9417760   2.687753
##   0.10       5                  1850      3.967347  0.9417213   2.688377
##   0.10       5                  1900      3.967195  0.9417066   2.685244
##   0.10       5                  1950      3.963634  0.9418031   2.681715
##   0.10       5                  2000      3.963537  0.9417834   2.678780
##   0.10       7                   100      4.791694  0.9170049   3.468091
##   0.10       7                   150      4.536828  0.9250729   3.245165
##   0.10       7                   200      4.389111  0.9296456   3.116329
##   0.10       7                   250      4.277937  0.9330143   3.020017
##   0.10       7                   300      4.197062  0.9353746   2.947913
##   0.10       7                   350      4.134860  0.9371745   2.892307
##   0.10       7                   400      4.093696  0.9383219   2.852477
##   0.10       7                   450      4.058564  0.9393228   2.820127
##   0.10       7                   500      4.029676  0.9401106   2.790373
##   0.10       7                   550      4.010132  0.9406704   2.769735
##   0.10       7                   600      3.992169  0.9411283   2.751660
##   0.10       7                   650      3.977431  0.9415334   2.731740
##   0.10       7                   700      3.968707  0.9417521   2.721572
##   0.10       7                   750      3.963295  0.9418939   2.712830
##   0.10       7                   800      3.957281  0.9420724   2.703369
##   0.10       7                   850      3.953283  0.9421194   2.695623
##   0.10       7                   900      3.950373  0.9422095   2.689001
##   0.10       7                   950      3.949391  0.9422335   2.681343
##   0.10       7                  1000      3.941443  0.9424583   2.673505
##   0.10       7                  1050      3.941453  0.9424711   2.669269
##   0.10       7                  1100      3.940261  0.9425079   2.666278
##   0.10       7                  1150      3.939780  0.9425429   2.662768
##   0.10       7                  1200      3.941778  0.9424621   2.661057
##   0.10       7                  1250      3.941306  0.9424762   2.656433
##   0.10       7                  1300      3.940841  0.9424906   2.655260
##   0.10       7                  1350      3.938082  0.9425546   2.649270
##   0.10       7                  1400      3.939006  0.9425158   2.647333
##   0.10       7                  1450      3.941716  0.9424365   2.647498
##   0.10       7                  1500      3.940881  0.9424714   2.642444
##   0.10       7                  1550      3.944226  0.9423854   2.644032
##   0.10       7                  1600      3.945520  0.9423610   2.642675
##   0.10       7                  1650      3.948077  0.9422745   2.643473
##   0.10       7                  1700      3.946046  0.9423124   2.639839
##   0.10       7                  1750      3.947414  0.9422707   2.639811
##   0.10       7                  1800      3.948256  0.9422531   2.639432
##   0.10       7                  1850      3.951906  0.9421578   2.639203
##   0.10       7                  1900      3.950592  0.9422017   2.636294
##   0.10       7                  1950      3.950885  0.9421827   2.634344
##   0.10       7                  2000      3.957035  0.9420202   2.636694
## 
## Tuning parameter 'n.minobsinnode' was held constant at a value of 11
## RMSE was used to select the optimal model using the smallest value.
## The final values used for the model were n.trees = 1350, interaction.depth =
##  7, shrinkage = 0.1 and n.minobsinnode = 11.
plot(gbmFit)

# extreme gradient boosting
set.seed(123)
xgbGrid <-  expand.grid(max_depth = c(3, 5, 7),
                        nrounds = (1:10)*70,
                        eta = 0.3,
                        gamma = 0,
                        subsample = 1,
                        min_child_weight = 1,
                        colsample_bytree = 0.6)

xgbFit <- train(x = predictors,
                y = response,
                method = "xgbTree",
                trControl = ctrl_xgb,
                tuneGrid = xgbGrid)
xgbFit
## eXtreme Gradient Boosting 
## 
## 900 samples
##   8 predictor
## 
## No pre-processing
## Resampling: Cross-Validated (20 fold) 
## Summary of sample sizes: 855, 855, 856, 856, 856, 854, ... 
## Resampling results across tuning parameters:
## 
##   max_depth  nrounds  RMSE      Rsquared   MAE     
##   3           70      4.548472  0.9258530  3.319408
##   3          140      4.193718  0.9361447  2.986624
##   3          210      4.026018  0.9409575  2.814816
##   3          280      3.957934  0.9426267  2.753618
##   3          350      3.915965  0.9437347  2.705431
##   3          420      3.885122  0.9446051  2.670619
##   3          490      3.880767  0.9447276  2.661558
##   3          560      3.866485  0.9450859  2.638666
##   3          630      3.858587  0.9453751  2.630057
##   3          700      3.857600  0.9454180  2.628142
##   5           70      4.182646  0.9368702  2.910676
##   5          140      4.043065  0.9407298  2.742854
##   5          210      4.013970  0.9415822  2.692528
##   5          280      4.013485  0.9416043  2.681317
##   5          350      4.008992  0.9416721  2.668952
##   5          420      4.009771  0.9416212  2.666352
##   5          490      4.011120  0.9415464  2.662795
##   5          560      4.013019  0.9414771  2.661483
##   5          630      4.013576  0.9414463  2.661160
##   5          700      4.018622  0.9412778  2.664534
##   7           70      4.269227  0.9347829  2.839591
##   7          140      4.243791  0.9354771  2.798705
##   7          210      4.230965  0.9358187  2.784395
##   7          280      4.228003  0.9358945  2.781854
##   7          350      4.227725  0.9358949  2.781597
##   7          420      4.227896  0.9358805  2.781953
##   7          490      4.228015  0.9358739  2.781983
##   7          560      4.228072  0.9358722  2.781918
##   7          630      4.228072  0.9358722  2.781918
##   7          700      4.228072  0.9358722  2.781918
## 
## Tuning parameter 'eta' was held constant at a value of 0.3
## Tuning
## 
## Tuning parameter 'min_child_weight' was held constant at a value of 1
## 
## Tuning parameter 'subsample' was held constant at a value of 1
## RMSE was used to select the optimal model using the smallest value.
## The final values used for the model were nrounds = 700, max_depth = 3, eta
##  = 0.3, gamma = 0, colsample_bytree = 0.6, min_child_weight = 1 and subsample
##  = 1.
plot(xgbFit)

# cubist
set.seed(123)
cbGrid <- expand.grid(committees = c(1:10, 20, 50, 75, 123), 
                      neighbors = c(0, 1, 5, 9))
cbGrid
##    committees neighbors
## 1           1         0
## 2           2         0
## 3           3         0
## 4           4         0
## 5           5         0
## 6           6         0
## 7           7         0
## 8           8         0
## 9           9         0
## 10         10         0
## 11         20         0
## 12         50         0
## 13         75         0
## 14        123         0
## 15          1         1
## 16          2         1
## 17          3         1
## 18          4         1
## 19          5         1
## 20          6         1
## 21          7         1
## 22          8         1
## 23          9         1
## 24         10         1
## 25         20         1
## 26         50         1
## 27         75         1
## 28        123         1
## 29          1         5
## 30          2         5
## 31          3         5
## 32          4         5
## 33          5         5
## 34          6         5
## 35          7         5
## 36          8         5
## 37          9         5
## 38         10         5
## 39         20         5
## 40         50         5
## 41         75         5
## 42        123         5
## 43          1         9
## 44          2         9
## 45          3         9
## 46          4         9
## 47          5         9
## 48          6         9
## 49          7         9
## 50          8         9
## 51          9         9
## 52         10         9
## 53         20         9
## 54         50         9
## 55         75         9
## 56        123         9
cubistFit <- train(predictors,
                   response, 
                   method="cubist",
                   FitGrid = cbGrid,
                   trControl = ctrl,
                   preProc = "BoxCox")
cubistFit
## Cubist 
## 
## 900 samples
##   8 predictor
## 
## Pre-processing: Box-Cox transformation (5) 
## Resampling: Cross-Validated (20 fold, repeated 20 times) 
## Summary of sample sizes: 855, 855, 856, 856, 856, 854, ... 
## Resampling results across tuning parameters:
## 
##   committees  neighbors  RMSE      Rsquared   MAE     
##    1          0          5.757070  0.8814646  4.124494
##    1          5          5.012061  0.9084429  3.407996
##    1          9          5.231181  0.9008091  3.601144
##   10          0          5.112298  0.9067381  3.768867
##   10          5          4.418290  0.9285085  3.036641
##   10          9          4.623403  0.9223662  3.211760
##   20          0          5.060287  0.9085702  3.739905
##   20          5          4.383041  0.9294864  3.017908
##   20          9          4.586975  0.9234415  3.188513
## 
## RMSE was used to select the optimal model using the smallest value.
## The final values used for the model were committees = 20 and neighbors = 5.
plot(cubistFit, auto.key = list(columns = 4, lines = TRUE))

plot(varImp(cubistFit))

Summary of all models

# suummary output of all models
resamples_cv <- resamples(list("Neural Net" = nnetFit,
                               "RF" = rfFit,
                               "XGB" = xgbFit))

resamples_rcv <- resamples(list("GBM" = gbmFit,
                                "Cubist" = cubistFit))

summary(resamples_rcv)
## 
## Call:
## summary.resamples(object = resamples_rcv)
## 
## Models: GBM, Cubist 
## Number of resamples: 400 
## 
## MAE 
##            Min.  1st Qu.   Median     Mean  3rd Qu.     Max. NA's
## GBM    1.663437 2.360057 2.604865 2.649270 2.912577 3.974157    0
## Cubist 1.804833 2.673383 2.955732 3.017908 3.326497 4.624829    0
## 
## RMSE 
##            Min.  1st Qu.   Median     Mean  3rd Qu.     Max. NA's
## GBM    2.172531 3.234712 3.753491 3.938082 4.479314 7.459319    0
## Cubist 2.287482 3.641158 4.226884 4.383041 4.919362 7.725057    0
## 
## Rsquared 
##             Min.   1st Qu.    Median      Mean   3rd Qu.      Max. NA's
## GBM    0.8080794 0.9298449 0.9508306 0.9425546 0.9635267 0.9857974    0
## Cubist 0.7644881 0.9145283 0.9381322 0.9294864 0.9528790 0.9835193    0
summary(resamples_cv)
## 
## Call:
## summary.resamples(object = resamples_cv)
## 
## Models: Neural Net, RF, XGB 
## Number of resamples: 20 
## 
## MAE 
##                Min.  1st Qu.   Median     Mean  3rd Qu.     Max. NA's
## Neural Net 2.809372 3.365640 3.647068 3.656338 3.875043 4.317641    0
## RF         2.598364 2.790693 3.421066 3.367641 3.669695 4.301928    0
## XGB        1.861652 2.327385 2.546302 2.628142 2.740838 3.687222    0
## 
## RMSE 
##                Min.  1st Qu.   Median     Mean  3rd Qu.     Max. NA's
## Neural Net 3.496230 4.282147 4.969973 4.878235 5.461264 5.911077    0
## RF         3.442241 4.066484 4.391610 4.736136 5.314887 6.810052    0
## XGB        2.851883 3.263398 3.626988 3.857600 4.023123 6.223857    0
## 
## Rsquared 
##                 Min.   1st Qu.    Median      Mean   3rd Qu.      Max. NA's
## Neural Net 0.8742637 0.8956146 0.9195148 0.9163351 0.9326244 0.9491027    0
## RF         0.8117342 0.9068681 0.9303838 0.9213526 0.9442502 0.9633361    0
## XGB        0.8473354 0.9421326 0.9545396 0.9454180 0.9620434 0.9658337    0

Model ranking based on mean RMSE

  1. XGB*
  2. GBM
  3. Cubist
  4. RF
  5. Neural Net

Interpretable machine learning

library(iml)
predictor = Predictor$new(xgbFit, data = predictors, y = response)

imp = FeatureImp$new(predictor, loss = "rmse")
plot(imp)

imp$results
##            feature importance.05 importance importance.95 permutation.error
## 1              Age      9.600921   9.646738      9.889071         14.750397
## 2           Cement      8.778072   8.909604      9.212497         13.623278
## 3            Water      5.260389   5.526491      5.639696          8.450311
## 4 BlastFurnaceSlag      4.470963   4.521810      4.687091          6.914098
## 5    FineAggregate      2.542928   2.597556      2.702018          3.971807
## 6 Superplasticizer      2.515451   2.575576      2.608171          3.938198
## 7  CoarseAggregate      2.237426   2.307224      2.328720          3.527874
## 8           FlyAsh      1.616057   1.621778      1.655662          2.479788
ale = FeatureEffect$new(predictor, feature = "Age")
ale$plot()

interact = Interaction$new(predictor)
plot(interact)

interact = Interaction$new(predictor, feature = "Age")
plot(interact)

tree = TreeSurrogate$new(predictor, maxdepth = 2)
plot(tree)

lime.explain = LocalModel$new(predictor, x.interest = predictors[1,])
lime.explain$results
##                        beta x.recoded     effect x.original          feature
## Cement           0.04050841     540.0 21.8745395        540           Cement
## Superplasticizer 0.46646311       2.5  1.1661578        2.5 Superplasticizer
## Age              0.03379576      28.0  0.9462812         28              Age
##                         feature.value
## Cement                     Cement=540
## Superplasticizer Superplasticizer=2.5
## Age                            Age=28
plot(lime.explain)

shapley = Shapley$new(predictor, x.interest = predictors[1,])
shapley$plot()

Apply XBG

# set chosen model
predictions <- predict(xgbFit, predictors)
plot(response, predictions, col = "blue")
abline(0,1, col = "red")

# predict XGB on the test data
predictions <- predict(xgbFit, testdata)

Results

# add index and export results to csv
final_pred <- cbind(index_testdata, predictions)
write.csv(final_pred, file = "predictions.csv", row.names = FALSE)

# stop core cluster
stopCluster(cl)