Question 6.2

Developinga model to predict permeability(see Sect.1.4) could savesig- nificant resourcesfor a pharmaceuticalcompany,while at the sametime more rapidly identifying molecules that have a sufficient permeability to become a drug:

Part A

Start R and use these commands to load the data: library(AppliedPredictiveModeling) data(permeability) The matrix fingerprints contains the 1,107 binary molecular predic- tors for the 165 compounds, while permeability contains permeability response.

library(AppliedPredictiveModeling)
library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
data(permeability)
head(fingerprints)
##   X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X15 X16 X17 X18 X19 X20 X21
## 1  0  0  0  0  0  1  1  1  0   0   0   1   0   1   0   0   1   1   1   0   0
## 2  0  0  0  0  0  0  1  1  0   0   0   1   1   1   0   0   1   1   1   0   0
## 3  0  0  0  0  0  1  1  1  0   0   0   0   1   1   0   0   1   1   1   0   0
## 4  0  0  0  0  0  0  1  1  0   0   0   1   1   1   0   0   1   1   1   0   0
## 5  0  0  0  0  0  0  1  1  0   0   0   1   1   1   0   0   1   1   1   0   0
## 6  0  0  0  0  0  0  1  1  0   0   0   1   1   1   0   0   1   1   1   0   0
##   X22 X23 X24 X25 X26 X27 X28 X29 X30 X31 X32 X33 X34 X35 X36 X37 X38 X39 X40
## 1   1   1   1   0   0   0   0   0   1   1   1   1   1   0   0   0   0   0   0
## 2   1   1   1   0   0   0   0   0   1   1   1   1   1   0   0   0   0   0   0
## 3   1   1   1   0   0   0   0   0   1   1   1   1   1   0   0   0   0   0   0
## 4   1   1   1   0   0   0   0   0   1   1   1   1   1   0   0   0   0   0   0
## 5   1   1   1   0   0   0   0   0   1   1   1   1   1   0   0   0   0   0   0
## 6   1   1   1   0   0   0   0   0   1   1   1   1   1   0   0   0   0   0   0
##   X41 X42 X43 X44 X45 X46 X47 X48 X49 X50 X51 X52 X53 X54 X55 X56 X57 X58 X59
## 1   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
## 2   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
## 3   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
## 4   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
## 5   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
## 6   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
##   X60 X61 X62 X63 X64 X65 X66 X67 X68 X69 X70 X71 X72 X73 X74 X75 X76 X77 X78
## 1   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
## 2   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
## 3   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
## 4   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
## 5   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
## 6   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
##   X79 X80 X81 X82 X83 X84 X85 X86 X87 X88 X89 X90 X91 X92 X93 X94 X95 X96 X97
## 1   0   0   1   1   1   1   1   1   1   1   1   1   1   1   0   0   0   1   1
## 2   0   0   1   1   1   1   1   1   0   0   1   1   1   1   0   0   0   1   0
## 3   0   0   1   1   1   1   1   1   0   0   1   1   1   1   0   0   0   1   0
## 4   0   0   1   1   1   1   1   1   0   0   1   1   1   1   0   0   0   1   0
## 5   0   0   1   1   1   1   1   1   0   0   1   1   1   1   0   0   0   1   0
## 6   0   0   1   1   1   1   1   1   0   0   1   1   1   1   0   0   0   1   0
##   X98 X99 X100 X101 X102 X103 X104 X105 X106 X107 X108 X109 X110 X111 X112 X113
## 1   1   0    0    1    1    0    0    0    0    0    0    0    0    0    0    0
## 2   0   0    0    1    0    0    0    0    0    0    0    0    0    0    0    0
## 3   0   0    0    1    0    1    0    0    0    0    1    0    0    1    0    0
## 4   0   0    0    1    0    0    0    0    0    0    0    0    0    0    0    0
## 5   0   0    0    1    0    0    0    0    0    0    0    0    0    0    0    0
## 6   0   0    0    1    0    0    0    0    0    0    0    0    0    0    0    0
##   X114 X115 X116 X117 X118 X119 X120 X121 X122 X123 X124 X125 X126 X127 X128
## 1    0    1    1    1    0    0    0    0    0    0    1    0    0    0    0
## 2    0    1    1    1    0    0    0    0    0    0    1    0    0    0    0
## 3    0    1    1    1    0    0    0    0    0    0    1    0    0    0    0
## 4    0    1    1    1    0    0    0    0    0    0    1    0    0    0    0
## 5    0    1    1    1    0    0    0    0    0    0    1    0    0    0    0
## 6    0    1    1    1    0    0    0    0    0    0    1    0    0    0    0
##   X129 X130 X131 X132 X133 X134 X135 X136 X137 X138 X139 X140 X141 X142 X143
## 1    0    0    0    0    0    0    0    0    1    0    0    0    0    0    1
## 2    0    0    0    0    0    0    0    0    1    0    0    0    1    0    1
## 3    0    0    0    0    0    0    0    0    1    0    0    0    0    0    1
## 4    0    0    0    0    0    0    0    0    1    0    0    0    0    0    1
## 5    0    0    0    0    0    0    0    0    1    0    0    0    0    0    1
## 6    0    0    0    0    0    0    0    0    1    0    0    0    0    0    1
##   X144 X145 X146 X147 X148 X149 X150 X151 X152 X153 X154 X155 X156 X157 X158
## 1    1    1    0    1    0    0    1    1    1    1    1    1    1    1    1
## 2    1    1    0    1    0    0    1    1    1    1    1    0    1    1    1
## 3    1    1    0    1    0    0    1    1    1    1    1    0    0    1    1
## 4    1    1    0    1    0    0    1    1    1    1    1    0    1    1    1
## 5    1    1    0    1    0    0    1    1    1    1    1    0    1    1    1
## 6    1    1    0    1    0    0    1    1    1    1    1    0    1    1    1
##   X159 X160 X161 X162 X163 X164 X165 X166 X167 X168 X169 X170 X171 X172 X173
## 1    1    1    1    1    1    1    1    1    1    1    1    1    1    1    1
## 2    1    0    1    1    1    1    1    1    1    1    1    1    1    1    1
## 3    1    0    1    1    1    1    1    1    1    1    1    1    1    1    1
## 4    1    0    1    1    1    1    1    1    1    1    1    1    1    1    1
## 5    1    0    1    1    1    1    1    1    1    1    1    1    1    1    1
## 6    1    0    1    1    1    1    1    1    1    1    1    1    1    1    1
##   X174 X175 X176 X177 X178 X179 X180 X181 X182 X183 X184 X185 X186 X187 X188
## 1    1    1    1    1    1    1    1    1    1    1    1    1    1    1    1
## 2    1    1    1    1    1    1    1    1    1    1    1    1    1    1    1
## 3    1    1    1    1    1    1    1    1    1    1    1    1    1    1    1
## 4    1    1    1    1    1    1    1    1    1    1    1    1    1    1    1
## 5    1    1    1    1    1    1    1    1    1    1    1    1    1    1    1
## 6    1    1    1    1    1    1    1    1    1    1    1    1    1    1    1
##   X189 X190 X191 X192 X193 X194 X195 X196 X197 X198 X199 X200 X201 X202 X203
## 1    1    1    1    1    1    1    1    1    1    1    1    1    1    1    1
## 2    1    1    1    1    1    1    1    1    1    1    1    1    1    1    1
## 3    1    1    1    1    1    1    1    1    1    1    1    1    1    1    1
## 4    1    1    1    1    1    1    1    1    1    1    1    1    1    1    1
## 5    1    1    1    1    1    1    1    1    1    1    1    1    1    1    1
## 6    1    1    1    1    1    1    1    1    1    1    1    1    1    1    1
##   X204 X205 X206 X207 X208 X209 X210 X211 X212 X213 X214 X215 X216 X217 X218
## 1    1    1    1    1    1    1    1    1    1    1    1    1    1    1    1
## 2    1    1    1    1    1    1    1    1    1    1    1    1    1    1    0
## 3    1    1    1    1    1    1    1    1    1    1    1    1    1    1    0
## 4    1    1    1    1    1    1    1    1    1    1    1    1    1    1    0
## 5    1    1    1    1    1    1    1    1    1    1    1    1    1    1    0
## 6    1    1    1    1    1    1    1    1    1    1    1    1    1    1    0
##   X219 X220 X221 X222 X223 X224 X225 X226 X227 X228 X229 X230 X231 X232 X233
## 1    1    1    1    1    1    1    1    1    1    1    1    1    1    1    1
## 2    1    1    1    0    1    1    0    1    1    1    1    0    1    1    1
## 3    1    1    0    0    0    0    0    0    0    0    0    1    0    0    0
## 4    1    1    1    0    1    1    0    1    1    1    1    0    1    1    1
## 5    1    1    1    0    1    1    0    1    1    1    1    0    1    1    1
## 6    1    1    1    0    1    1    0    1    1    1    1    0    1    1    1
##   X234 X235 X236 X237 X238 X239 X240 X241 X242 X243 X244 X245 X246 X247 X248
## 1    1    1    1    1    1    1    1    1    1    1    1    1    1    1    1
## 2    1    1    1    1    0    1    1    1    0    0    1    1    1    1    1
## 3    0    0    0    0    1    1    1    1    0    0    1    1    1    1    1
## 4    1    1    1    1    1    1    1    1    0    0    1    1    1    1    1
## 5    1    1    1    1    0    1    1    1    0    0    1    1    1    1    1
## 6    1    1    1    1    1    1    1    1    0    0    1    1    1    1    1
##   X249 X250 X251 X252 X253 X254 X255 X256 X257 X258 X259 X260 X261 X262 X263
## 1    1    1    1    1    1    1    1    1    1    1    1    1    1    1    1
## 2    1    0    0    0    1    1    1    1    1    0    0    1    1    1    1
## 3    1    0    0    0    1    1    1    1    1    0    0    1    1    1    1
## 4    1    0    0    0    1    1    1    1    1    0    0    1    1    1    1
## 5    1    0    0    0    1    1    1    1    1    0    0    1    1    1    1
## 6    1    0    0    0    1    1    1    1    1    0    0    1    1    1    1
##   X264 X265 X266 X267 X268 X269 X270 X271 X272 X273 X274 X275 X276 X277 X278
## 1    1    1    1    1    1    1    1    1    1    1    1    1    1    1    0
## 2    0    1    1    0    1    1    1    0    0    0    1    0    0    1    1
## 3    0    1    1    0    1    1    1    0    0    0    1    0    0    1    0
## 4    0    1    1    0    1    1    1    0    0    0    1    0    0    1    0
## 5    0    1    1    0    1    1    1    0    0    0    1    0    0    1    1
## 6    0    1    1    0    1    1    1    0    0    0    1    0    0    1    0
##   X279 X280 X281 X282 X283 X284 X285 X286 X287 X288 X289 X290 X291 X292 X293
## 1    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 2    1    1    1    1    1    1    1    1    1    1    1    1    1    1    1
## 3    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 4    0    1    1    0    0    0    0    0    0    0    0    0    0    0    1
## 5    1    1    1    1    0    1    1    1    0    0    1    1    1    1    1
## 6    0    1    1    0    0    0    0    0    0    0    0    0    0    0    1
##   X294 X295 X296 X297 X298 X299 X300 X301 X302 X303 X304 X305 X306 X307 X308
## 1    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 2    1    1    1    1    1    1    1    1    1    1    1    1    1    1    1
## 3    0    0    1    1    0    0    0    1    1    1    0    0    0    0    0
## 4    1    1    1    1    1    1    1    1    1    1    1    1    1    1    1
## 5    1    1    1    1    1    1    1    1    1    1    1    1    1    1    1
## 6    1    1    1    1    1    1    1    1    1    1    1    1    1    1    1
##   X309 X310 X311 X312 X313 X314 X315 X316 X317 X318 X319 X320 X321 X322 X323
## 1    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 2    1    1    0    0    0    0    0    0    0    0    0    0    0    0    0
## 3    1    1    1    1    1    1    1    1    1    1    1    1    1    1    1
## 4    1    1    0    1    1    0    0    0    0    0    0    0    0    0    0
## 5    1    1    0    0    0    0    0    0    0    0    0    0    0    0    0
## 6    1    1    0    1    1    0    0    0    0    0    0    0    0    0    0
##   X324 X325 X326 X327 X328 X329 X330 X331 X332 X333 X334 X335 X336 X337 X338
## 1    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 2    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 3    1    1    1    1    1    1    1    1    1    1    1    1    1    1    1
## 4    0    1    1    1    1    1    1    1    1    1    0    1    0    0    0
## 5    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 6    0    1    1    1    1    1    1    1    1    1    0    1    0    0    0
##   X339 X340 X341 X342 X343 X344 X345 X346 X347 X348 X349 X350 X351 X352 X353
## 1    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 2    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 3    1    1    1    1    1    1    0    0    0    0    0    0    0    0    0
## 4    0    0    0    0    0    0    1    1    1    1    0    0    0    0    0
## 5    0    0    0    0    0    0    0    0    0    0    1    0    0    0    0
## 6    0    0    0    0    0    0    1    0    1    1    0    1    1    1    1
##   X354 X355 X356 X357 X358 X359 X360 X361 X362 X363 X364 X365 X366 X367 X368
## 1    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 2    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 3    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 4    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 5    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 6    1    0    0    0    0    0    0    0    0    0    0    0    0    0    0
##   X369 X370 X371 X372 X373 X374 X375 X376 X377 X378 X379 X380 X381 X382 X383
## 1    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 2    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 3    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 4    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 5    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 6    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
##   X384 X385 X386 X387 X388 X389 X390 X391 X392 X393 X394 X395 X396 X397 X398
## 1    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 2    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 3    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 4    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 5    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 6    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
##   X399 X400 X401 X402 X403 X404 X405 X406 X407 X408 X409 X410 X411 X412 X413
## 1    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 2    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 3    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 4    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 5    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 6    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
##   X414 X415 X416 X417 X418 X419 X420 X421 X422 X423 X424 X425 X426 X427 X428
## 1    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 2    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 3    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 4    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 5    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 6    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
##   X429 X430 X431 X432 X433 X434 X435 X436 X437 X438 X439 X440 X441 X442 X443
## 1    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 2    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 3    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 4    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 5    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 6    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
##   X444 X445 X446 X447 X448 X449 X450 X451 X452 X453 X454 X455 X456 X457 X458
## 1    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 2    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 3    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 4    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 5    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 6    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
##   X459 X460 X461 X462 X463 X464 X465 X466 X467 X468 X469 X470 X471 X472 X473
## 1    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 2    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 3    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 4    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 5    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 6    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
##   X474 X475 X476 X477 X478 X479 X480 X481 X482 X483 X484 X485 X486 X487 X488
## 1    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 2    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 3    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 4    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 5    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 6    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
##   X489 X490 X491 X492 X493 X494 X495 X496 X497 X498 X499 X500 X501 X502 X503
## 1    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 2    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 3    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 4    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 5    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 6    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
##   X504 X505 X506 X507 X508 X509 X510 X511 X512 X513 X514 X515 X516 X517 X518
## 1    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 2    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 3    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 4    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 5    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 6    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
##   X519 X520 X521 X522 X523 X524 X525 X526 X527 X528 X529 X530 X531 X532 X533
## 1    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 2    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 3    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 4    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 5    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 6    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
##   X534 X535 X536 X537 X538 X539 X540 X541 X542 X543 X544 X545 X546 X547 X548
## 1    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 2    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 3    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 4    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 5    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 6    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
##   X549 X550 X551 X552 X553 X554 X555 X556 X557 X558 X559 X560 X561 X562 X563
## 1    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 2    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 3    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 4    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 5    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 6    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
##   X564 X565 X566 X567 X568 X569 X570 X571 X572 X573 X574 X575 X576 X577 X578
## 1    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 2    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 3    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 4    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 5    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 6    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
##   X579 X580 X581 X582 X583 X584 X585 X586 X587 X588 X589 X590 X591 X592 X593
## 1    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 2    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 3    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 4    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 5    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 6    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
##   X594 X595 X596 X597 X598 X599 X600 X601 X602 X603 X604 X605 X606 X607 X608
## 1    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 2    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 3    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 4    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 5    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 6    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
##   X609 X610 X611 X612 X613 X614 X615 X616 X617 X618 X619 X620 X621 X622 X623
## 1    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 2    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 3    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 4    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 5    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 6    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
##   X624 X625 X626 X627 X628 X629 X630 X631 X632 X633 X634 X635 X636 X637 X638
## 1    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 2    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 3    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 4    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 5    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 6    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
##   X639 X640 X641 X642 X643 X644 X645 X646 X647 X648 X649 X650 X651 X652 X653
## 1    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 2    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 3    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 4    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 5    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 6    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
##   X654 X655 X656 X657 X658 X659 X660 X661 X662 X663 X664 X665 X666 X667 X668
## 1    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 2    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 3    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 4    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 5    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 6    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
##   X669 X670 X671 X672 X673 X674 X675 X676 X677 X678 X679 X680 X681 X682 X683
## 1    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 2    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 3    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 4    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 5    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 6    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
##   X684 X685 X686 X687 X688 X689 X690 X691 X692 X693 X694 X695 X696 X697 X698
## 1    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 2    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 3    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 4    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 5    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 6    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
##   X699 X700 X701 X702 X703 X704 X705 X706 X707 X708 X709 X710 X711 X712 X713
## 1    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 2    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 3    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 4    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 5    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 6    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
##   X714 X715 X716 X717 X718 X719 X720 X721 X722 X723 X724 X725 X726 X727 X728
## 1    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 2    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 3    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 4    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 5    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 6    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
##   X729 X730 X731 X732 X733 X734 X735 X736 X737 X738 X739 X740 X741 X742 X743
## 1    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 2    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 3    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 4    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 5    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 6    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
##   X744 X745 X746 X747 X748 X749 X750 X751 X752 X753 X754 X755 X756 X757 X758
## 1    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 2    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 3    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 4    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 5    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 6    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
##   X759 X760 X761 X762 X763 X764 X765 X766 X767 X768 X769 X770 X771 X772 X773
## 1    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 2    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 3    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 4    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 5    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 6    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
##   X774 X775 X776 X777 X778 X779 X780 X781 X782 X783 X784 X785 X786 X787 X788
## 1    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 2    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 3    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 4    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 5    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 6    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
##   X789 X790 X791 X792 X793 X794 X795 X796 X797 X798 X799 X800 X801 X802 X803
## 1    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 2    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 3    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 4    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 5    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 6    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
##   X804 X805 X806 X807 X808 X809 X810 X811 X812 X813 X814 X815 X816 X817 X818
## 1    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 2    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 3    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 4    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 5    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 6    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
##   X819 X820 X821 X822 X823 X824 X825 X826 X827 X828 X829 X830 X831 X832 X833
## 1    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 2    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 3    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 4    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 5    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 6    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
##   X834 X835 X836 X837 X838 X839 X840 X841 X842 X843 X844 X845 X846 X847 X848
## 1    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 2    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 3    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 4    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 5    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 6    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
##   X849 X850 X851 X852 X853 X854 X855 X856 X857 X858 X859 X860 X861 X862 X863
## 1    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 2    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 3    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 4    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 5    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 6    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
##   X864 X865 X866 X867 X868 X869 X870 X871 X872 X873 X874 X875 X876 X877 X878
## 1    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 2    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 3    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 4    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 5    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 6    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
##   X879 X880 X881 X882 X883 X884 X885 X886 X887 X888 X889 X890 X891 X892 X893
## 1    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 2    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 3    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 4    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 5    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 6    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
##   X894 X895 X896 X897 X898 X899 X900 X901 X902 X903 X904 X905 X906 X907 X908
## 1    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 2    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 3    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 4    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 5    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 6    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
##   X909 X910 X911 X912 X913 X914 X915 X916 X917 X918 X919 X920 X921 X922 X923
## 1    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 2    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 3    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 4    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 5    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 6    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
##   X924 X925 X926 X927 X928 X929 X930 X931 X932 X933 X934 X935 X936 X937 X938
## 1    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 2    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 3    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 4    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 5    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 6    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
##   X939 X940 X941 X942 X943 X944 X945 X946 X947 X948 X949 X950 X951 X952 X953
## 1    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 2    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 3    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 4    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 5    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 6    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
##   X954 X955 X956 X957 X958 X959 X960 X961 X962 X963 X964 X965 X966 X967 X968
## 1    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 2    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 3    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 4    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 5    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 6    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
##   X969 X970 X971 X972 X973 X974 X975 X976 X977 X978 X979 X980 X981 X982 X983
## 1    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 2    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 3    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 4    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 5    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 6    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
##   X984 X985 X986 X987 X988 X989 X990 X991 X992 X993 X994 X995 X996 X997 X998
## 1    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 2    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 3    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 4    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 5    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 6    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
##   X999 X1000 X1001 X1002 X1003 X1004 X1005 X1006 X1007 X1008 X1009 X1010 X1011
## 1    0     0     0     0     0     0     0     0     0     0     0     0     0
## 2    0     0     0     0     0     0     0     0     0     0     0     0     0
## 3    0     0     0     0     0     0     0     0     0     0     0     0     0
## 4    0     0     0     0     0     0     0     0     0     0     0     0     0
## 5    0     0     0     0     0     0     0     0     0     0     0     0     0
## 6    0     0     0     0     0     0     0     0     0     0     0     0     0
##   X1012 X1013 X1014 X1015 X1016 X1017 X1018 X1019 X1020 X1021 X1022 X1023 X1024
## 1     0     0     0     0     0     0     0     0     0     0     0     0     0
## 2     0     0     0     0     0     0     0     0     0     0     0     0     0
## 3     0     0     0     0     0     0     0     0     0     0     0     0     0
## 4     0     0     0     0     0     0     0     0     0     0     0     0     0
## 5     0     0     0     0     0     0     0     0     0     0     0     0     0
## 6     0     0     0     0     0     0     0     0     0     0     0     0     0
##   X1025 X1026 X1027 X1028 X1029 X1030 X1031 X1032 X1033 X1034 X1035 X1036 X1037
## 1     0     0     0     0     0     0     0     0     0     0     0     0     0
## 2     0     0     0     0     0     0     0     0     0     0     0     0     0
## 3     0     0     0     0     0     0     0     0     0     0     0     0     0
## 4     0     0     0     0     0     0     0     0     0     0     0     0     0
## 5     0     0     0     0     0     0     0     0     0     0     0     0     0
## 6     0     0     0     0     0     0     0     0     0     0     0     0     0
##   X1038 X1039 X1040 X1041 X1042 X1043 X1044 X1045 X1046 X1047 X1048 X1049 X1050
## 1     0     0     0     0     0     0     0     0     0     0     0     0     0
## 2     0     0     0     0     0     0     0     0     0     0     0     0     0
## 3     0     0     0     0     0     0     0     0     0     0     0     0     0
## 4     0     0     0     0     0     0     0     0     0     0     0     0     0
## 5     0     0     0     0     0     0     0     0     0     0     0     0     0
## 6     0     0     0     0     0     0     0     0     0     0     0     0     0
##   X1051 X1052 X1053 X1054 X1055 X1056 X1057 X1058 X1059 X1060 X1061 X1062 X1063
## 1     0     0     0     0     0     0     0     0     0     0     0     0     0
## 2     0     0     0     0     0     0     0     0     0     0     0     0     0
## 3     0     0     0     0     0     0     0     0     0     0     0     0     0
## 4     0     0     0     0     0     0     0     0     0     0     0     0     0
## 5     0     0     0     0     0     0     0     0     0     0     0     0     0
## 6     0     0     0     0     0     0     0     0     0     0     0     0     0
##   X1064 X1065 X1066 X1067 X1068 X1069 X1070 X1071 X1072 X1073 X1074 X1075 X1076
## 1     0     0     0     0     0     0     0     0     0     0     0     0     0
## 2     0     0     0     0     0     0     0     0     0     0     0     0     0
## 3     0     0     0     0     0     0     0     0     0     0     0     0     0
## 4     0     0     0     0     0     0     0     0     0     0     0     0     0
## 5     0     0     0     0     0     0     0     0     0     0     0     0     0
## 6     0     0     0     0     0     0     0     0     0     0     0     0     0
##   X1077 X1078 X1079 X1080 X1081 X1082 X1083 X1084 X1085 X1086 X1087 X1088 X1089
## 1     0     0     0     0     0     0     0     0     0     0     0     0     0
## 2     0     0     0     0     0     0     0     0     0     0     0     0     0
## 3     0     0     0     0     0     0     0     0     0     0     0     0     0
## 4     0     0     0     0     0     0     0     0     0     0     0     0     0
## 5     0     0     0     0     0     0     0     0     0     0     0     0     0
## 6     0     0     0     0     0     0     0     0     0     0     0     0     0
##   X1090 X1091 X1092 X1093 X1094 X1095 X1096 X1097 X1098 X1099 X1100 X1101 X1102
## 1     0     0     0     0     0     0     0     0     0     0     0     0     0
## 2     0     0     0     0     0     0     0     0     0     0     0     0     0
## 3     0     0     0     0     0     0     0     0     0     0     0     0     0
## 4     0     0     0     0     0     0     0     0     0     0     0     0     0
## 5     0     0     0     0     0     0     0     0     0     0     0     0     0
## 6     0     0     0     0     0     0     0     0     0     0     0     0     0
##   X1103 X1104 X1105 X1106 X1107
## 1     0     0     0     0     0
## 2     0     0     0     0     0
## 3     0     0     0     0     0
## 4     0     0     0     0     0
## 5     0     0     0     0     0
## 6     0     0     0     0     0

Part B

The fingerprint predictors indicate the presence or absence of substruc- tures of a molecule and are often sparse meaning that relatively few ofthe molecules contain each substructure. Filter out the predictors that have low frequencies using the nearZeroVar function from the caret package. How many predictors are left for modeling?

library(caret)
## Warning: package 'caret' was built under R version 4.3.3
## Loading required package: ggplot2
## Loading required package: lattice
nearZeroPreds <- nearZeroVar(fingerprints)
nearZeroPreds
##   [1]    7    8    9   10   13   14   17   18   19   22   23   24   30   31   32
##  [16]   33   34   45   77   81   82   83   84   85   89   90   91   92   95  100
##  [31]  104  105  106  107  109  110  112  113  114  115  116  117  119  120  122
##  [46]  123  124  128  131  132  134  135  136  137  139  140  144  145  147  148
##  [61]  149  151  155  160  161  164  165  166  216  217  218  219  220  222  243
##  [76]  252  259  273  275  277  282  283  287  288  289  292  346  347  348  349
##  [91]  350  351  352  353  354  363  364  365  369  375  379  384  391  393  397
## [106]  399  402  404  405  407  408  409  410  411  412  413  414  415  416  417
## [121]  418  419  420  421  422  423  424  425  426  427  428  429  430  431  432
## [136]  433  434  435  436  437  438  439  440  441  442  443  444  445  446  447
## [151]  448  449  450  451  452  453  454  455  456  457  458  459  460  461  462
## [166]  463  464  465  466  467  468  469  470  471  472  473  474  475  476  477
## [181]  478  479  480  481  482  483  484  485  486  487  488  489  490  491  492
## [196]  493  494  495  498  500  501  502  513  523  525  526  527  528  530  531
## [211]  532  533  534  535  536  537  538  539  540  541  542  543  544  545  546
## [226]  547  548  550  552  555  562  563  564  566  567  569  570  572  575  578
## [241]  579  580  581  582  583  584  585  586  587  588  589  596  605  606  607
## [256]  608  609  610  611  612  614  615  616  617  618  619  620  622  623  624
## [271]  625  626  627  628  629  630  631  632  633  634  635  636  637  638  639
## [286]  640  641  642  643  644  645  646  647  648  649  650  651  652  653  654
## [301]  655  656  657  658  659  660  661  662  663  664  665  666  667  668  669
## [316]  670  671  672  673  674  675  676  677  678  680  681  682  683  684  685
## [331]  686  687  688  689  690  691  692  693  694  695  696  697  706  707  708
## [346]  709  710  711  712  713  714  715  716  717  718  720  721  722  723  724
## [361]  725  726  727  728  729  730  731  734  735  736  737  738  739  740  741
## [376]  742  743  744  745  746  747  748  749  756  757  758  759  760  761  762
## [391]  763  764  765  766  767  768  769  770  771  772  777  778  779  781  783
## [406]  784  785  786  787  788  789  790  791  794  796  797  799  802  803  804
## [421]  807  808  809  810  811  814  815  816  817  818  819  820  821  822  823
## [436]  824  825  826  827  828  829  830  831  832  833  834  835  836  837  838
## [451]  839  840  841  842  843  844  845  846  847  848  849  850  851  852  853
## [466]  854  855  856  857  858  859  860  861  862  863  864  865  866  867  868
## [481]  869  870  871  872  873  874  875  876  877  878  879  880  881  882  883
## [496]  884  885  886  887  888  889  890  891  892  893  894  895  896  897  898
## [511]  899  900  901  902  903  904  905  906  907  908  909  910  911  912  913
## [526]  914  915  916  917  918  919  920  921  922  923  924  925  926  927  928
## [541]  929  930  931  932  933  934  935  936  937  938  939  940  941  942  943
## [556]  944  945  946  947  948  949  950  951  952  953  954  955  956  957  958
## [571]  959  960  961  962  963  964  965  966  967  968  969  970  971  972  973
## [586]  974  975  976  977  978  979  980  981  982  983  984  985  986  987  988
## [601]  989  990  991  992  993  994  995  996  997  998  999 1000 1001 1002 1003
## [616] 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018
## [631] 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033
## [646] 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048
## [661] 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063
## [676] 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078
## [691] 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093
## [706] 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107
length(nearZeroPreds)
## [1] 719

We can see that out of the 1,107 predictors we have 719 predictors that have near zero variance.

Now lets filter out these predictors and see how many we are left with.

filtered_fingerprints <- fingerprints[, -nearZeroPreds]
head(filtered_fingerprints)
##   X1 X2 X3 X4 X5 X6 X11 X12 X15 X16 X20 X21 X25 X26 X27 X28 X29 X35 X36 X37 X38
## 1  0  0  0  0  0  1   0   1   0   0   0   0   0   0   0   0   0   0   0   0   0
## 2  0  0  0  0  0  0   0   1   0   0   0   0   0   0   0   0   0   0   0   0   0
## 3  0  0  0  0  0  1   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
## 4  0  0  0  0  0  0   0   1   0   0   0   0   0   0   0   0   0   0   0   0   0
## 5  0  0  0  0  0  0   0   1   0   0   0   0   0   0   0   0   0   0   0   0   0
## 6  0  0  0  0  0  0   0   1   0   0   0   0   0   0   0   0   0   0   0   0   0
##   X39 X40 X41 X42 X43 X44 X46 X47 X48 X49 X50 X51 X52 X53 X54 X55 X56 X57 X58
## 1   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
## 2   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
## 3   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
## 4   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
## 5   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
## 6   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
##   X59 X60 X61 X62 X63 X64 X65 X66 X67 X68 X69 X70 X71 X72 X73 X74 X75 X76 X78
## 1   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
## 2   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
## 3   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
## 4   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
## 5   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
## 6   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
##   X79 X80 X86 X87 X88 X93 X94 X96 X97 X98 X99 X101 X102 X103 X108 X111 X118
## 1   0   0   1   1   1   0   0   1   1   1   0    1    1    0    0    0    0
## 2   0   0   1   0   0   0   0   1   0   0   0    1    0    0    0    0    0
## 3   0   0   1   0   0   0   0   1   0   0   0    1    0    1    1    1    0
## 4   0   0   1   0   0   0   0   1   0   0   0    1    0    0    0    0    0
## 5   0   0   1   0   0   0   0   1   0   0   0    1    0    0    0    0    0
## 6   0   0   1   0   0   0   0   1   0   0   0    1    0    0    0    0    0
##   X121 X125 X126 X127 X129 X130 X133 X138 X141 X142 X143 X146 X150 X152 X153
## 1    0    0    0    0    0    0    0    0    0    0    1    0    1    1    1
## 2    0    0    0    0    0    0    0    0    1    0    1    0    1    1    1
## 3    0    0    0    0    0    0    0    0    0    0    1    0    1    1    1
## 4    0    0    0    0    0    0    0    0    0    0    1    0    1    1    1
## 5    0    0    0    0    0    0    0    0    0    0    1    0    1    1    1
## 6    0    0    0    0    0    0    0    0    0    0    1    0    1    1    1
##   X154 X156 X157 X158 X159 X162 X163 X167 X168 X169 X170 X171 X172 X173 X174
## 1    1    1    1    1    1    1    1    1    1    1    1    1    1    1    1
## 2    1    1    1    1    1    1    1    1    1    1    1    1    1    1    1
## 3    1    0    1    1    1    1    1    1    1    1    1    1    1    1    1
## 4    1    1    1    1    1    1    1    1    1    1    1    1    1    1    1
## 5    1    1    1    1    1    1    1    1    1    1    1    1    1    1    1
## 6    1    1    1    1    1    1    1    1    1    1    1    1    1    1    1
##   X175 X176 X177 X178 X179 X180 X181 X182 X183 X184 X185 X186 X187 X188 X189
## 1    1    1    1    1    1    1    1    1    1    1    1    1    1    1    1
## 2    1    1    1    1    1    1    1    1    1    1    1    1    1    1    1
## 3    1    1    1    1    1    1    1    1    1    1    1    1    1    1    1
## 4    1    1    1    1    1    1    1    1    1    1    1    1    1    1    1
## 5    1    1    1    1    1    1    1    1    1    1    1    1    1    1    1
## 6    1    1    1    1    1    1    1    1    1    1    1    1    1    1    1
##   X190 X191 X192 X193 X194 X195 X196 X197 X198 X199 X200 X201 X202 X203 X204
## 1    1    1    1    1    1    1    1    1    1    1    1    1    1    1    1
## 2    1    1    1    1    1    1    1    1    1    1    1    1    1    1    1
## 3    1    1    1    1    1    1    1    1    1    1    1    1    1    1    1
## 4    1    1    1    1    1    1    1    1    1    1    1    1    1    1    1
## 5    1    1    1    1    1    1    1    1    1    1    1    1    1    1    1
## 6    1    1    1    1    1    1    1    1    1    1    1    1    1    1    1
##   X205 X206 X207 X208 X209 X210 X211 X212 X213 X214 X215 X221 X223 X224 X225
## 1    1    1    1    1    1    1    1    1    1    1    1    1    1    1    1
## 2    1    1    1    1    1    1    1    1    1    1    1    1    1    1    0
## 3    1    1    1    1    1    1    1    1    1    1    1    0    0    0    0
## 4    1    1    1    1    1    1    1    1    1    1    1    1    1    1    0
## 5    1    1    1    1    1    1    1    1    1    1    1    1    1    1    0
## 6    1    1    1    1    1    1    1    1    1    1    1    1    1    1    0
##   X226 X227 X228 X229 X230 X231 X232 X233 X234 X235 X236 X237 X238 X239 X240
## 1    1    1    1    1    1    1    1    1    1    1    1    1    1    1    1
## 2    1    1    1    1    0    1    1    1    1    1    1    1    0    1    1
## 3    0    0    0    0    1    0    0    0    0    0    0    0    1    1    1
## 4    1    1    1    1    0    1    1    1    1    1    1    1    1    1    1
## 5    1    1    1    1    0    1    1    1    1    1    1    1    0    1    1
## 6    1    1    1    1    0    1    1    1    1    1    1    1    1    1    1
##   X241 X242 X244 X245 X246 X247 X248 X249 X250 X251 X253 X254 X255 X256 X257
## 1    1    1    1    1    1    1    1    1    1    1    1    1    1    1    1
## 2    1    0    1    1    1    1    1    1    0    0    1    1    1    1    1
## 3    1    0    1    1    1    1    1    1    0    0    1    1    1    1    1
## 4    1    0    1    1    1    1    1    1    0    0    1    1    1    1    1
## 5    1    0    1    1    1    1    1    1    0    0    1    1    1    1    1
## 6    1    0    1    1    1    1    1    1    0    0    1    1    1    1    1
##   X258 X260 X261 X262 X263 X264 X265 X266 X267 X268 X269 X270 X271 X272 X274
## 1    1    1    1    1    1    1    1    1    1    1    1    1    1    1    1
## 2    0    1    1    1    1    0    1    1    0    1    1    1    0    0    1
## 3    0    1    1    1    1    0    1    1    0    1    1    1    0    0    1
## 4    0    1    1    1    1    0    1    1    0    1    1    1    0    0    1
## 5    0    1    1    1    1    0    1    1    0    1    1    1    0    0    1
## 6    0    1    1    1    1    0    1    1    0    1    1    1    0    0    1
##   X276 X278 X279 X280 X281 X284 X285 X286 X290 X291 X293 X294 X295 X296 X297
## 1    1    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 2    0    1    1    1    1    1    1    1    1    1    1    1    1    1    1
## 3    0    0    0    0    0    0    0    0    0    0    0    0    0    1    1
## 4    0    0    0    1    1    0    0    0    0    0    1    1    1    1    1
## 5    0    1    1    1    1    1    1    1    1    1    1    1    1    1    1
## 6    0    0    0    1    1    0    0    0    0    0    1    1    1    1    1
##   X298 X299 X300 X301 X302 X303 X304 X305 X306 X307 X308 X309 X310 X311 X312
## 1    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 2    1    1    1    1    1    1    1    1    1    1    1    1    1    0    0
## 3    0    0    0    1    1    1    0    0    0    0    0    1    1    1    1
## 4    1    1    1    1    1    1    1    1    1    1    1    1    1    0    1
## 5    1    1    1    1    1    1    1    1    1    1    1    1    1    0    0
## 6    1    1    1    1    1    1    1    1    1    1    1    1    1    0    1
##   X313 X314 X315 X316 X317 X318 X319 X320 X321 X322 X323 X324 X325 X326 X327
## 1    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 2    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 3    1    1    1    1    1    1    1    1    1    1    1    1    1    1    1
## 4    1    0    0    0    0    0    0    0    0    0    0    0    1    1    1
## 5    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 6    1    0    0    0    0    0    0    0    0    0    0    0    1    1    1
##   X328 X329 X330 X331 X332 X333 X334 X335 X336 X337 X338 X339 X340 X341 X342
## 1    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 2    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 3    1    1    1    1    1    1    1    1    1    1    1    1    1    1    1
## 4    1    1    1    1    1    1    0    1    0    0    0    0    0    0    0
## 5    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 6    1    1    1    1    1    1    0    1    0    0    0    0    0    0    0
##   X343 X344 X345 X355 X356 X357 X358 X359 X360 X361 X362 X366 X367 X368 X370
## 1    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 2    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 3    1    1    0    0    0    0    0    0    0    0    0    0    0    0    0
## 4    0    0    1    0    0    0    0    0    0    0    0    0    0    0    0
## 5    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 6    0    0    1    0    0    0    0    0    0    0    0    0    0    0    0
##   X371 X372 X373 X374 X376 X377 X378 X380 X381 X382 X383 X385 X386 X387 X388
## 1    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 2    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 3    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 4    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 5    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 6    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
##   X389 X390 X392 X394 X395 X396 X398 X400 X401 X403 X406 X496 X497 X499 X503
## 1    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 2    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 3    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 4    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 5    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 6    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
##   X504 X505 X506 X507 X508 X509 X510 X511 X512 X514 X515 X516 X517 X518 X519
## 1    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 2    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 3    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 4    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 5    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 6    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
##   X520 X521 X522 X524 X529 X549 X551 X553 X554 X556 X557 X558 X559 X560 X561
## 1    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 2    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 3    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 4    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 5    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 6    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
##   X565 X568 X571 X573 X574 X576 X577 X590 X591 X592 X593 X594 X595 X597 X598
## 1    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 2    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 3    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 4    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 5    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 6    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
##   X599 X600 X601 X602 X603 X604 X613 X621 X679 X698 X699 X700 X701 X702 X703
## 1    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 2    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 3    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 4    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 5    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 6    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
##   X704 X705 X719 X732 X733 X750 X751 X752 X753 X754 X755 X773 X774 X775 X776
## 1    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 2    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 3    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 4    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 5    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
## 6    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
##   X780 X782 X792 X793 X795 X798 X800 X801 X805 X806 X812 X813
## 1    0    0    0    0    0    0    0    0    0    0    0    0
## 2    0    0    0    0    0    0    0    0    0    0    0    0
## 3    0    0    0    0    0    0    0    0    0    0    0    0
## 4    0    0    0    0    0    0    0    0    0    0    0    0
## 5    0    0    0    0    0    0    0    0    0    0    0    0
## 6    0    0    0    0    0    0    0    0    0    0    0    0
ncol(filtered_fingerprints)
## [1] 388

We can see that we are left with 388 predictors that have non zero variance.

Part C

First split the data into a train and test set. I will do a 70/30 train test split

set.seed(123)  # for reproducibility
train_index <- createDataPartition(permeability, p = 0.7, list = FALSE)

X_train <- filtered_fingerprints[train_index, ]
y_train <- permeability[train_index]

X_test <- filtered_fingerprints[-train_index, ]
y_test <- permeability[-train_index]
dim(X_train)
## [1] 117 388
length(y_train)
## [1] 117
dim(X_test)
## [1]  48 388
length(y_test)
## [1] 48

Now we have a training and test set so lets preprocess the data if needed.

summary(X_train)
##        X1              X2               X3               X4        
##  Min.   :0.000   Min.   :0.0000   Min.   :0.0000   Min.   :0.0000  
##  1st Qu.:0.000   1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.0000  
##  Median :0.000   Median :0.0000   Median :0.0000   Median :0.0000  
##  Mean   :0.265   Mean   :0.2564   Mean   :0.2393   Mean   :0.2393  
##  3rd Qu.:1.000   3rd Qu.:1.0000   3rd Qu.:0.0000   3rd Qu.:0.0000  
##  Max.   :1.000   Max.   :1.0000   Max.   :1.0000   Max.   :1.0000  
##        X5               X6             X11              X12        
##  Min.   :0.0000   Min.   :0.000   Min.   :0.0000   Min.   :0.0000  
##  1st Qu.:0.0000   1st Qu.:0.000   1st Qu.:0.0000   1st Qu.:0.0000  
##  Median :0.0000   Median :0.000   Median :0.0000   Median :1.0000  
##  Mean   :0.2393   Mean   :0.453   Mean   :0.1538   Mean   :0.7179  
##  3rd Qu.:0.0000   3rd Qu.:1.000   3rd Qu.:0.0000   3rd Qu.:1.0000  
##  Max.   :1.0000   Max.   :1.000   Max.   :1.0000   Max.   :1.0000  
##       X15              X16              X20              X21        
##  Min.   :0.0000   Min.   :0.0000   Min.   :0.0000   Min.   :0.0000  
##  1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.0000  
##  Median :0.0000   Median :0.0000   Median :0.0000   Median :0.0000  
##  Mean   :0.1111   Mean   :0.2821   Mean   :0.2393   Mean   :0.2393  
##  3rd Qu.:0.0000   3rd Qu.:1.0000   3rd Qu.:0.0000   3rd Qu.:0.0000  
##  Max.   :1.0000   Max.   :1.0000   Max.   :1.0000   Max.   :1.0000  
##       X25              X26              X27              X28        
##  Min.   :0.0000   Min.   :0.0000   Min.   :0.0000   Min.   :0.0000  
##  1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.0000  
##  Median :0.0000   Median :0.0000   Median :0.0000   Median :0.0000  
##  Mean   :0.2991   Mean   :0.2393   Mean   :0.2393   Mean   :0.2393  
##  3rd Qu.:1.0000   3rd Qu.:0.0000   3rd Qu.:0.0000   3rd Qu.:0.0000  
##  Max.   :1.0000   Max.   :1.0000   Max.   :1.0000   Max.   :1.0000  
##       X29              X35              X36              X37        
##  Min.   :0.0000   Min.   :0.0000   Min.   :0.0000   Min.   :0.0000  
##  1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.0000  
##  Median :0.0000   Median :0.0000   Median :0.0000   Median :0.0000  
##  Mean   :0.2393   Mean   :0.2479   Mean   :0.1966   Mean   :0.2991  
##  3rd Qu.:0.0000   3rd Qu.:0.0000   3rd Qu.:0.0000   3rd Qu.:1.0000  
##  Max.   :1.0000   Max.   :1.0000   Max.   :1.0000   Max.   :1.0000  
##       X38              X39              X40              X41       
##  Min.   :0.0000   Min.   :0.0000   Min.   :0.0000   Min.   :0.000  
##  1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.000  
##  Median :0.0000   Median :0.0000   Median :0.0000   Median :0.000  
##  Mean   :0.2393   Mean   :0.2393   Mean   :0.2393   Mean   :0.359  
##  3rd Qu.:0.0000   3rd Qu.:0.0000   3rd Qu.:0.0000   3rd Qu.:1.000  
##  Max.   :1.0000   Max.   :1.0000   Max.   :1.0000   Max.   :1.000  
##       X42              X43              X44              X46        
##  Min.   :0.0000   Min.   :0.0000   Min.   :0.0000   Min.   :0.0000  
##  1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.0000  
##  Median :0.0000   Median :0.0000   Median :0.0000   Median :0.0000  
##  Mean   :0.3504   Mean   :0.3504   Mean   :0.3504   Mean   :0.2393  
##  3rd Qu.:1.0000   3rd Qu.:1.0000   3rd Qu.:1.0000   3rd Qu.:0.0000  
##  Max.   :1.0000   Max.   :1.0000   Max.   :1.0000   Max.   :1.0000  
##       X47              X48              X49              X50        
##  Min.   :0.0000   Min.   :0.0000   Min.   :0.0000   Min.   :0.0000  
##  1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.0000  
##  Median :0.0000   Median :0.0000   Median :0.0000   Median :0.0000  
##  Mean   :0.2393   Mean   :0.2222   Mean   :0.2991   Mean   :0.2393  
##  3rd Qu.:0.0000   3rd Qu.:0.0000   3rd Qu.:1.0000   3rd Qu.:0.0000  
##  Max.   :1.0000   Max.   :1.0000   Max.   :1.0000   Max.   :1.0000  
##       X51             X52              X53              X54        
##  Min.   :0.000   Min.   :0.0000   Min.   :0.0000   Min.   :0.0000  
##  1st Qu.:0.000   1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.0000  
##  Median :0.000   Median :0.0000   Median :0.0000   Median :0.0000  
##  Mean   :0.359   Mean   :0.3504   Mean   :0.3504   Mean   :0.2393  
##  3rd Qu.:1.000   3rd Qu.:1.0000   3rd Qu.:1.0000   3rd Qu.:0.0000  
##  Max.   :1.000   Max.   :1.0000   Max.   :1.0000   Max.   :1.0000  
##       X55              X56              X57              X58        
##  Min.   :0.0000   Min.   :0.0000   Min.   :0.0000   Min.   :0.0000  
##  1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.0000  
##  Median :0.0000   Median :0.0000   Median :0.0000   Median :0.0000  
##  Mean   :0.2393   Mean   :0.2393   Mean   :0.2222   Mean   :0.2222  
##  3rd Qu.:0.0000   3rd Qu.:0.0000   3rd Qu.:0.0000   3rd Qu.:0.0000  
##  Max.   :1.0000   Max.   :1.0000   Max.   :1.0000   Max.   :1.0000  
##       X59              X60              X61              X62        
##  Min.   :0.0000   Min.   :0.0000   Min.   :0.0000   Min.   :0.0000  
##  1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.0000  
##  Median :0.0000   Median :0.0000   Median :0.0000   Median :0.0000  
##  Mean   :0.2393   Mean   :0.2222   Mean   :0.2222   Mean   :0.2393  
##  3rd Qu.:0.0000   3rd Qu.:0.0000   3rd Qu.:0.0000   3rd Qu.:0.0000  
##  Max.   :1.0000   Max.   :1.0000   Max.   :1.0000   Max.   :1.0000  
##       X63              X64              X65              X66        
##  Min.   :0.0000   Min.   :0.0000   Min.   :0.0000   Min.   :0.0000  
##  1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.0000  
##  Median :0.0000   Median :0.0000   Median :0.0000   Median :0.0000  
##  Mean   :0.2393   Mean   :0.2393   Mean   :0.2393   Mean   :0.2222  
##  3rd Qu.:0.0000   3rd Qu.:0.0000   3rd Qu.:0.0000   3rd Qu.:0.0000  
##  Max.   :1.0000   Max.   :1.0000   Max.   :1.0000   Max.   :1.0000  
##       X67              X68              X69              X70        
##  Min.   :0.0000   Min.   :0.0000   Min.   :0.0000   Min.   :0.0000  
##  1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.0000  
##  Median :0.0000   Median :0.0000   Median :0.0000   Median :0.0000  
##  Mean   :0.2222   Mean   :0.2222   Mean   :0.2222   Mean   :0.2393  
##  3rd Qu.:0.0000   3rd Qu.:0.0000   3rd Qu.:0.0000   3rd Qu.:0.0000  
##  Max.   :1.0000   Max.   :1.0000   Max.   :1.0000   Max.   :1.0000  
##       X71              X72              X73             X74       
##  Min.   :0.0000   Min.   :0.0000   Min.   :0.000   Min.   :0.000  
##  1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.000   1st Qu.:0.000  
##  Median :0.0000   Median :0.0000   Median :0.000   Median :0.000  
##  Mean   :0.2991   Mean   :0.2991   Mean   :0.265   Mean   :0.265  
##  3rd Qu.:1.0000   3rd Qu.:1.0000   3rd Qu.:1.000   3rd Qu.:1.000  
##  Max.   :1.0000   Max.   :1.0000   Max.   :1.000   Max.   :1.000  
##       X75             X76             X78             X79        
##  Min.   :0.000   Min.   :0.000   Min.   :0.000   Min.   :0.0000  
##  1st Qu.:0.000   1st Qu.:0.000   1st Qu.:0.000   1st Qu.:0.0000  
##  Median :0.000   Median :0.000   Median :0.000   Median :0.0000  
##  Mean   :0.265   Mean   :0.265   Mean   :0.359   Mean   :0.3504  
##  3rd Qu.:1.000   3rd Qu.:1.000   3rd Qu.:1.000   3rd Qu.:1.0000  
##  Max.   :1.000   Max.   :1.000   Max.   :1.000   Max.   :1.0000  
##       X80             X86              X87              X88        
##  Min.   :0.000   Min.   :0.0000   Min.   :0.0000   Min.   :0.0000  
##  1st Qu.:0.000   1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.0000  
##  Median :0.000   Median :1.0000   Median :0.0000   Median :0.0000  
##  Mean   :0.359   Mean   :0.7009   Mean   :0.4359   Mean   :0.1624  
##  3rd Qu.:1.000   3rd Qu.:1.0000   3rd Qu.:1.0000   3rd Qu.:0.0000  
##  Max.   :1.000   Max.   :1.0000   Max.   :1.0000   Max.   :1.0000  
##       X93              X94               X96              X97        
##  Min.   :0.0000   Min.   :0.00000   Min.   :0.0000   Min.   :0.0000  
##  1st Qu.:0.0000   1st Qu.:0.00000   1st Qu.:1.0000   1st Qu.:0.0000  
##  Median :0.0000   Median :0.00000   Median :1.0000   Median :1.0000  
##  Mean   :0.2564   Mean   :0.06838   Mean   :0.7949   Mean   :0.5299  
##  3rd Qu.:1.0000   3rd Qu.:0.00000   3rd Qu.:1.0000   3rd Qu.:1.0000  
##  Max.   :1.0000   Max.   :1.00000   Max.   :1.0000   Max.   :1.0000  
##       X98              X99               X101             X102       
##  Min.   :0.0000   Min.   :0.00000   Min.   :0.0000   Min.   :0.0000  
##  1st Qu.:0.0000   1st Qu.:0.00000   1st Qu.:0.0000   1st Qu.:0.0000  
##  Median :0.0000   Median :0.00000   Median :1.0000   Median :0.0000  
##  Mean   :0.2564   Mean   :0.09402   Mean   :0.7009   Mean   :0.4188  
##  3rd Qu.:1.0000   3rd Qu.:0.00000   3rd Qu.:1.0000   3rd Qu.:1.0000  
##  Max.   :1.0000   Max.   :1.00000   Max.   :1.0000   Max.   :1.0000  
##       X103             X108             X111             X118        
##  Min.   :0.0000   Min.   :0.0000   Min.   :0.0000   Min.   :0.00000  
##  1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.00000  
##  Median :0.0000   Median :0.0000   Median :0.0000   Median :0.00000  
##  Mean   :0.1709   Mean   :0.1709   Mean   :0.4444   Mean   :0.06838  
##  3rd Qu.:0.0000   3rd Qu.:0.0000   3rd Qu.:1.0000   3rd Qu.:0.00000  
##  Max.   :1.0000   Max.   :1.0000   Max.   :1.0000   Max.   :1.00000  
##       X121             X125             X126              X127        
##  Min.   :0.0000   Min.   :0.0000   Min.   :0.00000   Min.   :0.00000  
##  1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.00000   1st Qu.:0.00000  
##  Median :0.0000   Median :0.0000   Median :0.00000   Median :0.00000  
##  Mean   :0.3419   Mean   :0.1538   Mean   :0.07692   Mean   :0.07692  
##  3rd Qu.:1.0000   3rd Qu.:0.0000   3rd Qu.:0.00000   3rd Qu.:0.00000  
##  Max.   :1.0000   Max.   :1.0000   Max.   :1.00000   Max.   :1.00000  
##       X129             X130             X133             X138       
##  Min.   :0.0000   Min.   :0.0000   Min.   :0.0000   Min.   :0.0000  
##  1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.0000  
##  Median :0.0000   Median :0.0000   Median :0.0000   Median :0.0000  
##  Mean   :0.1709   Mean   :0.1538   Mean   :0.2821   Mean   :0.2821  
##  3rd Qu.:0.0000   3rd Qu.:0.0000   3rd Qu.:1.0000   3rd Qu.:1.0000  
##  Max.   :1.0000   Max.   :1.0000   Max.   :1.0000   Max.   :1.0000  
##       X141             X142             X143             X146        
##  Min.   :0.0000   Min.   :0.0000   Min.   :0.0000   Min.   :0.00000  
##  1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:1.0000   1st Qu.:0.00000  
##  Median :0.0000   Median :0.0000   Median :1.0000   Median :0.00000  
##  Mean   :0.2991   Mean   :0.2393   Mean   :0.8803   Mean   :0.05983  
##  3rd Qu.:1.0000   3rd Qu.:0.0000   3rd Qu.:1.0000   3rd Qu.:0.00000  
##  Max.   :1.0000   Max.   :1.0000   Max.   :1.0000   Max.   :1.00000  
##       X150             X152             X153             X154       
##  Min.   :0.0000   Min.   :0.0000   Min.   :0.0000   Min.   :0.0000  
##  1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.0000  
##  Median :1.0000   Median :1.0000   Median :1.0000   Median :1.0000  
##  Mean   :0.5983   Mean   :0.6239   Mean   :0.6154   Mean   :0.6154  
##  3rd Qu.:1.0000   3rd Qu.:1.0000   3rd Qu.:1.0000   3rd Qu.:1.0000  
##  Max.   :1.0000   Max.   :1.0000   Max.   :1.0000   Max.   :1.0000  
##       X156             X157             X158             X159       
##  Min.   :0.0000   Min.   :0.0000   Min.   :0.0000   Min.   :0.0000  
##  1st Qu.:1.0000   1st Qu.:1.0000   1st Qu.:0.0000   1st Qu.:0.0000  
##  Median :1.0000   Median :1.0000   Median :1.0000   Median :1.0000  
##  Mean   :0.7863   Mean   :0.8034   Mean   :0.5641   Mean   :0.5214  
##  3rd Qu.:1.0000   3rd Qu.:1.0000   3rd Qu.:1.0000   3rd Qu.:1.0000  
##  Max.   :1.0000   Max.   :1.0000   Max.   :1.0000   Max.   :1.0000  
##       X162             X163             X167             X168       
##  Min.   :0.0000   Min.   :0.0000   Min.   :0.0000   Min.   :0.0000  
##  1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.0000  
##  Median :1.0000   Median :1.0000   Median :1.0000   Median :1.0000  
##  Mean   :0.6239   Mean   :0.6154   Mean   :0.6154   Mean   :0.6154  
##  3rd Qu.:1.0000   3rd Qu.:1.0000   3rd Qu.:1.0000   3rd Qu.:1.0000  
##  Max.   :1.0000   Max.   :1.0000   Max.   :1.0000   Max.   :1.0000  
##       X169             X170             X171             X172       
##  Min.   :0.0000   Min.   :0.0000   Min.   :0.0000   Min.   :0.0000  
##  1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:1.0000   1st Qu.:0.0000  
##  Median :1.0000   Median :1.0000   Median :1.0000   Median :1.0000  
##  Mean   :0.6154   Mean   :0.6154   Mean   :0.9402   Mean   :0.6154  
##  3rd Qu.:1.0000   3rd Qu.:1.0000   3rd Qu.:1.0000   3rd Qu.:1.0000  
##  Max.   :1.0000   Max.   :1.0000   Max.   :1.0000   Max.   :1.0000  
##       X173             X174             X175             X176       
##  Min.   :0.0000   Min.   :0.0000   Min.   :0.0000   Min.   :0.0000  
##  1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.0000  
##  Median :1.0000   Median :1.0000   Median :1.0000   Median :1.0000  
##  Mean   :0.6154   Mean   :0.6154   Mean   :0.6068   Mean   :0.6068  
##  3rd Qu.:1.0000   3rd Qu.:1.0000   3rd Qu.:1.0000   3rd Qu.:1.0000  
##  Max.   :1.0000   Max.   :1.0000   Max.   :1.0000   Max.   :1.0000  
##       X177             X178             X179             X180       
##  Min.   :0.0000   Min.   :0.0000   Min.   :0.0000   Min.   :0.0000  
##  1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.0000  
##  Median :1.0000   Median :1.0000   Median :1.0000   Median :1.0000  
##  Mean   :0.6068   Mean   :0.6068   Mean   :0.6239   Mean   :0.6154  
##  3rd Qu.:1.0000   3rd Qu.:1.0000   3rd Qu.:1.0000   3rd Qu.:1.0000  
##  Max.   :1.0000   Max.   :1.0000   Max.   :1.0000   Max.   :1.0000  
##       X181             X182             X183             X184       
##  Min.   :0.0000   Min.   :0.0000   Min.   :0.0000   Min.   :0.0000  
##  1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.0000  
##  Median :1.0000   Median :1.0000   Median :1.0000   Median :1.0000  
##  Mean   :0.5983   Mean   :0.6325   Mean   :0.6154   Mean   :0.6068  
##  3rd Qu.:1.0000   3rd Qu.:1.0000   3rd Qu.:1.0000   3rd Qu.:1.0000  
##  Max.   :1.0000   Max.   :1.0000   Max.   :1.0000   Max.   :1.0000  
##       X185             X186             X187             X188       
##  Min.   :0.0000   Min.   :0.0000   Min.   :0.0000   Min.   :0.0000  
##  1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.0000  
##  Median :1.0000   Median :1.0000   Median :1.0000   Median :1.0000  
##  Mean   :0.6068   Mean   :0.6068   Mean   :0.6239   Mean   :0.6154  
##  3rd Qu.:1.0000   3rd Qu.:1.0000   3rd Qu.:1.0000   3rd Qu.:1.0000  
##  Max.   :1.0000   Max.   :1.0000   Max.   :1.0000   Max.   :1.0000  
##       X189             X190             X191             X192       
##  Min.   :0.0000   Min.   :0.0000   Min.   :0.0000   Min.   :0.0000  
##  1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.0000  
##  Median :1.0000   Median :1.0000   Median :1.0000   Median :1.0000  
##  Mean   :0.6154   Mean   :0.5983   Mean   :0.5983   Mean   :0.6154  
##  3rd Qu.:1.0000   3rd Qu.:1.0000   3rd Qu.:1.0000   3rd Qu.:1.0000  
##  Max.   :1.0000   Max.   :1.0000   Max.   :1.0000   Max.   :1.0000  
##       X193             X194             X195             X196       
##  Min.   :0.0000   Min.   :0.0000   Min.   :0.0000   Min.   :0.0000  
##  1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.0000  
##  Median :1.0000   Median :1.0000   Median :1.0000   Median :1.0000  
##  Mean   :0.6068   Mean   :0.6068   Mean   :0.6239   Mean   :0.6239  
##  3rd Qu.:1.0000   3rd Qu.:1.0000   3rd Qu.:1.0000   3rd Qu.:1.0000  
##  Max.   :1.0000   Max.   :1.0000   Max.   :1.0000   Max.   :1.0000  
##       X197             X198             X199             X200       
##  Min.   :0.0000   Min.   :0.0000   Min.   :0.0000   Min.   :0.0000  
##  1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.0000  
##  Median :1.0000   Median :1.0000   Median :1.0000   Median :1.0000  
##  Mean   :0.6239   Mean   :0.6154   Mean   :0.6154   Mean   :0.5983  
##  3rd Qu.:1.0000   3rd Qu.:1.0000   3rd Qu.:1.0000   3rd Qu.:1.0000  
##  Max.   :1.0000   Max.   :1.0000   Max.   :1.0000   Max.   :1.0000  
##       X201             X202             X203             X204       
##  Min.   :0.0000   Min.   :0.0000   Min.   :0.0000   Min.   :0.0000  
##  1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.0000  
##  Median :1.0000   Median :1.0000   Median :1.0000   Median :1.0000  
##  Mean   :0.5983   Mean   :0.5983   Mean   :0.5983   Mean   :0.6154  
##  3rd Qu.:1.0000   3rd Qu.:1.0000   3rd Qu.:1.0000   3rd Qu.:1.0000  
##  Max.   :1.0000   Max.   :1.0000   Max.   :1.0000   Max.   :1.0000  
##       X205             X206             X207             X208       
##  Min.   :0.0000   Min.   :0.0000   Min.   :0.0000   Min.   :0.0000  
##  1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.0000  
##  Median :1.0000   Median :1.0000   Median :1.0000   Median :1.0000  
##  Mean   :0.6154   Mean   :0.6325   Mean   :0.6325   Mean   :0.5983  
##  3rd Qu.:1.0000   3rd Qu.:1.0000   3rd Qu.:1.0000   3rd Qu.:1.0000  
##  Max.   :1.0000   Max.   :1.0000   Max.   :1.0000   Max.   :1.0000  
##       X209             X210             X211             X212       
##  Min.   :0.0000   Min.   :0.0000   Min.   :0.0000   Min.   :0.0000  
##  1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.0000  
##  Median :1.0000   Median :1.0000   Median :1.0000   Median :1.0000  
##  Mean   :0.5983   Mean   :0.5983   Mean   :0.5983   Mean   :0.6068  
##  3rd Qu.:1.0000   3rd Qu.:1.0000   3rd Qu.:1.0000   3rd Qu.:1.0000  
##  Max.   :1.0000   Max.   :1.0000   Max.   :1.0000   Max.   :1.0000  
##       X213             X214             X215             X221       
##  Min.   :0.0000   Min.   :0.0000   Min.   :0.0000   Min.   :0.0000  
##  1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:1.0000   1st Qu.:0.0000  
##  Median :1.0000   Median :1.0000   Median :1.0000   Median :1.0000  
##  Mean   :0.6068   Mean   :0.6068   Mean   :0.9402   Mean   :0.7436  
##  3rd Qu.:1.0000   3rd Qu.:1.0000   3rd Qu.:1.0000   3rd Qu.:1.0000  
##  Max.   :1.0000   Max.   :1.0000   Max.   :1.0000   Max.   :1.0000  
##       X223             X224             X225              X226       
##  Min.   :0.0000   Min.   :0.0000   Min.   :0.00000   Min.   :0.0000  
##  1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.00000   1st Qu.:1.0000  
##  Median :1.0000   Median :1.0000   Median :0.00000   Median :1.0000  
##  Mean   :0.7436   Mean   :0.7436   Mean   :0.04274   Mean   :0.7949  
##  3rd Qu.:1.0000   3rd Qu.:1.0000   3rd Qu.:0.00000   3rd Qu.:1.0000  
##  Max.   :1.0000   Max.   :1.0000   Max.   :1.00000   Max.   :1.0000  
##       X227             X228             X229             X230       
##  Min.   :0.0000   Min.   :0.0000   Min.   :0.0000   Min.   :0.0000  
##  1st Qu.:1.0000   1st Qu.:1.0000   1st Qu.:0.0000   1st Qu.:0.0000  
##  Median :1.0000   Median :1.0000   Median :1.0000   Median :0.0000  
##  Mean   :0.7949   Mean   :0.7949   Mean   :0.6325   Mean   :0.2393  
##  3rd Qu.:1.0000   3rd Qu.:1.0000   3rd Qu.:1.0000   3rd Qu.:0.0000  
##  Max.   :1.0000   Max.   :1.0000   Max.   :1.0000   Max.   :1.0000  
##       X231            X232            X233            X234      
##  Min.   :0.000   Min.   :0.000   Min.   :0.000   Min.   :0.000  
##  1st Qu.:1.000   1st Qu.:1.000   1st Qu.:1.000   1st Qu.:1.000  
##  Median :1.000   Median :1.000   Median :1.000   Median :1.000  
##  Mean   :0.812   Mean   :0.812   Mean   :0.812   Mean   :0.812  
##  3rd Qu.:1.000   3rd Qu.:1.000   3rd Qu.:1.000   3rd Qu.:1.000  
##  Max.   :1.000   Max.   :1.000   Max.   :1.000   Max.   :1.000  
##       X235             X236             X237             X238       
##  Min.   :0.0000   Min.   :0.0000   Min.   :0.0000   Min.   :0.0000  
##  1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.0000  
##  Median :1.0000   Median :1.0000   Median :1.0000   Median :0.0000  
##  Mean   :0.5726   Mean   :0.6325   Mean   :0.5556   Mean   :0.2821  
##  3rd Qu.:1.0000   3rd Qu.:1.0000   3rd Qu.:1.0000   3rd Qu.:1.0000  
##  Max.   :1.0000   Max.   :1.0000   Max.   :1.0000   Max.   :1.0000  
##       X239             X240             X241             X242       
##  Min.   :0.0000   Min.   :0.0000   Min.   :0.0000   Min.   :0.0000  
##  1st Qu.:1.0000   1st Qu.:1.0000   1st Qu.:1.0000   1st Qu.:0.0000  
##  Median :1.0000   Median :1.0000   Median :1.0000   Median :0.0000  
##  Mean   :0.8034   Mean   :0.8034   Mean   :0.7607   Mean   :0.4701  
##  3rd Qu.:1.0000   3rd Qu.:1.0000   3rd Qu.:1.0000   3rd Qu.:1.0000  
##  Max.   :1.0000   Max.   :1.0000   Max.   :1.0000   Max.   :1.0000  
##       X244             X245             X246             X247       
##  Min.   :0.0000   Min.   :0.0000   Min.   :0.0000   Min.   :0.0000  
##  1st Qu.:1.0000   1st Qu.:1.0000   1st Qu.:1.0000   1st Qu.:0.0000  
##  Median :1.0000   Median :1.0000   Median :1.0000   Median :1.0000  
##  Mean   :0.8034   Mean   :0.8034   Mean   :0.8034   Mean   :0.7265  
##  3rd Qu.:1.0000   3rd Qu.:1.0000   3rd Qu.:1.0000   3rd Qu.:1.0000  
##  Max.   :1.0000   Max.   :1.0000   Max.   :1.0000   Max.   :1.0000  
##       X248             X249             X250             X251      
##  Min.   :0.0000   Min.   :0.0000   Min.   :0.0000   Min.   :0.000  
##  1st Qu.:0.0000   1st Qu.:1.0000   1st Qu.:0.0000   1st Qu.:0.000  
##  Median :1.0000   Median :1.0000   Median :0.0000   Median :0.000  
##  Mean   :0.5812   Mean   :0.7607   Mean   :0.4701   Mean   :0.188  
##  3rd Qu.:1.0000   3rd Qu.:1.0000   3rd Qu.:1.0000   3rd Qu.:0.000  
##  Max.   :1.0000   Max.   :1.0000   Max.   :1.0000   Max.   :1.000  
##       X253             X254             X255             X256       
##  Min.   :0.0000   Min.   :0.0000   Min.   :0.0000   Min.   :0.0000  
##  1st Qu.:1.0000   1st Qu.:1.0000   1st Qu.:0.0000   1st Qu.:0.0000  
##  Median :1.0000   Median :1.0000   Median :1.0000   Median :1.0000  
##  Mean   :0.8034   Mean   :0.8034   Mean   :0.7265   Mean   :0.5812  
##  3rd Qu.:1.0000   3rd Qu.:1.0000   3rd Qu.:1.0000   3rd Qu.:1.0000  
##  Max.   :1.0000   Max.   :1.0000   Max.   :1.0000   Max.   :1.0000  
##       X257            X258              X260            X261       
##  Min.   :0.000   Min.   :0.00000   Min.   :0.000   Min.   :0.0000  
##  1st Qu.:0.000   1st Qu.:0.00000   1st Qu.:0.000   1st Qu.:1.0000  
##  Median :1.000   Median :0.00000   Median :1.000   Median :1.0000  
##  Mean   :0.547   Mean   :0.09402   Mean   :0.735   Mean   :0.7607  
##  3rd Qu.:1.000   3rd Qu.:0.00000   3rd Qu.:1.000   3rd Qu.:1.0000  
##  Max.   :1.000   Max.   :1.00000   Max.   :1.000   Max.   :1.0000  
##       X262             X263             X264             X265      
##  Min.   :0.0000   Min.   :0.0000   Min.   :0.0000   Min.   :0.000  
##  1st Qu.:1.0000   1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.000  
##  Median :1.0000   Median :1.0000   Median :0.0000   Median :1.000  
##  Mean   :0.7521   Mean   :0.6667   Mean   :0.3846   Mean   :0.735  
##  3rd Qu.:1.0000   3rd Qu.:1.0000   3rd Qu.:1.0000   3rd Qu.:1.000  
##  Max.   :1.0000   Max.   :1.0000   Max.   :1.0000   Max.   :1.000  
##       X266             X267             X268             X269       
##  Min.   :0.0000   Min.   :0.0000   Min.   :0.0000   Min.   :0.0000  
##  1st Qu.:1.0000   1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.0000  
##  Median :1.0000   Median :1.0000   Median :1.0000   Median :1.0000  
##  Mean   :0.7521   Mean   :0.5128   Mean   :0.6752   Mean   :0.5214  
##  3rd Qu.:1.0000   3rd Qu.:1.0000   3rd Qu.:1.0000   3rd Qu.:1.0000  
##  Max.   :1.0000   Max.   :1.0000   Max.   :1.0000   Max.   :1.0000  
##       X270             X271            X272             X274       
##  Min.   :0.0000   Min.   :0.000   Min.   :0.0000   Min.   :0.0000  
##  1st Qu.:0.0000   1st Qu.:0.000   1st Qu.:0.0000   1st Qu.:0.0000  
##  Median :0.0000   Median :0.000   Median :0.0000   Median :1.0000  
##  Mean   :0.4786   Mean   :0.359   Mean   :0.4103   Mean   :0.6154  
##  3rd Qu.:1.0000   3rd Qu.:1.000   3rd Qu.:1.0000   3rd Qu.:1.0000  
##  Max.   :1.0000   Max.   :1.000   Max.   :1.0000   Max.   :1.0000  
##       X276             X278             X279             X280       
##  Min.   :0.0000   Min.   :0.0000   Min.   :0.0000   Min.   :0.0000  
##  1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.0000  
##  Median :0.0000   Median :0.0000   Median :0.0000   Median :0.0000  
##  Mean   :0.1709   Mean   :0.1795   Mean   :0.1795   Mean   :0.1795  
##  3rd Qu.:0.0000   3rd Qu.:0.0000   3rd Qu.:0.0000   3rd Qu.:0.0000  
##  Max.   :1.0000   Max.   :1.0000   Max.   :1.0000   Max.   :1.0000  
##       X281             X284             X285             X286       
##  Min.   :0.0000   Min.   :0.0000   Min.   :0.0000   Min.   :0.0000  
##  1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.0000  
##  Median :0.0000   Median :0.0000   Median :0.0000   Median :0.0000  
##  Mean   :0.1795   Mean   :0.1795   Mean   :0.1795   Mean   :0.1795  
##  3rd Qu.:0.0000   3rd Qu.:0.0000   3rd Qu.:0.0000   3rd Qu.:0.0000  
##  Max.   :1.0000   Max.   :1.0000   Max.   :1.0000   Max.   :1.0000  
##       X290             X291             X293              X294        
##  Min.   :0.0000   Min.   :0.0000   Min.   :0.00000   Min.   :0.00000  
##  1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.00000   1st Qu.:0.00000  
##  Median :0.0000   Median :0.0000   Median :0.00000   Median :0.00000  
##  Mean   :0.1795   Mean   :0.1795   Mean   :0.07692   Mean   :0.07692  
##  3rd Qu.:0.0000   3rd Qu.:0.0000   3rd Qu.:0.00000   3rd Qu.:0.00000  
##  Max.   :1.0000   Max.   :1.0000   Max.   :1.00000   Max.   :1.00000  
##       X295              X296             X297             X298       
##  Min.   :0.00000   Min.   :0.0000   Min.   :0.0000   Min.   :0.0000  
##  1st Qu.:0.00000   1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.0000  
##  Median :0.00000   Median :0.0000   Median :0.0000   Median :0.0000  
##  Mean   :0.08547   Mean   :0.2991   Mean   :0.1282   Mean   :0.1795  
##  3rd Qu.:0.00000   3rd Qu.:1.0000   3rd Qu.:0.0000   3rd Qu.:0.0000  
##  Max.   :1.00000   Max.   :1.0000   Max.   :1.0000   Max.   :1.0000  
##       X299             X300             X301             X302       
##  Min.   :0.0000   Min.   :0.0000   Min.   :0.0000   Min.   :0.0000  
##  1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.0000  
##  Median :0.0000   Median :0.0000   Median :0.0000   Median :0.0000  
##  Mean   :0.1795   Mean   :0.1795   Mean   :0.2991   Mean   :0.2991  
##  3rd Qu.:0.0000   3rd Qu.:0.0000   3rd Qu.:1.0000   3rd Qu.:1.0000  
##  Max.   :1.0000   Max.   :1.0000   Max.   :1.0000   Max.   :1.0000  
##       X303             X304             X305             X306        
##  Min.   :0.0000   Min.   :0.0000   Min.   :0.0000   Min.   :0.00000  
##  1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.00000  
##  Median :0.0000   Median :0.0000   Median :0.0000   Median :0.00000  
##  Mean   :0.1282   Mean   :0.1795   Mean   :0.1795   Mean   :0.05983  
##  3rd Qu.:0.0000   3rd Qu.:0.0000   3rd Qu.:0.0000   3rd Qu.:0.00000  
##  Max.   :1.0000   Max.   :1.0000   Max.   :1.0000   Max.   :1.00000  
##       X307              X308              X309             X310       
##  Min.   :0.00000   Min.   :0.00000   Min.   :0.0000   Min.   :0.0000  
##  1st Qu.:0.00000   1st Qu.:0.00000   1st Qu.:0.0000   1st Qu.:0.0000  
##  Median :0.00000   Median :0.00000   Median :0.0000   Median :0.0000  
##  Mean   :0.07692   Mean   :0.07692   Mean   :0.1197   Mean   :0.1709  
##  3rd Qu.:0.00000   3rd Qu.:0.00000   3rd Qu.:0.0000   3rd Qu.:0.0000  
##  Max.   :1.00000   Max.   :1.00000   Max.   :1.0000   Max.   :1.0000  
##       X311             X312             X313             X314       
##  Min.   :0.0000   Min.   :0.0000   Min.   :0.0000   Min.   :0.0000  
##  1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.0000  
##  Median :0.0000   Median :0.0000   Median :0.0000   Median :0.0000  
##  Mean   :0.2222   Mean   :0.2735   Mean   :0.2479   Mean   :0.2821  
##  3rd Qu.:0.0000   3rd Qu.:1.0000   3rd Qu.:0.0000   3rd Qu.:1.0000  
##  Max.   :1.0000   Max.   :1.0000   Max.   :1.0000   Max.   :1.0000  
##       X315            X316              X317              X318        
##  Min.   :0.000   Min.   :0.00000   Min.   :0.00000   Min.   :0.00000  
##  1st Qu.:0.000   1st Qu.:0.00000   1st Qu.:0.00000   1st Qu.:0.00000  
##  Median :1.000   Median :0.00000   Median :0.00000   Median :0.00000  
##  Mean   :0.641   Mean   :0.08547   Mean   :0.08547   Mean   :0.08547  
##  3rd Qu.:1.000   3rd Qu.:0.00000   3rd Qu.:0.00000   3rd Qu.:0.00000  
##  Max.   :1.000   Max.   :1.00000   Max.   :1.00000   Max.   :1.00000  
##       X319            X320             X321             X322       
##  Min.   :0.000   Min.   :0.0000   Min.   :0.0000   Min.   :0.0000  
##  1st Qu.:0.000   1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.0000  
##  Median :0.000   Median :0.0000   Median :0.0000   Median :0.0000  
##  Mean   :0.188   Mean   :0.1795   Mean   :0.1795   Mean   :0.2308  
##  3rd Qu.:0.000   3rd Qu.:0.0000   3rd Qu.:0.0000   3rd Qu.:0.0000  
##  Max.   :1.000   Max.   :1.0000   Max.   :1.0000   Max.   :1.0000  
##       X323             X324             X325             X326       
##  Min.   :0.0000   Min.   :0.0000   Min.   :0.0000   Min.   :0.0000  
##  1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.0000  
##  Median :0.0000   Median :0.0000   Median :0.0000   Median :0.0000  
##  Mean   :0.2051   Mean   :0.2051   Mean   :0.2735   Mean   :0.2479  
##  3rd Qu.:0.0000   3rd Qu.:0.0000   3rd Qu.:1.0000   3rd Qu.:0.0000  
##  Max.   :1.0000   Max.   :1.0000   Max.   :1.0000   Max.   :1.0000  
##       X327             X328             X329             X330       
##  Min.   :0.0000   Min.   :0.0000   Min.   :0.0000   Min.   :0.0000  
##  1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.0000  
##  Median :0.0000   Median :0.0000   Median :0.0000   Median :0.0000  
##  Mean   :0.2479   Mean   :0.2479   Mean   :0.1368   Mean   :0.2479  
##  3rd Qu.:0.0000   3rd Qu.:0.0000   3rd Qu.:0.0000   3rd Qu.:0.0000  
##  Max.   :1.0000   Max.   :1.0000   Max.   :1.0000   Max.   :1.0000  
##       X331             X332             X333             X334        
##  Min.   :0.0000   Min.   :0.0000   Min.   :0.0000   Min.   :0.00000  
##  1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.00000  
##  Median :0.0000   Median :0.0000   Median :0.0000   Median :0.00000  
##  Mean   :0.2479   Mean   :0.2479   Mean   :0.2479   Mean   :0.09402  
##  3rd Qu.:0.0000   3rd Qu.:0.0000   3rd Qu.:0.0000   3rd Qu.:0.00000  
##  Max.   :1.0000   Max.   :1.0000   Max.   :1.0000   Max.   :1.00000  
##       X335             X336             X337              X338      
##  Min.   :0.0000   Min.   :0.0000   Min.   :0.00000   Min.   :0.000  
##  1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.00000   1st Qu.:0.000  
##  Median :0.0000   Median :0.0000   Median :0.00000   Median :0.000  
##  Mean   :0.1368   Mean   :0.1026   Mean   :0.07692   Mean   :0.359  
##  3rd Qu.:0.0000   3rd Qu.:0.0000   3rd Qu.:0.00000   3rd Qu.:1.000  
##  Max.   :1.0000   Max.   :1.0000   Max.   :1.00000   Max.   :1.000  
##       X339            X340             X341            X342       
##  Min.   :0.000   Min.   :0.0000   Min.   :0.000   Min.   :0.0000  
##  1st Qu.:0.000   1st Qu.:0.0000   1st Qu.:0.000   1st Qu.:0.0000  
##  Median :0.000   Median :0.0000   Median :0.000   Median :0.0000  
##  Mean   :0.359   Mean   :0.1197   Mean   :0.359   Mean   :0.3419  
##  3rd Qu.:1.000   3rd Qu.:0.0000   3rd Qu.:1.000   3rd Qu.:1.0000  
##  Max.   :1.000   Max.   :1.0000   Max.   :1.000   Max.   :1.0000  
##       X343             X344             X345              X355      
##  Min.   :0.0000   Min.   :0.0000   Min.   :0.00000   Min.   :0.000  
##  1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.00000   1st Qu.:0.000  
##  Median :0.0000   Median :0.0000   Median :0.00000   Median :0.000  
##  Mean   :0.3419   Mean   :0.1197   Mean   :0.05983   Mean   :0.265  
##  3rd Qu.:1.0000   3rd Qu.:0.0000   3rd Qu.:0.00000   3rd Qu.:1.000  
##  Max.   :1.0000   Max.   :1.0000   Max.   :1.00000   Max.   :1.000  
##       X356             X357             X358             X359      
##  Min.   :0.0000   Min.   :0.0000   Min.   :0.0000   Min.   :0.000  
##  1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.000  
##  Median :0.0000   Median :0.0000   Median :0.0000   Median :0.000  
##  Mean   :0.2821   Mean   :0.2308   Mean   :0.2735   Mean   :0.188  
##  3rd Qu.:1.0000   3rd Qu.:0.0000   3rd Qu.:1.0000   3rd Qu.:0.000  
##  Max.   :1.0000   Max.   :1.0000   Max.   :1.0000   Max.   :1.000  
##       X360            X361             X362             X366       
##  Min.   :0.000   Min.   :0.0000   Min.   :0.0000   Min.   :0.0000  
##  1st Qu.:0.000   1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.0000  
##  Median :0.000   Median :0.0000   Median :0.0000   Median :0.0000  
##  Mean   :0.188   Mean   :0.1282   Mean   :0.1709   Mean   :0.1966  
##  3rd Qu.:0.000   3rd Qu.:0.0000   3rd Qu.:0.0000   3rd Qu.:0.0000  
##  Max.   :1.000   Max.   :1.0000   Max.   :1.0000   Max.   :1.0000  
##       X367             X368             X370             X371       
##  Min.   :0.0000   Min.   :0.0000   Min.   :0.0000   Min.   :0.0000  
##  1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.0000  
##  Median :0.0000   Median :0.0000   Median :0.0000   Median :0.0000  
##  Mean   :0.2308   Mean   :0.2821   Mean   :0.2308   Mean   :0.2308  
##  3rd Qu.:0.0000   3rd Qu.:1.0000   3rd Qu.:0.0000   3rd Qu.:0.0000  
##  Max.   :1.0000   Max.   :1.0000   Max.   :1.0000   Max.   :1.0000  
##       X372              X373              X374             X376        
##  Min.   :0.00000   Min.   :0.00000   Min.   :0.0000   Min.   :0.00000  
##  1st Qu.:0.00000   1st Qu.:0.00000   1st Qu.:0.0000   1st Qu.:0.00000  
##  Median :0.00000   Median :0.00000   Median :0.0000   Median :0.00000  
##  Mean   :0.06838   Mean   :0.06838   Mean   :0.1282   Mean   :0.07692  
##  3rd Qu.:0.00000   3rd Qu.:0.00000   3rd Qu.:0.0000   3rd Qu.:0.00000  
##  Max.   :1.00000   Max.   :1.00000   Max.   :1.0000   Max.   :1.00000  
##       X377              X378              X380              X381        
##  Min.   :0.00000   Min.   :0.00000   Min.   :0.00000   Min.   :0.00000  
##  1st Qu.:0.00000   1st Qu.:0.00000   1st Qu.:0.00000   1st Qu.:0.00000  
##  Median :0.00000   Median :0.00000   Median :0.00000   Median :0.00000  
##  Mean   :0.08547   Mean   :0.08547   Mean   :0.08547   Mean   :0.08547  
##  3rd Qu.:0.00000   3rd Qu.:0.00000   3rd Qu.:0.00000   3rd Qu.:0.00000  
##  Max.   :1.00000   Max.   :1.00000   Max.   :1.00000   Max.   :1.00000  
##       X382              X383              X385              X386        
##  Min.   :0.00000   Min.   :0.00000   Min.   :0.00000   Min.   :0.00000  
##  1st Qu.:0.00000   1st Qu.:0.00000   1st Qu.:0.00000   1st Qu.:0.00000  
##  Median :0.00000   Median :0.00000   Median :0.00000   Median :0.00000  
##  Mean   :0.08547   Mean   :0.08547   Mean   :0.08547   Mean   :0.08547  
##  3rd Qu.:0.00000   3rd Qu.:0.00000   3rd Qu.:0.00000   3rd Qu.:0.00000  
##  Max.   :1.00000   Max.   :1.00000   Max.   :1.00000   Max.   :1.00000  
##       X387              X388              X389              X390        
##  Min.   :0.00000   Min.   :0.00000   Min.   :0.00000   Min.   :0.00000  
##  1st Qu.:0.00000   1st Qu.:0.00000   1st Qu.:0.00000   1st Qu.:0.00000  
##  Median :0.00000   Median :0.00000   Median :0.00000   Median :0.00000  
##  Mean   :0.08547   Mean   :0.08547   Mean   :0.08547   Mean   :0.08547  
##  3rd Qu.:0.00000   3rd Qu.:0.00000   3rd Qu.:0.00000   3rd Qu.:0.00000  
##  Max.   :1.00000   Max.   :1.00000   Max.   :1.00000   Max.   :1.00000  
##       X392              X394              X395              X396        
##  Min.   :0.00000   Min.   :0.00000   Min.   :0.00000   Min.   :0.00000  
##  1st Qu.:0.00000   1st Qu.:0.00000   1st Qu.:0.00000   1st Qu.:0.00000  
##  Median :0.00000   Median :0.00000   Median :0.00000   Median :0.00000  
##  Mean   :0.08547   Mean   :0.08547   Mean   :0.08547   Mean   :0.08547  
##  3rd Qu.:0.00000   3rd Qu.:0.00000   3rd Qu.:0.00000   3rd Qu.:0.00000  
##  Max.   :1.00000   Max.   :1.00000   Max.   :1.00000   Max.   :1.00000  
##       X398              X400              X401              X403        
##  Min.   :0.00000   Min.   :0.00000   Min.   :0.00000   Min.   :0.00000  
##  1st Qu.:0.00000   1st Qu.:0.00000   1st Qu.:0.00000   1st Qu.:0.00000  
##  Median :0.00000   Median :0.00000   Median :0.00000   Median :0.00000  
##  Mean   :0.08547   Mean   :0.08547   Mean   :0.08547   Mean   :0.08547  
##  3rd Qu.:0.00000   3rd Qu.:0.00000   3rd Qu.:0.00000   3rd Qu.:0.00000  
##  Max.   :1.00000   Max.   :1.00000   Max.   :1.00000   Max.   :1.00000  
##       X406              X496             X497             X499       
##  Min.   :0.00000   Min.   :0.0000   Min.   :0.0000   Min.   :0.0000  
##  1st Qu.:0.00000   1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.0000  
##  Median :0.00000   Median :0.0000   Median :0.0000   Median :0.0000  
##  Mean   :0.08547   Mean   :0.1197   Mean   :0.1282   Mean   :0.1368  
##  3rd Qu.:0.00000   3rd Qu.:0.0000   3rd Qu.:0.0000   3rd Qu.:0.0000  
##  Max.   :1.00000   Max.   :1.0000   Max.   :1.0000   Max.   :1.0000  
##       X503             X504             X505             X506       
##  Min.   :0.0000   Min.   :0.0000   Min.   :0.0000   Min.   :0.0000  
##  1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.0000  
##  Median :0.0000   Median :0.0000   Median :0.0000   Median :0.0000  
##  Mean   :0.1197   Mean   :0.2479   Mean   :0.2479   Mean   :0.2479  
##  3rd Qu.:0.0000   3rd Qu.:0.0000   3rd Qu.:0.0000   3rd Qu.:0.0000  
##  Max.   :1.0000   Max.   :1.0000   Max.   :1.0000   Max.   :1.0000  
##       X507             X508             X509             X510       
##  Min.   :0.0000   Min.   :0.0000   Min.   :0.0000   Min.   :0.0000  
##  1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.0000  
##  Median :0.0000   Median :0.0000   Median :0.0000   Median :0.0000  
##  Mean   :0.1795   Mean   :0.1709   Mean   :0.2222   Mean   :0.1026  
##  3rd Qu.:0.0000   3rd Qu.:0.0000   3rd Qu.:0.0000   3rd Qu.:0.0000  
##  Max.   :1.0000   Max.   :1.0000   Max.   :1.0000   Max.   :1.0000  
##       X511              X512             X514              X515        
##  Min.   :0.00000   Min.   :0.0000   Min.   :0.00000   Min.   :0.00000  
##  1st Qu.:0.00000   1st Qu.:0.0000   1st Qu.:0.00000   1st Qu.:0.00000  
##  Median :0.00000   Median :0.0000   Median :0.00000   Median :0.00000  
##  Mean   :0.08547   Mean   :0.1111   Mean   :0.08547   Mean   :0.08547  
##  3rd Qu.:0.00000   3rd Qu.:0.0000   3rd Qu.:0.00000   3rd Qu.:0.00000  
##  Max.   :1.00000   Max.   :1.0000   Max.   :1.00000   Max.   :1.00000  
##       X516             X517             X518              X519        
##  Min.   :0.0000   Min.   :0.0000   Min.   :0.00000   Min.   :0.00000  
##  1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.00000   1st Qu.:0.00000  
##  Median :0.0000   Median :0.0000   Median :0.00000   Median :0.00000  
##  Mean   :0.1026   Mean   :0.1026   Mean   :0.08547   Mean   :0.09402  
##  3rd Qu.:0.0000   3rd Qu.:0.0000   3rd Qu.:0.00000   3rd Qu.:0.00000  
##  Max.   :1.0000   Max.   :1.0000   Max.   :1.00000   Max.   :1.00000  
##       X520             X521             X522              X524       
##  Min.   :0.0000   Min.   :0.0000   Min.   :0.00000   Min.   :0.0000  
##  1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.00000   1st Qu.:0.0000  
##  Median :0.0000   Median :0.0000   Median :0.00000   Median :0.0000  
##  Mean   :0.1111   Mean   :0.1111   Mean   :0.08547   Mean   :0.1026  
##  3rd Qu.:0.0000   3rd Qu.:0.0000   3rd Qu.:0.00000   3rd Qu.:0.0000  
##  Max.   :1.0000   Max.   :1.0000   Max.   :1.00000   Max.   :1.0000  
##       X529              X549             X551              X553        
##  Min.   :0.00000   Min.   :0.0000   Min.   :0.00000   Min.   :0.00000  
##  1st Qu.:0.00000   1st Qu.:0.0000   1st Qu.:0.00000   1st Qu.:0.00000  
##  Median :0.00000   Median :0.0000   Median :0.00000   Median :0.00000  
##  Mean   :0.08547   Mean   :0.2222   Mean   :0.05128   Mean   :0.05128  
##  3rd Qu.:0.00000   3rd Qu.:0.0000   3rd Qu.:0.00000   3rd Qu.:0.00000  
##  Max.   :1.00000   Max.   :1.0000   Max.   :1.00000   Max.   :1.00000  
##       X554              X556             X557             X558       
##  Min.   :0.00000   Min.   :0.0000   Min.   :0.0000   Min.   :0.0000  
##  1st Qu.:0.00000   1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.0000  
##  Median :0.00000   Median :0.0000   Median :0.0000   Median :0.0000  
##  Mean   :0.05128   Mean   :0.2222   Mean   :0.2222   Mean   :0.2222  
##  3rd Qu.:0.00000   3rd Qu.:0.0000   3rd Qu.:0.0000   3rd Qu.:0.0000  
##  Max.   :1.00000   Max.   :1.0000   Max.   :1.0000   Max.   :1.0000  
##       X559             X560             X561              X565       
##  Min.   :0.0000   Min.   :0.0000   Min.   :0.00000   Min.   :0.0000  
##  1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.00000   1st Qu.:0.0000  
##  Median :0.0000   Median :0.0000   Median :0.00000   Median :0.0000  
##  Mean   :0.2222   Mean   :0.2222   Mean   :0.05983   Mean   :0.2222  
##  3rd Qu.:0.0000   3rd Qu.:0.0000   3rd Qu.:0.00000   3rd Qu.:0.0000  
##  Max.   :1.0000   Max.   :1.0000   Max.   :1.00000   Max.   :1.0000  
##       X568              X571            X573             X574      
##  Min.   :0.00000   Min.   :0.000   Min.   :0.0000   Min.   :0.000  
##  1st Qu.:0.00000   1st Qu.:0.000   1st Qu.:0.0000   1st Qu.:0.000  
##  Median :0.00000   Median :0.000   Median :0.0000   Median :0.000  
##  Mean   :0.05128   Mean   :0.188   Mean   :0.2137   Mean   :0.188  
##  3rd Qu.:0.00000   3rd Qu.:0.000   3rd Qu.:0.0000   3rd Qu.:0.000  
##  Max.   :1.00000   Max.   :1.000   Max.   :1.0000   Max.   :1.000  
##       X576            X577             X590              X591        
##  Min.   :0.000   Min.   :0.0000   Min.   :0.00000   Min.   :0.00000  
##  1st Qu.:0.000   1st Qu.:0.0000   1st Qu.:0.00000   1st Qu.:0.00000  
##  Median :0.000   Median :0.0000   Median :0.00000   Median :0.00000  
##  Mean   :0.188   Mean   :0.1111   Mean   :0.07692   Mean   :0.07692  
##  3rd Qu.:0.000   3rd Qu.:0.0000   3rd Qu.:0.00000   3rd Qu.:0.00000  
##  Max.   :1.000   Max.   :1.0000   Max.   :1.00000   Max.   :1.00000  
##       X592              X593              X594              X595        
##  Min.   :0.00000   Min.   :0.00000   Min.   :0.00000   Min.   :0.00000  
##  1st Qu.:0.00000   1st Qu.:0.00000   1st Qu.:0.00000   1st Qu.:0.00000  
##  Median :0.00000   Median :0.00000   Median :0.00000   Median :0.00000  
##  Mean   :0.09402   Mean   :0.09402   Mean   :0.09402   Mean   :0.07692  
##  3rd Qu.:0.00000   3rd Qu.:0.00000   3rd Qu.:0.00000   3rd Qu.:0.00000  
##  Max.   :1.00000   Max.   :1.00000   Max.   :1.00000   Max.   :1.00000  
##       X597             X598             X599             X600       
##  Min.   :0.0000   Min.   :0.0000   Min.   :0.0000   Min.   :0.0000  
##  1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.0000  
##  Median :0.0000   Median :0.0000   Median :0.0000   Median :0.0000  
##  Mean   :0.1197   Mean   :0.1282   Mean   :0.1709   Mean   :0.1197  
##  3rd Qu.:0.0000   3rd Qu.:0.0000   3rd Qu.:0.0000   3rd Qu.:0.0000  
##  Max.   :1.0000   Max.   :1.0000   Max.   :1.0000   Max.   :1.0000  
##       X601             X602             X603             X604       
##  Min.   :0.0000   Min.   :0.0000   Min.   :0.0000   Min.   :0.0000  
##  1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.0000  
##  Median :0.0000   Median :0.0000   Median :0.0000   Median :0.0000  
##  Mean   :0.1709   Mean   :0.1795   Mean   :0.1709   Mean   :0.1197  
##  3rd Qu.:0.0000   3rd Qu.:0.0000   3rd Qu.:0.0000   3rd Qu.:0.0000  
##  Max.   :1.0000   Max.   :1.0000   Max.   :1.0000   Max.   :1.0000  
##       X613              X621              X679              X698        
##  Min.   :0.00000   Min.   :0.00000   Min.   :0.00000   Min.   :0.00000  
##  1st Qu.:0.00000   1st Qu.:0.00000   1st Qu.:0.00000   1st Qu.:0.00000  
##  Median :0.00000   Median :0.00000   Median :0.00000   Median :0.00000  
##  Mean   :0.05128   Mean   :0.05128   Mean   :0.09402   Mean   :0.04274  
##  3rd Qu.:0.00000   3rd Qu.:0.00000   3rd Qu.:0.00000   3rd Qu.:0.00000  
##  Max.   :1.00000   Max.   :1.00000   Max.   :1.00000   Max.   :1.00000  
##       X699              X700             X701             X702       
##  Min.   :0.00000   Min.   :0.0000   Min.   :0.0000   Min.   :0.0000  
##  1st Qu.:0.00000   1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.0000  
##  Median :0.00000   Median :0.0000   Median :0.0000   Median :0.0000  
##  Mean   :0.08547   Mean   :0.1111   Mean   :0.1111   Mean   :0.1111  
##  3rd Qu.:0.00000   3rd Qu.:0.0000   3rd Qu.:0.0000   3rd Qu.:0.0000  
##  Max.   :1.00000   Max.   :1.0000   Max.   :1.0000   Max.   :1.0000  
##       X703              X704              X705             X719        
##  Min.   :0.00000   Min.   :0.00000   Min.   :0.0000   Min.   :0.00000  
##  1st Qu.:0.00000   1st Qu.:0.00000   1st Qu.:0.0000   1st Qu.:0.00000  
##  Median :0.00000   Median :0.00000   Median :0.0000   Median :0.00000  
##  Mean   :0.08547   Mean   :0.07692   Mean   :0.1111   Mean   :0.04274  
##  3rd Qu.:0.00000   3rd Qu.:0.00000   3rd Qu.:0.0000   3rd Qu.:0.00000  
##  Max.   :1.00000   Max.   :1.00000   Max.   :1.0000   Max.   :1.00000  
##       X732              X733              X750              X751        
##  Min.   :0.00000   Min.   :0.00000   Min.   :0.00000   Min.   :0.00000  
##  1st Qu.:0.00000   1st Qu.:0.00000   1st Qu.:0.00000   1st Qu.:0.00000  
##  Median :0.00000   Median :0.00000   Median :0.00000   Median :0.00000  
##  Mean   :0.05128   Mean   :0.05128   Mean   :0.05983   Mean   :0.08547  
##  3rd Qu.:0.00000   3rd Qu.:0.00000   3rd Qu.:0.00000   3rd Qu.:0.00000  
##  Max.   :1.00000   Max.   :1.00000   Max.   :1.00000   Max.   :1.00000  
##       X752              X753              X754              X755        
##  Min.   :0.00000   Min.   :0.00000   Min.   :0.00000   Min.   :0.00000  
##  1st Qu.:0.00000   1st Qu.:0.00000   1st Qu.:0.00000   1st Qu.:0.00000  
##  Median :0.00000   Median :0.00000   Median :0.00000   Median :0.00000  
##  Mean   :0.05983   Mean   :0.05983   Mean   :0.08547   Mean   :0.08547  
##  3rd Qu.:0.00000   3rd Qu.:0.00000   3rd Qu.:0.00000   3rd Qu.:0.00000  
##  Max.   :1.00000   Max.   :1.00000   Max.   :1.00000   Max.   :1.00000  
##       X773              X774              X775              X776        
##  Min.   :0.00000   Min.   :0.00000   Min.   :0.00000   Min.   :0.00000  
##  1st Qu.:0.00000   1st Qu.:0.00000   1st Qu.:0.00000   1st Qu.:0.00000  
##  Median :0.00000   Median :0.00000   Median :0.00000   Median :0.00000  
##  Mean   :0.08547   Mean   :0.08547   Mean   :0.08547   Mean   :0.08547  
##  3rd Qu.:0.00000   3rd Qu.:0.00000   3rd Qu.:0.00000   3rd Qu.:0.00000  
##  Max.   :1.00000   Max.   :1.00000   Max.   :1.00000   Max.   :1.00000  
##       X780              X782              X792              X793        
##  Min.   :0.00000   Min.   :0.00000   Min.   :0.00000   Min.   :0.00000  
##  1st Qu.:0.00000   1st Qu.:0.00000   1st Qu.:0.00000   1st Qu.:0.00000  
##  Median :0.00000   Median :0.00000   Median :0.00000   Median :0.00000  
##  Mean   :0.05128   Mean   :0.05128   Mean   :0.05128   Mean   :0.05128  
##  3rd Qu.:0.00000   3rd Qu.:0.00000   3rd Qu.:0.00000   3rd Qu.:0.00000  
##  Max.   :1.00000   Max.   :1.00000   Max.   :1.00000   Max.   :1.00000  
##       X795              X798              X800              X801        
##  Min.   :0.00000   Min.   :0.00000   Min.   :0.00000   Min.   :0.00000  
##  1st Qu.:0.00000   1st Qu.:0.00000   1st Qu.:0.00000   1st Qu.:0.00000  
##  Median :0.00000   Median :0.00000   Median :0.00000   Median :0.00000  
##  Mean   :0.06838   Mean   :0.06838   Mean   :0.05128   Mean   :0.05128  
##  3rd Qu.:0.00000   3rd Qu.:0.00000   3rd Qu.:0.00000   3rd Qu.:0.00000  
##  Max.   :1.00000   Max.   :1.00000   Max.   :1.00000   Max.   :1.00000  
##       X805              X806              X812              X813        
##  Min.   :0.00000   Min.   :0.00000   Min.   :0.00000   Min.   :0.00000  
##  1st Qu.:0.00000   1st Qu.:0.00000   1st Qu.:0.00000   1st Qu.:0.00000  
##  Median :0.00000   Median :0.00000   Median :0.00000   Median :0.00000  
##  Mean   :0.06838   Mean   :0.05128   Mean   :0.05983   Mean   :0.05128  
##  3rd Qu.:0.00000   3rd Qu.:0.00000   3rd Qu.:0.00000   3rd Qu.:0.00000  
##  Max.   :1.00000   Max.   :1.00000   Max.   :1.00000   Max.   :1.00000

All 388 of our variables are binary predictors so we do not need to do any pre processing or normalizing.

Now lets tune a PLS model

ctrl <- trainControl(method = "cv", number = 10)

pls_model <- train(
  x = X_train,
  y = y_train,
  method = "pls",
  tuneLength = 30,  # try up to 30 latent variables
  trControl = ctrl)

plot(pls_model)

pls_model$results
##    ncomp     RMSE  Rsquared       MAE   RMSESD RsquaredSD    MAESD
## 1      1 13.99057 0.2632922 10.990276 2.666266  0.2042171 2.329243
## 2      2 12.36473 0.3927908  8.739689 3.094271  0.2891991 2.228934
## 3      3 12.28337 0.4203126  8.988762 3.219348  0.2831496 2.259523
## 4      4 12.61118 0.4200091  9.744753 3.367148  0.2821755 2.064688
## 5      5 12.47332 0.4398682  9.363258 3.937844  0.2764332 2.876394
## 6      6 12.27722 0.4454306  9.154514 4.328274  0.2746557 3.228768
## 7      7 12.42897 0.4293957  9.409811 4.435112  0.2695746 3.368292
## 8      8 12.54118 0.4303471  9.644543 4.060634  0.2725991 3.160871
## 9      9 13.15973 0.4013557 10.178219 4.148851  0.2794787 3.160699
## 10    10 13.46107 0.3842828 10.324400 4.188993  0.2740585 3.153917
## 11    11 13.92600 0.3705277 10.589860 4.267311  0.2761292 3.272115
## 12    12 14.22097 0.3555753 10.756449 4.276586  0.2833077 3.225461
## 13    13 14.59459 0.3399373 11.034311 4.142828  0.2891632 3.002470
## 14    14 15.14949 0.3132850 11.297229 3.985753  0.2880336 2.920489
## 15    15 15.52003 0.2982553 11.538648 3.962501  0.2860026 2.716527
## 16    16 15.91983 0.2824934 11.888595 3.916828  0.2843008 2.538151
## 17    17 16.35493 0.2673083 12.270187 3.725130  0.2769962 2.536216
## 18    18 16.67870 0.2619140 12.485529 3.867268  0.2722359 2.626051
## 19    19 16.78528 0.2610212 12.582949 4.032575  0.2756357 2.652309
## 20    20 17.02744 0.2493200 12.804788 4.134513  0.2804832 2.713170
## 21    21 16.95746 0.2565461 12.823946 3.816953  0.2750488 2.398808
## 22    22 17.01795 0.2571057 12.857521 3.827173  0.2725787 2.486868
## 23    23 17.06730 0.2542541 12.951830 3.577713  0.2646589 2.465520
## 24    24 17.39181 0.2481465 13.124719 3.647713  0.2574169 2.548013
## 25    25 17.65716 0.2380617 13.262988 3.727740  0.2541769 2.661649
## 26    26 17.89782 0.2361610 13.449007 3.805809  0.2475813 2.783008
## 27    27 18.30201 0.2275197 13.704734 3.662568  0.2400155 2.673213
## 28    28 18.56365 0.2213178 13.978130 3.678705  0.2344072 2.710218
## 29    29 18.75258 0.2173062 14.054935 3.687866  0.2331955 2.792637
## 30    30 18.97718 0.2139653 14.242906 3.567992  0.2329017 2.761886

Now lets find the optimal number of components and corresponding R squared.

pls_model$bestTune$ncomp
## [1] 6
pls_model$results %>%
  filter(ncomp == 8)
##   ncomp     RMSE  Rsquared      MAE   RMSESD RsquaredSD    MAESD
## 1     8 12.54118 0.4303471 9.644543 4.060634  0.2725991 3.160871

We can see that 8 components are optimal we can tell this by looking at the plot because 9 components is where the RMSE starts to rise again we could use 3 components potentially as that does give the lowest RMSE but including 8 gives a more info to the model and still gives a low RMSE. The R squared associated with this number of components is 0.453212.

Part D

Predict the response for the test set. What is the test set estimate of R2?

pls_predictions <- predict(pls_model, newdata = X_test)
pls_test_values <- data.frame(obs = y_test,pred = pls_predictions)
defaultSummary(pls_test_values)
##     RMSE Rsquared      MAE 
## 9.604105 0.578010 7.116708

We get a test set Rsquared of about 0.45.

Part E

Try building other models discussed in this chapter. Do any have better predictive performance?

Lets try a lm first

X_train_lm <- as.data.frame(X_train)
X_test_lm <- as.data.frame(X_test)
lm_model <- lm(y_train ~.,data=X_train_lm)
summary(lm_model)
## 
## Call:
## lm(formula = y_train ~ ., data = X_train_lm)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -16.992  -1.436   0.000   1.496  16.992 
## 
## Coefficients: (301 not defined because of singularities)
##               Estimate Std. Error t value Pr(>|t|)   
## (Intercept)   21.75781   16.68495   1.304  0.20248   
## X1            91.05271   39.12365   2.327  0.02714 * 
## X2           -26.32381   37.39691  -0.704  0.48711   
## X3           107.66697   57.87925   1.860  0.07303 . 
## X4                  NA         NA      NA       NA   
## X5                  NA         NA      NA       NA   
## X6             5.31207    3.24128   1.639  0.11204   
## X11          -24.72696   15.54502  -1.591  0.12253   
## X12           17.23917   18.80956   0.917  0.36696   
## X15          -26.73304   21.52307  -1.242  0.22416   
## X16           10.38152    8.10295   1.281  0.21027   
## X20                 NA         NA      NA       NA   
## X21                 NA         NA      NA       NA   
## X25          -16.34576   12.39314  -1.319  0.19751   
## X26                 NA         NA      NA       NA   
## X27                 NA         NA      NA       NA   
## X28                 NA         NA      NA       NA   
## X29                 NA         NA      NA       NA   
## X35          -16.44797   22.65900  -0.726  0.47372   
## X36           10.04017   15.52915   0.647  0.52302   
## X37                 NA         NA      NA       NA   
## X38                 NA         NA      NA       NA   
## X39                 NA         NA      NA       NA   
## X40                 NA         NA      NA       NA   
## X41          -41.04600   26.66559  -1.539  0.13458   
## X42           -1.79470   33.73087  -0.053  0.95793   
## X43                 NA         NA      NA       NA   
## X44                 NA         NA      NA       NA   
## X46                 NA         NA      NA       NA   
## X47                 NA         NA      NA       NA   
## X48         -117.99150   58.67730  -2.011  0.05372 . 
## X49                 NA         NA      NA       NA   
## X50                 NA         NA      NA       NA   
## X51                 NA         NA      NA       NA   
## X52                 NA         NA      NA       NA   
## X53                 NA         NA      NA       NA   
## X54                 NA         NA      NA       NA   
## X55                 NA         NA      NA       NA   
## X56                 NA         NA      NA       NA   
## X57                 NA         NA      NA       NA   
## X58                 NA         NA      NA       NA   
## X59                 NA         NA      NA       NA   
## X60                 NA         NA      NA       NA   
## X61                 NA         NA      NA       NA   
## X62                 NA         NA      NA       NA   
## X63                 NA         NA      NA       NA   
## X64                 NA         NA      NA       NA   
## X65                 NA         NA      NA       NA   
## X66                 NA         NA      NA       NA   
## X67                 NA         NA      NA       NA   
## X68                 NA         NA      NA       NA   
## X69                 NA         NA      NA       NA   
## X70                 NA         NA      NA       NA   
## X71                 NA         NA      NA       NA   
## X72                 NA         NA      NA       NA   
## X73                 NA         NA      NA       NA   
## X74                 NA         NA      NA       NA   
## X75                 NA         NA      NA       NA   
## X76                 NA         NA      NA       NA   
## X78                 NA         NA      NA       NA   
## X79                 NA         NA      NA       NA   
## X80                 NA         NA      NA       NA   
## X86          -40.02772   38.08069  -1.051  0.30188   
## X87           91.44258   53.80166   1.700  0.09991 . 
## X88           25.08852   58.29007   0.430  0.67008   
## X93           -1.54417    3.75482  -0.411  0.68391   
## X94           -5.92987   59.59247  -0.100  0.92142   
## X96          -14.63353   25.89929  -0.565  0.57641   
## X97                 NA         NA      NA       NA   
## X98          -35.39208   33.81212  -1.047  0.30387   
## X99           10.46745   21.92733   0.477  0.63668   
## X101                NA         NA      NA       NA   
## X102         -34.27179   39.62013  -0.865  0.39413   
## X103          12.61535   22.22723   0.568  0.57470   
## X108                NA         NA      NA       NA   
## X111          32.96925   39.04938   0.844  0.40542   
## X118         -24.45919   12.75555  -1.918  0.06507 . 
## X121         -24.05350   32.22390  -0.746  0.46141   
## X125          84.16069   43.36335   1.941  0.06206 . 
## X126         -20.28304   21.52307  -0.942  0.35378   
## X127                NA         NA      NA       NA   
## X129                NA         NA      NA       NA   
## X130                NA         NA      NA       NA   
## X133                NA         NA      NA       NA   
## X138                NA         NA      NA       NA   
## X141           8.24932    3.79875   2.172  0.03821 * 
## X142                NA         NA      NA       NA   
## X143         -23.38986   15.16888  -1.542  0.13393   
## X146          -8.73395   79.12507  -0.110  0.91287   
## X150         -53.51175   35.72791  -1.498  0.14500   
## X152                NA         NA      NA       NA   
## X153          60.12744   36.70351   1.638  0.11219   
## X154                NA         NA      NA       NA   
## X156          16.54078   14.89454   1.111  0.27590   
## X157          62.41889   50.10856   1.246  0.22285   
## X158         -10.61579   24.02420  -0.442  0.66185   
## X159           4.44864   18.08771   0.246  0.80745   
## X162                NA         NA      NA       NA   
## X163                NA         NA      NA       NA   
## X167                NA         NA      NA       NA   
## X168                NA         NA      NA       NA   
## X169                NA         NA      NA       NA   
## X170                NA         NA      NA       NA   
## X171                NA         NA      NA       NA   
## X172                NA         NA      NA       NA   
## X173                NA         NA      NA       NA   
## X174                NA         NA      NA       NA   
## X175                NA         NA      NA       NA   
## X176                NA         NA      NA       NA   
## X177                NA         NA      NA       NA   
## X178                NA         NA      NA       NA   
## X179                NA         NA      NA       NA   
## X180                NA         NA      NA       NA   
## X181                NA         NA      NA       NA   
## X182         -11.12379   14.68256  -0.758  0.45479   
## X183                NA         NA      NA       NA   
## X184                NA         NA      NA       NA   
## X185                NA         NA      NA       NA   
## X186                NA         NA      NA       NA   
## X187                NA         NA      NA       NA   
## X188                NA         NA      NA       NA   
## X189                NA         NA      NA       NA   
## X190                NA         NA      NA       NA   
## X191                NA         NA      NA       NA   
## X192                NA         NA      NA       NA   
## X193                NA         NA      NA       NA   
## X194                NA         NA      NA       NA   
## X195                NA         NA      NA       NA   
## X196                NA         NA      NA       NA   
## X197                NA         NA      NA       NA   
## X198                NA         NA      NA       NA   
## X199                NA         NA      NA       NA   
## X200                NA         NA      NA       NA   
## X201                NA         NA      NA       NA   
## X202                NA         NA      NA       NA   
## X203                NA         NA      NA       NA   
## X204                NA         NA      NA       NA   
## X205                NA         NA      NA       NA   
## X206                NA         NA      NA       NA   
## X207                NA         NA      NA       NA   
## X208                NA         NA      NA       NA   
## X209                NA         NA      NA       NA   
## X210                NA         NA      NA       NA   
## X211                NA         NA      NA       NA   
## X212                NA         NA      NA       NA   
## X213                NA         NA      NA       NA   
## X214                NA         NA      NA       NA   
## X215                NA         NA      NA       NA   
## X221          32.59700   16.93457   1.925  0.06410 . 
## X223                NA         NA      NA       NA   
## X224                NA         NA      NA       NA   
## X225          34.43026   41.76048   0.824  0.41640   
## X226         -61.63237   28.66969  -2.150  0.04005 * 
## X227                NA         NA      NA       NA   
## X228                NA         NA      NA       NA   
## X229          32.27830   50.46355   0.640  0.52743   
## X230         -79.16118   47.50128  -1.667  0.10638   
## X231                NA         NA      NA       NA   
## X232                NA         NA      NA       NA   
## X233                NA         NA      NA       NA   
## X234                NA         NA      NA       NA   
## X235         -22.95537   22.78222  -1.008  0.32198   
## X236                NA         NA      NA       NA   
## X237           5.11373   25.14573   0.203  0.84027   
## X238          94.28135   38.30746   2.461  0.02004 * 
## X239                NA         NA      NA       NA   
## X240                NA         NA      NA       NA   
## X241          28.51292   38.95339   0.732  0.47006   
## X242         -52.54391   34.64402  -1.517  0.14017   
## X244                NA         NA      NA       NA   
## X245                NA         NA      NA       NA   
## X246                NA         NA      NA       NA   
## X247          -0.80546   29.28971  -0.027  0.97825   
## X248         -21.23364   36.21432  -0.586  0.56219   
## X249                NA         NA      NA       NA   
## X250                NA         NA      NA       NA   
## X251          14.76330   77.22922   0.191  0.84973   
## X253                NA         NA      NA       NA   
## X254                NA         NA      NA       NA   
## X255                NA         NA      NA       NA   
## X256                NA         NA      NA       NA   
## X257          26.49963   30.70300   0.863  0.39517   
## X258         -35.05909   20.15904  -1.739  0.09262 . 
## X260                NA         NA      NA       NA   
## X261                NA         NA      NA       NA   
## X262         -33.80935   50.61650  -0.668  0.50945   
## X263                NA         NA      NA       NA   
## X264                NA         NA      NA       NA   
## X265                NA         NA      NA       NA   
## X266                NA         NA      NA       NA   
## X267                NA         NA      NA       NA   
## X268                NA         NA      NA       NA   
## X269                NA         NA      NA       NA   
## X270                NA         NA      NA       NA   
## X271                NA         NA      NA       NA   
## X272          12.12996   25.34775   0.479  0.63585   
## X274                NA         NA      NA       NA   
## X276                NA         NA      NA       NA   
## X278          -2.85268    4.74243  -0.602  0.55217   
## X279                NA         NA      NA       NA   
## X280          -0.62424    7.51866  -0.083  0.93440   
## X281                NA         NA      NA       NA   
## X284                NA         NA      NA       NA   
## X285                NA         NA      NA       NA   
## X286                NA         NA      NA       NA   
## X290                NA         NA      NA       NA   
## X291                NA         NA      NA       NA   
## X293          28.78680   18.63848   1.544  0.13332   
## X294                NA         NA      NA       NA   
## X295         -30.06120   15.49440  -1.940  0.06214 . 
## X296                NA         NA      NA       NA   
## X297                NA         NA      NA       NA   
## X298                NA         NA      NA       NA   
## X299                NA         NA      NA       NA   
## X300                NA         NA      NA       NA   
## X301                NA         NA      NA       NA   
## X302                NA         NA      NA       NA   
## X303                NA         NA      NA       NA   
## X304                NA         NA      NA       NA   
## X305                NA         NA      NA       NA   
## X306          -8.73133   12.46353  -0.701  0.48917   
## X307                NA         NA      NA       NA   
## X308                NA         NA      NA       NA   
## X309                NA         NA      NA       NA   
## X310                NA         NA      NA       NA   
## X311          15.35786   16.64430   0.923  0.36377   
## X312         -63.00108   34.06227  -1.850  0.07459 . 
## X313                NA         NA      NA       NA   
## X314                NA         NA      NA       NA   
## X315          -7.76439    7.79402  -0.996  0.32739   
## X316           9.89220   14.59697   0.678  0.50334   
## X317                NA         NA      NA       NA   
## X318                NA         NA      NA       NA   
## X319          37.27258   33.85179   1.101  0.27993   
## X320                NA         NA      NA       NA   
## X321                NA         NA      NA       NA   
## X322                NA         NA      NA       NA   
## X323                NA         NA      NA       NA   
## X324                NA         NA      NA       NA   
## X325                NA         NA      NA       NA   
## X326                NA         NA      NA       NA   
## X327                NA         NA      NA       NA   
## X328                NA         NA      NA       NA   
## X329         -27.63501   21.83687  -1.266  0.21576   
## X330                NA         NA      NA       NA   
## X331                NA         NA      NA       NA   
## X332                NA         NA      NA       NA   
## X333                NA         NA      NA       NA   
## X334          18.27825   16.70182   1.094  0.28279   
## X335                NA         NA      NA       NA   
## X336                NA         NA      NA       NA   
## X337         -36.88771   21.11743  -1.747  0.09126 . 
## X338          -5.21076   18.19264  -0.286  0.77659   
## X339                NA         NA      NA       NA   
## X340         -29.35301   16.98895  -1.728  0.09467 . 
## X341                NA         NA      NA       NA   
## X342          39.84663   17.37555   2.293  0.02927 * 
## X343                NA         NA      NA       NA   
## X344                NA         NA      NA       NA   
## X345          -4.38150   10.09999  -0.434  0.66763   
## X355                NA         NA      NA       NA   
## X356                NA         NA      NA       NA   
## X357         -20.79701   17.79084  -1.169  0.25193   
## X358         -33.67432   11.49169  -2.930  0.00654 **
## X359         -26.47488   26.01400  -1.018  0.31723   
## X360                NA         NA      NA       NA   
## X361          -8.78529   10.88965  -0.807  0.42637   
## X362                NA         NA      NA       NA   
## X366                NA         NA      NA       NA   
## X367                NA         NA      NA       NA   
## X368                NA         NA      NA       NA   
## X370          26.99739   20.70275   1.304  0.20247   
## X371                NA         NA      NA       NA   
## X372                NA         NA      NA       NA   
## X373                NA         NA      NA       NA   
## X374          -0.59805    7.52041  -0.080  0.93716   
## X376         -12.63455   24.63548  -0.513  0.61193   
## X377                NA         NA      NA       NA   
## X378                NA         NA      NA       NA   
## X380                NA         NA      NA       NA   
## X381                NA         NA      NA       NA   
## X382                NA         NA      NA       NA   
## X383                NA         NA      NA       NA   
## X385                NA         NA      NA       NA   
## X386                NA         NA      NA       NA   
## X387                NA         NA      NA       NA   
## X388                NA         NA      NA       NA   
## X389                NA         NA      NA       NA   
## X390                NA         NA      NA       NA   
## X392                NA         NA      NA       NA   
## X394                NA         NA      NA       NA   
## X395                NA         NA      NA       NA   
## X396                NA         NA      NA       NA   
## X398                NA         NA      NA       NA   
## X400                NA         NA      NA       NA   
## X401                NA         NA      NA       NA   
## X403                NA         NA      NA       NA   
## X406                NA         NA      NA       NA   
## X496           6.79975    9.83342   0.691  0.49475   
## X497                NA         NA      NA       NA   
## X499                NA         NA      NA       NA   
## X503         -20.91595   19.74615  -1.059  0.29823   
## X504                NA         NA      NA       NA   
## X505                NA         NA      NA       NA   
## X506                NA         NA      NA       NA   
## X507          14.12446   16.88804   0.836  0.40979   
## X508                NA         NA      NA       NA   
## X509          11.71199   20.24931   0.578  0.56747   
## X510                NA         NA      NA       NA   
## X511                NA         NA      NA       NA   
## X512                NA         NA      NA       NA   
## X514                NA         NA      NA       NA   
## X515                NA         NA      NA       NA   
## X516                NA         NA      NA       NA   
## X517                NA         NA      NA       NA   
## X518                NA         NA      NA       NA   
## X519                NA         NA      NA       NA   
## X520                NA         NA      NA       NA   
## X521                NA         NA      NA       NA   
## X522                NA         NA      NA       NA   
## X524                NA         NA      NA       NA   
## X529                NA         NA      NA       NA   
## X549                NA         NA      NA       NA   
## X551                NA         NA      NA       NA   
## X553                NA         NA      NA       NA   
## X554                NA         NA      NA       NA   
## X556                NA         NA      NA       NA   
## X557                NA         NA      NA       NA   
## X558                NA         NA      NA       NA   
## X559                NA         NA      NA       NA   
## X560                NA         NA      NA       NA   
## X561                NA         NA      NA       NA   
## X565                NA         NA      NA       NA   
## X568                NA         NA      NA       NA   
## X571         -19.80546   14.51603  -1.364  0.18294   
## X573                NA         NA      NA       NA   
## X574                NA         NA      NA       NA   
## X576                NA         NA      NA       NA   
## X577                NA         NA      NA       NA   
## X590                NA         NA      NA       NA   
## X591                NA         NA      NA       NA   
## X592                NA         NA      NA       NA   
## X593                NA         NA      NA       NA   
## X594                NA         NA      NA       NA   
## X595                NA         NA      NA       NA   
## X597                NA         NA      NA       NA   
## X598                NA         NA      NA       NA   
## X599                NA         NA      NA       NA   
## X600                NA         NA      NA       NA   
## X601                NA         NA      NA       NA   
## X602                NA         NA      NA       NA   
## X603                NA         NA      NA       NA   
## X604                NA         NA      NA       NA   
## X613                NA         NA      NA       NA   
## X621                NA         NA      NA       NA   
## X679                NA         NA      NA       NA   
## X698                NA         NA      NA       NA   
## X699                NA         NA      NA       NA   
## X700                NA         NA      NA       NA   
## X701                NA         NA      NA       NA   
## X702                NA         NA      NA       NA   
## X703                NA         NA      NA       NA   
## X704                NA         NA      NA       NA   
## X705                NA         NA      NA       NA   
## X719                NA         NA      NA       NA   
## X732          -5.60363    5.92410  -0.946  0.35201   
## X733                NA         NA      NA       NA   
## X750          -0.02374   12.62240  -0.002  0.99851   
## X751                NA         NA      NA       NA   
## X752                NA         NA      NA       NA   
## X753                NA         NA      NA       NA   
## X754                NA         NA      NA       NA   
## X755                NA         NA      NA       NA   
## X773                NA         NA      NA       NA   
## X774                NA         NA      NA       NA   
## X775                NA         NA      NA       NA   
## X776                NA         NA      NA       NA   
## X780                NA         NA      NA       NA   
## X782                NA         NA      NA       NA   
## X792                NA         NA      NA       NA   
## X793                NA         NA      NA       NA   
## X795                NA         NA      NA       NA   
## X798                NA         NA      NA       NA   
## X800                NA         NA      NA       NA   
## X801                NA         NA      NA       NA   
## X805                NA         NA      NA       NA   
## X806                NA         NA      NA       NA   
## X812                NA         NA      NA       NA   
## X813                NA         NA      NA       NA   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 7.669 on 29 degrees of freedom
## Multiple R-squared:  0.9437, Adjusted R-squared:  0.7747 
## F-statistic: 5.584 on 87 and 29 DF,  p-value: 1.027e-06
lm_predictions <- predict(lm_model, newdata = X_test_lm)
## Warning in predict.lm(lm_model, newdata = X_test_lm): prediction from
## rank-deficient fit; attr(*, "non-estim") has doubtful cases
lm_test_values <- data.frame(obs = y_test,pred = lm_predictions)
defaultSummary(lm_test_values)
##       RMSE   Rsquared        MAE 
## 30.3087993  0.2130654 20.4964880

The lm model gives a Rsquared of 0.21 not an improvement compared to the PLS model.

Lets try robust ridge regression

library(elasticnet)
## Warning: package 'elasticnet' was built under R version 4.3.3
## Loading required package: lars
## Warning: package 'lars' was built under R version 4.3.3
## Loaded lars 1.3
ridge_grid <- data.frame(.lambda = seq(0, .1, length = 15))
set.seed(100)
ridgeRegFit <- train(X_train_lm, y_train, method = "ridge",tuneGrid = ridge_grid,trControl = ctrl)
ridgeRegFit
## Ridge Regression 
## 
## 117 samples
## 388 predictors
## 
## No pre-processing
## Resampling: Cross-Validated (10 fold) 
## Summary of sample sizes: 105, 106, 105, 105, 105, 106, ... 
## Resampling results across tuning parameters:
## 
##   lambda       RMSE          Rsquared   MAE         
##   0.000000000  3.420508e+20  0.1511077  1.852177e+20
##   0.007142857  5.488365e+03  0.2478913  3.583052e+03
##   0.014285714  1.668384e+01  0.3015268  1.243951e+01
##   0.021428571  1.603268e+01  0.3181464  1.204384e+01
##   0.028571429  1.565874e+01  0.3292670  1.181993e+01
##   0.035714286  1.541853e+01  0.3378542  1.167494e+01
##   0.042857143  1.519109e+01  0.3455063  1.147924e+01
##   0.050000000  1.504670e+01  0.3514188  1.136776e+01
##   0.057142857  1.498083e+01  0.3530273  1.128466e+01
##   0.064285714  1.481946e+01  0.3620584  1.118037e+01
##   0.071428571  1.473146e+01  0.3664619  1.110786e+01
##   0.078571429  1.466120e+01  0.3704889  1.104677e+01
##   0.085714286  1.459697e+01  0.3743362  1.099071e+01
##   0.092857143  1.454235e+01  0.3778743  1.094360e+01
##   0.100000000  1.451263e+01  0.3807330  1.090971e+01
## 
## RMSE was used to select the optimal model using the smallest value.
## The final value used for the model was lambda = 0.1.
ridge_preds <- predict(ridgeRegFit, newdata = X_test)
ridge_results_test <- data.frame(obs = y_test,pred = ridge_preds)
defaultSummary(ridge_results_test)
##       RMSE   Rsquared        MAE 
## 11.2148255  0.5311375  8.7431850

The tuned ridge model does give a better R squared of 0.53 but the RMSE is slightly higher than the PLS model.

Lets try an elastic net model.

enetGrid <- expand.grid(.lambda = c(0, 0.01, .1),.fraction = seq(.05, 1, length = 20))
set.seed(100)
enetTune <- train(X_train_lm, y_train,method = "enet",tuneGrid = enetGrid,trControl = ctrl)
enetTune
## Elasticnet 
## 
## 117 samples
## 388 predictors
## 
## No pre-processing
## Resampling: Cross-Validated (10 fold) 
## Summary of sample sizes: 105, 106, 105, 105, 105, 106, ... 
## Resampling results across tuning parameters:
## 
##   lambda  fraction  RMSE          Rsquared   MAE         
##   0.00    0.05      1.711668e+19  0.2298755  9.263289e+18
##   0.00    0.10      3.423267e+19  0.1865586  1.852815e+19
##   0.00    0.15      5.133658e+19  0.1635670  2.778953e+19
##   0.00    0.20      6.843741e+19  0.1532548  3.705001e+19
##   0.00    0.25      8.553824e+19  0.1455426  4.631049e+19
##   0.00    0.30      1.026391e+20  0.1400844  5.557097e+19
##   0.00    0.35      1.197399e+20  0.1358755  6.483145e+19
##   0.00    0.40      1.368407e+20  0.1336335  7.409193e+19
##   0.00    0.45      1.539416e+20  0.1324509  8.335241e+19
##   0.00    0.50      1.710424e+20  0.1322443  9.261289e+19
##   0.00    0.55      1.881432e+20  0.1326111  1.018734e+20
##   0.00    0.60      2.052441e+20  0.1330571  1.111339e+20
##   0.00    0.65      2.223449e+20  0.1333027  1.203943e+20
##   0.00    0.70      2.394458e+20  0.1322226  1.296548e+20
##   0.00    0.75      2.565466e+20  0.1312588  1.389153e+20
##   0.00    0.80      2.736475e+20  0.1312852  1.481758e+20
##   0.00    0.85      2.907483e+20  0.1368224  1.574363e+20
##   0.00    0.90      3.078491e+20  0.1425114  1.666967e+20
##   0.00    0.95      3.249500e+20  0.1470285  1.759572e+20
##   0.00    1.00      3.420508e+20  0.1511077  1.852177e+20
##   0.01    0.05      1.172782e+01  0.5088716  8.659022e+00
##   0.01    0.10      1.195538e+01  0.4768486  8.923971e+00
##   0.01    0.15      1.232927e+01  0.4433672  9.321255e+00
##   0.01    0.20      1.258429e+01  0.4269542  9.477163e+00
##   0.01    0.25      1.300489e+01  0.4092402  9.756145e+00
##   0.01    0.30      1.351828e+01  0.3893960  1.014309e+01
##   0.01    0.35      1.402245e+01  0.3697915  1.050417e+01
##   0.01    0.40      1.437275e+01  0.3573559  1.078588e+01
##   0.01    0.45      1.468341e+01  0.3491428  1.102210e+01
##   0.01    0.50      1.497927e+01  0.3425462  1.123914e+01
##   0.01    0.55      1.529098e+01  0.3336770  1.144284e+01
##   0.01    0.60      1.562572e+01  0.3224807  1.167871e+01
##   0.01    0.65      1.596847e+01  0.3119691  1.190944e+01
##   0.01    0.70      1.625201e+01  0.3050226  1.212393e+01
##   0.01    0.75      1.650293e+01  0.3002499  1.228912e+01
##   0.01    0.80      1.671373e+01  0.2971098  1.241612e+01
##   0.01    0.85      1.692242e+01  0.2930528  1.253676e+01
##   0.01    0.90      1.711738e+01  0.2894612  1.267407e+01
##   0.01    0.95      1.720204e+01  0.2886183  1.273618e+01
##   0.01    1.00      1.725984e+01  0.2883368  1.277181e+01
##   0.10    0.05      1.241968e+01  0.5109244  9.698252e+00
##   0.10    0.10      1.157462e+01  0.5189631  8.349230e+00
##   0.10    0.15      1.174377e+01  0.4996312  8.602587e+00
##   0.10    0.20      1.201987e+01  0.4802363  8.842522e+00
##   0.10    0.25      1.229078e+01  0.4606681  9.143619e+00
##   0.10    0.30      1.251363e+01  0.4442865  9.357213e+00
##   0.10    0.35      1.267395e+01  0.4356267  9.505066e+00
##   0.10    0.40      1.288603e+01  0.4261813  9.692488e+00
##   0.10    0.45      1.310680e+01  0.4184598  9.872842e+00
##   0.10    0.50      1.335035e+01  0.4096525  1.004887e+01
##   0.10    0.55      1.356841e+01  0.4019001  1.021269e+01
##   0.10    0.60      1.376548e+01  0.3953202  1.036546e+01
##   0.10    0.65      1.392382e+01  0.3906756  1.048748e+01
##   0.10    0.70      1.403235e+01  0.3885828  1.057177e+01
##   0.10    0.75      1.412779e+01  0.3872153  1.064735e+01
##   0.10    0.80      1.421758e+01  0.3858977  1.070786e+01
##   0.10    0.85      1.430631e+01  0.3845511  1.077067e+01
##   0.10    0.90      1.437890e+01  0.3835340  1.081909e+01
##   0.10    0.95      1.444966e+01  0.3821148  1.086451e+01
##   0.10    1.00      1.451263e+01  0.3807330  1.090971e+01
## 
## RMSE was used to select the optimal model using the smallest value.
## The final values used for the model were fraction = 0.1 and lambda = 0.1.
enet_preds <- predict(enetTune, newdata = X_test_lm)
enet_results_test <- data.frame(obs = y_test,pred = enet_preds)
defaultSummary(enet_results_test)
##       RMSE   Rsquared        MAE 
## 11.4641223  0.3528985  7.4471473

We get a test R squared of 0.35 which is not better than the PLS or ridge.

Part F

Yes, I would recommend the ridge model with lambda 0.1 for the experiment because it gave an improved RSquared and even tho the RMSE was slightly higher because of the higher RSquared I think it could be a better model.

Question 6.3

A chemical manufacturing process for a pharmaceutical product was discussed in Sect.1.4. In this problem, the objective is to understand the relationship between biological measurements of the raw materials(predictors), measurements of the manufacturing process (predictors), and the response of product yield. Biological predictors cannot be changed but can be used to assess the quality of the raw material before processing. On the other hand, manufacturing process predictors can be changed in the manufacturing process. Improving product yield by 1% will boost revenue by approximately one hundred thousand dollars per batch:

Part A

Start R and use these commands to load the data:

library(AppliedPredictiveModeling) data(chemicalManufacturing)

The matrix processPredictors contains the 57 predictors (12 describing the input biological material and 45 describing the process predictors) for the 176 manufacturing runs. yield contains the percent yield for each run.

data(ChemicalManufacturingProcess)

Part B

A small percentage of cells in the predictor set contain missing values. Use an imputation function to fill in these missing values (e.g., see Sect.3.8).

sum(is.na(ChemicalManufacturingProcess))        
## [1] 106
colSums(is.na(ChemicalManufacturingProcess))    
##                  Yield   BiologicalMaterial01   BiologicalMaterial02 
##                      0                      0                      0 
##   BiologicalMaterial03   BiologicalMaterial04   BiologicalMaterial05 
##                      0                      0                      0 
##   BiologicalMaterial06   BiologicalMaterial07   BiologicalMaterial08 
##                      0                      0                      0 
##   BiologicalMaterial09   BiologicalMaterial10   BiologicalMaterial11 
##                      0                      0                      0 
##   BiologicalMaterial12 ManufacturingProcess01 ManufacturingProcess02 
##                      0                      1                      3 
## ManufacturingProcess03 ManufacturingProcess04 ManufacturingProcess05 
##                     15                      1                      1 
## ManufacturingProcess06 ManufacturingProcess07 ManufacturingProcess08 
##                      2                      1                      1 
## ManufacturingProcess09 ManufacturingProcess10 ManufacturingProcess11 
##                      0                      9                     10 
## ManufacturingProcess12 ManufacturingProcess13 ManufacturingProcess14 
##                      1                      0                      1 
## ManufacturingProcess15 ManufacturingProcess16 ManufacturingProcess17 
##                      0                      0                      0 
## ManufacturingProcess18 ManufacturingProcess19 ManufacturingProcess20 
##                      0                      0                      0 
## ManufacturingProcess21 ManufacturingProcess22 ManufacturingProcess23 
##                      0                      1                      1 
## ManufacturingProcess24 ManufacturingProcess25 ManufacturingProcess26 
##                      1                      5                      5 
## ManufacturingProcess27 ManufacturingProcess28 ManufacturingProcess29 
##                      5                      5                      5 
## ManufacturingProcess30 ManufacturingProcess31 ManufacturingProcess32 
##                      5                      5                      0 
## ManufacturingProcess33 ManufacturingProcess34 ManufacturingProcess35 
##                      5                      5                      5 
## ManufacturingProcess36 ManufacturingProcess37 ManufacturingProcess38 
##                      5                      0                      0 
## ManufacturingProcess39 ManufacturingProcess40 ManufacturingProcess41 
##                      0                      1                      1 
## ManufacturingProcess42 ManufacturingProcess43 ManufacturingProcess44 
##                      0                      0                      0 
## ManufacturingProcess45 
##                      0

We have 106 missing values across numerous predictors. Lets use KNN to impute the data using the caret package preprocess function

# Create a pre-processing object to apply KNN imputation
preProc <- preProcess(ChemicalManufacturingProcess, method = "knnImpute")

# Apply the imputation to the data
imputed_data <- predict(preProc, ChemicalManufacturingProcess)
sum(is.na(imputed_data))
## [1] 0
head(imputed_data)
##        Yield BiologicalMaterial01 BiologicalMaterial02 BiologicalMaterial03
## 1 -1.1792673           -0.2261036           -1.5140979          -2.68303622
## 2  1.2263678            2.2391498            1.3089960          -0.05623504
## 3  1.0042258            2.2391498            1.3089960          -0.05623504
## 4  0.6737219            2.2391498            1.3089960          -0.05623504
## 5  1.2534583            1.4827653            1.8939391           1.13594780
## 6  1.8386128           -0.4081962            0.6620886          -0.59859075
##   BiologicalMaterial04 BiologicalMaterial05 BiologicalMaterial06
## 1            0.2201765            0.4941942           -1.3828880
## 2            1.2964386            0.4128555            1.1290767
## 3            1.2964386            0.4128555            1.1290767
## 4            1.2964386            0.4128555            1.1290767
## 5            0.9414412           -0.3734185            1.5348350
## 6            1.5894524            1.7305423            0.6192092
##   BiologicalMaterial07 BiologicalMaterial08 BiologicalMaterial09
## 1           -0.1313107            -1.233131           -3.3962895
## 2           -0.1313107             2.282619           -0.7227225
## 3           -0.1313107             2.282619           -0.7227225
## 4           -0.1313107             2.282619           -0.7227225
## 5           -0.1313107             1.071310           -0.1205678
## 6           -0.1313107             1.189487           -1.7343424
##   BiologicalMaterial10 BiologicalMaterial11 BiologicalMaterial12
## 1            1.1005296            -1.838655           -1.7709224
## 2            1.1005296             1.393395            1.0989855
## 3            1.1005296             1.393395            1.0989855
## 4            1.1005296             1.393395            1.0989855
## 5            0.4162193             0.136256            1.0989855
## 6            1.6346255             1.022062            0.7240877
##   ManufacturingProcess01 ManufacturingProcess02 ManufacturingProcess03
## 1              0.2154105              0.5662872              0.3765810
## 2             -6.1497028             -1.9692525              0.1979962
## 3             -6.1497028             -1.9692525              0.1087038
## 4             -6.1497028             -1.9692525              0.4658734
## 5             -0.2784345             -1.9692525              0.1087038
## 6              0.4348971             -1.9692525              0.5551658
##   ManufacturingProcess04 ManufacturingProcess05 ManufacturingProcess06
## 1              0.5655598            -0.44593467             -0.5414997
## 2             -2.3669726             0.99933318              0.9625383
## 3             -3.1638563             0.06246417             -0.1117745
## 4             -3.3232331             0.42279841              2.1850322
## 5             -2.2075958             0.84537219             -0.6304083
## 6             -1.2513352             0.49486525              0.5550403
##   ManufacturingProcess07 ManufacturingProcess08 ManufacturingProcess09
## 1             -0.1596700             -0.3095182             -1.7201524
## 2             -0.9580199              0.8941637              0.5883746
## 3              1.0378549              0.8941637             -0.3815947
## 4             -0.9580199             -1.1119728             -0.4785917
## 5              1.0378549              0.8941637             -0.4527258
## 6              1.0378549              0.8941637             -0.2199332
##   ManufacturingProcess10 ManufacturingProcess11 ManufacturingProcess12
## 1            -0.07700901            -0.09157342             -0.4806937
## 2             0.52297397             1.08204765             -0.4806937
## 3             0.31428424             0.55112383             -0.4806937
## 4            -0.02483658             0.80261406             -0.4806937
## 5            -0.39004361             0.10403009             -0.4806937
## 6             0.28819802             1.41736795             -0.4806937
##   ManufacturingProcess13 ManufacturingProcess14 ManufacturingProcess15
## 1             0.97711512              0.8093999              1.1846438
## 2            -0.50030980              0.2775205              0.9617071
## 3             0.28765016              0.4425865              0.8245152
## 4             0.28765016              0.7910592              1.0817499
## 5             0.09066017              2.5334227              3.3282665
## 6            -0.50030980              2.4050380              3.1396277
##   ManufacturingProcess16 ManufacturingProcess17 ManufacturingProcess18
## 1              0.3303945              0.9263296              0.1505348
## 2              0.1455765             -0.2753953              0.1559773
## 3              0.1455765              0.3655246              0.1831898
## 4              0.1967569              0.3655246              0.1695836
## 5              0.4754056             -0.3555103              0.2076811
## 6              0.6261033             -0.7560852              0.1423710
##   ManufacturingProcess19 ManufacturingProcess20 ManufacturingProcess21
## 1              0.4563798              0.3109942              0.2109804
## 2              1.5095063              0.1849230              0.2109804
## 3              1.0926437              0.1849230              0.2109804
## 4              0.9829430              0.1562704              0.2109804
## 5              1.6192070              0.2938027             -0.6884239
## 6              1.9044287              0.3998171             -0.5599376
##   ManufacturingProcess22 ManufacturingProcess23 ManufacturingProcess24
## 1             0.05833309              0.8317688              0.8907291
## 2            -0.72230090             -1.8147683             -1.0060115
## 3            -0.42205706             -1.2132826             -0.8335805
## 4            -0.12181322             -0.6117969             -0.6611496
## 5             0.77891831              0.5911745              1.5804530
## 6             1.07916216             -1.2132826             -1.3508734
##   ManufacturingProcess25 ManufacturingProcess26 ManufacturingProcess27
## 1              0.1200183              0.1256347              0.3460352
## 2              0.1093082              0.1966227              0.1906613
## 3              0.1842786              0.2159831              0.2104362
## 4              0.1708910              0.2052273              0.1906613
## 5              0.2726365              0.2912733              0.3432102
## 6              0.1146633              0.2417969              0.3516852
##   ManufacturingProcess28 ManufacturingProcess29 ManufacturingProcess30
## 1              0.7826636              0.5943242              0.7566948
## 2              0.8779201              0.8347250              0.7566948
## 3              0.8588688              0.7746248              0.2444430
## 4              0.8588688              0.7746248              0.2444430
## 5              0.8969714              0.9549255             -0.1653585
## 6              0.9160227              1.0150257              0.9615956
##   ManufacturingProcess31 ManufacturingProcess32 ManufacturingProcess33
## 1             -0.1952552             -0.4568829              0.9890307
## 2             -0.2672523              1.9517531              0.9890307
## 3             -0.1592567              2.6928719              0.9890307
## 4             -0.1592567              2.3223125              1.7943843
## 5             -0.1412574              2.3223125              2.5997378
## 6             -0.3572486              2.6928719              2.5997378
##   ManufacturingProcess34 ManufacturingProcess35 ManufacturingProcess36
## 1             -1.7202722            -0.88694718             -0.6557774
## 2              1.9568096             1.14638329             -0.6557774
## 3              1.9568096             1.23880740             -1.8000420
## 4              0.1182687             0.03729394             -1.8000420
## 5              0.1182687            -2.55058120             -2.9443066
## 6              0.1182687            -0.51725073             -1.8000420
##   ManufacturingProcess37 ManufacturingProcess38 ManufacturingProcess39
## 1             -1.1540243              0.7174727              0.2317270
## 2              2.2161351             -0.8224687              0.2317270
## 3             -0.7046697             -0.8224687              0.2317270
## 4              0.4187168             -0.8224687              0.2317270
## 5             -1.8280562             -0.8224687              0.2981503
## 6             -1.3787016             -0.8224687              0.2317270
##   ManufacturingProcess40 ManufacturingProcess41 ManufacturingProcess42
## 1             0.05969714            -0.06900773             0.20279570
## 2             2.14909691             2.34626280            -0.05472265
## 3            -0.46265281            -0.44058781             0.40881037
## 4            -0.46265281            -0.44058781            -0.31224099
## 5            -0.46265281            -0.44058781            -0.10622632
## 6            -0.46265281            -0.44058781             0.15129203
##   ManufacturingProcess43 ManufacturingProcess44 ManufacturingProcess45
## 1             2.40564734            -0.01588055             0.64371849
## 2            -0.01374656             0.29467248             0.15220242
## 3             0.10146268            -0.01588055             0.39796046
## 4             0.21667191            -0.01588055            -0.09355562
## 5             0.21667191            -0.32643359            -0.09355562
## 6             1.48397347            -0.01588055            -0.33931365

Now we have no missing values in our dataset.

Part C

Split the data into a training and a test set, pre-process the data, and tune a model of your choice from this chapter. What is the optimal value of the performance metric?

First, split the data into a training and test set with a 70/30 split

set.seed(123)
train_index <- createDataPartition(imputed_data$Yield, p = 0.7, list = FALSE)

X_train <- imputed_data[train_index, ]
X_train <- X_train %>%
  select(-Yield)
y_train <- imputed_data$Yield[train_index]

X_test <- imputed_data[-train_index, ] %>%
  select(-Yield)
y_test <- imputed_data$Yield[-train_index]
head(X_test)
##    BiologicalMaterial01 BiologicalMaterial02 BiologicalMaterial03
## 1            -0.2261036            -1.514098          -2.68303622
## 2             2.2391498             1.308996          -0.05623504
## 3             2.2391498             1.308996          -0.05623504
## 4             2.2391498             1.308996          -0.05623504
## 5             1.4827653             1.893939           1.13594780
## 10            0.7403878             1.960861           1.08846043
##    BiologicalMaterial04 BiologicalMaterial05 BiologicalMaterial06
## 1             0.2201765            0.4941942            -1.382888
## 2             1.2964386            0.4128555             1.129077
## 3             1.2964386            0.4128555             1.129077
## 4             1.2964386            0.4128555             1.129077
## 5             0.9414412           -0.3734185             1.534835
## 10            1.8881010            0.4453910             1.550852
##    BiologicalMaterial07 BiologicalMaterial08 BiologicalMaterial09
## 1            -0.1313107            -1.233131           -3.3962895
## 2            -0.1313107             2.282619           -0.7227225
## 3            -0.1313107             2.282619           -0.7227225
## 4            -0.1313107             2.282619           -0.7227225
## 5            -0.1313107             1.071310           -0.1205678
## 10           -0.1313107             2.001950            0.6742764
##    BiologicalMaterial10 BiologicalMaterial11 BiologicalMaterial12
## 1             1.1005296            -1.838655            -1.770922
## 2             1.1005296             1.393395             1.098986
## 3             1.1005296             1.393395             1.098986
## 4             1.1005296             1.393395             1.098986
## 5             0.4162193             0.136256             1.098986
## 10            1.7514590             1.503343             1.616086
##    ManufacturingProcess01 ManufacturingProcess02 ManufacturingProcess03
## 1               0.2154105              0.5662872              0.3765810
## 2              -6.1497028             -1.9692525              0.1979962
## 3              -6.1497028             -1.9692525              0.1087038
## 4              -6.1497028             -1.9692525              0.4658734
## 5              -0.2784345             -1.9692525              0.1087038
## 10              0.4348971             -1.9692525              0.4658734
##    ManufacturingProcess04 ManufacturingProcess05 ManufacturingProcess06
## 1               0.5655598            -0.44593467             -0.5414997
## 2              -2.3669726             0.99933318              0.9625383
## 3              -3.1638563             0.06246417             -0.1117745
## 4              -3.3232331             0.42279841              2.1850322
## 5              -2.2075958             0.84537219             -0.6304083
## 10              0.9799394             0.06901570              0.8884478
##    ManufacturingProcess07 ManufacturingProcess08 ManufacturingProcess09
## 1              -0.1596700             -0.3095182             -1.7201524
## 2              -0.9580199              0.8941637              0.5883746
## 3               1.0378549              0.8941637             -0.3815947
## 4              -0.9580199             -1.1119728             -0.4785917
## 5               1.0378549              0.8941637             -0.4527258
## 10             -0.9580199             -1.1119728              0.9375635
##    ManufacturingProcess10 ManufacturingProcess11 ManufacturingProcess12
## 1             -0.07700901            -0.09157342             -0.4806937
## 2              0.52297397             1.08204765             -0.4806937
## 3              0.31428424             0.55112383             -0.4806937
## 4             -0.02483658             0.80261406             -0.4806937
## 5             -0.39004361             0.10403009             -0.4806937
## 10             1.20121560             1.13793436             -0.4806937
##    ManufacturingProcess13 ManufacturingProcess14 ManufacturingProcess15
## 1              0.97711512              0.8093999              1.1846438
## 2             -0.50030980              0.2775205              0.9617071
## 3              0.28765016              0.4425865              0.8245152
## 4              0.28765016              0.7910592              1.0817499
## 5              0.09066017              2.5334227              3.3282665
## 10            -0.20482482             -0.1443149              0.6530254
##    ManufacturingProcess16 ManufacturingProcess17 ManufacturingProcess18
## 1               0.3303945              0.9263296             0.15053478
## 2               0.1455765             -0.2753953             0.15597729
## 3               0.1455765              0.3655246             0.18318982
## 4               0.1967569              0.3655246             0.16958356
## 5               0.4754056             -0.3555103             0.20768110
## 10              0.1370464              0.7660996             0.08250345
##    ManufacturingProcess19 ManufacturingProcess20 ManufacturingProcess21
## 1               0.4563798              0.3109942              0.2109804
## 2               1.5095063              0.1849230              0.2109804
## 3               1.0926437              0.1849230              0.2109804
## 4               0.9829430              0.1562704              0.2109804
## 5               1.6192070              0.2938027             -0.6884239
## 10              1.3778655              0.1648662              1.4958436
##    ManufacturingProcess22 ManufacturingProcess23 ManufacturingProcess24
## 1              0.05833309              0.8317688              0.8907291
## 2             -0.72230090             -1.8147683             -1.0060115
## 3             -0.42205706             -1.2132826             -0.8335805
## 4             -0.12181322             -0.6117969             -0.6611496
## 5              0.77891831              0.5911745              1.5804530
## 10            -0.42205706             -1.2132826             -0.8335805
##    ManufacturingProcess25 ManufacturingProcess26 ManufacturingProcess27
## 1               0.1200183              0.1256347              0.3460352
## 2               0.1093082              0.1966227              0.1906613
## 3               0.1842786              0.2159831              0.2104362
## 4               0.1708910              0.2052273              0.1906613
## 5               0.2726365              0.2912733              0.3432102
## 10              0.1735685              0.2568549              0.2471609
##    ManufacturingProcess28 ManufacturingProcess29 ManufacturingProcess30
## 1               0.7826636              0.5943242              0.7566948
## 2               0.8779201              0.8347250              0.7566948
## 3               0.8588688              0.7746248              0.2444430
## 4               0.8588688              0.7746248              0.2444430
## 5               0.8969714              0.9549255             -0.1653585
## 10              0.9160227              1.0150257              0.6542445
##    ManufacturingProcess31 ManufacturingProcess32 ManufacturingProcess33
## 1              -0.1952552             -0.4568829              0.9890307
## 2              -0.2672523              1.9517531              0.9890307
## 3              -0.1592567              2.6928719              0.9890307
## 4              -0.1592567              2.3223125              1.7943843
## 5              -0.1412574              2.3223125              2.5997378
## 10             -0.3032508              1.0253547              0.9890307
##    ManufacturingProcess34 ManufacturingProcess35 ManufacturingProcess36
## 1              -1.7202722            -0.88694718             -0.6557774
## 2               1.9568096             1.14638329             -0.6557774
## 3               1.9568096             1.23880740             -1.8000420
## 4               0.1182687             0.03729394             -1.8000420
## 5               0.1182687            -2.55058120             -2.9443066
## 10              0.1182687            -0.70209896             -0.6557774
##    ManufacturingProcess37 ManufacturingProcess38 ManufacturingProcess39
## 1              -1.1540243              0.7174727              0.2317270
## 2               2.2161351             -0.8224687              0.2317270
## 3              -0.7046697             -0.8224687              0.2317270
## 4               0.4187168             -0.8224687              0.2317270
## 5              -1.8280562             -0.8224687              0.2981503
## 10              1.7667805              0.7174727              0.1653036
##    ManufacturingProcess40 ManufacturingProcess41 ManufacturingProcess42
## 1              0.05969714            -0.06900773             0.20279570
## 2              2.14909691             2.34626280            -0.05472265
## 3             -0.46265281            -0.44058781             0.40881037
## 4             -0.46265281            -0.44058781            -0.31224099
## 5             -0.46265281            -0.44058781            -0.10622632
## 10            -0.46265281            -0.44058781             0.04828469
##    ManufacturingProcess43 ManufacturingProcess44 ManufacturingProcess45
## 1              2.40564734            -0.01588055             0.64371849
## 2             -0.01374656             0.29467248             0.15220242
## 3              0.10146268            -0.01588055             0.39796046
## 4              0.21667191            -0.01588055            -0.09355562
## 5              0.21667191            -0.32643359            -0.09355562
## 10            -0.12895579             0.29467248             0.64371849
head(y_test)
## [1] -1.1792673  1.2263678  1.0042258  0.6737219  1.2534583  1.2317859

Now, I will pre-process and tune an elastic net model I will scale the data.

# Set up cross-validation
train_control <- trainControl(method = "cv", number = 10)

# Grid of alpha (0 = ridge, 1 = lasso) and lambda values
elastic_grid <- expand.grid(
  alpha = seq(0, 1, length = 11),             # alpha from 0 to 1
  lambda = 10^seq(-3, 1, length = 20)         # lambda from 0.001 to 10
)

# Train Elastic Net model
elastic_model <- train(
  x = X_train,
  y = y_train,
  method = "glmnet",
  preProcess = c("center", "scale"),
  tuneGrid = elastic_grid,
  trControl = train_control,
  metric = "RMSE"
)
## Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo,
## : There were missing values in resampled performance measures.
print(elastic_model)
## glmnet 
## 
## 124 samples
##  57 predictor
## 
## Pre-processing: centered (57), scaled (57) 
## Resampling: Cross-Validated (10 fold) 
## Summary of sample sizes: 112, 112, 111, 112, 112, 112, ... 
## Resampling results across tuning parameters:
## 
##   alpha  lambda        RMSE       Rsquared    MAE      
##   0.0     0.001000000  1.3278634  0.57844003  0.7322428
##   0.0     0.001623777  1.3278634  0.57844003  0.7322428
##   0.0     0.002636651  1.3278634  0.57844003  0.7322428
##   0.0     0.004281332  1.3278634  0.57844003  0.7322428
##   0.0     0.006951928  1.3278634  0.57844003  0.7322428
##   0.0     0.011288379  1.3278634  0.57844003  0.7322428
##   0.0     0.018329807  1.3278634  0.57844003  0.7322428
##   0.0     0.029763514  1.3278634  0.57844003  0.7322428
##   0.0     0.048329302  1.3278634  0.57844003  0.7322428
##   0.0     0.078475997  1.2300121  0.58560774  0.6990532
##   0.0     0.127427499  1.0757928  0.58809733  0.6543799
##   0.0     0.206913808  0.9352806  0.60056785  0.6118112
##   0.0     0.335981829  0.8128609  0.62111809  0.5750638
##   0.0     0.545559478  0.7230554  0.63363659  0.5510543
##   0.0     0.885866790  0.6801405  0.62672074  0.5498386
##   0.0     1.438449888  0.6806812  0.61159822  0.5620915
##   0.0     2.335721469  0.7140542  0.58759375  0.5801108
##   0.0     3.792690191  0.7590269  0.54773764  0.6178966
##   0.0     6.158482111  0.8017282  0.51054849  0.6548829
##   0.0    10.000000000  0.8413307  0.48185532  0.6898674
##   0.1     0.001000000  3.0087671  0.48936502  1.2435410
##   0.1     0.001623777  2.8062625  0.48849372  1.1863329
##   0.1     0.002636651  2.5945288  0.48817210  1.1280650
##   0.1     0.004281332  2.4034946  0.48885555  1.0730579
##   0.1     0.006951928  2.1813942  0.49041605  1.0103106
##   0.1     0.011288379  1.9152732  0.49521083  0.9325039
##   0.1     0.018329807  1.5953278  0.51530578  0.8346973
##   0.1     0.029763514  1.3130720  0.55484985  0.7418870
##   0.1     0.048329302  1.1663597  0.51991223  0.6933085
##   0.1     0.078475997  0.9962708  0.55920964  0.6348725
##   0.1     0.127427499  0.8321493  0.59661714  0.5787555
##   0.1     0.206913808  0.7480890  0.64693615  0.5527826
##   0.1     0.335981829  0.7261603  0.63678699  0.5560211
##   0.1     0.545559478  0.6958242  0.63573021  0.5606597
##   0.1     0.885866790  0.6864220  0.62948205  0.5732344
##   0.1     1.438449888  0.7514296  0.59530876  0.6243326
##   0.1     2.335721469  0.8328162  0.58876228  0.6788964
##   0.1     3.792690191  0.9445853  0.57391567  0.7618832
##   0.1     6.158482111  1.0196045  0.25071699  0.8202739
##   0.1    10.000000000  1.0198547         NaN  0.8204133
##   0.2     0.001000000  2.8369084  0.48956677  1.1961205
##   0.2     0.001623777  2.6200354  0.48872833  1.1352437
##   0.2     0.002636651  2.3894620  0.48842518  1.0712227
##   0.2     0.004281332  2.2369144  0.48838790  1.0285503
##   0.2     0.006951928  2.0607848  0.48835277  0.9794875
##   0.2     0.011288379  1.7394811  0.49930887  0.8838380
##   0.2     0.018329807  1.3859265  0.55753389  0.7686266
##   0.2     0.029763514  1.2022810  0.51134306  0.7104849
##   0.2     0.048329302  0.9600991  0.56014169  0.6246727
##   0.2     0.078475997  0.8012848  0.58112656  0.5712338
##   0.2     0.127427499  0.6980526  0.65102489  0.5364270
##   0.2     0.206913808  0.6855033  0.64305293  0.5386345
##   0.2     0.335981829  0.6564882  0.64706234  0.5388780
##   0.2     0.545559478  0.6912773  0.62073411  0.5738587
##   0.2     0.885866790  0.7439466  0.60432458  0.6204757
##   0.2     1.438449888  0.8388899  0.59629036  0.6819404
##   0.2     2.335721469  0.9732801  0.54788151  0.7833792
##   0.2     3.792690191  1.0198547         NaN  0.8204133
##   0.2     6.158482111  1.0198547         NaN  0.8204133
##   0.2    10.000000000  1.0198547         NaN  0.8204133
##   0.3     0.001000000  2.6175980  0.49046849  1.1352789
##   0.3     0.001623777  2.4094439  0.48959488  1.0771305
##   0.3     0.002636651  2.2668367  0.48920268  1.0372521
##   0.3     0.004281332  2.2159958  0.48698178  1.0246847
##   0.3     0.006951928  1.9921816  0.48719973  0.9612176
##   0.3     0.011288379  1.5431340  0.52647945  0.8263773
##   0.3     0.018329807  1.3069131  0.51163221  0.7482595
##   0.3     0.029763514  1.0266852  0.53947924  0.6540259
##   0.3     0.048329302  0.8605860  0.56780265  0.5917000
##   0.3     0.078475997  0.7098592  0.61085249  0.5455314
##   0.3     0.127427499  0.6565133  0.64785418  0.5266847
##   0.3     0.206913808  0.6486920  0.64984376  0.5308735
##   0.3     0.335981829  0.6733742  0.63050094  0.5562003
##   0.3     0.545559478  0.7169880  0.60901626  0.5990750
##   0.3     0.885866790  0.7989384  0.60137281  0.6537125
##   0.3     1.438449888  0.9368342  0.56876327  0.7539952
##   0.3     2.335721469  1.0198547         NaN  0.8204133
##   0.3     3.792690191  1.0198547         NaN  0.8204133
##   0.3     6.158482111  1.0198547         NaN  0.8204133
##   0.3    10.000000000  1.0198547         NaN  0.8204133
##   0.4     0.001000000  2.4216218  0.49126526  1.0805883
##   0.4     0.001623777  2.2506819  0.49029862  1.0329423
##   0.4     0.002636651  2.2488466  0.48882061  1.0330879
##   0.4     0.004281332  2.1745715  0.48475964  1.0141832
##   0.4     0.006951928  1.8307266  0.49020008  0.9157939
##   0.4     0.011288379  1.4143658  0.55623674  0.7785372
##   0.4     0.018329807  1.2032051  0.50379335  0.7163990
##   0.4     0.029763514  0.9467232  0.55482025  0.6221988
##   0.4     0.048329302  0.7724333  0.58089073  0.5642119
##   0.4     0.078475997  0.6327641  0.65837670  0.5152799
##   0.4     0.127427499  0.6442275  0.65030477  0.5252044
##   0.4     0.206913808  0.6591120  0.64257879  0.5389385
##   0.4     0.335981829  0.6929674  0.61175246  0.5779363
##   0.4     0.545559478  0.7459763  0.61047037  0.6154863
##   0.4     0.885866790  0.8631670  0.58751789  0.6963850
##   0.4     1.438449888  1.0100894  0.44491963  0.8123530
##   0.4     2.335721469  1.0198547         NaN  0.8204133
##   0.4     3.792690191  1.0198547         NaN  0.8204133
##   0.4     6.158482111  1.0198547         NaN  0.8204133
##   0.4    10.000000000  1.0198547         NaN  0.8204133
##   0.5     0.001000000  2.2404096  0.49251034  1.0302696
##   0.5     0.001623777  2.2081982  0.49103984  1.0210415
##   0.5     0.002636651  2.2386623  0.48798706  1.0311877
##   0.5     0.004281332  2.1484796  0.48252147  1.0079457
##   0.5     0.006951928  1.6781831  0.50014704  0.8714623
##   0.5     0.011288379  1.3474871  0.52942518  0.7645652
##   0.5     0.018329807  1.0606103  0.52610095  0.6688384
##   0.5     0.029763514  0.8884373  0.56095049  0.6023078
##   0.5     0.048329302  0.7257194  0.59467872  0.5517087
##   0.5     0.078475997  0.6383156  0.65165076  0.5206514
##   0.5     0.127427499  0.6500751  0.64859065  0.5296612
##   0.5     0.206913808  0.6713140  0.62787604  0.5526607
##   0.5     0.335981829  0.7037244  0.61384881  0.5874274
##   0.5     0.545559478  0.7848592  0.59567070  0.6421593
##   0.5     0.885866790  0.9262629  0.55984204  0.7454041
##   0.5     1.438449888  1.0198547         NaN  0.8204133
##   0.5     2.335721469  1.0198547         NaN  0.8204133
##   0.5     3.792690191  1.0198547         NaN  0.8204133
##   0.5     6.158482111  1.0198547         NaN  0.8204133
##   0.5    10.000000000  1.0198547         NaN  0.8204133
##   0.6     0.001000000  2.0969150  0.49392220  0.9900562
##   0.6     0.001623777  2.1654235  0.49183428  1.0093508
##   0.6     0.002636651  2.2085024  0.48686630  1.0233229
##   0.6     0.004281332  2.0506276  0.48180414  0.9809582
##   0.6     0.006951928  1.5615566  0.51413999  0.8358062
##   0.6     0.011288379  1.2789354  0.50059321  0.7449696
##   0.6     0.018329807  0.9966740  0.54783884  0.6407564
##   0.6     0.029763514  0.8282476  0.56812988  0.5821954
##   0.6     0.048329302  0.6710768  0.62110178  0.5349909
##   0.6     0.078475997  0.6416944  0.65012048  0.5228561
##   0.6     0.127427499  0.6564132  0.64396779  0.5341661
##   0.6     0.206913808  0.6819957  0.61514307  0.5662901
##   0.6     0.335981829  0.7222390  0.61017412  0.5962515
##   0.6     0.545559478  0.8226084  0.58914255  0.6674269
##   0.6     0.885866790  0.9846851  0.48368905  0.7915703
##   0.6     1.438449888  1.0198547         NaN  0.8204133
##   0.6     2.335721469  1.0198547         NaN  0.8204133
##   0.6     3.792690191  1.0198547         NaN  0.8204133
##   0.6     6.158482111  1.0198547         NaN  0.8204133
##   0.6    10.000000000  1.0198547         NaN  0.8204133
##   0.7     0.001000000  2.0215001  0.49554078  0.9693061
##   0.7     0.001623777  2.1444386  0.49219271  1.0046150
##   0.7     0.002636651  2.1838946  0.48508052  1.0171857
##   0.7     0.004281332  1.9353842  0.48307894  0.9490804
##   0.7     0.006951928  1.4604545  0.53214083  0.8027846
##   0.7     0.011288379  1.2189734  0.49925398  0.7249038
##   0.7     0.018329807  0.9532214  0.55112392  0.6256745
##   0.7     0.029763514  0.7877433  0.57449408  0.5701790
##   0.7     0.048329302  0.6326735  0.65531217  0.5186896
##   0.7     0.078475997  0.6460885  0.64799454  0.5270335
##   0.7     0.127427499  0.6613950  0.63602842  0.5396400
##   0.7     0.206913808  0.6885419  0.61069259  0.5754734
##   0.7     0.335981829  0.7449782  0.59888766  0.6130356
##   0.7     0.545559478  0.8608030  0.57863352  0.6941457
##   0.7     0.885866790  1.0189670  0.09719598  0.8199775
##   0.7     1.438449888  1.0198547         NaN  0.8204133
##   0.7     2.335721469  1.0198547         NaN  0.8204133
##   0.7     3.792690191  1.0198547         NaN  0.8204133
##   0.7     6.158482111  1.0198547         NaN  0.8204133
##   0.7    10.000000000  1.0198547         NaN  0.8204133
##   0.8     0.001000000  1.9803278  0.49693842  0.9581675
##   0.8     0.001623777  2.1269033  0.49236223  1.0011804
##   0.8     0.002636651  2.1534563  0.48306032  1.0091017
##   0.8     0.004281332  1.8191562  0.48669080  0.9157293
##   0.8     0.006951928  1.3787232  0.55237878  0.7715350
##   0.8     0.011288379  1.1040501  0.51194879  0.6868823
##   0.8     0.018329807  0.9195601  0.55494207  0.6139070
##   0.8     0.029763514  0.7583896  0.58073546  0.5623739
##   0.8     0.048329302  0.6391604  0.65043843  0.5224692
##   0.8     0.078475997  0.6500160  0.64608063  0.5293914
##   0.8     0.127427499  0.6676606  0.62659872  0.5482547
##   0.8     0.206913808  0.6946373  0.61378373  0.5792340
##   0.8     0.335981829  0.7654323  0.59338946  0.6257220
##   0.8     0.545559478  0.9003351  0.56183306  0.7239460
##   0.8     0.885866790  1.0198547         NaN  0.8204133
##   0.8     1.438449888  1.0198547         NaN  0.8204133
##   0.8     2.335721469  1.0198547         NaN  0.8204133
##   0.8     3.792690191  1.0198547         NaN  0.8204133
##   0.8     6.158482111  1.0198547         NaN  0.8204133
##   0.8    10.000000000  1.0198547         NaN  0.8204133
##   0.9     0.001000000  1.9360099  0.49913431  0.9459655
##   0.9     0.001623777  2.0962142  0.49255828  0.9929895
##   0.9     0.002636651  2.1839145  0.48064011  1.0180043
##   0.9     0.004281332  1.7254098  0.49143439  0.8881214
##   0.9     0.006951928  1.3165029  0.53446410  0.7579816
##   0.9     0.011288379  1.0471004  0.53190116  0.6640137
##   0.9     0.018329807  0.8813975  0.55919137  0.6005711
##   0.9     0.029763514  0.7273519  0.59069050  0.5536868
##   0.9     0.048329302  0.6420416  0.64830024  0.5232645
##   0.9     0.078475997  0.6530477  0.64438935  0.5305766
##   0.9     0.127427499  0.6759427  0.61614774  0.5573518
##   0.9     0.206913808  0.7051069  0.61029897  0.5824587
##   0.9     0.335981829  0.7856098  0.59041976  0.6392198
##   0.9     0.545559478  0.9443861  0.52063402  0.7589195
##   0.9     0.885866790  1.0198547         NaN  0.8204133
##   0.9     1.438449888  1.0198547         NaN  0.8204133
##   0.9     2.335721469  1.0198547         NaN  0.8204133
##   0.9     3.792690191  1.0198547         NaN  0.8204133
##   0.9     6.158482111  1.0198547         NaN  0.8204133
##   0.9    10.000000000  1.0198547         NaN  0.8204133
##   1.0     0.001000000  1.8979329  0.50150778  0.9353285
##   1.0     0.001623777  2.0631665  0.49271186  0.9840520
##   1.0     0.002636651  2.1613334  0.47862605  1.0119807
##   1.0     0.004281332  1.6409139  0.49699392  0.8628744
##   1.0     0.006951928  1.2820817  0.50448769  0.7494401
##   1.0     0.011288379  0.9991887  0.54497460  0.6423042
##   1.0     0.018329807  0.8521514  0.56329564  0.5903246
##   1.0     0.029763514  0.6896843  0.60626850  0.5417622
##   1.0     0.048329302  0.6445061  0.64705751  0.5251471
##   1.0     0.078475997  0.6575286  0.64052716  0.5329462
##   1.0     0.127427499  0.6815510  0.61052230  0.5646903
##   1.0     0.206913808  0.7160579  0.60493605  0.5893166
##   1.0     0.335981829  0.8073021  0.58486432  0.6547974
##   1.0     0.545559478  0.9848073  0.46274114  0.7908575
##   1.0     0.885866790  1.0198547         NaN  0.8204133
##   1.0     1.438449888  1.0198547         NaN  0.8204133
##   1.0     2.335721469  1.0198547         NaN  0.8204133
##   1.0     3.792690191  1.0198547         NaN  0.8204133
##   1.0     6.158482111  1.0198547         NaN  0.8204133
##   1.0    10.000000000  1.0198547         NaN  0.8204133
## 
## RMSE was used to select the optimal model using the smallest value.
## The final values used for the model were alpha = 0.7 and lambda = 0.0483293.
plot(elastic_model)

I Chose RMSE as my performance metric and we can see above that the best elastic net model with the lowest RMSE is a model where alpha = 0.7 and lambda = 0.0483293.

elastic_model$results %>%
  filter(RMSE == min(RMSE))
##   alpha    lambda      RMSE  Rsquared       MAE    RMSESD RsquaredSD     MAESD
## 1   0.7 0.0483293 0.6326735 0.6553122 0.5186896 0.2545696  0.1801891 0.1793219

For this model we get a RMSE of 0.63 and an R Squared of 0.66.

Part D

Predict the response for the test set. What is the value of the performance metric and how does this compare with the resampled performance metric on the training set?

elastic_preds <- predict(elastic_model, newdata = X_test)
test_metrics <- postResample(pred = elastic_preds, obs = y_test)
test_metrics
##      RMSE  Rsquared       MAE 
## 0.6205362 0.6118390 0.5122034

When predicting on the test set we get an RMSE of 0.62 and a R squared of 0.61 these are very close to the training data RMSE of 0.63 and RSquared of 0.66. This shows that the model does not appear to be overfit or underfit.

Part E

Which predictors are most important in the model you have trained? Do either the biological or process predictors dominate the list?

var_imp <- varImp(elastic_model, scale = TRUE)
var_imp
## glmnet variable importance
## 
##   only 20 most important variables shown (out of 57)
## 
##                        Overall
## ManufacturingProcess32 100.000
## ManufacturingProcess17  59.374
## ManufacturingProcess09  58.765
## ManufacturingProcess06  38.084
## ManufacturingProcess37  34.288
## BiologicalMaterial06    33.910
## ManufacturingProcess34  30.906
## ManufacturingProcess39  27.145
## ManufacturingProcess36  27.115
## ManufacturingProcess13  25.148
## ManufacturingProcess45  22.298
## BiologicalMaterial05    20.844
## ManufacturingProcess07  17.147
## ManufacturingProcess04  17.145
## ManufacturingProcess15   9.090
## ManufacturingProcess18   8.573
## ManufacturingProcess43   8.022
## ManufacturingProcess42   5.534
## ManufacturingProcess23   5.323
## ManufacturingProcess19   5.150

Of the top 20 Processes 18 of them are Manufacturing processes and only 2 were biological. Also of the 2 biological materials one was set at about 33 and the other 20 Overall importance. This is peanuts compared to the Manufacturing processes as they dominate and seem to be much more important to the model.

Part F

Explore the relationships between each of the top predictors and the response.How could this information be helpful in improving yield in future runs of the manufacturing process?

top_vars <- rownames(varImp(elastic_model)$importance)[order(-varImp(elastic_model)$importance$Overall)][1:6]
top_vars
## [1] "ManufacturingProcess32" "ManufacturingProcess17" "ManufacturingProcess09"
## [4] "ManufacturingProcess06" "ManufacturingProcess37" "BiologicalMaterial06"

These are the top 6 predictors so lets use ggplot to explore their relationship to the response Yield

library(ggplot2)
# Plot each top variable against yield
for (var in top_vars) {
  print(
    ggplot(imputed_data, aes_string(x = var, y = "Yield")) + 
      geom_point() +
      labs(title = paste("Yield vs", var)) 
    )

}
## Warning: `aes_string()` was deprecated in ggplot2 3.0.0.
## ℹ Please use tidy evaluation idioms with `aes()`.
## ℹ See also `vignette("ggplot2-in-packages")` for more information.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.

Looking at the scatter plots above we can see that there are some clear relathionships between the predictors and the response Yield. The relationships are below:

  • ManufacturingProcess32: Slight positive correlation
  • ManufacturingProcess17: Slight negitive correlation
  • ManufacturingProcess09: Moderate positive correlation
  • ManufacturingProcess06: Strong positive correlation
  • ManufacturingProcess37: No correlation, white noise.
  • BiologicalMaterial06: Very slight positive correlation.