start h2o
library(h2o)
h2o.init()
H2O is not running yet, starting it now...
Note: In case of errors look at the following log files:
C:\Users\r631758\AppData\Local\Temp\1\Rtmp4y1xDl/h2o_r631758_started_from_r.out
C:\Users\r631758\AppData\Local\Temp\1\Rtmp4y1xDl/h2o_r631758_started_from_r.err
java version "1.8.0_144"
Java(TM) SE Runtime Environment (build 1.8.0_144-b01)
Java HotSpot(TM) 64-Bit Server VM (build 25.144-b01, mixed mode)
Starting H2O JVM and connecting: . Connection successful!
R is connected to the H2O cluster:
H2O cluster uptime: 1 seconds 743 milliseconds
H2O cluster version: 3.14.0.3
H2O cluster version age: 13 days
H2O cluster name: H2O_started_from_R_r631758_dkn515
H2O cluster total nodes: 1
H2O cluster total memory: 3.48 GB
H2O cluster total cores: 8
H2O cluster allowed cores: 8
H2O cluster healthy: TRUE
H2O Connection ip: localhost
H2O Connection port: 54321
H2O Connection proxy: NA
H2O Internal Security: FALSE
H2O API Extensions: Algos, AutoML, Core V3, Core V4
R Version: R version 3.4.2 (2017-09-28)
h2o.removeAll()
[1] 0
LOAD DATA
cov<-h2o.importFile(path="https://s3.amazonaws.com/h2o-public-test-data/bigdata/laptop/covtype/covtype.full.csv")
|
| | 0%
|
|=========================================================================== | 89%
|
|=====================================================================================| 100%
dim(cov)
[1] 581012 13
splits<-h2o.splitFrame(cov,ratio=c(0.6,0.2),destination_frames = c("train.hex", "valid.hex", "test.hex"), seed=1234)
train<-splits[[1]]
valid<-splits[[2]]
test<-splits[[3]]
scatter plot via binning (works for categorical and numeric columns) to get more familiar with the dataset.
par(mfrow=c(1,1))
plot(h2o.tabulate(cov, "Elevation", "Cover_Type"))

plot(h2o.tabulate(cov, "Horizontal_Distance_To_Roadways", "Cover_Type"))

plot(h2o.tabulate(cov, "Soil_Type", "Cover_Type"))

plot(h2o.tabulate(cov, "Horizontal_Distance_To_Roadways", "Elevation" ))

set response and predictors
response<-"Cover_Type"
predictors<-setdiff(names(cov), response)
predictors
[1] "Elevation" "Aspect"
[3] "Slope" "Horizontal_Distance_To_Hydrology"
[5] "Vertical_Distance_To_Hydrology" "Horizontal_Distance_To_Roadways"
[7] "Hillshade_9am" "Hillshade_Noon"
[9] "Hillshade_3pm" "Horizontal_Distance_To_Fire_Points"
[11] "Wilderness_Area" "Soil_Type"
first DL model
summary(model.DL1)
Model Details:
==============
H2OMultinomialModel: deeplearning
Model Key: dl_model_first
Status of Neuron Layers: predicting Cover_Type, 7-class classification, multinomial distribution, CrossEntropy loss, 53,007 weights/biases, 632.3 KB, 349,383 training samples, mini-batch size 1
layer units type dropout l1 l2 mean_rate rate_rms momentum mean_weight
1 1 56 Input 0.00 %
2 2 200 Rectifier 0.00 % 0.000000 0.000000 0.053888 0.217150 0.000000 -0.009997
3 3 200 Rectifier 0.00 % 0.000000 0.000000 0.009023 0.008028 0.000000 -0.025903
4 4 7 Softmax 0.000000 0.000000 0.125248 0.307327 0.000000 -0.308807
weight_rms mean_bias bias_rms
1
2 0.118891 0.031963 0.113345
3 0.118803 0.727847 0.373808
4 0.494686 -0.479510 0.136315
H2OMultinomialMetrics: deeplearning
** Reported on training data. **
** Metrics reported on temporary training frame with 10019 samples **
Training Set Metrics:
=====================
MSE: (Extract with `h2o.mse`) 0.1381905
RMSE: (Extract with `h2o.rmse`) 0.3717398
Logloss: (Extract with `h2o.logloss`) 0.4395264
Mean Per-Class Error: 0.3533335
Confusion Matrix: Extract with `h2o.confusionMatrix(<model>,train = TRUE)`)
=========================================================================
Confusion Matrix: Row labels: Actual class; Column labels: Predicted class
class_1 class_2 class_3 class_4 class_5 class_6 class_7 Error Rate
class_1 2482 1120 5 0 3 3 66 0.3254 = 1,197 / 3,679
class_2 148 4635 66 0 15 22 4 0.0521 = 255 / 4,890
class_3 0 24 496 10 0 59 0 0.1579 = 93 / 589
class_4 0 0 21 21 0 4 0 0.5435 = 25 / 46
class_5 3 97 2 0 65 3 0 0.6176 = 105 / 170
class_6 0 50 106 1 0 134 0 0.5395 = 157 / 291
class_7 67 17 0 0 0 0 270 0.2373 = 84 / 354
Totals 2700 5943 696 32 83 225 340 0.1912 = 1,916 / 10,019
Hit Ratio Table: Extract with `h2o.hit_ratio_table(<model>,train = TRUE)`
=======================================================================
Top-7 Hit Ratios:
k hit_ratio
1 1 0.808763
2 2 0.982633
3 3 0.997505
4 4 0.999501
5 5 1.000000
6 6 1.000000
7 7 1.000000
H2OMultinomialMetrics: deeplearning
** Reported on validation data. **
** Metrics reported on full validation frame **
Validation Set Metrics:
=====================
Extract validation frame with `h2o.getFrame("valid.hex")`
MSE: (Extract with `h2o.mse`) 0.1409691
RMSE: (Extract with `h2o.rmse`) 0.3754585
Logloss: (Extract with `h2o.logloss`) 0.4486927
Mean Per-Class Error: 0.339294
Confusion Matrix: Extract with `h2o.confusionMatrix(<model>,valid = TRUE)`)
=========================================================================
Confusion Matrix: Row labels: Actual class; Column labels: Predicted class
class_1 class_2 class_3 class_4 class_5 class_6 class_7 Error Rate
class_1 28498 13402 21 0 36 13 530 0.3295 = 14,002 / 42,500
class_2 2089 53039 735 2 209 263 43 0.0593 = 3,341 / 56,380
class_3 0 254 6230 68 2 589 0 0.1278 = 913 / 7,143
class_4 0 0 237 286 0 39 0 0.4911 = 276 / 562
class_5 20 1091 74 0 678 7 0 0.6374 = 1,192 / 1,870
class_6 0 580 1119 24 2 1739 0 0.4980 = 1,725 / 3,464
class_7 817 134 0 0 0 0 3148 0.2320 = 951 / 4,099
Totals 31424 68500 8416 380 927 2650 3721 0.1931 = 22,400 / 116,018
Hit Ratio Table: Extract with `h2o.hit_ratio_table(<model>,valid = TRUE)`
=======================================================================
Top-7 Hit Ratios:
k hit_ratio
1 1 0.806926
2 2 0.982563
3 3 0.997595
4 4 0.999647
5 5 1.000000
6 6 1.000000
7 7 1.000000
Scoring History:
timestamp duration training_speed epochs iterations samples training_rmse
1 2017-10-06 09:02:46 0.000 sec 0.00000 0 0.000000
2 2017-10-06 09:02:49 5.046 sec 11394 obs/sec 0.10045 1 35060.000000 0.44480
3 2017-10-06 09:03:06 22.086 sec 16911 obs/sec 0.90087 9 314418.000000 0.37490
4 2017-10-06 09:03:09 25.304 sec 17211 obs/sec 1.00105 10 349383.000000 0.37174
training_logloss training_classification_error validation_rmse validation_logloss
1
2 0.61934 0.26090 0.44649 0.62490
3 0.45158 0.19363 0.38334 0.46998
4 0.43953 0.19124 0.37546 0.44869
validation_classification_error
1
2 0.26311
3 0.20249
4 0.19307
Variable Importances: (Extract with `h2o.varimp`)
=================================================
Variable Importances:
variable relative_importance scaled_importance percentage
1 Wilderness_Area.area_0 1.000000 1.000000 0.031628
2 Elevation 0.983167 0.983167 0.031096
3 Horizontal_Distance_To_Roadways 0.981207 0.981207 0.031034
4 Horizontal_Distance_To_Fire_Points 0.841802 0.841802 0.026625
5 Wilderness_Area.area_2 0.820543 0.820543 0.025953
---
variable relative_importance scaled_importance percentage
51 Soil_Type.type_14 0.432849 0.432849 0.013690
52 Hillshade_3pm 0.431671 0.431671 0.013653
53 Slope 0.428544 0.428544 0.013554
54 Aspect 0.280167 0.280167 0.008861
55 Soil_Type.missing(NA) 0.000000 0.000000 0.000000
56 Wilderness_Area.missing(NA) 0.000000 0.000000 0.000000
And the focus will be to look at model performance, since we are using R to
control H2O. So we can simply type in:
getModel “dl_model_first”
Add early stopping
summary(model.DL2)
Model Details:
==============
H2OMultinomialModel: deeplearning
Model Key: dl_model_faster
Status of Neuron Layers: predicting Cover_Type, 7-class classification, multinomial distribution, CrossEntropy loss, 4,167 weights/biases, 57.0 KB, 5,100,821 training samples, mini-batch size 1
layer units type dropout l1 l2 mean_rate rate_rms momentum mean_weight
1 1 56 Input 0.00 %
2 2 32 Rectifier 0.00 % 0.000000 0.000000 0.043453 0.203151 0.000000 -0.007334
3 3 32 Rectifier 0.00 % 0.000000 0.000000 0.000282 0.000193 0.000000 -0.055414
4 4 32 Rectifier 0.00 % 0.000000 0.000000 0.000754 0.001095 0.000000 0.061181
5 5 7 Softmax 0.000000 0.000000 0.125158 0.310982 0.000000 -4.875788
weight_rms mean_bias bias_rms
1
2 0.311607 0.441150 0.441827
3 0.372292 0.534934 0.465208
4 0.571504 0.997466 0.938149
5 3.544398 -2.336402 0.512782
H2OMultinomialMetrics: deeplearning
** Reported on training data. **
** Metrics reported on temporary training frame with 10014 samples **
Training Set Metrics:
=====================
MSE: (Extract with `h2o.mse`) 0.1204502
RMSE: (Extract with `h2o.rmse`) 0.3470594
Logloss: (Extract with `h2o.logloss`) 0.3975742
Mean Per-Class Error: 0.211248
Confusion Matrix: Extract with `h2o.confusionMatrix(<model>,train = TRUE)`)
=========================================================================
Confusion Matrix: Row labels: Actual class; Column labels: Predicted class
class_1 class_2 class_3 class_4 class_5 class_6 class_7 Error Rate
class_1 2981 564 1 0 25 4 31 0.1733 = 625 / 3,606
class_2 444 4223 71 1 79 53 5 0.1339 = 653 / 4,876
class_3 0 13 578 5 2 54 0 0.1135 = 74 / 652
class_4 0 0 8 31 0 1 0 0.2250 = 9 / 40
class_5 2 41 5 0 106 1 0 0.3161 = 49 / 155
class_6 1 16 94 7 1 209 0 0.3628 = 119 / 328
class_7 52 3 0 0 0 0 302 0.1541 = 55 / 357
Totals 3480 4860 757 44 213 322 338 0.1582 = 1,584 / 10,014
Hit Ratio Table: Extract with `h2o.hit_ratio_table(<model>,train = TRUE)`
=======================================================================
Top-7 Hit Ratios:
k hit_ratio
1 1 0.841821
2 2 0.983223
3 3 0.997404
4 4 0.999800
5 5 1.000000
6 6 1.000000
7 7 1.000000
H2OMultinomialMetrics: deeplearning
** Reported on validation data. **
** Metrics reported on temporary validation frame with 9993 samples **
Validation Set Metrics:
=====================
MSE: (Extract with `h2o.mse`) 0.1201998
RMSE: (Extract with `h2o.rmse`) 0.3466984
Logloss: (Extract with `h2o.logloss`) 0.3923519
Mean Per-Class Error: 0.2282578
Confusion Matrix: Extract with `h2o.confusionMatrix(<model>,valid = TRUE)`)
=========================================================================
Confusion Matrix: Row labels: Actual class; Column labels: Predicted class
class_1 class_2 class_3 class_4 class_5 class_6 class_7 Error Rate
class_1 3028 595 0 0 24 0 30 0.1765 = 649 / 3,677
class_2 482 4155 70 0 79 45 2 0.1403 = 678 / 4,833
class_3 0 16 533 5 0 64 0 0.1375 = 85 / 618
class_4 0 0 17 32 0 2 0 0.3725 = 19 / 51
class_5 1 39 4 0 131 3 0 0.2640 = 47 / 178
class_6 1 22 71 3 3 179 0 0.3584 = 100 / 279
class_7 51 2 0 0 0 0 304 0.1485 = 53 / 357
Totals 3563 4829 695 40 237 293 336 0.1632 = 1,631 / 9,993
Hit Ratio Table: Extract with `h2o.hit_ratio_table(<model>,valid = TRUE)`
=======================================================================
Top-7 Hit Ratios:
k hit_ratio
1 1 0.836786
2 2 0.985790
3 3 0.997899
4 4 0.999400
5 5 1.000000
6 6 1.000000
7 7 1.000000
Scoring History:
timestamp duration training_speed epochs iterations samples
1 2017-10-06 09:12:47 0.000 sec 0.00000 0 0.000000
2 2017-10-06 09:12:48 1.054 sec 112810 obs/sec 0.28605 1 99837.000000
3 2017-10-06 09:12:54 6.693 sec 138898 obs/sec 2.57767 9 899645.000000
4 2017-10-06 09:12:59 11.897 sec 146100 obs/sec 4.87009 17 1699733.000000
5 2017-10-06 09:13:04 16.994 sec 149838 obs/sec 7.16276 25 2499911.000000
6 2017-10-06 09:13:09 22.012 sec 152423 obs/sec 9.45770 33 3300879.000000
7 2017-10-06 09:13:15 27.613 sec 154283 obs/sec 12.03534 42 4200514.000000
8 2017-10-06 09:13:20 33.145 sec 155935 obs/sec 14.61490 51 5100821.000000
training_rmse training_logloss training_classification_error validation_rmse
1
2 0.43345 0.57840 0.25285 0.43537
3 0.38097 0.46431 0.19762 0.38214
4 0.35902 0.41841 0.17306 0.36160
5 0.35432 0.40474 0.17006 0.35413
6 0.34744 0.39598 0.16018 0.35014
7 0.35340 0.41231 0.16856 0.35832
8 0.34706 0.39757 0.15818 0.34670
validation_logloss validation_classification_error
1
2 0.58195 0.25508
3 0.47078 0.19584
4 0.42128 0.17752
5 0.40627 0.16782
6 0.39457 0.16452
7 0.41678 0.17632
8 0.39235 0.16321
Variable Importances: (Extract with `h2o.varimp`)
=================================================
Variable Importances:
variable relative_importance scaled_importance percentage
1 Wilderness_Area.area_3 1.000000 1.000000 0.035017
2 Elevation 0.951407 0.951407 0.033315
3 Horizontal_Distance_To_Roadways 0.927762 0.927762 0.032487
4 Wilderness_Area.area_0 0.865524 0.865524 0.030308
5 Wilderness_Area.area_1 0.834018 0.834018 0.029205
---
variable relative_importance scaled_importance percentage
51 Soil_Type.type_14 0.299532 0.299532 0.010489
52 Vertical_Distance_To_Hydrology 0.215133 0.215133 0.007533
53 Slope 0.197478 0.197478 0.006915
54 Aspect 0.058334 0.058334 0.002043
55 Soil_Type.missing(NA) 0.000000 0.000000 0.000000
56 Wilderness_Area.missing(NA) 0.000000 0.000000 0.000000
plot(model.DL2)

tuning
model.DL3 <- h2o.deeplearning(
model_id="dl_model_tuned",
training_frame=train,
validation_frame=valid,
x=predictors,
y=response,
overwrite_with_best_model=F, ## Return the final model after 10 epochs, even if not the best
hidden=c(128,128,128), ## more hidden layers -> more complex interactions
epochs=10, ## to keep it short enough
score_validation_samples=10000, ## downsample validation set for faster scoring
score_duty_cycle=0.025, ## don't score more than 2.5% of the wall time
adaptive_rate=F, ## manually tuned learning rate
rate=0.01,
rate_annealing=2e-6,
momentum_start=0.2, ## manually tuned momentum
momentum_stable=0.4,
momentum_ramp=1e7,
l1=1e-5, ## add some L1/L2 regularization
l2=1e-5,
max_w2=10 ## helps stability for Rectifier
)
|
| | 0%
|
|== | 3%
|
|===== | 6%
|
|======= | 9%
|
|=============== | 17%
|
|========================= | 29%
|
|================================== | 40%
|
|============================================ | 52%
|
|==================================================== | 60%
|
|=========================================================== | 69%
|
|================================================================ | 74%
|
|======================================================================= | 83%
|
|=============================================================================== | 92%
|
|==================================================================================== | 97%
|
|======================================================================================| 100%
summary(model.DL3)
Model Details:
==============
H2OMultinomialModel: deeplearning
Model Key: dl_model_tuned
Status of Neuron Layers: predicting Cover_Type, 7-class classification, multinomial distribution, CrossEntropy loss, 41,223 weights/biases, 332.0 KB, 3,496,969 training samples, mini-batch size 1
layer units type dropout l1 l2 mean_rate rate_rms momentum mean_weight
1 1 56 Input 0.00 %
2 2 128 Rectifier 0.00 % 0.000010 0.000010 0.001251 0.000000 0.269939 -0.010811
3 3 128 Rectifier 0.00 % 0.000010 0.000010 0.001251 0.000000 0.269939 -0.048135
4 4 128 Rectifier 0.00 % 0.000010 0.000010 0.001251 0.000000 0.269939 -0.066310
5 5 7 Softmax 0.000010 0.000010 0.001251 0.000000 0.269939 -0.017610
weight_rms mean_bias bias_rms
1
2 0.315516 -0.105030 0.298393
3 0.217276 0.952453 0.371964
4 0.207231 0.873646 0.183867
5 0.270758 0.021364 2.382584
H2OMultinomialMetrics: deeplearning
** Reported on training data. **
** Metrics reported on temporary training frame with 10101 samples **
Training Set Metrics:
=====================
MSE: (Extract with `h2o.mse`) 0.05807823
RMSE: (Extract with `h2o.rmse`) 0.2409943
Logloss: (Extract with `h2o.logloss`) 0.190546
Mean Per-Class Error: 0.1655679
Confusion Matrix: Extract with `h2o.confusionMatrix(<model>,train = TRUE)`)
=========================================================================
Confusion Matrix: Row labels: Actual class; Column labels: Predicted class
class_1 class_2 class_3 class_4 class_5 class_6 class_7 Error Rate
class_1 3353 298 1 0 3 1 14 0.0864 = 317 / 3,670
class_2 245 4601 14 0 21 7 3 0.0593 = 290 / 4,891
class_3 0 18 621 2 3 31 0 0.0800 = 54 / 675
class_4 0 0 13 22 0 2 0 0.4054 = 15 / 37
class_5 4 29 0 0 125 0 0 0.2089 = 33 / 158
class_6 2 17 44 1 0 242 0 0.2092 = 64 / 306
class_7 37 3 0 0 0 0 324 0.1099 = 40 / 364
Totals 3641 4966 693 25 152 283 341 0.0805 = 813 / 10,101
Hit Ratio Table: Extract with `h2o.hit_ratio_table(<model>,train = TRUE)`
=======================================================================
Top-7 Hit Ratios:
k hit_ratio
1 1 0.919513
2 2 0.996436
3 3 0.999703
4 4 1.000000
5 5 1.000000
6 6 1.000000
7 7 1.000000
H2OMultinomialMetrics: deeplearning
** Reported on validation data. **
** Metrics reported on temporary validation frame with 10123 samples **
Validation Set Metrics:
=====================
MSE: (Extract with `h2o.mse`) 0.06016481
RMSE: (Extract with `h2o.rmse`) 0.2452852
Logloss: (Extract with `h2o.logloss`) 0.199066
Mean Per-Class Error: 0.1338916
Confusion Matrix: Extract with `h2o.confusionMatrix(<model>,valid = TRUE)`)
=========================================================================
Confusion Matrix: Row labels: Actual class; Column labels: Predicted class
class_1 class_2 class_3 class_4 class_5 class_6 class_7 Error Rate
class_1 3429 300 0 0 4 0 24 0.0873 = 328 / 3,757
class_2 229 4614 9 0 25 8 3 0.0561 = 274 / 4,888
class_3 0 19 566 5 1 28 0 0.0856 = 53 / 619
class_4 0 0 8 31 0 3 0 0.2619 = 11 / 42
class_5 6 25 3 0 116 0 0 0.2267 = 34 / 150
class_6 1 14 21 0 0 275 0 0.1158 = 36 / 311
class_7 36 1 0 0 0 0 319 0.1039 = 37 / 356
Totals 3701 4973 607 36 146 314 346 0.0764 = 773 / 10,123
Hit Ratio Table: Extract with `h2o.hit_ratio_table(<model>,valid = TRUE)`
=======================================================================
Top-7 Hit Ratios:
k hit_ratio
1 1 0.923639
2 2 0.997432
3 3 0.999704
4 4 1.000000
5 5 1.000000
6 6 1.000000
7 7 1.000000
Scoring History:
timestamp duration training_speed epochs iterations samples
1 2017-10-06 09:21:56 0.000 sec 0.00000 0 0.000000
2 2017-10-06 09:22:00 3.916 sec 27705 obs/sec 0.28664 1 100043.000000
3 2017-10-06 09:22:10 14.041 sec 36952 obs/sec 1.43284 5 500082.000000
4 2017-10-06 09:22:19 23.156 sec 40075 obs/sec 2.57771 9 899658.000000
5 2017-10-06 09:22:28 32.045 sec 41703 obs/sec 3.72233 13 1299150.000000
6 2017-10-06 09:22:37 41.005 sec 42600 obs/sec 4.86899 17 1699351.000000
7 2017-10-06 09:22:46 50.730 sec 42499 obs/sec 6.01686 21 2099974.000000
8 2017-10-06 09:22:56 1 min 0.400 sec 42448 obs/sec 7.16110 25 2499332.000000
9 2017-10-06 09:23:05 1 min 9.397 sec 41354 obs/sec 8.01937 28 2798880.000000
10 2017-10-06 09:23:15 1 min 18.986 sec 41484 obs/sec 9.16190 32 3197639.000000
11 2017-10-06 09:23:22 1 min 26.241 sec 41567 obs/sec 10.01954 35 3496969.000000
training_rmse training_logloss training_classification_error validation_rmse
1
2 0.43813 0.58807 0.25433 0.43908
3 0.34886 0.38898 0.15979 0.35134
4 0.31460 0.32190 0.12939 0.31554
5 0.29547 0.28554 0.11266 0.29710
6 0.27646 0.25116 0.10177 0.27936
7 0.27000 0.23831 0.09573 0.27495
8 0.25563 0.21730 0.08583 0.25924
9 0.25111 0.20968 0.08286 0.25832
10 0.24559 0.20142 0.07979 0.25097
11 0.24099 0.19055 0.08049 0.24529
validation_logloss validation_classification_error
1
2 0.58501 0.25664
3 0.38723 0.16764
4 0.31874 0.12980
5 0.28290 0.11854
6 0.25225 0.10669
7 0.24612 0.10106
8 0.22099 0.08970
9 0.21883 0.09068
10 0.20744 0.08496
11 0.19907 0.07636
Variable Importances: (Extract with `h2o.varimp`)
=================================================
Variable Importances:
variable relative_importance scaled_importance percentage
1 Elevation 1.000000 1.000000 0.049328
2 Horizontal_Distance_To_Roadways 0.950752 0.950752 0.046899
3 Horizontal_Distance_To_Fire_Points 0.938702 0.938702 0.046304
4 Wilderness_Area.area_0 0.788530 0.788530 0.038897
5 Soil_Type.type_31 0.549967 0.549967 0.027129
---
variable relative_importance scaled_importance percentage
51 Soil_Type.type_7 0.160953 0.160953 0.007940
52 Soil_Type.type_13 0.159431 0.159431 0.007864
53 Soil_Type.type_14 0.159139 0.159139 0.007850
54 Soil_Type.type_6 0.141617 0.141617 0.006986
55 Soil_Type.missing(NA) 0.000000 0.000000 0.000000
56 Wilderness_Area.missing(NA) 0.000000 0.000000 0.000000
Let’s compare the training error with the validation and test set errors
h2o.performance(model.DL3, train=T)
H2OMultinomialMetrics: deeplearning
** Reported on training data. **
** Metrics reported on temporary training frame with 10101 samples **
Training Set Metrics:
=====================
MSE: (Extract with `h2o.mse`) 0.05807823
RMSE: (Extract with `h2o.rmse`) 0.2409943
Logloss: (Extract with `h2o.logloss`) 0.190546
Mean Per-Class Error: 0.1655679
Confusion Matrix: Extract with `h2o.confusionMatrix(<model>,train = TRUE)`)
=========================================================================
Confusion Matrix: Row labels: Actual class; Column labels: Predicted class
class_1 class_2 class_3 class_4 class_5 class_6 class_7 Error Rate
class_1 3353 298 1 0 3 1 14 0.0864 = 317 / 3,670
class_2 245 4601 14 0 21 7 3 0.0593 = 290 / 4,891
class_3 0 18 621 2 3 31 0 0.0800 = 54 / 675
class_4 0 0 13 22 0 2 0 0.4054 = 15 / 37
class_5 4 29 0 0 125 0 0 0.2089 = 33 / 158
class_6 2 17 44 1 0 242 0 0.2092 = 64 / 306
class_7 37 3 0 0 0 0 324 0.1099 = 40 / 364
Totals 3641 4966 693 25 152 283 341 0.0805 = 813 / 10,101
Hit Ratio Table: Extract with `h2o.hit_ratio_table(<model>,train = TRUE)`
=======================================================================
Top-7 Hit Ratios:
k hit_ratio
1 1 0.919513
2 2 0.996436
3 3 0.999703
4 4 1.000000
5 5 1.000000
6 6 1.000000
7 7 1.000000
h2o.performance(model.DL3, valid=T)
H2OMultinomialMetrics: deeplearning
** Reported on validation data. **
** Metrics reported on temporary validation frame with 10123 samples **
Validation Set Metrics:
=====================
MSE: (Extract with `h2o.mse`) 0.06016481
RMSE: (Extract with `h2o.rmse`) 0.2452852
Logloss: (Extract with `h2o.logloss`) 0.199066
Mean Per-Class Error: 0.1338916
Confusion Matrix: Extract with `h2o.confusionMatrix(<model>,valid = TRUE)`)
=========================================================================
Confusion Matrix: Row labels: Actual class; Column labels: Predicted class
class_1 class_2 class_3 class_4 class_5 class_6 class_7 Error Rate
class_1 3429 300 0 0 4 0 24 0.0873 = 328 / 3,757
class_2 229 4614 9 0 25 8 3 0.0561 = 274 / 4,888
class_3 0 19 566 5 1 28 0 0.0856 = 53 / 619
class_4 0 0 8 31 0 3 0 0.2619 = 11 / 42
class_5 6 25 3 0 116 0 0 0.2267 = 34 / 150
class_6 1 14 21 0 0 275 0 0.1158 = 36 / 311
class_7 36 1 0 0 0 0 319 0.1039 = 37 / 356
Totals 3701 4973 607 36 146 314 346 0.0764 = 773 / 10,123
Hit Ratio Table: Extract with `h2o.hit_ratio_table(<model>,valid = TRUE)`
=======================================================================
Top-7 Hit Ratios:
k hit_ratio
1 1 0.923639
2 2 0.997432
3 3 0.999704
4 4 1.000000
5 5 1.000000
6 6 1.000000
7 7 1.000000
h2o.performance(model.DL3, newdata=train) ## full training data
H2OMultinomialMetrics: deeplearning
Test Set Metrics:
=====================
MSE: (Extract with `h2o.mse`) 0.05680386
RMSE: (Extract with `h2o.rmse`) 0.2383356
Logloss: (Extract with `h2o.logloss`) 0.1880695
Mean Per-Class Error: 0.1371149
Confusion Matrix: Extract with `h2o.confusionMatrix(<model>, <data>)`)
=========================================================================
Confusion Matrix: Row labels: Actual class; Column labels: Predicted class
class_1 class_2 class_3 class_4 class_5 class_6 class_7 Error Rate
class_1 116065 10335 5 0 131 22 562 0.0870 = 11,055 / 127,120
class_2 7652 161227 400 1 645 319 98 0.0535 = 9,115 / 170,342
class_3 1 478 19793 91 62 1017 0 0.0769 = 1,649 / 21,442
class_4 0 0 313 1260 0 85 0 0.2400 = 398 / 1,658
class_5 119 1155 70 0 4347 28 1 0.2400 = 1,373 / 5,720
class_6 43 476 1159 35 7 8713 0 0.1649 = 1,720 / 10,433
class_7 1094 104 0 0 1 0 11101 0.0975 = 1,199 / 12,300
Totals 124974 173775 21740 1387 5193 10184 11762 0.0760 = 26,509 / 349,015
Hit Ratio Table: Extract with `h2o.hit_ratio_table(<model>, <data>)`
=======================================================================
Top-7 Hit Ratios:
k hit_ratio
1 1 0.924046
2 2 0.996510
3 3 0.999682
4 4 0.999948
5 5 0.999997
6 6 1.000000
7 7 1.000000
h2o.performance(model.DL3, newdata=valid) ## full validation data
H2OMultinomialMetrics: deeplearning
Test Set Metrics:
=====================
MSE: (Extract with `h2o.mse`) 0.06270704
RMSE: (Extract with `h2o.rmse`) 0.2504137
Logloss: (Extract with `h2o.logloss`) 0.2078286
Mean Per-Class Error: 0.1537241
Confusion Matrix: Extract with `h2o.confusionMatrix(<model>, <data>)`)
=========================================================================
Confusion Matrix: Row labels: Actual class; Column labels: Predicted class
class_1 class_2 class_3 class_4 class_5 class_6 class_7 Error Rate
class_1 38571 3687 1 0 37 8 196 0.0924 = 3,929 / 42,500
class_2 2811 53002 137 0 254 137 39 0.0599 = 3,378 / 56,380
class_3 0 198 6526 31 24 364 0 0.0864 = 617 / 7,143
class_4 0 0 122 403 0 37 0 0.2829 = 159 / 562
class_5 48 410 33 0 1368 11 0 0.2684 = 502 / 1,870
class_6 14 209 408 14 5 2814 0 0.1876 = 650 / 3,464
class_7 376 26 0 0 1 0 3696 0.0983 = 403 / 4,099
Totals 41820 57532 7227 448 1689 3371 3931 0.0831 = 9,638 / 116,018
Hit Ratio Table: Extract with `h2o.hit_ratio_table(<model>, <data>)`
=======================================================================
Top-7 Hit Ratios:
k hit_ratio
1 1 0.916927
2 2 0.995656
3 3 0.999560
4 4 0.999931
5 5 0.999991
6 6 1.000000
7 7 1.000000
h2o.performance(model.DL3, newdata=test) ## full test data
H2OMultinomialMetrics: deeplearning
Test Set Metrics:
=====================
MSE: (Extract with `h2o.mse`) 0.06255105
RMSE: (Extract with `h2o.rmse`) 0.2501021
Logloss: (Extract with `h2o.logloss`) 0.2079834
Mean Per-Class Error: 0.143468
Confusion Matrix: Extract with `h2o.confusionMatrix(<model>, <data>)`)
=========================================================================
Confusion Matrix: Row labels: Actual class; Column labels: Predicted class
class_1 class_2 class_3 class_4 class_5 class_6 class_7 Error Rate
class_1 38269 3715 0 0 38 1 197 0.0936 = 3,951 / 42,220
class_2 2778 53200 141 0 264 143 53 0.0597 = 3,379 / 56,579
class_3 0 201 6544 44 23 357 0 0.0872 = 625 / 7,169
class_4 0 0 90 414 0 23 0 0.2144 = 113 / 527
class_5 45 386 29 0 1428 15 0 0.2496 = 475 / 1,903
class_6 26 181 414 20 5 2824 0 0.1862 = 646 / 3,470
class_7 441 26 0 0 0 0 3644 0.1136 = 467 / 4,111
Totals 41559 57709 7218 478 1758 3363 3894 0.0833 = 9,656 / 115,979
Hit Ratio Table: Extract with `h2o.hit_ratio_table(<model>, <data>)`
=======================================================================
Top-7 Hit Ratios:
k hit_ratio
1 1 0.916744
2 2 0.995413
3 3 0.999543
4 4 0.999922
5 5 0.999991
6 6 1.000000
7 7 1.000000
To confirm that the reported confusion matrix on the validation set (here, the test set) was correct, we make a prediction on the test set and compare the confusion matrices explicitly:
pred
predict class_1 class_2 class_3 class_4 class_5 class_6
1 class_1 0.6104956948 0.3882843120 5.102326e-04 1.422036e-04 1.348964e-04 1.638090e-04
2 class_1 0.9998101910 0.0001751107 1.213332e-06 1.774480e-08 2.334101e-08 9.071854e-08
3 class_1 0.9997581288 0.0002416531 1.693165e-08 1.481961e-08 1.157406e-08 2.459106e-08
4 class_1 0.9990958104 0.0008910997 4.660538e-06 1.096123e-06 9.205112e-07 3.310508e-07
5 class_2 0.0268636285 0.9730607424 5.180056e-06 5.601822e-07 4.797164e-05 7.898646e-06
6 class_5 0.0002249955 0.1834546000 1.399325e-04 2.806930e-07 8.161799e-01 1.937054e-07
class_7
1 2.688517e-04
2 1.335316e-05
3 1.501860e-07
4 6.081686e-06
5 1.401853e-05
6 1.013247e-07
[115979 rows x 8 columns]
test$Accuracy<-pred$predict==test$Cover_Type
test$Accuracy
Accuracy
1 0
2 1
3 1
4 1
5 1
6 1
[115979 rows x 1 column]
1-mean(test$Accuracy)
[1] 0.08325645
Hyper-parameter tuning with grid search
sampled_train=train[1:10000,]
hyper_params <- list(
hidden=list(c(32,32,32),c(64,64)),
input_dropout_ratio=c(0,0.05),
rate=c(0.01,0.02),
rate_annealing=c(1e-8,1e-7,1e-6)
)
grid <- h2o.grid(
algorithm="deeplearning",
grid_id="dl_grid",
training_frame=sampled_train,
validation_frame=valid,
x=predictors,
y=response,
epochs=10,
stopping_metric="misclassification",
stopping_tolerance=1e-2, ## stop when misclassification does not improve by >=1% for 2 scoring events
stopping_rounds=2,
score_validation_samples=10000, ## downsample validation set for faster scoring
score_duty_cycle=0.025, ## don't score more than 2.5% of the wall time
adaptive_rate=F, ## manually tuned learning rate
momentum_start=0.5, ## manually tuned momentum
momentum_stable=0.9,
momentum_ramp=1e7,
l1=1e-5,
l2=1e-5,
activation=c("Rectifier"),
max_w2=10, ## can help improve stability for Rectifier
hyper_params=hyper_params
)
|
| | 0%
|
|===== | 6%
|
|========== | 12%
|
|============== | 17%
|
|===================== | 24%
|
|============================ | 33%
|
|================================= | 38%
|
|====================================== | 45%
|
|=========================================== | 50%
|
|================================================ | 57%
|
|====================================================== | 63%
|
|============================================================ | 71%
|
|================================================================ | 75%
|
|====================================================================== | 82%
|
|============================================================================= | 90%
|
|=================================================================================== | 98%
|
|=====================================================================================| 100%
grid
H2O Grid Details
================
Grid ID: dl_grid
Used hyper parameters:
- hidden
- input_dropout_ratio
- rate
- rate_annealing
Number of models: 24
Number of failed models: 0
Hyper-Parameter Search Summary: ordered by increasing logloss
hidden input_dropout_ratio rate rate_annealing model_ids logloss
1 [64, 64] 0.05 0.02 1.0E-6 dl_grid_model_23 0.569671198145718
2 [64, 64] 0.05 0.02 1.0E-7 dl_grid_model_15 0.5853816041731887
3 [64, 64] 0.05 0.01 1.0E-6 dl_grid_model_19 0.5873452379123708
4 [32, 32, 32] 0.0 0.02 1.0E-8 dl_grid_model_4 0.589141734307819
5 [64, 64] 0.0 0.01 1.0E-8 dl_grid_model_1 0.5897429442995311
---
hidden input_dropout_ratio rate rate_annealing model_ids logloss
19 [32, 32, 32] 0.05 0.02 1.0E-8 dl_grid_model_6 0.6300737705366444
20 [32, 32, 32] 0.0 0.02 1.0E-7 dl_grid_model_12 0.6317759057671989
21 [32, 32, 32] 0.05 0.01 1.0E-6 dl_grid_model_18 0.6616221471542507
22 [64, 64] 0.05 0.02 1.0E-8 dl_grid_model_7 0.6674139018531181
23 [64, 64] 0.0 0.01 1.0E-7 dl_grid_model_9 0.6756402490627954
24 [32, 32, 32] 0.05 0.02 1.0E-7 dl_grid_model_14 0.6799036040013375
which model has the lowest validation error
grid<-h2o.getGrid("dl_grid", sort_by="err", decreasing = FALSE)
grid
H2O Grid Details
================
Grid ID: dl_grid
Used hyper parameters:
- hidden
- input_dropout_ratio
- rate
- rate_annealing
Number of models: 24
Number of failed models: 0
Hyper-Parameter Search Summary: ordered by increasing err
hidden input_dropout_ratio rate rate_annealing model_ids err
1 [64, 64] 0.05 0.02 1.0E-6 dl_grid_model_23 0.24419184365340513
2 [32, 32, 32] 0.0 0.02 1.0E-8 dl_grid_model_4 0.25191370911621436
3 [32, 32, 32] 0.0 0.01 1.0E-7 dl_grid_model_8 0.25526791089705
4 [64, 64] 0.0 0.01 1.0E-8 dl_grid_model_1 0.25543532214831727
5 [32, 32, 32] 0.0 0.01 1.0E-8 dl_grid_model_0 0.25556331703422813
---
hidden input_dropout_ratio rate rate_annealing model_ids err
19 [64, 64] 0.05 0.01 1.0E-7 dl_grid_model_11 0.27376690533015113
20 [64, 64] 0.0 0.01 1.0E-7 dl_grid_model_9 0.27814437112577484
21 [32, 32, 32] 0.05 0.02 1.0E-8 dl_grid_model_6 0.2803159052284315
22 [64, 64] 0.05 0.02 1.0E-8 dl_grid_model_7 0.2812312011229196
23 [32, 32, 32] 0.05 0.01 1.0E-6 dl_grid_model_18 0.2835508618112982
24 [32, 32, 32] 0.05 0.02 1.0E-7 dl_grid_model_14 0.289982075283808
which model has the logloss validation error
grid<-h2o.getGrid("dl_grid", sort_by="logloss", decreasing = FALSE)
grid
H2O Grid Details
================
Grid ID: dl_grid
Used hyper parameters:
- hidden
- input_dropout_ratio
- rate
- rate_annealing
Number of models: 24
Number of failed models: 0
Hyper-Parameter Search Summary: ordered by increasing logloss
hidden input_dropout_ratio rate rate_annealing model_ids logloss
1 [64, 64] 0.05 0.02 1.0E-6 dl_grid_model_23 0.569671198145718
2 [64, 64] 0.05 0.02 1.0E-7 dl_grid_model_15 0.5853816041731887
3 [64, 64] 0.05 0.01 1.0E-6 dl_grid_model_19 0.5873452379123708
4 [32, 32, 32] 0.0 0.02 1.0E-8 dl_grid_model_4 0.589141734307819
5 [64, 64] 0.0 0.01 1.0E-8 dl_grid_model_1 0.5897429442995311
---
hidden input_dropout_ratio rate rate_annealing model_ids logloss
19 [32, 32, 32] 0.05 0.02 1.0E-8 dl_grid_model_6 0.6300737705366444
20 [32, 32, 32] 0.0 0.02 1.0E-7 dl_grid_model_12 0.6317759057671989
21 [32, 32, 32] 0.05 0.01 1.0E-6 dl_grid_model_18 0.6616221471542507
22 [64, 64] 0.05 0.02 1.0E-8 dl_grid_model_7 0.6674139018531181
23 [64, 64] 0.0 0.01 1.0E-7 dl_grid_model_9 0.6756402490627954
24 [32, 32, 32] 0.05 0.02 1.0E-7 dl_grid_model_14 0.6799036040013375
Find the best model and its full set of parameters
grid@summary_table[1,]
Hyper-Parameter Search Summary: ordered by increasing logloss
hidden input_dropout_ratio rate rate_annealing model_ids logloss
1 [64, 64] 0.05 0.02 1.0E-6 dl_grid_model_23 0.569671198145718
best_model <- h2o.getModel(grid@model_ids[[1]])
best_model
Model Details:
==============
H2OMultinomialModel: deeplearning
Model ID: dl_grid_model_23
Status of Neuron Layers: predicting Cover_Type, 7-class classification, multinomial distribution, CrossEntropy loss, 8,263 weights/biases, 71.9 KB, 100,000 training samples, mini-batch size 1
layer units type dropout l1 l2 mean_rate rate_rms momentum mean_weight
1 1 56 Input 5.00 %
2 2 64 Rectifier 0.00 % 0.000010 0.000010 0.018182 0.000000 0.504000 -0.019257
3 3 64 Rectifier 0.00 % 0.000010 0.000010 0.018182 0.000000 0.504000 -0.091553
4 4 7 Softmax 0.000010 0.000010 0.018182 0.000000 0.504000 0.019541
weight_rms mean_bias bias_rms
1
2 0.257166 -0.081523 0.248525
3 0.206619 0.720783 0.231155
4 0.388730 -0.022784 1.173140
H2OMultinomialMetrics: deeplearning
** Reported on training data. **
** Metrics reported on full training frame **
Training Set Metrics:
=====================
Extract training frame with `h2o.getFrame("RTMP_sid_9fdd_111")`
MSE: (Extract with `h2o.mse`) 0.1719534
RMSE: (Extract with `h2o.rmse`) 0.4146726
Logloss: (Extract with `h2o.logloss`) 0.5305122
Mean Per-Class Error: 0.4056287
Confusion Matrix: Extract with `h2o.confusionMatrix(<model>,train = TRUE)`)
=========================================================================
Confusion Matrix: Row labels: Actual class; Column labels: Predicted class
class_1 class_2 class_3 class_4 class_5 class_6 class_7 Error Rate
class_1 2799 835 1 0 0 0 53 0.2411 = 889 / 3,688
class_2 657 4051 76 0 14 32 5 0.1622 = 784 / 4,835
class_3 0 34 559 20 0 17 0 0.1127 = 71 / 630
class_4 0 0 19 25 0 0 0 0.4318 = 19 / 44
class_5 9 108 4 0 34 1 0 0.7821 = 122 / 156
class_6 0 48 186 1 0 74 0 0.7605 = 235 / 309
class_7 117 1 0 0 0 0 220 0.3491 = 118 / 338
Totals 3582 5077 845 46 48 124 278 0.2238 = 2,238 / 10,000
Hit Ratio Table: Extract with `h2o.hit_ratio_table(<model>,train = TRUE)`
=======================================================================
Top-7 Hit Ratios:
k hit_ratio
1 1 0.776200
2 2 0.973200
3 3 0.997000
4 4 0.999300
5 5 1.000000
6 6 1.000000
7 7 1.000000
H2OMultinomialMetrics: deeplearning
** Reported on validation data. **
** Metrics reported on temporary validation frame with 10029 samples **
Validation Set Metrics:
=====================
MSE: (Extract with `h2o.mse`) 0.1840918
RMSE: (Extract with `h2o.rmse`) 0.4290592
Logloss: (Extract with `h2o.logloss`) 0.5696712
Mean Per-Class Error: 0.4315904
Confusion Matrix: Extract with `h2o.confusionMatrix(<model>,valid = TRUE)`)
=========================================================================
Confusion Matrix: Row labels: Actual class; Column labels: Predicted class
class_1 class_2 class_3 class_4 class_5 class_6 class_7 Error Rate
class_1 2668 910 0 0 0 0 50 0.2646 = 960 / 3,628
class_2 739 4064 82 0 10 39 3 0.1768 = 873 / 4,937
class_3 0 41 501 24 0 19 0 0.1436 = 84 / 585
class_4 0 0 25 31 0 1 0 0.4561 = 26 / 57
class_5 3 114 6 0 29 1 0 0.8105 = 124 / 153
class_6 1 44 203 2 0 63 0 0.7987 = 250 / 313
class_7 132 0 0 0 0 0 224 0.3708 = 132 / 356
Totals 3543 5173 817 57 39 123 277 0.2442 = 2,449 / 10,029
Hit Ratio Table: Extract with `h2o.hit_ratio_table(<model>,valid = TRUE)`
=======================================================================
Top-7 Hit Ratios:
k hit_ratio
1 1 0.755808
2 2 0.970884
3 3 0.995413
4 4 0.999103
5 5 1.000000
6 6 1.000000
7 7 1.000000
print(best_model@allparameters)
$model_id
[1] "dl_grid_model_23"
$training_frame
[1] "RTMP_sid_9fdd_111"
$validation_frame
[1] "valid.hex"
$nfolds
[1] 0
$keep_cross_validation_predictions
[1] FALSE
$keep_cross_validation_fold_assignment
[1] FALSE
$fold_assignment
[1] "AUTO"
$ignore_const_cols
[1] TRUE
$score_each_iteration
[1] FALSE
$balance_classes
[1] FALSE
$max_after_balance_size
[1] 5
$max_confusion_matrix_size
[1] 20
$max_hit_ratio_k
[1] 0
$overwrite_with_best_model
[1] TRUE
$use_all_factor_levels
[1] TRUE
$standardize
[1] TRUE
$activation
[1] "Rectifier"
$hidden
[1] 64 64
$epochs
[1] 10
$train_samples_per_iteration
[1] -2
$target_ratio_comm_to_comp
[1] 0.05
$seed
[1] 5.242031e+18
$adaptive_rate
[1] FALSE
$rho
[1] 0.99
$epsilon
[1] 1e-08
$rate
[1] 0.02
$rate_annealing
[1] 1e-06
$rate_decay
[1] 1
$momentum_start
[1] 0.5
$momentum_ramp
[1] 1e+07
$momentum_stable
[1] 0.9
$nesterov_accelerated_gradient
[1] TRUE
$input_dropout_ratio
[1] 0.05
$l1
[1] 1e-05
$l2
[1] 1e-05
$max_w2
[1] 10
$initial_weight_distribution
[1] "UniformAdaptive"
$initial_weight_scale
[1] 1
$loss
[1] "Automatic"
$distribution
[1] "AUTO"
$quantile_alpha
[1] 0.5
$tweedie_power
[1] 1.5
$huber_alpha
[1] 0.9
$score_interval
[1] 5
$score_training_samples
[1] 10000
$score_validation_samples
[1] 10000
$score_duty_cycle
[1] 0.025
$classification_stop
[1] 0
$regression_stop
[1] 1e-06
$stopping_rounds
[1] 2
$stopping_metric
[1] "misclassification"
$stopping_tolerance
[1] 0.01
$max_runtime_secs
[1] 1.797693e+308
$score_validation_sampling
[1] "Uniform"
$diagnostics
[1] TRUE
$fast_mode
[1] TRUE
$force_load_balance
[1] TRUE
$variable_importances
[1] TRUE
$replicate_training_data
[1] TRUE
$single_node_mode
[1] FALSE
$shuffle_training_data
[1] FALSE
$missing_values_handling
[1] "MeanImputation"
$quiet_mode
[1] FALSE
$autoencoder
[1] FALSE
$sparse
[1] FALSE
$col_major
[1] FALSE
$average_activation
[1] 0
$sparsity_beta
[1] 0
$max_categorical_features
[1] 2147483647
$reproducible
[1] FALSE
$export_weights_and_biases
[1] FALSE
$mini_batch_size
[1] 1
$categorical_encoding
[1] "AUTO"
$elastic_averaging
[1] FALSE
$elastic_averaging_moving_rate
[1] 0.9
$elastic_averaging_regularization
[1] 0.001
$x
[1] "Soil_Type" "Wilderness_Area"
[3] "Elevation" "Aspect"
[5] "Slope" "Horizontal_Distance_To_Hydrology"
[7] "Vertical_Distance_To_Hydrology" "Horizontal_Distance_To_Roadways"
[9] "Hillshade_9am" "Hillshade_Noon"
[11] "Hillshade_3pm" "Horizontal_Distance_To_Fire_Points"
$y
[1] "Cover_Type"
print(h2o.performance(best_model, valid=T))
H2OMultinomialMetrics: deeplearning
** Reported on validation data. **
** Metrics reported on temporary validation frame with 10029 samples **
Validation Set Metrics:
=====================
MSE: (Extract with `h2o.mse`) 0.1840918
RMSE: (Extract with `h2o.rmse`) 0.4290592
Logloss: (Extract with `h2o.logloss`) 0.5696712
Mean Per-Class Error: 0.4315904
Confusion Matrix: Extract with `h2o.confusionMatrix(<model>,valid = TRUE)`)
=========================================================================
Confusion Matrix: Row labels: Actual class; Column labels: Predicted class
class_1 class_2 class_3 class_4 class_5 class_6 class_7 Error Rate
class_1 2668 910 0 0 0 0 50 0.2646 = 960 / 3,628
class_2 739 4064 82 0 10 39 3 0.1768 = 873 / 4,937
class_3 0 41 501 24 0 19 0 0.1436 = 84 / 585
class_4 0 0 25 31 0 1 0 0.4561 = 26 / 57
class_5 3 114 6 0 29 1 0 0.8105 = 124 / 153
class_6 1 44 203 2 0 63 0 0.7987 = 250 / 313
class_7 132 0 0 0 0 0 224 0.3708 = 132 / 356
Totals 3543 5173 817 57 39 123 277 0.2442 = 2,449 / 10,029
Hit Ratio Table: Extract with `h2o.hit_ratio_table(<model>,valid = TRUE)`
=======================================================================
Top-7 Hit Ratios:
k hit_ratio
1 1 0.755808
2 2 0.970884
3 3 0.995413
4 4 0.999103
5 5 1.000000
6 6 1.000000
7 7 1.000000
print(h2o.logloss(best_model, valid=T))
[1] 0.5696712
Random Hyper-Parameter Search
hyper_params <- list(
activation=c("Rectifier","Tanh","Maxout","RectifierWithDropout","TanhWithDropout","MaxoutWithDropout"),
hidden=list(c(20,20),c(50,50),c(30,30,30),c(25,25,25,25)),
input_dropout_ratio=c(0,0.05),
l1=seq(0,1e-4,1e-6),
l2=seq(0,1e-4,1e-6)
)
search_criteria = list(strategy = "RandomDiscrete", max_runtime_secs = 360, max_models = 100, seed=1234567, stopping_rounds=5, stopping_tolerance=1e-2)
dl_random_grid <- h2o.grid(
algorithm="deeplearning",
grid_id = "dl_grid_random",
training_frame=sampled_train,
validation_frame=valid,
x=predictors,
y=response,
epochs=1,
stopping_metric="logloss",
stopping_tolerance=1e-2, ## stop when logloss does not improve by >=1% for 2 scoring events
stopping_rounds=2,
score_validation_samples=10000, ## downsample validation set for faster scoring
score_duty_cycle=0.025, ## don't score more than 2.5% of the wall time
max_w2=10, ## can help improve stability for Rectifier
hyper_params = hyper_params,
search_criteria = search_criteria
)
|
| | 0%
|
|=== | 3%
|
|====== | 7%
|
|========= | 11%
|
|============ | 14%
|
|================ | 19%
|
|==================== | 23%
|
|======================== | 28%
|
|========================== | 31%
|
|============================= | 34%
|
|=============================== | 37%
|
|=================================== | 41%
|
|====================================== | 45%
|
|========================================= | 49%
|
|============================================= | 53%
|
|================================================== | 59%
|
|===================================================== | 63%
|
|========================================================= | 67%
|
|============================================================ | 70%
|
|=============================================================== | 74%
|
|================================================================== | 78%
|
|===================================================================== | 82%
|
|========================================================================= | 86%
|
|============================================================================== | 91%
|
|================================================================================= | 95%
|
|=====================================================================================| 100%
grid <- h2o.getGrid("dl_grid_random",sort_by="logloss",decreasing=FALSE)
grid
H2O Grid Details
================
Grid ID: dl_grid_random
Used hyper parameters:
- activation
- hidden
- input_dropout_ratio
- l1
- l2
Number of models: 100
Number of failed models: 0
Hyper-Parameter Search Summary: ordered by increasing logloss
activation hidden input_dropout_ratio l1 l2 model_ids
1 Rectifier [50, 50] 0.0 9.7E-5 2.0E-6 dl_grid_random_model_44
2 Rectifier [50, 50] 0.0 2.0E-5 4.9E-5 dl_grid_random_model_65
3 Tanh [30, 30, 30] 0.0 5.7E-5 9.8E-5 dl_grid_random_model_47
4 Tanh [30, 30, 30] 0.0 3.3E-5 1.1E-5 dl_grid_random_model_48
5 Maxout [30, 30, 30] 0.0 6.0E-5 9.1E-5 dl_grid_random_model_3
logloss
1 0.6605639097651821
2 0.6644398487564103
3 0.6679234738744023
4 0.6683151852389394
5 0.6720450370479197
---
activation hidden input_dropout_ratio l1 l2
95 MaxoutWithDropout [30, 30, 30] 0.0 4.8E-5 8.1E-5
96 MaxoutWithDropout [25, 25, 25, 25] 0.0 8.8E-5 4.3E-5
97 MaxoutWithDropout [25, 25, 25, 25] 0.05 5.0E-5 1.5E-5
98 MaxoutWithDropout [25, 25, 25, 25] 0.0 2.2E-5 8.2E-5
99 MaxoutWithDropout [25, 25, 25, 25] 0.05 5.9E-5 3.2E-5
100 MaxoutWithDropout [25, 25, 25, 25] 0.0 6.7E-5 1.5E-5
model_ids logloss
95 dl_grid_random_model_72 1.366595350132344
96 dl_grid_random_model_99 1.4608458870815466
97 dl_grid_random_model_59 1.52063747492084
98 dl_grid_random_model_73 1.6602057928125216
99 dl_grid_random_model_55 1.732893001138338
100 dl_grid_random_model_87 2.0260795967015732
grid@summary_table[1,]
Hyper-Parameter Search Summary: ordered by increasing logloss
activation hidden input_dropout_ratio l1 l2 model_ids
1 Rectifier [50, 50] 0.0 9.7E-5 2.0E-6 dl_grid_random_model_44
logloss
1 0.6605639097651821
best_model <- h2o.getModel(grid@model_ids[[1]]) ## model with lowest logloss
best_model
Model Details:
==============
H2OMultinomialModel: deeplearning
Model ID: dl_grid_random_model_44
Status of Neuron Layers: predicting Cover_Type, 7-class classification, multinomial distribution, CrossEntropy loss, 5,757 weights/biases, 75.1 KB, 10,835 training samples, mini-batch size 1
layer units type dropout l1 l2 mean_rate rate_rms momentum mean_weight
1 1 56 Input 0.00 %
2 2 50 Rectifier 0.00 % 0.000097 0.000002 0.052188 0.210668 0.000000 -0.002207
3 3 50 Rectifier 0.00 % 0.000097 0.000002 0.003062 0.001594 0.000000 -0.001186
4 4 7 Softmax 0.000097 0.000002 0.009306 0.068111 0.000000 -0.075202
weight_rms mean_bias bias_rms
1
2 0.139820 0.436044 0.067020
3 0.143040 0.983014 0.034908
4 0.430450 -0.026954 0.014811
H2OMultinomialMetrics: deeplearning
** Reported on training data. **
** Metrics reported on full training frame **
Training Set Metrics:
=====================
Extract training frame with `h2o.getFrame("RTMP_sid_9fdd_111")`
MSE: (Extract with `h2o.mse`) 0.2219503
RMSE: (Extract with `h2o.rmse`) 0.471116
Logloss: (Extract with `h2o.logloss`) 0.673122
Mean Per-Class Error: 0.5545455
Confusion Matrix: Extract with `h2o.confusionMatrix(<model>,train = TRUE)`)
=========================================================================
Confusion Matrix: Row labels: Actual class; Column labels: Predicted class
class_1 class_2 class_3 class_4 class_5 class_6 class_7 Error Rate
class_1 2678 972 2 0 0 2 34 0.2739 = 1,010 / 3,688
class_2 999 3660 120 0 4 44 8 0.2430 = 1,175 / 4,835
class_3 0 33 554 0 0 43 0 0.1206 = 76 / 630
class_4 0 0 43 0 0 1 0 1.0000 = 44 / 44
class_5 3 133 12 0 7 1 0 0.9551 = 149 / 156
class_6 0 54 167 0 0 88 0 0.7152 = 221 / 309
class_7 193 0 1 0 0 0 144 0.5740 = 194 / 338
Totals 3873 4852 899 0 11 179 186 0.2869 = 2,869 / 10,000
Hit Ratio Table: Extract with `h2o.hit_ratio_table(<model>,train = TRUE)`
=======================================================================
Top-7 Hit Ratios:
k hit_ratio
1 1 0.713100
2 2 0.959900
3 3 0.993400
4 4 0.998100
5 5 0.999800
6 6 0.999900
7 7 1.000000
H2OMultinomialMetrics: deeplearning
** Reported on validation data. **
** Metrics reported on temporary validation frame with 9999 samples **
Validation Set Metrics:
=====================
MSE: (Extract with `h2o.mse`) 0.2180249
RMSE: (Extract with `h2o.rmse`) 0.4669313
Logloss: (Extract with `h2o.logloss`) 0.6605639
Mean Per-Class Error: 0.5684699
Confusion Matrix: Extract with `h2o.confusionMatrix(<model>,valid = TRUE)`)
=========================================================================
Confusion Matrix: Row labels: Actual class; Column labels: Predicted class
class_1 class_2 class_3 class_4 class_5 class_6 class_7 Error Rate
class_1 2652 965 1 0 0 2 45 0.2764 = 1,013 / 3,665
class_2 1017 3761 117 0 7 38 2 0.2390 = 1,181 / 4,942
class_3 0 18 511 0 0 51 0 0.1190 = 69 / 580
class_4 0 0 48 0 0 0 0 1.0000 = 48 / 48
class_5 5 139 11 0 8 3 0 0.9518 = 158 / 166
class_6 0 45 157 0 0 81 0 0.7138 = 202 / 283
class_7 214 0 0 0 0 0 101 0.6794 = 214 / 315
Totals 3888 4928 845 0 15 175 148 0.2885 = 2,885 / 9,999
Hit Ratio Table: Extract with `h2o.hit_ratio_table(<model>,valid = TRUE)`
=======================================================================
Top-7 Hit Ratios:
k hit_ratio
1 1 0.711471
2 2 0.962696
3 3 0.992799
4 4 0.997400
5 5 1.000000
6 6 1.000000
7 7 1.000000
look at the model with the lowest validation misclassification rate
grid <- h2o.getGrid("dl_grid_random",sort_by="err",decreasing=FALSE)
best_model <- h2o.getModel(grid@model_ids[[1]]) ## model with lowest classification error (on validation, since it was available during training)
h2o.confusionMatrix(best_model,valid=T)
Confusion Matrix: Row labels: Actual class; Column labels: Predicted class
class_1 class_2 class_3 class_4 class_5 class_6 class_7 Error Rate
class_1 2419 1147 1 0 0 0 92 0.3389 = 1,240 / 3,659
class_2 742 3985 52 0 12 42 11 0.1773 = 859 / 4,844
class_3 0 74 441 0 8 84 0 0.2735 = 166 / 607
class_4 0 0 42 0 0 15 0 1.0000 = 57 / 57
class_5 0 152 3 0 0 0 0 1.0000 = 155 / 155
class_6 0 75 101 0 1 121 0 0.5940 = 177 / 298
class_7 143 1 0 0 0 0 239 0.3760 = 144 / 383
Totals 3304 5434 640 0 21 262 342 0.2797 = 2,798 / 10,003
best_params <- best_model@allparameters
best_params$activation
[1] "Rectifier"
best_params$hidden
[1] 50 50
best_params$input_dropout_ratio
[1] 0
best_params$l1
[1] 3.5e-05
best_params$l2
[1] 5.5e-05
do checkpointing
max_epochs <- 12 ## Add two more epochs
m_cont <- h2o.deeplearning(
model_id="dl_model_tuned_continued",
checkpoint="dl_model_tuned",
training_frame=train,
validation_frame=valid,
x=predictors,
y=response,
hidden=c(128,128,128), ## more hidden layers -> more complex interactions
epochs=max_epochs, ## hopefully long enough to converge (otherwise restart again)
stopping_metric="logloss", ## logloss is directly optimized by Deep Learning
stopping_tolerance=1e-2, ## stop when validation logloss does not improve by >=1% for 2 scoring events
stopping_rounds=2,
score_validation_samples=10000, ## downsample validation set for faster scoring
score_duty_cycle=0.025, ## don't score more than 2.5% of the wall time
adaptive_rate=F, ## manually tuned learning rate
rate=0.01,
rate_annealing=2e-6,
momentum_start=0.2, ## manually tuned momentum
momentum_stable=0.4,
momentum_ramp=1e7,
l1=1e-5, ## add some L1/L2 regularization
l2=1e-5,
max_w2=10 ## helps stability for Rectifier
)
|
| | 0%
|
|==== | 5%
|
|====== | 7%
|
|=====================================================================================| 100%
summary(m_cont)
Model Details:
==============
H2OMultinomialModel: deeplearning
Model Key: dl_model_tuned_continued
Status of Neuron Layers: predicting Cover_Type, 7-class classification, multinomial distribution, CrossEntropy loss, 41,223 weights/biases, 331.5 KB, 3,897,371 training samples, mini-batch size 1
layer units type dropout l1 l2 mean_rate rate_rms momentum mean_weight
1 1 56 Input 0.00 %
2 2 128 Rectifier 0.00 % 0.000010 0.000010 0.001137 0.000000 0.277947 -0.010811
3 3 128 Rectifier 0.00 % 0.000010 0.000010 0.001137 0.000000 0.277947 -0.048135
4 4 128 Rectifier 0.00 % 0.000010 0.000010 0.001137 0.000000 0.277947 -0.066310
5 5 7 Softmax 0.000010 0.000010 0.001137 0.000000 0.277947 -0.017610
weight_rms mean_bias bias_rms
1
2 0.315516 -0.105030 0.298393
3 0.217276 0.952453 0.371964
4 0.207231 0.873646 0.183867
5 0.270758 0.021364 2.382584
H2OMultinomialMetrics: deeplearning
** Reported on training data. **
** Metrics reported on temporary training frame with 9958 samples **
Training Set Metrics:
=====================
MSE: (Extract with `h2o.mse`) 0.05657894
RMSE: (Extract with `h2o.rmse`) 0.2378633
Logloss: (Extract with `h2o.logloss`) 0.1879827
Mean Per-Class Error: 0.1447977
Confusion Matrix: Extract with `h2o.confusionMatrix(<model>,train = TRUE)`)
=========================================================================
Confusion Matrix: Row labels: Actual class; Column labels: Predicted class
class_1 class_2 class_3 class_4 class_5 class_6 class_7 Error Rate
class_1 3258 302 0 0 3 1 18 0.0905 = 324 / 3,582
class_2 203 4735 14 0 15 10 2 0.0490 = 244 / 4,979
class_3 0 15 546 2 2 31 0 0.0839 = 50 / 596
class_4 0 0 7 35 0 1 0 0.1860 = 8 / 43
class_5 1 33 5 0 92 1 0 0.3030 = 40 / 132
class_6 0 16 38 1 1 224 0 0.2000 = 56 / 280
class_7 35 0 0 0 0 0 311 0.1012 = 35 / 346
Totals 3497 5101 610 38 113 268 331 0.0760 = 757 / 9,958
Hit Ratio Table: Extract with `h2o.hit_ratio_table(<model>,train = TRUE)`
=======================================================================
Top-7 Hit Ratios:
k hit_ratio
1 1 0.923981
2 2 0.996485
3 3 0.999699
4 4 1.000000
5 5 1.000000
6 6 1.000000
7 7 1.000000
H2OMultinomialMetrics: deeplearning
** Reported on validation data. **
** Metrics reported on temporary validation frame with 9983 samples **
Validation Set Metrics:
=====================
MSE: (Extract with `h2o.mse`) 0.06307016
RMSE: (Extract with `h2o.rmse`) 0.2511377
Logloss: (Extract with `h2o.logloss`) 0.2102238
Mean Per-Class Error: 0.1646872
Confusion Matrix: Extract with `h2o.confusionMatrix(<model>,valid = TRUE)`)
=========================================================================
Confusion Matrix: Row labels: Actual class; Column labels: Predicted class
class_1 class_2 class_3 class_4 class_5 class_6 class_7 Error Rate
class_1 3292 310 0 0 1 3 17 0.0914 = 331 / 3,623
class_2 224 4620 18 0 19 13 5 0.0570 = 279 / 4,899
class_3 0 25 553 3 2 35 0 0.1052 = 65 / 618
class_4 0 0 11 37 0 3 0 0.2745 = 14 / 51
class_5 5 38 2 0 102 0 0 0.3061 = 45 / 147
class_6 1 23 34 1 2 226 0 0.2125 = 61 / 287
class_7 36 2 0 0 0 0 320 0.1061 = 38 / 358
Totals 3558 5018 618 41 126 280 342 0.0834 = 833 / 9,983
Hit Ratio Table: Extract with `h2o.hit_ratio_table(<model>,valid = TRUE)`
=======================================================================
Top-7 Hit Ratios:
k hit_ratio
1 1 0.916558
2 2 0.994791
3 3 0.999299
4 4 0.999699
5 5 0.999900
6 6 1.000000
7 7 1.000000
Scoring History:
timestamp duration training_speed epochs iterations samples
1 2017-10-06 09:21:56 0.000 sec 0.00000 0 0.000000
2 2017-10-06 09:22:00 3.916 sec 27705 obs/sec 0.28664 1 100043.000000
3 2017-10-06 09:22:10 14.041 sec 36952 obs/sec 1.43284 5 500082.000000
4 2017-10-06 09:22:19 23.156 sec 40075 obs/sec 2.57771 9 899658.000000
5 2017-10-06 09:22:28 32.045 sec 41703 obs/sec 3.72233 13 1299150.000000
6 2017-10-06 09:22:37 41.005 sec 42600 obs/sec 4.86899 17 1699351.000000
7 2017-10-06 09:22:46 50.730 sec 42499 obs/sec 6.01686 21 2099974.000000
8 2017-10-06 09:22:56 1 min 0.400 sec 42448 obs/sec 7.16110 25 2499332.000000
9 2017-10-06 09:23:05 1 min 9.397 sec 41354 obs/sec 8.01937 28 2798880.000000
10 2017-10-06 09:23:15 1 min 18.986 sec 41484 obs/sec 9.16190 32 3197639.000000
11 2017-10-06 09:23:22 1 min 26.241 sec 41567 obs/sec 10.01954 35 3496969.000000
12 2017-10-06 10:25:03 1 min 29.328 sec 41377 obs/sec 10.30675 36 3597210.000000
13 2017-10-06 10:25:11 1 min 37.672 sec 40979 obs/sec 11.16677 39 3897371.000000
14 2017-10-06 10:25:11 1 min 37.850 sec 40978 obs/sec 11.16677 39 3897371.000000
training_rmse training_logloss training_classification_error validation_rmse
1
2 0.43813 0.58807 0.25433 0.43908
3 0.34886 0.38898 0.15979 0.35134
4 0.31460 0.32190 0.12939 0.31554
5 0.29547 0.28554 0.11266 0.29710
6 0.27646 0.25116 0.10177 0.27936
7 0.27000 0.23831 0.09573 0.27495
8 0.25563 0.21730 0.08583 0.25924
9 0.25111 0.20968 0.08286 0.25832
10 0.24559 0.20142 0.07979 0.25097
11 0.24099 0.19055 0.08049 0.24529
12 0.23784 0.18875 0.07602 0.25179
13 0.23658 0.18653 0.07542 0.25430
14 0.23786 0.18798 0.07602 0.25114
validation_logloss validation_classification_error
1
2 0.58501 0.25664
3 0.38723 0.16764
4 0.31874 0.12980
5 0.28290 0.11854
6 0.25225 0.10669
7 0.24612 0.10106
8 0.22099 0.08970
9 0.21883 0.09068
10 0.20744 0.08496
11 0.19907 0.07636
12 0.21513 0.08294
13 0.21704 0.08695
14 0.21022 0.08344
Variable Importances: (Extract with `h2o.varimp`)
=================================================
Variable Importances:
variable relative_importance scaled_importance percentage
1 Elevation 1.000000 1.000000 0.049328
2 Horizontal_Distance_To_Roadways 0.950752 0.950752 0.046899
3 Horizontal_Distance_To_Fire_Points 0.938702 0.938702 0.046304
4 Wilderness_Area.area_0 0.788530 0.788530 0.038897
5 Soil_Type.type_31 0.549967 0.549967 0.027129
---
variable relative_importance scaled_importance percentage
51 Soil_Type.type_7 0.160953 0.160953 0.007940
52 Soil_Type.type_13 0.159431 0.159431 0.007864
53 Soil_Type.type_14 0.159139 0.159139 0.007850
54 Soil_Type.type_6 0.141617 0.141617 0.006986
55 Soil_Type.missing(NA) 0.000000 0.000000 0.000000
56 Wilderness_Area.missing(NA) 0.000000 0.000000 0.000000
plot(m_cont)

save model
load the model
print(path)
[1] "C:\\Users\\r631758\\Desktop\\r631758\\R codes\\H2O\\exercise\\mybest_deeplearning_covtype_model\\dl_model_tuned_continued"
m_loaded<-h2o.loadModel(path)
summary(m_loaded)
Model Details:
==============
H2OMultinomialModel: deeplearning
Model Key: dl_model_tuned_continued
Status of Neuron Layers: predicting Cover_Type, 7-class classification, multinomial distribution, CrossEntropy loss, 41,223 weights/biases, 331.5 KB, 3,897,371 training samples, mini-batch size 1
layer units type dropout l1 l2 mean_rate rate_rms momentum mean_weight
1 1 56 Input 0.00 %
2 2 128 Rectifier 0.00 % 0.000010 0.000010 0.001137 0.000000 0.277947 -0.010811
3 3 128 Rectifier 0.00 % 0.000010 0.000010 0.001137 0.000000 0.277947 -0.048135
4 4 128 Rectifier 0.00 % 0.000010 0.000010 0.001137 0.000000 0.277947 -0.066310
5 5 7 Softmax 0.000010 0.000010 0.001137 0.000000 0.277947 -0.017610
weight_rms mean_bias bias_rms
1
2 0.315516 -0.105030 0.298393
3 0.217276 0.952453 0.371964
4 0.207231 0.873646 0.183867
5 0.270758 0.021364 2.382584
H2OMultinomialMetrics: deeplearning
** Reported on training data. **
** Metrics reported on temporary training frame with 9958 samples **
Training Set Metrics:
=====================
MSE: (Extract with `h2o.mse`) 0.05657894
RMSE: (Extract with `h2o.rmse`) 0.2378633
Logloss: (Extract with `h2o.logloss`) 0.1879827
Mean Per-Class Error: 0.1447977
Confusion Matrix: Extract with `h2o.confusionMatrix(<model>,train = TRUE)`)
=========================================================================
Confusion Matrix: Row labels: Actual class; Column labels: Predicted class
class_1 class_2 class_3 class_4 class_5 class_6 class_7 Error Rate
class_1 3258 302 0 0 3 1 18 0.0905 = 324 / 3,582
class_2 203 4735 14 0 15 10 2 0.0490 = 244 / 4,979
class_3 0 15 546 2 2 31 0 0.0839 = 50 / 596
class_4 0 0 7 35 0 1 0 0.1860 = 8 / 43
class_5 1 33 5 0 92 1 0 0.3030 = 40 / 132
class_6 0 16 38 1 1 224 0 0.2000 = 56 / 280
class_7 35 0 0 0 0 0 311 0.1012 = 35 / 346
Totals 3497 5101 610 38 113 268 331 0.0760 = 757 / 9,958
Hit Ratio Table: Extract with `h2o.hit_ratio_table(<model>,train = TRUE)`
=======================================================================
Top-7 Hit Ratios:
k hit_ratio
1 1 0.923981
2 2 0.996485
3 3 0.999699
4 4 1.000000
5 5 1.000000
6 6 1.000000
7 7 1.000000
H2OMultinomialMetrics: deeplearning
** Reported on validation data. **
** Metrics reported on temporary validation frame with 9983 samples **
Validation Set Metrics:
=====================
MSE: (Extract with `h2o.mse`) 0.06307016
RMSE: (Extract with `h2o.rmse`) 0.2511377
Logloss: (Extract with `h2o.logloss`) 0.2102238
Mean Per-Class Error: 0.1646872
Confusion Matrix: Extract with `h2o.confusionMatrix(<model>,valid = TRUE)`)
=========================================================================
Confusion Matrix: Row labels: Actual class; Column labels: Predicted class
class_1 class_2 class_3 class_4 class_5 class_6 class_7 Error Rate
class_1 3292 310 0 0 1 3 17 0.0914 = 331 / 3,623
class_2 224 4620 18 0 19 13 5 0.0570 = 279 / 4,899
class_3 0 25 553 3 2 35 0 0.1052 = 65 / 618
class_4 0 0 11 37 0 3 0 0.2745 = 14 / 51
class_5 5 38 2 0 102 0 0 0.3061 = 45 / 147
class_6 1 23 34 1 2 226 0 0.2125 = 61 / 287
class_7 36 2 0 0 0 0 320 0.1061 = 38 / 358
Totals 3558 5018 618 41 126 280 342 0.0834 = 833 / 9,983
Hit Ratio Table: Extract with `h2o.hit_ratio_table(<model>,valid = TRUE)`
=======================================================================
Top-7 Hit Ratios:
k hit_ratio
1 1 0.916558
2 2 0.994791
3 3 0.999299
4 4 0.999699
5 5 0.999900
6 6 1.000000
7 7 1.000000
Scoring History:
timestamp duration training_speed epochs iterations samples
1 2017-10-06 09:21:56 0.000 sec 0.00000 0 0.000000
2 2017-10-06 09:22:00 3.916 sec 27705 obs/sec 0.28664 1 100043.000000
3 2017-10-06 09:22:10 14.041 sec 36952 obs/sec 1.43284 5 500082.000000
4 2017-10-06 09:22:19 23.156 sec 40075 obs/sec 2.57771 9 899658.000000
5 2017-10-06 09:22:28 32.045 sec 41703 obs/sec 3.72233 13 1299150.000000
6 2017-10-06 09:22:37 41.005 sec 42600 obs/sec 4.86899 17 1699351.000000
7 2017-10-06 09:22:46 50.730 sec 42499 obs/sec 6.01686 21 2099974.000000
8 2017-10-06 09:22:56 1 min 0.400 sec 42448 obs/sec 7.16110 25 2499332.000000
9 2017-10-06 09:23:05 1 min 9.397 sec 41354 obs/sec 8.01937 28 2798880.000000
10 2017-10-06 09:23:15 1 min 18.986 sec 41484 obs/sec 9.16190 32 3197639.000000
11 2017-10-06 09:23:22 1 min 26.241 sec 41567 obs/sec 10.01954 35 3496969.000000
12 2017-10-06 10:25:03 1 min 29.328 sec 41377 obs/sec 10.30675 36 3597210.000000
13 2017-10-06 10:25:11 1 min 37.672 sec 40979 obs/sec 11.16677 39 3897371.000000
14 2017-10-06 10:25:11 1 min 37.850 sec 40978 obs/sec 11.16677 39 3897371.000000
training_rmse training_logloss training_classification_error validation_rmse
1
2 0.43813 0.58807 0.25433 0.43908
3 0.34886 0.38898 0.15979 0.35134
4 0.31460 0.32190 0.12939 0.31554
5 0.29547 0.28554 0.11266 0.29710
6 0.27646 0.25116 0.10177 0.27936
7 0.27000 0.23831 0.09573 0.27495
8 0.25563 0.21730 0.08583 0.25924
9 0.25111 0.20968 0.08286 0.25832
10 0.24559 0.20142 0.07979 0.25097
11 0.24099 0.19055 0.08049 0.24529
12 0.23784 0.18875 0.07602 0.25179
13 0.23658 0.18653 0.07542 0.25430
14 0.23786 0.18798 0.07602 0.25114
validation_logloss validation_classification_error
1
2 0.58501 0.25664
3 0.38723 0.16764
4 0.31874 0.12980
5 0.28290 0.11854
6 0.25225 0.10669
7 0.24612 0.10106
8 0.22099 0.08970
9 0.21883 0.09068
10 0.20744 0.08496
11 0.19907 0.07636
12 0.21513 0.08294
13 0.21704 0.08695
14 0.21022 0.08344
Variable Importances: (Extract with `h2o.varimp`)
=================================================
Variable Importances:
variable relative_importance scaled_importance percentage
1 Elevation 1.000000 1.000000 0.049328
2 Horizontal_Distance_To_Roadways 0.950752 0.950752 0.046899
3 Horizontal_Distance_To_Fire_Points 0.938702 0.938702 0.046304
4 Wilderness_Area.area_0 0.788530 0.788530 0.038897
5 Soil_Type.type_31 0.549967 0.549967 0.027129
---
variable relative_importance scaled_importance percentage
51 Soil_Type.type_7 0.160953 0.160953 0.007940
52 Soil_Type.type_13 0.159431 0.159431 0.007864
53 Soil_Type.type_14 0.159139 0.159139 0.007850
54 Soil_Type.type_6 0.141617 0.141617 0.006986
55 Soil_Type.missing(NA) 0.000000 0.000000 0.000000
56 Wilderness_Area.missing(NA) 0.000000 0.000000 0.000000
cross validation
For N-fold cross-validation, specify nfolds>1 instead of (or in addition to) a validation frame, and N+1 models will be built: 1 model on the full training data, and N models with each 1/N-th of the data held out (there are different holdout strategies). Those N models then score on the held out data, and their combined predictions on the full training data are scored to get the cross-validation metrics.
dlmodel <- h2o.deeplearning(
x=predictors,
y=response,
training_frame=train,
hidden=c(10,10),
epochs=1,
nfolds=5,
fold_assignment="Modulo" # can be "AUTO", "Modulo", "Random" or "Stratified"
)
|
| | 0%
|
|================= | 20%
|
|================================== | 40%
|
|====================================================== | 64%
|
|========================================================================= | 85%
|
|==================================================================================== | 99%
|
|=====================================================================================| 100%
dlmodel
Model Details:
==============
H2OMultinomialModel: deeplearning
Model ID: DeepLearning_model_R_1507305162426_1
Status of Neuron Layers: predicting Cover_Type, 7-class classification, multinomial distribution, CrossEntropy loss, 757 weights/biases, 14.7 KB, 364,594 training samples, mini-batch size 1
layer units type dropout l1 l2 mean_rate rate_rms momentum mean_weight
1 1 56 Input 0.00 %
2 2 10 Rectifier 0.00 % 0.000000 0.000000 0.060172 0.232994 0.000000 -0.004632
3 3 10 Rectifier 0.00 % 0.000000 0.000000 0.000835 0.000636 0.000000 -0.039448
4 4 7 Softmax 0.000000 0.000000 0.044997 0.189531 0.000000 -1.296691
weight_rms mean_bias bias_rms
1
2 0.207322 0.277274 0.310572
3 0.415587 0.846932 0.386649
4 1.890843 -0.734529 0.451531
H2OMultinomialMetrics: deeplearning
** Reported on training data. **
** Metrics reported on temporary training frame with 10076 samples **
Training Set Metrics:
=====================
MSE: (Extract with `h2o.mse`) 0.1934122
RMSE: (Extract with `h2o.rmse`) 0.4397866
Logloss: (Extract with `h2o.logloss`) 0.6023489
Mean Per-Class Error: 0.429122
Confusion Matrix: Extract with `h2o.confusionMatrix(<model>,train = TRUE)`)
=========================================================================
Confusion Matrix: Row labels: Actual class; Column labels: Predicted class
class_1 class_2 class_3 class_4 class_5 class_6 class_7 Error Rate
class_1 2800 772 3 0 0 8 84 0.2364 = 867 / 3,667
class_2 966 3773 88 0 2 40 3 0.2256 = 1,099 / 4,872
class_3 0 46 560 14 0 32 0 0.1411 = 92 / 652
class_4 0 0 19 34 0 1 0 0.3704 = 20 / 54
class_5 5 125 8 0 13 0 0 0.9139 = 138 / 151
class_6 1 61 182 3 0 61 0 0.8019 = 247 / 308
class_7 113 4 0 0 0 0 255 0.3145 = 117 / 372
Totals 3885 4781 860 51 15 142 342 0.2561 = 2,580 / 10,076
Hit Ratio Table: Extract with `h2o.hit_ratio_table(<model>,train = TRUE)`
=======================================================================
Top-7 Hit Ratios:
k hit_ratio
1 1 0.743946
2 2 0.966356
3 3 0.994442
4 4 0.997916
5 5 0.999702
6 6 1.000000
7 7 1.000000
H2OMultinomialMetrics: deeplearning
** Reported on cross-validation data. **
** 5-fold cross-validation on training data (Metrics computed for combined holdout predictions) **
Cross-Validation Set Metrics:
=====================
Extract cross-validation frame with `h2o.getFrame("train.hex")`
MSE: (Extract with `h2o.mse`) 0.1939382
RMSE: (Extract with `h2o.rmse`) 0.4403842
Logloss: (Extract with `h2o.logloss`) 0.6046599
Mean Per-Class Error: 0.4962742
Hit Ratio Table: Extract with `h2o.hit_ratio_table(<model>,xval = TRUE)`
=======================================================================
Top-7 Hit Ratios:
k hit_ratio
1 1 0.739653
2 2 0.968211
3 3 0.994040
4 4 0.998459
5 5 0.999734
6 6 0.999905
7 7 1.000000
Cross-Validation Metrics Summary:
mean sd cv_1_valid cv_2_valid cv_3_valid cv_4_valid
accuracy 0.73965305 0.0017962467 0.7390227 0.74208844 0.7349827 0.7411859
err 0.26034698 0.0017962467 0.26097733 0.25791156 0.26501727 0.2588141
err_count 18173.0 125.383415 18217.0 18003.0 18499.0 18066.0
logloss 0.60465986 0.0036992545 0.612202 0.59924364 0.6091049 0.6035284
max_per_class_error 0.93898463 0.017054334 0.9501748 0.9222615 0.9105403 0.93233746
mean_per_class_accuracy 0.5039658 0.015003628 0.4644207 0.50871575 0.52845037 0.5115771
mean_per_class_error 0.49603423 0.015003628 0.53557926 0.49128422 0.47154963 0.48842287
mse 0.1939382 0.001298748 0.19492394 0.19286226 0.19710253 0.19240879
r2 0.90045166 0.0015482176 0.9002755 0.90029657 0.8968938 0.9010371
rmse 0.44037923 0.0014718772 0.44150192 0.43916085 0.4439623 0.43864426
cv_5_valid
accuracy 0.74098533
err 0.25901467
err_count 18080.0
logloss 0.5992204
max_per_class_error 0.9796092
mean_per_class_accuracy 0.5066649
mean_per_class_error 0.4933351
mse 0.1923935
r2 0.9037552
rmse 0.43862683
Regression and Binary Classification
summary(dlmodel)
Model Details:
==============
H2ORegressionModel: deeplearning
Model Key: DeepLearning_model_R_1507305162426_6
Status of Neuron Layers: predicting bin_response, regression, gaussian distribution, Quadratic loss, 691 weights/biases, 13.9 KB, 38,275 training samples, mini-batch size 1
layer units type dropout l1 l2 mean_rate rate_rms momentum mean_weight
1 1 56 Input 0.00 %
2 2 10 Rectifier 0.00 % 0.000000 0.000000 0.060061 0.229440 0.000000 -0.001721
3 3 10 Rectifier 0.00 % 0.000000 0.000000 0.001306 0.000772 0.000000 -0.020185
4 4 1 Linear 0.000000 0.000000 0.000391 0.000183 0.000000 -0.005553
weight_rms mean_bias bias_rms
1
2 0.191966 0.433413 0.173285
3 0.320735 0.957016 0.031317
4 0.444293 0.008284 0.000000
H2ORegressionMetrics: deeplearning
** Reported on training data. **
** Metrics reported on temporary training frame with 10100 samples **
MSE: 0.1481445
RMSE: 0.3848954
MAE: 0.3119092
RMSLE: 0.2723728
Mean Residual Deviance : 0.1481445
Scoring History:
timestamp duration training_speed epochs iterations samples training_rmse
1 2017-10-06 11:04:09 0.000 sec 0.00000 0 0.000000
2 2017-10-06 11:04:09 0.197 sec 200611 obs/sec 0.01035 1 3611.000000 0.41927
3 2017-10-06 11:04:09 0.329 sec 281433 obs/sec 0.10967 11 38275.000000 0.38490
training_deviance training_mae
1
2 0.17579 0.37207
3 0.14814 0.31191
Variable Importances: (Extract with `h2o.varimp`)
=================================================
Variable Importances:
variable relative_importance scaled_importance percentage
1 Elevation 1.000000 1.000000 0.032023
2 Soil_Type.type_30 0.778499 0.778499 0.024930
3 Soil_Type.type_19 0.766471 0.766471 0.024545
4 Soil_Type.type_6 0.750092 0.750092 0.024020
5 Soil_Type.type_1 0.744390 0.744390 0.023838
---
variable relative_importance scaled_importance percentage
51 Soil_Type.type_2 0.395219 0.395219 0.012656
52 Aspect 0.378613 0.378613 0.012124
53 Slope 0.360744 0.360744 0.011552
54 Soil_Type.type_35 0.340908 0.340908 0.010917
55 Soil_Type.missing(NA) 0.000000 0.000000 0.000000
56 Wilderness_Area.missing(NA) 0.000000 0.000000 0.000000
true binomial model
train$bin_response <- as.factor(train$bin_response) ##make categorical
dlmodel <- h2o.deeplearning(
x=predictors,
y="bin_response",
training_frame=train,
hidden=c(10,10),
epochs=0.1
#balance_classes=T ## enable this for high class imbalance
)
|
| | 0%
|
|================================== | 40%
|
|=====================================================================================| 100%
summary(dlmodel) ## Now the model metrics contain AUC for binary classification
Model Details:
==============
H2OBinomialModel: deeplearning
Model Key: DeepLearning_model_R_1507305162426_7
Status of Neuron Layers: predicting bin_response, 2-class classification, bernoulli distribution, CrossEntropy loss, 702 weights/biases, 14.0 KB, 34,987 training samples, mini-batch size 1
layer units type dropout l1 l2 mean_rate rate_rms momentum mean_weight
1 1 56 Input 0.00 %
2 2 10 Rectifier 0.00 % 0.000000 0.000000 0.075014 0.243454 0.000000 0.009294
3 3 10 Rectifier 0.00 % 0.000000 0.000000 0.001173 0.000875 0.000000 -0.074586
4 4 2 Softmax 0.000000 0.000000 0.002222 0.001456 0.000000 0.732539
weight_rms mean_bias bias_rms
1
2 0.184678 0.484191 0.124705
3 0.291599 0.980515 0.048223
4 1.550770 -0.006356 0.013262
H2OBinomialMetrics: deeplearning
** Reported on training data. **
** Metrics reported on temporary training frame with 9989 samples **
MSE: 0.1513175
RMSE: 0.3889955
LogLoss: 0.4570643
Mean Per-Class Error: 0.2616744
AUC: 0.8506438
Gini: 0.7012876
Confusion Matrix (vertical: actual; across: predicted) for F1-optimal threshold:
0 1 Error Rate
0 2202 1425 0.392887 =1425/3627
1 830 5532 0.130462 =830/6362
Totals 3032 6957 0.225748 =2255/9989
Maximum Metrics: Maximum metrics at their respective thresholds
metric threshold value idx
1 max f1 0.467751 0.830693 244
2 max f2 0.210867 0.902244 348
3 max f0point5 0.683535 0.842757 159
4 max accuracy 0.531648 0.778757 217
5 max precision 0.999340 1.000000 0
6 max recall 0.053739 1.000000 398
7 max specificity 0.999340 1.000000 0
8 max absolute_mcc 0.666616 0.531593 165
9 max min_per_class_accuracy 0.627737 0.769569 180
10 max mean_per_class_accuracy 0.683535 0.775308 159
Gains/Lift Table: Extract with `h2o.gainsLift(<model>, <data>)` or `h2o.gainsLift(<model>, valid=<T/F>, xval=<T/F>)`
Scoring History:
timestamp duration training_speed epochs iterations samples training_rmse
1 2017-10-06 11:06:57 0.000 sec 0.00000 0 0.000000
2 2017-10-06 11:06:57 0.204 sec 342800 obs/sec 0.00982 1 3428.000000 0.41152
3 2017-10-06 11:06:57 0.367 sec 291558 obs/sec 0.10024 10 34987.000000 0.38900
training_logloss training_auc training_lift training_classification_error
1
2 0.51174 0.81409 1.57010 0.25838
3 0.45706 0.85064 1.57010 0.22575
Variable Importances: (Extract with `h2o.varimp`)
=================================================
Variable Importances:
variable relative_importance scaled_importance percentage
1 Soil_Type.type_5 1.000000 1.000000 0.030491
2 Soil_Type.type_4 0.907483 0.907483 0.027670
3 Soil_Type.type_36 0.897845 0.897845 0.027376
4 Elevation 0.866505 0.866505 0.026420
5 Soil_Type.type_16 0.811355 0.811355 0.024739
---
variable relative_importance scaled_importance percentage
51 Soil_Type.type_17 0.412993 0.412993 0.012592
52 Aspect 0.395864 0.395864 0.012070
53 Hillshade_9am 0.394204 0.394204 0.012020
54 Vertical_Distance_To_Hydrology 0.241319 0.241319 0.007358
55 Soil_Type.missing(NA) 0.000000 0.000000 0.000000
56 Wilderness_Area.missing(NA) 0.000000 0.000000 0.000000
plot(h2o.performance(dlmodel)) ## display ROC curve

h2o.shutdown(prompt=FALSE)
[1] TRUE
---
title: "forest cover type DL"
output: html_notebook
---

#start h2o
```{r}
library(h2o)
h2o.init()
h2o.removeAll()
```

#LOAD DATA
```{r}
cov<-h2o.importFile(path="https://s3.amazonaws.com/h2o-public-test-data/bigdata/laptop/covtype/covtype.full.csv")
dim(cov)
splits<-h2o.splitFrame(cov,ratio=c(0.6,0.2),destination_frames = c("train.hex", "valid.hex", "test.hex"), seed=1234)
train<-splits[[1]]
valid<-splits[[2]]
test<-splits[[3]]
```

#scatter plot via binning (works for categorical and numeric columns) to get more familiar with the dataset.
```{r}
par(mfrow=c(1,1))
plot(h2o.tabulate(cov, "Elevation", "Cover_Type"))
plot(h2o.tabulate(cov, "Horizontal_Distance_To_Roadways", "Cover_Type"))
plot(h2o.tabulate(cov, "Soil_Type",                       "Cover_Type"))
plot(h2o.tabulate(cov, "Horizontal_Distance_To_Roadways", "Elevation" ))
```

#set response and predictors
```{r}
response<-"Cover_Type"
predictors<-setdiff(names(cov), response)
predictors
```

#first DL model
```{r}
model.DL1<-h2o.deeplearning(x=predictors, y=response, training_frame = train, validation_frame = valid, model_id="dl_model_first", epochs=1, variable_importances=T)
summary(model.DL1)
```
####  http://localhost:54321
#### And the focus will be to look at model performance, since we are using R to 
####  control H2O. So we can simply type in:
####  getModel "dl_model_first"

```{r}
head(as.data.frame(h2o.varimp(model.DL1)))
```

#Add early stopping
```{r}
model.DL2<-h2o.deeplearning(x=predictors, y=response, training_frame = train, validation_frame = valid, model_id="dl_model_faster", hidden=c(32,32,32), epochs = 100000, score_validation_samples = 10000, stopping_rounds=2, stopping_metric="misclassification", stopping_tolerance = 0.01)
summary(model.DL2)
plot(model.DL2)
```

#tuning
```{r}
model.DL3 <- h2o.deeplearning(
  model_id="dl_model_tuned", 
  training_frame=train, 
  validation_frame=valid, 
  x=predictors, 
  y=response, 
  overwrite_with_best_model=F,    ## Return the final model after 10 epochs, even if not the best
  hidden=c(128,128,128),          ## more hidden layers -> more complex interactions
  epochs=10,                      ## to keep it short enough
  score_validation_samples=10000, ## downsample validation set for faster scoring
  score_duty_cycle=0.025,         ## don't score more than 2.5% of the wall time
  adaptive_rate=F,                ## manually tuned learning rate
  rate=0.01, 
  rate_annealing=2e-6,            
  momentum_start=0.2,             ## manually tuned momentum
  momentum_stable=0.4, 
  momentum_ramp=1e7, 
  l1=1e-5,                        ## add some L1/L2 regularization
  l2=1e-5,
  max_w2=10                       ## helps stability for Rectifier
) 
summary(model.DL3)
```

#Let's compare the training error with the validation and test set errors
```{r}
h2o.performance(model.DL3, train=T)
h2o.performance(model.DL3, valid=T)
h2o.performance(model.DL3, newdata=train)    ## full training data
h2o.performance(model.DL3, newdata=valid)    ## full validation data
h2o.performance(model.DL3, newdata=test)     ## full test data
```

#To confirm that the reported confusion matrix on the validation set (here, the test set) was correct, we make a prediction on the test set and compare the confusion matrices explicitly:
```{r}
pred<-h2o.predict(model.DL3, test)
pred
test$Accuracy<-pred$predict==test$Cover_Type
test$Accuracy
1-mean(test$Accuracy)
```

#Hyper-parameter tuning with grid search
```{r}
sampled_train=train[1:10000,]

hyper_params <- list(
  hidden=list(c(32,32,32),c(64,64)),
  input_dropout_ratio=c(0,0.05),
  rate=c(0.01,0.02),
  rate_annealing=c(1e-8,1e-7,1e-6)
)

grid <- h2o.grid(
  algorithm="deeplearning",
  grid_id="dl_grid", 
  training_frame=sampled_train,
  validation_frame=valid, 
  x=predictors, 
  y=response,
  epochs=10,
  stopping_metric="misclassification",
  stopping_tolerance=1e-2,        ## stop when misclassification does not improve by >=1% for 2 scoring events
  stopping_rounds=2,
  score_validation_samples=10000, ## downsample validation set for faster scoring
  score_duty_cycle=0.025,         ## don't score more than 2.5% of the wall time
  adaptive_rate=F,                ## manually tuned learning rate
  momentum_start=0.5,             ## manually tuned momentum
  momentum_stable=0.9, 
  momentum_ramp=1e7, 
  l1=1e-5,
  l2=1e-5,
  activation=c("Rectifier"),
  max_w2=10,                      ## can help improve stability for Rectifier
  hyper_params=hyper_params
)
grid
```

#which model has the lowest validation error
```{r}
grid<-h2o.getGrid("dl_grid", sort_by="err", decreasing = FALSE)
grid
```

#which model has the logloss validation error
```{r}
grid<-h2o.getGrid("dl_grid", sort_by="logloss", decreasing = FALSE)
grid

```

## Find the best model and its full set of parameters
```{r}
grid@summary_table[1,]
best_model <- h2o.getModel(grid@model_ids[[1]])
best_model

print(best_model@allparameters)
print(h2o.performance(best_model, valid=T))
print(h2o.logloss(best_model, valid=T))
```

#Random Hyper-Parameter Search

```{r}
hyper_params <- list(
  activation=c("Rectifier","Tanh","Maxout","RectifierWithDropout","TanhWithDropout","MaxoutWithDropout"),
  hidden=list(c(20,20),c(50,50),c(30,30,30),c(25,25,25,25)),
  input_dropout_ratio=c(0,0.05),
  l1=seq(0,1e-4,1e-6),
  l2=seq(0,1e-4,1e-6)
)


search_criteria = list(strategy = "RandomDiscrete", max_runtime_secs = 360, max_models = 100, seed=1234567, stopping_rounds=5, stopping_tolerance=1e-2)
dl_random_grid <- h2o.grid(
  algorithm="deeplearning",
  grid_id = "dl_grid_random",
  training_frame=sampled_train,
  validation_frame=valid, 
  x=predictors, 
  y=response,
  epochs=1,
  stopping_metric="logloss",
  stopping_tolerance=1e-2,        ## stop when logloss does not improve by >=1% for 2 scoring events
  stopping_rounds=2,
  score_validation_samples=10000, ## downsample validation set for faster scoring
  score_duty_cycle=0.025,         ## don't score more than 2.5% of the wall time
  max_w2=10,                      ## can help improve stability for Rectifier
  hyper_params = hyper_params,
  search_criteria = search_criteria
)                                
grid <- h2o.getGrid("dl_grid_random",sort_by="logloss",decreasing=FALSE)
grid

grid@summary_table[1,]
best_model <- h2o.getModel(grid@model_ids[[1]]) ## model with lowest logloss
best_model

```

#look at the model with the lowest validation misclassification rate
```{r}
grid <- h2o.getGrid("dl_grid_random",sort_by="err",decreasing=FALSE)
best_model <- h2o.getModel(grid@model_ids[[1]]) ## model with lowest classification error (on validation, since it was available during training)
h2o.confusionMatrix(best_model,valid=T)
best_params <- best_model@allparameters
best_params$activation
best_params$hidden
best_params$input_dropout_ratio
best_params$l1
best_params$l2
```

#do checkpointing
```{r}
max_epochs <- 12 ## Add two more epochs

m_cont <- h2o.deeplearning(
  model_id="dl_model_tuned_continued", 
  checkpoint="dl_model_tuned", 
  training_frame=train, 
  validation_frame=valid, 
  x=predictors, 
  y=response, 
  hidden=c(128,128,128),          ## more hidden layers -> more complex interactions
  epochs=max_epochs,              ## hopefully long enough to converge (otherwise restart again)
  stopping_metric="logloss",      ## logloss is directly optimized by Deep Learning
  stopping_tolerance=1e-2,        ## stop when validation logloss does not improve by >=1% for 2 scoring events
  stopping_rounds=2,
  score_validation_samples=10000, ## downsample validation set for faster scoring
  score_duty_cycle=0.025,         ## don't score more than 2.5% of the wall time
  adaptive_rate=F,                ## manually tuned learning rate
  rate=0.01, 
  rate_annealing=2e-6,            
  momentum_start=0.2,             ## manually tuned momentum
  momentum_stable=0.4, 
  momentum_ramp=1e7, 
  l1=1e-5,                        ## add some L1/L2 regularization
  l2=1e-5,
  max_w2=10                       ## helps stability for Rectifier
) 
summary(m_cont)
plot(m_cont)
```

#save model 
```{r}
path<-h2o.saveModel(m_cont, path="./mybest_deeplearning_covtype_model", force=TRUE)
```

#load the model
```{r}
print(path)
m_loaded<-h2o.loadModel(path)
summary(m_loaded)
```

#cross validation
#For N-fold cross-validation, specify nfolds>1 instead of (or in addition to) a validation frame, and N+1 models will be built: 1 model on the full training data, and N models with each 1/N-th of the data held out (there are different holdout strategies). Those N models then score on the held out data, and their combined predictions on the full training data are scored to get the cross-validation metrics.

```{r}
dlmodel <- h2o.deeplearning(
  x=predictors,
  y=response, 
  training_frame=train,
  hidden=c(10,10),
  epochs=1,
  nfolds=5,
  fold_assignment="Modulo" # can be "AUTO", "Modulo", "Random" or "Stratified"
  )
dlmodel
```

#Regression and Binary Classification
```{r}
train$bin_response<-ifelse(train[,response]=="class_1", 0,1)
dlmodel <- h2o.deeplearning(
  x=predictors,
  y="bin_response", 
  training_frame=train,
  hidden=c(10,10),
  epochs=0.1
)
summary(dlmodel)
```

#true binomial model
```{r}
train$bin_response <- as.factor(train$bin_response) ##make categorical
dlmodel <- h2o.deeplearning(
  x=predictors,
  y="bin_response", 
  training_frame=train,
  hidden=c(10,10),
  epochs=0.1
  #balance_classes=T    ## enable this for high class imbalance
)
summary(dlmodel) ## Now the model metrics contain AUC for binary classification
plot(h2o.performance(dlmodel)) ## display ROC curve
```

```{r}
h2o.shutdown(prompt=FALSE)
```


