library(h2o)
h2o.init()
H2O is not running yet, starting it now...
Note: In case of errors look at the following log files:
/tmp/Rtmp101q6c/file18d1e8b7725/h2o_r3032219_started_from_r.out
/tmp/Rtmp101q6c/file18d10572ba3/h2o_r3032219_started_from_r.err
Starting H2O JVM and connecting: ... Connection successful!
R is connected to the H2O cluster:
H2O cluster uptime: 2 seconds 708 milliseconds
H2O cluster timezone: UTC
H2O data parsing timezone: UTC
H2O cluster version: 3.44.0.3
H2O cluster version age: 2 years, 1 month and 16 days
H2O cluster name: H2O_started_from_R_r3032219_fom207
H2O cluster total nodes: 1
H2O cluster total memory: 0.24 GB
H2O cluster total cores: 1
H2O cluster allowed cores: 1
H2O cluster healthy: TRUE
H2O Connection ip: localhost
H2O Connection port: 54321
H2O Connection proxy: NA
H2O Internal Security: FALSE
R Version: R version 4.5.2 (2025-10-31)
h2o.init(nthreads = -1)
Connection successful!
R is connected to the H2O cluster:
H2O cluster uptime: 1 minutes 55 seconds
H2O cluster timezone: UTC
H2O data parsing timezone: UTC
H2O cluster version: 3.44.0.3
H2O cluster version age: 2 years, 1 month and 16 days
H2O cluster name: H2O_started_from_R_r3032219_fom207
H2O cluster total nodes: 1
H2O cluster total memory: 0.18 GB
H2O cluster total cores: 1
H2O cluster allowed cores: 1
H2O cluster healthy: TRUE
H2O Connection ip: localhost
H2O Connection port: 54321
H2O Connection proxy: NA
H2O Internal Security: FALSE
R Version: R version 4.5.2 (2025-10-31)
datasets <- "https://raw.githubusercontent.com/DarrenCook/h2o/bk/datasets/"
data <- h2o.importFile(paste0(datasets, "iris_wheader.csv"))
|
| | 0%
|
|=================================================================================================| 100%
y <- "class"
x <- setdiff(names(data), y)
parts <- h2o.splitFrame(data, 0.8)
#In R, h2o.splitFrame() takes an H2O frame and returns a list of the splits, which are assigned to train and #test, for readability:
train <- parts[[1]]
test <- parts[[2]]
m <- h2o.deeplearning(x, y, train)
|
| | 0%
|
|============================================================================== | 80%
|
|=================================================================================================| 100%
p <- h2o.predict(m, test)
|
| | 0%
|
|=================================================================================================| 100%
h2o.mse(m)
[1] 0.1714094
h2o.confusionMatrix(m)
Confusion Matrix: Row labels: Actual class; Column labels: Predicted class
as.data.frame(p)
as.data.frame( h2o.cbind(p$predict, test$class) )
The h2o model preformed very well overall, correctly identifying all Setosa and Virginica in the above table. The only errors occured within the bounderies of Virginica and Versicolor, where the model misclassified 4 Versicolor as Virginica.
mean(p$predict == test$class)
[1] 0.8857143
The model guessed 88.6% of our unseen test samples correctly, and got 11.4% wrong.
h2o.performance(m, test)
H2OMultinomialMetrics: deeplearning
Test Set Metrics:
=====================
MSE: (Extract with `h2o.mse`) 0.0968934
RMSE: (Extract with `h2o.rmse`) 0.311277
Logloss: (Extract with `h2o.logloss`) 0.3721349
Mean Per-Class Error: 0.1333333
AUC: (Extract with `h2o.auc`) NaN
AUCPR: (Extract with `h2o.aucpr`) NaN
Confusion Matrix: Extract with `h2o.confusionMatrix(<model>, <data>)`)
=========================================================================
Confusion Matrix: Row labels: Actual class; Column labels: Predicted class
Hit Ratio Table: Extract with `h2o.hit_ratio_table(<model>, <data>)`
=======================================================================
Top-3 Hit Ratios:
NANANA