Install h2o
install.packages("h2o")
## Installing package into '/cloud/lib/x86_64-pc-linux-gnu-library/4.5'
## (as 'lib' is unspecified)
Check that it worked
library(h2o)
##
## ----------------------------------------------------------------------
##
## Your next step is to start H2O:
## > h2o.init()
##
## For H2O package documentation, ask for help:
## > ??h2o
##
## After starting H2O, you can use the Web UI at http://localhost:54321
## For more information visit https://docs.h2o.ai
##
## ----------------------------------------------------------------------
##
## Attaching package: 'h2o'
## The following objects are masked from 'package:stats':
##
## cor, sd, var
## The following objects are masked from 'package:base':
##
## &&, %*%, %in%, ||, apply, as.factor, as.numeric, colnames,
## colnames<-, ifelse, is.character, is.factor, is.numeric, log,
## log10, log1p, log2, round, signif, trunc
Start h2o
h2o.init()
## Connection successful!
##
## R is connected to the H2O cluster:
## H2O cluster uptime: 10 minutes 36 seconds
## H2O cluster timezone: UTC
## H2O data parsing timezone: UTC
## H2O cluster version: 3.44.0.3
## H2O cluster version age: 2 years, 1 month and 10 days
## H2O cluster name: H2O_started_from_R_r3583867_mha688
## H2O cluster total nodes: 1
## H2O cluster total memory: 0.17 GB
## H2O cluster total cores: 1
## H2O cluster allowed cores: 1
## H2O cluster healthy: TRUE
## H2O Connection ip: localhost
## H2O Connection port: 54321
## H2O Connection proxy: NA
## H2O Internal Security: FALSE
## R Version: R version 4.5.2 (2025-10-31)
## Warning in h2o.clusterInfo():
## Your H2O cluster version is (2 years, 1 month and 10 days) old. There may be a newer version available.
## Please download and install the latest version from: https://h2o-release.s3.amazonaws.com/h2o/latest_stable.html
datasets <- "https://raw.githubusercontent.com/DarrenCook/h2o/bk/datasets/"
data <- h2o.importFile(paste0(datasets, "iris_wheader.csv"))
## | | | 0% | |======================================================================| 100%
y <- "class"
x <- setdiff(names(data), y)
parts <- h2o.splitFrame(data, 0.8)
train <- parts[[1]]
test <- parts[[2]]
m <- h2o.deeplearning(x, y, train)
## | | | 0% | |======================================================================| 100%
p <- h2o.predict(m, test)
## | | | 0% | |======================================================================| 100%
Performance
h2o.mse(m)
## [1] 0.1204832
h2o.confusionMatrix(m)
Display as a table
as.data.frame(p)
Predicted vs Actual
as.data.frame(h2o.cbind(p$predict, test$class))
What percentage the H2O model got right.
mean(p$predict == test$class)
## [1] 0.6363636
h2o.performance(m, test)
## H2OMultinomialMetrics: deeplearning
##
## Test Set Metrics:
## =====================
##
## MSE: (Extract with `h2o.mse`) 0.2762632
## RMSE: (Extract with `h2o.rmse`) 0.5256075
## Logloss: (Extract with `h2o.logloss`) 0.8617307
## Mean Per-Class Error: 0.25
## AUC: (Extract with `h2o.auc`) NaN
## AUCPR: (Extract with `h2o.aucpr`) NaN
## Confusion Matrix: Extract with `h2o.confusionMatrix(<model>, <data>)`)
## =========================================================================
## Confusion Matrix: Row labels: Actual class; Column labels: Predicted class
## Iris-setosa Iris-versicolor Iris-virginica Error Rate
## Iris-setosa 6 0 0 0.0000 = 0 / 6
## Iris-versicolor 0 11 0 0.0000 = 0 / 11
## Iris-virginica 0 12 4 0.7500 = 12 / 16
## Totals 6 23 4 0.3636 = 12 / 33
##
## Hit Ratio Table: Extract with `h2o.hit_ratio_table(<model>, <data>)`
## =======================================================================
## Top-3 Hit Ratios:
## k hit_ratio
## 1 1 0.636364
## 2 2 1.000000
## 3 3 1.000000