R Notebook

Install h2o

install.packages("h2o")

## Installing package into '/cloud/lib/x86_64-pc-linux-gnu-library/4.5'
## (as 'lib' is unspecified)

Check that it worked

library(h2o)

## 
## ----------------------------------------------------------------------
## 
## Your next step is to start H2O:
##     > h2o.init()
## 
## For H2O package documentation, ask for help:
##     > ??h2o
## 
## After starting H2O, you can use the Web UI at http://localhost:54321
## For more information visit https://docs.h2o.ai
## 
## ----------------------------------------------------------------------

## 
## Attaching package: 'h2o'

## The following objects are masked from 'package:stats':
## 
##     cor, sd, var

## The following objects are masked from 'package:base':
## 
##     &&, %*%, %in%, ||, apply, as.factor, as.numeric, colnames,
##     colnames<-, ifelse, is.character, is.factor, is.numeric, log,
##     log10, log1p, log2, round, signif, trunc

Start h2o

h2o.init()

##  Connection successful!
## 
## R is connected to the H2O cluster: 
##     H2O cluster uptime:         10 minutes 36 seconds 
##     H2O cluster timezone:       UTC 
##     H2O data parsing timezone:  UTC 
##     H2O cluster version:        3.44.0.3 
##     H2O cluster version age:    2 years, 1 month and 10 days 
##     H2O cluster name:           H2O_started_from_R_r3583867_mha688 
##     H2O cluster total nodes:    1 
##     H2O cluster total memory:   0.17 GB 
##     H2O cluster total cores:    1 
##     H2O cluster allowed cores:  1 
##     H2O cluster healthy:        TRUE 
##     H2O Connection ip:          localhost 
##     H2O Connection port:        54321 
##     H2O Connection proxy:       NA 
##     H2O Internal Security:      FALSE 
##     R Version:                  R version 4.5.2 (2025-10-31)

## Warning in h2o.clusterInfo(): 
## Your H2O cluster version is (2 years, 1 month and 10 days) old. There may be a newer version available.
## Please download and install the latest version from: https://h2o-release.s3.amazonaws.com/h2o/latest_stable.html

datasets <- "https://raw.githubusercontent.com/DarrenCook/h2o/bk/datasets/"
data <- h2o.importFile(paste0(datasets, "iris_wheader.csv"))

##   |                                                                              |                                                                      |   0%  |                                                                              |======================================================================| 100%

y <- "class"
x <- setdiff(names(data), y)
parts <- h2o.splitFrame(data, 0.8)

train <- parts[[1]]
test <- parts[[2]]
m <- h2o.deeplearning(x, y, train)

##   |                                                                              |                                                                      |   0%  |                                                                              |======================================================================| 100%

p <- h2o.predict(m, test)

##   |                                                                              |                                                                      |   0%  |                                                                              |======================================================================| 100%

Performance

h2o.mse(m)

## [1] 0.1204832

h2o.confusionMatrix(m)

Display as a table

as.data.frame(p)

Predicted vs Actual

as.data.frame(h2o.cbind(p$predict, test$class))

What percentage the H2O model got right.

mean(p$predict == test$class)

## [1] 0.6363636

h2o.performance(m, test)

## H2OMultinomialMetrics: deeplearning
## 
## Test Set Metrics: 
## =====================
## 
## MSE: (Extract with `h2o.mse`) 0.2762632
## RMSE: (Extract with `h2o.rmse`) 0.5256075
## Logloss: (Extract with `h2o.logloss`) 0.8617307
## Mean Per-Class Error: 0.25
## AUC: (Extract with `h2o.auc`) NaN
## AUCPR: (Extract with `h2o.aucpr`) NaN
## Confusion Matrix: Extract with `h2o.confusionMatrix(<model>, <data>)`)
## =========================================================================
## Confusion Matrix: Row labels: Actual class; Column labels: Predicted class
##                 Iris-setosa Iris-versicolor Iris-virginica  Error      Rate
## Iris-setosa               6               0              0 0.0000 =   0 / 6
## Iris-versicolor           0              11              0 0.0000 =  0 / 11
## Iris-virginica            0              12              4 0.7500 = 12 / 16
## Totals                    6              23              4 0.3636 = 12 / 33
## 
## Hit Ratio Table: Extract with `h2o.hit_ratio_table(<model>, <data>)`
## =======================================================================
## Top-3 Hit Ratios: 
##   k hit_ratio
## 1 1  0.636364
## 2 2  1.000000
## 3 3  1.000000