October 01, 2015

Agenda

A short presentation and then a demo:

  • About Us
  • Intro
  • The Problem: Operationalizing Predictive Models
  • Solutions
  • Case Study: Iris Flower Predictor
  • Demo
  • Q & A

About Us

Intro

Common examples of data driven products:

  • Loan/credit approval
  • Recommendation systems (Movies, products, news feed)
  • Quoting premiums; claims estimates
  • Churn reduction/ Customer retention

Intro

  • Netflix Movie Recommendation System

Intro

  • Amazon Product Recommendation System

The Problem: Operationalizing Predictive Models (1/2)

Credit: Nick Elprin.
Image used with permission


  • Data scientists may not be good at web programming or app dev
  • Software enginners may not be good at machine learning and statistical analysis

The Problem: Operationalizing Predictive Models (2/2)

  • Different languages are good for different tasks
    Credit: Nick Elprin.
    Image used with permission

Solutions

Why Use Predictive API Engines

  • I do not have time/skills for every single task
  • I want to focus on understanding my problem and improving models

Predictive API Engines, Part 1: Domino Data Lab

How Domino Works

Credit: Nick Elprin.
Image used with permission


  • Data scientists focus on developing & improving models
  • Software enginners focus on maintaining the apps

Demo

  • 1st: Use random forests to predict the flower species in the iris dataset
  • 2nd: Turn the model into a web service
  • 3rd: Call the web service in a sample app
library(xtable)
data(iris)
print(xtable(head(iris, 5)), type = "html", include.rownames = F)
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
5.10 3.50 1.40 0.20 setosa
4.90 3.00 1.40 0.20 setosa
4.70 3.20 1.30 0.20 setosa
4.60 3.10 1.50 0.20 setosa
5.00 3.60 1.40 0.20 setosa

Let's Do it together

Simple WebServer Application

Action Shot (1/2)

Action Shot(2/2)

Web Sample
- Fill out the input form with input values and click "make prediction".
- The predicted value (Species type) is displayed in the web app.

Relevant Code (1/2)

Formulate Request

MODEL_ATTRS = [ 'sepal_length', 'sepal_width', 
                'petal_length', 'petal_width' ]
                
if request.method == 'POST': # you submitted the form

        attributes = [] # just the values from the form in an array
        for field in MODEL_ATTRS:
            attributes.append(request.form[field])
        result = fetch_score(attributes)
        result = result.json()

Relevant Code (2/2)

API Call

API_KEY = 'YOUR_API_KEY'

def fetch_score(attributes):
    return requests.post(
        "https://app.dominodatalab.com/v1/SparkIQLabs/helloWorld/endpoint",
        headers = {
            'X-Domino-Api-Key': API_KEY,
            'Content-type': 'application/json'
        },
        json={ "parameters": attributes }
)

Best Practices for Using a Predictive API Engine

  • Separate training, initialization, and prediction
  • Make prediction functions thread-safe
  • Leverage persistence/serialization tools (e.g. pickle)

What your Predictive API Engine should have for Production

  • Very low latency
  • Zero-downtime upgrades
  • High availability
  • Reproducibility
  • Logging
  • Security

Thank you