Thursday June 26, 2014

Background

And you may ask yourself-Well…How did I get here?

Talking Heads

Analytics

Learning from data

All truth passes through three stages. First, it is ridiculed. Second, it is violently opposed. Third, it is accepted as being self-evident.

Arthur Schopenhauer, German philosopher (1788 – 1860)

Goals for Analytics

Test an existing state of knowledge.

Discover a new state of knowledge.

Use knowledge to improve decision-making process.

Reporting: measure outcomes (wins, speed, distance, etc.)

Modeling: predict outcomes

Challenges for Analytics

More data > less data

More kinds of data > fewer kinds of data

Conveying significance of findings to stakeholders (coaches, players, GM's, fans, etc.)

Lack of subject matter expertise

Examples

Luke: All right, I'll give it a try.

Yoda: No. Try not. Do… or do not. There is no try.

Which Shooting Performance is Better?

Scenario A: Player hits 7 FG on 12 FGA — all 2-PT — for a total of 14 points.

Scenario B: Player hits 6 FG on 12 FGA — four 2-PT FG and two 3-PT FG — for a total of 14 points.

  • Scenario B creates one more opportunity for gaining possession to the offense.
  • A possession is worth ~1 pt, and the probability of getting an offensive rebound is 30% (0.3).
  • Therefore, this "extra" missed field goal is worth ~0.3 pts to the offense.
  • Analyzing strategies/outcomes such as this requires knowledge of "states"

http://www.d3coder.com/thecity/2010/12/07/debate-which-shooting-performance-is-better/

ezPM Model for NBA Player Valuation

NBA Draft Combine Measurements

NBA Scoring Efficiency Frontier

"There was a key statistic missing."

Aging Curves

Visualizing Team Units

Visualizing Opponent Matchups

Clustering Scoring Styles

Play Rates and Efficiency

SportsVU

nbawowy! Demo

MEAN (MongoDB + Express + Angular + NodeJS)

> db.nbc_pbp_2014_final.findOne({"type":"fga","shooter":"Stephen Curry","opponent":"Thunder","value":3},
{"Warriors":1,"Thunder":1,"made":1,"release":1,"coords":1,"distance":1,"assist":1})
{
  "Thunder" : [
        "Russell Westbrook",
        "Thabo Sefolosha",
        "Kevin Durant",
        "Serge Ibaka",
        "Steven Adams"
    ],
    "Warriors" : [
        "Stephen Curry",
        "Klay Thompson",
        "Andre Iguodala",
        "David Lee",
        "Andrew Bogut"
    ],
    "_id" : ObjectId("535317c85bca6d54dd00df01"),
    "assist" : null,
    "coords" : {"x" : 18,"y" : 27},
    "distance" : 27,
    "made" : true,
    "release" : "pull up jump shot"
}

Tools of the trade

Developer Tools

Scripting languages

  • Python
  • Ruby
  • R

Web Frameworks

  • AngularJS
  • EmberJS
  • Backbone
  • NodeJS (server)

Database

  • SQL (MySQL, Postgres)
  • NoSQL (MongoDB, Cassandra, CouchDB, …)

"Big Data"

  • Amazon Web Services (S3, EMR, EC2, Redshift, …)
  • Hadoop
  • MapReduce, Pig, Hive, Storm, Spark, etc.
  • AMQP, RabbitMQ, ZeroMQ

Viz Tools

Questions?

http://www.meetup.com/Bay-Area-Sports-Analytics-Meetup/

I know Kung Fu

Neo, The Matrix