Mathsport 2019: Establishing a performance edge in P2P betting

C. Donovan, C. Bleak & B. Caneco
02 July 2019

University of St Andrews & DMP Statistical Solutions UK Ltd.

Overview

  • Betting basics
  • The P2P world of gambling
  • Genuine edges can be hard to establish
  • Determining your power
  • Real-world example
  • Recap

Odds

We're basically considering odds over time.

  • There are different types and somewhat related to probability of an outcome
  • Fractional: e.g. 2/1 bet £1 to perhaps profit £2
  • Decimal: the multiples of your stake returned (including stake)

NB - these aren't standard statistical odds \( p/(1-p) \)

Odds and probs

Odds may be directly related to probabilities:

  • Consider a fair coin
  • Rational/informed/numerate players will only agree on decimal odds of 2, factional 1/1
  • For games with calculable probs, rational odds are clear

These situations are in the minority - usually \( p \) is estimated

Honest Bob, local bookie

Observe at time \( t \), odds on offer don't simply reflect probabilities

  • Must be an estimate, the world is complex
  • The inverted decimal odds don't sum to one
  • These will also shift through time to balance Bob's book - whereas whatever underlying \( p \) might remain constant

Honest Bob

P2P trading

Gambling has gotten very modern:

  • We can play on both sides now - you can “sell” or “buy” odds
  • Acting as bookie, we lay - we sell some odds \( X \), so make money if A doesn't happen (we're gambling against)
  • Acting as punter, we back - we offer to buy odds \( X \), so make money if A does happen (we're gambling for)
  • There is almost direct analogy with going long/short on stocks

P2P trading

Some dogs…

They look like this

P2P trading

Can be seen like this

P2P trading

  • There are 1000s of methods you might devise to make money - but which ones are actually effective?
  • To assess this, you need a good grasp on variance and statistical fundamentals
  • Power and Gambler's Ruin are classics that need transferring to the P2P area

Power

  • This basically quantifies our ability to detect signal of specific sizes for a system (that comprises signal and noise)
  • Closely related to type-1 error (false positives) - which drives our thresholds for \( p \)-values in statistical testing
  • In short: “Given an effect of size \( X \), how likely are we to find it”
  • Relatedly, “How much data do we need to be confident in finding the signal?”

Do I have an edge? (some sums)

The basic problem is commonplace in statistics - the building blocks are Bernoulli/Binomial:

  • Tests and CIs on \( p \)
  • Some complications:
    • We have “portfolios” of bets which are a mix of RVs
    • Independence issues
    • Market forces through time and stochasticity in matching

Do I have an edge? (some sums)

  • Assume we have a system that operates at odds \( \omega \) - without edge, we have implied \( p = 1/\omega \)
  • Over the course of 1000 plays, the sampling distribution for \( p \) is roughly Normal (if \( p \) isn't very small or large)
  • The central 95% of this is

\[ p \pm z_{\alpha = 0.025} \sqrt(p(1-p)/1000) \]

Do I have an edge? (some sums)

  • To make concrete - assume I'm gambling in the 40 region
n <- 1000; alpha = 0.05; p <- 1/40
upper <- p + qnorm(alpha/2)*sqrt(p*(1-p)/n)
lower <- p - qnorm(alpha/2)*sqrt(p*(1-p)/n)
upper;lower
[1] 0.01532345
[1] 0.03467655
  • So observed proportions of losses of 0.015 to 0.035 are pretty common
  • [practically, that is huge]

Do I have an edge? (some sums)

By simulation under the Null:

  # ... some other code
  # empirically
  quantile(sampDistp, c(0.025, 0.975))
 2.5% 97.5% 
0.016 0.035 
  quantile(returnDist, c(0.025, 0.975))
 2.5% 97.5% 
 -400   360 
  # probability of finding p>0.1 after 1000 trials
  prob10percent <- length(which(returnDist/n >= 0.1))/reps
  prob10percent
[1] 0.329
  • So we have about a 32.9% chance of finding some “edge” of 10% or more, where none exists, after 1000 trades (scary!)
  • [Could have gotten that from Binomial first-principles: 23 or less losses would turn a profit of 100+ units on 1000 trials]

Do I have an edge? (simulating)

By simulation under the Null (sampling distribution for profit units):

plot of chunk unnamed-chunk-6

Do I have an edge? (simulating stuff)

Useful to look at some trading histories

plot of chunk unnamed-chunk-8

  • This is just 20 random traces, some might convince you to mortgage your house scary!
  • We're really prone to being fooled by long odds.

Strategies are more like portfolios

  • Unlikely you're simply gambling at a particular odds
  • We can do similar calcs for a mix, but we need to know the mix

For example:

plot of chunk unnamed-chunk-10

Betting mixes

We can repeat as per previous.

  # empirically
  quantile(sampDistp, c(0.025, 0.975))
    2.5%    97.5% 
0.147975 0.192000 
  quantile(returnDist, c(0.025, 0.975))
    2.5%    97.5% 
-143.025  139.000 
  prob10percent <- length(which(returnDist/n >= 0.1))/reps
  prob10percent
[1] 0.076
  • So for this mix, 7.6% chance of finding a spurious 10% edge after 1000 bets (average odds are 6)

Betting mixes - sampling dist under H0

This leads to a distribution of returns like this after 1000 bets with this distribution of positions:

plot of chunk unnamed-chunk-13

Do I have an edge? (simulating stuff)

plot of chunk unnamed-chunk-14

  • Less scary I think, mainly a function of lower odds

Back to power

  • Power calculations require some speculative size of signal
  • Here I'll assume a 10% edge is some minimum requirement to be profitable (commisions eat into things)
  • 10% edge here means probability of winning is 10% less than odds imply (I'm laying)
  • The same bet-portfolio applies i.e. I need to know the distribution of betting odds achieved

Back to power

The sampling distribution is a bit like this:

plot of chunk unnamed-chunk-17

With central 95% of:

  # empirically
  quantile(sampDistp_edge10, c(0.025, 0.975))
 2.5% 97.5% 
0.132 0.176 
  quantile(returnDist_edge10, c(0.025, 0.975))
   2.5%   97.5% 
-46.000 220.025 

Do I have an edge? (simulating stuff)

Comparing a speculative edge, to that without:

plot of chunk unnamed-chunk-19

  • The general trend contributed by the edge is clear

Do I have an edge? (simulating stuff)

Comparing a speculative edge (pink), to that without (blue):

plot of chunk unnamed-chunk-20

  • Which returns a power of 24.4% if we used a type-1 error of 5%
  • Not great - even 1000 bets, we'd be unlikely to establish our 10% edge

Do I have an edge? (simulating stuff)

Increasing our data helps of course (1000 bets):

plot of chunk unnamed-chunk-21

Versus 10k bets - power >90%:

plot of chunk unnamed-chunk-22

Two-point distributions

Take \( Y \) to be the outcome (1 a win, 0 a loss), \( P(Y=1) = p \) and \( P(Y=0) = 1-p \)

For laying, the return \( R \) is given by

\[ R = \begin{cases} -S(\omega - 1) & \mbox{if } Y = 1\\ S & \mbox{if } Y = 0\\ \end{cases} \]

Two-point distributions

With a few sums

  • \( E[R] = 0 \) if \( p \) = \( \omega^{-1} \) (not accounting for commission)
  • \( V[R] = \omega-1 \)

The second is of particular interest. If we assume independence of events then we can get the variance of many such events by simple summing.

Two-point distributions

More generally, we'd have a mix of positions

  • \( V[R_i] = \omega_i-1 \)
  • so variance of the return for many such bets under independence: \( \sum_n V[R_i] \)

So we can go directly to the return distribution - which will be also predicable in shape for large numbers of bets (Normal - yay for the statistical free lunch that is CLT).

Caveats

  • We're assuming independence, but generally multiple positions might be sought within a single event
  • Normality is asymptotic - so long odds, with small N won't be so predicable
  • However you need the distribution of \( p \) or \( \omega \) for your system
  • This could easily be estimated by running the system on historical data

Example

  • Data are collected from global horse racing events with reasonable liquidity
  • This is 200+ events per day
  • Odds data are collected on 1/3 second basis for a period prior to, and during, event
  • Currently approximately 2TB of uncompressed data per annum
  • Automated system(s) trained on historic data will take positions pre-play

Example

  • Analysis over historic data provides a distribution of positions taken
  • Entry rate (i.e. positions filled) is approximately 40%
  • Projected return is approx 16% per position, post-commission
  • How much data is required to establish this is genuine?

Example

This system has a distribution of positions like this (from running it on historic data):

plot of chunk unnamed-chunk-24

Example

  • From which we can derive a variance of returns after 1000 bets of about 8200.
  • Suggesting an approximate Normal(0, 90) return distribution
  • So really, unless our system is thought to achieve more than 180 units on 1000 bets, it will still look like noise at this point.

The simulated returns look like the following, which is about 2500 trades:

Example

Performance on historical data

plot of chunk unnamed-chunk-26

Example

Is it any good?

  • Apparent returns are about 16% per position taken, post-commission
  • No edge at all seems likely to return in the order of 280 units - we've hit 400
  • \( p \) is approximately 0.002 under \( N(0, 140) \). An edge seems likely (even not accounting for commission), but magnitude is uncertain (and this is not real trading at this point).

Example

Is it any good?

  • How much actual trading do we need to do to be confident there really is a double digit return?
  • We'd calculate low power for say 4000 future bets.

Example

plot of chunk unnamed-chunk-27

Example

Historic and trading combined

plot of chunk unnamed-chunk-28

Example

Compared to Null simulations

plot of chunk unnamed-chunk-30

Example

Trading and historic

Example

[without the unfilled bets]

Trading and historic

NB - The evils of commission

  • Bookies effectively profit by offering poor odds - refer bookies over-round
  • P2P levies a commision on winnings - say 5% for Betfair (variable - up to 60%), 2% for Betdaq (some caveats) and others
  • This can be substantial! Consider a system offering 10 units won to 9 units lost - commission is actually now 50% of profits.
  • Hence this really needs consideration in all calculations

Main points

  • The general availability of lay betting is dangerous
    • Liabilities are easily large (\( S(\omega-1) \))
    • Medium to long odds laying gives the appearance of success - it is very easy to be fooled by randomness
  • You should know the properties of the Null position before you start betting your house
    • Simple simulations and sums are informative

Main points

  • Power is quite low for any form of longer odds gambling
  • Even large data isn't sufficient to offer confidence for long-odds approaches
  • Consider Premier league - some 380 games per season - years may be required
  • These are also for filled bets - making a lot of bets doesn't equal lots of data

Questions?

Roland Langrock

This guy