C. Donovan, C. Bleak & B. Caneco
02 July 2019
University of St Andrews & DMP Statistical Solutions UK Ltd.
We're basically considering odds over time.
NB - these aren't standard statistical odds \( p/(1-p) \)
Odds may be directly related to probabilities:
These situations are in the minority - usually \( p \) is estimated
Observe at time \( t \), odds on offer don't simply reflect probabilities
Gambling has gotten very modern:
Some dogs…
The basic problem is commonplace in statistics - the building blocks are Bernoulli/Binomial:
\[ p \pm z_{\alpha = 0.025} \sqrt(p(1-p)/1000) \]
n <- 1000; alpha = 0.05; p <- 1/40
upper <- p + qnorm(alpha/2)*sqrt(p*(1-p)/n)
lower <- p - qnorm(alpha/2)*sqrt(p*(1-p)/n)
upper;lower
[1] 0.01532345
[1] 0.03467655
By simulation under the Null:
# ... some other code
# empirically
quantile(sampDistp, c(0.025, 0.975))
2.5% 97.5%
0.016 0.035
quantile(returnDist, c(0.025, 0.975))
2.5% 97.5%
-400 360
# probability of finding p>0.1 after 1000 trials
prob10percent <- length(which(returnDist/n >= 0.1))/reps
prob10percent
[1] 0.329
By simulation under the Null (sampling distribution for profit units):
Useful to look at some trading histories
For example:
We can repeat as per previous.
# empirically
quantile(sampDistp, c(0.025, 0.975))
2.5% 97.5%
0.147975 0.192000
quantile(returnDist, c(0.025, 0.975))
2.5% 97.5%
-143.025 139.000
prob10percent <- length(which(returnDist/n >= 0.1))/reps
prob10percent
[1] 0.076
This leads to a distribution of returns like this after 1000 bets with this distribution of positions:
The sampling distribution is a bit like this:
With central 95% of:
# empirically
quantile(sampDistp_edge10, c(0.025, 0.975))
2.5% 97.5%
0.132 0.176
quantile(returnDist_edge10, c(0.025, 0.975))
2.5% 97.5%
-46.000 220.025
Comparing a speculative edge, to that without:
Comparing a speculative edge (pink), to that without (blue):
Increasing our data helps of course (1000 bets):
Versus 10k bets - power >90%:
Take \( Y \) to be the outcome (1 a win, 0 a loss), \( P(Y=1) = p \) and \( P(Y=0) = 1-p \)
For laying, the return \( R \) is given by
\[ R = \begin{cases} -S(\omega - 1) & \mbox{if } Y = 1\\ S & \mbox{if } Y = 0\\ \end{cases} \]
With a few sums
The second is of particular interest. If we assume independence of events then we can get the variance of many such events by simple summing.
More generally, we'd have a mix of positions
So we can go directly to the return distribution - which will be also predicable in shape for large numbers of bets (Normal - yay for the statistical free lunch that is CLT).
This system has a distribution of positions like this (from running it on historic data):
The simulated returns look like the following, which is about 2500 trades:
Performance on historical data
Is it any good?
Is it any good?
Historic and trading combined
Compared to Null simulations
[without the unfilled bets]
Roland Langrock