Women’s World Cup 2023 Prediction Competition

Milt Mavrakakis & Iain Gourlay / Smartodds

What we do

  • We predict outcomes of professional sports on behalf of our clients.

  • Started with football, now also working on American football, baseball, basketball, cricket, golf, ice hockey, tennis, and more.

  • We’re always recruiting. Join us!

smartodds.co.uk/jobs/quantitative-analyst

Are goals Poisson?

How do we test this?

  • Fit Poisson distribution to home and away goals (separately).

  • Compare observe and expected goals.

  • Perform goodness-of-fit test.

(This is the most common approach in the literature.)

Some results (EPL 2010-2015)

       chiSq df pValue
home 13.4194  5 0.0197
away 23.4891  5 0.0003

A hypothesis test

  • What are we actually testing?

\[ Y^{(h)}_i \stackrel{\text{iid}}\sim \text{Poisson}(\lambda); \; i=1,...,N \\ Y^{(a)}_i \stackrel{\text{iid}}\sim \text{Poisson}(\mu); \; i=1,...,N \]

  • Is this a sensible test?

A better assumption

  • It makes more sense to assume that

\[ Y^{(h)}_i \stackrel{\text{iid}}\sim \text{Poisson}(\lambda_i); \; i=1,...,N \\ Y^{(a)}_i \stackrel{\text{iid}}\sim \text{Poisson}(\mu_i); \; i=1,...,N \]

  • We want \(\lambda_i\) and \(\mu_i\) to depend on some covariates \(X_i\) and some parameters \(\theta\).

  • This can easily look like overdispersion in the aggregated data.

A simple model

  • Each team has an attack rating (\(\alpha\)) and a defence rating (\(\beta\)).

  • We also have home advantage (\(\eta\)) and a global mean (\(\gamma\)).

  • All of these are constant over time.

  • We set

\[ \log(\lambda_i) = \alpha_{\text{home}(i)} + \beta_{\text{away}(i)} + \gamma + \eta/2 \\ \log(\mu_i) = \alpha_{\text{away}(i)} + \beta_{\text{home}(i)} + \gamma - \eta/2 \]

A simulated example

  • Sample \(\alpha\), \(\beta\) from a Gaussian. Assume \(\alpha\) and \(\beta\) are positively correlated.

  • Set reasonable values for \(\eta\) and \(\gamma\).

  • Generate 5 seasons’ worth of data using the independent Poisson assumption.

  • Summary statistics:

    mean(HG) sd(HG) mean(AG) sd(AG)
EPL    1.564  1.305    1.183  1.150
Sim    1.556  1.330    1.186  1.149

Simulation results

       chiSq df pValue
home 24.3305  5 0.0002
away 12.6575  5 0.0268

Moral of the story

  • Don’t rely too much on empirical/aggregated data.

  • Make sure that plots and tests are relevant to your modelling assumptions.

  • Answer the question by formulating a model.

  • Always a good idea to simulate from your model.

Modelling women’s football

  • Conventional wisdom: extreme scorelines are overrepresented (relative to independent Poisson).

  • Different to men’s football, where low scorelines are overrepresented.

  • Is this true?

International results

Women:

         awayGoals
homeGoals    0    1    2    3    4    5   6+
       0  5.68 7.02 4.70 3.29 2.62 1.90 3.24
       1  8.58 7.61 4.67 2.37 1.16 0.57 0.82
       2  6.85 5.78 2.59 1.29 0.54 0.13 0.15
       3  4.89 3.11 1.51 0.52 0.12 0.08 0.03
       4  4.00 1.68 0.60 0.25 0.08 0.02 0.02
       5  2.62 1.16 0.29 0.08 0.02 0.03 0.00
       6+ 5.96 1.07 0.29 0.02 0.00 0.00 0.00

Men:

         awayGoals
homeGoals     0     1     2     3     4     5    6+
       0   9.20  7.87  5.17  2.32  1.15  0.42  0.62
       1  11.74 10.64  5.12  1.93  0.85  0.26  0.22
       2   8.60  7.64  3.67  1.38  0.41  0.14  0.08
       3   4.90  3.44  1.60  0.42  0.17  0.02  0.01
       4   2.75  1.63  0.70  0.23  0.06  0.00  0.00
       5   1.34  0.65  0.19  0.07  0.00  0.00  0.00
       6+  1.63  0.62  0.14  0.01  0.01  0.00  0.00

International results - comparison

Difference (women minus men):

         awayGoals
homeGoals     0     1     2     3     4     5    6+
       0  -3.52 -0.85 -0.46  0.97  1.47  1.48  2.63
       1  -3.16 -3.03 -0.45  0.43  0.31  0.31  0.60
       2  -1.74 -1.86 -1.08 -0.09  0.13  0.00  0.07
       3  -0.01 -0.33 -0.08  0.10 -0.05  0.06  0.02
       4   1.24  0.05 -0.09  0.02  0.02  0.02  0.02
       5   1.28  0.51  0.09  0.01  0.02  0.03  0.00
       6+  4.33  0.45  0.15  0.00 -0.01  0.00  0.00
  • What can we conclude?

Thank you

Get in touch!

Milt Mavrakakis miltiadis.mavrakakis@smartodds.co.uk

Iain Gourlay iain.gourlay@smartodds.co.uk