Using some MLS data, I wanted to illustrate some issues that I think are important to be wary of when reading about ELO ratings. I’m not saying that ELO ratings are invalid - but rather depending upon what user choices we make (regarding the constant used, for instance), we may be addressing different questions to those that we think we are. I also caution about using these ratings to rank teams as the 1st, 2nd, 3rd etc. These ratings should be interpreted more as giving you a guide as to the respective differnce in win probability between two teams - not a categorical assessment of who is definitively better. I prefer the Glicko ratings system as it adds in a deviation measure that gets at this issue.

I have written this very quickly and without too much discussion. I hope it’s easy to follow as I didn’t initially write it with the idea that I would post it and haven’t had time to edit. I might possibly add to it in the future.

 

Using MLS 2015 data

Thanks to Tom Worville for providing me with the MLS results data. Here are some initial ELO ratings of the MLS teams using data from the previous season:

##                      team  elo
## 1            Chicago Fire 1950
## 2         Colorado Rapids 1905
## 3           Columbus Crew 2050
## 4               DC United 2075
## 5               FC Dallas 2050
## 6          Houston Dynamo 1905
## 7      Los Angeles Galaxy 2160
## 8         Montreal Impact 1900
## 9  New England Revolution 2025
## 10          New York City 1900
## 11     New York Red Bulls 2025
## 12           Orlando City 1920
## 13     Philadelphia Union 2000
## 14       Portland Timbers 2000
## 15         Real Salt Lake 2075
## 16   San Jose Earthquakes 1925
## 17       Seattle Sounders 2075
## 18   Sporting Kansas City 2035
## 19             Toronto FC 1950
## 20    Vancouver Whitecaps 2010

 

These will be the starting point, though we could have decided to start every team at the same value. When we have only 20 weeks worth of data to explore changes in ratings, I think it makes sense to use previous ratings as our start point. However, when teams change so much from season to season, then it may make sense to reset the ratings. Alternatively, with the Glicko system (below) we could keep the ratings but increase the deviation (our uncertainty) at the change of each season.

 

Below are the results of the 2015 MLS season. I’m showing the top 6 and bottom 6 results. There are 201 in total broken down into 20 weeks. The result variable = 1 if home win, 0 if away win and 0.5 if a tie:

##         Date                 home                visitor result week
## 1 2015-03-09     Seattle Sounders New England Revolution    1.0    1
## 2 2015-03-09 Sporting Kansas City     New York Red Bulls    0.5    1
## 3 2015-03-08            FC Dallas   San Jose Earthquakes    1.0    1
## 4 2015-03-08       Houston Dynamo          Columbus Crew    1.0    1
## 5 2015-03-08         Orlando City          New York City    0.5    1
## 6 2015-03-08     Portland Timbers         Real Salt Lake    0.5    1
##           Date                 home              visitor result week
## 196 2015-07-18     Portland Timbers  Vancouver Whitecaps    0.5   20
## 197 2015-07-18       Real Salt Lake       Houston Dynamo    1.0   20
## 198 2015-07-18     Seattle Sounders      Colorado Rapids    0.0   20
## 199 2015-07-18 Sporting Kansas City      Montreal Impact    1.0   20
## 200 2015-07-18   Los Angeles Galaxy San Jose Earthquakes    1.0   20
## 201 2015-07-18           Toronto FC   Philadelphia Union    1.0   20

 

Some Considerations

ELO ratings are re-calculated every week. If a team plays twice in that week then only one new rating is calculated. The ELO rating contains a contant ‘k’. This can be ‘optimally’ worked out based upon historical data (i.e. knowing that a difference of N ratings points between two teams should lead to a win probability of x% - e.g. a 100 point rating difference ought to be a 64% win probability for the superior team). However, it can be difficult to have sufficient historical data to accurately calculate ‘k’. Further, the distribution of win probabilities over a range of ratings differences may not perfectly fit the curve/distribution that ELO ratings are based upon.

For soccer, generally an ELO of between 20-40 appears to be satisfactory. It’s also possible to have different k values for whether a winner is an expected versus unexpected winner, but I won’t develop this idea here.

Here are the final ratings after week 20 for three different constants:

 

k = 20

## 
## Elo Ratings For 20 Players Playing 201 Games
## 
##                    Player Rating Games Win Draw Loss Lag
## 1      Los Angeles Galaxy   2114    22   9    7    6   0
## 2    Sporting Kansas City   2082    18   9    6    3   0
## 3               FC Dallas   2063    20  10    5    5   0
## 4               DC United   2052    22  10    5    7   0
## 5        Seattle Sounders   2038    21  10    2    9   0
## 6           Columbus Crew   2032    21   8    6    7   0
## 7      New York Red Bulls   2030    19   8    5    6   0
## 8          Real Salt Lake   2026    21   6    8    7   0
## 9        Portland Timbers   2021    21   9    5    7   0
## 10    Vancouver Whitecaps   2021    21  10    3    8   0
## 11 New England Revolution   1991    22   7    6    9   0
## 12             Toronto FC   1978    18   8    3    7   0
## 13     Philadelphia Union   1969    21   6    4   11   0
## 14   San Jose Earthquakes   1953    19   7    4    8   0
## 15        Colorado Rapids   1953    20   5    9    6   0
## 16           Orlando City   1939    20   6    6    8   0
## 17         Houston Dynamo   1928    20   6    6    8   0
## 18           Chicago Fire   1921    19   5    3   11   0
## 19        Montreal Impact   1918    17   6    3    8   0
## 20          New York City   1906    20   5    6    9   0

 

k = 30

## 
## Elo Ratings For 20 Players Playing 201 Games
## 
##                    Player Rating Games Win Draw Loss Lag
## 1    Sporting Kansas City   2104    18   9    6    3   0
## 2      Los Angeles Galaxy   2103    22   9    7    6   0
## 3               FC Dallas   2071    20  10    5    5   0
## 4               DC United   2039    22  10    5    7   0
## 5      New York Red Bulls   2030    19   8    5    6   0
## 6        Portland Timbers   2030    21   9    5    7   0
## 7           Columbus Crew   2027    21   8    6    7   0
## 8     Vancouver Whitecaps   2021    21  10    3    8   0
## 9        Seattle Sounders   2021    21  10    2    9   0
## 10         Real Salt Lake   2009    21   6    8    7   0
## 11             Toronto FC   1989    18   8    3    7   0
## 12 New England Revolution   1975    22   7    6    9   0
## 13        Colorado Rapids   1973    20   5    9    6   0
## 14     Philadelphia Union   1962    21   6    4   11   0
## 15   San Jose Earthquakes   1961    19   7    4    8   0
## 16           Orlando City   1944    20   6    6    8   0
## 17         Houston Dynamo   1932    20   6    6    8   0
## 18        Montreal Impact   1924    17   6    3    8   0
## 19          New York City   1909    20   5    6    9   0
## 20           Chicago Fire   1909    19   5    3   11   0

 

k = 40

## 
## Elo Ratings For 20 Players Playing 201 Games
## 
##                    Player Rating Games Win Draw Loss Lag
## 1    Sporting Kansas City   2124    18   9    6    3   0
## 2      Los Angeles Galaxy   2097    22   9    7    6   0
## 3               FC Dallas   2078    20  10    5    5   0
## 4        Portland Timbers   2038    21   9    5    7   0
## 5      New York Red Bulls   2029    19   8    5    6   0
## 6               DC United   2027    22  10    5    7   0
## 7           Columbus Crew   2024    21   8    6    7   0
## 8     Vancouver Whitecaps   2020    21  10    3    8   0
## 9        Seattle Sounders   2004    21  10    2    9   0
## 10             Toronto FC   2000    18   8    3    7   0
## 11         Real Salt Lake   1997    21   6    8    7   0
## 12        Colorado Rapids   1991    20   5    9    6   0
## 13   San Jose Earthquakes   1966    19   7    4    8   0
## 14     Philadelphia Union   1959    21   6    4   11   0
## 15 New England Revolution   1959    22   7    6    9   0
## 16           Orlando City   1947    20   6    6    8   0
## 17         Houston Dynamo   1933    20   6    6    8   0
## 18        Montreal Impact   1930    17   6    3    8   0
## 19          New York City   1913    20   5    6    9   0
## 20           Chicago Fire   1898    19   5    3   11   0

 

Overall these are pretty similar, though notably the top teams and bottom teams do shuffle around. This plot below shows the rank order changes over a range of constants:

 

Glicko

An improvement to the ELO ratings system (in my view), is the Glicko ratings system developed by Mark Glickman - see link below. The major advantage of this is that it has a ‘deviation’ parameter which provides a measure of uncertainty about the rating.

For the purpose of this illustration, I’ve initially assigned a deviation of 100 ratings points at week 1. This means, for instance, that LA Galaxy’s initial rating of 2160 is actually likely in the range 2060-2260. As we gain more information from more results, the deviation (our certainty in our Ratings measure) gets smaller.

There is also a constant value associated with the Glicko ratings system - typically constant values are smaller than those used with the ELO system. For this illustration, I shall just use one constant value (10).

## 
## Glicko Ratings For 20 Players Playing 201 Games
## 
##                    Player Rating Deviation Games Win Draw Loss Lag
## 1    Sporting Kansas City   2110     73.11    18   9    6    3   0
## 2      Los Angeles Galaxy   2094     69.89    22   9    7    6   0
## 3               FC Dallas   2074     70.99    20  10    5    5   0
## 4               DC United   2047     69.80    22  10    5    7   0
## 5      New York Red Bulls   2035     71.40    19   8    5    6   0
## 6        Portland Timbers   2032     69.58    21   9    5    7   0
## 7        Seattle Sounders   2030     70.04    21  10    2    9   0
## 8     Vancouver Whitecaps   2027     69.80    21  10    3    8   0
## 9           Columbus Crew   2021     69.18    21   8    6    7   0
## 10         Real Salt Lake   2005     69.70    21   6    8    7   0
## 11             Toronto FC   1990     72.18    18   8    3    7   0
## 12 New England Revolution   1979     68.41    22   7    6    9   0
## 13        Colorado Rapids   1977     71.29    20   5    9    6   0
## 14   San Jose Earthquakes   1971     72.07    19   7    4    8   0
## 15     Philadelphia Union   1951     70.75    21   6    4   11   0
## 16           Orlando City   1948     71.14    20   6    6    8   0
## 17         Houston Dynamo   1937     71.21    20   6    6    8   0
## 18        Montreal Impact   1928     73.47    17   6    3    8   0
## 19          New York City   1904     71.23    20   5    6    9   0
## 20           Chicago Fire   1902     71.77    19   5    3   11   0

 

Here you can see that we have a deviation measure of uncertainty about each final rating. We can plot that like this:

 

 

Interim Final thoughts

There are many other things that can be looked at using ELO/Glicko type models. A major one is how to differentially address the value of home-wins versus away-wins (or for that matter home ties versus away ties). I can look at that in more detail in a future post. I hope from this that it can be seen that the choice of the constant factor in either the ELO or Glicko rating system is not trivial. Depending on what values are chosen, the ratings output can address different questions (i.e. which teams are ‘hot’ verus which teams are consistently strong over long periods).

Further, I hope that it’s evident that ELO shouldn’t be seen as a way of perfectly ordinally ranking teams as 1st, 2nd, 3rd,…. etc. Rather, by using the Glicko, we can achieve some confidence over how different teams are likely to be from one another in their rating. In sports like soccer, and especially in the MLS, it’s typical for teams to be very similar to one another in ratings.