The Dixon-Coles model relies on a time-dependant weight function that results in the most recent games being worth more than games in the more distant path when determining a team’s strength. The properties of this weight function have surprising impact on the predicted chances of a home team winning their next game.
Before examining the changes brought by different weight functions, we’ll look at the functions themselves.
The original model by Dixon and Coles uses an exponential decay function which produces a weight curve that looks like this for a given past 3 years:
Here, more recent games (on dates furthest right) have higher weight than more distant games (further left). The empty spaces in this chart correspond to the offseason dates where no hockey was played.
You can change the steepness of the slope by modifying the xi value used in the function:
The best results from this method came with xi = 0.00426, this value has remained essentially stable for the past 10 years.
Some familiarity with the model showed that there was a opportunity to improve the performance of the model’s continuous over- or under-trusting certain teams’ chances in their games. For example, it was noted in 2022-2023 that New Jersey was always under-supported and this didn’t improve with the season.
A new design for weighting was percieved with a more logistic curve shape.
We can also tune the cutoff point by changing the upsilon value:
While these aren’t what was originally conceived for the shape of the curve, the best result comes from xi = 0.003 and upsilon at 150.
We can compare the game results predicted off of a model using the two types of weights.
The total accuracy of the original method for 2022-2023 was 60.07 % correct with a Log Loss of 0.6669. The total accuracy of the new method for the same time was 60.57 % correct, with a Log Loss of 0.6684. This is a slight improvement on the accuracy but a slight decrease in Log Loss (lower is better for that metric).
If we plot the old and new home win odds against eachother we see there’s certainly correlation but a wide variation in the predicted results:
There’s clearly some results that are very different from eachother, especially near the critical 0.5 value!
Let’s see if this improves the performance for habitually over- or under-predicting a team.
sc<-scores[,c("GameID", "Result")]
res<-unique(dplyr::left_join(res, sc, by = "GameID"))
res2<-data.frame(Team = c(res$HomeTeam, res$AwayTeam),
OldWin = c(res$OldHomeWin, 1-res$OldHomeWin),
NewWin = c(res$NewHomeWin, 1-res$NewHomeWin),
Result = c(res$Result, 1-res$Result))
res2 %>%
mutate("CorrectNew" = ifelse(((.data$NewWin > 0.5 & .data$Result > 0.5) | (.data$NewWin < 0.5 & .data$Result < 0.5)), 1, 0),
"CorrectOld" = ifelse(((.data$OldWin > 0.5 & .data$Result > 0.5) | (.data$OldWin < 0.5 & .data$Result < 0.5)), 1, 0)) %>%
group_by(Team) %>%
summarise("OldCorrect" = sum(CorrectOld),
"OldAccuracy" = sum(CorrectOld)/n(),
"NewCorrect" = sum(CorrectNew),
"NewAccuracy" = sum(CorrectNew)/n(),
"TotalGames" = n()) %>%
knitr::kable()
Team | OldCorrect | OldAccuracy | NewCorrect | NewAccuracy | TotalGames |
---|---|---|---|---|---|
Anaheim Ducks | 60 | 0.7317073 | 60 | 0.7317073 | 82 |
Arizona Coyotes | 53 | 0.6463415 | 56 | 0.6829268 | 82 |
Boston Bruins | 65 | 0.7303371 | 67 | 0.7528090 | 89 |
Buffalo Sabres | 44 | 0.5365854 | 45 | 0.5487805 | 82 |
Calgary Flames | 40 | 0.4878049 | 43 | 0.5243902 | 82 |
Carolina Hurricanes | 63 | 0.6494845 | 64 | 0.6597938 | 97 |
Chicago Blackhawks | 56 | 0.6829268 | 57 | 0.6951220 | 82 |
Colorado Avalanche | 54 | 0.6067416 | 54 | 0.6067416 | 89 |
Columbus Blue Jackets | 54 | 0.6585366 | 55 | 0.6707317 | 82 |
Dallas Stars | 58 | 0.5742574 | 62 | 0.6138614 | 101 |
Detroit Red Wings | 47 | 0.5731707 | 46 | 0.5609756 | 82 |
Edmonton Oilers | 51 | 0.5425532 | 52 | 0.5531915 | 94 |
Florida Panthers | 62 | 0.6019417 | 59 | 0.5728155 | 103 |
Los Angeles Kings | 54 | 0.6136364 | 49 | 0.5568182 | 88 |
Minnesota Wild | 50 | 0.5681818 | 53 | 0.6022727 | 88 |
Montreal Canadiens | 51 | 0.6219512 | 47 | 0.5731707 | 82 |
Nashville Predators | 43 | 0.5243902 | 46 | 0.5609756 | 82 |
New Jersey Devils | 54 | 0.5744681 | 54 | 0.5744681 | 94 |
New York Islanders | 49 | 0.5568182 | 52 | 0.5909091 | 88 |
New York Rangers | 49 | 0.5505618 | 52 | 0.5842697 | 89 |
Ottawa Senators | 49 | 0.5975610 | 51 | 0.6219512 | 82 |
Philadelphia Flyers | 57 | 0.6951220 | 56 | 0.6829268 | 82 |
Pittsburgh Penguins | 44 | 0.5365854 | 45 | 0.5487805 | 82 |
San Jose Sharks | 53 | 0.6463415 | 53 | 0.6463415 | 82 |
Seattle Kraken | 60 | 0.6250000 | 60 | 0.6250000 | 96 |
St. Louis Blues | 45 | 0.5487805 | 47 | 0.5731707 | 82 |
Tampa Bay Lightning | 52 | 0.5909091 | 51 | 0.5795455 | 88 |
Toronto Maple Leafs | 53 | 0.5698925 | 50 | 0.5376344 | 93 |
Vancouver Canucks | 53 | 0.6463415 | 55 | 0.6707317 | 82 |
Vegas Golden Knights | 61 | 0.5865385 | 62 | 0.5961538 | 104 |
Washington Capitals | 48 | 0.5853659 | 44 | 0.5365854 | 82 |
Winnipeg Jets | 50 | 0.5747126 | 49 | 0.5632184 | 87 |
Note this includes both regular season and playoff games.
As for betting results, odds have been sent directly for evaluation.