Introduction

The Dixon-Coles model relies on a time-dependant weight function that results in the most recent games being worth more than games in the more distant path when determining a team’s strength. The properties of this weight function have surprising impact on the predicted chances of a home team winning their next game.

Weighting Functions

Before examining the changes brought by different weight functions, we’ll look at the functions themselves.

Original Dixon-Coles Weights

The original model by Dixon and Coles uses an exponential decay function which produces a weight curve that looks like this for a given past 3 years:

Here, more recent games (on dates furthest right) have higher weight than more distant games (further left). The empty spaces in this chart correspond to the offseason dates where no hockey was played.

You can change the steepness of the slope by modifying the $xi$ value used in the function:

The best results from this method came with xi = 0.00426, this value has remained essentially stable for the past 10 years.

New Logistic Weights

Some familiarity with the model showed that there was a opportunity to improve the performance of the model’s continuous over- or under-trusting certain teams’ chances in their games. For example, it was noted in 2022-2023 that New Jersey was always under-supported and this didn’t improve with the season.

A new design for weighting was percieved with a more logistic curve shape.

We can also tune the cutoff point by changing the upsilon value:

While these aren’t what was originally conceived for the shape of the curve, the best result comes from xi = 0.003 and upsilon at 150.

Compare Game Results

We can compare the game results predicted off of a model using the two types of weights.

The total accuracy of the original method for 2022-2023 was 60.07 % correct with a Log Loss of 0.6669. The total accuracy of the new method for the same time was 60.57 % correct, with a Log Loss of 0.6684. This is a slight improvement on the accuracy but a slight decrease in Log Loss (lower is better for that metric).

If we plot the old and new home win odds against eachother we see there’s certainly correlation but a wide variation in the predicted results:

There’s clearly some results that are very different from eachother, especially near the critical 0.5 value!

Let’s see if this improves the performance for habitually over- or under-predicting a team.

sc<-scores[,c("GameID", "Result")]
res<-unique(dplyr::left_join(res, sc, by = "GameID"))
res2<-data.frame(Team = c(res$HomeTeam, res$AwayTeam),
                 OldWin = c(res$OldHomeWin, 1-res$OldHomeWin),
                 NewWin = c(res$NewHomeWin, 1-res$NewHomeWin),
                 Result = c(res$Result, 1-res$Result))
res2 %>%
  mutate("CorrectNew" = ifelse(((.data$NewWin > 0.5 & .data$Result > 0.5) | (.data$NewWin < 0.5 & .data$Result < 0.5)), 1, 0),
         "CorrectOld" = ifelse(((.data$OldWin > 0.5 & .data$Result > 0.5) | (.data$OldWin < 0.5 & .data$Result < 0.5)), 1, 0)) %>%
  group_by(Team) %>%
  summarise("OldCorrect" = sum(CorrectOld),
            "OldAccuracy" = sum(CorrectOld)/n(),
            "NewCorrect" = sum(CorrectNew),
            "NewAccuracy" = sum(CorrectNew)/n(), 
            "TotalGames" = n()) %>%
  knitr::kable()

Team	OldCorrect	OldAccuracy	NewCorrect	NewAccuracy	TotalGames
Anaheim Ducks	60	0.7317073	60	0.7317073	82
Arizona Coyotes	53	0.6463415	56	0.6829268	82
Boston Bruins	65	0.7303371	67	0.7528090	89
Buffalo Sabres	44	0.5365854	45	0.5487805	82
Calgary Flames	40	0.4878049	43	0.5243902	82
Carolina Hurricanes	63	0.6494845	64	0.6597938	97
Chicago Blackhawks	56	0.6829268	57	0.6951220	82
Colorado Avalanche	54	0.6067416	54	0.6067416	89
Columbus Blue Jackets	54	0.6585366	55	0.6707317	82
Dallas Stars	58	0.5742574	62	0.6138614	101
Detroit Red Wings	47	0.5731707	46	0.5609756	82
Edmonton Oilers	51	0.5425532	52	0.5531915	94
Florida Panthers	62	0.6019417	59	0.5728155	103
Los Angeles Kings	54	0.6136364	49	0.5568182	88
Minnesota Wild	50	0.5681818	53	0.6022727	88
Montreal Canadiens	51	0.6219512	47	0.5731707	82
Nashville Predators	43	0.5243902	46	0.5609756	82
New Jersey Devils	54	0.5744681	54	0.5744681	94
New York Islanders	49	0.5568182	52	0.5909091	88
New York Rangers	49	0.5505618	52	0.5842697	89
Ottawa Senators	49	0.5975610	51	0.6219512	82
Philadelphia Flyers	57	0.6951220	56	0.6829268	82
Pittsburgh Penguins	44	0.5365854	45	0.5487805	82
San Jose Sharks	53	0.6463415	53	0.6463415	82
Seattle Kraken	60	0.6250000	60	0.6250000	96
St. Louis Blues	45	0.5487805	47	0.5731707	82
Tampa Bay Lightning	52	0.5909091	51	0.5795455	88
Toronto Maple Leafs	53	0.5698925	50	0.5376344	93
Vancouver Canucks	53	0.6463415	55	0.6707317	82
Vegas Golden Knights	61	0.5865385	62	0.5961538	104
Washington Capitals	48	0.5853659	44	0.5365854	82
Winnipeg Jets	50	0.5747126	49	0.5632184	87

Note this includes both regular season and playoff games.

As for betting results, odds have been sent directly for evaluation.

Comparison of Historical Weight Function Results

Phil Bulsink

2023-09-26

Introduction

Weighting Functions

Original Dixon-Coles Weights

New Logistic Weights

Compare Game Results