Processing math: 100%

Introduction

The Dixon-Coles model relies on a time-dependant weight function that results in the most recent games being worth more than games in the more distant path when determining a team’s strength. The properties of this weight function have surprising impact on the predicted chances of a home team winning their next game.

Weighting Functions

Before examining the changes brought by different weight functions, we’ll look at the functions themselves.

Original Dixon-Coles Weights

The original model by Dixon and Coles uses an exponential decay function which produces a weight curve that looks like this for a given past 3 years:

Here, more recent games (on dates furthest right) have higher weight than more distant games (further left). The empty spaces in this chart correspond to the offseason dates where no hockey was played.

You can change the steepness of the slope by modifying the xi value used in the function:

The best results from this method came with xi = 0.00426, this value has remained essentially stable for the past 10 years.

New Logistic Weights

Some familiarity with the model showed that there was a opportunity to improve the performance of the model’s continuous over- or under-trusting certain teams’ chances in their games. For example, it was noted in 2022-2023 that New Jersey was always under-supported and this didn’t improve with the season.

A new design for weighting was percieved with a more logistic curve shape.

We can also tune the cutoff point by changing the upsilon value:

While these aren’t what was originally conceived for the shape of the curve, the best result comes from xi = 0.003 and upsilon at 150.

Compare Game Results

We can compare the game results predicted off of a model using the two types of weights.

The total accuracy of the original method for 2022-2023 was 60.07 % correct with a Log Loss of 0.6669. The total accuracy of the new method for the same time was 60.57 % correct, with a Log Loss of 0.6684. This is a slight improvement on the accuracy but a slight decrease in Log Loss (lower is better for that metric).

If we plot the old and new home win odds against eachother we see there’s certainly correlation but a wide variation in the predicted results:

There’s clearly some results that are very different from eachother, especially near the critical 0.5 value!

Let’s see if this improves the performance for habitually over- or under-predicting a team.

sc<-scores[,c("GameID", "Result")]
res<-unique(dplyr::left_join(res, sc, by = "GameID"))
res2<-data.frame(Team = c(res$HomeTeam, res$AwayTeam),
                 OldWin = c(res$OldHomeWin, 1-res$OldHomeWin),
                 NewWin = c(res$NewHomeWin, 1-res$NewHomeWin),
                 Result = c(res$Result, 1-res$Result))
res2 %>%
  mutate("CorrectNew" = ifelse(((.data$NewWin > 0.5 & .data$Result > 0.5) | (.data$NewWin < 0.5 & .data$Result < 0.5)), 1, 0),
         "CorrectOld" = ifelse(((.data$OldWin > 0.5 & .data$Result > 0.5) | (.data$OldWin < 0.5 & .data$Result < 0.5)), 1, 0)) %>%
  group_by(Team) %>%
  summarise("OldCorrect" = sum(CorrectOld),
            "OldAccuracy" = sum(CorrectOld)/n(),
            "NewCorrect" = sum(CorrectNew),
            "NewAccuracy" = sum(CorrectNew)/n(), 
            "TotalGames" = n()) %>%
  knitr::kable()
Team OldCorrect OldAccuracy NewCorrect NewAccuracy TotalGames
Anaheim Ducks 60 0.7317073 60 0.7317073 82
Arizona Coyotes 53 0.6463415 56 0.6829268 82
Boston Bruins 65 0.7303371 67 0.7528090 89
Buffalo Sabres 44 0.5365854 45 0.5487805 82
Calgary Flames 40 0.4878049 43 0.5243902 82
Carolina Hurricanes 63 0.6494845 64 0.6597938 97
Chicago Blackhawks 56 0.6829268 57 0.6951220 82
Colorado Avalanche 54 0.6067416 54 0.6067416 89
Columbus Blue Jackets 54 0.6585366 55 0.6707317 82
Dallas Stars 58 0.5742574 62 0.6138614 101
Detroit Red Wings 47 0.5731707 46 0.5609756 82
Edmonton Oilers 51 0.5425532 52 0.5531915 94
Florida Panthers 62 0.6019417 59 0.5728155 103
Los Angeles Kings 54 0.6136364 49 0.5568182 88
Minnesota Wild 50 0.5681818 53 0.6022727 88
Montreal Canadiens 51 0.6219512 47 0.5731707 82
Nashville Predators 43 0.5243902 46 0.5609756 82
New Jersey Devils 54 0.5744681 54 0.5744681 94
New York Islanders 49 0.5568182 52 0.5909091 88
New York Rangers 49 0.5505618 52 0.5842697 89
Ottawa Senators 49 0.5975610 51 0.6219512 82
Philadelphia Flyers 57 0.6951220 56 0.6829268 82
Pittsburgh Penguins 44 0.5365854 45 0.5487805 82
San Jose Sharks 53 0.6463415 53 0.6463415 82
Seattle Kraken 60 0.6250000 60 0.6250000 96
St. Louis Blues 45 0.5487805 47 0.5731707 82
Tampa Bay Lightning 52 0.5909091 51 0.5795455 88
Toronto Maple Leafs 53 0.5698925 50 0.5376344 93
Vancouver Canucks 53 0.6463415 55 0.6707317 82
Vegas Golden Knights 61 0.5865385 62 0.5961538 104
Washington Capitals 48 0.5853659 44 0.5365854 82
Winnipeg Jets 50 0.5747126 49 0.5632184 87

Note this includes both regular season and playoff games.

As for betting results, odds have been sent directly for evaluation.