goalie carey: A Statistical Tribute

Background

There is a strongly-held belief among Montreal Canadiens fans that the club’s starting goaltender Carey Price (aka goalie carey) plays exceptionally well on Saturday nights. I mean exceptionally well, even compared to his usual exceptionally-good play. See here a traditional reddit ritual the day after a Saturday night win:

While I have never doubted this I was curious about how big the “Saturday night effect”, really is. It’s surprisingly hard to find NHL statistics broken down by day-of-the-week, so I did a little bit of scraping and analysis.

The Basics

Including playoffs, Price has played in 551 total NHL games. He was awarded a decision^1 in 538 of those games. His record is

285-198-55

To begin, let’s to focus on just the regular season, as the playoffs are a different beast, where every night is Satuday night. Carey has played in 497 regular season games, with 488 decisions, for a record of

262-171-55

Without further ado, Carey Price’s regular season record on Satuday nights is

90-31-18

for a win percentage of 64.7%. Compare this to his season record on all of the other nights:

172-140-37

On the other nights of the week, Carey wins 49.3% of his games.

Some Pretty Pictures

Wow, that seems like a pretty big difference. Here it is graphically:

But just from those number we can’t tell the whole story: do other days stand out? Are Saturdays good or is some non-Saturday day just really bad? Here are Carey’s percentage wins, losses and overtime losses for each day of the week:

That’s pretty impressive! We see that only on Tuesdays, Thursdays and Saturdays does Carey get the win a majority of the time. However, Saturday really is a stand out day, with a 10-point higher win percentage than any other night. Interestingly, Friday is Carey’s worst night, and it is the only night with more regulation losses than wins (Sunday and Wednesday are exactly even).

There’s one problem: by looking at percentages only we may have a skewed idea of what is going on. This percentage view, while optimal for comparing days of the week, sells short how much Carey wins overall, because it does not take into account the fact that more games are played on certain days. So now let’s look at the raw count of each results on each Day:

Aha! Price wins the biggest majority of his games on the days when the most games are played. Furthermore, his weakest Day, Friday, is not that big of a game night. In fact, the three days we picked out as being Carey’s best are also the days with the most games:

Cool. So far everything we’ve looked at has been record-based, which means that it includes the play of the Canadiens as a whole. Is it really Carey that play wells on Satudays or are the Habs just better in general? Because the two are so linked, this would be a good question to look into in its own right, by pulling in a bunch of team stats, looking at games without Price, etc. For now I’m happy to just look at Price’s save percentage and call it a day:

Indeed, Price’s best save percentage is on Saturdays. That doesn’t mean his play is solely responsible for the high proportion of Satuday wins, but it does indicate that Carey plays at his best on Saturdays.

A Simple Statistical Test in Much Detail

CAUTION: I’m going to go into a lot of detail for those interested in what statistical tests really mean. If you don’t care or already know this, the TDLR is that Carey really does win “statisically significantly” more on Saturdays.

Looking at these plots is nice, and they show that Carey’s record is better on Saturdays than the other nights of the week. But we wonder whether this is just a fluke. Afer all, if you flipped coins every day of the week, some days are going to have more heads than others. This sort of question is where “statistical significance” comes into play. Stated very simply, the idea of statistical significance just involves combining the difference in win percentage with information about sample size. For better or for worse, the standard approach to determining statistical siginifance uses “classical” statistical hypothesis testing.

I had one specifc question, that I had in my head before I saw any data:

“Does Carey Price really win more on Saturday Nights?”

To rigorously interpret a statistical test, it’s important to know the research question ahead of time and stick to it. The hypotheses we are testing are:

\[\begin{equation} \begin{aligned} \text{Null Hypothesis }(H_0) & \; \text{: Carey wins the same proportion of games regardless of whether it is Saturday or not} \\ \text{Alt. Hypothesis }(H_1) & \; \text{: Carey wins a higher proportion of games on Saturdays than on other days} \end{aligned} \end{equation}\]

Our test will either present strong enough evidence to reject the null hypothesis, or it won’t. The evidence will come in the form of a test statistic, a number that summarizes information about the relative proportions of wins on each type of night, along with the sample sizes. The classical approach is pretty much: if this test statistic is big enough, we will reject \(H_0\).

We a choose statistical model for the data, which involves making some assumptions and simplifications. My model views each game as a Bernoulli trial, from one of two populations (Saturday games and non- Saturday games). This means that the Saturday games in our dataset are a random sample from a hypothetical infinite population of Saturdays games, that all share a common probability of a win, \(p_1\). Similarly the non-Saturday games are a sample from a population with a possibly different chance of winning, \(p_2\). Our hypotheis test can then be stated:

\[\begin{equation} \begin{aligned} H_0: \; & p_1 = p_2 \\ H_1: \; & p_1 > p_2 \end{aligned} \end{equation}\]

Now, we form our test statistic \(Z\), which depends on the observed proportions of wins, as well as the number of games worth of data we have:

\[ \frac{p_1 - p_2}{ \sqrt{ p ( 1 - p ) [ (1/n_1) + (1/n_2) ] } } \]

where \(p_1\) is the Saturday win percentage, \(p_2\) is the non-Saturday win percentage, \(n_1\) is the number of Saturday games, \(n_2\) is the number of non-Saturday games, and \(p\) is the overall win percentage. Why this statistic? See here.

For our data, \(Z \approx 3.092\). As I mentioned, the larger \(Z\), the stronger the evidence that Carey wins more on Saturdays. From the test statistic, we can compute the infamous “\(p\)-value”. The \(p\) value is the probability, if the null hypothesis were true, that our test-statistic would come out to be \(3.092\) or bigger. In plainer speak, the \(p\)-value is the chance that we would see Carey win at least this much more on Saturdays, even if the Saturday effect isn’t real. Two things can increase \(Z\): seeing a higher difference in win percentage, or observing more games.

For our data, the \(p\)-value is \(9.94\times 10^{-4}\), or almost exactly \(0.001\). This means that the chances we saw such a big Z value given no real improvement on Saturdays is about \(1/1000\). At any reasonable confidence level, I conclude Mr. Satuday night is statistically legit.

The Playoffs

As a Habs fan, I hate to say it, but in the playoffs the Mr. Saturday Night effect is not present. If anything, it’s reversed. Remember that the sample size of playoff games is small, so this data is not as informative. I’ve run out of steam on this whole thing, so I’m just going to present the playoff results without any hard-hitting analysis. All ears to comments on this reversal.

First off, the win percentage plot again:

And here is the raw count plot:

Saturdays look rough. Does Carey really play worse on Saturdays in the playoffs though? Let’s look at the SV% plot:

I hate to even bring this up because Carey price is the closest thing to a god that I believe in, but if the Habs are going to win a cup, the Saturday night magic must continue on through to the post season.

If you’re wondering, including the playoff data in all of the previous analysis doesn’t change anything much. The number of playoffs games (54) is so small compared to regular season games (497) that the effect is minimal.

Some notes on how this was made

I created this using solely R-based tools, or other tools with interfaces for R. The data was collected using Selenium, which automates web-browsers to collect information that is generated by scripts on web-pages. Selenium bindings for R were provided by the RSelenium package. The data was manipulated using the fantastic data.table package, and the plots generated with mimimal fuss using the well known ggplot2 package. All exploration on my end was done using these tools.

This document itself was created using knitr and RMarkdown, with the code living within the document itself. The beauty of all of this is that as the stats are updated on the web, the new plots and analysis can be automatically refreshed. I started this document last week, and today I was able to easily update it to include Carey’s Saturday night win against Ottawa yesterday.

If you’ve got any thoughts on what could make this more useful, more correct or more entertaining, let me know. If you have any questions you want me to check out using this data, let me know. Or do it yourself. If you have any questions about doing that ask and I’m sure myself and plenty of other nerds will be glad to help.