Good national hockey league (NHL) forwards can have a positive impact for their team, while said forwards are on the ice. The best forwards in the league can help their teams retain possession of the puck at a higher rate than the average NHL forward. To see which forwards NHL coaches deem the best on a given team, we can look at the amount of time on ice a forward receives per game, at even strength. Even strength in this case is when the game is played at 5 players versus 5 players, and eliminates any power play or penalty kill game states.
It’s a widely held belief that varying skill levels of NHL players have an impact on goal-scoring and goals generated. However, it is also believed that turning a shot attempt into a goal requires a substantial amount of luck, as well.
NHL teams are divided into four forward lines of three players each. The top two lines on any team typically receive the most 5v5 time on ice per game. I would like to see if the first line forwads, denoted as “firstliners”, have a greater impact on a team’s shooting percentage than the average NHL forward. For my project I’ve considered all 1106 forwards who have played an NHL game since 2007. However, I’ve narrowed my selection down to the 576 forwards who have played at least 100 games in that time span. Since a season is 82 games long, 100 games is a decent estimate for players who have seen regular game time over the course of at least one season. This will help eliminate a lot of noise involved from players whose game log is a very small sample size of information.
summary(atleast100Gm$TOI.Gm)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 3.90 10.30 11.80 11.44 13.00 16.20
summary(atleast100Gm$OSh.)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 2.500 6.700 7.600 7.605 8.500 11.300
summary(firstliners$TOI.Gm)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 13.00 13.30 13.50 13.74 14.10 16.20
summary(firstliners$OSh.)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 6.300 8.200 8.600 8.658 9.200 11.300
Here is a histogram of all 576 forwards’ team shooting percentages. In other words, what percentage of shots did player x’s team turn into goals while player x was on the ice?
hist(atleast100Gm$OSh.)
Here is a histogram of the first line forwards’ (by time on ice) teams’ shooting percentages with those players on the ice.
hist(firstliners$OSh.)
Null hypothesis: The mean of the firstliners’ team shooting percentage is equal to the mean of all 576 forwards’ with at least 100 games of experience team shooting percentage.
Alternative hypothesis: The firstliners’ team shooting percentage is greater than the average team shooting percentage of all 576 forwards.
Below is the summary data from team shooting percentage of all 576 forwards, and then just the firstliners shooting percentage.
summary(atleast100Gm$OSh.)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 2.500 6.700 7.600 7.605 8.500 11.300
summary(firstliners$OSh.)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 6.300 8.200 8.600 8.658 9.200 11.300
The population mean of all 576 forwards’ team shooting percentages is 7.605. The observed mean for the firstliners’ team shooting percentage is 8.658.
Below is the standard deviation of the team shooting percentages of all 576 forwards. I also used R to calculate the standard error, based on the 576 forwards.
sd(atleast100Gm$OSh.)
## [1] 1.27156
(sd(atleast100Gm$OSh.))/sqrt(576)
## [1] 0.05298167
The null hypothesis is that there is no difference between the average shooting percentage of first liners and the average shooting percentage of all 576 forwards who have played at least 100 games since 2007.
mean(firstliners$OSh.)-mean(atleast100Gm$OSh.)
## [1] 1.052951
Calculate the z-score (using the difference between the two means minus the expected difference of 0 shooting percentage points)
((mean(firstliners$OSh.)-mean(atleast100Gm$OSh.))-0)/(sd(atleast100Gm$OSh.))
## [1] 0.8280784
The Z-score is .828. We’ll call it .83 for the table lookup purposes.
The p-value comes out to .2033 since we want to know the probility the mean team shooting percentage of firstliners is higher than .83 standard deviations away from the 576 skater population mean.
Insert inference section here…
Given the fairly high p-value, or, at least, a p-value above the standard 5% significance level, I think it’s ok to fail to reject the null hypothesis. In other words, that the first liners’ mean team shooting percentage is higher than the mean of all 576 forwards in question is due to chance.