110 Final: NHL Injuries

Author

Z Griffin

A Comparison of NHL Head Injuries Approximately 20 Years Apart

2005-2006 and 2024-2025 Seasons

Kris Letang, Pittsburgh Penguins, 2017, in a helmet with visor Source: Wikipedia Commons.

Kris Letang, Pittsburgh Penguins, 2017, in a helmet with visor. Source: Wikipedia Commons

Avalanche Nathan MacKinnon took a puck to the face May 11th, 2026 during a playoff game. Source: ESPN, Associated Press

Introduction

My data set is the NHL Injury Database, compiled by LW3H, from data from the teams, puckpedia.com, cbssports.com, the former capfriendly.com, tsn.ca, and sportsforcaster.com, consisting of injury data for all NHL teams to from the 2000-2001 season to midway through 2025-2026, when I downloaded it. It is posted on the NHL Injury Viz blog page, and Tableau Public. I joined this with team rosters and player stats from Hockey-reference.com, official NHL data provided by sportsradar.

Variable (dataset) Defintion Notes
Season (both) Season of play, presented as the year it ended it either 2006 or 2025
Team (both) Team player was signed with used in joining the datasets
Position (injuries) Position played (Forward, Defense, or Goalie) also indicates retired – filtered for only Forward and Defense
fullname (injuries) /Player (roster) Player’s given and surname had to wrangle injuries dataset to create to join to rosters
InjuryType (injuries) What the injury or illness was
injurygroup (created) classified InjuryType by area of body head, upper or lower body, or other for sickness, undisclosed, non location specific injuries
GamesMissed (injuries) how many games were missed
Nationality (roster) What country the player is from modified from original column “Birth” due to formatting
Pos (roster) also position, but classifies as Center, Left and Right Wing, Defense, and Goalie
Age (roster) Players age as of February of that season
height (roster) height in inches wrangled this column from original
Wt (roster) weight in pounds
Shoots (roster) side player shoots puck from L or R often opposite of dominate hand
Exp (Experience) (roster) Years in league before that season
GP (Games Played) (roster) how many games played that season specifically for that team
Goals Goals scored
Assists Goals they contributed to usually by making one of the last two passes
PTS (Points) (Roster) Sum of Goals and Assists
+/- (plus minus) (Roster Differential of goals scored for and against when player is on the ice
PIM (Roster) Penalties in minutes minor penalties are 2 min, major is 5, game misconduct is 10
TOI (Roster Time on Ice, total for whole season displayed as mm:ss
ATOI (roster) Average time on ice, TOI/Games played displayed as mm:ss
TOI_min (roster) Time on ice, total converted to minutes as a decimal
ATOI_min (roster) Average time on ice, TOI/Games played converted to minutes as a decimal

I have been a hockey fan for years, following the Pittsburgh Penguins, and it has bugged me in a time when more and more attention is being payed to head injuries and concussions in sports, the National Hockey League does not wear full face masks. Collegiate hockey requires full facial protection. The new Professional Womens Hockey League wear helmets with cages. Instead, skaters in the NHL wear visors that protect the eyes only – and those were only made mandatory starting in the 2013-2014 season.

In 2012-2013, around 70% of the league was voluntarily wearing visors when New York Rangers defenseman Marc Staal was hit in the eye by a puck, and missed the remainder of the season (Johnson). Staal was not wearing a visor. In June, the NHL and NHL Players Association voted for a visor mandate that included a grandfather clause for anyone who had already played 25 games. Before Staal’s injury, a mandate had been rejected (Halford).

In the beginning of the 2005-2006 season, 38% of the NHL were wearing visors (CBC). In 2024-2025 and currently, only four players do not (0.46% of the league, in March of 2026) Jamie Benn of the Dallas Stars, Zach Bogosian of the Minnesota Wild, Ryan O’Reilly of the Nashville Predators, and Ryan Reaves of the San Jose Sharks (Johnson, Trettenero). I am comparing head injuries from two seasons approximately two decades apart to see how the increase safety measures, including visors, have performed over the years. The seasons are approximately two decades apart as the current season (2025-2026) was still in progress when I started this project, and 2004-2005 was not played.

Two important notes about this dataset:

One: Injuries are not independent necessarily independent, as the same players can get repeat injuries – Blake Lizzote of the Penguins even does with conussions in 2025; and are often injuring each other.

Two: Only injuries that meant the player missed subsequent games appear on this database. The above picture of Nathan MacKinnon’s injury likely would not appear in this dataset later this week as he came back in that game, and I doubt he will miss a game.

Data

Load Libraries and Data

library(tidyverse)
library(readxl)
Warning: package 'readxl' was built under R version 4.5.3
library(webshot2)
Warning: package 'webshot2' was built under R version 4.5.3
library(tidymodels)
library(RColorBrewer)
library(highcharter)
library(patchwork)
Warning: package 'patchwork' was built under R version 4.5.3
library(ggfortify)
setwd("~/Schol Stuff/Montgomery College 2025/Data 110 Data Visualization/110 Final Project")

nhlinjury <- read_csv("NHL Injury Database_data.csv") 
 
Roster06 <- read_excel("2006_2025Rosters.xlsx", sheet = 1)
Roster25 <- read_excel("2006_2025Rosters.xlsx", sheet = 2)

Preview and Clean The Rosters

starting with the Rosters and player stats: two different tables from Hockey-reference.com were combined per team in Excel, and not all columns were used.

head(Roster06)
# A tibble: 6 × 21
  Season Team    Player    Birth Nationality Pos     Age Ht       Wt `S/C` Exp  
   <dbl> <chr>   <chr>     <chr> <chr>       <chr> <dbl> <chr> <dbl> <chr> <chr>
1   2006 Anaheim François… ca CA CA          D        25 5-11    208 L/-   1    
2   2006 Anaheim Kip Bren… ca CA CA          LW       25 6-4     230 L/-   3    
3   2006 Anaheim Ilya Bry… su SU SU          G        25 6-3     213 -/L   2    
4   2006 Anaheim Keith Ca… us US US          D        35 6-1     207 L/-   13   
5   2006 Anaheim Joe DiPe… ca CA CA          D        26 6-2     199 L/-   1    
6   2006 Anaheim Sergei F… su SU SU          C        36 6-2     207 L/-   14   
# ℹ 10 more variables: `Birth Date` <chr>, Summary <chr>, GP <dbl>,
#   Goals <dbl>, Assists <dbl>, PTS <dbl>, `+/-` <dbl>, PIM <dbl>, TOI <chr>,
#   ATOI <chr>
head(Roster25)
# A tibble: 6 × 21
  Season Team    Player    Birth Nationality Pos     Age Ht       Wt `S/C` Exp  
   <dbl> <chr>   <chr>     <chr> <chr>       <chr> <dbl> <chr> <dbl> <chr> <chr>
1   2025 Anaheim Leo Carl… se SE SE          C        20 6-3     208 L/-   1    
2   2025 Anaheim Sam Cola… us US US          RW       23 6-2     213 R/-   1    
3   2025 Anaheim Lukáš Do… cz CZ CZ          G        24 6-2     190 -/L   3    
4   2025 Anaheim Brian Du… us US US          D        33 6-4     215 L/-   11   
5   2025 Anaheim Robby Fa… ca CA CA          C        29 5-11    185 L/-   8    
6   2025 Anaheim Cam Fowl… ca CA CA          D        33 6-2     213 L/-   14   
# ℹ 10 more variables: `Birth Date` <chr>, Summary <chr>, GP <dbl>,
#   Goals <dbl>, Assists <dbl>, PTS <dbl>, `+/-` <dbl>, PIM <dbl>, TOI <chr>,
#   ATOI <chr>

Nationality was created as the lowercase characters in ‘Birth’ were a country flag image on the website.

Bind the rosters

Combine the two roster dataframes into one before any other cleaning is done to them:

# change Roster06 Goals to numeric so the bind works
Roster06$Goals <- as.numeric(Roster06$Goals) 

Rosterall <- bind_rows(Roster06, Roster25)

dim(Rosterall)
[1] 2205   21
head(Rosterall)
# A tibble: 6 × 21
  Season Team    Player    Birth Nationality Pos     Age Ht       Wt `S/C` Exp  
   <dbl> <chr>   <chr>     <chr> <chr>       <chr> <dbl> <chr> <dbl> <chr> <chr>
1   2006 Anaheim François… ca CA CA          D        25 5-11    208 L/-   1    
2   2006 Anaheim Kip Bren… ca CA CA          LW       25 6-4     230 L/-   3    
3   2006 Anaheim Ilya Bry… su SU SU          G        25 6-3     213 -/L   2    
4   2006 Anaheim Keith Ca… us US US          D        35 6-1     207 L/-   13   
5   2006 Anaheim Joe DiPe… ca CA CA          D        26 6-2     199 L/-   1    
6   2006 Anaheim Sergei F… su SU SU          C        36 6-2     207 L/-   14   
# ℹ 10 more variables: `Birth Date` <chr>, Summary <chr>, GP <dbl>,
#   Goals <dbl>, Assists <dbl>, PTS <dbl>, `+/-` <dbl>, PIM <dbl>, TOI <chr>,
#   ATOI <chr>

Making Numeric and Seperating Columns (or both)

Time on Ice (TOI) and Average Time on Ice (ATOI) need converted to numeric

Experience (Exp) (years in league) needs the R for rookie replaced with 0, and converted to numeric

# ms() pulls out the mm:ss format, as.numeric converts to total seconds, divide by 60 to make minutes
Rosterall$TOI_min <- as.numeric(ms(Rosterall$TOI))/60 

Rosterall$ATOI_min <-as.numeric(ms(Rosterall$ATOI))/60

Rosterall$Exp <- as.numeric(gsub("R", "0", Rosterall$Exp))


head(Rosterall)
# A tibble: 6 × 23
  Season Team    Player    Birth Nationality Pos     Age Ht       Wt `S/C`   Exp
   <dbl> <chr>   <chr>     <chr> <chr>       <chr> <dbl> <chr> <dbl> <chr> <dbl>
1   2006 Anaheim François… ca CA CA          D        25 5-11    208 L/-       1
2   2006 Anaheim Kip Bren… ca CA CA          LW       25 6-4     230 L/-       3
3   2006 Anaheim Ilya Bry… su SU SU          G        25 6-3     213 -/L       2
4   2006 Anaheim Keith Ca… us US US          D        35 6-1     207 L/-      13
5   2006 Anaheim Joe DiPe… ca CA CA          D        26 6-2     199 L/-       1
6   2006 Anaheim Sergei F… su SU SU          C        36 6-2     207 L/-      14
# ℹ 12 more variables: `Birth Date` <chr>, Summary <chr>, GP <dbl>,
#   Goals <dbl>, Assists <dbl>, PTS <dbl>, `+/-` <dbl>, PIM <dbl>, TOI <chr>,
#   ATOI <chr>, TOI_min <dbl>, ATOI_min <dbl>

Ht (height) in feet-inches needs to be separated and recombined. S/C (shoots/catches) needs to be separated as well

Rosterall2 <- Rosterall |>
  
  separate(Ht, into = c("feet", "inch"), sep = "-", convert = TRUE) |> 
  mutate(height = as.numeric(feet)*12 + as.numeric(inch), .after = "inch") |> #multiply feet by 12, add to inches to get height in inches as new column 
  
  separate(`S/C`, into= c("shoots", "catches"), sep ="/", convert = TRUE)  # separate S/C and names them; not concerned about the dashes in place of NAs as Catches is only relevant for goalies and they should be the only ones without "shoots" and I'm filtering them out! 
  


head(Rosterall2)
# A tibble: 6 × 26
  Season Team    Player   Birth Nationality Pos     Age  feet  inch height    Wt
   <dbl> <chr>   <chr>    <chr> <chr>       <chr> <dbl> <int> <int>  <dbl> <dbl>
1   2006 Anaheim Françoi… ca CA CA          D        25     5    11     71   208
2   2006 Anaheim Kip Bre… ca CA CA          LW       25     6     4     76   230
3   2006 Anaheim Ilya Br… su SU SU          G        25     6     3     75   213
4   2006 Anaheim Keith C… us US US          D        35     6     1     73   207
5   2006 Anaheim Joe DiP… ca CA CA          D        26     6     2     74   199
6   2006 Anaheim Sergei … su SU SU          C        36     6     2     74   207
# ℹ 15 more variables: shoots <chr>, catches <chr>, Exp <dbl>,
#   `Birth Date` <chr>, Summary <chr>, GP <dbl>, Goals <dbl>, Assists <dbl>,
#   PTS <dbl>, `+/-` <dbl>, PIM <dbl>, TOI <chr>, ATOI <chr>, TOI_min <dbl>,
#   ATOI_min <dbl>
Rosterall3 <- Rosterall2 |>
  # removing a few columns I absolutely don't need for ease of viewing
  select(-Birth, -feet, -inch, -catches, -Summary)

head(Rosterall3)
# A tibble: 6 × 21
  Season Team    Player        Nationality Pos     Age height    Wt shoots   Exp
   <dbl> <chr>   <chr>         <chr>       <chr> <dbl>  <dbl> <dbl> <chr>  <dbl>
1   2006 Anaheim François Bea… CA          D        25     71   208 L          1
2   2006 Anaheim Kip Brennan   CA          LW       25     76   230 L          3
3   2006 Anaheim Ilya Bryzgal… SU          G        25     75   213 -          2
4   2006 Anaheim Keith Carney  US          D        35     73   207 L         13
5   2006 Anaheim Joe DiPenta   CA          D        26     74   199 L          1
6   2006 Anaheim Sergei Fedor… SU          C        36     74   207 L         14
# ℹ 11 more variables: `Birth Date` <chr>, GP <dbl>, Goals <dbl>,
#   Assists <dbl>, PTS <dbl>, `+/-` <dbl>, PIM <dbl>, TOI <chr>, ATOI <chr>,
#   TOI_min <dbl>, ATOI_min <dbl>

Cleaning the Injury Data base

head(nhlinjury)
# A tibble: 6 × 8
  Season  Team    Position Player   `Injury Type` `Cap Hit`  Chip `Games Missed`
  <chr>   <chr>   <chr>    <chr>    <chr>             <dbl> <dbl>          <dbl>
1 2000/01 Anaheim F        Kariya,… Foot                 NA    NA             16
2 2000/01 Anaheim F        Leclerc… Abdominal            NA    NA             13
3 2000/01 Anaheim F        Leclerc… Knee                 NA    NA             15
4 2000/01 Anaheim F        McDonal… Concussion           NA    NA              7
5 2000/01 Anaheim F        McInnis… Groin                NA    NA              3
6 2000/01 Anaheim F        McInnis… Neck                 NA    NA              3

Remove spaces in column names

Fix Season to be just the ending year of the season

names(nhlinjury) <- gsub(" ", "", names(nhlinjury))

str_sub(nhlinjury$Season, start=3, end = 5) <- ""


head(nhlinjury)
# A tibble: 6 × 8
  Season Team    Position Player         InjuryType CapHit  Chip GamesMissed
  <chr>  <chr>   <chr>    <chr>          <chr>       <dbl> <dbl>       <dbl>
1 2001   Anaheim F        Kariya, Paul   Foot           NA    NA          16
2 2001   Anaheim F        Leclerc, Mike  Abdominal      NA    NA          13
3 2001   Anaheim F        Leclerc, Mike  Knee           NA    NA          15
4 2001   Anaheim F        McDonald, Andy Concussion     NA    NA           7
5 2001   Anaheim F        McInnis, Marty Groin          NA    NA           3
6 2001   Anaheim F        McInnis, Marty Neck           NA    NA           3

Then separate Player from Surname, Given name and recombine into Given Surname to match the Rosters

nhlinjury1 <- nhlinjury |>
  filter(Season %in% c('2006','2025')) |>  
  
  separate(Player, into = c('surname', 'given'), sep =",", convert = TRUE) |>
  mutate(fullname = paste({given},{surname}, sep = " "), .after = 'given')

head(nhlinjury)
# A tibble: 6 × 8
  Season Team    Position Player         InjuryType CapHit  Chip GamesMissed
  <chr>  <chr>   <chr>    <chr>          <chr>       <dbl> <dbl>       <dbl>
1 2001   Anaheim F        Kariya, Paul   Foot           NA    NA          16
2 2001   Anaheim F        Leclerc, Mike  Abdominal      NA    NA          13
3 2001   Anaheim F        Leclerc, Mike  Knee           NA    NA          15
4 2001   Anaheim F        McDonald, Andy Concussion     NA    NA           7
5 2001   Anaheim F        McInnis, Marty Groin          NA    NA           3
6 2001   Anaheim F        McInnis, Marty Neck           NA    NA           3

Group Injury Types

sort(unique(nhlinjury1$InjuryType))
 [1] "Abdominal"     "Adductor"      "Ankle"         "Appendectomy" 
 [5] "Arm"           "Back"          "Bicep"         "Blood clots"  
 [9] "Bronchitis"    "Cancer"        "Charley horse" "Cheek"        
[13] "Chest"         "Collarbone"    "Concussion"    "Dehydration"  
[17] "Dental"        "Dizziness"     "Ear"           "Elbow"        
[21] "Eye"           "Facial"        "Finger"        "Flu"          
[25] "Foot"          "Groin"         "Hamstring"     "Hand"         
[29] "Head"          "Heart"         "Heel"          "Hepatitis"    
[33] "Hernia"        "Hip"           "Illness"       "Infection"    
[37] "Jaw"           "Knee"          "Larynx"        "Leg"          
[41] "Lower body"    "Mid-body"      "Neck"          "Nose"         
[45] "Oblique"       "Orbital bone"  "Pectoral"      "Pelvis"       
[49] "Quadriceps"    "Respiratory"   "Ribs"          "Shoulder"     
[53] "Sinus"         "Sternum"       "Tailbone"      "Thigh"        
[57] "Thumb"         "Toe"           "Torso"         "Undisclosed"  
[61] "Upper body"    "Wrist"        

Group for injuries to the head, upper body, lower body, and illness or unspecified location injuries

head <- c("Cheek", "Concussion", "Dizziness", "Ear", "Eye", "Facial", "Head", "Jaw", "Nose", "Orbital bone")

upper <- c("Abdominal", "Arm", "Back", "Bicep", "Chest", "Collarbone", "Elbow", "Finger", "Hand", "Heart", "Larynx", "Mid-body", "Oblique", "Pectoral", "Ribs", "Shoulder", "Sternum", "Thumb", "Torso", "Upper body", "Wrist")

lower <- c("Adductor", "Ankle", "Charley horse", "Foot", "Groin", "Hamstring", "Heel", "Hip", "Knee", "Leg", "Lower body", "Pelvis", "Quadricepts", "Tailbone", "Thigh", "Toe")

other <- c("Appendectomy", "Blood clots", "Bronchitis", "Cancer", "Dehydration", "Dental", "Hepatitis", "Hernia", "Illness", "infection", "Sinus", "Undisclosed")

injury2 <- nhlinjury1 |>
  mutate(injurygroup = case_when(
    InjuryType %in% head ~ "head",
    InjuryType %in% upper ~ "upper", 
    InjuryType %in% lower ~ "lower",
    TRUE ~ "other"), .after = InjuryType) 

head(injury2)    
# A tibble: 6 × 11
  Season Team    Position surname  given  fullname InjuryType injurygroup CapHit
  <chr>  <chr>   <chr>    <chr>    <chr>  <chr>    <chr>      <chr>        <dbl>
1 2006   Anaheim F        Brennan  " Kip" " Kip B… Shoulder   upper           NA
2 2006   Anaheim F        Fedorov  " Ser… " Serge… Groin      lower           NA
3 2006   Anaheim F        Fedoruk  " Tod… " Todd … Back       upper           NA
4 2006   Anaheim F        Getzlaf  " Rya… " Ryan … Shoulder   upper           NA
5 2006   Anaheim F        Hedström " Jon… " Jonat… Groin      lower           NA
6 2006   Anaheim F        Konopka  " Zen… " Zenon… Ankle      lower           NA
# ℹ 2 more variables: Chip <dbl>, GamesMissed <dbl>

Join the dataframes

First trim all whitespace from the joining columns

Rosterall3$Player <-trimws(Rosterall3$Player) # weirdly this did not seem to work on every row and I had to manually remove spaces from the Excel file

injury2$fullname <- trimws(injury2$fullname)

Rosterall$Team <- trimws(Rosterall3$Team)

injury2$Team <- trimws(injury2$Team)

Rosterall3$Season <-trimws(Rosterall3$Season)

injury2$Season <- trimws(injury2$Season)

complete the join

joined <- left_join(injury2, Rosterall3, by = c("Season" = "Season", "fullname" = "Player", "Team" = "Team"), relationship = "many-to-many")

head(joined)
# A tibble: 6 × 29
  Season Team    Position surname  given  fullname InjuryType injurygroup CapHit
  <chr>  <chr>   <chr>    <chr>    <chr>  <chr>    <chr>      <chr>        <dbl>
1 2006   Anaheim F        Brennan  " Kip" Kip Bre… Shoulder   upper           NA
2 2006   Anaheim F        Fedorov  " Ser… Sergei … Groin      lower           NA
3 2006   Anaheim F        Fedoruk  " Tod… Todd Fe… Back       upper           NA
4 2006   Anaheim F        Getzlaf  " Rya… Ryan Ge… Shoulder   upper           NA
5 2006   Anaheim F        Hedström " Jon… Jonatha… Groin      lower           NA
6 2006   Anaheim F        Konopka  " Zen… Zenon K… Ankle      lower           NA
# ℹ 20 more variables: Chip <dbl>, GamesMissed <dbl>, Nationality <chr>,
#   Pos <chr>, Age <dbl>, height <dbl>, Wt <dbl>, shoots <chr>, Exp <dbl>,
#   `Birth Date` <chr>, GP <dbl>, Goals <dbl>, Assists <dbl>, PTS <dbl>,
#   `+/-` <dbl>, PIM <dbl>, TOI <chr>, ATOI <chr>, TOI_min <dbl>,
#   ATOI_min <dbl>

Please note that joins sometimes failed due to the player never actually playing a game (during the regular season) with the team they were signed to when reported as injured. In some cases, this is because they missed the entire season due to injury. In others, it is because they were traded or otherwise changed teams mid-season, or were signed to an NHL level contract, but played solely for the minor league affiliate team.

Filter for only active Forwards and Defensmen

Goalies have completely different stats that I did not grab. They also had all been wearing full face masks.

The “retired” status is for players who haven’t played since a previous season due to injuries or physical condition but had not officially retired. Those players would matter for salary cap investigations, but I’m not looking at money at all.

unique(joined$Position)
[1] "F"             "D"             "G"             "D \"retired\""
[5] "G \"retired\""
join1 <- joined |>
  filter(Position %in% c("F", "D")) |>
  select(-CapHit, -Chip) # remove the columns concerning injury time effect on team's salary cap, as that is outside of the scope of this project

head(join1)
# A tibble: 6 × 27
  Season Team    Position surname  given       fullname   InjuryType injurygroup
  <chr>  <chr>   <chr>    <chr>    <chr>       <chr>      <chr>      <chr>      
1 2006   Anaheim F        Brennan  " Kip"      Kip Brenn… Shoulder   upper      
2 2006   Anaheim F        Fedorov  " Sergei"   Sergei Fe… Groin      lower      
3 2006   Anaheim F        Fedoruk  " Todd"     Todd Fedo… Back       upper      
4 2006   Anaheim F        Getzlaf  " Ryan"     Ryan Getz… Shoulder   upper      
5 2006   Anaheim F        Hedström " Jonathan" Jonathan … Groin      lower      
6 2006   Anaheim F        Konopka  " Zenon"    Zenon Kon… Ankle      lower      
# ℹ 19 more variables: GamesMissed <dbl>, Nationality <chr>, Pos <chr>,
#   Age <dbl>, height <dbl>, Wt <dbl>, shoots <chr>, Exp <dbl>,
#   `Birth Date` <chr>, GP <dbl>, Goals <dbl>, Assists <dbl>, PTS <dbl>,
#   `+/-` <dbl>, PIM <dbl>, TOI <chr>, ATOI <chr>, TOI_min <dbl>,
#   ATOI_min <dbl>

Filter for just the Head group injuries

headjoin <- join1 |> 
  filter(injurygroup == "head")

head(headjoin)
# A tibble: 6 × 27
  Season Team    Position surname     given      fullname InjuryType injurygroup
  <chr>  <chr>   <chr>    <chr>       <chr>      <chr>    <chr>      <chr>      
1 2006   Anaheim F        Lupul       " Joffrey" Joffrey… Concussion head       
2 2006   Anaheim F        Niedermayer " Rob"     Rob Nie… Concussion head       
3 2006   Anaheim F        Perry       " Corey"   Corey P… Concussion head       
4 2006   Anaheim D        Marshall    " Jason"   Jason M… Nose       head       
5 2006   Anaheim D        Salei       " Ruslan"  Ruslan … Orbital b… head       
6 2006   Boston  D        Leetch      " Brian"   Brian L… Head       head       
# ℹ 19 more variables: GamesMissed <dbl>, Nationality <chr>, Pos <chr>,
#   Age <dbl>, height <dbl>, Wt <dbl>, shoots <chr>, Exp <dbl>,
#   `Birth Date` <chr>, GP <dbl>, Goals <dbl>, Assists <dbl>, PTS <dbl>,
#   `+/-` <dbl>, PIM <dbl>, TOI <chr>, ATOI <chr>, TOI_min <dbl>,
#   ATOI_min <dbl>

Model: Head Injury Severity (GamesMissed)

Check for collinearity using library DataExplorer on the numeric predictor variables

library(DataExplorer)
headjoinnumeric <- headjoin |>
  select(Age, height, Wt, Exp, GP, Goals, Assists, PTS, `+/-`, PIM, TOI_min, ATOI_min)

plot_correlation(headjoinnumeric)

Age and Experience are highly correlated; as are height and Wt (weight); PTS highly correlates with goals, assists, and TOI_min; TOI_min with Games played (GP); Goals and assists are right on the boarder of judgement call.

Create model

Fit1 using Season, Position, InjuryType, Nationality, Wt (weight), shoots, sqrt(Exp), log(GP) (games played), sqrt(PTS_, +/-, log(Penalty Minutes) (PIM), and Average Time on Ice (ATOI_min).

NOTE Log of Games Played is still not normally distributed, highly skewed left, with the curve not coming all the way down, but at least it is no longer nearly linear

fit1 <- lm(data = headjoin, GamesMissed ~ Season + InjuryType + Nationality + Wt + shoots + sqrt(Exp) + log(GP) + sqrt(PTS) + `+/-` + log(PIM) + ATOI_min)
summary(fit1)

Call:
lm(formula = GamesMissed ~ Season + InjuryType + Nationality + 
    Wt + shoots + sqrt(Exp) + log(GP) + sqrt(PTS) + `+/-` + log(PIM) + 
    ATOI_min, data = headjoin)

Residuals:
    Min      1Q  Median      3Q     Max 
-14.850  -5.099  -0.629   3.570  38.892 

Coefficients:
                        Estimate Std. Error t value Pr(>|t|)    
(Intercept)             24.48455   18.18412   1.346 0.181904    
Season2025              -3.60691    2.87303  -1.255 0.212930    
InjuryTypeDizziness      6.15671    7.24136   0.850 0.397712    
InjuryTypeEar           -3.66600    9.36280  -0.392 0.696419    
InjuryTypeEye           -7.91756    3.46729  -2.283 0.025020 *  
InjuryTypeFacial        -2.64872    2.82568  -0.937 0.351353    
InjuryTypeHead          -1.12452    2.54506  -0.442 0.659780    
InjuryTypeJaw            3.95409    3.72504   1.061 0.291622    
InjuryTypeNose         -10.44233    5.39302  -1.936 0.056323 .  
InjuryTypeOrbital bone   2.69040    5.16185   0.521 0.603644    
NationalityCH           -6.78397    9.48158  -0.715 0.476364    
NationalityCS          -10.39441    4.35599  -2.386 0.019354 *  
NationalityCZ           -2.20512    6.61380  -0.333 0.739685    
NationalityDE           -8.00487    6.58099  -1.216 0.227380    
NationalityFI           -1.93630    4.94727  -0.391 0.696539    
NationalityRU           -0.79012    6.88161  -0.115 0.908875    
NationalitySE           -3.33688    3.64323  -0.916 0.362431    
NationalitySU          -13.26471    4.15151  -3.195 0.001992 ** 
NationalityUS           -4.44688    2.32255  -1.915 0.059067 .  
Wt                       0.10349    0.07024   1.473 0.144532    
shootsR                 -3.59790    1.88068  -1.913 0.059271 .  
sqrt(Exp)                2.49500    0.98588   2.531 0.013318 *  
log(GP)                 -8.59201    2.33869  -3.674 0.000428 ***
sqrt(PTS)                0.37239    0.75258   0.495 0.622069    
`+/-`                    0.16899    0.11255   1.502 0.137103    
log(PIM)                -0.75355    1.60864  -0.468 0.640730    
ATOI_min                -0.20308    0.26323  -0.771 0.442658    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 8.546 on 81 degrees of freedom
Multiple R-squared:  0.5291,    Adjusted R-squared:  0.378 
F-statistic:   3.5 on 26 and 81 DF,  p-value: 8.53e-06
autoplot(fit1, nrow=2, ncol=2)

Backwards regression!

Remove sqrt(PTS) as highest p-value

fit2<- lm(data = headjoin, GamesMissed ~ Season + InjuryType + Nationality + Wt + shoots + sqrt(Exp) + log(GP)  + `+/-` + log(PIM) + ATOI_min)
autoplot(fit2)
Warning: Removed 78 rows containing missing values or values outside the scale range
(`geom_line()`).
Warning: Removed 2 rows containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 8 rows containing missing values or values outside the scale range
(`geom_line()`).

summary(fit2)

Call:
lm(formula = GamesMissed ~ Season + InjuryType + Nationality + 
    Wt + shoots + sqrt(Exp) + log(GP) + `+/-` + log(PIM) + ATOI_min, 
    data = headjoin)

Residuals:
    Min      1Q  Median      3Q     Max 
-14.359  -5.281  -0.349   3.654  39.098 

Coefficients:
                        Estimate Std. Error t value Pr(>|t|)    
(Intercept)             22.39299   17.60439   1.272 0.206965    
Season2025              -3.71798    2.85103  -1.304 0.195854    
InjuryTypeDizziness      6.63284    7.14402   0.928 0.355901    
InjuryTypeEar           -2.95560    9.20937  -0.321 0.749077    
InjuryTypeEye           -7.87710    3.45033  -2.283 0.025019 *  
InjuryTypeFacial        -2.38757    2.76314  -0.864 0.390067    
InjuryTypeHead          -1.23550    2.52346  -0.490 0.625718    
InjuryTypeJaw            3.88595    3.70531   1.049 0.297373    
InjuryTypeNose         -10.59943    5.35882  -1.978 0.051294 .  
InjuryTypeOrbital bone   2.34068    5.08964   0.460 0.646812    
NationalityCH           -6.90016    9.43492  -0.731 0.466655    
NationalityCS          -10.43284    4.33520  -2.407 0.018352 *  
NationalityCZ           -1.86709    6.54806  -0.285 0.776258    
NationalityDE           -7.38899    6.43239  -1.149 0.254013    
NationalityFI           -1.81302    4.91818  -0.369 0.713349    
NationalityRU           -0.58785    6.83776  -0.086 0.931698    
NationalitySE           -3.19155    3.61461  -0.883 0.379840    
NationalitySU          -13.44670    4.11610  -3.267 0.001589 ** 
NationalityUS           -4.20381    2.25953  -1.860 0.066402 .  
Wt                       0.10503    0.06985   1.504 0.136504    
shootsR                 -3.57266    1.87131  -1.909 0.059738 .  
sqrt(Exp)                2.58077    0.96604   2.671 0.009106 ** 
log(GP)                 -8.05480    2.06186  -3.907 0.000192 ***
`+/-`                    0.18387    0.10796   1.703 0.092314 .  
log(PIM)                -0.69294    1.59656  -0.434 0.665415    
ATOI_min                -0.14517    0.23469  -0.619 0.537932    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 8.507 on 82 degrees of freedom
Multiple R-squared:  0.5277,    Adjusted R-squared:  0.3837 
F-statistic: 3.664 on 25 and 82 DF,  p-value: 4.738e-06

Remove log(PIM) (Penalties in Minutes) as highest p-value

fit3<- lm(data = headjoin, GamesMissed ~ Season + InjuryType + Nationality + Wt + shoots + sqrt(Exp) + log(GP)  + `+/-` + ATOI_min)
summary(fit3)

Call:
lm(formula = GamesMissed ~ Season + InjuryType + Nationality + 
    Wt + shoots + sqrt(Exp) + log(GP) + `+/-` + ATOI_min, data = headjoin)

Residuals:
    Min      1Q  Median      3Q     Max 
-14.635  -5.171  -0.451   3.810  39.592 

Coefficients:
                        Estimate Std. Error t value Pr(>|t|)    
(Intercept)             22.66532   17.50698   1.295  0.19903    
Season2025              -2.90853    2.14583  -1.355  0.17896    
InjuryTypeDizziness      5.96455    6.94192   0.859  0.39270    
InjuryTypeEar           -2.27784    9.03152  -0.252  0.80150    
InjuryTypeEye           -7.93150    3.43115  -2.312  0.02328 *  
InjuryTypeFacial        -2.36556    2.74914  -0.860  0.39201    
InjuryTypeHead          -1.34985    2.49737  -0.541  0.59029    
InjuryTypeJaw            3.71219    3.66557   1.013  0.31414    
InjuryTypeNose         -10.93661    5.27622  -2.073  0.04129 *  
InjuryTypeOrbital bone   1.82742    4.92608   0.371  0.71161    
NationalityCH           -7.54787    9.27049  -0.814  0.41787    
NationalityCS          -10.22047    4.28639  -2.384  0.01939 *  
NationalityCZ           -1.87073    6.51596  -0.287  0.77475    
NationalityDE           -6.92372    6.31135  -1.097  0.27580    
NationalityFI           -1.80487    4.89404  -0.369  0.71322    
NationalityRU           -0.57744    6.80420  -0.085  0.93257    
NationalitySE           -2.87013    3.52059  -0.815  0.41727    
NationalitySU          -13.10794    4.02161  -3.259  0.00162 ** 
NationalityUS           -4.13462    2.24285  -1.843  0.06883 .  
Wt                       0.10052    0.06873   1.463  0.14737    
shootsR                 -3.56224    1.86199  -1.913  0.05918 .  
sqrt(Exp)                2.59108    0.96102   2.696  0.00849 ** 
log(GP)                 -8.63934    1.55356  -5.561 3.19e-07 ***
`+/-`                    0.17042    0.10290   1.656  0.10147    
ATOI_min                -0.13084    0.23122  -0.566  0.57302    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 8.465 on 83 degrees of freedom
Multiple R-squared:  0.5266,    Adjusted R-squared:  0.3897 
F-statistic: 3.847 on 24 and 83 DF,  p-value: 2.519e-06
autoplot(fit3)
Warning: Removed 79 rows containing missing values or values outside the scale range
(`geom_line()`).
Warning: Removed 2 rows containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 6 rows containing missing values or values outside the scale range
(`geom_line()`).

Remove ATOI_min as highest P-value

fit4 <- lm(data = headjoin, GamesMissed ~ Season + InjuryType + Nationality + Wt + shoots + sqrt(Exp) + log(GP)  + `+/-`)
summary(fit4)

Call:
lm(formula = GamesMissed ~ Season + InjuryType + Nationality + 
    Wt + shoots + sqrt(Exp) + log(GP) + `+/-`, data = headjoin)

Residuals:
    Min      1Q  Median      3Q     Max 
-15.001  -5.505  -0.466   3.586  39.089 

Coefficients:
                        Estimate Std. Error t value Pr(>|t|)    
(Intercept)             20.62437   17.06194   1.209 0.230134    
Season2025              -2.87566    2.13634  -1.346 0.181902    
InjuryTypeDizziness      5.54681    6.87457   0.807 0.422026    
InjuryTypeEar           -2.26454    8.99487  -0.252 0.801842    
InjuryTypeEye           -7.82027    3.41163  -2.292 0.024393 *  
InjuryTypeFacial        -2.57684    2.71262  -0.950 0.344864    
InjuryTypeHead          -1.41008    2.48498  -0.567 0.571929    
InjuryTypeJaw            3.66398    3.64972   1.004 0.318306    
InjuryTypeNose         -10.71722    5.24062  -2.045 0.043982 *  
InjuryTypeOrbital bone   1.87917    4.90526   0.383 0.702619    
NationalityCH           -8.36226    9.12095  -0.917 0.361862    
NationalityCS          -10.34608    4.26328  -2.427 0.017372 *  
NationalityCZ           -2.28949    6.44755  -0.355 0.723408    
NationalityDE           -7.17378    6.27033  -1.144 0.255839    
NationalityFI           -1.39259    4.81988  -0.289 0.773350    
NationalityRU           -1.04309    6.72687  -0.155 0.877144    
NationalitySE           -3.03840    3.49378  -0.870 0.386966    
NationalitySU          -13.49368    3.94734  -3.418 0.000974 ***
NationalityUS           -4.05468    2.22932  -1.819 0.072506 .  
Wt                       0.10947    0.06662   1.643 0.104060    
shootsR                 -3.48010    1.84879  -1.882 0.063249 .  
sqrt(Exp)                2.38495    0.88570   2.693 0.008553 ** 
log(GP)                 -8.97538    1.42975  -6.278 1.44e-08 ***
`+/-`                    0.15879    0.10042   1.581 0.117581    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 8.431 on 84 degrees of freedom
Multiple R-squared:  0.5248,    Adjusted R-squared:  0.3946 
F-statistic: 4.033 on 23 and 84 DF,  p-value: 1.377e-06
autoplot(fit4) 
Warning: Removed 86 rows containing missing values or values outside the scale range
(`geom_line()`).
Warning: Removed 2 rows containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 6 rows containing missing values or values outside the scale range
(`geom_line()`).

Remove Plus/minus as highest p-value

fit5 <- lm(data = headjoin, GamesMissed ~ Season + InjuryType + Nationality + Wt + shoots + sqrt(Exp) + log(GP))
summary(fit5)

Call:
lm(formula = GamesMissed ~ Season + InjuryType + Nationality + 
    Wt + shoots + sqrt(Exp) + log(GP), data = headjoin)

Residuals:
    Min      1Q  Median      3Q     Max 
-15.953  -4.918  -1.057   3.759  40.357 

Coefficients:
                       Estimate Std. Error t value Pr(>|t|)    
(Intercept)             20.3074    17.2107   1.180  0.24132    
Season2025              -2.4643     2.1391  -1.152  0.25253    
InjuryTypeDizziness      3.2552     6.7791   0.480  0.63233    
InjuryTypeEar           -0.7754     9.0240  -0.086  0.93173    
InjuryTypeEye           -7.1363     3.4138  -2.090  0.03957 *  
InjuryTypeFacial        -2.1056     2.7199  -0.774  0.44098    
InjuryTypeHead          -1.1257     2.5002  -0.450  0.65369    
InjuryTypeJaw            3.9530     3.6772   1.075  0.28541    
InjuryTypeNose         -10.8382     5.2861  -2.050  0.04342 *  
InjuryTypeOrbital bone   2.8721     4.9077   0.585  0.55994    
NationalityCH           -8.3281     9.2011  -0.905  0.36796    
NationalityCS           -8.5467     4.1447  -2.062  0.04225 *  
NationalityCZ           -3.2541     6.4750  -0.503  0.61657    
NationalityDE           -7.2434     6.3253  -1.145  0.25536    
NationalityFI           -0.1708     4.7993  -0.036  0.97170    
NationalityRU            1.3249     6.6157   0.200  0.84175    
NationalitySE           -1.9379     3.4539  -0.561  0.57621    
NationalitySU          -12.5180     3.9331  -3.183  0.00204 ** 
NationalityUS           -4.3815     2.2392  -1.957  0.05367 .  
Wt                       0.1091     0.0672   1.624  0.10812    
shootsR                 -3.3162     1.8621  -1.781  0.07850 .  
sqrt(Exp)                2.1075     0.8758   2.406  0.01827 *  
log(GP)                 -8.9007     1.4415  -6.174 2.19e-08 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 8.505 on 85 degrees of freedom
Multiple R-squared:  0.5106,    Adjusted R-squared:  0.384 
F-statistic: 4.031 on 22 and 85 DF,  p-value: 1.738e-06
autoplot(fit5) 
Warning: Removed 88 rows containing missing values or values outside the scale range
(`geom_line()`).
Warning: Removed 2 rows containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 4 rows containing missing values or values outside the scale range
(`geom_line()`).

R-squared decreased: best model was fit4

Final Model:

fit4:

GamesMissed_hat = 20.62437 - 2.87566(Season2025) + 5.54681(InjuryTypeDizziness) - 2.26454(InjuryTypeEar) - 7.82027(InjuryTypeEye) - 2.57684(InjuryTypeFacial)- 1.41008(InjuryTypeHead) + 3.66398(InjuryTypeJaw)
- 10.8382(InjuryTypeNose) + 1.87917(InjuryTypeOrbital bone) - 8.3281(NationalityCH) - 10.34608(NationalityCS) - 2.28949(NationalityCZ) - 7.17378(NationalityDE) - 1.39259(NationalityFI) - 1.04309(NationalityRU) - 3.03840(NationalitySE) - 13.49368(NationalitySU) - 4.3815(NationalityUS) + 0.1091(Wt) - 3.3162(shootsR) + 2.1075(sqrt(Exp)) - 8.9007(log(GP))

Adjusted R-squared value: 0.3946

My model only explains 39.46% of the variation in GamesMissed, so it is not a very good model.

Based on NationalityCA being in the intercept and all the other nationalities having negative coefficents, Canadians had the most severe injuries. I highly doubt Canadians actually get injured more severely; this probably due to Canada and the US having the highest numbers of players in the league. (While three of the four players who don’t wear visors are Canadian, none of them got head injuries in 2025)

Visualizations

Count the number of each injury group for 2006 and 2025 seasons

groupcount <- join1 |>
  group_by(Season, injurygroup) |>
  count()

Count the number of Injury Types in the head injury group for 2006 and 2025 seasons

headcount <- headjoin |>
  group_by(Season, InjuryType) |>
  count()

Visualization 1:

p1 <- groupcount |>
  ggplot((aes(reorder(injurygroup,n), n, fill = Season))) +
  geom_col() +
  labs(title = "Injuries* by Body Part Area",
       #subtitle = "*that required absence from 1+ games",
       x = "Injury Body Area",
       y = "Count",
       fill = "Season") +
  scale_fill_manual(values = c("#b4d4ed","#83b6de")) +
  theme_bw()
p2 <- headcount |>
  ggplot((aes(reorder(InjuryType, n),n , fill = Season))) +
  geom_col() +
  coord_flip() +
  labs(title = "Head Injuries* to NHL Players in 2006 & 2025",
       #subtitle = "*that required absence from 1+ games",
       x = "Injury", 
       y = "Count",
       fill = "Season")+ # using a joint caption below
  scale_fill_manual(values = c("#b4d4ed","#83b6de")) +
  theme_bw() 
 p1 + p2 + plot_layout(guides = "collect") + plot_annotation(
   title = "Injuries* to NHL Players in 2005-2006 & 2024-2025",
   subtitle = "*that required absence from 1+ games",
   caption = "Source: puckpedia.com, cbssports.com, the former capfriendly.com, tsn.ca, and sportsforcaster.com")

 # plot_layout shares the  legends, plot_annotation gives collective title and caption

The first chart shows that injuries in general have decreased in the 2024-2025 season compared to 2005-2006. Upper body injuries stayed about the same, and “other” increased, but as “other” includes illness, that category is not as useful.

The head injuries chart shows nearly all types of head injuries have decreased! Concussions and eye injuries in particular have a noticeable drop off. This is consistent with the research by Benson in youth and collegiate hockey. It is notable that “Facial” injuries look to be about equal between the two seasons, along with “jaw” injuries – the jaw is not covered by the visor. “Facial” can include parts parts of the face that both are and are not covered by the visor.

Vizualization 2:

highchart() |>
  hc_add_series(headjoin,
                type = "scatter",
                hcaes(x = "ATOI_min",
                y = "GamesMissed",
                group = "Season"))|>
  hc_colors(c("#b4d4ed","#83b6de")) |>
  hc_add_theme(hc_theme_handdrawn()) |>
  hc_title(text = "Games Missed for Head Injuries vs Average Time on Ice\n
           2006 and 2025") |>
  hc_xAxis(title=list(text="Average Time on Ice (minutes)")) |>
  hc_yAxis(title = list(text="Games Missed due to Injury")) |>
  hc_caption(text = "Source: puckpedia.com, cbssports.com, the former capfriendly.com, tsn.ca, and sportsforcaster.com, and sportsradar", align = "right") |>
  hc_legend(align = "right", verticalAlign = "top") |>
  hc_tooltip(pointFormat = "<b>{point.fullname}</b>, {point.Pos} <br> {point.Team}, {point.Season} <br> <b>Injury</b>: {point.InjuryType}<br> <b>ATOI: </b> {point.ATOI} mm:ss <br> <b>Games Missed:</b> {point.GamesMissed} <b> Games Played:</b> {point.GP}")

This graph shows Games missed due to head injury vs Average Time on Ice in minute. While ATOI_min was taken out of my linear model, I was interested in which players were getting injured, and ATOI can be a proxy for how good a player is, as the better players play more per game. As taking ATOI_min out of the linear model showed, there is no connection between player quality and severity of injury, but it is interesting to note that the severity has gone down for 2025.

I took advantage of highcharter’s interactive tooltips to have the mouseover display the player’s name, position (the more detailed one from the roster), team and injury year, as well as ATOI (in the original mm:ss for readability), games missed, and games played.

References

Benson, B. W., et al. “The Impact of Face Shield Use on Concussions in Ice Hockey: A Multivariate Analysis.” British Journal of Sports Medicine, vol. 36, no. 1, Feb. 2002, pp. 27–32. Original Articles. bjsm.bmj.com, https://doi.org/10.1136/bjsm.36.1.27.

CBC. “Use of Visors in NHL on the Rise: Report.” CBC Sports, 18 Oct. 2005. CBC.ca, https://www.cbc.ca/sports/hockey/use-of-visors-in-nhl-on-the-rise-report-1.528068.

Halford, Mike. “NHL Makes Visors Mandatory for New Players.” NBC Sports, 4 June 2013, https://www.nbcsports.com/nhl/news/nhl-makes-visors-mandatory-for-new-players.

Miller, Michael. English: Pittsburgh Penguins Defensemen Kris Letang. 9 Dec. 2017, Own work. Wikimedia Commons, https://commons.wikimedia.org/wiki/File:Kris_Letang_2017-12-09_17960_%281%29.jpg.

Johnston, Chris. “Why 4 NHL Players Still Don’t Wear Visors, and Who Will Be the Last Standing?” The New York Times, 13 Mar. 2026. NYTimes.com, https://www.nytimes.com/athletic/7108800/2026/03/12/nhl-players-visor-holdouts/.

Press, Associated. “Avs Star MacKinnon Bloodied after Puck to Face from Teammate.” ESPN.Com, 12 May 2026, https://www.espn.com/nhl/story/_/id/48746176/avs-star-mackinnon-bloodied-puck-face-teammate.

Trettenero, Brady. “NHL Players Without Visors (2024).” Gino Hard, 5 Apr. 2023, https://www.ginohard.com/nhl-visorless-players/.