# Create Variable of Average Goals Per game & Winning Percentage
<- team_data %>%
team_data group_by(Team) %>%
mutate(avg_goals_per_game= GF / GP, #Goaled For / Games Played
win_pct = W / GP) %>% # Number of Wins / Total # of Games
mutate(Conference = case_when(
%in% c(
Team "Anaheim Ducks", "Arizona Coyotes", "Calgary Flames",
"Chicago Blackhawks","Colorado Avalanche", "Dallas Stars",
"Edmonton Oilers", "Los Angeles Kings","Minnesota Wild",
"Nashville Predators", "San Jose Sharks", "Seattle Kraken",
"St. Louis Blues", "Vancouver Canucks", "Vegas Golden Knights",
"Winnipeg Jets", "Utah Hockey Club"
~ "Western",
)
%in% c(
Team "Boston Bruins", "Buffalo Sabres", "Carolina Hurricanes",
"Columbus Blue Jackets","Detroit Red Wings", "Florida Panthers"
"Montréal Canadiens", "New Jersey Devils",
,"New York Islanders", "New York Rangers", "Ottawa Senators",
"Philadelphia Flyers","Pittsburgh Penguins",
"Tampa Bay Lightning", "Toronto Maple Leafs",
"Washington Capitals"
~ "Eastern",
) %>%
)) mutate(Division = case_when(
%in% c(
Team "Boston Bruins", "Buffalo Sabres", "Detroit Red Wings",
"Florida Panthers","Montréal Canadiens", "Ottawa Senators",
"Tampa Bay Lightning", "Toronto Maple Leafs"
~ "Atlantic",
)
%in% c(
Team "Carolina Hurricanes", "Columbus Blue Jackets",
"New Jersey Devils","New York Islanders", "New York Rangers",
"Philadelphia Flyers","Pittsburgh Penguins",
"Washington Capitals"
~ "Metropolitan",
)
%in% c(
Team "Arizona Coyotes", "Chicago Blackhawks", "Colorado Avalanche",
"Dallas Stars","Minnesota Wild", "Nashville Predators",
"St. Louis Blues", "Winnipeg Jets", "Utah Hockey Club"
~ "Central",
)
%in% c(
Team "Anaheim Ducks", "Calgary Flames", "Edmonton Oilers",
"Los Angeles Kings","San Jose Sharks", "Seattle Kraken",
"Vancouver Canucks", "Vegas Golden Knights"
~ "Pacific")) )
Assignment 7
Analysis of NHL Goals from the 2021-2025 Seasons
Introduction
Hockey is a fast paced game that takes many skilled players working together strategically to succeed. When coming up with strategies to succeed in hockey, it is based off knowledge of the game, instincts, and data. Looking at hockey data of the National Hockey League (NHL) over a period of time can bring trend, that cannot be discovered in only one season’s data, to the forefront. The question that is going to be investigated in this analysis is:
Is there a relationship between a team’s average goals per game and their winning percentage ?
This question will be answered through multiple steps which start with getting the data. The first step is the scrap a data table on team stats from NHL.com, then clean the data to ensure all data is in the correct form. From there data wrangling can happen to transform our data and gain more insights. And lastly, a visual will be created to answer. This question will be answered using 2021-2025 regular seasons (4 seasons) data but could be investigated further using additional seasons to see if there is a trend over a longer period of time.
Steps
1. Scrap Data
Good practice for web scraping is to self identify to the website before scraping any data. This means telling the website your name and a way for them to contact you if they have an questions because what you are going with the data you gathered. When self identifying to NHL.com, data cannot be scraped the page only returns data after the confirmation that the user is capable of receiving ads. To be able to retrieve the data an automated browser will need to be used.
2. Data Wrangling
Now that we have gathered the data we want to use, we have to ensure all data types were correctly assigned and create any new variable needed for the visual.
All data is in their correct format so all the data wrangling needed to answer this question is to create a variable or average number of goals per game played and winning percentage . To take the analysis further, two additional variablea will be created. The conference and division variables will allow us to see if there is a relationship between average number of goals scored per game and winning percentage at the league, conference level, and divisional level.
3. Create Visual
Now to create a scatter plot of the average number of goals per game against their winning percentage.
ggplot(team_data, aes(x = avg_goals_per_game, y = win_pct, color=Conference)) +
geom_point(size = 4) + # Size of Point
geom_smooth(method = "lm",color = "navy") + # Regression line
scale_y_continuous(labels = scales:: percent) + # % Labels
labs(
title = "Relationship Between Points Per Game and Win Percentage",
x = "Average Points Per Game",
y = "Winning Percentage"
)
`geom_smooth()` using formula = 'y ~ x'
ggplot(team_data, aes(x = avg_goals_per_game, y = win_pct, color=Division)) +
geom_point(size = 4) + # Size of Point
geom_smooth(method = "lm",color = "navy") + # Regression line
scale_y_continuous(labels = scales:: percent) + # % Labels
labs(
title = "Relationship Between Points Per Game and Win Percentage",
x = "Average Points Per Game",
y = "Winning Percentage"
)
`geom_smooth()` using formula = 'y ~ x'
Analysis
Looking at the first visual of the scatter plot of average number of goals scored per game against winning percentage of each team at the conference level, it shows that there is a positive correlation. The points are somewhat evenly spread with a few more about the regression line than below. When looking at each conference, the Western Conference is more tightly dispersed around the trend line where as the Eastern Conference is more spread out. There are more Western Conference teams below 40% winning percentage and around 2.5-2.6 average goals per game.
Now looking at the relationship between average number of goals scored per game and winning percentage, there is not a trend or pattern between a certain division and their relationship. Each division has teams below and above the line at high average points per game and low. From this we know there are similar but slightly different relationships between average number of goals scored per game and winning percentage at the conference level but not so much at the divisional level.