Setup table and load libraries
library(ggplot2)
nhl <- read.csv("NHLteamstats.csv")
*Plotting the goal rate versus the situation
ggplot(nhl, aes(x = Situation, y = GF60)) + ### Load in the data and specify axis
facet_grid(. ~ Situation, space = "free", scales = "free_x") + # Divide the grid up based on situation
geom_point() + # Add points
stat_summary(fun.y = "mean", geom = "text", label = "----", size = 10, color = "black") +
# Add a dotted line for the mean
theme(axis.text.x = element_blank(), axis.ticks.x = element_blank()) + ylab("Goals per 60 minutes") #Apply themes to remove the x-axis ticks and labels and change the y-axis title
*From this we can see that something is wrong with the data. The goal rate is low for even strength, high for powerplay but also high for when the team is a man down. By eyeing the data, it is obvious that from 2008-2012, the 4v5 data is actually 5v4. This will need to be fixed.
*Until then only use 5v5 stats.
*Now we are looking at the relationship between shot accuracy and number of goals scored, in both even strength and power play situations.
ggplot(subset(nhl, Situation != "4v5"), aes(x = Sh., y = GF60, color = as.factor(Year))) +
# Load the data, remove 4v5 because of errors, use year for color
facet_grid(Year ~ Situation, space = "free", scales = "free_x") + # Facet the grid by situation and year
geom_point() + # Add points
geom_smooth(method = "lm") + # Add a linear regression to each
scale_colour_discrete(name = "Year") #Label the legend