library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.2 ──
## ✔ ggplot2 3.3.6      ✔ purrr   0.3.5 
## ✔ tibble  3.1.8      ✔ dplyr   1.0.10
## ✔ tidyr   1.2.1      ✔ stringr 1.4.1 
## ✔ readr   2.1.2      ✔ forcats 0.5.2 
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
library(readxl)
data <- read_excel("../00_data/myData.xlsx")
## New names:
## • `` -> `...1`
data
## # A tibble: 4,810 × 24
##     ...1  rank position hand  player   years total…¹ status yr_st…² season   age
##    <dbl> <dbl> <chr>    <chr> <chr>    <chr>   <dbl> <chr>    <dbl> <chr>  <dbl>
##  1     1     1 C        Left  Wayne G… 1979…     894 Retir…    1979 1978-…    18
##  2     2     1 C        Left  Wayne G… 1979…     894 Retir…    1979 1978-…    18
##  3     3     1 C        Left  Wayne G… 1979…     894 Retir…    1979 1978-…    18
##  4     4     1 C        Left  Wayne G… 1979…     894 Retir…    1979 1979-…    19
##  5     5     1 C        Left  Wayne G… 1979…     894 Retir…    1979 1980-…    20
##  6     6     1 C        Left  Wayne G… 1979…     894 Retir…    1979 1981-…    21
##  7     7     1 C        Left  Wayne G… 1979…     894 Retir…    1979 1982-…    22
##  8     8     1 C        Left  Wayne G… 1979…     894 Retir…    1979 1983-…    23
##  9     9     1 C        Left  Wayne G… 1979…     894 Retir…    1979 1984-…    24
## 10    10     1 C        Left  Wayne G… 1979…     894 Retir…    1979 1985-…    25
## # … with 4,800 more rows, 13 more variables: team <chr>, league <chr>,
## #   season_games <dbl>, goals <dbl>, assists <dbl>, points <dbl>,
## #   plus_minus <chr>, penalty_min <dbl>, goals_even <chr>,
## #   goals_power_play <chr>, goals_short_handed <chr>, goals_game_winner <chr>,
## #   headshot <chr>, and abbreviated variable names ¹​total_goals, ²​yr_start

Which position scores the most goals on average?

The data and variables used in the analysis include which positions players played and how many goals and assists they scored in a given season.

Relevant R code

data %>%
    
    ggplot(aes(x = goals, y = position)) +
    geom_boxplot()

data %>%
    
    ggplot(aes(x = goals, y = position)) +
    geom_tile()

Based on the plots created above it looks as if the “C” and “RW” positions have the highest amount of goals on average.