Question 1
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr 1.1.4 ✔ readr 2.1.5
✔ forcats 1.0.0 ✔ stringr 1.5.1
✔ ggplot2 4.0.0 ✔ tibble 3.3.0
✔ lubridate 1.9.4 ✔ tidyr 1.3.1
✔ purrr 1.1.0
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag() masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
data1 <- read_csv ("https://raw.githubusercontent.com/vaiseys/dav-course/refs/heads/main/Data/nfl_salaries.csv" )
Rows: 800 Columns: 11
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
dbl (11): year, Cornerback, Defensive Lineman, Linebacker, Offensive Lineman...
ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
Question 2
data2 <- data1 |>
pivot_longer (cols = - year, names_to = "position" , values_to = "salary" )
Question 3
data3 <- data2 |>
filter (position == "Quarterback" )
data3 |>
ggplot (aes (x = salary)) +
geom_histogram (binwidth = 1e6 , boundary = 0 , closed = "left" , na.rm = TRUE ) +
facet_wrap (~ year, ncol = 3 ) +
scale_x_continuous (
breaks = scales:: breaks_pretty (n = 6 ),
labels = scales:: label_number (scale = 1e-6 ,
accuracy = 1 )
) +
labs (
title = "Quarterback Salaries by Year" ,
x = "Salary (M)" ,
y = "Count"
)
What patterns do you notice?
Between 2011 and 2019, in each of those years, roughly 40% of quarterbacks earned less than $1 million, (Seen from the leftmost bars for all year are the tallest) and that percentage fluctuates.
Question 4
data4 <- data2 |>
group_by (year, position) |>
summarize (
avg_salary = mean (salary, na.rm = TRUE ),
.groups = "drop"
) |>
arrange (year, position)
view (data4)
Question 5
data4 |>
ggplot (aes (x = year, y = avg_salary, color = position)) +
geom_line (linewidth = 1 ) +
geom_point (size = 1.5 ) +
scale_y_continuous (
labels = scales:: label_number (scale = 1e-6 , accuracy = 0.1 )
) +
labs (
title = "Average NFL Salaries by Position (2011–2019)" ,
x = "Year" ,
y = "Average Salary (Million $)" ,
color = "Position"
)
Trend 1: Quarterbacks earn higher salaries than almost all other positions in every year.
Trend 2: Safeties, special teamers, tight ends, and wide receivers lag far behind other positions in both salary levels and growth trends.