Assignment 7

Introduction

This assignment analyzes shifts in NBA stats at the team level since 2010, as well as explores the assumption “You need an elite offense to find success”. Additionally, this analysis will dive into what metrics possibly define postseason opportunities.

Prompt for Investigation:

Have scoring trends changed since 2010, and if so what are the metrics that induce success?

Running Libraries

library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.6
✔ forcats   1.0.1     ✔ stringr   1.6.0
✔ ggplot2   4.0.1     ✔ tibble    3.3.1
✔ lubridate 1.9.4     ✔ tidyr     1.3.2
✔ purrr     1.2.1     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(rvest)

Attaching package: 'rvest'

The following object is masked from 'package:readr':

    guess_encoding
library(xml2)
library(stringr)
library(readr)

Data Sourcing

All of the data used is from basketballreference.com and has been ethically web scraped from their tables.

The link to my dataset is provided here: https://myxavier-my.sharepoint.com/:x:/g/personal/skikunr_xavier_edu/IQDCg21NxgM6Rpvdfc_jGh6wATR0GhfITyrBeTRbCZEYJpw?download=1

Ethics Statement

In compliance with robots.txt protocols and terms of service, I conducted a non-profit educational study of NBA scoring trends from 2010 to 2026. The data was ethically sourced from Sports Reference to analyze long-term shifts in team offensive performance.

Three Point Conversion Rate between Playoff & Non Playoff Teams

nba_df <- read_csv("~/Desktop/OneDrive - Xavier University/BAIS 462 Spring 2026/nba_df.csv")
Rows: 527 Columns: 27
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr  (1): Team
dbl (26): Rk, G, MP, FG, FGA, FG%, 3P, 3PA, 3P%, 2P, 2PA, 2P%, FT, FTA, FT%,...

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
nba_df %>%
  mutate(`3P%` = as.numeric(str_remove(`3P%`, "%"))) %>%
  group_by(season, playoffs) %>%
  summarise(
    avg_3p = mean(`3P%`, na.rm = TRUE),
    .groups = "drop"
  ) %>%
  ggplot(aes(x = season, y = avg_3p, color = factor(playoffs))) +
  geom_line(linewidth = 1.2) +
  geom_point(size = 1) +
  labs(
    title = "3P% Over Time: Playoff vs Non-Playoff Teams (2010–2016)",
    x = "Season",
    y = "Average 3P%",
    color = "Playoffs (1 = Yes, 0 = No)"
  ) +
  theme_minimal()

Using the visual above, on average playoff teams consistently have shot the three pointer at a better rate, however in recent years there appears to be a tightening of the gap since 2024. This could be due to pace of play and style changes favoring teams that induce foul trouble on their opponents and excel at the charity stripe.

League Average Scoring & Shooting Summary

nba_df <- read_csv("~/Library/CloudStorage/OneDrive-XavierUniversity/BAIS 462 Spring 2026/nba_df.csv")
Rows: 527 Columns: 27
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr  (1): Team
dbl (26): Rk, G, MP, FG, FGA, FG%, 3P, 3PA, 3P%, 2P, 2PA, 2P%, FT, FTA, FT%,...

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
nba_df %>%
  filter(season >= 2010, season <= 2026) %>%
  filter(str_detect(Team, regex("League", ignore_case = TRUE))) %>%
  mutate(`3P%` = as.numeric(str_remove(`3P%`, "%"))) %>%
  select(season, `3P%`, `3PA`, PTS) %>%
  pivot_longer(cols = -season, names_to = "metric", values_to = "value") %>%
  ggplot(aes(x = season, y = value)) +
  geom_line(linewidth = 1.2, color = "steelblue") +
  geom_point(size = 2, color = "steelblue") +
  facet_wrap(~metric, scales = "free_y", ncol = 1) +
  labs(
    title = "NBA League Averages (2010–2026)",
    x = "Season",
    y = "Value"
  ) +
  theme_minimal()

The chart above compares overall league average trends for key metrics such as three pointers attempted per game, the success rate of those, as well as the overall points per game. Although there is no linear path for percentage increasing, there is definitely a consistent improvement in both the number of attempts as well as points per game. This correlation would make sense because of a three pointer being worth more than a standard field goal, so naturally points per game would increase with the number of attempts increasing as well.

League Average Points Per game

nba_df <- read_csv("~/Library/CloudStorage/OneDrive-XavierUniversity/BAIS 462 Spring 2026/nba_df.csv")
Rows: 527 Columns: 27
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr  (1): Team
dbl (26): Rk, G, MP, FG, FGA, FG%, 3P, 3PA, 3P%, 2P, 2PA, 2P%, FT, FTA, FT%,...

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
nba_df %>%
  filter(str_detect(Team, regex("League", ignore_case = TRUE))) %>%
  ggplot(aes(x = season, y = PTS)) +
  geom_line(linewidth = 1.2, color = "red") +
  geom_point(size = 2, color = "black") +
  labs(
    title = "NBA League Average Points Per Game Over Time",
    x = "Season",
    y = "Points Per Game (PPG)"
  ) +
  theme_minimal()

This chart is very similar to the one above, but is scaled out to see if there are any visual discrepancies, such as the sharp decline between 2012-2013, as well as 2022-2023. One possible theory for these dips in points per game could be the dawn of a new offensive era for each, and naturally growing pains exist and other teams learn to adopt similar habits which improves the league average.

Three point Attempts for Playoff vs Non Playoff Teams

nba_df <- read_csv("~/Library/CloudStorage/OneDrive-XavierUniversity/BAIS 462 Spring 2026/nba_df.csv")
Rows: 527 Columns: 27
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr  (1): Team
dbl (26): Rk, G, MP, FG, FGA, FG%, 3P, 3PA, 3P%, 2P, 2PA, 2P%, FT, FTA, FT%,...

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
nba_df %>%
  group_by(season, playoffs) %>%
  summarise(
    avg_3PA = mean(`3PA`, na.rm = TRUE),
    .groups = "drop") %>%
  ggplot(aes(x = factor(season), y = avg_3PA)) +
  geom_col(fill = "orange") +
  facet_wrap(~playoffs, labeller = labeller(
    playoffs = c("0" = "Non-Playoff Teams", "1" = "Playoff Teams")
  )) +
  labs(
    title = "3PT Attempts Over Time: Playoff vs Non-Playoff Teams",
    x = "Season",
    y = "Average 3PA"
  ) +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))

The chart above compares the number of three pointers attempted by a playoff team vs non playoff teams over the given years. What is interesting for this visual is although there is a noticeable difference in three point make rate between the two groups (seen in an earlier graph), there is virtually no difference between the two. This points to one of two conclusions: efficiency is the difference in fate of a franchise, or external factors such as free throw rate and percentage tell the missing piece to the puzzle.

Free throw % by Playoff vs Non playoff teams

nba_df <- read_csv("~/Library/CloudStorage/OneDrive-XavierUniversity/BAIS 462 Spring 2026/nba_df.csv")
Rows: 527 Columns: 27
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr  (1): Team
dbl (26): Rk, G, MP, FG, FGA, FG%, 3P, 3PA, 3P%, 2P, 2PA, 2P%, FT, FTA, FT%,...

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
nba_df %>%
  group_by(season, playoffs) %>%
  summarise(
    avg_FTP = mean(`FT%`, na.rm = TRUE),
    .groups = "drop"
  ) %>%
  ggplot(aes(x = factor(season), y = avg_FTP)) +
  geom_col(fill = "darkgreen") +
  coord_cartesian(ylim = c(0.7, 0.8)) +
  facet_wrap(~playoffs, labeller = labeller(
    playoffs = c("0" = "Non-Playoff Teams", "1" = "Playoff Teams")
  )) +
  labs(
    title = "Free Throw Percentage Over Time: Playoff vs Non-Playoff Teams",
    x = "Season",
    y = "Free Throw %"
  ) +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))

Using what we learned from the previous visual, the picture begins to be clearer. When broken down into groups of teams who made the playoffs vs teams who did not, we see that although both graphs follow a similar shape over time, it is clear that efficiency is a huge contender as to how playoff teams are defined. In a league where narrow margins can determine season outcomes, scoring efficiency is the difference between life and death in postseason hopes.

Conclusion

The evidence gathered over the past sixteen seasons confirms that the NBA has moved into an era of unprecedented offensive optimization. While the data highlights significant increases in three-point conversion rates, these trends must be contextualized within broader qualitative changes, such as shifting defensive philosophies and officiating standards that favor offensive flow. The phasing out of traditional post-centric roles in favor of hybrid, multi-dimensional players reflects a league-wide directive toward versatility. Consequently, the ‘Modern 5’ has become the definitive symbol of this era, bridging the gap between historical physical dominance and contemporary technical skill.