Introduction

The National Basketball Association (NBA) has evolved significantly over the past decades, with new trends and standout performances emerging every season. In this report, I analyze five key areas using historical NBA data (2003–2022) to uncover insights:

  1. Underrated Scorers Over the Years: Some players have posted high scoring averages but were overlooked in MVP discussions due to their team’s lack of success. Exploring these players and their impact.

  2. Home-Court Advantage Trends: Home teams historically win about 60% of games, analyzing whether this trend has changed over time.

  3. Defensive Impact Players: Not all great defenders receive awards. Here I examined players with high steals and blocks but little formal recognition.

  4. Teams with the Most Close Game Wins: Clutch performance defines strong teams. I identified those who excelled in games decided by five points or fewer.

  5. Evolution of Three-Point Shooting: The reliance on three-point shots has increased dramatically. I tracked this evolution over time.

1. Underrated Scorers Over the Years

Findings: The visualization above shows five prolific scoring seasons by players who did not win MVP. Tracy McGrady’s 32.1 PPG in 2003 and Dwyane Wade’s 30.2 PPG in 2009 stand out as especially high. Carmelo Anthony (2013), Damian Lillard (2020), and Bradley Beal (2021) each averaged around 29–31 PPG. In each case, the MVP award that season went to a player on a team with a better record or a player with more all-around stats. This suggests that scoring alone, while important, is not the sole criterion for MVP — team success and other contributions weigh heavily. These players were “underrated” in the MVP race despite their scoring prowess.

# Identify high-scoring seasons without MVP awards
underrated_players <- data.frame(
  Player = c("Tracy McGrady (2003)", "Dwyane Wade (2009)", 
             "Carmelo Anthony (2013)", "Damian Lillard (2020)", "Bradley Beal (2021)"),
  PPG = c(32.1, 30.2, 28.7, 30.0, 31.3)
)

# Bar chart with labels
ggplot(underrated_players, aes(x = reorder(Player, PPG), y = PPG, fill = Player)) +
  geom_bar(stat = "identity") +
  geom_text(aes(label = PPG), hjust = -0.15, size = 2.85) +
  coord_flip() +
  scale_fill_brewer(palette = "Set2") +
  labs(title = "Top Scoring Seasons by Players without MVP Awards",
       x = "Player (Season)", y = "Points Per Game") +
  theme_minimal() +
  theme(legend.position = "none")

3. Defensive Impact Players

Findings: The scatter plot spots players who averaged roughly 1.5+ steals and 1.5+ blocks per game – a rare combination of defensive stats. Notable names include Andrei Kirilenko (around 1.9 SPG and 2.8 BPG at his peak) and Shawn Marion (about 2.0 SPG, 1.7 BPG in 2006). These two filled stat sheets defensively; Kirilenko in 2004 became the only player to rank top-5 in both steals and blocks per game (1.9 and 2.8) that season Gerald Wallace, Josh Smith, Nerlens Noel, DeMarcus Cousins, and Andre Drummond also appear in this elite quadrant of the chart. None of these players ever won a Defensive Player of the Year award, and only a couple made All-Defensive Teams in their careers. For instance, Marion led his team in both steals and blocks in 2005-06, but was never named DPOY. This shows that while they all had significant defensive impact as measured by steals/blocks, factors like team defense and traditional metrics (like defensive rebounds or opponent FG%) might have overshadowed their contributions. It highlights that some players can be defensive anchors and playmakers (getting steals and blocks) without getting top-tier defensive accolades or recognition.

defensive_players <- data.frame(
  Player = c("Andrei Kirilenko", "Shawn Marion", "Gerald Wallace", 
             "Josh Smith", "Nerlens Noel", 
             "DeMarcus Cousins", "Andre Drummond"),
  StealsPG = c(1.9, 2.0, 2.0, 1.5, 1.6, 1.5, 1.7),
  BlocksPG = c(2.8, 1.7, 1.6, 2.0, 1.7, 1.6, 1.7)
)


ggplot(defensive_players, aes(x = StealsPG, y = BlocksPG)) +
  geom_point(color = "lightblue", size = 3) +
  geom_text(aes(label = Player), vjust = -0.5, hjust = 0.5, size = 4) +
  labs(title = "Steals and Blocks per Game: Underrated Defensive Stars",
       x = "Steals per Game", y = "Blocks per Game") +
  theme_minimal()

4. Teams with the Most Close Game Wins

Findings: The top 10 teams with the most close-game wins (games decided by ≤5 points) from 2003–2022 highlight those excelling in clutch moments. The Boston Celtics lead with 292 wins, followed closely by the Dallas Mavericks (287), Miami Heat (278), Houston Rockets (267), and Los Angeles Lakers (267). The Portland Trail Blazers (266), Cleveland Cavaliers (264), San Antonio Spurs (264), Denver Nuggets (261), and Brooklyn Nets (259) complete the list. These teams consistently performed well under pressure across different eras, often featuring star players known for clutch performances. While a high number of close wins indicates strong late-game execution, it can also suggest frequent tight contests rather than dominant victories.

close_games <- games %>% 
  filter(abs(PTS_home - PTS_away) <= 5)


close_games <- close_games %>%
  mutate(WinnerID = ifelse(HOME_TEAM_WINS == 1, HOME_TEAM_ID, VISITOR_TEAM_ID))


close_wins <- close_games %>%
  group_by(WinnerID) %>%
  summarise(Wins = n()) %>%
  inner_join(teams %>% select(TEAM_ID, NICKNAME), by = c("WinnerID" = "TEAM_ID")) %>%
  arrange(desc(Wins))


top_close <- head(close_wins, 10)


ggplot(top_close, aes(x = reorder(NICKNAME, -Wins), y = Wins, fill = NICKNAME)) +
  geom_bar(stat = "identity") +
  geom_text(aes(label = Wins), vjust = 0.2, hjust= -0.25, size = 2.5) +
  coord_flip() +
  scale_fill_viridis(discrete = TRUE, option = "C") +
  labs(title = "Teams with Most Wins in Games Decided by ≤5 Points (2003–2022)",
       x = "Team", y = "Number of Close-Game Wins") +
  theme_minimal() +
  theme(legend.position = "none")

5. Evolution of Three-Point Shooting

Findings: The trend upward in the chart is evident. In the early 2000s, NBA teams attempted approximately 15–18 three-pointers per game on average. The rate modestly increased in the mid-2000s and then began rising more rapidly around 2013. By the late 2010s, the average team was attempting more than 30 threes per game. In fact, by 2018–19 the league average had settled in the 32 three-point attempts per game range. The evidence here is an increase from approximately 15 in 2003 to approximately 35 by 2020 – more than doubling the 3-point shot rate. This aligns with strategic shifts: teams learned the three-point shot (3 points) efficiency could compensate for lower shooting percentages. The shift was spurred by successful three-point-oriented teams (e.g., the Golden State Warriors’ championship years and analytics-driven strategies by teams such as the Houston Rockets). The temporary plateau in 2020-2022 at mid-30s 3PA may indicate the rate of increase is leveling off as teams balance. Overall, the NBA’s reliance on the 3-pointer has grown exponentially, and it is a centerpiece of modern offense compared to the early 2000s.

# Compute average 3-point attempts per team per game by season
three_trend <- games_details %>%
  group_by(SEASON, GAME_ID, TEAM_ID) %>%
  summarise(team3PA = sum(FG3A, na.rm = TRUE)) %>%
  ungroup() %>%
  group_by(SEASON) %>%
  summarise(Avg3PA = mean(team3PA))
## `summarise()` has grouped output by 'SEASON', 'GAME_ID'. You can override using
## the `.groups` argument.
# Line chart with labels
ggplot(three_trend, aes(x = SEASON, y = Avg3PA)) +
  geom_line(color = "purple", size = 1.2) +
  geom_point(color = "black", size = 2) +
  geom_text(aes(label = round(Avg3PA, 1)), vjust = -0.5, size = 3.25) +
  labs(title = "Rise of Three-Point Attempts Over Time (2003–2022)",
       x = "Season", y = "Avg 3-Point Attempts per Team per Game") +
  theme_minimal()