2026-03-24

Dataset Overview and Source

All Seasons Dataset: Data from an NBA dataset that tracks player stats from 1996-2022. For this particular analysis, the data is from the year 1997-1998. The variables used are:

  • player_name: Player’s Name
  • team_abbreviation: Player’s Team Abbreviation
  • division: Players Team’s Division
  • age: Age of Player
  • pts: Average Points Per Game
  • reb: Average Rebounds Per Game
  • ast: Average Assists Per Game
  • ts_pct: True Shooting Percentage (Weighted Shot Make Percentage)

Data Setup:

This code shows how data is prepared for my analysis. My analysis focuses on offense and starts with teams before moving to players.

# Loads in the .csv
seasons_df = read.csv("all_seasons.csv")

# Remove empty rows and move all teams into respective divisions
seasons9798_df = seasons_df %>%
  filter(season == "1997-98") %>% 
  drop_na() %>%
  mutate(division = case_when(
    team_abbreviation %in% c("BOS", "MIA", "NJN", "NYK", 
                             "ORL", "PHI", "WAS") ~ "Atlantic",
    team_abbreviation %in% c("ATL", "CHH", "CHI", "CLE", "DET", 
                             "IND", "MIL", "TOR") ~ "Central",
    team_abbreviation %in% c("DAL", "DEN", "HOU", "MIN",
                             "SAS", "UTA", "VAN") ~ "Midwest",
    team_abbreviation %in% c("GSW", "LAC", "LAL", "PHX",
                             "POR", "SAC", "SEA") ~ "Pacific",
  ))

Average Team Points per game (ggplotly)

Takeaway - This graph displays the best offensive teams in the league. Those teams are the Lakers (LAL), Nets (NYN), and Suns (PHX). I want to understand what makes them so good in my next plots.

Average Team Age (ggplotly)

Takeaway - Many of the best offensive teams in the league this year have an average age of 26-28 (middle-aged). I want to discover the connection between why being a middle-aged team means more offensive success.

Games Played vs Player Count (ggplotly)

Takeaway - Only about 10% of the entire league’s players will play all games, so, availability is key and likely correlates with offensive success. Thus, certain ages must play more games than others, which will be my next plot to discover which ages are the best.

Age vs Average Games Played (plot_ly)

Takeaway - Players in the age range of 24-30 usually play more games compared to other. This explains why teams with an average middle age of around 26-28 are offensively better – due to their players being able to play in more games. This consistency leadS to lineups that have more chemistry and continuity, creating more offensive success.

Player’s Average Pts, Ast, and Rebs (plot_ly)

Takeaway - To understand top-scoring teams as well, I need to understand their players. For all of them, (such as the LAL and NYN) they have more than two players who average ~15+ points. In addition, many have multiple players (~5) who average around 10+ points a game, showing many players on a team can score effectively, no matter if they are a star or role-player. How offensively efficient are these players?

True Shooting % vs Points (ggplotly)

True Shooting (TS) [Efficiency] = PTS / ( 2 * (FGA + (0.44 * FTA)))

Takeaway - Most players seem to have true shot percentage between 40% to 60%, meaning they are all around the same efficiency. So, it seems that True Shooting % does not correlate to more points per game for a player. I’ll analyze this relationship next and try to explain why it is.

Statistical Analysis: Code

## 
## Call:
## lm(formula = pts ~ ts_pct, data = seasons9798_df)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -14.938  -3.798  -1.305   3.122  19.885 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   -3.881      1.399  -2.774  0.00578 ** 
## ts_pct        23.819      2.769   8.601  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 5.401 on 437 degrees of freedom
## Multiple R-squared:  0.1448, Adjusted R-squared:  0.1428 
## F-statistic: 73.97 on 1 and 437 DF,  p-value: < 2.2e-16

Statistical Analysis: Analysis and Meaning

Relationship - The correlation is very weak with an R^2 of 0.1428. However, the relationship is still significant due to the p-value being extremely small – proving there is still a correlation.

Analysis - Being an efficient scorer does not heavily correlate into averaging more points.

This means a team’s coaching matters significantly. With most of the league being around the same level of efficiency, being able to put players in better positions for shots matters more to a team’s success. For instance, the Lakers coaching staff leverages Shaq’s prolific scoring in the paint by putting him in such positions more, helping him average 30+ points a game.

In addition, a player’s efficiency might dip due to their expectations. For instance, since Jordan is an amazing scorer, he will take more difficult shots at the end of the shot-clock as he has a higher probability of making them compared to his teammates.

Final Takeaways

Player Longevity Matters: Teams that were the best on offense, such as the Lakers and Nets both had middle-aged teams. Having a middle-aged team means they usually play in more games and have more chemistry, making their team more successful.

Higher Efficiency does not equal more points: Although it seems counter-intuitive, most of the league has similar shooting efficiency. So, coaches putting players in better spots helps them score more points, such as putting Shaq in the paint or Karl Malone in the post.

Regular Season Offensive Success does not equal Post Season Success: Although the Lakers, Suns, and Nets were incredible offensive teams in the regular season, none of them made the Finals, as the Bulls played the Jazz that year. Defensive rating and experience matters in the playoffs as well, which could explain these teams’ lack of success.

Thank you!