2026-03-24

Dataset Overview and Source

All Seasons Dataset: Data from an NBA dataset that tracks player stats from 1996-2022. For this particular analysis, the data is from the year 1997-1998. The Variables used are:

  • player_name: Player’s Name
  • team_abbreviation: Player’s Team Abbreviation
  • division: Players Team’s Division
  • age: Age of Player
  • pts: Average Points Per Game
  • reb: Average Rebounds Per Game
  • ast: Average Assists Per Game
  • ts_pct: True Shooting Percentage (Weighted Shot Make Percentage)

Data Setup:

This shows how data is loaded and prepared for analysis. My analysis focuses on offense and starts with teams before then moves to players.

# Loads in the .csv
seasons_df = read.csv("all_seasons.csv")

# Remove empty rows and move all teams into respective divisions
seasons9798_df = seasons_df %>%
  filter(season == "1997-98") %>% 
  drop_na() %>%
  mutate(division = case_when(
    team_abbreviation %in% c("BOS", "MIA", "NJN", "NYK", 
                             "ORL", "PHI", "WAS") ~ "Atlantic",
    team_abbreviation %in% c("ATL", "CHH", "CHI", "CLE", "DET", 
                             "IND", "MIL", "TOR") ~ "Central",
    team_abbreviation %in% c("DAL", "DEN", "HOU", "MIN",
                             "SAS", "UTA", "VAN") ~ "Midwest",
    team_abbreviation %in% c("GSW", "LAC", "LAL", "PHX",
                             "POR", "SAC", "SEA") ~ "Pacific",
  ))

Bar-Chart of Average Points per game per Team (ggplotly)

Takeaway - The best offensive teams in the league are the LAL, MIN, and NYN. I will analyze why.

Bar-Chart of Average Age per Team (ggplotly)

Takeaway - Many of the best offensive teams are middle-aged (26-28) such as the NYN and MIN.

Line Graph of Games Played vs Player Count (ggplotly)

Takeaway - Most players do not play all games, so availability is key and likely correlates with offensive success.

Box-Plot of Age vs Average Games Played (plot_ly)

Takeaway - Players in the middle age range (from 24-30) usually play more games compared to other.

3-Variable Plot of Average Points, Assists, and Rebounds per Team (plot_ly)

Takeaway - Top-scoring teams (such as the LAL and NYN) all have more than two players who average ~15+ points

True Shooting % vs Points (ggplotly)

True Shooting (TS) [Efficiency] = PTS / ( 2 * (FGA + (0.44 * FTA)))

Takeaway - True Shooting % does not seem to correlate to more points per game. I’ll analyze this relationship next.

Statistical Analysis: Average PTS vs TS Code

## 
## Call:
## lm(formula = pts ~ ts_pct, data = seasons9798_df)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -14.938  -3.798  -1.305   3.122  19.885 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   -3.881      1.399  -2.774  0.00578 ** 
## ts_pct        23.819      2.769   8.601  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 5.401 on 437 degrees of freedom
## Multiple R-squared:  0.1448, Adjusted R-squared:  0.1428 
## F-statistic: 73.97 on 1 and 437 DF,  p-value: < 2.2e-16

Statistical Analysis: Analysis and Meaning

Relationship - The correlation is very weak with an R^2 of 0.1428. However, the relationship is still significant due to the p-value being extremely small meaning there is a correlation.

Analysis - Being an efficient scorer does not heavily correlate into averaging more points. In addition, from the plot, it seems like most of the league are within the 50%-60% TS efficiency.

I believe This means a team coaching matters significantly. With most of the league being around the same level of efficiency, being able to put players in better positions for shots matters more to a team’s success. Shaq averages 30+ points due to his prolific scoring in the paint, which the Lakers coaches leverage.

In addition, a player’s efficiency might dip as well due to their expectations. For instance, since Jordan is an amazing scorer, he will take more difficult shots at the end of the shot-clock as he has a higher probability of making them compared to his teammates.

Final Takeaways

Player Longevity Matters: Teams that were the best on offense, such as the Lakers and Nets both had younger teams. Having a younger team means they usually play in more games and have more chemistry, making their team more successful.

Higher Efficiency does not equal more points: Although it seems counter-intuitive, most of the league has similar shooting efficiency. So, coaches putting players in better spots allow them to score more points, such as putting Shaq in the paint or Karl Malone in the post.

Regular Season Offensive Success does not equal Post Season Success: Although the Lakers, Wolves, and Nets all were incredible offensive teams in the regular season, none of them made the Finals, as the Bulls played the Jazz that year. Defensive rating and experience matters in the playoffs as well, which could explain these teams’ lack of success.

Thank you!