Roster Construction in Todays College Basketball

Introduction

For my research, I will be viewing roster construction of the best teams in college basketball, focusing heavily on the top 68 teams, due to 68 teams making the tournament every year. In order to accomplish this goal I will be using player ratings provided in a dataframe from Evan Miya (EvanMiya CBB Analytics - College Basketball Team, Player, and Lineup Metrics), only using stats from players with a minimum 500 possessions played, as well as team metrics I webscraped from Ken Pomeroy (2026 Pomeroy College Basketball Ratings), and focusing on the 2026 season. With Evan Miya we will be focusing mainly on a stat called BPR(also the split versions between offense and defense) and this is defined by Miya as; BPR: Bayesian Performance Rating is the sum of a player’s OBPR and DBPR. This rating is the ultimate measure of a player’s overall value to his team when he is on the floor. BPR is interpreted as the number of points per 100 possessions better than the opponent the player’s team is expected to be if the player were on the court with 9 other average players. A higher rating is better. My intentions is to try and discover how elite teams are constructing their roster, and hopefully providing insight as to how coaches can navigate the transfer portal, and supplement their roster with the construction of the top teams in the nation.

Objective 1 (The “Perfect” Roster)

To begin, I would like to begin with a baseline, and thankfully with the College Basketball season being over, we have a perfect starting point. The 2025-2026 National Champions were the Michigan Wolverines, and so we have our “Perfect” Roster. The first step will be viewing trends they used as a baseline, not an end all be all, but something to compare the trends of all the future teams.

Michigan_trends <- Miya_data %>% 
  filter(team == "Michigan")
#First Question, how is Michigans talent disbursement when looking over the players that contributed to the team (>500 possessions)
Michigan_trends%>% 
  ggplot(aes(x = reorder(name, -bpr), y = bpr)) +
  geom_col() +
  labs(title = "Michigan BPR disbursement",
       y = "BPR by Player",
       x = "Player Name")

When viewing Michigans BPR roster outlook, we see they have their one star player (Yaxel Lendeborg), and then a strong and fairly even next 4 to round out the starting 5, and then even the next 4 contributors are all very positive players showing the importance of depth.

Objective 2 (The Not So Perfect Outcome)

As NIL continues to ramp up, we see less and less upsets happening in the tournament every year, however this year, we still saw 12 seed High Point upset 5 seed Wisconsin. I believe would provide a lot of value to also view how Wisconsin’s roster was built, and see later on when comparing it to Michigan, and other top teams if we see any large discrepancy.

Wisco_trends <- Miya_data %>% 
  filter(team == "Wisconsin")
#First Question, how is Michigans talent disbursement when looking over the players that contributed to the team (>500 possessions)
Wisco_trends%>% 
  ggplot(aes(x = reorder(name, -bpr), y = bpr)) +
  geom_col() +
  labs(title = "Wisconsin BPR disbursement",
       y = "BPR by Player",
       x = "Player Name")

As expected, Wisconsin has a much different roster construction compared to Michigan. 1st major observation is no clear and defined “superstar” by BPR standards, but 3 very good players, which still shouldn’t be the biggest issue right? Well it seems that Wisconsin focused very heavily on those 3, and not the supporting case around them, as we see massive drop offs after the top 3 players. It seems as the old saying says, you really should not put all your eggs in one basket.

Objective 3 (Team BPR)

Just for a quick check, looking at if there are any super large outliars in terms of team ranking and total team BPR could provide some great insight.

top_68_trends <- top_68 %>% 
  group_by(team_name) %>% 
  summarise(
    team_rank = first(team_rank),
    team_bpr = sum(bpr)
  )

top_68_trends%>% 
  ggplot(aes(x = team_rank, y = team_bpr)) +
  geom_point() +
  geom_smooth()+
  labs(title = "Team BPR from Top Teams",
       y = "Team BPR",
       x = "Team Rank")

As expected, the higher ranked teams and higher team BPR are heavily correlated with no super large outliars on either side of the scale

Objective 4 (The Star Player)

Next, as noted with Michigan and Wisconsin the “Star” player being defined may be heavily related to teams winning games, and seeing what “share” of the teams BPR the top player holds may be an important factor to showing the importance of that role

star_share <- top_68 %>% 
  group_by(team_name) %>% 
  summarise(
    team_bpr = sum(bpr),
    top_player_bpr = max(bpr),
    star_share = top_player_bpr / team_bpr,
     team_rank = first(team_rank)
  )

star_share%>% 
  ggplot(aes(x = team_rank, y = star_share)) +
  geom_point() +
  geom_smooth()+
  labs(title = "Star Share vs Team Rank",
       y = "Star Share",
       x = "Team Rank")

The correlation between star share and team rank does not seem super strong, giving some precedence that the hypothesis of it’s importance may be incorrect, and Wisconsin’s failures derive more from their lack of depth, vs the lack of a defined number 1 option. Seeing how far the gap is between the top 5 teams shows that reliance on your star player may vary a lot, and you can still achieve amazing results.

Objective 5 (Lineup trends)

On Evan Miya, we have a stat called role, it is a scale from 1-5. 1 being a pure shot creator offensively, and a 5 being a pure receiver, or reliant on others, so how important is this to teams? Do top teams carry more creators, more receivers maybe, or do they just carry a strong combo of both. Also, what type of positions do teams carry on average, 1 being a PG all the way to 5 as the Center.

Positional_value <-  top_68 %>% 
    group_by(team_name) %>% 
  summarise(
    team_role = mean(role),
     team_rank = first(team_rank),
    team_position = mean(position)
  )
Positional_value%>% 
  ggplot(aes(x = team_rank, y = team_role)) +
  geom_point() +
  geom_smooth()+
  labs(title = "Average Role vs Team Rank",
       y = "Role",
       x = "Team Rank")

Positional_value%>% 
  ggplot(aes(x = team_rank, y = team_position)) +
  geom_point() +
  geom_smooth()+
  labs(title = "Average Position vs Team Rank",
       y = "Position",
       x = "Team Rank")

As figured teams are around 3, being a little more on the receiver heavy end of the spectrum, but that is also due to the lack of room for multiple creators on the floor at once. When looking at the positions of players, it seems like a lot more lineups are going towards the bigger side of the court, however there are some top teams that are very guard heavy, but besides the top half, the guard heavy teams have a lot more variability than the teams with a 3+ average position value.

Objective 6 (Specialists)

Another big question, do top teams carry offensive/defensive specialist? We are focusing on guys who are far and away better on one end of the court vs the other, and some say that is a dying breed in basketball, but how true is that in modern roster construction?

top_68%>% 
  ggplot(aes(x = box_obpr, y = box_dbpr)) +
  geom_point() +
  geom_smooth()+
  labs(title = "Are there many specialists?",
       y = "DBPR",
       x = "OBPR")

It seems the idea specialists are dying is also a correct assumption when viewing top teams there are not very many guys you could call “specialists” at all.

Objective 7(Do teams rely on one guy?)

Finally, I want to look at the correlation between teams top OBPR and DBPR player vs the entire teams offensive and defensive efficiency and see if the top end players are keeping these teams afloat

top_players <- top_68 %>% 
  group_by(team_name) %>% 
  summarise(
    team_offense = first(adj_team_off_eff),
    top_player_obpr = max(obpr),
    team_defense = first(adj_team_def_eff),
    top_player_dbpr = max(dbpr)
  )

top_players%>% 
  ggplot(aes(x = team_offense, y = top_player_obpr)) +
  geom_point() +
  geom_smooth()+
  labs(title = "How much do star players carry offense",
       y = "OBPR",
       x = "Team Offensive Efficiency")

top_players%>% 
  ggplot(aes(x = team_defense, y = top_player_obpr)) +
  geom_point() +
  geom_smooth()+
  labs(title = "How much do star players carry defense",
       y = "DBPR",
       x = "Team Defensive Efficiency")

When comparing these two graphs it becomes much clearer that while it seems that one player can “carry” your offense, as the correlation between star player OBPR and team offensive efficiency, the answer is a lot less clear with defense. While some of the top teams have the best defenders, there are also quite a few with players below the trend line but still being a lot better defense compared to the ones above the trend line.

Key Takeaways

Roster construction for any sport is a question I feel like there will never be a defenite answer to, but looking at some of these trends we begin to have some more clarity on what leads to success. First, it seems having depth, surrounding a star player, is one of the most important factors. Wisconsin for example, had three players who were BPR “stars”, but then a huge roster drop off, but then looking at Michigan they had one super high rated player, a big drop, and then it remained pretty steady throughout the rest of the rotation. We also see that some of the very elite teams are more guard heavy then forward and center, however if your goal is to be a good team, the floor for a lineups of 3+ average position value is higher, will the <3 lineups seem to have the higher ceiling. Another big thing to note is that elite teams are not carrying many, if any, guys who cannot at least competently play both sides of the floor. Having guys who are not two-way players can make your offense or defense much more stagnant, and make the guys around him worse. Finally, your star player seems to be able to shoulder a lot more of the offensive burden compared to the defensive burden, so maybe targeting a super elite scorer and surrounding him with 4 quality defenders may be teams best bet of putting the best product out on the court.