Introduction

The area that I decided to research is college basketball. I wanted to research college basketball because I attend Xavier University and we have a great basketball program. College basketball has always interested me ever since I was a kid and was one of the main reasons why I chose to attend Xavier.

In order to research this topic I found a data set on Kaggle. This data is from the 2021 season and provides statistics for every Division 1 college basketball team. The data set includes defensive and offensive metrics as well as the conference the team plays in and what seed they were given in the NCAA Tournament. I intend to use this data set to answer many questions about these teams and how these metrics relate to team success.

Data Dictionary

Field Type Description
TEAM text The Division I college basketball school
CONF text The Athletic Conference in which the school participates in
G numeric Number of games played
W numeric Number of games won
ADJOE numeric Adjusted Offensive Efficiency (An estimate of the offensive efficiency (points scored per 100 possessions)
ADJDE numeric Adjusted Defensive Efficiency (An estimate of the defensive efficiency (points allowed per 100 possessions)
BARTHAG numeric Power Rating (Chance of beating an average Division I team)
EFG_O numeric Effective Field Goal Percentage Shot
EFG_D numeric Effective Field Goal Percentage Allowed
TOR numeric Turnover Percentage Allowed (Turnover Rate)
TORD numeric Turnover Percentage Committed (Steal Rate)
ORB numeric Offensive Rebound Rate
DRB numeric Offensive Rebound Rate Allowed
FTR numeric Free Throw Rate (How often the given team shoots Free Throws)
FTRD numeric Free Throw Rate Allowed
2P_O numeric Two-Point Shooting Percentage
2P_D numeric Two-Point Shooting Percentage Allowed
3P_O numeric Three-Point Shooting Percentage
3P_D numeric Three-Point Shooting Percentage Allowed
ADJ_T numeric Adjusted Tempo (An estimate of the tempo (possessions per 40 minutes) a team would have against the team<ca>
WAB numeric Wins Above Bubble (The bubble refers to the cut off between making the NCAA March Madness Tournament and not<ca>
SEED numeric Seed in the NCAA March Madness Tournament

Variability of Adjusted Offesnive Efficiency by Conference (Power 6)

The first thing that I wanted to look at was Adjusted Offensive Efficiency for the power 6 conferences. The power 6 conferences are the ACC, Big 12, Big 10, Big East, PAC 12, and SEC. I think this question is interesting because I want to see which conferences are better offensively.

Looking at the boxplot we can see that the Big 10 has the highest median offensive efficiency. What this means is that teams in the big ten score more points per 100 possessions than the other conferences. The SEC had the lowest median offensive efficiency. There are many reasons why a conference can have low or high median offensive efficiency’s. One reason could be defense. Maybe the SEC has the highest defensive efficiency and that’s why offenses do not score as much. Maybe the Big 10 plays bad defense which leads to a lot of scoring.

Variability of Adjusted Defesnive Efficiency by Conference (Power 6)

Now let’s look at defensive efficiency for the power 6. To do this we will do the exact same thing but use the adjusted defensive efficiency metric. Looking at defensive efficiency will help us answer the questions we had after looking at the offensive efficiency.

When looking at this boxplot you must keep in mind that low medians are better. Defensive efficiency measures the amount of points given up by a team per 100 possessions. That means we want teams to give up less points because less points given up is better.

This boxplot does help us answer some of the questions we previously had. The SEC is one of the best conferences in terms of defensive efficiency. They are not the best but they play great defense which is most likely why their offensive efficiency rating is so low. The best defensive conference is the Big 10. This is very surprising because the Big 10 also had the best offensive efficiency. Although this may seem confusing there are reasons to how this is possible. In 2021, when this data was collected, the Big 10 had some of the best teams in terms of adjusted efficiency margin. This metric was created by Ken Pomeroy and it takes the difference between ADJOE and ADJDE to create one metric to rank teams. This means that the Big 10 had some of the top teams in both offensive and defensive efficiency. The worst conference in terms of defensive efficiency was the PAC 12 and they gave up almost 98 points per 100 possessions which is very poor.

Variability of Seed In NCAA Tournament by Conference (Power 6)

After looking at offensive and defensive efficiency by conference I thought it would be interesting to look at the variability of seed rankings for the power 6 conferences. This will show us which conferences had the higher and lower median seeds in the NCAA Tournament. In the NCAA Tournament each team is given a seed. The seeds range from 1 to 16 with 4 teams given each seed. This means that there are four teams seeded at 1 and so on. Low seeds are better. The 1 seeds are the best teams and the 16 seeds are the worst.

This boxplot shows us that the Big 12 and Big 10 had the best seeds in the tournament. This means that both of these conferences had some very good teams in the tournament and not many bad teams. The ACC on the other had the highest median seed ranking. The median seed for the ACC was above 7.5. This means that they had some very bad teams that made the tournament. Although the ACC had the highest median seed that does not necessarily mean they are a bad conference. Making the tournament is an achievement itself and deserves praise. However, the ACC teams that made the tournament were not as good as the teams that made the tournament from the other power 6 conferences.

Relationship between Wins and 3 Point Shooting Percentage

In college basketball and basketball in general the ability to hit 3 point shots is more important than it ever has been. That is why I wanted to see the relationship between 3 point percentage and the amount of wins a team has during the 2021 season. I presume that teams that have high 3 point percentages win more games.

Teams at the bottom left of this scatterplot are very bad at shooting 3 pointers and did not win many games in the 2021 season. Teams near the top right are the opposite. These teams were very good at shooting 3’s and also won a lot of games. Chicago State stands out among every other team as the worst team. They only shot around 25% from 3 and won 3 games in the 2021 season. That abysmal 3 point shooting percentage would not win you many games. Even my school of Xavier University would struggle to win games if they only shot 25% from 3 point range. Looking at some of the best teams on this graph you can see teams like Gonzaga, Illinois, and the national champion of that season, Baylor. Baylor shot over 40% from 3 which is incredible. This is one of the many reasons of why they won the National Championship that year. In fact, in the championship game they shot 43.5% from 3. Any team shooting that well from 3 would be almost unbeatable. Looking at some of the outlier teams we have American and UNC Greensboro. American shot around 37% from 3 yet won less than 5 games. UNC Greensboro on the other hand shot around 30% yet had over 20 wins. This goes to show that although 3 point percentage is important there are many other factors that lead to teams winning and losing games.

Relationship Between Turnover Rate and Offensive Efficiency

In almost every sport there are turnovers. A turnover in basketball is when a team gives up possession of the basketball before taking a shot or free throw. Turnovers are important in basketball because you do not want to waste your own offensive possessions and give teams extra offensive possessions. Turnovers can cause teams to lose games. That is why I want to look at how turnovers affect offensive efficiency.

This scatterplot shows a clear negative relationship between turnovers and offensive efficiency. As turnover rate goes up offensive efficiency goes down. Team like Maine, Mississippi Valley State, and South Carolina State have some of the worst turnover rates in college basketball. Their awful turnover rates are one of the biggest reasons why their offensive efficiency is so low. If you turnover the ball you will not have as many opportunities to score points which will lead to low offensive efficiency. Teams like Iowa and Villanova have very low turnover rates as well as some of the best offensive efficiency rates. They make the most out of their offensive possessions and cherish the basketball. This is one of the reasons why they were so successful in the 2021 season.

Scraping Secondary Data From Kenpom

The data that I collected comes from Ken Pomeroy’s rankings on kenpom.com. Every year, since 2002, Ken Pomeroy ranks all of the Division 1 College basketball teams by Adjusted Efficiency Margin. This is his own calculation that he created and it takes Adjusted Defensive Efficiency and subtracts that from Adjusted Offensive Efficiency.I wanted to scrape this data because I will be able to see this years rankings and stats instead of the 2021 data that I was using in the previous graphs.

Who Are The Best And Worst Teams In College Basketball in 2023

While using the 2023 data I wanted to see who the best and worst overall teams are in college basketball this year. To do this I plotted the relationship between Adjusted Offensive Efficiency and Adjusted Defensive Efficiency.

This scatterplot shows Adjusted Offensive Efficiency on the x axis and Adjusted Defensive Efficiency on the y axis. The teams at the bottom right of the graph score the most points and give up the least points. The teams in the top left are the opposite. We can see that teams like Gonzaga, Baylor, and Houston are some of the best overall teams this year. They have great offenses and great defenses. Teams like Chicago State and Mississippi Valley State are the worst. The have both horrendous defenses and offenses.