The U.S. Olympic team has a rich history in the Olympic games getting it’s start in the 1896 Athen’s games where the U.S. sent 14 athletes. In the 2024 Olympic games the U.S. sent 592 athletes making it the biggest team in the games. Early success in the games were in track and field and swimming with modern champions such as Michael Phelps and Carl Lewis. In 1992 the U.S. sent the Dream team to the games establishing itself as a dominant figure in the summer Olympic games for basketball. The 2024 U.S. Olympica men’s basketball team will be the focus for this case study.
ESPN NBA stat leaders for the 2023-24 season https://www.espn.com/nba/stats/_/season/2024/seasontype/2
Basketball Reference https://www.basketball-reference.com/leagues/NBA_2024_totals.html
This is a case study for anyone interested in the 2024 Olympic men’s basketball team or NBA.
library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.1.4 ✔ readr 2.1.5
## ✔ forcats 1.0.1 ✔ stringr 1.6.0
## ✔ ggplot2 4.0.0 ✔ tibble 3.3.0
## ✔ lubridate 1.9.4 ✔ tidyr 1.3.1
## ✔ purrr 1.2.0
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(lubridate)
library(dplyr)
library(ggplot2)
library(tidyr)
For this project I will use a spreadsheet created by myself with data from the ESPN and Basketball reference statistics from the season 2023-24.
NBA <- read.csv("/cloud/project/FINAL NBA STAT SHEET - NBA STAT SHEET - top 50 US NBA players (1).csv")
top_5_USA <- read.csv("/cloud/project/Final NBA stat sheet_4 - Sheet3.csv")
top_5_NBA <- read.csv("/cloud/project/_FINAL NBA STAT SHEET_ - Sheet1.csv")
top_12_NBA <- read.csv("/cloud/project/FINAL NBA STAT SHEET_2 - Sheet1.csv")
top_12_nonUSA_team <- read.csv("/cloud/project/FINAL NBA STAT SHEET_3 - Sheet1.csv")
I already checked the data in Google sheets. I just need to make sure that everything was imported correctly by using View() and head () functions.
head(top_5_USA)
## NAME RANKING TEAM_USA TEAM WINS WIN_RANK AVG_ATT AVG_ATT_RANK
## 1 Lebron James 1 yes LAL 47 8 18903 10
## 2 Jayson Tatum 3 yes BOS 64 1 19156 8
## 3 Joel Embiid 5 yes PHI 47 8 20041 3
## 4 Kevin Durant 5 yes PHX 49 6 17071 23
## 5 Anthony Davis 8 yes LAL 47 8 18903 10
## POSITION GP PPG PTS_RNK FGM FGM_RNK FGA FG. FG._RANK X3_PM X3_PM_RANK
## 1 F 71 25.7 8 9.6 5 17.9 54.0 6 2.1 14
## 2 F 74 26.9 4 9.1 7 19.3 47.1 22 3.1 4
## 3 C 66 33.1 1 11.0 1 20.1 54.8 5 1.0 22
## 4 F 75 27.1 3 10.0 3 19.1 52.3 9 2.2 13
## 5 C 76 24.7 10 9.4 6 16.9 55.6 4 0.4 24
## X3_PM. X3_PM._RANK REBOUNDS RBNDS_RANK AST ASTS_RANK STL STL_RANK BLK
## 1 41.0 9 7.3 9 8.3 3 1.3 5 0.5
## 2 37.6 14 8.1 8 4.9 18 1.0 8 0.6
## 3 33.0 30 10.2 5 4.2 22 1.0 8 1.7
## 4 41.3 7 6.6 12 5.0 17 0.9 9 1.2
## 5 27.1 33 12.6 1 3.5 26 1.2 6 2.3
## BLK_RANK TO X TOTAL_RANKING
## 1 12 NA NA 89
## 2 11 NA NA 105
## 3 3 NA NA 108
## 4 6 NA NA 108
## 5 1 NA NA 129
colnames(NBA)
## [1] "NAME" "RANKING" "TEAM_USA" "TEAM"
## [5] "WINS" "WIN_RANK" "AVG_ATT" "AVG_ATT_RANK"
## [9] "POSITION" "GP" "PPG" "PTS_RNK"
## [13] "FGM" "FGM_RNK" "FGA" "FG."
## [17] "FG._RANK" "X3_PM" "X3_PM_RANK" "X3_PM."
## [21] "X3_PM._RANK" "REBOUNDS" "RBNDS_RANK" "AST"
## [25] "ASTS_RANK" "STL" "STL_RANK" "BLK"
## [29] "BLK_RANK" "TO" "X" "TOTAL_RANKING"
The data was taken from ESPN’s season datasets. The data was cross-referenced with a reputable website to ensure it’s accuracy. The sites name was https://www.basketball-reference.com/leagues/NBA_2024_totals.html. The data was then put into a Google spreadsheet designed to use offensive and defensive stats from the season of 2023-24 to see which players ranked the highest among those stats. Being I only used one season’s stats to determine the outcome’s for this study, a more indepth analysis would require taking a look at the player’s entire career in the NBA.
The spreadsheet was organized in a way to rank the top 45 U.S. born NBA players in offensive and defensive categories. Each category was ranked, 1 being the best, 45 being the worst. The rankings were then added up given us the best and the worst players in this dataset. There was also a stat included for the team wins that the player was on for that season and the average attendance per home game of the team that player was on. This was an attempt to give a category for a player’s popularity. The win category was included to represent a players efforts on the court resulting in success measured in wins. Again, this data set only represented the 2023-24 season.
I am now going to compare the stats between the 2024 Men’s Olympic basketball team (12 players) with the top 12 players that were on my data set.
Let’s explore the 2024 Olympic team and view the data to see what we find.
NBA_2 <- filter (NBA, TEAM_USA == "yes")
head(NBA_2, n=12)
## NAME RANKING TEAM_USA TEAM WINS WIN_RANK AVG_ATT AVG_ATT_RANK
## 1 Lebron James 1 yes LAL 47 8 18903 10
## 2 Jayson Tatum 3 yes BOS 64 1 19156 8
## 3 Joel Embiid 5 yes PHI 47 8 20041 3
## 4 Kevin Durant 5 yes PHX 49 6 17071 23
## 5 Anthony Davis 8 yes LAL 47 8 18903 10
## 6 Devin Booker 10 yes PHX 49 6 17071 23
## 7 Anthony Edwards 12 yes MIN 56 3 18024 16
## 8 Stepehn Curry 13 yes GS 47 8 18064 15
## 9 Bam Adebayo 17 yes MIA 46 9 19749 4
## 10 D White 19 yes BOS 64 1 19156 8
## 11 Jrue Holiday 21 yes BOS 64 1 19156 8
## 12 Tyrese Haliburton 23 yes IND 47 8 16525 27
## POSITION GP PPG PTS_RNK FGM FGM_RNK FGA FG. FG._RANK X3_PM X3_PM_RANK
## 1 F 71 25.7 8 9.6 5 17.9 54.0 6 2.1 14
## 2 F 74 26.9 4 9.1 7 19.3 47.1 22 3.1 4
## 3 C 66 33.1 1 11.0 1 20.1 54.8 5 1.0 22
## 4 F 75 27.1 3 10.0 3 19.1 52.3 9 2.2 13
## 5 C 76 24.7 10 9.4 6 16.9 55.6 4 0.4 24
## 6 G 68 27.1 3 9.4 6 19.2 49.2 15 2.2 13
## 7 G 79 25.9 7 9.1 7 19.7 46.1 26 2.4 11
## 8 G 74 26.4 6 8.8 10 19.5 45.0 29 4.8 1
## 9 C 71 19.3 28 7.5 21 14.3 52.1 10 0.2 25
## 10 G 73 15.2 34 5.3 32 11.7 49.3 14 2.7 8
## 11 G 69 12.5 36 4.8 34 10.0 48.0 17 2.0 15
## 12 G 69 20.1 23 7.2 24 15.2 47.7 19 2.8 7
## X3_PM. X3_PM._RANK REBOUNDS RBNDS_RANK AST ASTS_RANK STL STL_RANK BLK
## 1 41.0 9 7.3 9 8.3 3 1.3 5 0.5
## 2 37.6 14 8.1 8 4.9 18 1.0 8 0.6
## 3 33.0 30 10.2 5 4.2 22 1.0 8 1.7
## 4 41.3 7 6.6 12 5.0 17 0.9 9 1.2
## 5 27.1 33 12.6 1 3.5 26 1.2 6 2.3
## 6 36.4 18 4.5 22 6.9 6 0.9 9 0.4
## 7 35.7 21 5.4 16 5.1 16 1.3 5 0.5
## 8 40.8 10 4.5 22 5.1 16 0.7 11 0.4
## 9 35.7 21 10.4 4 3.9 24 1.1 7 0.9
## 10 39.6 12 4.2 25 5.2 15 1.0 8 1.2
## 11 42.9 2 5.4 16 4.8 19 0.9 9 0.8
## 12 36.4 18 3.9 27 10.9 1 1.2 6 0.7
## BLK_RANK TO X TOTAL_RANKING
## 1 12 NA NA 89
## 2 11 NA NA 105
## 3 3 NA NA 108
## 4 6 NA NA 108
## 5 1 NA NA 129
## 6 13 NA NA 134
## 7 12 NA NA 140
## 8 13 NA NA 141
## 9 8 NA NA 161
## 10 6 NA NA 163
## 11 9 NA NA 166
## 12 10 NA NA 170
The first thing you see is that the U.S. team had players in the rankings of 1 to 23 meaning according to the stats I used the U.S. team was not the best players according to my statistical analysis. Let’s explore the top 12 performing players according to my data analysis.
head(NBA, n=12)
## NAME RANKING TEAM_USA TEAM WINS WIN_RANK AVG_ATT AVG_ATT_RANK
## 1 Lebron James 1 yes LAL 47 8 18903 10
## 2 Kyrie Irving 2 no DAL 50 5 20217 2
## 3 Jayson Tatum 3 yes BOS 64 1 19156 8
## 4 Kawhi Leonard 4 no LAC 51 4 18945 9
## 5 Joel Embiid 5 yes PHI 47 8 20041 3
## 6 Kevin Durant 5 yes PHX 49 6 17071 23
## 7 Jalen Brunson 6 no NY 50 5 19728 5
## 8 De'Aaron Fox 7 no SAC 46 9 17927 17
## 9 Anthony Davis 8 yes LAL 47 8 18903 10
## 10 Tyrese Maxey 9 no PHI 47 8 20041 3
## 11 Devin Booker 10 yes PHX 49 6 17071 23
## 12 Paul George 11 no LAC 51 4 18945 9
## POSITION GP PPG PTS_RNK FGM FGM_RNK FGA FG. FG._RANK X3_PM X3_PM_RANK
## 1 F 71 25.7 8 9.6 5 17.9 54.0 6 2.1 14
## 2 G 58 25.6 9 9.7 4 19.5 49.7 13 3.0 5
## 3 F 74 26.9 4 9.1 7 19.3 47.1 22 3.1 4
## 4 F 68 23.7 13 9.0 8 17.1 52.5 7 2.1 14
## 5 C 66 33.1 1 11.0 1 20.1 54.8 5 1.0 22
## 6 F 75 27.1 3 10.0 3 19.1 52.3 9 2.2 13
## 7 G 77 28.7 2 10.3 2 21.4 47.9 18 2.7 8
## 8 G 74 26.6 5 9.7 4 20.9 46.5 23 2.9 6
## 9 C 76 24.7 10 9.4 6 16.9 55.6 4 0.4 24
## 10 G 70 25.9 7 9.1 7 20.3 45.0 29 3.0 5
## 11 G 68 27.1 3 9.4 6 19.2 49.2 15 2.2 13
## 12 F 74 22.6 17 7.9 17 16.7 47.1 22 3.3 3
## X3_PM. X3_PM._RANK REBOUNDS RBNDS_RANK AST ASTS_RANK STL STL_RANK BLK
## 1 41.0 9 7.3 9 8.3 3 1.3 5 0.5
## 2 41.1 8 5.0 20 5.2 15 1.3 5 0.5
## 3 37.6 14 8.1 8 4.9 18 1.0 8 0.6
## 4 41.7 4 6.1 13 3.6 25 1.6 2 0.9
## 5 33.0 30 10.2 5 4.2 22 1.0 8 1.7
## 6 41.3 7 6.6 12 5.0 17 0.9 9 1.2
## 7 40.1 11 3.6 30 6.7 7 0.9 9 0.2
## 8 36.9 17 4.6 21 5.6 12 2.0 1 0.4
## 9 27.1 33 12.6 1 3.5 26 1.2 6 2.3
## 10 37.3 15 3.7 29 6.2 9 1.0 8 0.5
## 11 36.4 18 4.5 22 6.9 6 0.9 9 0.4
## 12 41.3 7 5.2 18 3.5 26 1.5 3 0.5
## BLK_RANK TO X TOTAL_RANKING
## 1 12 NA NA 89
## 2 12 NA NA 98
## 3 11 NA NA 105
## 4 8 NA NA 107
## 5 3 NA NA 108
## 6 6 NA NA 108
## 7 15 NA NA 112
## 8 13 NA NA 128
## 9 1 NA NA 129
## 10 12 NA NA 132
## 11 13 NA NA 134
## 12 12 NA NA 138
According to our data, out of the top 12 performing players in the 2023-24 season only 6 made the U.S. Olympic teams. The out liars in this data would be Kyrie Irving (ranked 2), Kawhi Leonard (ranked 4), and Jalen Brunson (ranked 6).
Let’s put some graphs together to try and figure out what the key factors in choosing the Olympic team could have been, because clearly according to our ranking system the 12 players on the team were not the top 12 best choices according to our data.
top_12_NBA %>%
filter(TEAM_USA %in% c("yes", "no")) %>%
ggplot(aes(x = NAME, y = PPG, group = TEAM_USA, color = TEAM_USA)) +
theme(axis.text.x = element_text (angle = 45, hjust = 1, vjust = 1)) +
labs(title = "Top 12 NBA players points per game") +
geom_line()
top_12_NBA %>%
filter(TEAM_USA %in% c("yes", "no")) %>%
ggplot(aes(x = NAME, y = X3_PM., group = TEAM_USA, color = TEAM_USA)) +
theme(axis.text.x = element_text (angle = 45, hjust = 1, vjust = 1)) +
labs(title = "Top 12 NBA players 3 point shooting percentage") +
geom_line()
NBA %>%
filter(TEAM_USA %in% c("yes", "no")) %>%
ggplot(aes(x = NAME, y = PPG, group = TEAM_USA, color = TEAM_USA)) +
theme(axis.text.x = element_text (angle = 45, hjust = 1, vjust = 1)) +
labs(title = "Top 45 NBA players points per game") +
geom_line()
According to our data there is no clear correlation between a high points per game percentage and a high 3 point percentage on the players chances to make the team.
Let’s explore some defensive stats to see if we see any trends in those categories.
NBA %>%
filter(TEAM_USA %in% c("yes", "no")) %>%
ggplot(aes(x = NAME, y = REBOUNDS, group = TEAM_USA, color = TEAM_USA)) +
theme(axis.text.x = element_text (angle = 45, hjust = 1, vjust = 1)) +
labs(title = "Top 45 NBA players rebounds per game") +
geom_line()
NBA %>%
filter(TEAM_USA %in% c("yes", "no")) %>%
ggplot(aes(x = NAME, y = STL, group = TEAM_USA, color = TEAM_USA)) +
theme(axis.text.x = element_text (angle = 45, hjust = 1, vjust = 1)) +
labs(title = "Top 45 NBA players steals per game") +
geom_line()
Clearly defensive categories were not involved in the final decisions. Although, the top re-bounder did make the team, there were many in the top 5 in this category that did not.
Finally, for fun I am going to see if I can draw any insights into the data that was collected regarding the average attendance per game and the win per season.
ggplot(NBA, aes(x = TEAM_USA, y = AVG_ATT)) +
geom_point() + geom_smooth() +
labs(title = "Average attendance per game of team = Olympics?")
## `geom_smooth()` using method = 'loess' and formula = 'y ~ x'
ggplot(NBA, aes(x = TEAM_USA, y = WINS)) +
geom_point() + geom_smooth() +
labs(title = "Does a winning team = Olympics?")
## `geom_smooth()` using method = 'loess' and formula = 'y ~ x'
The attendance of the crowd did not make a difference in a players chances to make the Olympic team, but it did seem to help if the player was on a winning team. Players on the teams with the most wins in the 2023-24 season clearly had a better chance to make the U.S. Olympics team.
Team USA had a very talented team and ultimately won the gold medal but according to our data analysis the team was not the best it could have been according to the stats from the 2023-24 season. The U.S. consisted of players that ranged from number 1 in ranking to number 26. The most noticeable of players that did not make the team was Jalen Brunson. He was one of the leaders in points per game and was ranked 6 overall in our ranking system. He also played on a team with a winning record and played in one of the biggest markets in the country, New York city, making him one of the fan’s favorites. It begs the question, why did Jalen Brunson not make the team?
The conclusions I did draw were that Team USA clearly valued players that were on winning teams in the NBA. They also favored players that had a strong 3-point shot. They also valued play outside of shooting as shown by the rebound graph. Overall, I concluded that team USA built a team that was strong in all aspects of the game. They had players with strong shooting skills, defense, and players that new how to win. They put together a well-rounded team that would bring home the gold. A more in depth analysis spanning over a longer period of time would gather more insight into why the U.S. chose the players they did, but according to the 2023-24 NBA season stats the 12 players were not the best choices according to my data analysis. Thank you for taking the time to read this case study.