Exploring Performance Metrics in NCAA Division 1 Volleyball Teams, 2022-2023 Season
Dataset: The dataset used in this analysis is sourced from the NCAA Division 1 volleyball teams’ performance during the 2022-2023 season. The dataset contains 334 rows and 14 columns, representing various metrics for each team. These metrics include performance indicators such as aces per set, assists per set, team attacks per set, blocks per set, digs per set, hitting percentage, kills per set, opponent hitting percentage, win-loss record, and more.
Data Source: The data was collected by SCORE Sports Data Repository
Variables:
Team: Name of the college volleyball team.
Conference: The conference to which the team belongs.
Region: The region to which the team belongs.
Aces_per_set: Average number of serves leading to a point per set.
Assists_per_set: Average number of sets, passes, or digs resulting in a kill per set.
Team_attacks_per_set: Average number of times the ball is sent to the opponent’s court per set.
Blocks_per_set: Average number of times the ball is blocked per set.
Digs_per_set: Average number of successful passes after an opponent’s attack per set.
Hitting_pctg: Percentage of successful hits relative to total attempts.
Kills_per_set: Average number of hits resulting in a point per set.
Opp_hitting_pctg: Average hitting percentage of the team’s opponent per set.
W: Number of team wins for the season.
L: Number of team losses for the season.
Win_loss_pctg: Percentage of total wins divided by the total matches of the season.
Reason for Choosing Topic and Dataset:
Volleyball is a sport that involves a blend of athleticism, strategy, and teamwork. Analyzing performance metrics in NCAA Division 1 volleyball teams allows for a deeper understanding of the factors contributing to team success. This dataset provides an opportunity to explore the degree of volleyball performance. Also I’ve never payed attention to volleyball and was very interested in the sport and what it’s about.
1. The necessary libraries
library(readr)library(dplyr)
Attaching package: 'dplyr'
The following objects are masked from 'package:stats':
filter, lag
The following objects are masked from 'package:base':
intersect, setdiff, setequal, union
library(ggplot2)
Warning: package 'ggplot2' was built under R version 4.3.3
library(tidyr)library(broom) #suggested by chat gbt so I gave it a trylibrary(highcharter)
Warning: package 'highcharter' was built under R version 4.3.3
Registered S3 method overwritten by 'quantmod':
method from
as.zoo.data.frame zoo
Rows: 334 Columns: 14
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (3): Team, Conference, region
dbl (11): aces_per_set, assists_per_set, team_attacks_per_set, blocks_per_se...
ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
2. Data cleaning and exploration
# Display structure of the datasetstr(volleyball_data)
Team Conference region aces_per_set
Length:334 Length:334 Length:334 Min. :0.900
Class :character Class :character Class :character 1st Qu.:1.310
Mode :character Mode :character Mode :character Median :1.455
Mean :1.465
3rd Qu.:1.610
Max. :2.330
assists_per_set team_attacks_per_set blocks_per_set digs_per_set
Min. : 4.44 Min. :24.25 Min. :0.600 Min. : 7.42
1st Qu.:10.87 1st Qu.:33.35 1st Qu.:1.810 1st Qu.:13.33
Median :11.54 Median :34.47 Median :2.070 Median :14.32
Mean :11.43 Mean :34.46 Mean :2.057 Mean :14.35
3rd Qu.:12.14 3rd Qu.:35.88 3rd Qu.:2.300 3rd Qu.:15.35
Max. :13.80 Max. :39.78 Max. :3.330 Max. :18.53
hitting_pctg kills_per_set opp_hitting_pctg W
Min. :0.0790 Min. : 4.92 Min. :0.1280 Min. : 0.00
1st Qu.:0.1830 1st Qu.:11.78 1st Qu.:0.1870 1st Qu.:10.00
Median :0.2080 Median :12.46 Median :0.2055 Median :15.00
Mean :0.2079 Mean :12.37 Mean :0.2076 Mean :15.13
3rd Qu.:0.2330 3rd Qu.:13.14 3rd Qu.:0.2270 3rd Qu.:19.00
Max. :0.3360 Max. :14.75 Max. :0.3380 Max. :31.00
NA's :2
L win_loss_pctg
Min. : 1.00 Min. :0.0000
1st Qu.:11.00 1st Qu.:0.3450
Median :15.00 Median :0.5155
Mean :14.72 Mean :0.4996
3rd Qu.:19.00 3rd Qu.:0.6352
Max. :31.00 Max. :0.9660
# Check for missing valuessum(is.na(volleyball_data))
[1] 3
# Explore unique values in categorical variablesunique(volleyball_data$Conference)
lm_model <-lm(win_loss_pctg ~ ., data = selected_data)summary(lm_model)
Call:
lm(formula = win_loss_pctg ~ ., data = selected_data)
Residuals:
ALL 331 residuals are 0: no residual degrees of freedom!
Coefficients: (45 not defined because of singularities)
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.394 NaN NaN NaN
TeamAbilene Christian -0.125 NaN NaN NaN
TeamAir Force 0.106 NaN NaN NaN
TeamAkron -0.187 NaN NaN NaN
TeamAlabama -0.061 NaN NaN NaN
TeamAlabama A&M -0.182 NaN NaN NaN
TeamAlabama St. 0.177 NaN NaN NaN
TeamAlcorn -0.222 NaN NaN NaN
TeamAmerican 0.106 NaN NaN NaN
TeamApp State 0.249 NaN NaN NaN
TeamArizona 0.122 NaN NaN NaN
TeamArizona St. 0.012 NaN NaN NaN
TeamArk.-Pine Bluff -0.104 NaN NaN NaN
TeamArkansas 0.306 NaN NaN NaN
TeamArkansas St. -0.104 NaN NaN NaN
TeamArmy West Point 0.273 NaN NaN NaN
TeamAuburn 0.316 NaN NaN NaN
TeamAustin Peay 0.050 NaN NaN NaN
TeamBall St. 0.333 NaN NaN NaN
TeamBaylor 0.387 NaN NaN NaN
TeamBelmont -0.136 NaN NaN NaN
TeamBethune-Cookman -0.094 NaN NaN NaN
TeamBinghamton 0.260 NaN NaN NaN
TeamBoise St. 0.123 NaN NaN NaN
TeamBoston College 0.255 NaN NaN NaN
TeamBowling Green 0.294 NaN NaN NaN
TeamBradley -0.071 NaN NaN NaN
TeamBrown 0.206 NaN NaN NaN
TeamBryant 0.219 NaN NaN NaN
TeamBucknell 0.183 NaN NaN NaN
TeamBuffalo 0.182 NaN NaN NaN
TeamButler 0.122 NaN NaN NaN
TeamBYU 0.365 NaN NaN NaN
TeamCal Poly 0.173 NaN NaN NaN
TeamCal St. Fullerton 0.070 NaN NaN NaN
TeamCalifornia -0.161 NaN NaN NaN
TeamCalifornia Baptist 0.068 NaN NaN NaN
TeamCampbell 0.239 NaN NaN NaN
TeamCanisius 0.070 NaN NaN NaN
TeamCentral Ark. 0.239 NaN NaN NaN
TeamCentral Conn. St. 0.068 NaN NaN NaN
TeamCentral Mich. 0.231 NaN NaN NaN
TeamCharleston So. 0.090 NaN NaN NaN
TeamCharlotte 0.013 NaN NaN NaN
TeamChattanooga 0.121 NaN NaN NaN
TeamChicago St. -0.046 NaN NaN NaN
TeamCincinnati -0.027 NaN NaN NaN
TeamClemson 0.025 NaN NaN NaN
TeamCleveland St. 0.058 NaN NaN NaN
TeamCoastal Carolina 0.173 NaN NaN NaN
TeamCol. of Charleston 0.073 NaN NaN NaN
TeamColgate 0.406 NaN NaN NaN
TeamColorado 0.251 NaN NaN NaN
TeamColorado St. 0.239 NaN NaN NaN
TeamColumbia -0.133 NaN NaN NaN
TeamCoppin St. 0.282 NaN NaN NaN
TeamCornell -0.133 NaN NaN NaN
TeamCreighton 0.450 NaN NaN NaN
TeamCSU Bakersfield -0.175 NaN NaN NaN
TeamCSUN -0.113 NaN NaN NaN
TeamDartmouth 0.246 NaN NaN NaN
TeamDavidson 0.282 NaN NaN NaN
TeamDayton 0.194 NaN NaN NaN
TeamDelaware 0.213 NaN NaN NaN
TeamDelaware St. 0.380 NaN NaN NaN
TeamDenver 0.219 NaN NaN NaN
TeamDePaul -0.061 NaN NaN NaN
TeamDrake 0.395 NaN NaN NaN
TeamDuke 0.158 NaN NaN NaN
TeamDuquesne -0.127 NaN NaN NaN
TeamEast Carolina -0.019 NaN NaN NaN
TeamEastern Ill. 0.106 NaN NaN NaN
TeamEastern Ky. 0.106 NaN NaN NaN
TeamEastern Mich. -0.227 NaN NaN NaN
TeamEastern Wash. -0.015 NaN NaN NaN
TeamElon 0.087 NaN NaN NaN
TeamETSU 0.330 NaN NaN NaN
TeamEvansville 0.063 NaN NaN NaN
TeamFairfield 0.387 NaN NaN NaN
TeamFDU 0.121 NaN NaN NaN
TeamFGCU 0.394 NaN NaN NaN
TeamFIU -0.118 NaN NaN NaN
TeamFla. Atlantic 0.106 NaN NaN NaN
TeamFlorida 0.412 NaN NaN NaN
TeamFlorida A&M 0.224 NaN NaN NaN
TeamFlorida St. 0.239 NaN NaN NaN
TeamFordham 0.054 NaN NaN NaN
TeamFresno St. -0.161 NaN NaN NaN
TeamFurman -0.027 NaN NaN NaN
TeamGa. Southern 0.177 NaN NaN NaN
TeamGardner-Webb -0.073 NaN NaN NaN
TeamGeorge Mason -0.153 NaN NaN NaN
TeamGeorge Washington 0.151 NaN NaN NaN
TeamGeorgetown -0.256 NaN NaN NaN
TeamGeorgia 0.348 NaN NaN NaN
TeamGeorgia St. -0.153 NaN NaN NaN
TeamGeorgia Tech 0.330 NaN NaN NaN
TeamGonzaga -0.187 NaN NaN NaN
TeamGrambling 0.162 NaN NaN NaN
TeamGrand Canyon 0.227 NaN NaN NaN
TeamGreen Bay 0.282 NaN NaN NaN
TeamHampton -0.236 NaN NaN NaN
TeamHarvard -0.167 NaN NaN NaN
TeamHawaii 0.365 NaN NaN NaN
TeamHigh Point 0.303 NaN NaN NaN
TeamHofstra 0.192 NaN NaN NaN
TeamHoly Cross -0.283 NaN NaN NaN
TeamHouston 0.488 NaN NaN NaN
TeamHouston Christian 0.282 NaN NaN NaN
TeamHoward 0.273 NaN NaN NaN
TeamIdaho -0.251 NaN NaN NaN
TeamIdaho St. 0.039 NaN NaN NaN
TeamIllinois 0.106 NaN NaN NaN
TeamIllinois St. 0.020 NaN NaN NaN
TeamIndiana 0.106 NaN NaN NaN
TeamIndiana St. -0.279 NaN NaN NaN
TeamIona 0.192 NaN NaN NaN
TeamIowa -0.071 NaN NaN NaN
TeamIowa St. 0.231 NaN NaN NaN
TeamIUPUI -0.094 NaN NaN NaN
TeamJackson St. 0.039 NaN NaN NaN
TeamJacksonville -0.001 NaN NaN NaN
TeamJacksonville St. 0.406 NaN NaN NaN
TeamJames Madison 0.434 NaN NaN NaN
TeamKansas 0.239 NaN NaN NaN
TeamKansas City -0.071 NaN NaN NaN
TeamKansas St. 0.123 NaN NaN NaN
TeamKennesaw St. 0.249 NaN NaN NaN
TeamKent St. 0.054 NaN NaN NaN
TeamKentucky 0.339 NaN NaN NaN
TeamLafayette -0.046 NaN NaN NaN
TeamLamar University -0.084 NaN NaN NaN
TeamLehigh 0.070 NaN NaN NaN
TeamLiberty 0.325 NaN NaN NaN
TeamLipscomb 0.158 NaN NaN NaN
TeamLittle Rock -0.168 NaN NaN NaN
TeamLIU 0.125 NaN NaN NaN
TeamLMU 0.249 NaN NaN NaN
TeamLong Beach St. 0.285 NaN NaN NaN
TeamLouisiana 0.142 NaN NaN NaN
TeamLouisiana Tech -0.015 NaN NaN NaN
TeamLouisville 0.518 NaN NaN NaN
TeamLoyola Chicago 0.341 NaN NaN NaN
TeamLoyola Maryland 0.046 NaN NaN NaN
TeamLSU 0.139 NaN NaN NaN
TeamManhattan -0.360 NaN NaN NaN
TeamMarist 0.227 NaN NaN NaN
TeamMarquette 0.485 NaN NaN NaN
TeamMarshall -0.061 NaN NaN NaN
TeamMaryland 0.106 NaN NaN NaN
TeamMcNeese 0.135 NaN NaN NaN
TeamMemphis 0.151 NaN NaN NaN
TeamMercer -0.049 NaN NaN NaN
TeamMiami 0.239 NaN NaN NaN
TeamMiami (OH) -0.153 NaN NaN NaN
TeamMichigan 0.173 NaN NaN NaN
TeamMichigan St. 0.025 NaN NaN NaN
TeamMiddle Tenn. 0.123 NaN NaN NaN
TeamMilwaukee -0.061 NaN NaN NaN
TeamMinnesota 0.316 NaN NaN NaN
TeamMississippi St. 0.142 NaN NaN NaN
TeamMissouri -0.073 NaN NaN NaN
TeamMissouri St. -0.094 NaN NaN NaN
TeamMontana 0.192 NaN NaN NaN
TeamMontana St. 0.073 NaN NaN NaN
TeamMorehead St. 0.073 NaN NaN NaN
TeamMorgan St. -0.291 NaN NaN NaN
TeamMurray St. 0.020 NaN NaN NaN
TeamN.C. A&T -0.048 NaN NaN NaN
TeamN.C. Central -0.125 NaN NaN NaN
TeamNavy 0.177 NaN NaN NaN
TeamNC State 0.158 NaN NaN NaN
TeamNebraska 0.418 NaN NaN NaN
TeamNevada 0.089 NaN NaN NaN
TeamNew Hampshire 0.261 NaN NaN NaN
TeamNew Mexico 0.192 NaN NaN NaN
TeamNew Mexico St. 0.177 NaN NaN NaN
TeamNew Orleans 0.090 NaN NaN NaN
TeamNiagara -0.061 NaN NaN NaN
TeamNicholls -0.161 NaN NaN NaN
TeamNIU 0.213 NaN NaN NaN
TeamNJIT -0.118 NaN NaN NaN
TeamNorfolk St. -0.114 NaN NaN NaN
TeamNorth Ala. 0.070 NaN NaN NaN
TeamNorth Carolina 0.192 NaN NaN NaN
TeamNorth Dakota 0.006 NaN NaN NaN
TeamNorth Dakota St. 0.242 NaN NaN NaN
TeamNorth Florida 0.020 NaN NaN NaN
TeamNorth Texas 0.122 NaN NaN NaN
TeamNortheastern 0.125 NaN NaN NaN
TeamNorthern Ariz. -0.177 NaN NaN NaN
TeamNorthern Colo. 0.316 NaN NaN NaN
TeamNorthern Ky. 0.187 NaN NaN NaN
TeamNorthwestern 0.168 NaN NaN NaN
TeamNorthwestern St. 0.200 NaN NaN NaN
TeamNotre Dame -0.037 NaN NaN NaN
TeamOakland -0.027 NaN NaN NaN
TeamOhio 0.231 NaN NaN NaN
TeamOhio St. 0.294 NaN NaN NaN
TeamOklahoma 0.142 NaN NaN NaN
TeamOld Dominion 0.035 NaN NaN NaN
TeamOle Miss -0.001 NaN NaN NaN
TeamOmaha 0.251 NaN NaN NaN
TeamOral Roberts -0.094 NaN NaN NaN
TeamOregon 0.418 NaN NaN NaN
TeamOregon St. -0.161 NaN NaN NaN
TeamPacific 0.168 NaN NaN NaN
TeamPenn -0.311 NaN NaN NaN
TeamPenn St. 0.371 NaN NaN NaN
TeamPepperdine 0.239 NaN NaN NaN
TeamPittsburgh 0.492 NaN NaN NaN
TeamPortland -0.049 NaN NaN NaN
TeamPortland St. 0.187 NaN NaN NaN
TeamPrairie View -0.144 NaN NaN NaN
TeamPresbyterian -0.073 NaN NaN NaN
TeamPrinceton 0.446 NaN NaN NaN
TeamProvidence 0.006 NaN NaN NaN
TeamPurdue 0.262 NaN NaN NaN
TeamPurdue Fort Wayne -0.084 NaN NaN NaN
TeamQuinnipiac 0.089 NaN NaN NaN
TeamRadford 0.035 NaN NaN NaN
TeamRhode Island -0.200 NaN NaN NaN
TeamRice 0.477 NaN NaN NaN
TeamRider -0.039 NaN NaN NaN
TeamRobert Morris -0.094 NaN NaN NaN
TeamRutgers -0.144 NaN NaN NaN
TeamSacramento St. 0.106 NaN NaN NaN
TeamSacred Heart 0.283 NaN NaN NaN
TeamSaint Francis 0.035 NaN NaN NaN
TeamSaint Louis 0.187 NaN NaN NaN
TeamSaint Mary's -0.048 NaN NaN NaN
TeamSaint Peter's -0.363 NaN NaN NaN
TeamSam Houston -0.108 NaN NaN NaN
TeamSamford 0.200 NaN NaN NaN
TeamSan Diego 0.545 NaN NaN NaN
TeamSan Diego St. -0.039 NaN NaN NaN
TeamSan Francisco 0.089 NaN NaN NaN
TeamSan Jose St. 0.306 NaN NaN NaN
TeamSanta Clara 0.012 NaN NaN NaN
TeamSeattle U -0.212 NaN NaN NaN
TeamSeton Hall 0.106 NaN NaN NaN
TeamSFA 0.445 NaN NaN NaN
TeamSiena 0.039 NaN NaN NaN
TeamSIUE 0.039 NaN NaN NaN
TeamSMU 0.294 NaN NaN NaN
TeamSouth Alabama 0.187 NaN NaN NaN
TeamSouth Carolina 0.070 NaN NaN NaN
TeamSouth Dakota 0.485 NaN NaN NaN
TeamSouth Dakota St. 0.151 NaN NaN NaN
TeamSouth Fla. -0.061 NaN NaN NaN
TeamSoutheast Mo. St. 0.121 NaN NaN NaN
TeamSoutheastern La. 0.364 NaN NaN NaN
TeamSouthern California 0.273 NaN NaN NaN
TeamSouthern Ill. 0.187 NaN NaN NaN
TeamSouthern Miss. 0.242 NaN NaN NaN
TeamSouthern U. -0.287 NaN NaN NaN
TeamSouthern Utah -0.102 NaN NaN NaN
TeamSt. Francis Brooklyn 0.058 NaN NaN NaN
TeamSt. John's 0.194 NaN NaN NaN
TeamStanford 0.450 NaN NaN NaN
TeamStetson 0.123 NaN NaN NaN
TeamStony Brook -0.009 NaN NaN NaN
TeamSyracuse -0.001 NaN NaN NaN
TeamTCU 0.213 NaN NaN NaN
TeamTemple -0.071 NaN NaN NaN
TeamTennessee 0.154 NaN NaN NaN
TeamTennessee St. 0.149 NaN NaN NaN
TeamTennessee Tech 0.154 NaN NaN NaN
TeamTexas 0.572 NaN NaN NaN
TeamTexas A&M 0.054 NaN NaN NaN
TeamTexas Southern -0.073 NaN NaN NaN
TeamTexas St. 0.380 NaN NaN NaN
TeamTexas Tech 0.158 NaN NaN NaN
TeamThe Citadel -0.027 NaN NaN NaN
TeamToledo 0.200 NaN NaN NaN
TeamTowson 0.541 NaN NaN NaN
TeamTroy 0.187 NaN NaN NaN
TeamTulane -0.104 NaN NaN NaN
TeamTulsa 0.025 NaN NaN NaN
TeamUAB 0.050 NaN NaN NaN
TeamUAlbany -0.135 NaN NaN NaN
TeamUC Davis 0.106 NaN NaN NaN
TeamUC Irvine 0.273 NaN NaN NaN
TeamUC Riverside -0.256 NaN NaN NaN
TeamUC Santa Barbara 0.273 NaN NaN NaN
TeamUCF 0.539 NaN NaN NaN
TeamUCLA 0.158 NaN NaN NaN
TeamUConn 0.154 NaN NaN NaN
TeamUIC 0.294 NaN NaN NaN
TeamUIW -0.187 NaN NaN NaN
TeamULM -0.175 NaN NaN NaN
TeamUMBC 0.260 NaN NaN NaN
TeamUMES -0.061 NaN NaN NaN
TeamUNC Asheville -0.240 NaN NaN NaN
TeamUNC Greensboro -0.061 NaN NaN NaN
TeamUNCW -0.154 NaN NaN NaN
TeamUNI 0.377 NaN NaN NaN
TeamUNLV 0.445 NaN NaN NaN
TeamUSC Upstate -0.187 NaN NaN NaN
TeamUT Arlington 0.236 NaN NaN NaN
TeamUT Martin 0.263 NaN NaN NaN
TeamUtah 0.090 NaN NaN NaN
TeamUtah St. 0.273 NaN NaN NaN
TeamUtah Valley 0.227 NaN NaN NaN
TeamUTEP 0.173 NaN NaN NaN
TeamUTRGV 0.400 NaN NaN NaN
TeamUTSA -0.086 NaN NaN NaN
TeamValparaiso 0.273 NaN NaN NaN
TeamVCU 0.058 NaN NaN NaN
TeamVillanova -0.061 NaN NaN NaN
TeamVirginia 0.020 NaN NaN NaN
TeamVirginia Tech -0.015 NaN NaN NaN
TeamWake Forest 0.154 NaN NaN NaN
TeamWashington 0.251 NaN NaN NaN
TeamWashington St. 0.303 NaN NaN NaN
TeamWeber St. 0.192 NaN NaN NaN
TeamWest Virginia -0.153 NaN NaN NaN
TeamWestern Caro. 0.212 NaN NaN NaN
TeamWestern Ill. -0.261 NaN NaN NaN
TeamWestern Ky. 0.485 NaN NaN NaN
TeamWestern Mich. 0.173 NaN NaN NaN
TeamWichita St. 0.187 NaN NaN NaN
TeamWilliam & Mary 0.068 NaN NaN NaN
TeamWinthrop 0.177 NaN NaN NaN
TeamWisconsin 0.481 NaN NaN NaN
TeamWofford 0.158 NaN NaN NaN
TeamWright St. 0.481 NaN NaN NaN
TeamWyoming -0.061 NaN NaN NaN
TeamXavier 0.242 NaN NaN NaN
TeamYale 0.491 NaN NaN NaN
TeamYoungstown St. 0.073 NaN NaN NaN
ConferenceACC NA NA NA NA
ConferenceAmerica East NA NA NA NA
ConferenceASUN NA NA NA NA
ConferenceAtlantic 10 NA NA NA NA
ConferenceBig 12 NA NA NA NA
ConferenceBig East NA NA NA NA
ConferenceBig Sky NA NA NA NA
ConferenceBig South NA NA NA NA
ConferenceBig Ten NA NA NA NA
ConferenceBig West NA NA NA NA
ConferenceC-USA NA NA NA NA
ConferenceCAA NA NA NA NA
ConferenceDI Independent NA NA NA NA
ConferenceHorizon NA NA NA NA
ConferenceIvy League NA NA NA NA
ConferenceMAAC NA NA NA NA
ConferenceMAC NA NA NA NA
ConferenceMEAC NA NA NA NA
ConferenceMountain West NA NA NA NA
ConferenceMVC NA NA NA NA
ConferenceNEC NA NA NA NA
ConferenceOVC NA NA NA NA
ConferencePac-12 NA NA NA NA
ConferencePatriot NA NA NA NA
ConferenceSEC NA NA NA NA
ConferenceSoCon NA NA NA NA
ConferenceSouthland NA NA NA NA
ConferenceSummit League NA NA NA NA
ConferenceSun Belt NA NA NA NA
ConferenceSWAC NA NA NA NA
ConferenceWAC NA NA NA NA
ConferenceWCC NA NA NA NA
regionMidwest NA NA NA NA
regionSouth NA NA NA NA
regionSoutheast NA NA NA NA
regionWest NA NA NA NA
aces_per_set NA NA NA NA
assists_per_set NA NA NA NA
blocks_per_set NA NA NA NA
digs_per_set NA NA NA NA
hitting_pctg NA NA NA NA
kills_per_set NA NA NA NA
opp_hitting_pctg NA NA NA NA
W NA NA NA NA
L NA NA NA NA
Residual standard error: NaN on 0 degrees of freedom
(3 observations deleted due to missingness)
Multiple R-squared: 1, Adjusted R-squared: NaN
F-statistic: NaN on 330 and 0 DF, p-value: NA
5. Analysis of linear regression model
# Equation for the modelequation <-paste("win_loss_pctg =", paste(round(coef(lm_model), 3), collapse =" + "), "* variable")# P-valuesp_values <-tidy(lm_model)$p.value# Adjusted R-squared value (help from chat gbt to do r-squared)adjusted_r_squared <-summary(lm_model)$adj.r.squared
6. Explore variables for final visualization
# 6. Explore variables for final visualization# Scatter plot of aces_per_set vs. win_loss_pctgselected_data %>%ggplot(aes(x = aces_per_set, y = win_loss_pctg)) +geom_point() +geom_smooth(method ="lm", se =FALSE, color ="blue") +labs(title ="Relationship between Aces per Set and Win-Loss Percentage",x ="Aces per Set",y ="Win-Loss Percentage") +theme_minimal() +theme(plot.title =element_text(size =14, face ="bold"))
`geom_smooth()` using formula = 'y ~ x'
# Scatter plot of assists_per_set vs. win_loss_pctgselected_data %>%ggplot(aes(x = assists_per_set, y = win_loss_pctg)) +geom_point() +geom_smooth(method ="lm", se =FALSE, color ="green") +labs(title ="Relationship between Assists per Set and Win-Loss Percentage",x ="Assists per Set",y ="Win-Loss Percentage") +theme_minimal() +theme(plot.title =element_text(size =14, face ="bold"))
`geom_smooth()` using formula = 'y ~ x'
# Scatter plot of hitting_pctg vs. win_loss_pctgselected_data %>%ggplot(aes(x = hitting_pctg, y = win_loss_pctg)) +geom_point() +geom_smooth(method ="lm", se =FALSE, color ="red") +labs(title ="Relationship between Hitting Percentage and Win-Loss Percentage",x ="Hitting Percentage",y ="Win-Loss Percentage") +theme_minimal() +theme(plot.title =element_text(size =14, face ="bold"))
`geom_smooth()` using formula = 'y ~ x'
Warning: Removed 2 rows containing non-finite outside the scale range
(`stat_smooth()`).
Warning: Removed 2 rows containing missing values or values outside the scale range
(`geom_point()`).
library(plotly)
Attaching package: 'plotly'
The following object is masked from 'package:ggplot2':
last_plot
The following object is masked from 'package:stats':
filter
The following object is masked from 'package:graphics':
layout
7. Final visualizations using ggplot2
I REALLY wanted to use high charter for all the graphs but for some reason when I went in to some adjustments it kept giving me a error
# Bar plot of wins by conferencewins_by_conference <- selected_data %>%group_by(Conference) %>%summarise(total_wins =sum(W)) %>%arrange(desc(total_wins))wins_by_conference_plot <-ggplot(wins_by_conference, aes(x =reorder(Conference, total_wins), y = total_wins, fill = Conference)) +geom_bar(stat ="identity") +labs(title ="Total Wins by Conference",x ="Conference",y ="Total Wins") +theme_minimal() +theme(axis.text.x =element_text(angle =45, hjust =1))wins_by_conference_plotly <-ggplotly(wins_by_conference_plot) %>%layout(xaxis =list(tickangle =-45))wins_by_conference_plotly
# Line plot of win-loss percentage by regionwin_loss_by_region <- selected_data %>%group_by(region) %>%summarise(avg_win_loss_pctg =mean(win_loss_pctg)) %>%arrange(desc(avg_win_loss_pctg))win_loss_by_region_plot <-ggplot(win_loss_by_region, aes(x = region, y = avg_win_loss_pctg)) +geom_line(color ="#FF5722") +geom_point(color ="#FF5722") +labs(title ="Average Win-Loss Percentage by Region",x ="Region",y ="Average Win-Loss Percentage") +theme_minimal() +theme(axis.text.x =element_text(angle =45, hjust =1))win_loss_by_region_plotly <-ggplotly(win_loss_by_region_plot)win_loss_by_region_plotly
# Bubble plot of hitting percentage vs. win-loss percentage by conferencebubble_plot_data <- selected_data %>%mutate(size = W + L)bubble_plot <-ggplot(bubble_plot_data, aes(x = hitting_pctg, y = win_loss_pctg, size = size, color = Conference)) +geom_point(alpha =0.7) +labs(title ="Hitting Percentage vs. Win-Loss Percentage by Conference",x ="Hitting Percentage",y ="Win-Loss Percentage",size ="Total Matches",color ="Conference") +theme_minimal()bubble_plotly <-ggplotly(bubble_plot)bubble_plotly
library(highcharter)# Create a bar graph of wins by conference using highcharterhighchart() %>%hc_chart(type ="column") %>%hc_title(text ="Wins by Conference") %>%hc_xAxis(categories = volleyball_data$Conference) %>%hc_yAxis(title =list(text ="Number of Wins")) %>%hc_add_series(name ="Wins", data = volleyball_data$W, colorByPoint =TRUE)
First time doing a stream graph and I could get the multiple colors to work
# I really wanted to try a stream graph so I had chat gbt help# Prepare the stream graphstream_data <- volleyball_data %>%select(Conference, W) %>%group_by(Conference) %>%summarize(Total_Wins =sum(W)) %>%arrange(desc(Total_Wins)) %>%mutate(Conference =factor(Conference, levels =unique(Conference)))# highcharterhighchart() %>%hc_chart(type ="streamgraph") %>%hc_title(text ="Total Wins by Conference") %>%hc_xAxis(type ="category", categories = stream_data$Conference) %>%hc_series(list(name ="Wins", data = stream_data$Total_Wins)) %>%hc_colors(c("blue", "green", "red", "orange", "purple")) # Custom colors
# I wanted to get the conferences to have different colors but I could get it to work
# Prepare the data for the stream graphstream_data_kills <- volleyball_data %>%select(Conference, kills_per_set) %>%group_by(Conference) %>%summarize(Total_Kills_Per_Set =sum(kills_per_set)) %>%arrange(desc(Total_Kills_Per_Set)) %>%mutate(Conference =factor(Conference, levels =unique(Conference)))# highcharter stream graph for kills_per_sethighchart() %>%hc_chart(type ="streamgraph") %>%hc_title(text ="Total Kills per Set by Conference") %>%hc_xAxis(type ="category", categories = stream_data_kills$Conference) %>%hc_series(list(name ="Kills per Set", data = stream_data_kills$Total_Kills_Per_Set)) %>%hc_colors(c("#4CAF50", "#FFC107", "#2196F3", "#FF5722", "#673AB7")) # Custom colors
Background Research
The dataset for this analysis was obtained from a sports repository, a hub that gathers and shares datasets on different sports. These repositories are useful for researchers, analysts, and fans, as they offer a wide range of datasets covering various sports and events. They collect data from official sources like sports leagues and organizations, ensuring that the datasets are accurate and reliable for analysis.
Visualization Analysis
The visualizations created from the dataset provide insights into various aspects of NCAA Division 1 volleyball teams’ performance:
Total Wins by Conference (Bar Plot): This visualization displays the total number of wins for each conference in the dataset. It reveals the distribution of wins among different conferences, with some conferences having significantly more wins than others. For example, conferences like the Big Ten and Pac-12 may stand out with higher total wins compared to others.
Average Win-Loss Percentage by Region (Line Plot): This visualization shows the average win-loss percentage for teams grouped by region. It allows for comparisons of performance across different regions, highlighting potential regional disparities in team success. It could be interesting to investigate any patterns or trends in win-loss percentages across regions, such as whether certain regions consistently produce stronger teams.
Hitting Percentage vs. Win-Loss Percentage by Conference (Bubble Plot): This interactive visualization depicts the relationship between hitting percentage and win-loss percentage for each conference. Each bubble represents a conference, with bubble size indicating the total number of matches played by teams in that conference. The color of the bubble distinguishes between different conferences. This plot enables the identification of conferences with high hitting percentages and win-loss percentages, as well as any outliers or unexpected patterns.
While the visualizations provide valuable insights into NCAA Division 1 volleyball team performance, there are some aspects that could have been explored further or included:
It would have been interesting to delve deeper into the relationship between specific performance metrics (e.g., aces per set, assists per set) and win-loss percentages using more advanced statistical analyses.
Additional demographic or contextual variables, such as team composition (e.g., number of returning players, average player height) or coaching staff characteristics, could have provided further insights into factors influencing team success.
Exploring temporal trends in team performance over multiple seasons could have provided a more comprehensive understanding of the dynamics of collegiate volleyball competition.
Overall, the visualizations offer a glimpse into the performance landscape of NCAA Division 1 volleyball teams and lay the groundwork for further analysis and exploration of this fascinating domain.