Data 110 final project redo

Volleyball From NCAA.com

Exploring Performance Metrics in NCAA Division 1 Volleyball Teams, 2022-2023 Season

Dataset: The dataset used in this analysis is sourced from the NCAA Division 1 volleyball teams’ performance during the 2022-2023 season. The dataset contains 334 rows and 14 columns, representing various metrics for each team. These metrics include performance indicators such as aces per set, assists per set, team attacks per set, blocks per set, digs per set, hitting percentage, kills per set, opponent hitting percentage, win-loss record, and more.

Data Source: The data was collected by SCORE Sports Data Repository

Variables:

Team: Name of the college volleyball team.

Conference: The conference to which the team belongs.

Region: The region to which the team belongs.

Aces_per_set: Average number of serves leading to a point per set.

Assists_per_set: Average number of sets, passes, or digs resulting in a kill per set.

Team_attacks_per_set: Average number of times the ball is sent to the opponent’s court per set.

Blocks_per_set: Average number of times the ball is blocked per set.

Digs_per_set: Average number of successful passes after an opponent’s attack per set.

Hitting_pctg: Percentage of successful hits relative to total attempts.

Kills_per_set: Average number of hits resulting in a point per set.

Opp_hitting_pctg: Average hitting percentage of the team’s opponent per set.

W: Number of team wins for the season.

L: Number of team losses for the season.

Win_loss_pctg: Percentage of total wins divided by the total matches of the season.

Reason for Choosing Topic and Dataset:

Volleyball is a sport that involves a blend of athleticism, strategy, and teamwork. Analyzing performance metrics in NCAA Division 1 volleyball teams allows for a deeper understanding of the factors contributing to team success. This dataset provides an opportunity to explore the degree of volleyball performance. Also I’ve never payed attention to volleyball and was very interested in the sport and what it’s about.

1. The necessary libraries

library(readr)
library(dplyr)

Attaching package: 'dplyr'
The following objects are masked from 'package:stats':

    filter, lag
The following objects are masked from 'package:base':

    intersect, setdiff, setequal, union
library(ggplot2)
Warning: package 'ggplot2' was built under R version 4.3.3
library(tidyr)
library(broom) #suggested by chat gbt so I gave it a try
library(highcharter)
Warning: package 'highcharter' was built under R version 4.3.3
Registered S3 method overwritten by 'quantmod':
  method            from
  as.zoo.data.frame zoo 
# Loading dataset
volleyball_data <- read_csv("volleyball_ncaa_div1_2022_23.csv")
Rows: 334 Columns: 14
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr  (3): Team, Conference, region
dbl (11): aces_per_set, assists_per_set, team_attacks_per_set, blocks_per_se...

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

2. Data cleaning and exploration

# Display structure of the dataset
str(volleyball_data)
spc_tbl_ [334 × 14] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
 $ Team                : chr [1:334] "Lafayette" "Delaware St." "Yale" "Coppin St." ...
 $ Conference          : chr [1:334] "Patriot" "MEAC" "Ivy League" "MEAC" ...
 $ region              : chr [1:334] "East" "Southeast" "East" "Southeast" ...
 $ aces_per_set        : num [1:334] 2.33 2.2 2.15 2.15 2.03 1.98 1.93 1.91 1.91 1.91 ...
 $ assists_per_set     : num [1:334] 11 11.4 12.6 10.6 11.6 ...
 $ team_attacks_per_set: num [1:334] 34.5 30 35.4 32.5 34.1 ...
 $ blocks_per_set      : num [1:334] 1.31 2.17 1.82 1.81 1.83 2.39 1.73 1.85 1.36 1.73 ...
 $ digs_per_set        : num [1:334] 13.6 12.6 15.3 14.2 14.3 ...
 $ hitting_pctg        : num [1:334] 0.18 0.25 0.242 0.194 0.201 0.226 0.227 0.238 0.248 0.255 ...
 $ kills_per_set       : num [1:334] 11.9 12.1 13.9 11.5 12.4 ...
 $ opp_hitting_pctg    : num [1:334] 0.227 0.137 0.155 0.17 0.188 0.175 0.202 0.23 0.237 0.179 ...
 $ W                   : num [1:334] 8 24 23 23 18 17 19 16 18 20 ...
 $ L                   : num [1:334] 15 7 3 11 13 13 13 13 13 10 ...
 $ win_loss_pctg       : num [1:334] 0.348 0.774 0.885 0.676 0.581 0.567 0.594 0.552 0.581 0.667 ...
 - attr(*, "spec")=
  .. cols(
  ..   Team = col_character(),
  ..   Conference = col_character(),
  ..   region = col_character(),
  ..   aces_per_set = col_double(),
  ..   assists_per_set = col_double(),
  ..   team_attacks_per_set = col_double(),
  ..   blocks_per_set = col_double(),
  ..   digs_per_set = col_double(),
  ..   hitting_pctg = col_double(),
  ..   kills_per_set = col_double(),
  ..   opp_hitting_pctg = col_double(),
  ..   W = col_double(),
  ..   L = col_double(),
  ..   win_loss_pctg = col_double()
  .. )
 - attr(*, "problems")=<externalptr> 
# Check summary statistics
summary(volleyball_data)
     Team            Conference           region           aces_per_set  
 Length:334         Length:334         Length:334         Min.   :0.900  
 Class :character   Class :character   Class :character   1st Qu.:1.310  
 Mode  :character   Mode  :character   Mode  :character   Median :1.455  
                                                          Mean   :1.465  
                                                          3rd Qu.:1.610  
                                                          Max.   :2.330  
                                                                         
 assists_per_set team_attacks_per_set blocks_per_set   digs_per_set  
 Min.   : 4.44   Min.   :24.25        Min.   :0.600   Min.   : 7.42  
 1st Qu.:10.87   1st Qu.:33.35        1st Qu.:1.810   1st Qu.:13.33  
 Median :11.54   Median :34.47        Median :2.070   Median :14.32  
 Mean   :11.43   Mean   :34.46        Mean   :2.057   Mean   :14.35  
 3rd Qu.:12.14   3rd Qu.:35.88        3rd Qu.:2.300   3rd Qu.:15.35  
 Max.   :13.80   Max.   :39.78        Max.   :3.330   Max.   :18.53  
                                                                     
  hitting_pctg    kills_per_set   opp_hitting_pctg       W        
 Min.   :0.0790   Min.   : 4.92   Min.   :0.1280   Min.   : 0.00  
 1st Qu.:0.1830   1st Qu.:11.78   1st Qu.:0.1870   1st Qu.:10.00  
 Median :0.2080   Median :12.46   Median :0.2055   Median :15.00  
 Mean   :0.2079   Mean   :12.37   Mean   :0.2076   Mean   :15.13  
 3rd Qu.:0.2330   3rd Qu.:13.14   3rd Qu.:0.2270   3rd Qu.:19.00  
 Max.   :0.3360   Max.   :14.75   Max.   :0.3380   Max.   :31.00  
 NA's   :2                                                        
       L         win_loss_pctg   
 Min.   : 1.00   Min.   :0.0000  
 1st Qu.:11.00   1st Qu.:0.3450  
 Median :15.00   Median :0.5155  
 Mean   :14.72   Mean   :0.4996  
 3rd Qu.:19.00   3rd Qu.:0.6352  
 Max.   :31.00   Max.   :0.9660  
                                 
# Check for missing values
sum(is.na(volleyball_data))
[1] 3
# Explore unique values in categorical variables
unique(volleyball_data$Conference)
 [1] "Patriot"        "MEAC"           "Ivy League"     "Atlantic 10"   
 [5] "C-USA"          "SoCon"          "ASUN"           "MVC"           
 [9] "SEC"            "MAAC"           "SWAC"           "Sun Belt"      
[13] "ACC"            "WCC"            "CAA"            "Southland"     
[17] "NEC"            "Mountain West"  "Big East"       "Big Ten"       
[21] "America East"   "Big 12"         "WAC"            "Big South"     
[25] "Pac-12"         "Horizon"        "Big Sky"        "Summit League" 
[29] "Big West"       "MAC"            "OVC"            "AAC"           
[33] "DI Independent" NA              
unique(volleyball_data$region)
[1] "East"      "Southeast" "South"     "Midwest"   "West"     

3. Select relevant columns for analysis

selected_data <- volleyball_data %>%
  select(Team, Conference, region, aces_per_set, assists_per_set, blocks_per_set, digs_per_set,
         hitting_pctg, kills_per_set, opp_hitting_pctg, W, L, win_loss_pctg)

4. Perform linear regression

lm_model <- lm(win_loss_pctg ~ ., data = selected_data)
summary(lm_model)

Call:
lm(formula = win_loss_pctg ~ ., data = selected_data)

Residuals:
ALL 331 residuals are 0: no residual degrees of freedom!

Coefficients: (45 not defined because of singularities)
                         Estimate Std. Error t value Pr(>|t|)
(Intercept)                 0.394        NaN     NaN      NaN
TeamAbilene Christian      -0.125        NaN     NaN      NaN
TeamAir Force               0.106        NaN     NaN      NaN
TeamAkron                  -0.187        NaN     NaN      NaN
TeamAlabama                -0.061        NaN     NaN      NaN
TeamAlabama A&M            -0.182        NaN     NaN      NaN
TeamAlabama St.             0.177        NaN     NaN      NaN
TeamAlcorn                 -0.222        NaN     NaN      NaN
TeamAmerican                0.106        NaN     NaN      NaN
TeamApp State               0.249        NaN     NaN      NaN
TeamArizona                 0.122        NaN     NaN      NaN
TeamArizona St.             0.012        NaN     NaN      NaN
TeamArk.-Pine Bluff        -0.104        NaN     NaN      NaN
TeamArkansas                0.306        NaN     NaN      NaN
TeamArkansas St.           -0.104        NaN     NaN      NaN
TeamArmy West Point         0.273        NaN     NaN      NaN
TeamAuburn                  0.316        NaN     NaN      NaN
TeamAustin Peay             0.050        NaN     NaN      NaN
TeamBall St.                0.333        NaN     NaN      NaN
TeamBaylor                  0.387        NaN     NaN      NaN
TeamBelmont                -0.136        NaN     NaN      NaN
TeamBethune-Cookman        -0.094        NaN     NaN      NaN
TeamBinghamton              0.260        NaN     NaN      NaN
TeamBoise St.               0.123        NaN     NaN      NaN
TeamBoston College          0.255        NaN     NaN      NaN
TeamBowling Green           0.294        NaN     NaN      NaN
TeamBradley                -0.071        NaN     NaN      NaN
TeamBrown                   0.206        NaN     NaN      NaN
TeamBryant                  0.219        NaN     NaN      NaN
TeamBucknell                0.183        NaN     NaN      NaN
TeamBuffalo                 0.182        NaN     NaN      NaN
TeamButler                  0.122        NaN     NaN      NaN
TeamBYU                     0.365        NaN     NaN      NaN
TeamCal Poly                0.173        NaN     NaN      NaN
TeamCal St. Fullerton       0.070        NaN     NaN      NaN
TeamCalifornia             -0.161        NaN     NaN      NaN
TeamCalifornia Baptist      0.068        NaN     NaN      NaN
TeamCampbell                0.239        NaN     NaN      NaN
TeamCanisius                0.070        NaN     NaN      NaN
TeamCentral Ark.            0.239        NaN     NaN      NaN
TeamCentral Conn. St.       0.068        NaN     NaN      NaN
TeamCentral Mich.           0.231        NaN     NaN      NaN
TeamCharleston So.          0.090        NaN     NaN      NaN
TeamCharlotte               0.013        NaN     NaN      NaN
TeamChattanooga             0.121        NaN     NaN      NaN
TeamChicago St.            -0.046        NaN     NaN      NaN
TeamCincinnati             -0.027        NaN     NaN      NaN
TeamClemson                 0.025        NaN     NaN      NaN
TeamCleveland St.           0.058        NaN     NaN      NaN
TeamCoastal Carolina        0.173        NaN     NaN      NaN
TeamCol. of Charleston      0.073        NaN     NaN      NaN
TeamColgate                 0.406        NaN     NaN      NaN
TeamColorado                0.251        NaN     NaN      NaN
TeamColorado St.            0.239        NaN     NaN      NaN
TeamColumbia               -0.133        NaN     NaN      NaN
TeamCoppin St.              0.282        NaN     NaN      NaN
TeamCornell                -0.133        NaN     NaN      NaN
TeamCreighton               0.450        NaN     NaN      NaN
TeamCSU Bakersfield        -0.175        NaN     NaN      NaN
TeamCSUN                   -0.113        NaN     NaN      NaN
TeamDartmouth               0.246        NaN     NaN      NaN
TeamDavidson                0.282        NaN     NaN      NaN
TeamDayton                  0.194        NaN     NaN      NaN
TeamDelaware                0.213        NaN     NaN      NaN
TeamDelaware St.            0.380        NaN     NaN      NaN
TeamDenver                  0.219        NaN     NaN      NaN
TeamDePaul                 -0.061        NaN     NaN      NaN
TeamDrake                   0.395        NaN     NaN      NaN
TeamDuke                    0.158        NaN     NaN      NaN
TeamDuquesne               -0.127        NaN     NaN      NaN
TeamEast Carolina          -0.019        NaN     NaN      NaN
TeamEastern Ill.            0.106        NaN     NaN      NaN
TeamEastern Ky.             0.106        NaN     NaN      NaN
TeamEastern Mich.          -0.227        NaN     NaN      NaN
TeamEastern Wash.          -0.015        NaN     NaN      NaN
TeamElon                    0.087        NaN     NaN      NaN
TeamETSU                    0.330        NaN     NaN      NaN
TeamEvansville              0.063        NaN     NaN      NaN
TeamFairfield               0.387        NaN     NaN      NaN
TeamFDU                     0.121        NaN     NaN      NaN
TeamFGCU                    0.394        NaN     NaN      NaN
TeamFIU                    -0.118        NaN     NaN      NaN
TeamFla. Atlantic           0.106        NaN     NaN      NaN
TeamFlorida                 0.412        NaN     NaN      NaN
TeamFlorida A&M             0.224        NaN     NaN      NaN
TeamFlorida St.             0.239        NaN     NaN      NaN
TeamFordham                 0.054        NaN     NaN      NaN
TeamFresno St.             -0.161        NaN     NaN      NaN
TeamFurman                 -0.027        NaN     NaN      NaN
TeamGa. Southern            0.177        NaN     NaN      NaN
TeamGardner-Webb           -0.073        NaN     NaN      NaN
TeamGeorge Mason           -0.153        NaN     NaN      NaN
TeamGeorge Washington       0.151        NaN     NaN      NaN
TeamGeorgetown             -0.256        NaN     NaN      NaN
TeamGeorgia                 0.348        NaN     NaN      NaN
TeamGeorgia St.            -0.153        NaN     NaN      NaN
TeamGeorgia Tech            0.330        NaN     NaN      NaN
TeamGonzaga                -0.187        NaN     NaN      NaN
TeamGrambling               0.162        NaN     NaN      NaN
TeamGrand Canyon            0.227        NaN     NaN      NaN
TeamGreen Bay               0.282        NaN     NaN      NaN
TeamHampton                -0.236        NaN     NaN      NaN
TeamHarvard                -0.167        NaN     NaN      NaN
TeamHawaii                  0.365        NaN     NaN      NaN
TeamHigh Point              0.303        NaN     NaN      NaN
TeamHofstra                 0.192        NaN     NaN      NaN
TeamHoly Cross             -0.283        NaN     NaN      NaN
TeamHouston                 0.488        NaN     NaN      NaN
TeamHouston Christian       0.282        NaN     NaN      NaN
TeamHoward                  0.273        NaN     NaN      NaN
TeamIdaho                  -0.251        NaN     NaN      NaN
TeamIdaho St.               0.039        NaN     NaN      NaN
TeamIllinois                0.106        NaN     NaN      NaN
TeamIllinois St.            0.020        NaN     NaN      NaN
TeamIndiana                 0.106        NaN     NaN      NaN
TeamIndiana St.            -0.279        NaN     NaN      NaN
TeamIona                    0.192        NaN     NaN      NaN
TeamIowa                   -0.071        NaN     NaN      NaN
TeamIowa St.                0.231        NaN     NaN      NaN
TeamIUPUI                  -0.094        NaN     NaN      NaN
TeamJackson St.             0.039        NaN     NaN      NaN
TeamJacksonville           -0.001        NaN     NaN      NaN
TeamJacksonville St.        0.406        NaN     NaN      NaN
TeamJames Madison           0.434        NaN     NaN      NaN
TeamKansas                  0.239        NaN     NaN      NaN
TeamKansas City            -0.071        NaN     NaN      NaN
TeamKansas St.              0.123        NaN     NaN      NaN
TeamKennesaw St.            0.249        NaN     NaN      NaN
TeamKent St.                0.054        NaN     NaN      NaN
TeamKentucky                0.339        NaN     NaN      NaN
TeamLafayette              -0.046        NaN     NaN      NaN
TeamLamar University       -0.084        NaN     NaN      NaN
TeamLehigh                  0.070        NaN     NaN      NaN
TeamLiberty                 0.325        NaN     NaN      NaN
TeamLipscomb                0.158        NaN     NaN      NaN
TeamLittle Rock            -0.168        NaN     NaN      NaN
TeamLIU                     0.125        NaN     NaN      NaN
TeamLMU                     0.249        NaN     NaN      NaN
TeamLong Beach St.          0.285        NaN     NaN      NaN
TeamLouisiana               0.142        NaN     NaN      NaN
TeamLouisiana Tech         -0.015        NaN     NaN      NaN
TeamLouisville              0.518        NaN     NaN      NaN
TeamLoyola Chicago          0.341        NaN     NaN      NaN
TeamLoyola Maryland         0.046        NaN     NaN      NaN
TeamLSU                     0.139        NaN     NaN      NaN
TeamManhattan              -0.360        NaN     NaN      NaN
TeamMarist                  0.227        NaN     NaN      NaN
TeamMarquette               0.485        NaN     NaN      NaN
TeamMarshall               -0.061        NaN     NaN      NaN
TeamMaryland                0.106        NaN     NaN      NaN
TeamMcNeese                 0.135        NaN     NaN      NaN
TeamMemphis                 0.151        NaN     NaN      NaN
TeamMercer                 -0.049        NaN     NaN      NaN
TeamMiami                   0.239        NaN     NaN      NaN
TeamMiami (OH)             -0.153        NaN     NaN      NaN
TeamMichigan                0.173        NaN     NaN      NaN
TeamMichigan St.            0.025        NaN     NaN      NaN
TeamMiddle Tenn.            0.123        NaN     NaN      NaN
TeamMilwaukee              -0.061        NaN     NaN      NaN
TeamMinnesota               0.316        NaN     NaN      NaN
TeamMississippi St.         0.142        NaN     NaN      NaN
TeamMissouri               -0.073        NaN     NaN      NaN
TeamMissouri St.           -0.094        NaN     NaN      NaN
TeamMontana                 0.192        NaN     NaN      NaN
TeamMontana St.             0.073        NaN     NaN      NaN
TeamMorehead St.            0.073        NaN     NaN      NaN
TeamMorgan St.             -0.291        NaN     NaN      NaN
TeamMurray St.              0.020        NaN     NaN      NaN
TeamN.C. A&T               -0.048        NaN     NaN      NaN
TeamN.C. Central           -0.125        NaN     NaN      NaN
TeamNavy                    0.177        NaN     NaN      NaN
TeamNC State                0.158        NaN     NaN      NaN
TeamNebraska                0.418        NaN     NaN      NaN
TeamNevada                  0.089        NaN     NaN      NaN
TeamNew Hampshire           0.261        NaN     NaN      NaN
TeamNew Mexico              0.192        NaN     NaN      NaN
TeamNew Mexico St.          0.177        NaN     NaN      NaN
TeamNew Orleans             0.090        NaN     NaN      NaN
TeamNiagara                -0.061        NaN     NaN      NaN
TeamNicholls               -0.161        NaN     NaN      NaN
TeamNIU                     0.213        NaN     NaN      NaN
TeamNJIT                   -0.118        NaN     NaN      NaN
TeamNorfolk St.            -0.114        NaN     NaN      NaN
TeamNorth Ala.              0.070        NaN     NaN      NaN
TeamNorth Carolina          0.192        NaN     NaN      NaN
TeamNorth Dakota            0.006        NaN     NaN      NaN
TeamNorth Dakota St.        0.242        NaN     NaN      NaN
TeamNorth Florida           0.020        NaN     NaN      NaN
TeamNorth Texas             0.122        NaN     NaN      NaN
TeamNortheastern            0.125        NaN     NaN      NaN
TeamNorthern Ariz.         -0.177        NaN     NaN      NaN
TeamNorthern Colo.          0.316        NaN     NaN      NaN
TeamNorthern Ky.            0.187        NaN     NaN      NaN
TeamNorthwestern            0.168        NaN     NaN      NaN
TeamNorthwestern St.        0.200        NaN     NaN      NaN
TeamNotre Dame             -0.037        NaN     NaN      NaN
TeamOakland                -0.027        NaN     NaN      NaN
TeamOhio                    0.231        NaN     NaN      NaN
TeamOhio St.                0.294        NaN     NaN      NaN
TeamOklahoma                0.142        NaN     NaN      NaN
TeamOld Dominion            0.035        NaN     NaN      NaN
TeamOle Miss               -0.001        NaN     NaN      NaN
TeamOmaha                   0.251        NaN     NaN      NaN
TeamOral Roberts           -0.094        NaN     NaN      NaN
TeamOregon                  0.418        NaN     NaN      NaN
TeamOregon St.             -0.161        NaN     NaN      NaN
TeamPacific                 0.168        NaN     NaN      NaN
TeamPenn                   -0.311        NaN     NaN      NaN
TeamPenn St.                0.371        NaN     NaN      NaN
TeamPepperdine              0.239        NaN     NaN      NaN
TeamPittsburgh              0.492        NaN     NaN      NaN
TeamPortland               -0.049        NaN     NaN      NaN
TeamPortland St.            0.187        NaN     NaN      NaN
TeamPrairie View           -0.144        NaN     NaN      NaN
TeamPresbyterian           -0.073        NaN     NaN      NaN
TeamPrinceton               0.446        NaN     NaN      NaN
TeamProvidence              0.006        NaN     NaN      NaN
TeamPurdue                  0.262        NaN     NaN      NaN
TeamPurdue Fort Wayne      -0.084        NaN     NaN      NaN
TeamQuinnipiac              0.089        NaN     NaN      NaN
TeamRadford                 0.035        NaN     NaN      NaN
TeamRhode Island           -0.200        NaN     NaN      NaN
TeamRice                    0.477        NaN     NaN      NaN
TeamRider                  -0.039        NaN     NaN      NaN
TeamRobert Morris          -0.094        NaN     NaN      NaN
TeamRutgers                -0.144        NaN     NaN      NaN
TeamSacramento St.          0.106        NaN     NaN      NaN
TeamSacred Heart            0.283        NaN     NaN      NaN
TeamSaint Francis           0.035        NaN     NaN      NaN
TeamSaint Louis             0.187        NaN     NaN      NaN
TeamSaint Mary's           -0.048        NaN     NaN      NaN
TeamSaint Peter's          -0.363        NaN     NaN      NaN
TeamSam Houston            -0.108        NaN     NaN      NaN
TeamSamford                 0.200        NaN     NaN      NaN
TeamSan Diego               0.545        NaN     NaN      NaN
TeamSan Diego St.          -0.039        NaN     NaN      NaN
TeamSan Francisco           0.089        NaN     NaN      NaN
TeamSan Jose St.            0.306        NaN     NaN      NaN
TeamSanta Clara             0.012        NaN     NaN      NaN
TeamSeattle U              -0.212        NaN     NaN      NaN
TeamSeton Hall              0.106        NaN     NaN      NaN
TeamSFA                     0.445        NaN     NaN      NaN
TeamSiena                   0.039        NaN     NaN      NaN
TeamSIUE                    0.039        NaN     NaN      NaN
TeamSMU                     0.294        NaN     NaN      NaN
TeamSouth Alabama           0.187        NaN     NaN      NaN
TeamSouth Carolina          0.070        NaN     NaN      NaN
TeamSouth Dakota            0.485        NaN     NaN      NaN
TeamSouth Dakota St.        0.151        NaN     NaN      NaN
TeamSouth Fla.             -0.061        NaN     NaN      NaN
TeamSoutheast Mo. St.       0.121        NaN     NaN      NaN
TeamSoutheastern La.        0.364        NaN     NaN      NaN
TeamSouthern California     0.273        NaN     NaN      NaN
TeamSouthern Ill.           0.187        NaN     NaN      NaN
TeamSouthern Miss.          0.242        NaN     NaN      NaN
TeamSouthern U.            -0.287        NaN     NaN      NaN
TeamSouthern Utah          -0.102        NaN     NaN      NaN
TeamSt. Francis Brooklyn    0.058        NaN     NaN      NaN
TeamSt. John's              0.194        NaN     NaN      NaN
TeamStanford                0.450        NaN     NaN      NaN
TeamStetson                 0.123        NaN     NaN      NaN
TeamStony Brook            -0.009        NaN     NaN      NaN
TeamSyracuse               -0.001        NaN     NaN      NaN
TeamTCU                     0.213        NaN     NaN      NaN
TeamTemple                 -0.071        NaN     NaN      NaN
TeamTennessee               0.154        NaN     NaN      NaN
TeamTennessee St.           0.149        NaN     NaN      NaN
TeamTennessee Tech          0.154        NaN     NaN      NaN
TeamTexas                   0.572        NaN     NaN      NaN
TeamTexas A&M               0.054        NaN     NaN      NaN
TeamTexas Southern         -0.073        NaN     NaN      NaN
TeamTexas St.               0.380        NaN     NaN      NaN
TeamTexas Tech              0.158        NaN     NaN      NaN
TeamThe Citadel            -0.027        NaN     NaN      NaN
TeamToledo                  0.200        NaN     NaN      NaN
TeamTowson                  0.541        NaN     NaN      NaN
TeamTroy                    0.187        NaN     NaN      NaN
TeamTulane                 -0.104        NaN     NaN      NaN
TeamTulsa                   0.025        NaN     NaN      NaN
TeamUAB                     0.050        NaN     NaN      NaN
TeamUAlbany                -0.135        NaN     NaN      NaN
TeamUC Davis                0.106        NaN     NaN      NaN
TeamUC Irvine               0.273        NaN     NaN      NaN
TeamUC Riverside           -0.256        NaN     NaN      NaN
TeamUC Santa Barbara        0.273        NaN     NaN      NaN
TeamUCF                     0.539        NaN     NaN      NaN
TeamUCLA                    0.158        NaN     NaN      NaN
TeamUConn                   0.154        NaN     NaN      NaN
TeamUIC                     0.294        NaN     NaN      NaN
TeamUIW                    -0.187        NaN     NaN      NaN
TeamULM                    -0.175        NaN     NaN      NaN
TeamUMBC                    0.260        NaN     NaN      NaN
TeamUMES                   -0.061        NaN     NaN      NaN
TeamUNC Asheville          -0.240        NaN     NaN      NaN
TeamUNC Greensboro         -0.061        NaN     NaN      NaN
TeamUNCW                   -0.154        NaN     NaN      NaN
TeamUNI                     0.377        NaN     NaN      NaN
TeamUNLV                    0.445        NaN     NaN      NaN
TeamUSC Upstate            -0.187        NaN     NaN      NaN
TeamUT Arlington            0.236        NaN     NaN      NaN
TeamUT Martin               0.263        NaN     NaN      NaN
TeamUtah                    0.090        NaN     NaN      NaN
TeamUtah St.                0.273        NaN     NaN      NaN
TeamUtah Valley             0.227        NaN     NaN      NaN
TeamUTEP                    0.173        NaN     NaN      NaN
TeamUTRGV                   0.400        NaN     NaN      NaN
TeamUTSA                   -0.086        NaN     NaN      NaN
TeamValparaiso              0.273        NaN     NaN      NaN
TeamVCU                     0.058        NaN     NaN      NaN
TeamVillanova              -0.061        NaN     NaN      NaN
TeamVirginia                0.020        NaN     NaN      NaN
TeamVirginia Tech          -0.015        NaN     NaN      NaN
TeamWake Forest             0.154        NaN     NaN      NaN
TeamWashington              0.251        NaN     NaN      NaN
TeamWashington St.          0.303        NaN     NaN      NaN
TeamWeber St.               0.192        NaN     NaN      NaN
TeamWest Virginia          -0.153        NaN     NaN      NaN
TeamWestern Caro.           0.212        NaN     NaN      NaN
TeamWestern Ill.           -0.261        NaN     NaN      NaN
TeamWestern Ky.             0.485        NaN     NaN      NaN
TeamWestern Mich.           0.173        NaN     NaN      NaN
TeamWichita St.             0.187        NaN     NaN      NaN
TeamWilliam & Mary          0.068        NaN     NaN      NaN
TeamWinthrop                0.177        NaN     NaN      NaN
TeamWisconsin               0.481        NaN     NaN      NaN
TeamWofford                 0.158        NaN     NaN      NaN
TeamWright St.              0.481        NaN     NaN      NaN
TeamWyoming                -0.061        NaN     NaN      NaN
TeamXavier                  0.242        NaN     NaN      NaN
TeamYale                    0.491        NaN     NaN      NaN
TeamYoungstown St.          0.073        NaN     NaN      NaN
ConferenceACC                  NA         NA      NA       NA
ConferenceAmerica East         NA         NA      NA       NA
ConferenceASUN                 NA         NA      NA       NA
ConferenceAtlantic 10          NA         NA      NA       NA
ConferenceBig 12               NA         NA      NA       NA
ConferenceBig East             NA         NA      NA       NA
ConferenceBig Sky              NA         NA      NA       NA
ConferenceBig South            NA         NA      NA       NA
ConferenceBig Ten              NA         NA      NA       NA
ConferenceBig West             NA         NA      NA       NA
ConferenceC-USA                NA         NA      NA       NA
ConferenceCAA                  NA         NA      NA       NA
ConferenceDI Independent       NA         NA      NA       NA
ConferenceHorizon              NA         NA      NA       NA
ConferenceIvy League           NA         NA      NA       NA
ConferenceMAAC                 NA         NA      NA       NA
ConferenceMAC                  NA         NA      NA       NA
ConferenceMEAC                 NA         NA      NA       NA
ConferenceMountain West        NA         NA      NA       NA
ConferenceMVC                  NA         NA      NA       NA
ConferenceNEC                  NA         NA      NA       NA
ConferenceOVC                  NA         NA      NA       NA
ConferencePac-12               NA         NA      NA       NA
ConferencePatriot              NA         NA      NA       NA
ConferenceSEC                  NA         NA      NA       NA
ConferenceSoCon                NA         NA      NA       NA
ConferenceSouthland            NA         NA      NA       NA
ConferenceSummit League        NA         NA      NA       NA
ConferenceSun Belt             NA         NA      NA       NA
ConferenceSWAC                 NA         NA      NA       NA
ConferenceWAC                  NA         NA      NA       NA
ConferenceWCC                  NA         NA      NA       NA
regionMidwest                  NA         NA      NA       NA
regionSouth                    NA         NA      NA       NA
regionSoutheast                NA         NA      NA       NA
regionWest                     NA         NA      NA       NA
aces_per_set                   NA         NA      NA       NA
assists_per_set                NA         NA      NA       NA
blocks_per_set                 NA         NA      NA       NA
digs_per_set                   NA         NA      NA       NA
hitting_pctg                   NA         NA      NA       NA
kills_per_set                  NA         NA      NA       NA
opp_hitting_pctg               NA         NA      NA       NA
W                              NA         NA      NA       NA
L                              NA         NA      NA       NA

Residual standard error: NaN on 0 degrees of freedom
  (3 observations deleted due to missingness)
Multiple R-squared:      1, Adjusted R-squared:    NaN 
F-statistic:   NaN on 330 and 0 DF,  p-value: NA

5. Analysis of linear regression model

# Equation for the model
equation <- paste("win_loss_pctg =", paste(round(coef(lm_model), 3), collapse = " + "), "* variable")

# P-values
p_values <- tidy(lm_model)$p.value

# Adjusted R-squared value (help from chat gbt to do r-squared)
adjusted_r_squared <- summary(lm_model)$adj.r.squared

6. Explore variables for final visualization

# 6. Explore variables for final visualization
# Scatter plot of aces_per_set vs. win_loss_pctg
selected_data %>%
  ggplot(aes(x = aces_per_set, y = win_loss_pctg)) +
  geom_point() +
  geom_smooth(method = "lm", se = FALSE, color = "blue") +
  labs(title = "Relationship between Aces per Set and Win-Loss Percentage",
       x = "Aces per Set",
       y = "Win-Loss Percentage") +
  theme_minimal() +
  theme(plot.title = element_text(size = 14, face = "bold"))
`geom_smooth()` using formula = 'y ~ x'

# Scatter plot of assists_per_set vs. win_loss_pctg
selected_data %>%
  ggplot(aes(x = assists_per_set, y = win_loss_pctg)) +
  geom_point() +
  geom_smooth(method = "lm", se = FALSE, color = "green") +
  labs(title = "Relationship between Assists per Set and Win-Loss Percentage",
       x = "Assists per Set",
       y = "Win-Loss Percentage") +
  theme_minimal() +
  theme(plot.title = element_text(size = 14, face = "bold"))
`geom_smooth()` using formula = 'y ~ x'

# Scatter plot of hitting_pctg vs. win_loss_pctg
selected_data %>%
  ggplot(aes(x = hitting_pctg, y = win_loss_pctg)) +
  geom_point() +
  geom_smooth(method = "lm", se = FALSE, color = "red") +
  labs(title = "Relationship between Hitting Percentage and Win-Loss Percentage",
       x = "Hitting Percentage",
       y = "Win-Loss Percentage") +
  theme_minimal() +
  theme(plot.title = element_text(size = 14, face = "bold"))
`geom_smooth()` using formula = 'y ~ x'
Warning: Removed 2 rows containing non-finite outside the scale range
(`stat_smooth()`).
Warning: Removed 2 rows containing missing values or values outside the scale range
(`geom_point()`).

library(plotly)

Attaching package: 'plotly'
The following object is masked from 'package:ggplot2':

    last_plot
The following object is masked from 'package:stats':

    filter
The following object is masked from 'package:graphics':

    layout

7. Final visualizations using ggplot2

I REALLY wanted to use high charter for all the graphs but for some reason when I went in to some adjustments it kept giving me a error

# Bar plot of wins by conference
wins_by_conference <- selected_data %>%
  group_by(Conference) %>%
  summarise(total_wins = sum(W)) %>%
  arrange(desc(total_wins))

wins_by_conference_plot <- ggplot(wins_by_conference, aes(x = reorder(Conference, total_wins), y = total_wins, fill = Conference)) +
  geom_bar(stat = "identity") +
  labs(title = "Total Wins by Conference",
       x = "Conference",
       y = "Total Wins") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))

wins_by_conference_plotly <- ggplotly(wins_by_conference_plot) %>%
  layout(xaxis = list(tickangle = -45))

wins_by_conference_plotly
# Line plot of win-loss percentage by region
win_loss_by_region <- selected_data %>%
  group_by(region) %>%
  summarise(avg_win_loss_pctg = mean(win_loss_pctg)) %>%
  arrange(desc(avg_win_loss_pctg))

win_loss_by_region_plot <- ggplot(win_loss_by_region, aes(x = region, y = avg_win_loss_pctg)) +
  geom_line(color = "#FF5722") +
  geom_point(color = "#FF5722") +
  labs(title = "Average Win-Loss Percentage by Region",
       x = "Region",
       y = "Average Win-Loss Percentage") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))

win_loss_by_region_plotly <- ggplotly(win_loss_by_region_plot)


win_loss_by_region_plotly
# Bubble plot of hitting percentage vs. win-loss percentage by conference
bubble_plot_data <- selected_data %>%
  mutate(size = W + L)

bubble_plot <- ggplot(bubble_plot_data, aes(x = hitting_pctg, y = win_loss_pctg, size = size, color = Conference)) +
  geom_point(alpha = 0.7) +
  labs(title = "Hitting Percentage vs. Win-Loss Percentage by Conference",
       x = "Hitting Percentage",
       y = "Win-Loss Percentage",
       size = "Total Matches",
       color = "Conference") +
  theme_minimal()

bubble_plotly <- ggplotly(bubble_plot)

bubble_plotly
library(highcharter)

# Create a bar graph of wins by conference using highcharter
highchart() %>%
  hc_chart(type = "column") %>%
  hc_title(text = "Wins by Conference") %>%
  hc_xAxis(categories = volleyball_data$Conference) %>%
  hc_yAxis(title = list(text = "Number of Wins")) %>%
  hc_add_series(name = "Wins", data = volleyball_data$W, colorByPoint = TRUE)

First time doing a stream graph and I could get the multiple colors to work

# I really wanted to try a stream graph so I had chat gbt help
# Prepare the stream graph
stream_data <- volleyball_data %>%
  select(Conference, W) %>%
  group_by(Conference) %>%
  summarize(Total_Wins = sum(W)) %>%
  arrange(desc(Total_Wins)) %>%
  mutate(Conference = factor(Conference, levels = unique(Conference)))

# highcharter
highchart() %>%
  hc_chart(type = "streamgraph") %>%
  hc_title(text = "Total Wins by Conference") %>%
  hc_xAxis(type = "category", categories = stream_data$Conference) %>%
  hc_series(list(name = "Wins", data = stream_data$Total_Wins)) %>%
  hc_colors(c("blue", "green", "red", "orange", "purple")) # Custom colors
# I wanted to get the conferences to have different colors but I could get it to work 
# Prepare the data for the stream graph
stream_data_kills <- volleyball_data %>%
  select(Conference, kills_per_set) %>%
  group_by(Conference) %>%
  summarize(Total_Kills_Per_Set = sum(kills_per_set)) %>%
  arrange(desc(Total_Kills_Per_Set)) %>%
  mutate(Conference = factor(Conference, levels = unique(Conference)))

# highcharter stream graph for kills_per_set
highchart() %>%
  hc_chart(type = "streamgraph") %>%
  hc_title(text = "Total Kills per Set by Conference") %>%
  hc_xAxis(type = "category", categories = stream_data_kills$Conference) %>%
  hc_series(list(name = "Kills per Set", data = stream_data_kills$Total_Kills_Per_Set)) %>%
  hc_colors(c("#4CAF50", "#FFC107", "#2196F3", "#FF5722", "#673AB7")) # Custom colors

Background Research

The dataset for this analysis was obtained from a sports repository, a hub that gathers and shares datasets on different sports. These repositories are useful for researchers, analysts, and fans, as they offer a wide range of datasets covering various sports and events. They collect data from official sources like sports leagues and organizations, ensuring that the datasets are accurate and reliable for analysis.

Visualization Analysis

The visualizations created from the dataset provide insights into various aspects of NCAA Division 1 volleyball teams’ performance:

Total Wins by Conference (Bar Plot): This visualization displays the total number of wins for each conference in the dataset. It reveals the distribution of wins among different conferences, with some conferences having significantly more wins than others. For example, conferences like the Big Ten and Pac-12 may stand out with higher total wins compared to others.

Hitting Percentage vs. Win-Loss Percentage by Conference (Bubble Plot): This interactive visualization depicts the relationship between hitting percentage and win-loss percentage for each conference. Each bubble represents a conference, with bubble size indicating the total number of matches played by teams in that conference. The color of the bubble distinguishes between different conferences. This plot enables the identification of conferences with high hitting percentages and win-loss percentages, as well as any outliers or unexpected patterns.

While the visualizations provide valuable insights into NCAA Division 1 volleyball team performance, there are some aspects that could have been explored further or included:

It would have been interesting to delve deeper into the relationship between specific performance metrics (e.g., aces per set, assists per set) and win-loss percentages using more advanced statistical analyses.

Additional demographic or contextual variables, such as team composition (e.g., number of returning players, average player height) or coaching staff characteristics, could have provided further insights into factors influencing team success.

Overall, the visualizations offer a glimpse into the performance landscape of NCAA Division 1 volleyball teams and lay the groundwork for further analysis and exploration of this fascinating domain.