Data analysis performed for Charlotte Lemgart’s physiology of exercise lab report

A preliminary analysis can be found at: https://rpubs.com/sean_collins/LemgartAnalysis

Presented below is a more in depth analysis of the questions of interest for this report including the use of linear and multivariable regression.

Research has shown mixed relationships between anaerobic and aerobic capacity. In this study hockey players participating in the NHL combine were tested both anaerobically and aerobically. The goal of this analysis is to test whether there is a relationship between their peak aerobic power and anaerobic power. Associations will be tested between the max aerobic test variables (both VO2 max in ml/kg/min and peak aerobic test watts), Wingate peak watts, and vertical jump average watts (Lewis equation, jump with a pause).

Underlying this analysis is the belief that balanced and similiar training between the athletes would result in a positive relationship between these variables. However, it is possible that factors such as types of training varying between players, or personal characteristics may result in a weak or no relationship between these variables.

# Load necessary R packages
library(dplyr) #For data manipulation
## 
## Attaching package: 'dplyr'
## 
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## 
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(reshape2) #For data manipulation related to plotting
library(ggplot2) #For plotting
library(pastecs) #For descriptive statistics
## Loading required package: boot
## 
## Attaching package: 'pastecs'
## 
## The following objects are masked from 'package:dplyr':
## 
##     first, last
# Set working directory - mac file system
setwd("~/physExLab/NHLcombinedata")

# Load data
nhldata<-read.csv("data.csv")

#Create weight in kg variable
nhldata<-mutate(nhldata, Weight_kg = Weight_lb/2.2)

#Create aerobic test peak watts / kg variable
nhldata<-mutate(nhldata, VO2max_watts_kg= VO2max_watts/Weight_kg)

#Create Lewis average watts / kg
nhldata<-mutate(nhldata, LewisAvg_.Pause_watts_kg= LewisAvg_.Pause_watts/Weight_kg)

#Create dataframe with only the variables we are interested in
nhldata<-select(nhldata,Weight_kg, WingatePeak_Watts, PeakWingate_W_kg, VO2max_l_min, VO2max_ml_kg_min, VO2max_watts, VO2max_watts_kg, LewisAvg_.Pause_watts, LewisAvg_.Pause_watts_kg)

# Limit to cases that have complete data 
nhldata <- nhldata[complete.cases(nhldata), ]
str(nhldata)
## 'data.frame':    94 obs. of  9 variables:
##  $ Weight_kg               : num  95.1 80 87.4 82.1 81.2 ...
##  $ WingatePeak_Watts       : int  1136 1017 1171 933 1089 1223 1210 917 1047 1191 ...
##  $ PeakWingate_W_kg        : num  12 12.7 13.4 11.4 13.4 14.1 13 12.8 13.9 14 ...
##  $ VO2max_l_min            : num  4.75 3.99 4.94 5.64 4.92 4.41 5.19 4.17 4.39 5.45 ...
##  $ VO2max_ml_kg_min        : num  50 49.9 56.5 68.7 60.6 50.6 55.8 57.9 58 63.8 ...
##  $ VO2max_watts            : int  520 400 520 520 520 520 520 440 480 520 ...
##  $ VO2max_watts_kg         : num  5.47 5 5.95 6.33 6.41 ...
##  $ LewisAvg_.Pause_watts   : num  1390 1330 1372 1357 1370 ...
##  $ LewisAvg_.Pause_watts_kg: num  14.6 16.6 15.7 16.5 16.9 ...
summary(nhldata)
##    Weight_kg      WingatePeak_Watts PeakWingate_W_kg  VO2max_l_min  
##  Min.   : 69.68   Min.   : 784      Min.   : 9.00    Min.   :3.640  
##  1st Qu.: 80.49   1st Qu.:1050      1st Qu.:12.80    1st Qu.:4.383  
##  Median : 84.95   Median :1142      Median :13.40    Median :4.725  
##  Mean   : 84.73   Mean   :1134      Mean   :13.41    Mean   :4.773  
##  3rd Qu.: 89.07   3rd Qu.:1195      3rd Qu.:14.00    3rd Qu.:5.185  
##  Max.   :109.41   Max.   :1502      Max.   :15.90    Max.   :6.280  
##  VO2max_ml_kg_min  VO2max_watts   VO2max_watts_kg LewisAvg_.Pause_watts
##  Min.   :44.90    Min.   :400.0   Min.   :4.396   Min.   :1125         
##  1st Qu.:51.77    1st Qu.:480.0   1st Qu.:5.649   1st Qu.:1301         
##  Median :56.30    Median :520.0   Median :6.064   Median :1413         
##  Mean   :56.41    Mean   :505.5   Mean   :5.981   Mean   :1409         
##  3rd Qu.:60.58    3rd Qu.:550.0   3rd Qu.:6.334   3rd Qu.:1500         
##  Max.   :68.70    Max.   :640.0   Max.   :7.556   Max.   :1865         
##  LewisAvg_.Pause_watts_kg
##  Min.   :12.85           
##  1st Qu.:15.90           
##  Median :16.62           
##  Mean   :16.63           
##  3rd Qu.:17.17           
##  Max.   :20.63

For a Descriptive summary of the variables, the univariate distributinos (boxplots) and additional bivariate distributions see the preliminary analysis.

Bivariate distributions (i.e. scatter plots)

This second draft attempts to fit the data with a function (i.e. linear model - either univariable or multivariable). The focus is on the plots that test the primary questions for this analysis - between the max aerobic test variables (both VO2 max in ml/kg/min and peak aerobic test watts), Wingate peak watts, and vertical jump average watts (Lewis equation, jump with a pause).

There are 9 total regression models presented below. The follow from the relevant scatterplot. If the scatter plot included a third (or fourth) variable (i.e. body mass, vertical jump) then those variables are added to the model (i.e. multivariable regression).

Compare Wingate peak watts to VO2 max watts

qplot(WingatePeak_Watts, VO2max_watts,data=nhldata,geom=c("point", "smooth"),method="lm")

Fit 1 - linear regression of Wingate Peak watts and VO2 Max Watts
fit1 <- lm(VO2max_watts ~ WingatePeak_Watts, data=nhldata) #fit model
summary(fit1) # show results 
## 
## Call:
## lm(formula = VO2max_watts ~ WingatePeak_Watts, data = nhldata)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -117.29  -26.06    9.67   34.26  133.04 
## 
## Coefficients:
##                    Estimate Std. Error t value Pr(>|t|)    
## (Intercept)       446.67364   51.22149   8.720 1.12e-13 ***
## WingatePeak_Watts   0.05192    0.04491   1.156    0.251    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 55.19 on 92 degrees of freedom
## Multiple R-squared:  0.01432,    Adjusted R-squared:  0.00361 
## F-statistic: 1.337 on 1 and 92 DF,  p-value: 0.2506

Compare VO2 max ml kg min to peak wingate watt kg

qplot(PeakWingate_W_kg, VO2max_ml_kg_min,data=nhldata,geom=c("point", "smooth"),method="lm")

Fit 2
fit2 <- lm(VO2max_ml_kg_min ~ PeakWingate_W_kg, data=nhldata) #fit model
summary(fit2) # show results 
## 
## Call:
## lm(formula = VO2max_ml_kg_min ~ PeakWingate_W_kg, data = nhldata)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -11.8880  -4.2729   0.1479   3.7004  11.3993 
## 
## Coefficients:
##                  Estimate Std. Error t value Pr(>|t|)    
## (Intercept)       66.8039     6.4834  10.304   <2e-16 ***
## PeakWingate_W_kg  -0.7747     0.4815  -1.609    0.111    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 5.356 on 92 degrees of freedom
## Multiple R-squared:  0.02736,    Adjusted R-squared:  0.01679 
## F-statistic: 2.588 on 1 and 92 DF,  p-value: 0.1111

Compare lewis average pause watts to the peak watts with body mass

qplot(LewisAvg_.Pause_watts, WingatePeak_Watts,data=nhldata,geom=c("point", "smooth"),method="lm",size = Weight_kg)

Fit 3
fit3 <- lm(WingatePeak_Watts ~ LewisAvg_.Pause_watts, data=nhldata) #fit model
summary(fit3) # show results 
## 
## Call:
## lm(formula = WingatePeak_Watts ~ LewisAvg_.Pause_watts, data = nhldata)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -509.95  -58.60    1.74   54.52  241.61 
## 
## Coefficients:
##                       Estimate Std. Error t value Pr(>|t|)    
## (Intercept)           542.1278   112.4754   4.820 5.65e-06 ***
## LewisAvg_.Pause_watts   0.4197     0.0794   5.287 8.35e-07 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 112.2 on 92 degrees of freedom
## Multiple R-squared:  0.233,  Adjusted R-squared:  0.2247 
## F-statistic: 27.95 on 1 and 92 DF,  p-value: 8.349e-07
Fit 4 (Fit 3 with addition of body mass predictor)
fit4 <- lm(WingatePeak_Watts ~ LewisAvg_.Pause_watts + Weight_kg, data=nhldata) #fit model
summary(fit4) # show results 
## 
## Call:
## lm(formula = WingatePeak_Watts ~ LewisAvg_.Pause_watts + Weight_kg, 
##     data = nhldata)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -402.81  -53.85   -0.13   48.82  204.00 
## 
## Coefficients:
##                        Estimate Std. Error t value Pr(>|t|)    
## (Intercept)            39.95617  139.51511   0.286    0.775    
## LewisAvg_.Pause_watts   0.03005    0.10340   0.291    0.772    
## Weight_kg              12.40815    2.41479   5.138 1.57e-06 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 99.34 on 91 degrees of freedom
## Multiple R-squared:  0.4055, Adjusted R-squared:  0.3924 
## F-statistic: 31.03 on 2 and 91 DF,  p-value: 5.298e-11

Compare lewis average pause watts to VO2 max watts with body mass

qplot(LewisAvg_.Pause_watts, VO2max_watts,data=nhldata,geom=c("point", "smooth"),method="lm",size = Weight_kg)

Fit 5
fit5 <- lm(VO2max_watts ~ LewisAvg_.Pause_watts, data=nhldata) #fit model
summary(fit5) # show results 
## 
## Call:
## lm(formula = VO2max_watts ~ LewisAvg_.Pause_watts, data = nhldata)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -116.658  -27.888    2.771   30.792  134.354 
## 
## Coefficients:
##                        Estimate Std. Error t value Pr(>|t|)    
## (Intercept)           420.74465   55.00359   7.649 1.92e-11 ***
## LewisAvg_.Pause_watts   0.06017    0.03883   1.550    0.125    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 54.88 on 92 degrees of freedom
## Multiple R-squared:  0.02544,    Adjusted R-squared:  0.01485 
## F-statistic: 2.402 on 1 and 92 DF,  p-value: 0.1246
Fit 6 (Fit 5 with addition of body mass predictor))
fit6 <- lm(VO2max_watts ~ LewisAvg_.Pause_watts + Weight_kg, data=nhldata) #fit model
summary(fit6)
## 
## Call:
## lm(formula = VO2max_watts ~ LewisAvg_.Pause_watts + Weight_kg, 
##     data = nhldata)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -127.471  -26.897   -2.307   29.437  114.384 
## 
## Coefficients:
##                        Estimate Std. Error t value Pr(>|t|)    
## (Intercept)           211.30789   70.87414   2.981  0.00368 ** 
## LewisAvg_.Pause_watts  -0.10235    0.05253  -1.949  0.05443 .  
## Weight_kg               5.17497    1.22672   4.219  5.8e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 50.47 on 91 degrees of freedom
## Multiple R-squared:  0.1849, Adjusted R-squared:  0.1669 
## F-statistic: 10.32 on 2 and 91 DF,  p-value: 9.147e-05

Is there is a pattern in the distribution of vertical jump power across the bivariate distribution of wingate and aerobic power?

qplot(WingatePeak_Watts, VO2max_watts,data=nhldata,geom=c("point", "smooth"),method="lm",size = LewisAvg_.Pause_watts)

Fit 7
fit7 <- lm(VO2max_watts ~ WingatePeak_Watts + LewisAvg_.Pause_watts, data=nhldata) #fit model
summary(fit7)
## 
## Call:
## lm(formula = VO2max_watts ~ WingatePeak_Watts + LewisAvg_.Pause_watts, 
##     data = nhldata)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -117.630  -27.850    1.321   32.168  133.711 
## 
## Coefficients:
##                        Estimate Std. Error t value Pr(>|t|)    
## (Intercept)           407.65399   61.81971   6.594 2.75e-09 ***
## WingatePeak_Watts       0.02415    0.05120   0.472    0.638    
## LewisAvg_.Pause_watts   0.05004    0.04452   1.124    0.264    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 55.11 on 91 degrees of freedom
## Multiple R-squared:  0.02782,    Adjusted R-squared:  0.00645 
## F-statistic: 1.302 on 2 and 91 DF,  p-value: 0.277

Is there a pattern in the distribution of vertical jump power and body mass across the bivariate distribution of wingate and aerobic power?

qplot(WingatePeak_Watts, VO2max_watts,data=nhldata,geom=c("point", "smooth"),method="lm",size = LewisAvg_.Pause_watts, color = Weight_kg)

Fit 8
fit8 <- lm(VO2max_watts ~ WingatePeak_Watts + LewisAvg_.Pause_watts + Weight_kg, data=nhldata) #fit model
summary(fit8)
## 
## Call:
## lm(formula = VO2max_watts ~ WingatePeak_Watts + LewisAvg_.Pause_watts + 
##     Weight_kg, data = nhldata)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -114.400  -25.728   -2.793   28.298  116.700 
## 
## Coefficients:
##                        Estimate Std. Error t value Pr(>|t|)    
## (Intercept)           214.89815   70.17471   3.062   0.0029 ** 
## WingatePeak_Watts      -0.08985    0.05270  -1.705   0.0917 .  
## LewisAvg_.Pause_watts  -0.09965    0.05201  -1.916   0.0585 .  
## Weight_kg               6.28990    1.37900   4.561  1.6e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 49.95 on 90 degrees of freedom
## Multiple R-squared:  0.2104, Adjusted R-squared:  0.184 
## F-statistic: 7.992 on 3 and 90 DF,  p-value: 8.818e-05
Fit 9 - tests Fit 8 with the addition of an interaction term between Vertical Power and Body mass
fit9 <- lm(VO2max_watts ~ WingatePeak_Watts + LewisAvg_.Pause_watts + Weight_kg + (LewisAvg_.Pause_watts * Weight_kg), data=nhldata) #fit model
summary(fit9)
## 
## Call:
## lm(formula = VO2max_watts ~ WingatePeak_Watts + LewisAvg_.Pause_watts + 
##     Weight_kg + (LewisAvg_.Pause_watts * Weight_kg), data = nhldata)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -114.42  -25.20   -2.81   28.23  116.62 
## 
## Coefficients:
##                                   Estimate Std. Error t value Pr(>|t|)  
## (Intercept)                      1.992e+02  4.682e+02   0.425   0.6715  
## WingatePeak_Watts               -8.989e-02  5.301e-02  -1.696   0.0934 .
## LewisAvg_.Pause_watts           -8.838e-02  3.362e-01  -0.263   0.7932  
## Weight_kg                        6.473e+00  5.568e+00   1.163   0.2481  
## LewisAvg_.Pause_watts:Weight_kg -1.303e-04  3.839e-03  -0.034   0.9730  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 50.23 on 89 degrees of freedom
## Multiple R-squared:  0.2104, Adjusted R-squared:  0.1749 
## F-statistic: 5.928 on 4 and 89 DF,  p-value: 0.0002824

Suggestions

based on the above the only thing I would take away from this that you might want to include in your report would be the relationship between the vertical jump power and the wingate peak power. This is not a fascinating finding, but there is a clear impact of body mass (kg) in the graphic that the linear models also confirm.

The figure of interest is this one:

Fit 3 - shows a positive slope coefficient and R^2= 0.2416
## 
## Call:
## lm(formula = WingatePeak_Watts ~ LewisAvg_.Pause_watts, data = nhldata)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -509.95  -58.60    1.74   54.52  241.61 
## 
## Coefficients:
##                       Estimate Std. Error t value Pr(>|t|)    
## (Intercept)           542.1278   112.4754   4.820 5.65e-06 ***
## LewisAvg_.Pause_watts   0.4197     0.0794   5.287 8.35e-07 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 112.2 on 92 degrees of freedom
## Multiple R-squared:  0.233,  Adjusted R-squared:  0.2247 
## F-statistic: 27.95 on 1 and 92 DF,  p-value: 8.349e-07
Fit 4 - adds body mass and now has an R^2= 0.4143
fit4 <- lm(WingatePeak_Watts ~ LewisAvg_.Pause_watts + Weight_kg, data=nhldata) #fit model
summary(fit4) # show results 
## 
## Call:
## lm(formula = WingatePeak_Watts ~ LewisAvg_.Pause_watts + Weight_kg, 
##     data = nhldata)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -402.81  -53.85   -0.13   48.82  204.00 
## 
## Coefficients:
##                        Estimate Std. Error t value Pr(>|t|)    
## (Intercept)            39.95617  139.51511   0.286    0.775    
## LewisAvg_.Pause_watts   0.03005    0.10340   0.291    0.772    
## Weight_kg              12.40815    2.41479   5.138 1.57e-06 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 99.34 on 91 degrees of freedom
## Multiple R-squared:  0.4055, Adjusted R-squared:  0.3924 
## F-statistic: 31.03 on 2 and 91 DF,  p-value: 5.298e-11

Together Fit 3 and Fit 4 demonstrate that a higher vertical jump (increased explosive power) predicts a higher peak Wingate power (increased power in 5 seconds); and that much of this relationship is influenced by body mass (when body mass is added it is the stronger predictor of Wingate power)