Data analysis performed for Charlotte Lemgart’s physiology of exercise lab report
A preliminary analysis can be found at: https://rpubs.com/sean_collins/LemgartAnalysis
Presented below is a more in depth analysis of the questions of interest for this report including the use of linear and multivariable regression.
Research has shown mixed relationships between anaerobic and aerobic capacity. In this study hockey players participating in the NHL combine were tested both anaerobically and aerobically. The goal of this analysis is to test whether there is a relationship between their peak aerobic power and anaerobic power. Associations will be tested between the max aerobic test variables (both VO2 max in ml/kg/min and peak aerobic test watts), Wingate peak watts, and vertical jump average watts (Lewis equation, jump with a pause).
Underlying this analysis is the belief that balanced and similiar training between the athletes would result in a positive relationship between these variables. However, it is possible that factors such as types of training varying between players, or personal characteristics may result in a weak or no relationship between these variables.
# Load necessary R packages
library(dplyr) #For data manipulation
##
## Attaching package: 'dplyr'
##
## The following objects are masked from 'package:stats':
##
## filter, lag
##
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
library(reshape2) #For data manipulation related to plotting
library(ggplot2) #For plotting
library(pastecs) #For descriptive statistics
## Loading required package: boot
##
## Attaching package: 'pastecs'
##
## The following objects are masked from 'package:dplyr':
##
## first, last
# Set working directory - mac file system
setwd("~/physExLab/NHLcombinedata")
# Load data
nhldata<-read.csv("data.csv")
#Create weight in kg variable
nhldata<-mutate(nhldata, Weight_kg = Weight_lb/2.2)
#Create aerobic test peak watts / kg variable
nhldata<-mutate(nhldata, VO2max_watts_kg= VO2max_watts/Weight_kg)
#Create Lewis average watts / kg
nhldata<-mutate(nhldata, LewisAvg_.Pause_watts_kg= LewisAvg_.Pause_watts/Weight_kg)
#Create dataframe with only the variables we are interested in
nhldata<-select(nhldata,Weight_kg, WingatePeak_Watts, PeakWingate_W_kg, VO2max_l_min, VO2max_ml_kg_min, VO2max_watts, VO2max_watts_kg, LewisAvg_.Pause_watts, LewisAvg_.Pause_watts_kg)
# Limit to cases that have complete data
nhldata <- nhldata[complete.cases(nhldata), ]
str(nhldata)
## 'data.frame': 94 obs. of 9 variables:
## $ Weight_kg : num 95.1 80 87.4 82.1 81.2 ...
## $ WingatePeak_Watts : int 1136 1017 1171 933 1089 1223 1210 917 1047 1191 ...
## $ PeakWingate_W_kg : num 12 12.7 13.4 11.4 13.4 14.1 13 12.8 13.9 14 ...
## $ VO2max_l_min : num 4.75 3.99 4.94 5.64 4.92 4.41 5.19 4.17 4.39 5.45 ...
## $ VO2max_ml_kg_min : num 50 49.9 56.5 68.7 60.6 50.6 55.8 57.9 58 63.8 ...
## $ VO2max_watts : int 520 400 520 520 520 520 520 440 480 520 ...
## $ VO2max_watts_kg : num 5.47 5 5.95 6.33 6.41 ...
## $ LewisAvg_.Pause_watts : num 1390 1330 1372 1357 1370 ...
## $ LewisAvg_.Pause_watts_kg: num 14.6 16.6 15.7 16.5 16.9 ...
summary(nhldata)
## Weight_kg WingatePeak_Watts PeakWingate_W_kg VO2max_l_min
## Min. : 69.68 Min. : 784 Min. : 9.00 Min. :3.640
## 1st Qu.: 80.49 1st Qu.:1050 1st Qu.:12.80 1st Qu.:4.383
## Median : 84.95 Median :1142 Median :13.40 Median :4.725
## Mean : 84.73 Mean :1134 Mean :13.41 Mean :4.773
## 3rd Qu.: 89.07 3rd Qu.:1195 3rd Qu.:14.00 3rd Qu.:5.185
## Max. :109.41 Max. :1502 Max. :15.90 Max. :6.280
## VO2max_ml_kg_min VO2max_watts VO2max_watts_kg LewisAvg_.Pause_watts
## Min. :44.90 Min. :400.0 Min. :4.396 Min. :1125
## 1st Qu.:51.77 1st Qu.:480.0 1st Qu.:5.649 1st Qu.:1301
## Median :56.30 Median :520.0 Median :6.064 Median :1413
## Mean :56.41 Mean :505.5 Mean :5.981 Mean :1409
## 3rd Qu.:60.58 3rd Qu.:550.0 3rd Qu.:6.334 3rd Qu.:1500
## Max. :68.70 Max. :640.0 Max. :7.556 Max. :1865
## LewisAvg_.Pause_watts_kg
## Min. :12.85
## 1st Qu.:15.90
## Median :16.62
## Mean :16.63
## 3rd Qu.:17.17
## Max. :20.63
For a Descriptive summary of the variables, the univariate distributinos (boxplots) and additional bivariate distributions see the preliminary analysis.
This second draft attempts to fit the data with a function (i.e. linear model - either univariable or multivariable). The focus is on the plots that test the primary questions for this analysis - between the max aerobic test variables (both VO2 max in ml/kg/min and peak aerobic test watts), Wingate peak watts, and vertical jump average watts (Lewis equation, jump with a pause).
There are 9 total regression models presented below. The follow from the relevant scatterplot. If the scatter plot included a third (or fourth) variable (i.e. body mass, vertical jump) then those variables are added to the model (i.e. multivariable regression).
qplot(WingatePeak_Watts, VO2max_watts,data=nhldata,geom=c("point", "smooth"),method="lm")
fit1 <- lm(VO2max_watts ~ WingatePeak_Watts, data=nhldata) #fit model
summary(fit1) # show results
##
## Call:
## lm(formula = VO2max_watts ~ WingatePeak_Watts, data = nhldata)
##
## Residuals:
## Min 1Q Median 3Q Max
## -117.29 -26.06 9.67 34.26 133.04
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 446.67364 51.22149 8.720 1.12e-13 ***
## WingatePeak_Watts 0.05192 0.04491 1.156 0.251
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 55.19 on 92 degrees of freedom
## Multiple R-squared: 0.01432, Adjusted R-squared: 0.00361
## F-statistic: 1.337 on 1 and 92 DF, p-value: 0.2506
qplot(PeakWingate_W_kg, VO2max_ml_kg_min,data=nhldata,geom=c("point", "smooth"),method="lm")
fit2 <- lm(VO2max_ml_kg_min ~ PeakWingate_W_kg, data=nhldata) #fit model
summary(fit2) # show results
##
## Call:
## lm(formula = VO2max_ml_kg_min ~ PeakWingate_W_kg, data = nhldata)
##
## Residuals:
## Min 1Q Median 3Q Max
## -11.8880 -4.2729 0.1479 3.7004 11.3993
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 66.8039 6.4834 10.304 <2e-16 ***
## PeakWingate_W_kg -0.7747 0.4815 -1.609 0.111
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 5.356 on 92 degrees of freedom
## Multiple R-squared: 0.02736, Adjusted R-squared: 0.01679
## F-statistic: 2.588 on 1 and 92 DF, p-value: 0.1111
qplot(LewisAvg_.Pause_watts, WingatePeak_Watts,data=nhldata,geom=c("point", "smooth"),method="lm",size = Weight_kg)
fit3 <- lm(WingatePeak_Watts ~ LewisAvg_.Pause_watts, data=nhldata) #fit model
summary(fit3) # show results
##
## Call:
## lm(formula = WingatePeak_Watts ~ LewisAvg_.Pause_watts, data = nhldata)
##
## Residuals:
## Min 1Q Median 3Q Max
## -509.95 -58.60 1.74 54.52 241.61
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 542.1278 112.4754 4.820 5.65e-06 ***
## LewisAvg_.Pause_watts 0.4197 0.0794 5.287 8.35e-07 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 112.2 on 92 degrees of freedom
## Multiple R-squared: 0.233, Adjusted R-squared: 0.2247
## F-statistic: 27.95 on 1 and 92 DF, p-value: 8.349e-07
fit4 <- lm(WingatePeak_Watts ~ LewisAvg_.Pause_watts + Weight_kg, data=nhldata) #fit model
summary(fit4) # show results
##
## Call:
## lm(formula = WingatePeak_Watts ~ LewisAvg_.Pause_watts + Weight_kg,
## data = nhldata)
##
## Residuals:
## Min 1Q Median 3Q Max
## -402.81 -53.85 -0.13 48.82 204.00
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 39.95617 139.51511 0.286 0.775
## LewisAvg_.Pause_watts 0.03005 0.10340 0.291 0.772
## Weight_kg 12.40815 2.41479 5.138 1.57e-06 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 99.34 on 91 degrees of freedom
## Multiple R-squared: 0.4055, Adjusted R-squared: 0.3924
## F-statistic: 31.03 on 2 and 91 DF, p-value: 5.298e-11
qplot(LewisAvg_.Pause_watts, VO2max_watts,data=nhldata,geom=c("point", "smooth"),method="lm",size = Weight_kg)
fit5 <- lm(VO2max_watts ~ LewisAvg_.Pause_watts, data=nhldata) #fit model
summary(fit5) # show results
##
## Call:
## lm(formula = VO2max_watts ~ LewisAvg_.Pause_watts, data = nhldata)
##
## Residuals:
## Min 1Q Median 3Q Max
## -116.658 -27.888 2.771 30.792 134.354
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 420.74465 55.00359 7.649 1.92e-11 ***
## LewisAvg_.Pause_watts 0.06017 0.03883 1.550 0.125
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 54.88 on 92 degrees of freedom
## Multiple R-squared: 0.02544, Adjusted R-squared: 0.01485
## F-statistic: 2.402 on 1 and 92 DF, p-value: 0.1246
fit6 <- lm(VO2max_watts ~ LewisAvg_.Pause_watts + Weight_kg, data=nhldata) #fit model
summary(fit6)
##
## Call:
## lm(formula = VO2max_watts ~ LewisAvg_.Pause_watts + Weight_kg,
## data = nhldata)
##
## Residuals:
## Min 1Q Median 3Q Max
## -127.471 -26.897 -2.307 29.437 114.384
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 211.30789 70.87414 2.981 0.00368 **
## LewisAvg_.Pause_watts -0.10235 0.05253 -1.949 0.05443 .
## Weight_kg 5.17497 1.22672 4.219 5.8e-05 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 50.47 on 91 degrees of freedom
## Multiple R-squared: 0.1849, Adjusted R-squared: 0.1669
## F-statistic: 10.32 on 2 and 91 DF, p-value: 9.147e-05
qplot(WingatePeak_Watts, VO2max_watts,data=nhldata,geom=c("point", "smooth"),method="lm",size = LewisAvg_.Pause_watts)
fit7 <- lm(VO2max_watts ~ WingatePeak_Watts + LewisAvg_.Pause_watts, data=nhldata) #fit model
summary(fit7)
##
## Call:
## lm(formula = VO2max_watts ~ WingatePeak_Watts + LewisAvg_.Pause_watts,
## data = nhldata)
##
## Residuals:
## Min 1Q Median 3Q Max
## -117.630 -27.850 1.321 32.168 133.711
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 407.65399 61.81971 6.594 2.75e-09 ***
## WingatePeak_Watts 0.02415 0.05120 0.472 0.638
## LewisAvg_.Pause_watts 0.05004 0.04452 1.124 0.264
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 55.11 on 91 degrees of freedom
## Multiple R-squared: 0.02782, Adjusted R-squared: 0.00645
## F-statistic: 1.302 on 2 and 91 DF, p-value: 0.277
qplot(WingatePeak_Watts, VO2max_watts,data=nhldata,geom=c("point", "smooth"),method="lm",size = LewisAvg_.Pause_watts, color = Weight_kg)
fit8 <- lm(VO2max_watts ~ WingatePeak_Watts + LewisAvg_.Pause_watts + Weight_kg, data=nhldata) #fit model
summary(fit8)
##
## Call:
## lm(formula = VO2max_watts ~ WingatePeak_Watts + LewisAvg_.Pause_watts +
## Weight_kg, data = nhldata)
##
## Residuals:
## Min 1Q Median 3Q Max
## -114.400 -25.728 -2.793 28.298 116.700
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 214.89815 70.17471 3.062 0.0029 **
## WingatePeak_Watts -0.08985 0.05270 -1.705 0.0917 .
## LewisAvg_.Pause_watts -0.09965 0.05201 -1.916 0.0585 .
## Weight_kg 6.28990 1.37900 4.561 1.6e-05 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 49.95 on 90 degrees of freedom
## Multiple R-squared: 0.2104, Adjusted R-squared: 0.184
## F-statistic: 7.992 on 3 and 90 DF, p-value: 8.818e-05
fit9 <- lm(VO2max_watts ~ WingatePeak_Watts + LewisAvg_.Pause_watts + Weight_kg + (LewisAvg_.Pause_watts * Weight_kg), data=nhldata) #fit model
summary(fit9)
##
## Call:
## lm(formula = VO2max_watts ~ WingatePeak_Watts + LewisAvg_.Pause_watts +
## Weight_kg + (LewisAvg_.Pause_watts * Weight_kg), data = nhldata)
##
## Residuals:
## Min 1Q Median 3Q Max
## -114.42 -25.20 -2.81 28.23 116.62
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.992e+02 4.682e+02 0.425 0.6715
## WingatePeak_Watts -8.989e-02 5.301e-02 -1.696 0.0934 .
## LewisAvg_.Pause_watts -8.838e-02 3.362e-01 -0.263 0.7932
## Weight_kg 6.473e+00 5.568e+00 1.163 0.2481
## LewisAvg_.Pause_watts:Weight_kg -1.303e-04 3.839e-03 -0.034 0.9730
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 50.23 on 89 degrees of freedom
## Multiple R-squared: 0.2104, Adjusted R-squared: 0.1749
## F-statistic: 5.928 on 4 and 89 DF, p-value: 0.0002824
based on the above the only thing I would take away from this that you might want to include in your report would be the relationship between the vertical jump power and the wingate peak power. This is not a fascinating finding, but there is a clear impact of body mass (kg) in the graphic that the linear models also confirm.
##
## Call:
## lm(formula = WingatePeak_Watts ~ LewisAvg_.Pause_watts, data = nhldata)
##
## Residuals:
## Min 1Q Median 3Q Max
## -509.95 -58.60 1.74 54.52 241.61
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 542.1278 112.4754 4.820 5.65e-06 ***
## LewisAvg_.Pause_watts 0.4197 0.0794 5.287 8.35e-07 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 112.2 on 92 degrees of freedom
## Multiple R-squared: 0.233, Adjusted R-squared: 0.2247
## F-statistic: 27.95 on 1 and 92 DF, p-value: 8.349e-07
fit4 <- lm(WingatePeak_Watts ~ LewisAvg_.Pause_watts + Weight_kg, data=nhldata) #fit model
summary(fit4) # show results
##
## Call:
## lm(formula = WingatePeak_Watts ~ LewisAvg_.Pause_watts + Weight_kg,
## data = nhldata)
##
## Residuals:
## Min 1Q Median 3Q Max
## -402.81 -53.85 -0.13 48.82 204.00
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 39.95617 139.51511 0.286 0.775
## LewisAvg_.Pause_watts 0.03005 0.10340 0.291 0.772
## Weight_kg 12.40815 2.41479 5.138 1.57e-06 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 99.34 on 91 degrees of freedom
## Multiple R-squared: 0.4055, Adjusted R-squared: 0.3924
## F-statistic: 31.03 on 2 and 91 DF, p-value: 5.298e-11
Together Fit 3 and Fit 4 demonstrate that a higher vertical jump (increased explosive power) predicts a higher peak Wingate power (increased power in 5 seconds); and that much of this relationship is influenced by body mass (when body mass is added it is the stronger predictor of Wingate power)