Data analysis

We have collect our data, fantastic! What are we going to do with it! Well first lest read the data in. We have two files, the first is our participant information, stature, age, mass etc. and the second is our data file “Sprint_data”. Make sure both these files are save neatly in your Human Movement folder and that you have set this folder as your working directory (in files click on more and then “set working directory” - you can copy the code into your script like I have done below:

setwd("~/Library/CloudStorage/OneDrive-TeessideUniversity/Work/Teaching/Human Movement/2023/R") # specify the folder where Sprint_data.xlsx exists on your device

Installing / loading packages

Remember we need some key packages to do something in R and once installed we need to “load” these. The code before is naming the packages we needed and installing any that are not installed and loading all of them for you - check for any error messages or if it asks you to restart R :

library(dplyr)
library(readxl)
library(ggplot2)
library(Rmisc)
library(psych)

You can now read your data in:

data <- read_excel("Sprint_data.xlsx")
head(data)
## # A tibble: 6 × 6
##   ID    Condition Trial   Five  Twenty Sex  
##   <chr> <chr>     <chr>   <chr> <chr>  <chr>
## 1 P12   Control   Trial_1 1.07  3.13   M    
## 2 P13   Away      Trial_1 1.27  3.75   F    
## 3 P15   Towards   Trial_1 1.12  3.3    M    
## 4 P19   Away      Trial_1 1.06  3.13   M    
## 5 P21   Away      Trial_1 2.38  8.15   M    
## 6 P23   Towards   Trial_1 1.1   3.33   M

Making sure our data is ready for analysis

Remember we need to make sure that R understands if our data is a number or a factor (e.g. Condition) The script below will assign the correct data type to each column - remember to check this you can use the function str(data):

data$ID<-as.factor(data$ID)
data$Condition<-as.factor(data$Condition)
data$Trial<-as.factor(data$Trial)
data$Five<-as.numeric(data$Five)
data$Twenty<-as.numeric(data$Twenty)
data$Sex<-as.factor(data$Sex)
head(data)
## # A tibble: 6 × 6
##   ID    Condition Trial    Five Twenty Sex  
##   <fct> <fct>     <fct>   <dbl>  <dbl> <fct>
## 1 P12   Control   Trial_1  1.07   3.13 M    
## 2 P13   Away      Trial_1  1.27   3.75 F    
## 3 P15   Towards   Trial_1  1.12   3.3  M    
## 4 P19   Away      Trial_1  1.06   3.13 M    
## 5 P21   Away      Trial_1  2.38   8.15 M    
## 6 P23   Towards   Trial_1  1.1    3.33 M

Calculating the mean of both trials

You will be able to see from the data that we have both trials for each participant in each condition. We want to take the average (mean) of these so we are going to use a bit of code to do this for us. This uses the aggregate and cbind functions to take the mean “Five” and “Twenty” time of each participant (ID) in each condition:

# Calculate the mean of Five and Ten, grouped by ID and Condition
mydata <- aggregate(cbind(Five, Twenty) ~ ID + Condition + Sex, data = data, FUN = mean)
head(mydata)
##    ID Condition Sex  Five Twenty
## 1 P10      Away   F 1.180  3.560
## 2 P13      Away   F 1.250  3.765
## 3 P14      Away   F 1.145  3.600
## 4 P10   Control   F 1.195  3.620
## 5 P13   Control   F 1.270  3.795
## 6 P14   Control   F 1.150  3.630

Plotting

So now we have data ready for analysis. Let’s just take a look at our data first, lets plot a simple box plot:

ggplot(mydata, aes(Condition, Twenty))+
  geom_boxplot()

You can make this look nice by adding different elements to it, including colour or a colour fill. Below I’ve added fill = Condition. You can change colour schemes, adding the line of code” “scale_fill_viridis_d(option =”viridis”, direction = 1)” uses a really nice colour palette for example. Also play with adding themes like “theme_classic”.

ggplot(mydata, aes(Condition, Twenty, fill = Condition))+
  geom_boxplot()

Assumption checking

We can see that there are a few outlines in our data. The first stage of any analysis is to check the data fits our assumptions of “normality”. For this we will use a histogram and look to see if it looks close to a normal “bell shaped curve”:

hist(mydata$Twenty)

Again this demonstrates the out liers - our data has a long tail! We can also use ggplot to graph this data and can use both a histogram and density plot:

# Histogram with kernel density
ggplot(mydata, aes(x = Twenty)) + 
  geom_histogram(aes(y = ..density..),
                 colour = 1, fill = "white") +
  geom_density()+theme_classic()
## Warning: The dot-dot notation (`..density..`) was deprecated in ggplot2 3.4.0.
## ℹ Please use `after_stat(density)` instead.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Use Google to find ways to amend this chart - think about using colors etc.

Q-Q plot

The Q-Q plot, or quantile-quantile plot, is a graphical tool to help us assess if a set of data plausibly came from some theoretical distribution such as a Normal or exponential. For example, if we run a statistical analysis that assumes our dependent variable is Normally distributed, we can use a Normal Q-Q plot to check that assumption. It's just a visual check, not an air-tight proof, so it is somewhat subjective. But itallows us to see at-a-glance if our assumption is plausible, and if not, how the assumption is violated and what data points contribute to the violation.

A Q-Q plot is a scatterplot created by plotting two sets of quantiles against one another. If both sets of quantiles came from the same distribution, we should see the points forming a line that's roughly straight.

We’ll use the library “car” so you might need to install the package if you don’t have it already:

library("car")
## Loading required package: carData
## 
## Attaching package: 'car'
## The following object is masked from 'package:psych':
## 
##     logit
## The following object is masked from 'package:dplyr':
## 
##     recode
qqPlot(mydata$Twenty)

## [1] 57 23

In this case it is really clear our data is not normally distributed, but we might want a formal test to check

Shapiro-Wilk normality test

The Shapiro–Wilk test is a test of normality and simply tells us if the data is significantly different from normal. Here the null hypothesis is that the data will be normal and the test gives us a p-value for the likelihood that we would see the distribution we do if the null hypothesis is true. Remember p-values are crude and in reality why is p=0.51 any better than p=0.49. As such this is also not “air-tight”.

The code below runs the Shapiro-Wilk test for Twenty meter time and we get W = 0.43, p = <0.0001) Note: e-14 means there are 14 0’s in front of the 6.82!

shapiro.test(mydata$Twenty)
## 
##  Shapiro-Wilk normality test
## 
## data:  mydata$Twenty
## W = 0.43028, p-value = 6.828e-14

Dealing with outliers

So our data clearly isn’t normal and we probably do need to address the outliers here - there is no right or wrong way to do this but its important we document what and how we do it. Here I have written code that you can copy that is going to delete any rows from our data where Five meter time is more than 3 standard deviations over the mean. We are going to call the new data set “my_data clean”

# Calculate the mean and standard deviation of the Five variable
mean_Five <- mean(mydata$Five)
sd_Five <- sd(mydata$Five)

# Define a threshold for outliers as 2 standard deviations from the mean
threshold <- 3 * sd_Five

# Remove rows containing outliers from the Five variable
mydata_clean <- mydata[abs(mydata$Five - mean_Five) < threshold, ]

So lets create our box plot again:

box_2<-ggplot(mydata_clean, aes(Condition, Twenty, fill = Condition))+
  geom_boxplot()+
  geom_jitter(alpha = 0.2)+
  labs(title="Box plot with jittered points", y="Twenty meter time (s)")+ 
  scale_fill_viridis_d(option = "viridis", direction = 1) +
  theme_classic() + 
      theme(legend.position = "none")
box_2

ggsave(
  "box_2.png", dpi = 320,
)
## Saving 7 x 5 in image

You should now have a go at running your assumption checks again for the clean_data - you will find again that these data are not normally distributed which poses us a problem.

Summarising our data

If the data is normally distributed we would summarise our data first using mean ± standard deviation for each group. summarySE is a great way to do this:

summarySE(data = mydata_clean, measurevar = "Twenty", groupvars = "Condition")
##   Condition  N   Twenty        sd         se        ci
## 1      Away 19 3.225789 0.2281950 0.05235152 0.1099865
## 2   Control 19 3.226842 0.2447403 0.05614727 0.1179610
## 3   Towards 19 3.227105 0.2404759 0.05516895 0.1159057

Non-parametric statistics

So if we are following the undergraduate textbook we would need to run a test to see if we can reject the null hypothesis. That test would need to be one that did not assume the data to be normally distributed: We have three conditions so usually we would use a test called and Analysis of Variance (ANOVA) but here we want to find it’s non-parametric alternative.

The Kruskal-Wallis test

The Kruskal-Wallis H test (sometimes also called the "one-way ANOVA on ranks") is a rank-based nonparametric test that can be used to determine if there are statistically significant differences between two or more groups of an independent variable on a continuous or ordinal dependent variable. It is considered the nonparametric alternative to the one-way ANOVA, and an extension of the Mann-Whitney U test to allow the comparison of more than two independent groups. As the Kruskal-Wallis H test does not assume normality in the data and is much less sensitive to outliers, it can be used when these assumptions have been violated and the use of a one-way ANOVA is inappropriate. In addition, if your data is ordinal, a one-way ANOVA is inappropriate, but the Kruskal-Wallis H test is not.

# install.packages(ggpubr)
# install.packages(rstatix)
library(ggpubr)
library(rstatix)
res.kruskal <- mydata_clean %>% kruskal_test(Five ~ Condition)
res.kruskal
## # A tibble: 1 × 6
##   .y.       n statistic    df     p method        
## * <chr> <int>     <dbl> <int> <dbl> <chr>         
## 1 Five     57     0.285     2 0.867 Kruskal-Wallis

Here we get a non-significant result (p = 0.867) with a very small effect size (- 0.03). Note: <0.2 is usually deemed trivial. So we cannot reject the null hypothesis here.

mydata_clean %>%kruskal_effsize(Five ~ Condition)
## # A tibble: 1 × 5
##   .y.       n effsize method  magnitude
## * <chr> <int>   <dbl> <chr>   <ord>    
## 1 Five     57 -0.0318 eta2[H] small

Out of interest what would we have got if we had run an ANOVA: here we get a slightly different p-value, even higher p=0.946, so again we cannot reject the null.

anova = aov(Five ~ Condition, mydata_clean)
summary(anova)
##             Df  Sum Sq  Mean Sq F value Pr(>F)
## Condition    2 0.00053 0.000266   0.055  0.946
## Residuals   54 0.26088 0.004831

Often statisticians would not go any further now, they’d be content with saying there were no significant differences between condition and thus not enough evidence to reject the null. However, I think it is useful still to quantify the between conditions e.g., away - control, toward - control and away - control. These are termed “post-hoc” tests.

There are a few ways of doing this the first is the Wilcoxon rank test (a non-parametric t-test) and because our chance of making a type 1 error increase the more tests we do we need to adjust the p-values accordingly - below we’ve chosen a Bonferroni correction for this:

pairwise.wilcox.test(mydata_clean$Five, mydata_clean$Condition, p.adjust.method = "bonferroni")
## Warning in wilcox.test.default(xi, xj, paired = paired, ...): cannot compute
## exact p-value with ties

## Warning in wilcox.test.default(xi, xj, paired = paired, ...): cannot compute
## exact p-value with ties

## Warning in wilcox.test.default(xi, xj, paired = paired, ...): cannot compute
## exact p-value with ties
## 
##  Pairwise comparisons using Wilcoxon rank sum test with continuity correction 
## 
## data:  mydata_clean$Five and mydata_clean$Condition 
## 
##         Away Control
## Control 1    -      
## Towards 1    1      
## 
## P value adjustment method: bonferroni

This shows us we have p-values of 1- NOTE: it is very unusually to get p = 1 (that’s the highest value you can get (100% probability).

Approach 2 - Dunn's Test (correct approacgh for kruskal-wallis test)

The Dunn’s test is perhaps a more appropriate method after using the Kruskal-wallis test. This gives us the same adjusted p-values.

pwc <- mydata_clean %>%
dunn_test(Five ~ Condition, p.adjust.method = "bonferroni")
pwc
## # A tibble: 3 × 9
##   .y.   group1  group2     n1    n2 statistic     p p.adj p.adj.signif
## * <chr> <chr>   <chr>   <int> <int>     <dbl> <dbl> <dbl> <chr>       
## 1 Five  Away    Control    19    19   -0.455  0.649     1 ns          
## 2 Five  Away    Towards    19    19    0.0147 0.988     1 ns          
## 3 Five  Control Towards    19    19    0.470  0.639     1 ns

But let’s do all this a better way

Remember, most of the stats we will do will come down to linear models: indeed an ANOVA is a linear model and a Kruskall-Wallis is basically a linear model on the ranked data (in stead of the raw data!).

Really it is not the data that needs to be normal, it is the “residuals” from the model, so a better approach is to run the model and test if it meets the necessary assumptions.

We are going to use three packages for this: lme4 runs the linear models, performance can be used to check these and emmeans allows us to look at the difference between groups.

Below we are running a model (called m) for five meter time by condition.

library(lme4)
library(performance)
library(emmeans)

m=lm(Five ~ 1  + Condition, mydata_clean)
m
## 
## Call:
## lm(formula = Five ~ 1 + Condition, data = mydata_clean)
## 
## Coefficients:
##      (Intercept)  ConditionControl  ConditionTowards  
##        1.0784211        -0.0060526         0.0007895

So we have our model, first let’s check it, we can run check_model(m) from the performance packages an this will visualize how the model fits. We can also check specific assumptions e.g., normality and heteroscedasticity (is the variance in the data uniform?)

check_model(m)

check_normality(m)     # shapiro.test, however, visual inspection (e.g. Q-Q plots) are preferable
## Warning: Non-normality of residuals detected (p = 0.013).
check_heteroscedasticity(m)
## OK: Error variance appears to be homoscedastic (p = 0.587).

So the model fits reasonably well but there the. residuals are significantly different from normal and the q-q plot shows variance is not uniform.

Maybe there is something that can explain this in our data? We have both male and female participants - what if we added “Sex” in the model as a co variate!

m2=lm(Five ~ 1  + Condition + Sex, mydata_clean)
check_model(m2)
## Variable `Component` is not in your data frame :/

check_normality(m2)     # shapiro.test, however, visual inspection (e.g. Q-Q plots) are preferable
## OK: residuals appear as normally distributed (p = 0.331).
check_heteroscedasticity(m2)
## OK: Error variance appears to be homoscedastic (p = 0.969).

Hey presto!! we have a model that fits nicely and we can use. So let’s see the results:

summary(m2)
## 
## Call:
## lm(formula = Five ~ 1 + Condition + Sex, data = mydata_clean)
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -0.109973 -0.026815  0.003975  0.025027  0.088185 
## 
## Coefficients:
##                    Estimate Std. Error t value Pr(>|t|)    
## (Intercept)       1.1978655  0.0172926  69.270  < 2e-16 ***
## ConditionControl -0.0060526  0.0146733  -0.412    0.682    
## ConditionTowards  0.0007895  0.0146733   0.054    0.957    
## SexM             -0.1418403  0.0164280  -8.634  1.1e-11 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.04523 on 53 degrees of freedom
## Multiple R-squared:  0.5853, Adjusted R-squared:  0.5618 
## F-statistic: 24.94 on 3 and 53 DF,  p-value: 3.38e-10
m2
## 
## Call:
## lm(formula = Five ~ 1 + Condition + Sex, data = mydata_clean)
## 
## Coefficients:
##      (Intercept)  ConditionControl  ConditionTowards              SexM  
##        1.1978655        -0.0060526         0.0007895        -0.1418403

Wow - that’s a lot of data, but basically this gives us the results of an ANOVA with an F-statistic and a p-value. The model shows significant differences but these are for Sex and not for Condition. You would write this F(3,53) = 29.9, p =<0.001 . Out of interest our first model without Sex showed F(2,54) = 0.06, p = 0.95.

Great but that doesn’t tell us anything useful, what is the actual difference between the conditions in seconds?

To compare the means between the conditions we can use the estimated marginal means (emmeans) packages and add confidence intervals, here I have set them at 95% . This shows us our Away condition mean was 1.13, our control was 1.12 and toward was 1.13, It also shows us the difference between the conditions and the confidence intervals, this can go in a nice table in our results e.g., Away - Control, 0.006 95% CI -0.029 to 0.041 p = 0.911

emm <- emmeans(m2, pairwise ~ Condition)
emm
## $emmeans
##  Condition emmean     SE df lower.CL upper.CL
##  Away        1.13 0.0118 53      1.1     1.15
##  Control     1.12 0.0118 53      1.1     1.14
##  Towards     1.13 0.0118 53      1.1     1.15
## 
## Results are averaged over the levels of: Sex 
## Confidence level used: 0.95 
## 
## $contrasts
##  contrast           estimate     SE df t.ratio p.value
##  Away - Control     0.006053 0.0147 53   0.412  0.9106
##  Away - Towards    -0.000789 0.0147 53  -0.054  0.9984
##  Control - Towards -0.006842 0.0147 53  -0.466  0.8873
## 
## Results are averaged over the levels of: Sex 
## P value adjustment: tukey method for comparing a family of 3 estimates
confint(emm, level = 0.95)
## $emmeans
##  Condition emmean     SE df lower.CL upper.CL
##  Away        1.13 0.0118 53      1.1     1.15
##  Control     1.12 0.0118 53      1.1     1.14
##  Towards     1.13 0.0118 53      1.1     1.15
## 
## Results are averaged over the levels of: Sex 
## Confidence level used: 0.95 
## 
## $contrasts
##  contrast           estimate     SE df lower.CL upper.CL
##  Away - Control     0.006053 0.0147 53  -0.0293   0.0414
##  Away - Towards    -0.000789 0.0147 53  -0.0362   0.0346
##  Control - Towards -0.006842 0.0147 53  -0.0422   0.0285
## 
## Results are averaged over the levels of: Sex 
## Confidence level used: 0.95 
## Conf-level adjustment: tukey method for comparing a family of 3 estimates

So we have our results!

There are no differences between the conditions, therefore it does not matter if we say run as fast as you can or give the external cues. Not quite, just because there is no evidence the external cues did not work doesn’t mean there is evidence they are the same. What we need to do is something called and equivalence test. Here we need to see if our differences between groups and the 90% confidence intervals for this different sit within a range we would determine to be “equivalent’ (see the work of Lakens e.g., 2017. Datson et al., 2022 comprehensively looked into what was deemed an important change in 5m speed in women’s soccer and suggested ~0.07 seconds. We can actually see from above our entire 95% CI sit between - 0.07 and +0.07 so we can be pretty sure that these data are statistically equivalent.

Lakens, D., 2017. Equivalence tests: A practical primer for t tests, correlations, and meta-analyses. Social psychological and personality science, 8(4), pp.355-362.

Datson, N., Lolli, L., Drust, B., Atkinson, G., Weston, M. and Gregson, W., 2022. Inter-methodological quantification of the target change for performance test outcomes relevant to elite female soccer players. Science and Medicine in Football, 6(2), pp.248-261.