In this workbook we will simulate some data representing a simple cross-over design where we are looking at sprint speed after two different conditions (lets say a basic warm-up (warm-up A: control condition) versus warm-up B: “special” warm-up [maybe we’ve included an fancy activation exercise here?])
Hopefully our study has been designed well and we have followed the guidelines in the CONSORT statement extension for cross-over designs. Remember the strength of this design is that we are comparing within-participants rather than a traditional parallel groups design where we are comparing between two groups of different people.
Dwan, K., Li, T., Altman, D.G. and Elbourne, D., 2019. CONSORT 2010 statement: extension to randomised crossover trials. bmj, 366.
For this workbook we are going to simulate data here for 5 meter and 20 meter sprint times, we will simulate the data to be normally distributed using rnorm(). We will create data for each condition for both distances and combine into one data frame we can analyse.
set.seed(123)
# Create data for condition_A
Five_A <- rnorm(30, 1.03, 0.03)
Twenty_A <- rnorm(30, 3.06, 0.21)
# Create a dataframe with condition_A data
A <- data.frame(Five = Five_A, Twenty = Twenty_A, Condition = rep("condition_A", 30))
# Generate data for condition_B
Five_B <- rnorm(30, 0.99, 0.04)
Twenty_B <- rnorm(30, 3.01, 0.19)
# Create a dataframe with condition_B data
B <- data.frame(Five = Five_B, Twenty = Twenty_B, Condition = rep("condition_B", 30))
# Combine data for condition_A and condition_B in long format
AB <- rbind(A, B)
# Print the first few rows of the combined dataframe
head(AB)
## Five Twenty Condition
## 1 1.013186 3.149557 condition_A
## 2 1.023095 2.998035 condition_A
## 3 1.076761 3.247976 condition_A
## 4 1.032115 3.244408 condition_A
## 5 1.033879 3.232532 condition_A
## 6 1.081452 3.204614 condition_A
Remember we need to check if the data is normally distributed (it should be as we have made it that way!) . Below I have run a histogram and kernel density plot and then verified using the Shapiro-Wilk test. NOTE: There are better ways to do this but that is beyond the scope of this work book.
We might also want to run a Q-Q plot and can use the library(car) here (remember to install that package if you haven’t already!
ggplot(AB, aes(x = Twenty)) +
geom_histogram(aes(y = ..density..),
colour = 1, fill = "white") +
geom_density()+theme_classic()
## Warning: The dot-dot notation (`..density..`) was deprecated in ggplot2 3.4.0.
## ℹ Please use `after_stat(density)` instead.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
shapiro.test(AB$Twenty)
##
## Shapiro-Wilk normality test
##
## data: AB$Twenty
## W = 0.98847, p-value = 0.8431
library("car")
qqPlot(AB$Twenty)
## [1] 14 37
So our data appears to be normally distributed so let’s visualise it and run our analysis: let’s plot a simple box plot:
ggplot(AB, aes(Condition, Twenty, fill = Condition))+
geom_boxplot()
There are lots of ways to change this plot, try adding a title, or axis titles, changing the colors, using different themes e.g., theme_classic or removing the legend. You can also name the plot and save it as picture in working directory.
plot1<- ggplot(AB, aes(Condition, Twenty, fill = Condition))+
geom_boxplot()
plot1
ggsave(
"plot1.png", dpi = 320,
)
## Saving 7 x 5 in image
So it looks like sprinting is faster in condition B than condition A. Assuming we are using a frequentest approach we should have stated our hypotheses beforehand and now need :
# Subset the data to get only Twenty data for condition_A and condition_B
Twenty_A <- AB$Twenty[AB$Condition == "condition_A"]
Twenty_B <- AB$Twenty[AB$Condition == "condition_B"]
# Conduct a two-sample t-test to compare the means of Twenty for condition_A and condition_B
t.test(Twenty_A, Twenty_B, paired = TRUE)
##
## Paired t-test
##
## data: Twenty_A and Twenty_B
## t = 2.4382, df = 29, p-value = 0.02112
## alternative hypothesis: true mean difference is not equal to 0
## 95 percent confidence interval:
## 0.01697044 0.19360946
## sample estimates:
## mean difference
## 0.1052899
So this shows us that there is a significant difference between conditions in 20 m sprint time with warm-up B 0.101 95% CI 0.07 to 0.19 s faster than warm-up A the control. You can just report this in your results or if you wanted to you could provide the t statistics t =2.4, p = 0.02