We delve into the world of statistical analysis to understand how different factors impact heart rate variability. We’ll use real-world medical data and the power of R to conduct a Repeated Measures ANOVA, allowing us to uncover valuable insights from repeated measurements.
This article will equip you with the skills to confidently analyze repeated measurements and draw meaningful conclusions using Repeated Measures ANOVA in R.
Join us on this journey to unlock the secrets hidden within heart rate data!
https://www.youtube.com/@RVStats_ENG
Repeated Measures ANOVA and Mixed ANOVA are used to analyze data with repeated measurements, but they differ in terms of their underlying assumptions and the types of factors they incorporate.
Here’s a comparison of the two approaches:
Repeated Measures ANOVA:
Assumption: Repeated Measures ANOVA assumes sphericity, which means that the variances of the differences between all pairs of within-subjects factor levels are equal. This assumption can be restrictive and is often violated in real-world data.
Factors: Repeated Measures ANOVA deals only with within-subjects factors. It analyzes the effect of a single independent variable (factor) with multiple levels on a dependent variable, where each level is measured repeatedly within the same subjects.
Design: It is suitable when there’s a single independent variable with multiple levels, and each participant provides measurements for each level of that variable.
Interaction: Repeated Measures ANOVA only tests the main effect of the within-subjects factor and its interactions.
Example: Analyzing the effect of different time points (e.g., Before and After) on blood pressure in a single treatment group.
Mixed ANOVA (Split-Plot ANOVA):
Assumption: Mixed ANOVA relaxes the assumption of sphericity by incorporating both between-subjects and within-subjects factors. It can handle unbalanced designs more effectively.
Design: It is suitable when there are two or more independent variables, where at least one is between-subjects (grouping variable) and another is within-subjects (repeated measurements).
The repeated measures ANOVA makes the following assumptions about the data:
No significant outliers in any cell of the design.
Normality: the outcome (or dependent) variable should be approximately normally distributed in each cell of the design.
Assumption of sphericity: the variance of the differences between groups should be equal. Automatically reported when using the R function anova_test().
\(SS_T = \sum_{i=1}^{k} \sum_{j=1}^{n_i}(x_{ij} - \bar{x})^2\)
\(SS_B = SS_{group} = \sum_{i=1}^{k} n_i (\bar{x_i} - \bar{x})^2\)
\(SS_W = SS_{subjects}= \sum_{i=1}^{n}\sum_{j=1}^{k}(x_{ij} - \bar{x_i})^2\)
\(SS_E = SS_W - SS_B\)
For other names, we have \(SS_{group}=SS_{Model}\)
Understand the importance of heart rate variability and how it can provide insights into overall health and well-being.
Tere are a few examples of medical procedures that could be mentioned as examples for cases similar to the heart rate variability data provided:
Exercise Stress Test: Explore how heart rate changes before, during, and after an exercise stress test, which measures the heart’s response to physical activity.
Infusion Therapy: Analyze heart rate variations before, during, and after infusion therapy sessions, which involve administering fluids or medications intravenously.
Pain Management Interventions: Investigate heart rate changes during pain management interventions like acupuncture sessions, where patients experience varying levels of discomfort.
Respiratory Treatments: Study heart rate variability during respiratory treatments such as inhalation therapy, assessing the impact on heart rate due to improved breathing.
Psychological Stress Testing: Analyze heart rate responses during psychological stress tests, which simulate stressful situations to observe the heart’s reaction.
Pre- and Post-Operative Monitoring: Explore changes in heart rate before, during, and after surgical procedures, focusing on how anesthesia and surgery affect heart rate.
Invasive Medical Procedures: Investigate heart rate variability in response to invasive procedures like endoscopy, where patients experience physiological changes.
Temperature Variations: Study heart rate variations due to temperature changes, such as exposure to cold or heat therapy sessions.
Cognitive Tasks: Analyze heart rate responses during cognitive tasks or memory tests to understand the influence of mental activity on heart rate.
Nutritional Interventions: Explore heart rate changes before, during, and after nutritional interventions, like glucose tolerance tests or meal consumption.
Take a closer look at our medical data, which includes heart rate measurements taken before, during, and after a medical procedure for multiple patients.
# Load necessary packages
library(tibble)
# Create a dataframe with medical application data in wide format
medical_data <- tribble(
~Patient_ID, ~Before, ~During, ~After,
1, 75, 85, 80,
2, 80, 90, 82,
3, 70, 75, 74,
4, 85, 95, 88,
5, 78, 88, 81
)
# Print the dataframe
print(medical_data)
## # A tibble: 5 × 4
## Patient_ID Before During After
## <dbl> <dbl> <dbl> <dbl>
## 1 1 75 85 80
## 2 2 80 90 82
## 3 3 70 75 74
## 4 4 85 95 88
## 5 5 78 88 81
In this example, we consider a study that measures the heart rate of patients before, during and after one treatment intervention. Each patient’s heart rate is measured multiple times to account for the repeated measures within the same individuals.
The concept of repeated measurements is applied and that’s why Repeated Measures ANOVA is the ideal statistical tool for analyzing this data.
To perform repeated measures anova in R, we identify subject as within subject variable and treat it as a random factor.
Hypothesis 1: Main Effect of Time Point
H0: \(μ_{Before} = μ_{During} = μ_{After}\)
(There is no difference in mean heart rate across time points.)
Ha: At least one pair of means is different (There is a significant difference in mean heart rate across time points.)
Hypothesis 2: Interaction Effect
H0: The effect of time point on heart rate is the same for all patients. Ha: The effect of time point on heart rate varies across patients.
Explore two popular R packages, rstatix`` andnlme`, to
perform Repeated Measures ANOVA. Compare their advantages and usage.
##reading libraries to use
library(rstatix)
##
## Attaching package: 'rstatix'
## The following object is masked from 'package:stats':
##
## filter
library(ggpubr)
## Loading required package: ggplot2
## Warning: package 'ggplot2' was built under R version 4.2.3
library(nlme)
library(car)
## Loading required package: carData
Before exploring and modelling ANOVA is necessary to pass the data to long format.
To convert the wide format data to long format, we can use the pivot_longer function from the tidyr package:
# Load necessary packages
library(tidyr)
# Convert to long format
medical_data <- pivot_longer(
data = medical_data,
cols = c(Before, During, After),
names_to = "Time_Point",
values_to = "Heart_Rate"
)
And converting Patient ID to factor (Although for certain packages is not necessary):
medical_data$Patient_ID = as.factor(medical_data$Patient_ID)
medical_data$Time_Point = factor(medical_data$Time_Point,
levels=c("Before","During","After"))
# Print the dataframe in long format
print(medical_data)
## # A tibble: 15 × 3
## Patient_ID Time_Point Heart_Rate
## <fct> <fct> <dbl>
## 1 1 Before 75
## 2 1 During 85
## 3 1 After 80
## 4 2 Before 80
## 5 2 During 90
## 6 2 After 82
## 7 3 Before 70
## 8 3 During 75
## 9 3 After 74
## 10 4 Before 85
## 11 4 During 95
## 12 4 After 88
## 13 5 Before 78
## 14 5 During 88
## 15 5 After 81
Let’s create some tables to summarize the data
medical_data %>%
group_by(Time_Point) %>%
get_summary_stats(Heart_Rate, type = "mean_sd")
## # A tibble: 3 × 5
## Time_Point variable n mean sd
## <fct> <fct> <dbl> <dbl> <dbl>
## 1 Before Heart_Rate 5 77.6 5.60
## 2 During Heart_Rate 5 86.6 7.44
## 3 After Heart_Rate 5 81 5
To explore the differences (if any) some plots are shown:
Create a box plot and add points corresponding to individual values:
bxp <- ggboxplot(medical_data, x = "Time_Point", y = "Heart_Rate", add = "point")
bxp
# Create an interaction plot
interaction.plot(
x.factor = medical_data$Time_Point,
trace.factor = medical_data$Patient_ID,
response = medical_data$Heart_Rate,
xlab="Time",
ylab = "Heart Rate",
trace.label = "Patient"
)
## testing outliers
medical_data %>%
group_by(Time_Point) %>%
identify_outliers(Heart_Rate)
## # A tibble: 3 × 5
## Time_Point Patient_ID Heart_Rate is.outlier is.extreme
## <fct> <fct> <dbl> <lgl> <lgl>
## 1 During 3 75 TRUE FALSE
## 2 After 3 74 TRUE FALSE
## 3 After 4 88 TRUE FALSE
No extreme value has been identified.
What to do if there are outliers?
In the situation where we have extreme outliers, we can include the outlier in the analysis anyway if we do not believe the result will be substantially affected.
This can be evaluated by comparing the result of the ANOVA with and without the outlier.
#Normality assumption
medical_data %>%
group_by(Time_Point) %>%
shapiro_test(Heart_Rate)
## # A tibble: 3 × 4
## Time_Point variable statistic p
## <fct> <chr> <dbl> <dbl>
## 1 Before Heart_Rate 0.998 0.998
## 2 During Heart_Rate 0.950 0.740
## 3 After Heart_Rate 0.958 0.794
The Heart Rate score was normally distributed at each time point, as assessed by Shapiro-Wilk’s test (p > 0.05).
Walk through the process of setting up and running the analysis using
the ez package. Interpret the output to understand the
effects of time points on heart rate.
# Perform Repeated Measures ANOVA with the rstatix package
rm_anova <- anova_test(
data = medical_data,
dv = Heart_Rate,
wid = Patient_ID,
within = Time_Point,
type = 3 # Specifies repeated measures design
)
# Print the results
rm_anova
## ANOVA Table (type III tests)
##
## $ANOVA
## Effect DFn DFd F p p<.05 ges
## 1 Time_Point 2 8 43.943 4.85e-05 * 0.316
##
## $`Mauchly's Test for Sphericity`
## Effect W p p<.05
## 1 Time_Point 0.358 0.215
##
## $`Sphericity Corrections`
## Effect GGe DF[GG] p[GG] p[GG]<.05 HFe DF[HF] p[HF]
## 1 Time_Point 0.609 1.22, 4.87 0.001 * 0.735 1.47, 5.88 0.000399
## p[HF]<.05
## 1 *
# ANOVA table
get_anova_table(rm_anova)
## ANOVA Table (type III tests)
##
## Effect DFn DFd F p p<.05 ges
## 1 Time_Point 2 8 43.943 4.85e-05 * 0.316
The Heart Rate was statistically significantly different at the different time points during the intervention, F(2, 8) = 43.943, p < 0.0001, eta2[g] = 0.316.
nlme
PackageFollow step-by-step instructions to perform Repeated Measures ANOVA
using the nlme package. Understand how to handle repeated
measures and interpret the results.
# Perform Repeated Measures ANOVA with the nlme package
rm_anova_nlme <- lme(
Heart_Rate ~ Time_Point,
random = ~1 | Patient_ID,
data = medical_data
)
# Print the results
print(summary(rm_anova_nlme))
## Linear mixed-effects model fit by REML
## Data: medical_data
## AIC BIC logLik
## 74.40574 76.83027 -32.20287
##
## Random effects:
## Formula: ~1 | Patient_ID
## (Intercept) Residual
## StdDev: 5.903389 1.532971
##
## Fixed effects: Heart_Rate ~ Time_Point
## Value Std.Error DF t-value p-value
## (Intercept) 77.6 2.727636 8 28.449541 0.000
## Time_PointDuring 9.0 0.969536 8 9.282791 0.000
## Time_PointAfter 3.4 0.969536 8 3.506832 0.008
## Correlation:
## (Intr) Tm_PnD
## Time_PointDuring -0.178
## Time_PointAfter -0.178 0.500
##
## Standardized Within-Group Residuals:
## Min Q1 Med Q3 Max
## -1.99524517 -0.33260188 0.06212067 0.57226684 1.00546393
##
## Number of Observations: 15
## Number of Groups: 5
# Print the results
print(anova(rm_anova_nlme))
## numDF denDF F-value p-value
## (Intercept) 1 8 937.3720 <.0001
## Time_Point 2 8 43.9433 <.0001
Discover how to perform a Mixed ANOVA, combining both between-subjects and within-subjects factors.
# Perform Mixed ANOVA
model = aov(Heart_Rate ~ Time_Point + Error(Patient_ID/Time_Point), data = medical_data)
summary(model)
##
## Error: Patient_ID
## Df Sum Sq Mean Sq F value Pr(>F)
## Residuals 4 427.6 106.9
##
## Error: Patient_ID:Time_Point
## Df Sum Sq Mean Sq F value Pr(>F)
## Time_Point 2 206.5 103.27 43.94 4.85e-05 ***
## Residuals 8 18.8 2.35
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Here, having Patient ID as factor was important to have
same results than previous packages.
library(lmerTest)
## Warning: package 'lmerTest' was built under R version 4.2.3
## Loading required package: lme4
## Loading required package: Matrix
##
## Attaching package: 'Matrix'
## The following objects are masked from 'package:tidyr':
##
## expand, pack, unpack
##
## Attaching package: 'lme4'
## The following object is masked from 'package:nlme':
##
## lmList
##
## Attaching package: 'lmerTest'
## The following object is masked from 'package:lme4':
##
## lmer
## The following object is masked from 'package:stats':
##
## step
# Fit a mixed-effects ANOVA model
mixed_anova <- lmer(Heart_Rate ~ Time_Point + (1 | Patient_ID), data = medical_data)
# Print the results
print(summary(mixed_anova))
## Linear mixed model fit by REML. t-tests use Satterthwaite's method [
## lmerModLmerTest]
## Formula: Heart_Rate ~ Time_Point + (1 | Patient_ID)
## Data: medical_data
##
## REML criterion at convergence: 64.4
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -1.99525 -0.33260 0.06212 0.57227 1.00546
##
## Random effects:
## Groups Name Variance Std.Dev.
## Patient_ID (Intercept) 34.85 5.903
## Residual 2.35 1.533
## Number of obs: 15, groups: Patient_ID, 5
##
## Fixed effects:
## Estimate Std. Error df t value Pr(>|t|)
## (Intercept) 77.6000 2.7276 4.3553 28.450 4.09e-06 ***
## Time_PointDuring 9.0000 0.9695 8.0000 9.283 1.48e-05 ***
## Time_PointAfter 3.4000 0.9695 8.0000 3.507 0.008 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Correlation of Fixed Effects:
## (Intr) Tm_PnD
## Tim_PntDrng -0.178
## Tim_PntAftr -0.178 0.500
pwc <- medical_data %>%
pairwise_t_test(
Heart_Rate ~ Time_Point, paired = TRUE,
p.adjust.method = "bonferroni"
)
pwc
## # A tibble: 3 × 10
## .y. group1 group2 n1 n2 statistic df p p.adj p.adj.si…¹
## * <chr> <chr> <chr> <int> <int> <dbl> <dbl> <dbl> <dbl> <chr>
## 1 Heart_Rate Before During 5 5 -9 4 0.000844 0.003 **
## 2 Heart_Rate Before After 5 5 -6.67 4 0.003 0.008 **
## 3 Heart_Rate During After 5 5 4.48 4 0.011 0.033 *
## # … with abbreviated variable name ¹p.adj.signif
All the pairwise differences are statistically significant.
Post-hoc analyses with a Bonferroni adjustment revealed that all the pairwise differences, between time points, were statistically significantly different (p <= 0.05).
pwc <- pwc %>% add_xy_position(x = "Time_Point")
bxp +
stat_pvalue_manual(pwc) +
labs(
subtitle = get_test_label(rm_anova, detailed = TRUE),
caption = get_pwc_label(pwc)
)
p-values are below a chosen significance level (e.g., 0.05), we reject the null hypotheses and conclude that there are significant differences in heart rate across time points and No interaction effect between time points and patients.