Exploring Heart Rate Variability: Repeated Measures ANOVA in R

We delve into the world of statistical analysis to understand how different factors impact heart rate variability. We’ll use real-world medical data and the power of R to conduct a Repeated Measures ANOVA, allowing us to uncover valuable insights from repeated measurements.

This article will equip you with the skills to confidently analyze repeated measurements and draw meaningful conclusions using Repeated Measures ANOVA in R.

Join us on this journey to unlock the secrets hidden within heart rate data!

https://www.youtube.com/@RVStats_ENG


WHY REPEATED MEASURES ANOVA

Repeated Measures ANOVA and Mixed ANOVA are used to analyze data with repeated measurements, but they differ in terms of their underlying assumptions and the types of factors they incorporate.

Here’s a comparison of the two approaches:

Repeated Measures ANOVA:

  1. Assumption: Repeated Measures ANOVA assumes sphericity, which means that the variances of the differences between all pairs of within-subjects factor levels are equal. This assumption can be restrictive and is often violated in real-world data.

  2. Factors: Repeated Measures ANOVA deals only with within-subjects factors. It analyzes the effect of a single independent variable (factor) with multiple levels on a dependent variable, where each level is measured repeatedly within the same subjects.

  3. Design: It is suitable when there’s a single independent variable with multiple levels, and each participant provides measurements for each level of that variable.

  4. Interaction: Repeated Measures ANOVA only tests the main effect of the within-subjects factor and its interactions.

  5. Example: Analyzing the effect of different time points (e.g., Before and After) on blood pressure in a single treatment group.

Mixed ANOVA (Split-Plot ANOVA):

  1. Assumption: Mixed ANOVA relaxes the assumption of sphericity by incorporating both between-subjects and within-subjects factors. It can handle unbalanced designs more effectively.

  2. Design: It is suitable when there are two or more independent variables, where at least one is between-subjects (grouping variable) and another is within-subjects (repeated measurements).

Assumptions for repeated measures ANOVA

The repeated measures ANOVA makes the following assumptions about the data:

  • No significant outliers in any cell of the design.

  • Normality: the outcome (or dependent) variable should be approximately normally distributed in each cell of the design.

  • Assumption of sphericity: the variance of the differences between groups should be equal. Automatically reported when using the R function anova_test().

TABLES


Anova table.
Anova table.


\(SS_T = \sum_{i=1}^{k} \sum_{j=1}^{n_i}(x_{ij} - \bar{x})^2\)

\(SS_B = SS_{group} = \sum_{i=1}^{k} n_i (\bar{x_i} - \bar{x})^2\)

\(SS_W = SS_{subjects}= \sum_{i=1}^{n}\sum_{j=1}^{k}(x_{ij} - \bar{x_i})^2\)

\(SS_E = SS_W - SS_B\)


For other names, we have \(SS_{group}=SS_{Model}\)


Introduction to Heart Rate Variability

Understand the importance of heart rate variability and how it can provide insights into overall health and well-being.

Tere are a few examples of medical procedures that could be mentioned as examples for cases similar to the heart rate variability data provided:

  1. Exercise Stress Test: Explore how heart rate changes before, during, and after an exercise stress test, which measures the heart’s response to physical activity.

  2. Infusion Therapy: Analyze heart rate variations before, during, and after infusion therapy sessions, which involve administering fluids or medications intravenously.

  3. Pain Management Interventions: Investigate heart rate changes during pain management interventions like acupuncture sessions, where patients experience varying levels of discomfort.

  4. Respiratory Treatments: Study heart rate variability during respiratory treatments such as inhalation therapy, assessing the impact on heart rate due to improved breathing.

  5. Psychological Stress Testing: Analyze heart rate responses during psychological stress tests, which simulate stressful situations to observe the heart’s reaction.

  6. Pre- and Post-Operative Monitoring: Explore changes in heart rate before, during, and after surgical procedures, focusing on how anesthesia and surgery affect heart rate.

  7. Invasive Medical Procedures: Investigate heart rate variability in response to invasive procedures like endoscopy, where patients experience physiological changes.

  8. Temperature Variations: Study heart rate variations due to temperature changes, such as exposure to cold or heat therapy sessions.

  9. Cognitive Tasks: Analyze heart rate responses during cognitive tasks or memory tests to understand the influence of mental activity on heart rate.

  10. Nutritional Interventions: Explore heart rate changes before, during, and after nutritional interventions, like glucose tolerance tests or meal consumption.

DATA OVERVIEW

Take a closer look at our medical data, which includes heart rate measurements taken before, during, and after a medical procedure for multiple patients.

# Load necessary packages
library(tibble)

# Create a dataframe with medical application data in wide format
medical_data <- tribble(
  ~Patient_ID, ~Before, ~During, ~After,
  1, 75, 85, 80,
  2, 80, 90, 82,
  3, 70, 75, 74,
  4, 85, 95, 88,
  5, 78, 88, 81
)

# Print the dataframe
print(medical_data)
## # A tibble: 5 × 4
##   Patient_ID Before During After
##        <dbl>  <dbl>  <dbl> <dbl>
## 1          1     75     85    80
## 2          2     80     90    82
## 3          3     70     75    74
## 4          4     85     95    88
## 5          5     78     88    81

Exploring Repeated Measures ANOVA

In this example, we consider a study that measures the heart rate of patients before, during and after one treatment intervention. Each patient’s heart rate is measured multiple times to account for the repeated measures within the same individuals.

The concept of repeated measurements is applied and that’s why Repeated Measures ANOVA is the ideal statistical tool for analyzing this data.

To perform repeated measures anova in R, we identify subject as within subject variable and treat it as a random factor.


HYPOTHESIS

Hypothesis 1: Main Effect of Time Point

H0: \(μ_{Before} = μ_{During} = μ_{After}\)

(There is no difference in mean heart rate across time points.)

Ha: At least one pair of means is different (There is a significant difference in mean heart rate across time points.)

Hypothesis 2: Interaction Effect

H0: The effect of time point on heart rate is the same for all patients. Ha: The effect of time point on heart rate varies across patients.

Choosing the Right Package

Explore two popular R packages, rstatix`` andnlme`, to perform Repeated Measures ANOVA. Compare their advantages and usage.

##reading libraries to use

library(rstatix)
## 
## Attaching package: 'rstatix'
## The following object is masked from 'package:stats':
## 
##     filter
library(ggpubr)
## Loading required package: ggplot2
## Warning: package 'ggplot2' was built under R version 4.2.3
library(nlme)
library(car)
## Loading required package: carData

PREPARING DATA

Before exploring and modelling ANOVA is necessary to pass the data to long format.

To convert the wide format data to long format, we can use the pivot_longer function from the tidyr package:

# Load necessary packages
library(tidyr)

# Convert to long format
medical_data <- pivot_longer(
  data = medical_data,
  cols = c(Before, During, After),
  names_to = "Time_Point",
  values_to = "Heart_Rate"
)

And converting Patient ID to factor (Although for certain packages is not necessary):

medical_data$Patient_ID = as.factor(medical_data$Patient_ID)

medical_data$Time_Point = factor(medical_data$Time_Point,
                                    levels=c("Before","During","After"))

# Print the dataframe in long format
print(medical_data)
## # A tibble: 15 × 3
##    Patient_ID Time_Point Heart_Rate
##    <fct>      <fct>           <dbl>
##  1 1          Before             75
##  2 1          During             85
##  3 1          After              80
##  4 2          Before             80
##  5 2          During             90
##  6 2          After              82
##  7 3          Before             70
##  8 3          During             75
##  9 3          After              74
## 10 4          Before             85
## 11 4          During             95
## 12 4          After              88
## 13 5          Before             78
## 14 5          During             88
## 15 5          After              81

DATA EXPLORATION

Let’s create some tables to summarize the data

medical_data %>%
  group_by(Time_Point) %>%
  get_summary_stats(Heart_Rate, type = "mean_sd")
## # A tibble: 3 × 5
##   Time_Point variable       n  mean    sd
##   <fct>      <fct>      <dbl> <dbl> <dbl>
## 1 Before     Heart_Rate     5  77.6  5.60
## 2 During     Heart_Rate     5  86.6  7.44
## 3 After      Heart_Rate     5  81    5

VISUALIZATION

To explore the differences (if any) some plots are shown:

Create a box plot and add points corresponding to individual values:

bxp <- ggboxplot(medical_data, x = "Time_Point", y = "Heart_Rate", add = "point")

bxp

# Create an interaction plot
interaction.plot(
  x.factor = medical_data$Time_Point,
  trace.factor = medical_data$Patient_ID,
  response = medical_data$Heart_Rate,
  xlab="Time",
  ylab = "Heart Rate",
  trace.label = "Patient"
)

MODELLING ANOVA REPEATED MEASURES

TEST ASSUMPTIONS

## testing outliers

medical_data %>%
  group_by(Time_Point) %>%
  identify_outliers(Heart_Rate)
## # A tibble: 3 × 5
##   Time_Point Patient_ID Heart_Rate is.outlier is.extreme
##   <fct>      <fct>           <dbl> <lgl>      <lgl>     
## 1 During     3                  75 TRUE       FALSE     
## 2 After      3                  74 TRUE       FALSE     
## 3 After      4                  88 TRUE       FALSE

No extreme value has been identified.

What to do if there are outliers?

In the situation where we have extreme outliers, we can include the outlier in the analysis anyway if we do not believe the result will be substantially affected.

This can be evaluated by comparing the result of the ANOVA with and without the outlier.

#Normality assumption
medical_data %>%
  group_by(Time_Point) %>%
  shapiro_test(Heart_Rate)
## # A tibble: 3 × 4
##   Time_Point variable   statistic     p
##   <fct>      <chr>          <dbl> <dbl>
## 1 Before     Heart_Rate     0.998 0.998
## 2 During     Heart_Rate     0.950 0.740
## 3 After      Heart_Rate     0.958 0.794

The Heart Rate score was normally distributed at each time point, as assessed by Shapiro-Wilk’s test (p > 0.05).

Conducting Repeated Measures ANOVA with `rstatix`` Package

Walk through the process of setting up and running the analysis using the ez package. Interpret the output to understand the effects of time points on heart rate.

# Perform Repeated Measures ANOVA with the rstatix package
rm_anova <- anova_test(
  data = medical_data,
  dv = Heart_Rate,
  wid = Patient_ID,
  within = Time_Point,
  type = 3  # Specifies repeated measures design
)

# Print the results
rm_anova
## ANOVA Table (type III tests)
## 
## $ANOVA
##       Effect DFn DFd      F        p p<.05   ges
## 1 Time_Point   2   8 43.943 4.85e-05     * 0.316
## 
## $`Mauchly's Test for Sphericity`
##       Effect     W     p p<.05
## 1 Time_Point 0.358 0.215      
## 
## $`Sphericity Corrections`
##       Effect   GGe     DF[GG] p[GG] p[GG]<.05   HFe     DF[HF]    p[HF]
## 1 Time_Point 0.609 1.22, 4.87 0.001         * 0.735 1.47, 5.88 0.000399
##   p[HF]<.05
## 1         *
# ANOVA table

get_anova_table(rm_anova)
## ANOVA Table (type III tests)
## 
##       Effect DFn DFd      F        p p<.05   ges
## 1 Time_Point   2   8 43.943 4.85e-05     * 0.316

The Heart Rate was statistically significantly different at the different time points during the intervention, F(2, 8) = 43.943, p < 0.0001, eta2[g] = 0.316.

Conducting Repeated Measures ANOVA with nlme Package

Follow step-by-step instructions to perform Repeated Measures ANOVA using the nlme package. Understand how to handle repeated measures and interpret the results.

# Perform Repeated Measures ANOVA with the nlme package
rm_anova_nlme <- lme(
  Heart_Rate ~ Time_Point,
  random = ~1 | Patient_ID,
  data = medical_data
)

# Print the results
print(summary(rm_anova_nlme))
## Linear mixed-effects model fit by REML
##   Data: medical_data 
##        AIC      BIC    logLik
##   74.40574 76.83027 -32.20287
## 
## Random effects:
##  Formula: ~1 | Patient_ID
##         (Intercept) Residual
## StdDev:    5.903389 1.532971
## 
## Fixed effects:  Heart_Rate ~ Time_Point 
##                  Value Std.Error DF   t-value p-value
## (Intercept)       77.6  2.727636  8 28.449541   0.000
## Time_PointDuring   9.0  0.969536  8  9.282791   0.000
## Time_PointAfter    3.4  0.969536  8  3.506832   0.008
##  Correlation: 
##                  (Intr) Tm_PnD
## Time_PointDuring -0.178       
## Time_PointAfter  -0.178  0.500
## 
## Standardized Within-Group Residuals:
##         Min          Q1         Med          Q3         Max 
## -1.99524517 -0.33260188  0.06212067  0.57226684  1.00546393 
## 
## Number of Observations: 15
## Number of Groups: 5
# Print the results
print(anova(rm_anova_nlme))
##             numDF denDF  F-value p-value
## (Intercept)     1     8 937.3720  <.0001
## Time_Point      2     8  43.9433  <.0001

Mixed ANOVA and lmerTest package (Bonus Content)

Discover how to perform a Mixed ANOVA, combining both between-subjects and within-subjects factors.

# Perform Mixed ANOVA

model = aov(Heart_Rate ~ Time_Point + Error(Patient_ID/Time_Point), data = medical_data)

summary(model)
## 
## Error: Patient_ID
##           Df Sum Sq Mean Sq F value Pr(>F)
## Residuals  4  427.6   106.9               
## 
## Error: Patient_ID:Time_Point
##            Df Sum Sq Mean Sq F value   Pr(>F)    
## Time_Point  2  206.5  103.27   43.94 4.85e-05 ***
## Residuals   8   18.8    2.35                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Here, having Patient ID as factor was important to have same results than previous packages.

library(lmerTest)
## Warning: package 'lmerTest' was built under R version 4.2.3
## Loading required package: lme4
## Loading required package: Matrix
## 
## Attaching package: 'Matrix'
## The following objects are masked from 'package:tidyr':
## 
##     expand, pack, unpack
## 
## Attaching package: 'lme4'
## The following object is masked from 'package:nlme':
## 
##     lmList
## 
## Attaching package: 'lmerTest'
## The following object is masked from 'package:lme4':
## 
##     lmer
## The following object is masked from 'package:stats':
## 
##     step
# Fit a mixed-effects ANOVA model
mixed_anova <- lmer(Heart_Rate ~ Time_Point + (1 | Patient_ID), data = medical_data)

# Print the results
print(summary(mixed_anova))
## Linear mixed model fit by REML. t-tests use Satterthwaite's method [
## lmerModLmerTest]
## Formula: Heart_Rate ~ Time_Point + (1 | Patient_ID)
##    Data: medical_data
## 
## REML criterion at convergence: 64.4
## 
## Scaled residuals: 
##      Min       1Q   Median       3Q      Max 
## -1.99525 -0.33260  0.06212  0.57227  1.00546 
## 
## Random effects:
##  Groups     Name        Variance Std.Dev.
##  Patient_ID (Intercept) 34.85    5.903   
##  Residual                2.35    1.533   
## Number of obs: 15, groups:  Patient_ID, 5
## 
## Fixed effects:
##                  Estimate Std. Error      df t value Pr(>|t|)    
## (Intercept)       77.6000     2.7276  4.3553  28.450 4.09e-06 ***
## Time_PointDuring   9.0000     0.9695  8.0000   9.283 1.48e-05 ***
## Time_PointAfter    3.4000     0.9695  8.0000   3.507    0.008 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##             (Intr) Tm_PnD
## Tim_PntDrng -0.178       
## Tim_PntAftr -0.178  0.500

pairwise comparisons

pwc <- medical_data %>%
  pairwise_t_test(
    Heart_Rate ~ Time_Point, paired = TRUE,
    p.adjust.method = "bonferroni"
    )
pwc
## # A tibble: 3 × 10
##   .y.        group1 group2    n1    n2 statistic    df        p p.adj p.adj.si…¹
## * <chr>      <chr>  <chr>  <int> <int>     <dbl> <dbl>    <dbl> <dbl> <chr>     
## 1 Heart_Rate Before During     5     5     -9        4 0.000844 0.003 **        
## 2 Heart_Rate Before After      5     5     -6.67     4 0.003    0.008 **        
## 3 Heart_Rate During After      5     5      4.48     4 0.011    0.033 *         
## # … with abbreviated variable name ¹​p.adj.signif

All the pairwise differences are statistically significant.

Visualization: box plots with p-values

Post-hoc analyses with a Bonferroni adjustment revealed that all the pairwise differences, between time points, were statistically significantly different (p <= 0.05).

pwc <- pwc %>% add_xy_position(x = "Time_Point")
bxp + 
  stat_pvalue_manual(pwc) +
  labs(
    subtitle = get_test_label(rm_anova, detailed = TRUE),
    caption = get_pwc_label(pwc)
  )

Conclusion and Applications

p-values are below a chosen significance level (e.g., 0.05), we reject the null hypotheses and conclude that there are significant differences in heart rate across time points and No interaction effect between time points and patients.