Determinants of Pulse Rate Variability in Response to Acute Physical Activity: An Analysis of Classroom Experiment Data

Author

Dennis Baidoo

Published

June 21, 2025

Abstract

This study examines the physiological and lifestyle factors influencing pulse rate changes following acute physical activity using data from a series of classroom experiments conducted at The University of Queensland between 1993 and 1998. Through stratified linear regression analysis of 112 student participants, we identified significant differences in cardiovascular response between running and sitting conditions, with mean pulse rate increases of 51.4 bpm for runners compared to negligible changes for sitters. The analysis revealed that while baseline pulse rates showed modest associations with exercise frequency, post-activity pulse changes were primarily predicted by activity type rather than individual characteristics. The study highlights the importance of considering inter-individual variability in physiological responses and demonstrates methodological approaches for handling heteroscedastic data in exercise science research.

1. Introduction

The measurement of pulse rate response to physical activity serves as a fundamental assessment in exercise physiology and health research. This investigation analyzes data collected from introductory statistics students who participated in a simple yet elegant experiment measuring pulse rate changes before and after brief physical activity. The experimental design evolved over five years to address potential compliance issues, creating a natural experiment that allows examination of both physiological responses and methodological considerations. The study addresses three primary questions: the fidelity of random assignment to activity conditions, the determinants of baseline pulse rates, and the factors influencing pulse rate changes after activity. These questions bear relevance to both exercise science and the design of classroom-based physiological experiments.

2. Methods

The study population comprised 112 undergraduate students enrolled in introductory statistics courses between 1993 and 1998. Participants first recorded their resting pulse rate (Pulse1), then were randomly assigned via coin toss (1993-1994) or pre-assigned forms (1995-1998) to either run in place or sit quietly for one minute before recording a second pulse measurement (Pulse2). Along with pulse data, participants provided information on height, weight, age, gender, smoking status, alcohol consumption, and exercise frequency. The analytical approach employed Welch’s t-test to compare pulse changes between activity groups, linear regression with stepwise selection to identify predictors of baseline pulse rate, and separate regression models for runners and sitters to account for heteroscedasticity. Model diagnostics included residual analysis, normality tests, and variance inflation factors to ensure appropriate model specification.

3. Results

The analysis revealed clear differentiation between activity groups, with runners showing a mean pulse increase of 51.4 bpm (SD = 21.4) compared to sitters’ mean change of -1.0 bpm (SD = 3.8), a statistically significant difference (t(47.3) = 16.6, p < 0.001).

library(erikmisc)
library(tidyverse)
ggplot2::theme_set(ggplot2::theme_bw())  # set theme_bw for all plots
# First, download the data to your computer,
#   save in the same folder as this qmd file.

# read the data
dat_pulse <-
  readr::read_table(
    file = "ADA2_CL_28_StatQualExam201408_PulseRates.dat"
  , show_col_types = FALSE
  ) |>
  dplyr::bind_rows(
    # Observations to predict are the last two rows of the dataset
    # Because we include them with the other data,
    #   all of the formatting is the same and the predictions will be made
    #   automatically since the response variable is NA (missing)
    read.table(
      text = "
Height Weight Age Gender Smokes Alcohol Exercise Ran Pulse1 Pulse2 Year
165    67     20  1      2      1       2        1   NA     NA     97
165    67     20  1      2      1       2        2   NA     NA     97
"
    , header = TRUE
    )
  ) |>
  mutate(
    # ID numbers we can use for removing unusual observations
    ID         = 1:n()
    # create response difference variable
  , Pulse_diff = Pulse2 - Pulse1
    # create factor variables
  , Gender     = Gender   |> factor(levels = c(1, 2)   , labels = c("Male", "Female"))
  , Smokes     = Smokes   |> factor(levels = c(1, 2)   , labels = c("Yes", "No"))
  , Alcohol    = Alcohol  |> factor(levels = c(1, 2)   , labels = c("Yes", "No"))
  , Exercise   = Exercise |> factor(levels = c(1, 2, 3), labels = c("High", "Moderate", "Low"))
  , Ran        = Ran      |> factor(levels = c(1, 2)   , labels = c("Ran", "Sat"))
  , Year       = Year     |> factor()
  )

str(dat_pulse)
tibble [112 × 13] (S3: tbl_df/tbl/data.frame)
 $ Height    : num [1:112] 173 179 167 195 173 184 162 169 164 168 ...
 $ Weight    : num [1:112] 57 58 62 84 64 74 57 55 56 60 ...
 $ Age       : num [1:112] 18 19 18 18 18 22 20 18 19 23 ...
 $ Gender    : Factor w/ 2 levels "Male","Female": 2 2 2 1 2 1 2 2 2 1 ...
 $ Smokes    : Factor w/ 2 levels "Yes","No": 2 2 2 2 2 2 2 2 2 2 ...
 $ Alcohol   : Factor w/ 2 levels "Yes","No": 1 1 1 1 1 1 1 1 1 1 ...
 $ Exercise  : Factor w/ 3 levels "High","Moderate",..: 2 2 1 1 3 3 2 2 1 2 ...
 $ Ran       : Factor w/ 2 levels "Ran","Sat": 2 1 1 2 2 1 2 2 2 1 ...
 $ Pulse1    : num [1:112] 86 82 96 71 90 78 68 71 68 88 ...
 $ Pulse2    : num [1:112] 88 150 176 73 88 141 72 77 68 150 ...
 $ Year      : Factor w/ 5 levels "93","95","96",..: 1 1 1 1 1 1 1 1 1 1 ...
 $ ID        : int [1:112] 1 2 3 4 5 6 7 8 9 10 ...
 $ Pulse_diff: num [1:112] 2 68 80 2 -2 63 4 6 0 62 ...
# Observations to predict are the last two rows of the dataset
tail(dat_pulse, 4)
# A tibble: 4 × 13
  Height Weight   Age Gender Smokes Alcohol Exercise Ran   Pulse1 Pulse2 Year 
   <dbl>  <dbl> <dbl> <fct>  <fct>  <fct>   <fct>    <fct>  <dbl>  <dbl> <fct>
1    170     65    18 Male   No     Yes     High     Sat       69     64 98   
2    185     85    19 Male   No     Yes     Moderate Sat       75     68 98   
3    165     67    20 Male   No     Yes     Moderate Ran       NA     NA 97   
4    165     67    20 Male   No     Yes     Moderate Sat       NA     NA 97   
# ℹ 2 more variables: ID <int>, Pulse_diff <dbl>
# select only numeric variables
# convert to long format
# plot histogram of each variable, facet by variable

dat_plot <-
  dat_pulse |>
  select(
    where(is.numeric)
  , Ran
  ) |>
  pivot_longer(
    cols = -c(ID, Ran)
  , names_to  = "Var"
  , values_to = "Value"
  )

p <- ggplot(dat_plot, aes(x = Value, fill = Ran))
p <- p + theme_bw()
p <- p + geom_histogram()
p <- p + facet_wrap(~ Var, drop = TRUE, scales = "free")
p <- p + labs(
                title     = "Plots of numeric variables"
              )
print(p)
`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
Warning: Removed 9 rows containing non-finite outside the scale range
(`stat_bin()`).

p <- ggplot(dat_pulse, aes(x = ID |> reorder(Pulse_diff), y = Pulse_diff, fill = Ran))
p <- p + theme_bw()
p <- p + geom_col()
p <- p + coord_flip()
#p <- p + facet_wrap(~ Ran, drop = TRUE)
p <- p + labs(
                title     = "Sorted pulse difference by ID number"
              , x         = "ID ordered by Pulse_diff"
              , y         = "Pulse difference"
              )
print(p)
Warning: Removed 3 rows containing missing values or values outside the scale range
(`geom_col()`).

print(shapiro.test(dat_pulse[dat_pulse$Ran=="Ran",]$Pulse_diff))

    Shapiro-Wilk normality test

data:  dat_pulse[dat_pulse$Ran == "Ran", ]$Pulse_diff
W = 0.9741, p-value = 0.3899
print(shapiro.test(dat_pulse[dat_pulse$Ran=="Sat",]$Pulse_diff))

    Shapiro-Wilk normality test

data:  dat_pulse[dat_pulse$Ran == "Sat", ]$Pulse_diff
W = 0.98306, p-value = 0.5375
# Perform two-sample t-test
t_test <- t.test(Pulse_diff ~ Ran, data = dat_pulse, var.equal = FALSE)
print(t_test)

    Welch Two Sample t-test

data:  Pulse_diff by Ran
t = 16.64, df = 47.28, p-value < 2.2e-16
alternative hypothesis: true difference in means between group Ran and group Sat is not equal to 0
95 percent confidence interval:
 46.05825 58.72436
sample estimates:
mean in group Ran mean in group Sat 
          51.3913           -1.0000 
dat_pulse2 <-
  dat_pulse |>
  filter(
    # Remove individual outliers, or ...
    ID %notin% c(73),
    # .... remove outliers using "Plots of numeric variables"
    Age < 30,
    Height >100,
  #, ...
  )
lm_fit_init <-
lm(
Pulse1 ~ (Age + Height + Weight + Gender + Exercise +Smokes +Alcohol)
, data = dat_pulse2
)

e_plot_lm_diagnostics(lm_fit_init, sw_plot_set = "simple")

summary(lm_fit_init)

Call:
lm(formula = Pulse1 ~ (Age + Height + Weight + Gender + Exercise + 
    Smokes + Alcohol), data = dat_pulse2)

Residuals:
     Min       1Q   Median       3Q      Max 
-25.9603  -7.3176   0.8837   6.9766  27.6041 

Coefficients:
                  Estimate Std. Error t value Pr(>|t|)   
(Intercept)      102.22386   32.76266   3.120   0.0024 **
Age               -0.39227    0.52523  -0.747   0.4570   
Height            -0.11081    0.18483  -0.600   0.5503   
Weight            -0.05364    0.10829  -0.495   0.6215   
GenderFemale       0.57497    2.95586   0.195   0.8462   
ExerciseModerate   5.09493    3.27289   1.557   0.1229   
ExerciseLow        8.61266    3.54836   2.427   0.0171 * 
SmokesNo          -2.37937    3.54412  -0.671   0.5036   
AlcoholNo         -1.41838    2.28743  -0.620   0.5367   
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 10.38 on 94 degrees of freedom
  (3 observations deleted due to missingness)
Multiple R-squared:  0.125, Adjusted R-squared:  0.05058 
F-statistic: 1.679 on 8 and 94 DF,  p-value: 0.1136
lm_fit_BIC <-
step(
lm_fit_init,
scope =
list(
upper = Pulse1 ~ (Age + Height + Weight + Gender + Exercise + Year +Smokes +Alcohol)^2
, lower = Pulse1 ~ 1
)
, direction = "both"
, test = "F", trace = 0
, k = log(nrow(dat_pulse2)) # condition_3_criterion takes effect here
)
lm_fit_final <- lm_fit_BIC
e_plot_lm_diagnostics(lm_fit_final, sw_plot_set = "simple")

out_cont2 <-
  e_plot_model_contrasts(
    fit = lm_fit_BIC
  , dat_cont = dat_pulse2
  , sw_print = FALSE
  , sw_TWI_plots_keep = c("singles", "both", "all")[2]
  , sw_TWI_both_orientation = c("wide", "tall")[1]
)
out_cont2$plots
$Height

p <- ggplot(dat_pulse, aes(x = Pulse_diff, fill = Ran))
p <- p + theme_bw()
p <- p + geom_histogram()
p <- p + facet_wrap(~ Ran, drop = TRUE)#, scales = "free")
p <- p + labs(
                title     = "3. Pulse difference by Ran"
              )
print(p)
`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
Warning: Removed 3 rows containing non-finite outside the scale range
(`stat_bin()`).

##  model for Sat
dat_pulse2 <- na.omit(dat_pulse2)
dat_pulse2_sat <- na.omit(dat_pulse2[dat_pulse2$Ran == "Sat", ])
lm_fit_init_sat <-lm(
Pulse_diff ~ (Age + Height + Weight + Gender + Exercise +Smokes +Alcohol)
, data = dat_pulse2_sat
)
e_plot_lm_diagnostics(lm_fit_init_sat, sw_plot_set = "simple")

summary(lm_fit_init_sat)

Call:
lm(formula = Pulse_diff ~ (Age + Height + Weight + Gender + Exercise + 
    Smokes + Alcohol), data = dat_pulse2_sat)

Residuals:
    Min      1Q  Median      3Q     Max 
-9.3588 -2.4770  0.1289  2.3347  6.6665 

Coefficients:
                  Estimate Std. Error t value Pr(>|t|)  
(Intercept)      -32.09412   16.13627  -1.989   0.0521 .
Age                0.25253    0.25285   0.999   0.3226  
Height             0.19040    0.09467   2.011   0.0496 *
Weight            -0.10980    0.05350  -2.052   0.0453 *
GenderFemale       0.81898    1.47227   0.556   0.5805  
ExerciseModerate  -1.23356    1.68665  -0.731   0.4679  
ExerciseLow       -0.32349    1.79461  -0.180   0.8577  
SmokesNo           1.19062    1.60906   0.740   0.4627  
AlcoholNo         -1.20813    1.10279  -1.096   0.2784  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 3.847 on 51 degrees of freedom
Multiple R-squared:  0.1286,    Adjusted R-squared:  -0.008078 
F-statistic: 0.9409 on 8 and 51 DF,  p-value: 0.4918
### model for run
dat_pulse2 <- na.omit(dat_pulse2)
dat_pulse2_ran <- na.omit(dat_pulse2[dat_pulse2$Ran == "Ran", ])
lm_fit_init_ran <-lm(
Pulse_diff ~ (Age + Height + Weight + Gender + Exercise +Smokes +Alcohol)
, data = dat_pulse2_ran
)
e_plot_lm_diagnostics(lm_fit_init_ran, sw_plot_set = "simple")

summary(lm_fit_init_ran)

Call:
lm(formula = Pulse_diff ~ (Age + Height + Weight + Gender + Exercise + 
    Smokes + Alcohol), data = dat_pulse2_ran)

Residuals:
    Min      1Q  Median      3Q     Max 
-37.323 -12.399  -0.192   8.803  40.094 

Coefficients:
                  Estimate Std. Error t value Pr(>|t|)  
(Intercept)      -147.1172   110.5497  -1.331   0.1921  
Age                 1.1973     1.8108   0.661   0.5129  
Height              1.0614     0.6052   1.754   0.0885 .
Weight             -0.2434     0.3632  -0.670   0.5072  
GenderFemale        3.0144     9.4851   0.318   0.7526  
ExerciseModerate    4.4031    10.0481   0.438   0.6640  
ExerciseLow         4.2128    10.9556   0.385   0.7030  
SmokesNo            0.2943    13.7367   0.021   0.9830  
AlcoholNo           8.7164     8.4455   1.032   0.3093  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 21.38 on 34 degrees of freedom
Multiple R-squared:  0.1186,    Adjusted R-squared:  -0.08882 
F-statistic: 0.5717 on 8 and 34 DF,  p-value: 0.7935
# Create the new data for prediction
dat_pred_ran <- data.frame(
  Height = 165,
  Weight = 67,
  Age = 20,
  Gender = factor("Male", levels = c("Male", "Female")),
  Smokes = factor("No", levels = c("Yes", "No")),
  Alcohol = factor("Yes", levels = c("Yes", "No")),
  Exercise = factor("Moderate", levels = c("High", "Moderate", "Low")),
  Ran = factor("Ran", levels = c("Ran", "Sat")),
  Pulse1 = NA,
  Pulse2 = NA,
  Year = factor("97", levels = c("93", "95", "96", "97", "98"))
)

dat_pred_sat <- dat_pred_ran
dat_pred_sat$Ran <- factor("Sat", levels = c("Ran", "Sat"))

# Predict using the final model for the "Ran" group
predict(lm_fit_init_ran, newdata = dat_pred_ran, interval = "prediction")
       fit       lwr      upr
1 40.34931 -8.366118 89.06473
# Predict using the final model for the "Sat" group
predict(lm_fit_init_sat, newdata = dat_pred_sat, interval = "prediction")
        fit       lwr      upr
1 -3.027521 -11.34068 5.285641

Baseline pulse rates showed limited association with measured characteristics, with only low exercise frequency demonstrating a significant positive relationship (β = 8.6 bpm, p = 0.017). In the sitting group, both height (β = 0.19 bpm/cm, p = 0.050) and weight (β = -0.11 bpm/kg, p = 0.045) showed modest but statistically significant associations with pulse changes. The running group exhibited no significant predictors of pulse change, likely due to substantial variability in individual responses. Predictive models estimated that a typical 20-year-old male student would experience a 40.3 bpm increase (95% CI: -8.4 to 89.1) when running versus a negligible change when sitting (-3.0 bpm, 95% CI: -11.3 to 5.3).

4. Discussion

The substantial pulse rate increase observed in the running group confirms the expected cardiovascular response to acute physical activity, while the stability of pulse rates in the sitting group supports the validity of the experimental protocol. The absence of strong predictors for baseline pulse rates may reflect the relative homogeneity of the student population in terms of cardiovascular health. The differential findings between activity groups underscore the importance of accounting for heteroscedasticity in physiological studies through stratified analyses. The study’s methodological evolution, from coin tosses to pre-assigned activity forms, illustrates how experimental design choices can impact data quality in classroom-based research. Limitations include potential self-report biases in lifestyle measures and uncontrolled variation in running intensity, which may contribute to the high variability observed in the running group’s responses.

5. Conclusion

This analysis of classroom experiment data demonstrates robust pulse rate responses to acute physical activity while highlighting the complex interplay between physiological characteristics and activity type. The findings emphasize that while activity type strongly influences cardiovascular response, individual characteristics show limited predictive power within homogeneous populations. The study contributes methodological insights for handling heteroscedastic data in exercise physiology research and provides a framework for designing effective classroom-based experiments. Future research in this area would benefit from incorporating objective activity monitoring and expanding to more diverse populations to better understand individual differences in physiological responses.

References

Erhardt, E. B., Bedrick, E. J., & Schrader, R. M. (2020). \(\textit{Lecture notes for Advanced Data Analysis 2 (ADA2) (Stat 428/528)}\). University of New Mexico.