library(erikmisc)
library(tidyverse)
ggplot2::theme_set(ggplot2::theme_bw()) # set theme_bw for all plotsDeterminants of Pulse Rate Variability in Response to Acute Physical Activity: An Analysis of Classroom Experiment Data
Abstract
This study examines the physiological and lifestyle factors influencing pulse rate changes following acute physical activity using data from a series of classroom experiments conducted at The University of Queensland between 1993 and 1998. Through stratified linear regression analysis of 112 student participants, we identified significant differences in cardiovascular response between running and sitting conditions, with mean pulse rate increases of 51.4 bpm for runners compared to negligible changes for sitters. The analysis revealed that while baseline pulse rates showed modest associations with exercise frequency, post-activity pulse changes were primarily predicted by activity type rather than individual characteristics. The study highlights the importance of considering inter-individual variability in physiological responses and demonstrates methodological approaches for handling heteroscedastic data in exercise science research.
1. Introduction
The measurement of pulse rate response to physical activity serves as a fundamental assessment in exercise physiology and health research. This investigation analyzes data collected from introductory statistics students who participated in a simple yet elegant experiment measuring pulse rate changes before and after brief physical activity. The experimental design evolved over five years to address potential compliance issues, creating a natural experiment that allows examination of both physiological responses and methodological considerations. The study addresses three primary questions: the fidelity of random assignment to activity conditions, the determinants of baseline pulse rates, and the factors influencing pulse rate changes after activity. These questions bear relevance to both exercise science and the design of classroom-based physiological experiments.
2. Methods
The study population comprised 112 undergraduate students enrolled in introductory statistics courses between 1993 and 1998. Participants first recorded their resting pulse rate (Pulse1), then were randomly assigned via coin toss (1993-1994) or pre-assigned forms (1995-1998) to either run in place or sit quietly for one minute before recording a second pulse measurement (Pulse2). Along with pulse data, participants provided information on height, weight, age, gender, smoking status, alcohol consumption, and exercise frequency. The analytical approach employed Welch’s t-test to compare pulse changes between activity groups, linear regression with stepwise selection to identify predictors of baseline pulse rate, and separate regression models for runners and sitters to account for heteroscedasticity. Model diagnostics included residual analysis, normality tests, and variance inflation factors to ensure appropriate model specification.
3. Results
The analysis revealed clear differentiation between activity groups, with runners showing a mean pulse increase of 51.4 bpm (SD = 21.4) compared to sitters’ mean change of -1.0 bpm (SD = 3.8), a statistically significant difference (t(47.3) = 16.6, p < 0.001).
# First, download the data to your computer,
# save in the same folder as this qmd file.
# read the data
dat_pulse <-
readr::read_table(
file = "ADA2_CL_28_StatQualExam201408_PulseRates.dat"
, show_col_types = FALSE
) |>
dplyr::bind_rows(
# Observations to predict are the last two rows of the dataset
# Because we include them with the other data,
# all of the formatting is the same and the predictions will be made
# automatically since the response variable is NA (missing)
read.table(
text = "
Height Weight Age Gender Smokes Alcohol Exercise Ran Pulse1 Pulse2 Year
165 67 20 1 2 1 2 1 NA NA 97
165 67 20 1 2 1 2 2 NA NA 97
"
, header = TRUE
)
) |>
mutate(
# ID numbers we can use for removing unusual observations
ID = 1:n()
# create response difference variable
, Pulse_diff = Pulse2 - Pulse1
# create factor variables
, Gender = Gender |> factor(levels = c(1, 2) , labels = c("Male", "Female"))
, Smokes = Smokes |> factor(levels = c(1, 2) , labels = c("Yes", "No"))
, Alcohol = Alcohol |> factor(levels = c(1, 2) , labels = c("Yes", "No"))
, Exercise = Exercise |> factor(levels = c(1, 2, 3), labels = c("High", "Moderate", "Low"))
, Ran = Ran |> factor(levels = c(1, 2) , labels = c("Ran", "Sat"))
, Year = Year |> factor()
)
str(dat_pulse)tibble [112 × 13] (S3: tbl_df/tbl/data.frame)
$ Height : num [1:112] 173 179 167 195 173 184 162 169 164 168 ...
$ Weight : num [1:112] 57 58 62 84 64 74 57 55 56 60 ...
$ Age : num [1:112] 18 19 18 18 18 22 20 18 19 23 ...
$ Gender : Factor w/ 2 levels "Male","Female": 2 2 2 1 2 1 2 2 2 1 ...
$ Smokes : Factor w/ 2 levels "Yes","No": 2 2 2 2 2 2 2 2 2 2 ...
$ Alcohol : Factor w/ 2 levels "Yes","No": 1 1 1 1 1 1 1 1 1 1 ...
$ Exercise : Factor w/ 3 levels "High","Moderate",..: 2 2 1 1 3 3 2 2 1 2 ...
$ Ran : Factor w/ 2 levels "Ran","Sat": 2 1 1 2 2 1 2 2 2 1 ...
$ Pulse1 : num [1:112] 86 82 96 71 90 78 68 71 68 88 ...
$ Pulse2 : num [1:112] 88 150 176 73 88 141 72 77 68 150 ...
$ Year : Factor w/ 5 levels "93","95","96",..: 1 1 1 1 1 1 1 1 1 1 ...
$ ID : int [1:112] 1 2 3 4 5 6 7 8 9 10 ...
$ Pulse_diff: num [1:112] 2 68 80 2 -2 63 4 6 0 62 ...
# Observations to predict are the last two rows of the dataset
tail(dat_pulse, 4)# A tibble: 4 × 13
Height Weight Age Gender Smokes Alcohol Exercise Ran Pulse1 Pulse2 Year
<dbl> <dbl> <dbl> <fct> <fct> <fct> <fct> <fct> <dbl> <dbl> <fct>
1 170 65 18 Male No Yes High Sat 69 64 98
2 185 85 19 Male No Yes Moderate Sat 75 68 98
3 165 67 20 Male No Yes Moderate Ran NA NA 97
4 165 67 20 Male No Yes Moderate Sat NA NA 97
# ℹ 2 more variables: ID <int>, Pulse_diff <dbl>
# select only numeric variables
# convert to long format
# plot histogram of each variable, facet by variable
dat_plot <-
dat_pulse |>
select(
where(is.numeric)
, Ran
) |>
pivot_longer(
cols = -c(ID, Ran)
, names_to = "Var"
, values_to = "Value"
)
p <- ggplot(dat_plot, aes(x = Value, fill = Ran))
p <- p + theme_bw()
p <- p + geom_histogram()
p <- p + facet_wrap(~ Var, drop = TRUE, scales = "free")
p <- p + labs(
title = "Plots of numeric variables"
)
print(p)`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
Warning: Removed 9 rows containing non-finite outside the scale range
(`stat_bin()`).
p <- ggplot(dat_pulse, aes(x = ID |> reorder(Pulse_diff), y = Pulse_diff, fill = Ran))
p <- p + theme_bw()
p <- p + geom_col()
p <- p + coord_flip()
#p <- p + facet_wrap(~ Ran, drop = TRUE)
p <- p + labs(
title = "Sorted pulse difference by ID number"
, x = "ID ordered by Pulse_diff"
, y = "Pulse difference"
)
print(p)Warning: Removed 3 rows containing missing values or values outside the scale range
(`geom_col()`).
print(shapiro.test(dat_pulse[dat_pulse$Ran=="Ran",]$Pulse_diff))
Shapiro-Wilk normality test
data: dat_pulse[dat_pulse$Ran == "Ran", ]$Pulse_diff
W = 0.9741, p-value = 0.3899
print(shapiro.test(dat_pulse[dat_pulse$Ran=="Sat",]$Pulse_diff))
Shapiro-Wilk normality test
data: dat_pulse[dat_pulse$Ran == "Sat", ]$Pulse_diff
W = 0.98306, p-value = 0.5375
# Perform two-sample t-test
t_test <- t.test(Pulse_diff ~ Ran, data = dat_pulse, var.equal = FALSE)
print(t_test)
Welch Two Sample t-test
data: Pulse_diff by Ran
t = 16.64, df = 47.28, p-value < 2.2e-16
alternative hypothesis: true difference in means between group Ran and group Sat is not equal to 0
95 percent confidence interval:
46.05825 58.72436
sample estimates:
mean in group Ran mean in group Sat
51.3913 -1.0000
dat_pulse2 <-
dat_pulse |>
filter(
# Remove individual outliers, or ...
ID %notin% c(73),
# .... remove outliers using "Plots of numeric variables"
Age < 30,
Height >100,
#, ...
)lm_fit_init <-
lm(
Pulse1 ~ (Age + Height + Weight + Gender + Exercise +Smokes +Alcohol)
, data = dat_pulse2
)
e_plot_lm_diagnostics(lm_fit_init, sw_plot_set = "simple")summary(lm_fit_init)
Call:
lm(formula = Pulse1 ~ (Age + Height + Weight + Gender + Exercise +
Smokes + Alcohol), data = dat_pulse2)
Residuals:
Min 1Q Median 3Q Max
-25.9603 -7.3176 0.8837 6.9766 27.6041
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 102.22386 32.76266 3.120 0.0024 **
Age -0.39227 0.52523 -0.747 0.4570
Height -0.11081 0.18483 -0.600 0.5503
Weight -0.05364 0.10829 -0.495 0.6215
GenderFemale 0.57497 2.95586 0.195 0.8462
ExerciseModerate 5.09493 3.27289 1.557 0.1229
ExerciseLow 8.61266 3.54836 2.427 0.0171 *
SmokesNo -2.37937 3.54412 -0.671 0.5036
AlcoholNo -1.41838 2.28743 -0.620 0.5367
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 10.38 on 94 degrees of freedom
(3 observations deleted due to missingness)
Multiple R-squared: 0.125, Adjusted R-squared: 0.05058
F-statistic: 1.679 on 8 and 94 DF, p-value: 0.1136
lm_fit_BIC <-
step(
lm_fit_init,
scope =
list(
upper = Pulse1 ~ (Age + Height + Weight + Gender + Exercise + Year +Smokes +Alcohol)^2
, lower = Pulse1 ~ 1
)
, direction = "both"
, test = "F", trace = 0
, k = log(nrow(dat_pulse2)) # condition_3_criterion takes effect here
)
lm_fit_final <- lm_fit_BIC
e_plot_lm_diagnostics(lm_fit_final, sw_plot_set = "simple")out_cont2 <-
e_plot_model_contrasts(
fit = lm_fit_BIC
, dat_cont = dat_pulse2
, sw_print = FALSE
, sw_TWI_plots_keep = c("singles", "both", "all")[2]
, sw_TWI_both_orientation = c("wide", "tall")[1]
)
out_cont2$plots$Height
p <- ggplot(dat_pulse, aes(x = Pulse_diff, fill = Ran))
p <- p + theme_bw()
p <- p + geom_histogram()
p <- p + facet_wrap(~ Ran, drop = TRUE)#, scales = "free")
p <- p + labs(
title = "3. Pulse difference by Ran"
)
print(p)`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
Warning: Removed 3 rows containing non-finite outside the scale range
(`stat_bin()`).
## model for Sat
dat_pulse2 <- na.omit(dat_pulse2)
dat_pulse2_sat <- na.omit(dat_pulse2[dat_pulse2$Ran == "Sat", ])
lm_fit_init_sat <-lm(
Pulse_diff ~ (Age + Height + Weight + Gender + Exercise +Smokes +Alcohol)
, data = dat_pulse2_sat
)
e_plot_lm_diagnostics(lm_fit_init_sat, sw_plot_set = "simple")summary(lm_fit_init_sat)
Call:
lm(formula = Pulse_diff ~ (Age + Height + Weight + Gender + Exercise +
Smokes + Alcohol), data = dat_pulse2_sat)
Residuals:
Min 1Q Median 3Q Max
-9.3588 -2.4770 0.1289 2.3347 6.6665
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -32.09412 16.13627 -1.989 0.0521 .
Age 0.25253 0.25285 0.999 0.3226
Height 0.19040 0.09467 2.011 0.0496 *
Weight -0.10980 0.05350 -2.052 0.0453 *
GenderFemale 0.81898 1.47227 0.556 0.5805
ExerciseModerate -1.23356 1.68665 -0.731 0.4679
ExerciseLow -0.32349 1.79461 -0.180 0.8577
SmokesNo 1.19062 1.60906 0.740 0.4627
AlcoholNo -1.20813 1.10279 -1.096 0.2784
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 3.847 on 51 degrees of freedom
Multiple R-squared: 0.1286, Adjusted R-squared: -0.008078
F-statistic: 0.9409 on 8 and 51 DF, p-value: 0.4918
### model for run
dat_pulse2 <- na.omit(dat_pulse2)
dat_pulse2_ran <- na.omit(dat_pulse2[dat_pulse2$Ran == "Ran", ])
lm_fit_init_ran <-lm(
Pulse_diff ~ (Age + Height + Weight + Gender + Exercise +Smokes +Alcohol)
, data = dat_pulse2_ran
)
e_plot_lm_diagnostics(lm_fit_init_ran, sw_plot_set = "simple")summary(lm_fit_init_ran)
Call:
lm(formula = Pulse_diff ~ (Age + Height + Weight + Gender + Exercise +
Smokes + Alcohol), data = dat_pulse2_ran)
Residuals:
Min 1Q Median 3Q Max
-37.323 -12.399 -0.192 8.803 40.094
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -147.1172 110.5497 -1.331 0.1921
Age 1.1973 1.8108 0.661 0.5129
Height 1.0614 0.6052 1.754 0.0885 .
Weight -0.2434 0.3632 -0.670 0.5072
GenderFemale 3.0144 9.4851 0.318 0.7526
ExerciseModerate 4.4031 10.0481 0.438 0.6640
ExerciseLow 4.2128 10.9556 0.385 0.7030
SmokesNo 0.2943 13.7367 0.021 0.9830
AlcoholNo 8.7164 8.4455 1.032 0.3093
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 21.38 on 34 degrees of freedom
Multiple R-squared: 0.1186, Adjusted R-squared: -0.08882
F-statistic: 0.5717 on 8 and 34 DF, p-value: 0.7935
# Create the new data for prediction
dat_pred_ran <- data.frame(
Height = 165,
Weight = 67,
Age = 20,
Gender = factor("Male", levels = c("Male", "Female")),
Smokes = factor("No", levels = c("Yes", "No")),
Alcohol = factor("Yes", levels = c("Yes", "No")),
Exercise = factor("Moderate", levels = c("High", "Moderate", "Low")),
Ran = factor("Ran", levels = c("Ran", "Sat")),
Pulse1 = NA,
Pulse2 = NA,
Year = factor("97", levels = c("93", "95", "96", "97", "98"))
)
dat_pred_sat <- dat_pred_ran
dat_pred_sat$Ran <- factor("Sat", levels = c("Ran", "Sat"))
# Predict using the final model for the "Ran" group
predict(lm_fit_init_ran, newdata = dat_pred_ran, interval = "prediction") fit lwr upr
1 40.34931 -8.366118 89.06473
# Predict using the final model for the "Sat" group
predict(lm_fit_init_sat, newdata = dat_pred_sat, interval = "prediction") fit lwr upr
1 -3.027521 -11.34068 5.285641
Baseline pulse rates showed limited association with measured characteristics, with only low exercise frequency demonstrating a significant positive relationship (β = 8.6 bpm, p = 0.017). In the sitting group, both height (β = 0.19 bpm/cm, p = 0.050) and weight (β = -0.11 bpm/kg, p = 0.045) showed modest but statistically significant associations with pulse changes. The running group exhibited no significant predictors of pulse change, likely due to substantial variability in individual responses. Predictive models estimated that a typical 20-year-old male student would experience a 40.3 bpm increase (95% CI: -8.4 to 89.1) when running versus a negligible change when sitting (-3.0 bpm, 95% CI: -11.3 to 5.3).
4. Discussion
The substantial pulse rate increase observed in the running group confirms the expected cardiovascular response to acute physical activity, while the stability of pulse rates in the sitting group supports the validity of the experimental protocol. The absence of strong predictors for baseline pulse rates may reflect the relative homogeneity of the student population in terms of cardiovascular health. The differential findings between activity groups underscore the importance of accounting for heteroscedasticity in physiological studies through stratified analyses. The study’s methodological evolution, from coin tosses to pre-assigned activity forms, illustrates how experimental design choices can impact data quality in classroom-based research. Limitations include potential self-report biases in lifestyle measures and uncontrolled variation in running intensity, which may contribute to the high variability observed in the running group’s responses.
5. Conclusion
This analysis of classroom experiment data demonstrates robust pulse rate responses to acute physical activity while highlighting the complex interplay between physiological characteristics and activity type. The findings emphasize that while activity type strongly influences cardiovascular response, individual characteristics show limited predictive power within homogeneous populations. The study contributes methodological insights for handling heteroscedastic data in exercise physiology research and provides a framework for designing effective classroom-based experiments. Future research in this area would benefit from incorporating objective activity monitoring and expanding to more diverse populations to better understand individual differences in physiological responses.
References
Erhardt, E. B., Bedrick, E. J., & Schrader, R. M. (2020). \(\textit{Lecture notes for Advanced Data Analysis 2 (ADA2) (Stat 428/528)}\). University of New Mexico.