Term Project Description
My project builds upon the study referenced below. I have proposed a
potential follow-up experiment to extend the study’s findings. The
project utilizes simulated data, structured and analyzed as if collected
from a real experiment. While the outcomes do not provide scientific
evidence, they showcase my skills in formulating and addressing relevant
biological questions through statistics, experimental design, and
programming.
Article citation:
Buijze et al. 2016: The Effect of Cold Showering on Health and Work:
A Randomized Controlled Trial
Brief statement on the findings from the original article that led
to your followup experiemnt:
My chosen paper is Buijze et al. “The Effect of Cold Showering on
Health and Work: A Randomized Controlled Trial”. This study aimed to
determine the cumulative effect of a routine (hot-to-) cold shower on
sickness, quality of life and work productivity. 3018 participants were
enrolled for the study, but the loss to follow up percent was 12% after
30 days and 19.6% after 90 days. They were divided into four groups, 3
treatments and one control. The treatment group consisted of three
groups of 30s, 60s and 90s of hot-cold shower exposure. The authors
found a 29% reduction in the sick absence days reported for the cold
shower treatment group compared to the control group. The duration of
the cold shower had no significant impact on outcomes. There were no
significant group effects reported for the illness days (Buijze et.al,
2016).
The Question
How does regular exposure to cold showers affect plasma cortisol
levels in treatment vs. control groups?
Disclaimer: This project analyzes simulated data. The questions and
hypotheses are real, but the results and conclusions are not.
Rationale and Background:
The chosen paper studied the effect of cold showers on sickness
absence days reported and illness days. The findings were that it leads
to a reduction of the sickness days reported. However, we do not know
how exactly cold showers impact more specific body responses and markers
of health such as white blood cells, inflammatory markers and hormones
such as plasma cortisol. Cortisol is a steroid hormone produced by the
adrenal glands and has many diverse physiological functions. It is one
of the stress hormones of the body, and secreted in higher quantities in
response to stress. In the short term, it enhances the activity of
specific immune cells, such as Natural Killer cells, and promotes the
production of pro-inflammatory cytokines, including IL-6 and TNF-α
[27,28]. However, chronic exposure to high cortisol levels can lead to
immune dysregulation and immunosuppression (Alotiby 2024). In my
experiment, I want to study how regular cold showering can affect plasma
cortisol. In the referenced paper, it was found that cold showering can
lead to reduction in sick absence days reported, and reduces the
severity of the sickness. If cold showering strengthens the immune
system, then plasma cortisol levels might be lower in cold shower
treatment than in control.
Hypotheses
A Statistical Null Hypothesis:
There is no difference between the mean plasma cortisol level of
treatment group and the mean plasma cortisol level of control group.
A Statistical Alternative Hypothesis:
There is no difference between the mean plasma cortisol level of
treatment group and the mean plasma cortisol level of control group.
Experimental Design
The experiment is a randomized control trial, unblinded. The
experiment cannot be blinded for the intervention and the outcome
assessment. A total of 2000 participants are recruited through social
media and advertisements. The participants are divided into control and
treatment groups. Treatment groups are instructed to shower as warm and
as long as preferred, but to end with a 30 second exposure to cold water
showers at the end of their warm showers at the coldest available water
temperature. Participants were asked to use a stopwatch to track time.
Control group is negative control, they are instructed to shower as they
prefer. Plasma cortisol levels are measured at the laboratory. The
effects are measured after 30 days, at the same time of the day to avoid
circadian differences. To prevent bias, both the control group and the
treatment groups receive the same type of measurement and treatment
except the manipulation, which is the cold shower. The collected data
will be analyzed using R studio, using a two sample t-test for
statistical difference analysis.
Explanatory and Response Variables:
Explanatory/Independent variable is exposure to cold shower
(treatment and control group), and Response/Dependent variable is the
plasma cortisol level.
Alpha:
Alpha = 0.05. This value is the standard used in scientific research.
This also corresponds to a confidence level where the expected values
would fall within two standard deviations from the measured mean.
Sample size:
Sample size is 1000 for each of the treatment and control groups.
Total sample size is 2000.
Sample size justification:
A sample size of 1000 would ensures 80% power with a 5% significance
level to detect moderate to small effect sizes.
Data Analysis Plan
Two-sample t-test will be used for the analysis.
My research uses two groups -cold shower treatment and control
groups. The outcomes of one group does not impact outcomes for the other
group. There is no pairing or connection between the individuals in the
groups - they are independent. So, two sample t-test is the appropriate
test for my experiment. A paired t-test is not chosen since the data are
not paired observations.
Assumptions and Exploratory Data Analysis (EDA)
The data obtained is assumed to follow a normal distribution.
Sampling is performed using random sampling method. Number of
experimental units are equal between two groups and the data is
independent of each other.
#USE THIS BLOCK TO INPUT NECESSARY CODE.
library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.1.4 ✔ readr 2.1.5
## ✔ forcats 1.0.0 ✔ stringr 1.5.1
## ✔ ggplot2 3.5.0 ✔ tibble 3.2.1
## ✔ lubridate 1.9.3 ✔ tidyr 1.3.1
## ✔ purrr 1.0.2
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
wltuntable<-read.csv("wltun.csv",row.names = 1)
head(wltuntable)
TidytableCortisol<-pivot_longer(wltuntable,cols =1:2,names_to = "Groups",values_to = "Plasma_Cortisol")
head(TidytableCortisol)
HistogramCortisol<-ggplot(TidytableCortisol,aes(x=Plasma_Cortisol))+geom_histogram()
HistogramCortisol
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

ks.test(TidytableCortisol$Plasma_Cortisol,y="pnorm",mean(TidytableCortisol$Plasma_Cortisol),sd(TidytableCortisol$Plasma_Cortisol))
##
## Asymptotic one-sample Kolmogorov-Smirnov test
##
## data: TidytableCortisol$Plasma_Cortisol
## D = 0.13062, p-value < 2.2e-16
## alternative hypothesis: two-sided
z_scores <- scale(TidytableCortisol$Plasma_Cortisol)
which(abs(z_scores) > 3) # Rows with potential outliers
## [1] 500
#According to the shape of the histogram, an outlier seems to be present in the dataset. KS test reveals that p value is less than 0.05, which means the data is not normal. The outlier seems to be present among higher values of plasma cortisol. Z score test reveals that an outlier is present in row 500. Upper limit for plasma cortisol is 20 mcg/dL. I will filter out the plasma cortisol values greater than 20 mcg/dL. The following code is for the filtering.
filteredTidytableCortisol<-TidytableCortisol|>filter(Plasma_Cortisol<20)
head(filteredTidytableCortisol)
#Outliers have been filtered using filter() function. New filtered tidy table will be used to construct histogram. I will also perform a KS test using new table.
NewHistogramCortisol<-ggplot(filteredTidytableCortisol,aes(x=Plasma_Cortisol))+geom_histogram()
NewHistogramCortisol
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

ks.test(x=filteredTidytableCortisol$Plasma_Cortisol,y="pnorm",mean(filteredTidytableCortisol$Plasma_Cortisol),sd(filteredTidytableCortisol$Plasma_Cortisol))
##
## Asymptotic one-sample Kolmogorov-Smirnov test
##
## data: filteredTidytableCortisol$Plasma_Cortisol
## D = 0.019087, p-value = 0.4602
## alternative hypothesis: two-sided
#New histogram shows a more normally distributed data. KS test results in a p-value of 0.4602, which is greater than 0.05, showing the data is normally distributed.
Interpretation of EDA:
First, I built a histogram and performed the KS test to visualize the
data and test for normality. The histogram showed a non-normal shape
with skew on the right, suggesting the presence of outlier.The data did
not pass the initial KS test, with p value of 2.2e-16, showing
non-normality of data. The Z score reveals an outlier present in row
500, which is removed using the filter() function.Using new table, new
histogram is built and and KS test is performed. The new histogram shows
a normally distributed shape of data. KS test reveals that p value is
0.4602, which shows that the data is normally distributed.
Primary Statistical Analysis
#USE THIS BLOCK TO INPUT NECESSARY CODE.
t.test(Plasma_Cortisol~Groups,data=filteredTidytableCortisol)
##
## Welch Two Sample t-test
##
## data: Plasma_Cortisol by Groups
## t = -36.462, df = 1994.4, p-value < 2.2e-16
## alternative hypothesis: true difference in means between group Plasma.Cortisol.in.Control.Group and group Plasma.Cortisol.in.Treatment.Group is not equal to 0
## 95 percent confidence interval:
## -2.333195 -2.095018
## sample estimates:
## mean in group Plasma.Cortisol.in.Control.Group
## 9.992535
## mean in group Plasma.Cortisol.in.Treatment.Group
## 12.206641
Data Visualization
#USE THIS BLOCK TO INPUT NECESSARY CODE.
ggplot(filteredTidytableCortisol,aes(x=Groups,y=Plasma_Cortisol))+geom_boxplot(fill=c("blue","orange"))+theme_classic()+annotate("text",x=1,y=16,label="A")+annotate("text",x=2,y=17,label="B")+ylab("Plasma Cortisol (mcg/dL)")

Conclusions
From the results of the t-test, the pa value is 2.2e-16, which is
much less than 0.05. Therefore the null hypothesis can be rejected.
There is a significant difference between the mean plasma cortisol
levels of treatment group and that of the treatment group. We can say
that the true mean differences are between -2.333195 -2.095018 with a
95% confidence. Since 0 is not included within the confidence interval,
we can say that the mean of the control group is significantly lower
than mean of the treatment group. The results agree with the alternate
hypothesis. However, it does not agree with my initial thoughts in
background, where I mentioned that plasma cortisol might be lower in
cold shower treatment group vs. control. Confounding variables could be
present - since immune function could be impacted by variety of factors
such as diet, sleep, physical exercise and stress. If there were no
limitations, I would conduct an experiment which also factors in the
other variables and try to keep them at the same level or give them the
same treatment regarding other variables. Furthermore, the individuals
could not be tracked to see their compliance. A baseline of Cortisol was
not collected for paired t-test analysis. If there were no limitations,
a better experiment could be designed by considering the factors while
tracking for compliance.
Citations
Alotiby, A. (2024). Immunology of Stress: A review article. Journal
of Clinical Medicine, 13(21), 6394. https://doi.org/10.3390/jcm13216394
Buijze, G. A., Sierevelt, I. N., Van Der Heijden, B. C. J. M.,
Dijkgraaf, M. G., & Frings-Dresen, M. H. W. (2016). The effect of
cold showering on health and work: a randomized controlled trial. PLoS
ONE, 11(9), e0161749. https://doi.org/10.1371/journal.pone.0161749
OpenAI. (2024). ChatGPT (June 8 Version) [Large language model].
Retrieved from https://chat.openai.com/