Term Project Description

My project builds upon the study referenced below. I have proposed a potential follow-up experiment to extend the study’s findings. The project utilizes simulated data, structured and analyzed as if collected from a real experiment. While the outcomes do not provide scientific evidence, they showcase my skills in formulating and addressing relevant biological questions through statistics, experimental design, and programming.

Article citation:

Buijze et al. 2016: The Effect of Cold Showering on Health and Work: A Randomized Controlled Trial

Brief statement on the findings from the original article that led to your followup experiemnt:

My chosen paper is Buijze et al. “The Effect of Cold Showering on Health and Work: A Randomized Controlled Trial”. This study aimed to determine the cumulative effect of a routine (hot-to-) cold shower on sickness, quality of life and work productivity. 3018 participants were enrolled for the study, but the loss to follow up percent was 12% after 30 days and 19.6% after 90 days. They were divided into four groups, 3 treatments and one control. The treatment group consisted of three groups of 30s, 60s and 90s of hot-cold shower exposure. The authors found a 29% reduction in the sick absence days reported for the cold shower treatment group compared to the control group. The duration of the cold shower had no significant impact on outcomes. There were no significant group effects reported for the illness days (Buijze et.al, 2016).


The Question

How does regular exposure to cold showers affect plasma cortisol levels in treatment vs. control groups?

Disclaimer: This project analyzes simulated data. The questions and hypotheses are real, but the results and conclusions are not.

Rationale and Background:

The chosen paper studied the effect of cold showers on sickness absence days reported and illness days. The findings were that it leads to a reduction of the sickness days reported. However, we do not know how exactly cold showers impact more specific body responses and markers of health such as white blood cells, inflammatory markers and hormones such as plasma cortisol. Cortisol is a steroid hormone produced by the adrenal glands and has many diverse physiological functions. It is one of the stress hormones of the body, and secreted in higher quantities in response to stress. In the short term, it enhances the activity of specific immune cells, such as Natural Killer cells, and promotes the production of pro-inflammatory cytokines, including IL-6 and TNF-α [27,28]. However, chronic exposure to high cortisol levels can lead to immune dysregulation and immunosuppression (Alotiby 2024). In my experiment, I want to study how regular cold showering can affect plasma cortisol. In the referenced paper, it was found that cold showering can lead to reduction in sick absence days reported, and reduces the severity of the sickness. If cold showering strengthens the immune system, then plasma cortisol levels might be lower in cold shower treatment than in control.


Hypotheses

A Statistical Null Hypothesis:

There is no difference between the mean plasma cortisol level of treatment group and the mean plasma cortisol level of control group.

A Statistical Alternative Hypothesis:

There is no difference between the mean plasma cortisol level of treatment group and the mean plasma cortisol level of control group.


Experimental Design

The experiment is a randomized control trial, unblinded. The experiment cannot be blinded for the intervention and the outcome assessment. A total of 2000 participants are recruited through social media and advertisements. The participants are divided into control and treatment groups. Treatment groups are instructed to shower as warm and as long as preferred, but to end with a 30 second exposure to cold water showers at the end of their warm showers at the coldest available water temperature. Participants were asked to use a stopwatch to track time. Control group is negative control, they are instructed to shower as they prefer. Plasma cortisol levels are measured at the laboratory. The effects are measured after 30 days, at the same time of the day to avoid circadian differences. To prevent bias, both the control group and the treatment groups receive the same type of measurement and treatment except the manipulation, which is the cold shower. The collected data will be analyzed using R studio, using a two sample t-test for statistical difference analysis.

Explanatory and Response Variables:

Explanatory/Independent variable is exposure to cold shower (treatment and control group), and Response/Dependent variable is the plasma cortisol level.

Alpha:

Alpha = 0.05. This value is the standard used in scientific research. This also corresponds to a confidence level where the expected values would fall within two standard deviations from the measured mean.

Sample size:

Sample size is 1000 for each of the treatment and control groups. Total sample size is 2000.

Sample size justification:

A sample size of 1000 would ensures 80% power with a 5% significance level to detect moderate to small effect sizes.


Data Analysis Plan

Two-sample t-test will be used for the analysis.

My research uses two groups -cold shower treatment and control groups. The outcomes of one group does not impact outcomes for the other group. There is no pairing or connection between the individuals in the groups - they are independent. So, two sample t-test is the appropriate test for my experiment. A paired t-test is not chosen since the data are not paired observations.


Assumptions and Exploratory Data Analysis (EDA)

The data obtained is assumed to follow a normal distribution. Sampling is performed using random sampling method. Number of experimental units are equal between two groups and the data is independent of each other.

#USE THIS BLOCK TO INPUT NECESSARY CODE.
library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.4     ✔ readr     2.1.5
## ✔ forcats   1.0.0     ✔ stringr   1.5.1
## ✔ ggplot2   3.5.0     ✔ tibble    3.2.1
## ✔ lubridate 1.9.3     ✔ tidyr     1.3.1
## ✔ purrr     1.0.2     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
wltuntable<-read.csv("wltun.csv",row.names = 1)

head(wltuntable)
TidytableCortisol<-pivot_longer(wltuntable,cols =1:2,names_to = "Groups",values_to = "Plasma_Cortisol")

head(TidytableCortisol)
HistogramCortisol<-ggplot(TidytableCortisol,aes(x=Plasma_Cortisol))+geom_histogram()

HistogramCortisol
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

ks.test(TidytableCortisol$Plasma_Cortisol,y="pnorm",mean(TidytableCortisol$Plasma_Cortisol),sd(TidytableCortisol$Plasma_Cortisol))
## 
##  Asymptotic one-sample Kolmogorov-Smirnov test
## 
## data:  TidytableCortisol$Plasma_Cortisol
## D = 0.13062, p-value < 2.2e-16
## alternative hypothesis: two-sided
z_scores <- scale(TidytableCortisol$Plasma_Cortisol)
which(abs(z_scores) > 3)  # Rows with potential outliers
## [1] 500
#According to the shape of the histogram, an outlier seems to be present in the dataset. KS test reveals that p value is less than 0.05, which means the data is not normal. The outlier seems to be present among higher values of plasma cortisol. Z score test reveals that an outlier is present in row 500. Upper limit for plasma cortisol is 20 mcg/dL. I will filter out the plasma cortisol values greater than 20 mcg/dL. The following code is for the filtering.

filteredTidytableCortisol<-TidytableCortisol|>filter(Plasma_Cortisol<20)

head(filteredTidytableCortisol)
#Outliers have been filtered using filter() function. New filtered tidy table will be used to construct histogram. I will also perform a KS test using new table.


NewHistogramCortisol<-ggplot(filteredTidytableCortisol,aes(x=Plasma_Cortisol))+geom_histogram()

NewHistogramCortisol
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

ks.test(x=filteredTidytableCortisol$Plasma_Cortisol,y="pnorm",mean(filteredTidytableCortisol$Plasma_Cortisol),sd(filteredTidytableCortisol$Plasma_Cortisol))
## 
##  Asymptotic one-sample Kolmogorov-Smirnov test
## 
## data:  filteredTidytableCortisol$Plasma_Cortisol
## D = 0.019087, p-value = 0.4602
## alternative hypothesis: two-sided
#New histogram shows a more normally distributed data. KS test results in a p-value of 0.4602, which is greater than 0.05, showing the data is normally distributed. 

Interpretation of EDA:

First, I built a histogram and performed the KS test to visualize the data and test for normality. The histogram showed a non-normal shape with skew on the right, suggesting the presence of outlier.The data did not pass the initial KS test, with p value of 2.2e-16, showing non-normality of data. The Z score reveals an outlier present in row 500, which is removed using the filter() function.Using new table, new histogram is built and and KS test is performed. The new histogram shows a normally distributed shape of data. KS test reveals that p value is 0.4602, which shows that the data is normally distributed.


Primary Statistical Analysis

#USE THIS BLOCK TO INPUT NECESSARY CODE.

t.test(Plasma_Cortisol~Groups,data=filteredTidytableCortisol)
## 
##  Welch Two Sample t-test
## 
## data:  Plasma_Cortisol by Groups
## t = -36.462, df = 1994.4, p-value < 2.2e-16
## alternative hypothesis: true difference in means between group Plasma.Cortisol.in.Control.Group and group Plasma.Cortisol.in.Treatment.Group is not equal to 0
## 95 percent confidence interval:
##  -2.333195 -2.095018
## sample estimates:
##   mean in group Plasma.Cortisol.in.Control.Group 
##                                         9.992535 
## mean in group Plasma.Cortisol.in.Treatment.Group 
##                                        12.206641

Data Visualization

#USE THIS BLOCK TO INPUT NECESSARY CODE.

ggplot(filteredTidytableCortisol,aes(x=Groups,y=Plasma_Cortisol))+geom_boxplot(fill=c("blue","orange"))+theme_classic()+annotate("text",x=1,y=16,label="A")+annotate("text",x=2,y=17,label="B")+ylab("Plasma Cortisol (mcg/dL)")


Conclusions

From the results of the t-test, the pa value is 2.2e-16, which is much less than 0.05. Therefore the null hypothesis can be rejected. There is a significant difference between the mean plasma cortisol levels of treatment group and that of the treatment group. We can say that the true mean differences are between -2.333195 -2.095018 with a 95% confidence. Since 0 is not included within the confidence interval, we can say that the mean of the control group is significantly lower than mean of the treatment group. The results agree with the alternate hypothesis. However, it does not agree with my initial thoughts in background, where I mentioned that plasma cortisol might be lower in cold shower treatment group vs. control. Confounding variables could be present - since immune function could be impacted by variety of factors such as diet, sleep, physical exercise and stress. If there were no limitations, I would conduct an experiment which also factors in the other variables and try to keep them at the same level or give them the same treatment regarding other variables. Furthermore, the individuals could not be tracked to see their compliance. A baseline of Cortisol was not collected for paired t-test analysis. If there were no limitations, a better experiment could be designed by considering the factors while tracking for compliance.


Citations

Alotiby, A. (2024). Immunology of Stress: A review article. Journal of Clinical Medicine, 13(21), 6394. https://doi.org/10.3390/jcm13216394

Buijze, G. A., Sierevelt, I. N., Van Der Heijden, B. C. J. M., Dijkgraaf, M. G., & Frings-Dresen, M. H. W. (2016). The effect of cold showering on health and work: a randomized controlled trial. PLoS ONE, 11(9), e0161749. https://doi.org/10.1371/journal.pone.0161749

OpenAI. (2024). ChatGPT (June 8 Version) [Large language model]. Retrieved from https://chat.openai.com/