library(readr)
Data <- read_csv("~/Documents/MATH 247/Islands Mini Project 3 - Sheet1(1).csv")
It is well known that exercise has positive effects on health, such as reducing heart issues, maintaining a healthy weight, and improving mood. One question that has been studied, but remains inconclusive is the effect of exercise on cognitive ability. I wanted to study if high-intensity exercise, specifically, has an effect on cognitive ability. The article “Effects of Physical Exercise on Cognitive Functioning and Wellbeing: Biological and Psychological Benefits” from Mandolesi, Laura et al., found that exercise did affect cognitive ability and that physical activity could act as protection against neurodegeneration. A different study by Brown, et al. specifically studied cognitive change in older adults when participating in 6 months of both high-intensity exercise and moderate-intensity exercise. Their results showed no significant improvement in cognitive ability when compared to a control group. Since the results of the two studies differed in their observations when it comes to exercise and cognitive ability, I wanted to see if exercise had any effects on cognitive ability with the islanders. The population parameter of interest for this study is the difference in the long-run mean time taken to complete a memory game of those who participate in high interval training for a week and the long-run mean time of those who do not, ages 25 to 65. Before starting, I suspected that the true value of the population parameter was different from the value of zero.
I started with my observational units of 40 islanders, aged 25-65, who were randomly sampled from the three different islands. I conducted a simple random sample by using a random number generator between 1-3 to choose which island. I Then used the generator to randomly select from the number of cities on the given island. Then used the generator to randomly select a household from the city. Finally, I would use the generator to randomly select between the adults aged 25-65 in the household, repeating the whole process for all 40 individuals in my sample. The response rate was pretty good. If a person did decline consent then I just went one step back and randomly picked a new household from the same city. This was the same protocol I used if the selected household had no one in the age range of 25-65. Once I had all 40, I used the random number generator to select 20 people for my treatment group and the rest became my control group. I had everyone do the memory game on day one and recorded their times, then I had my treatment group do high intensity interval training (20 mins) once everyday for the next week. When the week was up, I then had everyone do the memory game again. The variables were measured by whether or not someone did the high-interval training for a week, and the time in seconds that it took for an individual to complete the memory game. I recorded the difference between each person’s initial time and the time they got on the second attempt a week later to help find my statistic later on. Unfortunately, my sample only ended up containing 39 people, as one of them left midway through the study.
The first variable is categorical. and was whether or not a person did HIIT for 20 minutes each day for a week. The second variable is quantitative, and was the time is took a person to complete the memory game. I made each person take the memory game before the week started and any HIIT was done, and again after the week. I then found the difference between each person’s initial and final times. From the boxplot below, we can see that the two groups, T (those who did HIIT) and C (those who didn’t do HIIT), are very similar meaning there doesn’t seem to be much of an association between doing HIIT and the difference you got between your initial and final time in the memory. The means of each distribution are 0.237 and -1.165 for the HIIT and NoHIIT groups respectfully. The medians of each are -0.6 and -0.3, with IQR’s of 7.55 and 6.625. This data showed that their variance are similar as well lending credence to the idea that there might not be an association between the two variables.
bwplot(Group ~ Difference, horizontal = TRUE, data = Data)
favstats(Difference ~ Group, data = Data)
## Group min Q1 median Q3 max mean sd n missing
## 1 C -15.1 -4.825 -0.3 1.8 6.8 -1.1650000 5.560602 20 0
## 2 T -6.0 -3.650 -0.6 3.9 11.1 0.2368421 5.185258 19 0
The population for my study is the population of all people aged 25-65 on all three of the islands. The population parameter is the difference in the long-run mean time taken to complete a memory game of those who do high interval training for a week and the long-run mean time of those who do not, ages 25 to 65. The change in performance (time) between the initial memory game and subsequent memory game a week later was measured. The null hypothesis is that the average amount of this change would not be different between a person that performed HIIT and a person that did not. The alternative hypothesis is that the average amount of this change would be different between a person that performed HIIT and a person that did not.
\[H0: \mu_{HIIT} - \mu_{No HIIT} = 0\] \[Ha: \mu_{HIIT} - \mu_{NoHIIT} \neq 0\] It’s important to go over what errors could occur from the results of the study, and what those errors would mean. A type 1 error in this study would mean that we falsely believe that performing HIIT for a week does affect the time taken to complete the memory game. A type 2 error in this study would mean that performing HIIT for a week does in fact affect the time taken to complete the memory game, but we have falsely been unable to reject the null hypothesis that it doesn’t. None of them are particularly bad, but a type 2 error could be worse as then people won’t get the beneficial effects of the HIIT. The sample that we obtained can reasonably be considered a representative sample from the population of interest because the sample was randomly selected through the above mentioned data collection method. Causation can also be concluded because the observational units were randomly assigned to each group. Before starting a theory based approach, validity conditions must be checked. Since I didn’t have at least 20 observations in each group and I didn’t know how symmetrical both population distributions are, I needed to do a simulation based approach. Through the simulation, I got a simulated p-value of 0.433. Since the validity condition was only short by 1 observational unit, and both groups have symmetrical and bell shaped distributions that aren’t heavily skewed, I think it is reasonable to do a theory based approach as well. The theory based approach provided a standardized statistic of -0.81465, and a theory based p-value of 0.42.
set.rseed(117)
Data.null <- do(1000) * diffmean(shuffle(Difference) ~ Group, data = Data)
dotPlot(~ diffmean, data = Data.null,
width = 0.2, cex = 1,
main="Simulated Null Distribution of the difference in sample means",
xlab="difference in sample means",
groups = (diffmean <= -1.402 | diffmean >= 1.402))#highlights p-value
p_value<-prop(~(diffmean <= -1.402|diffmean >= 1.402), data = Data.null)
cat("two-sided p-value is",p_value)
## two-sided p-value is 0.433
histogram(~Difference | Group, data = Data, width = 4, layout = c(1, 2))
t.test(Difference ~ Group, data = Data)
##
## Welch Two Sample t-test
##
## data: Difference by Group
## t = -0.81465, df = 36.989, p-value = 0.4205
## alternative hypothesis: true difference in means between group C and group T is not equal to 0
## 95 percent confidence interval:
## -4.888526 2.084842
## sample estimates:
## mean in group C mean in group T
## -1.1650000 0.2368421
diffmean(Difference ~ Group, data = Data)
## diffmean
## 1.401842
pval(t.test(Difference ~ Group, data = Data))
## p.value
## 0.4204865
With a p-value of 0.42, we see that the probability of observing our statistic of 1.402 assuming the null hypothesis is true is 42%. Since the p-value is large, we cannot reject the null hypothesis, and in context of the study this p-value does not give us statistical evidence to support the alternative hypothesis that HIIT is associated with the time it takes to complete a memory game. Finally, the 95% confidence interval was from -4.888 to 2.084, which shows the range of plausible values for our population parameter. This means we are 95% confident that the long-run difference in mean time taken to complete the memory game between the HIIT group and No HIIT group lies between -4.888 and 2.084. The confidence interval does contain zero which supports the conclusion that we cannot reject the null hypothesis as it shows that the null hypothesis is a plausible value.
confint(t.test(Difference ~ Group, data = Data))
## mean in group C mean in group T lower upper level
## 1 -1.165 0.2368421 -4.888526 2.084842 0.95
In conclusion, I did not find enough evidence to support the alternative hypothesis that the average difference between the initial and final memory test performance would be different between a person that performed HIIT and a person who did not. My findings are in agreement with the observations of Brown, et al. Future work in this area of study should consider using a longer period of time, incorporating different cognitive tasks and workouts, and breaking the study down by age groups.
Bibliography
Belinda M. Brown, et al., High-intensity exercise and cognitive function in cognitively normal older adults: a pilot randomized clinical trial, Alzheimer’s Research & Therapy, Volume 13, 2021, Article 33, https://doi.org/10.1186/s13195-021-00774-y
Laura Mandolesi, et al., Effects of Physical Exercise on Cognitive Functioning and Wellbeing: Biological and Psychological Benefits, Frontiers in Psychology, Volume 9, 2018, https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5934999/