Math 247 Final Project Report

Introduction and Literature Review

My research explored the effect of prescribed exercise on focused. The research studied adults (age>18) from the simulated world of The Islands, meaning the study is generalizable to all Island adults. We used vigilance testing as a proxy variable for focus;\(^1\) The study looked at the population parameter of the long run effect of acute exercise on focus, i.e. if all Islanders were subjected to this experiment, the mean difference in vigilance test score between the two groups.

Vigilance testing has been well established as a proxy variable for focus.\(^1\) It is preformed by having the participant react as quickly as possible to a stimulus for an extended period of time, and the mean time is calculated.\(^1\) Vigilance testing is commonly used in studies sleep.\(^1\) However, while the Islands user interface allows for a “vigilance test,” the description indicates that the test is instead technically a ‘cancellation test.’\(^2\) This form of testing involved finding all of a certain symbol (such as the letter Z) on a page full of like symbols (such as a page full of letters).\(^2\) However, this testing has been established as a measure of focus and used in other studies,\(^3\) so this is a non-issue, and for this study vigilance testing and cancellation testing will be interchangeable.

An increase in performance (faster response) on vigilance tests during moderate effort exercise (compared to hard effort and no effort) has been shown.\(^{4,5}\) While these studies showed an effect during exercise, they did not explore effects immediately afterward.

This guided my study to focus on the effect of exercise on vigilance immediately after exercise as that seemed like an obvious direction to continue this research. As moderate exercise has been shown to increase focus during the activity,\(^{4,5}\) a decrease in vigilance test scores is predicted directly following after exercise indicating increased focus.

Methods

The observational units were adult Islanders from any of the towns in the Islands. 30 participants were selected through random sampling and every time the participant refused to study the sampling was restarted for that participant the town was not necessarily the same. This was performed using python code included below. The sampling method did not ensure that participants were selected from every town, so if there was a cultural/location effect, it was not controlled for. This may also mean that if participants from a certain place refused to be surveyed at a higher rate, they may have not been sampled, or sampled at a lower rate. In the future, this sampling bias should be controlled for through a larger sample size or ensuring an even number of participants are selected from each town. Controls were not in place for other possible confounding variables such as age, sex, ect.

The variable was measured by assigning participants to take a vigilance test, and recording the time it took for them to complete the vigilance test. Half the participants took the test directly after being assigned 30 minutes of moderate exercise (30 min run), and the other half were just assigned to take the test. All participants were started as soon as they finished their run by staggering the start of their runs by 30 seconds.

A possible area of concern for this form of testing is the well documented confounding variable of sleep.\(^1\). While all participants took the test at a relatively similar time (within 45 minutes of each other), the participants may be in different parts of their circadian rhythm. As this form of testing has already been shown to be sensitive to tiredness and differences in circadian rhythm\(^1\) it would be important to control for this variable, especially considering the small sample size.

import csv
import random

# Read towns from the input CSV (with support for no header)
def read_towns(filename):
    towns = []
    with open(filename, newline='') as csvfile:
        # If the CSV has no header, specify the fieldnames:
        reader = csv.DictReader(csvfile, fieldnames=['town'])  # replace 'town' with the actual name if necessary
        next(reader)  # Skip the header row (if exists and you specify the fieldnames)
        for row in reader:
            towns.append(row['town'])
    return towns

# Get random participant data with skip option
def get_random_participant(towns):
    while True:
        town = random.choice(towns)
        print(f"\nSelected Town: {town}")
        
        num_houses = int(input("How many houses are in the town? "))
        house_number = random.randint(1, num_houses)
        
        print(f"\nSelected House: #{house_number}")
        
        num_people = int(input(f"How many people live in house #{house_number}? "))
        
        selected_person = random.randint(1, num_people)
        
        name = input(f"Name of participant #{selected_person} in house #{house_number} (or type 'skip' to try again): ")
        
        if name.lower() != 'skip':
            return {'town': town, 'house_number': house_number, 'name': name}
        else:
            print("Skipping... choosing everything again from scratch.")
            
# Main logic
def main():
    input_file = 'towns.csv'  # Change this if needed
    output_file = 'selected_participants.csv'
    
    towns = read_towns(input_file)
    participants = []

    for i in range(30):
        print(f"\n--- Selecting participant {i + 1} ---")
        participant = get_random_participant(towns)
        participants.append(participant)

    # Write selected participants to output CSV
    with open(output_file, mode='w', newline='') as csvfile:
        fieldnames = ['town', 'house_number', 'name']
        writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
        writer.writeheader()
        writer.writerows(participants)

    print(f"\n:) 30 participants selected and saved to '{output_file}'")

if __name__ == '__main__':
    main()

Results and Discussion

bwplot(exercise ~ vigilance,
       horizontal = TRUE,
       main = "Vigilance test time between groups",
       xlab = "Vigilance Test (response time, min)", # Label for x-axis (horizontal axis)
       ylab = "Exercise (yes/no)", # Label for y-axis (vertical axis)
       data = data)

The population is the adults of the Islands, and the parameter is the long run means of vigilance test scores with and without exercise. Unfortunately, there does not appear to be a difference between the variables. While the “yes” group is in fact lower, it is by a very small amount, and the no group median is actually lower which may be a better statistic to consider as the no group has a much lower outlier than the yes group.

The null hypothesis is that the mean vigilance test score is not lower in the exercise group than in the now exercise group (\(H_o: \mu_1 - \mu_2 =0\)), and the alternative hypothesis is that the mean vigilance test score will be lower in the exercise group than in the now exercise group (\(H_a: \mu_1 - \mu_2 >0\)); where \(\mu_1\) is the mean vigilance test score of the no exercise group and \(\mu_2\) is the mean vigilance test score of the exercise group.

A type I error is rejecting the null hypothesis when it is in fact true. For this study, this would concluding that there is a lower mean vigilance test score in the exercise group when there is in fact not a difference. A type II error is not rejecting the null hypothesis when it is false. For this study, this would be concluding that there is not a lower mean vigilance test score in the exercise group when a difference does in fact exist.

As discussed earlier, while any Islander over the age of 18 did have the opportunity to be chosen, the small sample size (n=30) coupled with the complete random sampling likely led to some towns being over represented while some are not represented at all. This could have been solved with random sampling with clustering, but because this was not done, I do not believe I have a representative sample of the Islands (the population of interest). This also makes the study prone to strong effects of confounding variables such as sleep.\(^1\)

A two sampled t-test was used to analyze our data. The validity conditions for a two sample t-test were not met as there was only 15 participants per group while the test requires at least 20 (the validity condition of normal data was met as shown by the normal appearing box plot). The theory based t-test is shown below, but a bootstrapped p values was also found with very similar results (\(p_{sim}=0.399\)).

udelta<-diffmean(vigilance ~ exercise, data = data)
tstat<-stat(t.test(vigilance[exercise == "y"], vigilance[exercise == "n"]))
cat("the difference in means is", (-udelta),"\nthe t-statistic is", (tstat))

## the difference in means is 0.14 
## the t-statistic is -0.2751275

The simulated based t statistic is small and unlikely to have statistical significance.

two.sided.p.value<-pval(t.test(vigilance ~ exercise, data = data))
cat("the one-sided p-value is",(two.sided.p.value/2))

## the one-sided p-value is 0.3926695

This p value is very large and does not indicate we have enough evidence to reject our null hypothesis. From this data, we can conclude exercise does not appear to have a significantly significant effect on focus as assessed through a vigilance test. We cannot accept the alternative hypothesis.

set.rseed(58)
data.null <- do(10000) * diffmean(shuffle(vigilance) ~ exercise, data = data)

dotPlot(~ diffmean,
data = data.null,
main="Simulated null distribution of the difference in sample means",
xlab="difference in sample means",
width = 0.25,
cex = 1,
groups = (diffmean >= 0.14))

simp_value <- mean(data.null$diffmean >= 0.14)
cat("the simulated p value is", (simp_value))

## the simulated p value is 0.3999

The simulated p value of 0.3999 is very close to the theoretical p value of 0.392. I suspect this is due to the fact that although the sample size is not large enough, the data is very normal. Both p values do not indicate we can reject the null hypothesis.

confint(t.test(vigilance[exercise == "y"], vigilance[exercise == "n"]))

The calculated confidence interval means that we are 95% sure that the actual population parameter for the difference in mean vigilance test scores \(\mu_1 - \mu_2\) - where \(\mu_1\) is the mean vigilance test score of the no exercise group and \(\mu_2\) is the mean vigilance test score of the exercise group - lies in the range [-1.18, 0.90]. The range of the null hypothesis includes 0 so we cannot reject the null hypothesis of \(\mu_1 - \mu_2=0\). This is the same conclusion we came to through our two sampled t test.

Conclusion

We did not find a significantly significant effect of exercise on focus as assessed by group mean vigilance test score (t=-0.275, \(p_{sim}\)=0.399). This contradicts our expectations based off existing literature.\(^{4,5}\) While we did randomly sample from our experiment, due to our small sample size compared to the population of the Islands (n=30) and no clustering, our data is likely prone to incidental sampling bias. This makes our conclusion hard to generalize to the population of the Islands despite our random sampling. Further research should revisit this question due to the flaws of this study and the potential of the research for real world applications; school/work schedules could be organized to improve focus and productivity. In doing future research, a larger sample size should be used, and a more in depth experiment could be preformed. For example, instead of dividing the participants into two groups and assigning one exercise and no exercise as a control, a different control could be used, and vigilance tests could be preformed before and after the control as a measure of change in vigilance. This study assumed that the baseline vigilance for both groups would be the same, which may have not been true due to the small sample size.

Bibliography

Van Dongen HPA, Dinges DF. Sleep, Circadian Rhythms, and Psychomotor Vigilance. Clinics in Sports Medicine. 2005;24(2):237-249. doi:10.1016/j.csm.2004.12.007
Rorden C, Karnath HO. A simple measure of neglect severity. Neuropsychologia. 2010;48(9):2758-2763. doi:10.1016/j.neuropsychologia.2010.04.018
Casagrande M, Ferrara M, Curcio G, Porcu S. Assessing nighttime vigilance through a three-letter cancellation task (3-LCT)effects of daytime sleep with temazepam or placebo. Physiology & Behavior. 1999;68(1-2):251-256. doi:10.1016/S0031-9384(99)00144-4
González Fernández FT, Etnier JL, Zabala M, Sanabria Lucena D. Vigilance performance during acute exercise. Published online 2017. doi:10.7352/IJSP.2017.48.435
González-Fernández FT, Latorre-Román PÁ, Parraga-Montilla J, Castillo-Rodriguez A, Clemente FM. Effect of Exercise Intensity on Psychomotor Vigilance During an Incremental Endurance Exercise in Under-19 Soccer Players. Motor Control. 2022;26(4):661-676. doi:10.1123/mc.2022-0033
Wendel, Evan. Statistics with Applicaitons Final Presentation. 2025. http://rpubs.com/therealmaxrebo/1310974

Math 247 Final Project Report

Evan Wendel