Introduction

Is there a difference in the stress levels of students who are pursuing a STEM major and students who are pursuing a non-STEM major?

The goal of this study is to determine whether there is a significant difference between the mean stress levels of students who are enrolled in STEM-related departments at their university and students who are enrolled in non-STEM departments. The associated population parameter is the difference between those population means; the STEM mean minus the non-STEM mean.

Differences have been observed between STEM and non-STEM majors before, mainly in attitudes over the importance and confidence in science and scientific methods in society (Cotner et al,2017). Studies have also been done comparing the performance of different ethnic minorities in STEM, showing a decrease of performance in minority groups (Whalen et al,2010). These differences could conceivably contribute to a difference in general stress levels of students in STEM for a variety of reasons, and the goal of this study is to determine whether there actually is an emergent difference in stress.

Before the measuring of any data, my assumption was that the sampled STEM students would have a higher mean level of stress than non-STEM students. This is motivated largely by my own perception of the difficulty of some STEM courses, and how lacking some foundational knowledge in some topics can lead to some very stressful situations in which a student may fall behind or lose motivation to succeed. I believed this, as well as the social factors related to scientific advancement in today’s modern age, could result in higher mean stress levels in STEM students.

Data Collection Methods

Observations were measured from a randomly selected, blocked sample of all students at the University of Arcadia in the town of Arcadia, situated on Providence island on the online “The Islands” population simulation. First, a sample size was chosen, then sub-sample sizes were determined by calculating the size of the proportion of the sample as that sub-population’s size relates to the whole population. As a result, the sample had the same proportional makeup as the student population in terms of department, leading to a representative randomly sampled mini-population.

Students who were enrolled in the Science, Math and Health departments were classified as STEM, and students in all other departments as non-STEM. This provided a fairly even split of the population between the two groups. Stress was measured via salivary cortisol testing; Cortisol is the hormone released by the adrenal gland as a response to stress, and is responsible for many of the physical side effects of stress. Salivary cortisol was chosen as the measurement method, as blood tests can induce stress in the subject, leading to inaccurate results or false positives (Bozovic et al,2013).

The initial sample size was chosen to be 60 students in total. However, after the sampling took place, 33% of the chosen students declined to take part in the study, leading to a total sample size of 40 students, with 18 STEM and 22 non-STEM. This was a much lower participation rate than expected, but due to the nature of the sampling being done with an external script, and that script not storing the already chosen students for further sampling, I was unable to sample further and still keep the blocked design. Time constraints also contributed to these issues.

All subjects were sampled once in the same one-hour period between 1:00PM and 2:00PM in an attempt to reduce the day-to-day variation of cortisol levels.

Descriptive Statistics

favstats(cortisol ~ isSTEM, data = Mini_Project_3_Data_Sheet1)

Groups labeled “TRUE” represent STEM students, and those labeled “FALSE” represent non-STEM students.

As you can see from the statistics above, the two groups are pretty comparable. Both have very similar minimum and maximum values, as well as medians. They differ slightly in standard deviation and mean, although these variations are still quite small.

Below are a pair of box plots to visually represent the data.

bwplot(isSTEM ~ cortisol, 
       horizontal = TRUE, 
       main="Cortisol Box-plots, STEM & not STEM",
       data = Mini_Project_3_Data_Sheet1)

As shown by these plots, the center of both distributions fully overlap, showing a lack of association between the two variables. The STEM (TRUE) group displays longer tails, with the most extreme values being filtered automatically by R as outliers. A less manipulated view of both distributions is given by histograms.

histogram(~cortisol | isSTEM, data = Mini_Project_3_Data_Sheet1, width = 0.025, layout = c(1, 2))

These histograms show the clear similarity between the two groups. They are centered near the same value and have nearly the same extrema, though the STEM (TRUE) group shows a tighter center than the non-STEM (FALSE) group, as reflected by a lower STEM standard deviation.

Analysis of Results

The population of this study is university students, the parameter being the difference between the mean salivary cortisol levels between STEM and non-STEM students. Therefore, the null and alternative hypotheses are as follows:

Null Hypothesis

The mean cortisol levels of the two population groups are the same \[H_0:\mu_\text{stem}-\mu_\text{non-STEM}=0\]

Alternative Hypothesis

The mean cortisol level of the STEM student population is greater than the mean of the non-STEM population \[H_a:\mu_\text{stem}-\mu_\text{non-STEM}>0\]

In this case, a type I error would be a rejection of the null hypothesis without the necessary evidence to support that rejection. This would be a result that claims the STEM mean is higher than the non-STEM mean, when in reality the difference isn’t significant enough to show a definite relationship. A type II error would be the opposite; there’s enough evidence to reject the null hypothesis, but the study concludes otherwise.

Because of the sampling method used, this study’s sample should be very representative of the population it was drawn from, the student population of the University of Arcadia. The randomly sampled blocked approach results in a sample with a very similar makeup to the population, while still minimizing biases. However, inherent differences between universities make it so this sample can’t realistically be generalized past the University of Arcadia, as students weren’t sampled from other universities.

T-statistic

The validity conditions for a theory-based two-sample t-test are both groups having symmetric distributions, or at least 20 observations in each group without any strong skew. The recorded sample sizes are 18 and 22, which is near enough to 20 to be valid for the test. Both of the sample distributions are also symmetric, without any particular skew in a single direction, meaning the rest of the validity conditions are met as well.

In order to calculate the t-statistic, we first need to know the sample statistic and standard error.

Sample statistic

\[ \bar{x}_\text{STEM}-\bar{x}_\text{not STEM}=0.1740000-0.1688182 \\ =0.0051818 \]

Standard error

\[ SE=\sqrt{\frac{s_\text{not STEM}^2}{n_\text{not STEM}}+\frac{s_\text{STEM}^2}{n_\text{STEM}}} \\ =\sqrt{\frac{0.05205733^2}{22}+\frac{0.04460678^2}{18}} \\ =\sqrt{0.0002337227}=0.01528799 \]

T-statistic

\[t=\frac{\text{diff in means}}{s_\text{diff in means}}=\frac{0.0051818}{0.01528799}=0.3389458\]

We can run a t-test function in R on the sample data to give us the p-value.

P-value

theory.based.pval=pval(t.test(cortisol ~ isSTEM, data= Mini_Project_3_Data_Sheet1))
cat("Theory-based two-sided p-value according to R: ",theory.based.pval)

## Theory-based two-sided p-value according to R:  0.7365206

As the alternative hypothesis assumes “greater than 0” rather than “not equal to 0,” a one-sided p-value is more appropriate.

\[\text{one-sided p-value}=0.7365206/2=\textbf{0.3682603}\]

Assuming the null hypothesis is true, there is a 37% probability that we will see a recorded statistic as large or larger than the statistic recorded in this study.

This p-value is very large, and offers no statistical evidence against the null hypothesis. Therefore, the null hypothesis cannot be rejected.

In the context of this study, there is not enough evidence to support the hypothesis that the mean cortisol levels, and therefore stress levels, of STEM students are higher than those of non-STEM students.

The same R function used to find the p-value can be used to find the 95% confidence interval.

theory.confint=confint(t.test(cortisol ~ isSTEM, data= Mini_Project_3_Data_Sheet1))
cat('[',theory.confint$lower,',',theory.confint$upper,']')

## [ -0.03613344 , 0.0257698 ]

In reality, the interval’s bounds are [-0.0258, 0.0361]. The default first variable assumed by R is the non-STEM group, so the outputted interval has the opposite sign of the parameter actually being investigated.

This confidence interval implies that we are 95% confident that the true population parameter, the true difference between the mean cortisol levels of the STEM and non-STEM populations, is between the values -0.0258 and 0.0361. The null hypothesis states that the mean levels are the same, meaning the difference between them is equal to 0. 0 is contained within this confidence interval, meaning the null hypothesis is included and plausible under 95% confidence. This agrees with the conclusion from the p-value, as both the p-value and confidence interval show that there is not enough evidence to reject then null hypothesis.

Conclusion

Even though the null hypothesis couldn’t be rejected and the alternative hypothesis couldn’t be shown as correct, I still learned from this study.

The p-value of 0.368 was far from the 0.05 required to reject the null hypothesis with 95% confidence, with a similar conclusion from the confidence interval [-0.0258, 0.0361], but even if the null couldn’t be rejected, these results still give a meaningful result. As far as the University of Arcadia is concerned, STEM and non-STEM students have very similar levels of stress, similar enough for any variation to be explained by random chance alone. Of course, this result doesn’t explicitly prove that there is no association, just that there is not enough evidence to rule out the null hypothesis.

These results cannot be generalized past the University of Arcadia. Different universities can have very different attitudes towards certain subjects with different levels of difficulty, so you can’t generalize the results from a single university to all universities in a region.

If I were to do this study again, first and foremost I would change the way I sampled. For an observational study, the more observations the better, and with the script I used for the sampling, all I would have to do is change a single parameter for the sample size. However, that script isn’t even strictly necessary, especially when comparing students from specific departments. The script would be better suited for a study that compares to experimental groups, both mean’t to represent samples of the entire student population.

However, following the blocked sampling, a further study I’d like to see would be a comparison of the mean cortisol levels of students in each individual department at the university. This could be accomplished with a theory-based ANOVA test, and wouldn’t require any complicated sampling scripts, as each category would need to represent its respective department rather than the population of the entire university. The ANOVA test could be used to determine if an actual difference exists between some of the groups, then a series of confidence intervals comparing each group could be used to determine exactly how different each group is from one another. That, however, is for someone in the future to determine.

Bibliography: references to literature

Bozovic, Djordje, Maja Racic, and Nedeljka Ivkovic. “Salivary cortisol levels as a biological marker of stress reaction.” Med Arch 67.5 (2013): 374-377,(https://d1wqtxts1xzle7.cloudfront.net/46732115/Salivary_cortisol_levels_as_a_biological20160623-12013-1aaa7w4-libre.pdf?1466688824=&response-content-disposition=inline%3B+filename%3DSalivary_Cortisol_Levels_as_a_Biological.pdf&Expires=1747246896&Signature=H3x-4rxrmXit74hGIbuJjw3hp3Yl7CNaPkUUQidUfuAp0u6Ndr9rcquv3BG9~6HZMbLeVoEoOy770RNw7pCP7b3y27xog4wWPLxvdPrIot7HOhkui3zdvdGr2H48BE~pLRp9JZoqI-7-FUE9CMK0dsIDfoVRMhXCBlkwkKiod6dS0nEfXiWIfu~ryVZYIfS3NTyYJIRFSr4xFtY472YKGtX5F4iY4a6k5~jYXs5FR54I4G7GejGKsw9JWwGeAPWF6B-Ff9ofZgUK8V9aqHbmGvj7F0H-lo~0u4aDFMICvS74eS9x0ejxfPEDwTEoAmHRbOwpcVrBald~YuJGwBXLFg__&Key-Pair-Id=APKAJLOHF5GGSLRBV4ZA#page=67)

Cotner, Sehoya, Seth Thompson, and Robin Wright. “Do biology majors really differ from non–STEM majors?.” CBE—Life Sciences Education 16.3 (2017): ar48,(https://www.lifescied.org/doi/full/10.1187/cbe.16-11-0329)

Whalen, Donald F., and Mack C. Shelley. “Academic success for STEM and non-STEM majors.” Journal of STEM Education: Innovations and research 11.1 (2010),(https://www.jstem.org/jstem/index.php/JSTEM/article/view/1470)

Math 247 Final Project Report

Kieran Young