Math 247 Final Project Report

In this study, I was interested in how time spent outdoors in the sun affects reported happiness. My research question was, Is there a difference in reported happiness levels between students who sunbathe and those who are not exposed to sun? The population parameter of interest was the proportion of all students at Hofn University within “The Islands” simulation who would be categorized as “happy” after sunbathing for 30 minutes compared to those who did not sunbathe. In considering my topic and then the significance of the research question, I found some studies to find possible support for an alternative hypothesis. One study in China that ran for 8 years demonstrated a significant statistic on the positive correlation between sun exposure on the interview date and subjective well-being (Liu et al., 2025). A review article that analyzed 7 studies found that in 6 of the 7 studies, there was a demonstrated benefit of exposure to ultraviolet radiation for improvement of mood (Veleva et al., 2018). The third and final paper I read discusses the importance of vitamin D, which one gets from the sun, and the detriment of Vitamin D deficiency on global health. The physical health effects of Vitamin D deficiency are jarring, including Vitamin D deficiency playing a role in 17 kinds of cancer (Naeem, 2010). After this research, I did suspect that the actual value of the parameter, the difference in proportions, would be positive. This would mean that the sun group would report a higher rate of happiness than the no-sun group. However, for the purposes of the study, my alternative hypothesis used a two-sided p-value to test for any difference, and I did not assume direction within the hypothesis test.

The observational units were each of the students from Hofn University in the study, using a sampling frame of the list of students at Hofn University. The ages of participants had a wide spread, with a range of age 19 to 78 years old. Given the Hofn University sampling frame of 473 student names, ChatGPT randomly selected 100 observational units, and 80 consented. The response rate was 80%. There were some issues with ChatGPT in creating this study and having it randomly select and assign participants. Upon inputting the sampling frame, within the list of 100 names it put back out, about half were real names from the list, while the other half were completely made up or repeats of names already given by ChatGPT earlier in the list. This did make finding participants somewhat difficult. I just kept having it put out names until I got 100 that were indeed real within The Islands, and then I asked those 100 for consent. The 80 participants were then randomly assigned to 2 groups: sun and no sun. Repeat visits weren’t a big issue, as I either only interviewed them with the 3 questions, or had them do a 30 minute task of “Sunbathe 30 minutes,” and then interviewed them within 10 minutes of them completing the task. A random sampling error is possible, as this sample only represents 16.91% percent of the population of Hofn University students. However, the use of a large sample size and random sampling methods should have mitigated any sampling errors. I was careful to avoid non-sampling errors by carefully inputting data as I went and using the same questions every time. However, a measurement error is possible, as respondents could have provided inaccurate information because of self-report methods.

I chose two binary, categorical variables. The explanatory variable was whether the participant spent time outdoors. I measured this variable of sun by either having participants complete the activity “Sunbathe 30 mins” or not do any activity before my interview. The response variable was whether they were happy or not, for which I had to use a proxy variable, as happiness cannot be directly measured. I measured this variable of happiness by asking three questions, to which participants could respond to on a scale of 0-4: responses ranged from feeling “not at all” (0) to feeling “a little” (1), “moderately” (2), “quite a bit” (3) and “extremely” (4). The word choice for the three questions comes from the POMS (Shacham, 1983), which is built into the programming of The Islands. I asked about how “discouraged”, “unhappy”, and “worthless” they felt, all under the umbrella term of depression. I averaged their scores, and if this average was less than 1, they’re categorized as happy, and if it was greater than or equal to 1, they were categorized as not. This was an easy variable to measure with the islanders, and none un-consented after being interviewed, but as I’m using a self-report method it’s unclear how reliable of sources they are. After analyzing the data, I determined that the proportion of those happy with sun is 0.43, happy with no sun is 0.44, not happy with sun is 0.38, and not happy with no sun is 0.39. Given that the proportions for both “happy” and “not happy” are nearly identical across the two groups, there does not appear to be a strong association. I included my initial bar graph of data, and a mosaic plot to be easier interpreted for significance. Upon observing the mosaic plot, there does not appear to be any strong or significant association between the two variables.

library(mosaic)
library(dplyr)
library(forcats)

library(mosaic)

Sunny <- 
  rbind(
    do(18) * data.frame(mood = "Happy", sight = "Sun"),
    do(24) * data.frame(mood = "Happy", sight = "No Sun"),
    do(23) * data.frame(mood = "Not Happy", sight = "Sun"),
    do(38) * data.frame(mood = "Not Happy", sight = "No Sun")
  )

Sunny <- Sunny %>% 
  mutate(sight = factor(sight)) %>% 
  mutate(sight = fct_relevel(sight, c("Sun", "No Sun")))

table_sunny <- table(Sunny$sight, Sunny$mood)
table_sunny

##         
##          Happy Not Happy
##   Sun       18        23
##   No Sun    24        38

bargraph(~mood, group = sight, data = Sunny, auto.key = TRUE)

mosaicplot(mood ~ sight, data =Sunny )

prop1 <- round(prop(~sight + mood, data = Sunny), 2)

prop2 <- round(prop(~mood + sight, data = Sunny), 2)

prop1

##     prop_Sun.Happy prop_Sun.Not Happy 
##               0.43               0.38

prop2

##    prop_Happy.Sun prop_Happy.No Sun 
##              0.44              0.39

Again, the population in my study is the students at Hofn University, and the parameter the true proportion of all students at Hofn University who would be categorized as “happy” after sunbathing for 30 minutes compared to those who did not sunbathe. The null hypothesis is that there is no difference in happiness between students who sunbathe for 30 minutes and those who do not (\(H_0:\mu_{sun}=\mu_{no sun}\)). The alternative hypothesis is that there is a difference in happiness between students who sunbathe for 30 minutes and those who do not (\(H_A: \mu_{sun}\neq\mu_{no sun}\)). As for what different types of errors could look like in the context of this study, there could’ve been a Type I or Type II error. A Type I error, or a false alarm, would be if I had concluded from my data that there was a difference in happiness between students who sunbathe for 30 minutes and those who do not, and thus rejected the null, where the null hypothesis was really true. A Type II error, or missed opportunity, would be if the null hypothesis, that there is no difference in happiness between students who sunbathe for 30 minutes and those who do not, was really false, but my data had led me not to reject the null. My measurements can reasonably be considered a representative sample from the population of interest, as I used a complete sampling frame and both random selection and random assignment. The standardized statistic, or z, in the study came out to be -1.578896, which provides moderate evidence against the null hypothesis. This is the number of standard deviations by which the value of a data point is above or below the mean value of what is being observed or measured. The validity conditions were met for the theory-based method of finding the z-statistic, with use of random sampling and random assignment, and at least 10 successes and 10 failures in each group. The p-value of 0.11436 means that, assuming the null hypothesis is true, there is an 11.436% probability of obtaining a result at least as extreme as the one observed, due to random chance. This p-value also indicates little to no evidence against the null hypothesis, and thus I fail to reject \(H_0\). I do not have sufficient evidence to conclude that there is a difference in happiness between students who sunbathe for 30 minutes and those who do not, and therefore, I fail to reject the null hypothesis and cannot conclude that sun exposure has an effect on students’ happiness levels at Hofn University. I cannot corroborate the alternative hypothesis \(H_A\). As for the confidence interval, I got (-0.39, 0.04). This means that we are 95% confident that the true difference in the proportion of Hofn University students who would be categorized as “happy” between those who sunbathe for 30 minutes and those who do not is between -0.39 and 0.04. Since the interval just barely includes zero, it suggests that there may be a small effect. This mirrors what was concluded using the z-score and p-value. Also, additionally I found the relative risk calculated to be 1.134146. Thus, those who saw the sun are 13.4% more likely to be happy than people who didn’t see the sun.

risk_sun <- table_sunny["Sun", "Happy"] / sum(table_sunny["Sun", ])

risk_nosun <- table_sunny["No Sun", "Happy"] / sum(table_sunny["No Sun", ])

RR <- risk_sun / risk_nosun
RR

## [1] 1.134146

sun_happy <- 18 
sun_not_happy <- 23 
no_sun_happy <- 24 
no_sun_not_happy <- 15 

n_sun <- sun_happy + sun_not_happy
n_no_sun <- no_sun_happy + no_sun_not_happy

p1 <- sun_happy / n_sun 
p2 <- no_sun_happy / n_no_sun

P <- (sun_happy + no_sun_happy) / (n_sun + n_no_sun)

z_statistic <- (p1 - p2) / sqrt(P * (1 - P) * ((1 / n_sun) + (1 / n_no_sun)))
z_statistic

## [1] -1.578896

p_value <- 2 * (1 - pnorm(abs(z_statistic)))
p_value

## [1] 0.11436

# Standard error of the sampling distribution based on individual proportions

SE.diff.CI<-sqrt((p1*(1-p1))/n_sun
              + (p2*(1-p2))/n_no_sun)


# margin of error for 95% CI
MoE <- 1.96 * SE.diff.CI
#MoE

LB<-p1-p2 - MoE # lower limit of 95% CI
UB<-p1-p2 + MoE # upper limit of 95% CI
round(cbind(LB,UB),2)

##         LB   UB
## [1,] -0.39 0.04

This study aimed to determine if sunbathing for 30 minutes affects happiness levels among students at Hofn University. After randomly sampling 100 students and obtaining consent from 80, I assigned them to either a sun or no-sun group. Happiness was measured using a self-report scale, and the results showed no significant difference between the two groups. The z-statistic of -1.58 and p-value of 0.11436 indicated little evidence against the null hypothesis, and the 95% confidence interval for the difference in proportions (-0.39 to 0.04) included zero, suggesting that sun exposure does not meaningfully affect happiness. I do not have sufficient evidence to conclude that there is a difference in happiness between students who sunbathe for 30 minutes and those who do not, and therefore, I fail to reject the null hypothesis and cannot conclude that sun exposure has an effect on students’ happiness levels at Hofn University. These are not exactly the results I would’ve assumed or hoped for, as the literature I had read before the study and my personal experience would’ve led me to support the alternative hypothesis. The study did meet all validity conditions, including random sampling, random assignment, and the success-failure condition. I believe that these results are generalizable to the larger population of students at Hofn University at the time period that the sample was taken, as I used a complete sampling frame, relatively large sample size, and had a pretty high response rate of 80%. In the future, I would consider using a more nuanced happiness scale or exploring the duration or frequency of sun exposure. I would have to make a consideration of the ethicality of unprotected or prolonged sun exposure for skin health. Some future research questions would include: Does the duration or frequency of sun exposure affect happiness more strongly? Would similar results be found at different universities or in different climates?

Bibliography

Shuai Liu, Xi Zhang, Chenghao Zhao, Happiness in the sky: The effect of sunshine exposure on subjective well-being, The Journal of Positive Psychology, 2025, ISSN 1743-9760, (https://doi.org/10.1080/19485565.2025.2487977.)

Boriana I. Veleva, R. L. van Bezooijen, V. G. M. Chel, M. E. Numans, M. A. A. Caljouw, Effect of ultraviolet light on mood, depressive disorders and well-being, Photodermatology, Photoimmunology & Photomedicine, Volume 34, Issue 5, 2018, Pages 288–297, ISSN 0905-4383, (https://doi.org/10.1111/phpp.12396.)

Zulfiqarali Naeem, Vitamin D deficiency—An ignored epidemic, International Journal of Health Sciences (Qassim), Volume 4, Issue 1, 2010, Pages V–VI, ISSN 1658-3639, (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3068797/.)

Math 247 Final Project Report

Madeleina Shear