Project 1: Exploring Gender Patterns in Stress and Social Media Use Among DATA1X01 Students

An Investigation into Digital Engagement and Stress Variability

Authors

Kye Messer

Shivaani Pillai

Celina Yoon

Mengxian Xu

Haichen Jiang

Yuze Chai Chai

Executive Summary

Dear Elise Magatova,

According to a study of 28 survey variables from a 94% student response:

  • Social media is minimally correlated with stress

  • Females exhibit slightly greater and more consistent stress levels than other genders.

Confounding variables and data cleaning highlight the need for more in-depth investigation of these intricate relationships.

Initial Data Analysis

Data Source

The data was collected via an online census from the DATA1001/1901 unit, with a 94% response rate from students. The survey captured 28 variables related to student behaviour and experiences.

Data Structure

The dataset includes 28 variables, with our analysis focusing on three key variables: social media use (hours per day, quantitative continuous), gender (categorical, later converted to a factor), and stress level (self-reported on a 1–10 scale, converted to numeric for analysis).

Limitations

Potential issues include self-reporting inaccuracies, observer bias, and the inclusion of misreported data (e.g., extreme social media hours). These may affect confidence, data interpretation and resulting reccomendations.

Assumptions

We assumed that extreme values, such as unrealistic social media usage (e.g., >16 hours), are due to misreporting. We also believe that respondents interpreted the survey questions uniformly.

Data Cleaning

Missing values were removed, and extreme outliers were filtered out. Variable types were adjusted as needed for subsequent analysis.

Data cleaning involved removing missing values with drop_na(), filtering out social media use values over 16 hours, converting the stress variable from character to numeric, and recoding gender as a factor. The code chunk below demonstrates these steps:

# Read and clean data
library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.5
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ ggplot2   3.5.1     ✔ tibble    3.2.1
✔ lubridate 1.9.4     ✔ tidyr     1.3.1
✔ purrr     1.0.4     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
datas <- read.csv("data1001_survey_data_2025_S1.csv")
datas <- drop_na(datas)
datas <- filter(datas, social_media_use < 16)

# Transform variables
datas$stress <- as.numeric(datas$stress)
datas$gender <- factor(datas$gender)

Research Question 1

Is there a difference in the daily hours spent on social media between different gender groups among DATA1X01 students?


In general, females report slightly higher median stress levels than other groups (Backović et al., 2012). The variability observed across the genders highlights the need for more specific questions or standardised measures, such as the Maslach Burnout Inventory-Student Scale (MBI-SS) to better understand the relationship between gender and stress. Figure 1 does, however, warrant further investigation into the reduction of student stress.

library(tidyverse)
library(ggplot2)

datas<- read.csv("data1001_survey_data_2025_S1.csv")
datas <- drop_na(datas)
datas <- filter(datas, social_media_use < 16)

stress = as.numeric(datas$stress)

datas$gender =factor(datas$gender)

ggplot(datas, aes(x = stress, y = gender, fill = gender)) +
  geom_boxplot() +
  labs(title = "Figure 1: Understanding Percieved Stress and Gender", x = "Stress", y = "Gender")

Moreover, Figure 1 illustrates the relationship between students’ stress levels and gender, with stress measured on a scale from 0-10. The data is categorised into four gender groups: female, male, non-binary/third gender, and prefer not to say. The median stress level for females is 6.25, whereas the median stress level for males is 5. Females exhibit the smallest IQR compared to other gender groups, a finding also reported by Worly et al. (2018), which indicates more consistent stress levels within this group. The lack of a strong correlation between gender and stress may be due to the subjective nature of self-reported stress levels, particularly as other studies illustrate notably higher perceived stress levels in females (p < 0.05) (Ashraf and Nawaz, 2022). The question is also phrased broadly, meaning that participants have varying interpretations of “stress” and differing capacities for introspection. 

Contextualising Findings

In contrast to Graves et al. (2021), who found strong gender differences with significantly higher stress in females, DATA10X1’s full range utilisation reduces the association between perceived stress and gender; however, significant interquartile range differences indicate underlying variability, suggesting that dispersion diverges remarkably overall, while central tendencies appear similar.

Research Question 2

What is the association between daily social media use and self-reported stress levels among DATA1X01 students?

Preliminary analysis indicates that increased daily social media use weakly correlates with higher self-reported stress among students. This association may arise from intensified social comparisons, FOMO, and information overload, which drive heightened stress levels (Hall et al., 2021; Matthes et al., 2019). Our findings (see Figure 2) underscore the possible importance of fostering mindful usage and exploring targeted interventions to mitigate these effects.

library(tidyverse)
library(ggplot2)

datas<- read.csv("data1001_survey_data_2025_S1.csv")
datas <- drop_na(datas)
datas <- filter(datas, social_media_use < 16)

lm(social_media_use~stress, data= datas)

Call:
lm(formula = social_media_use ~ stress, data = datas)

Coefficients:
(Intercept)       stress  
    3.79196      0.09089  
ggplot(datas, aes(x=social_media_use,y=stress,))+geom_point(position="jitter") + geom_smooth(method = "lm", se = FALSE)+labs(title="Figure 2: Exploring Social Media Use and Percieved Stress", x="Hours of social media use", y="Percieved stress level")
`geom_smooth()` using formula = 'y ~ x'

cor(datas$stress, datas$social_media_use)
[1] 0.08260798
model=lm(social_media_use ~ stress, data= datas)

ggplot(model, aes(x= .fitted, y = .resid)) + 
  geom_point(position="jitter") +
  geom_hline(yintercept = 0, linetype = "dashed", colour = "red")+labs(title="Residual Plot")+
  labs(x = "Fitted values", y = "Residuals", title = "Residual Plot")

The correlation between social media usage and stress levels in Figure 2 appears weakly positive, and the scatterplot shows no strong linear pattern, suggesting that increased social media use may not significantly raise stress. The clustering of points, coupled with a lack of homoscedasticity, indicates that a linear model is not the best fit for these data. Confounders such as work obligations, family responsibilities, and broader social and health influences could shape stress levels and social media use profoundly. Further analysis of these confounding variables could clarify these relationships and their impact on stress outcomes.

Contextualising Findings

Stress and everyday social media use are weakly positively correlated, according to our analysis of DATA1X01 students, with non-linear residual patterns indicating confounding effects. Further emphasizing the need for more research into mediating factors, this is consistent with findings from Nygaard et al. (2024) that showed modest baseline stress levels but no longitudinal effects.

Acknowledgements

ISI Professional Ethics Compliance

Through explicit documentation of our data cleansing, analytic techniques, and limitations, we upheld the ISI shared value of transparency. Additionally, by guaranteeing objective, repeatable outcomes and upholding the highest levels of objectivity throughout our work, we complied with the ethical concept of professional integrity.

Meetings

Our group met only one time to distinguish a timeline and delegate tasks. This was on the 19/3/2025, from 1700 - 1800.

The most important outcome of this meeting was that we would use a google document to submit all drafting and coding chunks, so that Kye could compile and format it into a quarto document. Thereafter, we had a simple discussion surrounding how we would execute tasks, for example, deciding what graphical representations were going to best suit our data.

References

Ashraf, Hira, and Noman Nawaz. “A Comparative Study of Life Satisfaction and Psychological Stress Levels Among Male and Female Allied Health College Students.” Journal of Social & Health Sciences 1 (December 29, 2022): 22–29. https://doi.org/10.58398/0001.000004.

Backović, Dušan V, Jelena Ilić Zivojinović, Jadranka Maksimović, and Miloš Maksimović. “Gender Differences in Academic Stress and Burnout Among Medical Students in Final Years of Education.” PubMed 24, no. 2 (June 1, 2012): 175–81. https://pubmed.ncbi.nlm.nih.gov/22706416.

Graves, B. Sue, Michael E. Hall, Carolyn Dias-Karch, Michael H. Haischer, and Christine Apter. “Gender Differences in Perceived Stress and Coping Among College Students.” PLoS ONE 16, no. 8 (August 12, 2021): e0255634. https://doi.org/10.1371/journal.pone.0255634.

Hall, Jeffrey A., Ric G. Steele, Jennifer L. Christofferson, and Teodora Mihailova. “Development and Initial Evaluation of a Multidimensional Digital Stress Scale.” Psychological Assessment 33, no. 3 (January 28, 2021): 230–242. https://doi.org/10.1037/pas0000979.

Matthes, Jörg, Kathrin Karsay, Desirée Schmuck, and Anja Stevic. “‘Too Much to Handle’: Impact of Mobile Social Networking Sites on Information Overload, Depressive Symptoms, and Well-being.” Computers in Human Behavior 105 (December 4, 2019): 106217. https://doi.org/10.1016/j.chb.2019.106217.

Nygaard, Mette, Thea Otte Andersen, and Naja Hulvej Rod. “Can Social Connections Become Stressful? Exploring the Link Between Social Media Use and Perceived Stress in Cross-sectional and Longitudinal Analyses of 25,053 Adults.” Journal of Mental Health (March 28, 2024): 1–9. https://doi.org/10.1080/09638237.2024.2332802.

Worly, Brett, Nicole Verbeck, Curt Walker, and Daniel M Clinchot. “Burnout, Perceived Stress, and Empathic Concern: Differences in Female and Male Millennial Medical Students.” Psychology Health & Medicine 24, no. 4 (October 7, 2018): 429–438. https://doi.org/10.1080/13548506.2018.1529329.

Various workshop queries.

AI Statement

This report was developed with the assistance of ChatGPT, in particular, ChatGPT’s o3-mini-high model. AI served its role over the entire project period (10/3/2025 - 21/4/2025), in particular formatting responses prior to final submission.