Author

550845168 ; 530511074 ; 550157094 ; 550358901 ; 550391160 ; 550603465


Exploring How Personal Interest and Stress level Correlates with Time Spent Studying in University Students



Executive summary

A box plot showed that students with greater Data Science interest generally reported studying more hours per week, indicating that intrinsic motivation is linked to stronger study habits. The scatterplot of stress and study hours, however, showed a negligible relationship, suggesting that perceived stress doesn’t meaningfully predict weekly study time.


Exploratory Data Analysis (EDA)

Code
df <- read.csv("data1001_survey_data_2025.csv")

clean_data <- df %>%
  filter(consent == "I consent to take part in the study",
         age > 17, # filtering potential false ages, such as 4, by setting the lower boundary to age to 17 - not 18, as some students attend university early
         hours_studying <= 22,          # remove hours above 22, based on various universities recommended unit study hours per week
         mark_goal >= 50, mark_goal <= 100) %>%   # keep goals between 50 and 100, as we are only interested in participants who intend to pass the course
  select(mark_goal, hours_studying, data_interest, stress) %>%
  drop_na()

clean_data_numeric <- clean_data %>%
  mutate(
    stress = as.numeric(stress),
    hours  = as.numeric(hours_studying)
  ) %>% 
  drop_na()

The data was sourced from a survey completed by 2955 DATA1X01 students during Semester 2 2025, covering 28 variables about their university life. For our analysis, we focused on three variables: students’ interest in Data Science from 0 (No Interest) to 10 (Extremely Interested), hours students hoped to study DATA1X01 per week, and student self-reported stress from 0 (No Stress) to 10 (Worst Stress Imaginable). We classified the variables as quantitative discrete, quantitative continuous, and quantitative discrete, respectively.

Limitations

Potential limitations include the integrity of answers, as some students could have given unrealistic or joke responses. Study hours could also be influenced by course content difficulty, acting as a confounding variable. Finally, some students did not consent to participate, which reduced the sample size and may limit how representative the data is.

Assumptions

We assumed that students provided honest and reasonable responses, and that reported study hours are assumed to be actual studying time (not time spent distracted, multitasking, attending lectures, etc.). We cleaned the data to only include consenting responses and restricted study hours to a plausible range of 0–22 per week, ensuring more reliable results. 


Research Question 1

How does interest in data science correlate with time spent studying DATA1X01?

This boxplot shows the relationship between students’ self-rated interest in Data Science (0–10) and their reported weekly study hours.There is a positive correlation between them: the higher the interest, the longer the average weekly study time. Furthermore, students with higher interest levels exhibit greater variation in their study time, with some studying over 10 hours per week, while others maintain a range of 4 to 6 hours. 

Code
showtext_auto()


ggplot(clean_data, aes(x = factor(data_interest), y = hours_studying)) +
  geom_boxplot(fill = "darkseagreen", size = 0.8, colour = "darkorchid4", alpha = 0.5) +
  labs(title = "Reported Hours of Study Per Week by Interest Level",
    x = "Interest in Data science",
    y = "Reported Hours of Study per Week") +

  theme_bw(base_family = "mono") +
  theme(
    plot.title = element_text(face = "bold", hjust = 0.5, size = 30, colour = "#1f4e5f"),
    axis.title = element_text(face = "bold", size = 14),
    axis.text = element_text(face = "bold", size = 12))

Students with low interest (0–2) report relatively few study hours, with medians around 2–3 hours per week. From interest levels 3–6, study hours gradually increase, with medians shifting upward to 4–5 hours. At high interest (7–10), both the median and spread of study hours increase noticeably, with medians reaching 5–6 hours and more variability (some students studying 15+ hours). Outliers exist across all levels, showing a few students report unusually high study hours regardless of interest.

To summarize, higher interest in Data Science is generally associated with more study hours, and students with stronger interest display greater variation in study time. This suggests intrinsic motivation not only increases average effort but also widens the range of engagement.

Research Question 2

Does perceived stress predict weekly study hours among DATA1X01 students?

The scatterplot of perceived stress against weekly study hours shows points widely dispersed with no clear pattern. The fitted regression line has a slight positive slope, suggesting that higher stress levels are associated with slightly more study time. The model estimated a slope of 0.13 hours per stress unit, with an intercept of 4.55 hours. The correlation was r = 0.035, and the model’s R² = 0.003 indicates that stress explains less than 0.3% of the variance in study hours.

Code
# Plot

showtext_auto()

  ggplot(clean_data_numeric, aes(stress, hours)) +
    geom_point(alpha = 0.15, size = 3, colour = "darkgreen") +
    geom_smooth(method = "lm", se = TRUE, colour = "darkorchid4", fill = "violet") +
    labs(
      x = "Perceived stress",
      y = "Weekly study hours",
      title = "Reported Hours of Study Per Week vs Stress"
  ) +
    theme_bw(base_family = "mono") +
    theme(
      plot.title = element_text(face = "bold", hjust = 0.5, size = 30, colour = "#1f4e5f"),
      axis.title = element_text(face = "bold", size = 14),
      axis.text = element_text(face = "bold", size = 12)
    )

Although the slope was statistically significant (p = 0.003), the effect size is negligible. The residuals versus fitted values plot showed a wide scatter with no structure, reinforcing that the model has very weak explanatory power.

In practical terms, perceived stress does not meaningfully predict how much students study each week. For the client, this means that reducing stress alone is unlikely to affect study behaviours; instead, combining wellbeing support with strategies that increase motivation and study skills would be more effective.


Articles

Naeem, I., Aparicio-Ting, F.E., & Dyjur, P. (2020). Student Stress and Academic Satisfaction: A Mixed Methods Exploratory Study. International Journal of Innovative Business Strategies, 6(1), pp.388–395. https://doi.org/10.20533/ijibs.2046.3626.2020.0050.

Zubair, T., Qazi, U., Faisal, S. M., & Khalid Khan, A. (2024). The impact of study hours on academic performance: A statistical analysis of students’ grades. International Journal of Multidisciplinary Research and Growth Evaluation, 5(3), 720–728. https://doi.org/10.54660/.ijmrge.2024.5.3.720-728


Acknowledgements

AI Usage Statement

We acknowledge using generative AI tools, specifically ChatGPT 5, to help prepare this report. The tool was used to check grammar, improve the structure, and make our R code clearer. While it supported drafting and expression, all analysis and conclusions are our own work and judgment. ChatGPT also helped explain complex concepts in a straightforward way, making them easier to communicate and understand.

Meeting Schedule

Date Time Attendance Minutes
05/09 13:00 - 14:00 Astha, Brent, Kris, Jarif, Josephine
  • Brainstormed research questions

  • Established a project timeline

  • Allocated roles

09/09 16:00 - 17:00 Astha, Deevesh, Kris, Jarif
  • Reviewed progress on Research Question 1

  • Changed Research Question 2

  • Decided on the limitations and assumptions

  • Planned tasks for the next week

17/09 17:00 - 18:30 Astha, Brent, Deevesh, Kris, Jarif, Josephine
  • Reviewed progress on Research Question 1 and 2

  • Worked on the professional standard of report and acknowledgements.

  • Wrote the executive summary

  • Split tasks for the presentation

  • Tabulated group contribution and meeting hours

  • Compiled everything to R and completed final edit

Group Contribution

Task Group Member Allocated
Executive Summary All
Exploratory Data Analysis Astha, Jarif
Research Question 1 Josephine, Kris
Research Question 2 Deevesh, Kris, Josephine 
Articles 1 and 2 Astha, Brent, Josephine
Acknowledgement All
Professional Standard of Report Astha, Brent, Jarif
Presentation All

Professional Standard of Report

We adhered to the shared value of truthfulness and integrity by ensuring our analyses relied solely on the data, avoiding bias or preconceived conclusions. We also followed the ethical principle of transparency, exposing our data cleaning steps, displaying R code used for key findings, and documenting limitations so that our methods and results are clear and reproducible.