library(tidyverse)library(readr)data1001_survey_data_2026S1_1_1_ <-read_csv("data1001_survey_data_2026S1-1 (1).csv")library(dplyr)filtered_data = data1001_survey_data_2026S1_1_1_ %>%filter(consent =="I consent to take part in the study") %>%filter(learner_style =="Style 1"|learner_style =="Style 2"|learner_style =="Style 3") %>%filter(data_interest >=0& data_interest <=10) %>%filter(stress >=0& stress <=10) %>%na.omit()library(plotly)library(ggplot2)
Recommendation/ Insight
Aimed to examine relationship between interest in DATA1X01 and learning-styles to determine whether styles may predict wellbeing outcomes due to indirect relationship between interest and stress.
Styles predict student interest, but not stress, with no indirect pathway between interest and stress levels.
Holistic approaches to addressing student wellbeing are needed.
Evidence
Overview
Data was sourced from a survey completed by 4099 consenting students in the 2026 University of Sydney DATA1X01 cohort which explored 26 different variables.
Analysed Variables:
Learner Style (Qualitative - nominal)
DATA1X01 Interest Level (Quantitative - discrete)
Stress (Quantitative - discrete)
Limitations
Data was collected in Week 1, and thus does not change over the semester. Additionally, there is subjectivity in the self-report method of collection, with perceptions of “stress”, “interest” and “learning-style” varying between individuals. Finally, since only consenting responses were analysed, our data represents a sample of the total cohort, reducing the generalisability.
Assumptions
We assumed that all responses were truthful, independent and reflected sufficient levels of self-insight, thereby minimising systematic bias. Furthermore, we assumed the survey to be valid, with high test-retest reliability. For Analysis 1, we assumed learning-style options encompassed all student preferences - an overgeneralisation, as approaches to learning may extend beyond these three categories. For Analysis 2, we assumed student interest level to influence stress, though causality could also be reversed.
Data Cleaning
The raw data set consisted of 6504 entries. Participants who did consent were excluded, leaving 4099 responses for analysis. The dataset was filtered to omit non-attempts to questions of Interest Level and Stress score.
Research Question 1
Do certain learner styles correlate with higher interest levels in the DATA1001 unit?
Code
boxplot_learnervdata =ggplot(filtered_data, aes(x = learner_style, y = data_interest)) +geom_boxplot(colour ="deeppink1",fill ="lightpink",linewidth =1,) +theme_classic() +labs(title ="Interest Level by Learner Style", x ="Learner Style",y ="Interest Level")ggplotly(boxplot_learnervdata, tooltip ="text")
Style 1 displayed the highest interest in (median = 7), compared to Style 2 (median = 4) and 3 (median = 5). There was a consistent moderate level of variability within each group (Style IQR 1 = 3, Style 2 IQR = 3, Style 3 IQR = 4), making self-described learning-styles a moderately reliable predictor of student interest. Studies have documented the beneficial effect of proximal goal setting on students’ perceived self-efficacy, which is positively related to intrinsic interest (Bandura, A., & Schunk, D. H. 1981). Thus, Style 1’s high interest may reflect how critical-thinking approaches better equip students to be proximally challenged in DATA1X01. Moreover, Style 2’s reliance on memorisation may be less effective for problem-solving in DATA1X01, impeding the development of self-efficacy which promotes intrinsic interest by rendering challenges more distal. Finally, Style 3 which did not have a specified problem solving approach, may instead hybridise between conceptualising (Style 1) and rote-learning (Style 2), resulting in a degree of proximal challenge that fosters interest levels in-between that of Styles 1 and 2.
Research Question 2
Is there a linear negative correlation between level of interest in DATA1X01 and reported stress levels for students?
Code
style_interest_stress =ggplot(filtered_data, aes(x = data_interest, y = stress)) +geom_point(colour ="pink") +facet_wrap(~ learner_style) +geom_smooth(method ='lm', color ="hotpink", se =FALSE) +theme_classic() +labs(title ="Stress by Data Interest and Learner Style",x ="Data Interest",y ="Stress Level")scatterplot_interestvstress =ggplot(filtered_data, aes(x = data_interest, y = stress)) +geom_point(colour ="black") +geom_smooth(method ='lm', color ="hotpink", se =FALSE) +theme_classic() +labs(title ="Stress by Data Interest", x ="Data Interest",y ="Stress Level")linear_interestvstress =lm(stress ~ data_interest, data = filtered_data)residual_interestvstress =ggplot(linear_interestvstress, aes(x = .fitted, y = .resid)) +geom_point(colour ="black") +geom_hline(yintercept =0, colour ="purple", linetype ="dashed") +coord_cartesian(ylim =c(-10, 10)) +theme_classic() +labs(title ="Stress vs Data Interest Residual Plot",x ="Fitted Values",y ="Residuals")
The scatterplot generated between the quantitative stress and interest levels displayed no clustering, a feature mirrored in the low correlation (r = 0.00538 3s.f). Thus despite the residual graph displaying homoscedasticity, linear regression was not a suitable model to relate the variables, and no discernable indirect relationship was shown by the regression line – a conclusion that was consistent even when filtering by learner style. This indicates that student stress is multifaceted, and more heavily modulated by factors other than subjective unit interest. Recent analytic studies support this, identifying performance and career as opposed to interest as the key determinants of negative academic experiences which most contributed to student stress. Alongside these poor academic experiences, psychological, physiological and environmental conditions were also noted to be significant risk factors.
Articles
Bandura, A., & Schunk, D. H. (1981). Cultivating competence, self-efficacy, and intrinsic interest through proximal self-motivation. Journal of Personality and Social Psychology, 41(3), 586–598. https://psycnet.apa.org/record/1982-07527-001
de Filippis, R., Foysal, A.A. Comprehensive analysis of stress factors affecting students: a machine learning approach. Discov Artif Intell 4, 62 (2024). https://link.springer.com/article/10.1007/s44163-024-00169-6
Claude Sonnet 4.6 (developed by Anthropic) was used during this project (https://claude.ai). Claude was used to assist with debugging and formatting RStudio code chunks for the graphs, customising HTML theme, and creating interactive Quarto tabs. AI was not used to generate written analysis, interpret results, or source literature.
Professional Statement
We have upheld the shared professional value of respect in ensuring subject confidentiality by omitting non-consenting respondents, whilst operating without detracting from the work of others. Additionally, our report maintains the ethical principle of pursuing objectivity, seeking to transparently analyse and report on the data with acknowledgement of limits and assumptions.