Introduction
This study investigates how lifestyle factors—screen time, caffeine intake, and physical activity—impact sleep quality in students. Sleep quality was measured on an ordinal scale, reflecting varying levels from “very poor” to “excellent.” Ordinal regression was chosen to analyze this outcome, as it preserves the ordered nature of the dependent variable and allows us to model predictors’ effects across multiple thresholds of sleep quality. These data were collected from a randomly selected survey of students reporting their daily screen time, caffeine consumption, and frequency of physical activity.
The analysis centers on whether screen time significantly predicts sleep quality after accounting for caffeine intake and physical activity. The results provide insight into the broader question of how lifestyle behaviors influence health and well-being.
Unlike linear regression, where the outcome is continuous, ordinal regression is better suited for analyzing ordinal outcomes while maintaining their structure. This project explores the hypothesis that screen time significantly predicts sleep quality when controlling for other factors.
Methodology
Ordinal regression is an analytical technique used to model relationships between predictors and outcomes measured on an ordinal scale. Unlike linear regression, which assumes a continuous response variable, ordinal regression accounts for the ranked but discrete nature of ordinal outcomes, such as the sleep quality scale in this study. The specific model used here is the proportional odds model, which assumes that the relationship between predictors and the log-odds of transitioning between any two adjacent levels of the outcome is consistent across all thresholds.
In the proportional odds model, thresholds act as cut-points on an underlying latent continuous variable. These thresholds define transitions between adjacent categories, such as from “poor” to “fair” or from “very good” to “excellent” sleep quality. Predictor coefficients estimate the effect of each variable on the log-odds of exceeding a given threshold, and a logit link function transforms these probabilities into a linear relationship with the predictors.
The validity of the proportional odds model rests on several key assumptions. First, the proportional odds assumption requires that the effects of predictors remain consistent across all levels of the outcome variable. Second, the observations must be independent, ensuring no hidden correlations distort the results. Lastly, the predictors should not exhibit multicollinearity, as excessive correlations can undermine the interpretability of coefficients. These assumptions were rigorously tested through diagnostic procedures, including proportional odds tests and variance inflation factor (VIF) analyses.
Ordinal regression is particularly well-suited for studies like this one, where the goal is to understand not just whether a predictor influences the outcome but how it affects the distribution of responses across ordered categories. By focusing on thresholds, this method provides a detailed view of how predictors interact with the ordinal structure of the outcome variable.
student_sleep_patterns <- read_csv("/Users/jinyu/Downloads/student_sleep_patterns.csv")
## Rows: 500 Columns: 14
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (2): Gender, University_Year
## dbl (12): Student_ID, Age, Sleep_Duration, Study_Hours, Screen_Time, Caffein...
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
student_sleep_patterns$Sleep_Quality <- factor(student_sleep_patterns$Sleep_Quality,
levels = c("1", "2", "3","4", "5", "6","7", "8", "9", "10"),
ordered = TRUE)
model_ordinal <- clm(Sleep_Quality ~ Screen_Time + Caffeine_Intake + Physical_Activity,
data = student_sleep_patterns)
summary(model_ordinal)
## formula: Sleep_Quality ~ Screen_Time + Caffeine_Intake + Physical_Activity
## data: student_sleep_patterns
##
## link threshold nobs logLik AIC niter max.grad cond.H
## logit flexible 500 -1145.51 2315.01 4(0) 4.31e-08 7.1e+05
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## Screen_Time 0.0175555 0.0907781 0.193 0.847
## Caffeine_Intake -0.0076299 0.0469101 -0.163 0.871
## Physical_Activity -0.0006206 0.0022351 -0.278 0.781
##
## Threshold coefficients:
## Estimate Std. Error z value
## 1|2 -1.89728 0.31995 -5.930
## 2|3 -1.25657 0.31047 -4.047
## 3|4 -0.71316 0.30702 -2.323
## 4|5 -0.32020 0.30580 -1.047
## 5|6 0.01027 0.30538 0.034
## 6|7 0.47603 0.30570 1.557
## 7|8 0.88208 0.30696 2.874
## 8|9 1.31183 0.31057 4.224
## 9|10 2.18442 0.32712 6.678
Results
The results of the ordinal regression analysis did not reveal any statistically significant relationships between the predictors and sleep quality. Screen time, the primary variable of interest, had an estimated log-odds of 0.0176 (p = 0.847), indicating no meaningful association with sleep quality. Similarly, caffeine intake and physical activity were found to have negligible effects, with p-values of 0.871 and 0.781, respectively.
Threshold coefficients provided additional insights into the structure of the outcome variable. For instance, the threshold separating the lowest two levels of sleep quality, “very poor” (1) and “poor” (2), was estimated at -1.8973, while the threshold between the highest two levels, “very good” (9) and “excellent” (10), was 2.1844. These thresholds suggest that transitions between sleep quality levels are broadly distributed but not heavily influenced by the predictors studied.
Goodness-of-fit metrics, including a log-likelihood of -1145.51 and an AIC of 2315.01, indicated that the model fit the data reasonably well. However, these metrics also highlighted the presence of unexplained variability, underscoring the complexity of sleep quality as a behavioral outcome.
Predicted probabilities for a hypothetical student with 4 hours of screen time, 2 cups of caffeine intake, and 3 hours of physical activity illustrated the even distribution of responses across sleep quality levels. This student’s probabilities ranged from 12.45% for “very poor” sleep quality to 10.61% for “excellent” sleep quality, with intermediate categories falling between these values.
Model Diagnostics
Proportional Odds Assumption: This assumption was validated, confirming the appropriateness of the proportional odds model.
nominal_test(model_ordinal)
vif(lm(as.numeric(Sleep_Quality) ~ Screen_Time + Caffeine_Intake + Physical_Activity,
data = student_sleep_patterns))
## Screen_Time Caffeine_Intake Physical_Activity
## 1.004018 1.003444 1.002047
Discussion
This study highlights the challenges of modeling a multifaceted outcome like sleep quality. Despite its strengths, the analysis did not identify any significant relationships between the predictors and sleep quality. The lack of significance suggests that other, unmeasured variables—such as stress levels, dietary habits, or the content viewed during screen time—may play more substantial roles in influencing sleep.
The study has several strengths. The use of ordinal regression preserved the integrity of the ordered sleep quality measure, allowing for a nuanced exploration of the relationships between predictors and the outcome variable. The validity of the proportional odds assumption, confirmed through diagnostic tests, further supports the robustness of the model and its conclusions. Moreover, the inclusion of detailed sleep metrics, such as weekday and weekend start and end times, added depth to the dataset, offering a more comprehensive view of students’ sleep patterns. Probabilistic predictions derived from the model provided additional insights into how predictors influence the likelihood of different sleep quality levels.
Despite its strengths, the study is not without limitations. The scope of predictors was relatively narrow, excluding other potentially important factors like stress levels, screen content type, and sleep environment, which could offer additional explanatory power. Furthermore, the reliance on self-reported measures of screen time, caffeine intake, and physical activity introduces potential biases, as participants may underestimate or overestimate their behaviors. Finally, the results are specific to the surveyed student population and may not generalize to other groups, such as working adults or individuals from different age ranges.
Looking ahead, future research should aim to expand the range of predictors included in the analysis, particularly factors like stress levels and screen content type, which might interact with screen time to affect sleep. Employing wearable devices to capture objective data on sleep quality and physical activity could also enhance the reliability of the results. Finally, exploring non-linear relationships or interactions among predictors might uncover more nuanced effects that were not apparent in this study.
Conclusions
In conclusion, this analysis demonstrated the application of ordinal regression to explore ordered categorical outcomes like sleep quality. While the predictors of screen time, caffeine intake, and physical activity were not significant in this dataset, the findings underscore the complexity of sleep quality and its multifactorial nature. By selecting a model that aligns with the structure of the outcome variable, this study provides a foundation for future research into the determinants of sleep, offering valuable insights for public health and behavioral science.
The findings of this study indicate that screen time, caffeine intake, and physical activity are not significant predictors of sleep quality in the dataset analyzed. None of these variables demonstrated statistically significant associations with the ordered levels of sleep quality. Despite this, the thresholds for transitions between sleep quality levels were consistent, supporting the validity of the proportional odds assumption and reinforcing the suitability of the ordinal regression model used.
Several key insights emerge from these findings. Although demographics such as gender, university year, and age were not included in the regression analysis, they may provide valuable contextual understanding. For example, older students or those in advanced academic years might experience sleep disturbances due to increased academic or personal responsibilities. Additionally, the consistency of thresholds across sleep quality levels further validates the use of the proportional odds model, demonstrating its effectiveness in capturing the ordered structure of the outcome variable.
These results suggest that while screen time, caffeine intake, and physical activity might influence sleep quality in other contexts, their effects appear minimal in this particular dataset. This outcome underscores the multifaceted nature of sleep quality, which is likely shaped by a combination of unmeasured variables such as stress, dietary habits, and the type of content consumed during screen time.
This analysis underscores the importance of selecting appropriate statistical models for ordinal outcomes, ensuring the preservation of the data’s structure. While the predictors examined here did not significantly affect sleep quality, this study serves as a foundation for more comprehensive research into the determinants of sleep. Expanding the range of predictors and incorporating objective measures will be critical steps in uncovering the complex interactions that shape sleep behaviors.
By demonstrating the utility of ordinal regression and identifying its limitations, this study contributes to a growing body of literature aimed at understanding sleep quality—a vital aspect of health and well-being.
new_data <- data.frame(Screen_Time = 4, Caffeine_Intake = 2, Physical_Activity = 3)
predict(model_ordinal, newdata = new_data, type = "prob")
## $fit
## 1 2 3 4 5 6 7
## 1 0.1245118 0.08803057 0.1047473 0.09045398 0.0815511 0.1148888 0.09195596
## 8 9 10
## 1 0.08266806 0.1151031 0.1060893
References
Jamal, Arsalan. (2023). Student Sleep Patterns [Data set]. Kaggle. https://www.kaggle.com/datasets/arsalanjamal002/student-sleep-patterns