# load
library(ggplot2)
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
data = read.csv("student_lifestyle_dataset.csv")
My project will explore the relationship between sleep, stress, and academic performance among students. By analyzing these variables, my goal is to better understand whether lifestyle factors such as sleep duration and stress levels influence GPA.
#formatting
data$Stress_Level = factor(
data$Stress_Level,
levels = c("Low", "Moderate", "High")
)
ggplot(
data,
aes(
Sleep_Hours_Per_Day,
Stress_Level,
) )+
geom_boxplot(
col = "black",
fill = "steelblue",
alpha = .5
)+
labs(
title = "Sleep duration by stress level",
x = "Hours in Sleep (per day)",
y = "Stress Level"
)+
theme_minimal()+
theme(
plot.title = element_text(size = 16, face = "bold"),
axis.title = element_text(size = 12)
)
data %>%
group_by(Stress_Level) %>%
summarize(
mean_sleep = mean(Sleep_Hours_Per_Day),
median_sleep = median(Sleep_Hours_Per_Day),
min_sleep = min(Sleep_Hours_Per_Day),
max_sleep = max(Sleep_Hours_Per_Day)
)
## # A tibble: 3 × 5
## Stress_Level mean_sleep median_sleep min_sleep max_sleep
## <fct> <dbl> <dbl> <dbl> <dbl>
## 1 Low 8.06 8 6 10
## 2 Moderate 7.95 7.9 6 10
## 3 High 7.05 6.8 5 10
The boxplot shows the distribution of sleep duration across different stress levels. Students with low stress levels tend to exhibit higher median sleep duration, while students with high stress levels tend to have lower median sleep.
In addition, the spread of the data indicates that students with higher stress levels show greater variability in sleep duration, as evidenced by the larger IQR and wider range in the boxplot. Contrasting, the data of the low stress students demonstrates a more consistent sleep pattern.
This observation is support by the mean sleep values, where high stress sleep duration is at 7.04 and low stress sleep duration is at 8.06 respectively.
Overall, these results suggest a negative relationship between stress and sleep, where higher stress is associated with both reduced and more variable sleep duration.
datagpa = ggplot(
data,
aes(
x = GPA,
y = Sleep_Hours_Per_Day
)
)+
geom_point(
)+
labs(
title = "Relationship between Sleep and GPA",
x = "GPA",
y = "Hours in sleep (per day)"
)+
geom_smooth(method = "lm", se = F)+
theme_minimal() +
theme(
plot.title = element_text(size = 16, face = "bold"),
axis.title = element_text(size = 12)
)
cor(data$Sleep_Hours_Per_Day, data$GPA)
## [1] -0.004278441
The scatterplot shows the relationship between Sleep and GPA. The data points are widely dispersed with no clear upward nor downward trend, suggesting a weak realtionship between these variables.
Furthermore, the correlation supports this analysis, with an R value close to 0, (-0.004), thus indicating that sleep duration alone may not be a strong predictor of academic performance. Other factors, such as stress, time management, or study habits may play a more significant role.
Overall, the analysis suggests that stress and sleep are related, with higher stress levels associated with reduced and more variable sleep. However, sleep duration alone does not appear to have a strong relationship with GPA.
In contrast, stress may have a more meaningful impact on academic performance, as higher stress levels are associated with slightly lower GPAs. These findings indicate that academic success is influenced by multiple factors, and further analysis could explore additional variables such as study habits or time management.