Sleep patterns, wellbeing, and academic performance among 253 college students.
SleepStudy_subset <- SleepStudy %>%
select(GPA, LarkOwl, ClassesMissed, AnxietyStatus)
my_table <- kable(head(SleepStudy_subset, 10),
col.names = c("GPA", "Chronotype", "Classes Missed", "Anxiety Status"),
caption = "Student Sleep Study Data (First 10 Rows)")
align = c("c","c","c","c")
kable_classic(my_table, full_width = FALSE, html_font = "Arial") %>%
column_spec(2, extra_css = "text-align: center;") %>%
column_spec(3, extra_css = "text-align: center;") %>%
column_spec(4, extra_css = "text-align: center;")
| GPA | Chronotype | Classes Missed | Anxiety Status |
|---|---|---|---|
| 3.60 | Neither | 0 | normal |
| 3.24 | Neither | 0 | normal |
| 2.97 | Owl | 12 | severe |
| 3.76 | Lark | 0 | normal |
| 3.20 | Owl | 4 | severe |
| 3.50 | Neither | 0 | moderate |
| 3.35 | Lark | 2 | normal |
| 3.00 | Lark | 0 | normal |
| 4.00 | Neither | 0 | severe |
| 2.90 | Neither | 0 | moderate |
Four variables selected for analysis:
1. GPA (numeric, continuous) Grade point average on a 0.0–4.0 scale. Continuous because it can take any value within the range (e.g., 3.47), not just whole numbers.
2. AnxietyScore (numeric, discrete/ordinal) Score measuring self-reported anxiety (integer values). Discrete because it is measured in whole-number units; treated as ordinal since higher scores indicate greater severity, but intervals may not be perfectly equal.
3. ClassesMissed (numeric, discrete) Count of classes missed in a semester. Discrete: counts of events (0, 1, 2 …), natural zero.
4. LarkOwl (categorical, nominal) Self-identified chronotype: “Lark” (morning), “Neither”, or “Owl” (evening). Nominal: three distinct categories with no inherent order.
summary_table <- data.frame(
Statistic = c("Min", "1st Qu.", "Median", "Mean", "3rd Qu.", "Max", "Std Dev"),
GPA = round(c(as.vector(summary(SleepStudy$GPA)),
sd(SleepStudy$GPA, na.rm = TRUE)), 2),
AnxietyScore = round(c(as.vector(summary(SleepStudy$AnxietyScore)),
sd(SleepStudy$AnxietyScore, na.rm = TRUE)), 2),
ClassesMissed = round(c(as.vector(summary(SleepStudy$ClassesMissed)),
sd(SleepStudy$ClassesMissed, na.rm = TRUE)), 2)
)
my_summary <- kable(summary_table,
col.names = c("Statistic", "GPA", "Anxiety Score", "Classes Missed"),
caption = "Base R Summary Statistics",
align = c("l", "c", "c", "c"))
kable_classic(my_summary, full_width = FALSE, html_font = "Arial") %>%
row_spec(4, bold = TRUE, background = "#c0c0c0") %>%
row_spec(7, bold = TRUE, background = "#c0c0c0")
| Statistic | GPA | Anxiety Score | Classes Missed |
|---|---|---|---|
| Min | 2.00 | 0.00 | 0.00 |
| 1st Qu. | 3.00 | 1.00 | 0.00 |
| Median | 3.30 | 4.00 | 1.00 |
| Mean | 3.24 | 5.37 | 2.21 |
| 3rd Qu. | 3.50 | 8.00 | 3.00 |
| Max | 4.00 | 26.00 | 20.00 |
| Std Dev | 0.40 | 5.20 | 3.24 |
# Frequency and Proportion Table: LarkOwl
freq_lark <- table(SleepStudy$LarkOwl)
prop_lark <- prop.table(freq_lark)
lark_table <- data.frame(
Chronotype = names(freq_lark),
Frequency = as.vector(freq_lark),
Proportion = round(as.vector(prop_lark), 3)
)
my_lark <- kable(lark_table,
col.names = c("Chronotype", "Frequency", "Proportion"),
caption = "Frequency and Proportion Table: Chronotype (LarkOwl)",
align = c("l", "c", "c"))
kable_classic(my_lark, full_width = FALSE, html_font = "Arial") %>%
column_spec(1, bold = TRUE)
| Chronotype | Frequency | Proportion |
|---|---|---|
| Lark | 41 | 0.162 |
| Neither | 163 | 0.644 |
| Owl | 49 | 0.194 |
library(ggplot2)
library(Lock5Data)
data("SleepStudy")
ggplot(SleepStudy, aes(x = GPA)) +
geom_histogram(binwidth = 0.1, fill = "skyblue", color = "white") +
geom_vline(aes(xintercept = mean(GPA, na.rm = TRUE)),
color = "black", linewidth = 1, linetype = "dashed") +
annotate("text", x = mean(SleepStudy$GPA, na.rm = TRUE) - 0.15,
y = 25, label = "Mean", color = "black", size = 4.5) +
labs(
title = "Distribution of Student GPA",
x = "Grade Point Average (0.0 - 4.0)",
y = "Number of Students"
) +
theme_bw()
ggplot(SleepStudy, aes(x = AnxietyScore)) +
geom_histogram(binwidth = 1, fill = "skyblue", color = "white") +
geom_vline(aes(xintercept = mean(AnxietyScore, na.rm = TRUE)),
color = "black", linewidth = 1, linetype = "dashed") +
annotate("text", x = mean(SleepStudy$AnxietyScore, na.rm = TRUE) + 2,
y = 25, label = "Mean", color = "black", size = 4.5) +
labs(
title = "Distribution of Student Anxiety",
x = "Anxiety Score",
y = "Number of Students"
) +
theme_bw()
ggplot(SleepStudy, aes(x = LarkOwl)) +
geom_bar(fill = "skyblue", color = "white") +
labs(
title = "Distribution of Student Chronotype",
x = "Chronotype",
y = "Number of Students"
) +
scale_x_discrete(labels = c("Lark" = "Lark (Early Riser)",
"Neither" = "Neither",
"Owl" = "Owl (Night Owl)")) +
theme_bw()
ggplot(SleepStudy, aes(x = LarkOwl, y = GPA, fill = LarkOwl)) +
geom_boxplot(color = "black", linewidth = 0.8) +
scale_fill_manual(values = c("Lark" = "skyblue",
"Neither" = "steelblue",
"Owl" = "midnightblue")) +
scale_x_discrete(labels = c("Lark" = "Lark (Early Riser)",
"Neither" = "Neither",
"Owl" = "Owl (Night Owl)")) +
labs(
title = "GPA by Student Chronotype",
x = "Chronotype",
y = "Grade Point Average (0.0 - 4.0)"
) +
theme_bw() +
theme(legend.position = "none",
panel.grid.major = element_line(linewidth = 0.5, color = "grey70"),
panel.grid.minor = element_line(linewidth = 0.5, color = "grey90"),
axis.line = element_line(linewidth = 0.5, color = "black"))
ggplot(SleepStudy, aes(x = LarkOwl, y = AnxietyScore, fill = LarkOwl)) +
geom_boxplot(color = "black", linewidth = 0.8) +
scale_fill_manual(values = c("Lark" = "skyblue",
"Neither" = "steelblue",
"Owl" = "midnightblue")) +
scale_x_discrete(labels = c("Lark" = "Lark (Early Riser)",
"Neither" = "Neither",
"Owl" = "Owl (Night Owl)")) +
labs(
title = "Anxiety Score by Student Chronotype",
x = "Chronotype",
y = "Anxiety Score"
) +
theme_bw() +
theme(legend.position = "none",
panel.grid.major = element_line(linewidth = 0.5, color = "grey70"),
panel.grid.minor = element_line(linewidth = 0.5, color = "grey90"),
axis.line = element_line(linewidth = 0.5, color = "black"))
ggplot(SleepStudy, aes(x = LarkOwl, y = ClassesMissed, fill = LarkOwl)) +
geom_boxplot(color = "black", linewidth = 0.8) +
scale_fill_manual(values = c("Lark" = "skyblue",
"Neither" = "steelblue",
"Owl" = "midnightblue")) +
scale_x_discrete(labels = c("Lark" = "Lark (Early Riser)",
"Neither" = "Neither",
"Owl" = "Owl (Night Owl)")) +
labs(
title = "Classes Missed by Student Chronotype",
x = "Chronotype",
y = "Classes Missed"
) +
theme_bw() +
theme(legend.position = "none",
panel.grid.major = element_line(linewidth = 0.5, color = "grey70"),
panel.grid.minor = element_line(linewidth = 0.5, color = "grey90"),
axis.line = element_line(linewidth = 0.5, color = "black"))
ggplot(SleepStudy, aes(x = AnxietyScore, y = GPA)) +
geom_point(color = "steelblue", size = 2, alpha = 0.6) +
geom_smooth(method = "lm", color = "black", linewidth = 0.8,
linetype = "dashed", se = FALSE) +
labs(
title = "GPA vs Anxiety Score",
x = "Anxiety Score",
y = "Grade Point Average (0.0 - 4.0)"
) +
theme_bw() +
theme(legend.position = "none",
panel.grid.major = element_line(linewidth = 0.5, color = "grey70"),
panel.grid.minor = element_line(linewidth = 0.5, color = "grey90"),
axis.line = element_line(linewidth = 0.5, color = "black"))
## `geom_smooth()` using formula = 'y ~ x'
The SleepStudy dataset is directly relevant to education psychology, and student well-being research. It captures the intersection of mental health indicators (anxiety, depression, stress, happiness) and academic performance (GPA, classes missed), enabling investigation of how a student’s sense of well-being and belonging at university relates to their engagement and achievement outcomes. This is particularly pertinent to student retention research and institutional support program design.
The analysis of the SleepStudy dataset reveals several meaningful patterns across academic performance, chronotype, and student well-being. The GPA histogram displays a left-skewed distribution, with the majority of students scoring above 3.0 out of 4.0 and a mean of approximately 3.24. The median sits slightly above the mean, indicating that a small number of low-performing students pull the average down. The standard deviation of 0.46 suggests most students fall within a relatively narrow band of academic performance.
The anxiety score histogram is strongly right-skewed, with most students reporting low anxiety but a notable tail reaching a maximum of 36. The large standard deviation relative to the mean highlights extreme variability — while most students cope well, a vulnerable minority experience disproportionately high anxiety, representing a group that warrants targeted support from student services.
The boxplot of GPA by chronotype reveals that Lark students achieve marginally higher and more consistent GPAs, while Owl students show the greatest variability in academic outcomes. This suggests that chronotype misalignment with early class scheduling may contribute to unpredictable academic performance among night-oriented students.
The scatter plot of GPA versus anxiety score shows a modest negative trend — as anxiety increases, GPA tends to decrease — though considerable scatter around the trend line indicates that anxiety alone does not determine academic outcomes.
Taken together, these findings suggest that students who identify as night owls, report high anxiety, and miss frequent classes may represent an at-risk profile.