This study investigates whether high levels of social media use are associated with poor academic performance, reduced participation in physical activities, diminished social interaction, and increased levels of stress, anxiety, addiction, and depression.
(a) Age demographics by gender
(b) Distribution of sleep hours data
(c) Distribution of data on time spend on screen
(d) Distribution of daily social media usage data
(e) Distribution of platform usage data
(f) Distribution of academic performance data
(g) Distribution of data on time spend on physical activities
(h) Distribution of data on social interaction levels
(i) Distribution of data on stress levels
(j) Distribution of data on anxiety levels
This section examines whether mental‑health outcomes differ between male and female students by comparing their stress, anxiety, and social‑media addiction levels.
This section examines the associations between lifestyle habits (sleep duration, screen time, and daily social media usage)
Sleep duration by depression status
This section evaluates whether students’ academic performance differs according to the social media platforms they use. (Academics and platform)
This section investigates whether students’ engagement in physical activity differs across social media platform groups. (Physical and platform)
Mental health scores, including stress, anxiety, and addiction, are compared across genders.
The analysis compares depressed and non‑depressed groups across key lifestyle variables, including daily activities, sleep duration, and screen time
Average daily social media hours by depression label
Average stress levels by depression label
This analysis compared the average GPA of students grouped by depression label to examine whether academic performance varies across these categories. (Average GPA by depression status)
This analysis examined the relationship between depression and physical activity by assessing how activity levels vary across individuals with different depression scores. (Average physical activity hours)
This analysis explored how key behavioral and mental‑health variables relate to one another by examining correlations among screen time, age, anxiety, academic performance, depression, stress, sleep duration, and addiction levels. (Relationship between variables)
This analysis evaluated a range of behavioral, lifestyle, and platform‑related factors to determine which variables meaningfully predict depression by comparing each factor’s relationship to depression status. (prediction)
ggplot(data=social_media_data)+
geom_bar(mapping=aes(x=age, fill=gender), position="dodge")+
labs(title = "Participants’ age demographics by gender")
ggplot(social_media_data, aes(x = sleep_hours)) +
geom_histogram(binwidth = 1, fill = "steelblue", color = "white") +
geom_freqpoly(binwidth = 1, color = "red", size = 1)+
labs(
title = "Distribution of Sleep Hours",
x = "Sleep Hours",
y = "Count"
) +
theme_minimal()
## Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
## ℹ Please use `linewidth` instead.
## This warning is displayed once per session.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
ggplot(social_media_data, aes(x = screen_time_before_sleep)) +
geom_histogram(aes(y = after_stat(count / sum(count)) * 100),
binwidth = 1, fill = "steelblue", color = "white") +
geom_freqpoly(aes(y = after_stat(count / sum(count)) * 100),
binwidth = 1, color = "red", size = 1) +
labs(
title = "Screen Time Before Sleep",
x = "Screen time in hours",
y = "Percentage (%)"
) +
theme_minimal()
platform_summary <- social_media_data %>%
count(platform_usage) %>%
mutate(percent = n / sum(n) * 100,
label = paste0(round(percent, 1), "%"))
ggplot(platform_summary, aes(x = "", y = percent, fill = platform_usage)) +
geom_col(width = 1) +
coord_polar(theta = "y") +
geom_text(aes(label = label),
position = position_stack(vjust = 0.5),
color = "white",
size = 4) +
labs(
title = "Platform Usage",
y = "Percentage",
x = ""
) +
theme_void()
ggplot(social_media_data, aes(x = academic_performance)) +
geom_histogram(binwidth = 1, fill = "steelblue", color = "white") +
geom_freqpoly(binwidth = 1, color = "red", size = 1)+
labs(
title = "academic performance",
x = "GPA",
y = "# of students"
) +
theme_minimal()
ggplot(social_media_data, aes(x = physical_activity)) +
geom_histogram(
binwidth = 1, fill = "steelblue", color = "white") +
geom_freqpoly(binwidth = 1, color = "red", size = 1)+
labs(
title = "Physical activity",
x = "Time in hours",
y = "# of students"
) +
theme_minimal()
ggplot(social_media_data, aes(x = stress_level)) +
geom_histogram(binwidth = 1, fill = "steelblue", color = "white") +
geom_freqpoly(binwidth = 1, color = "red", size = 1)+
labs(
title = "stress level",
x = "stress levels",
y = "# of students"
) +
theme_minimal()
ggplot(social_media_data, aes(x =anxiety_level)) +
geom_histogram(binwidth = 1, fill = "steelblue", color = "white") +
geom_freqpoly(binwidth = 1, color = "red", size = 1)+
labs(
title = "Anxiety level",
x = "anxiety levels",
y = "# of students"
) +
theme_minimal()
ggplot(social_media_data, aes(x = addiction_level)) +
geom_histogram(binwidth = 1, fill = "steelblue", color = "white") +
geom_freqpoly(binwidth = 1, color = "red", size = 1)+
labs(
title = "Addiction level",
x = "addiction levels",
y = "# of students"
) +
theme_minimal()
depression_summary <- social_media_data %>%
count(depression_label) %>%
mutate(percent = n / sum(n) * 100,
label = paste0(round(percent, 1), "%"))
ggplot(depression_summary, aes(x = "", y = percent, fill = depression_label)) +
geom_col(width = 1) +
coord_polar(theta = "y") +
geom_text(aes(label = label),
position = position_stack(vjust = 0.5),
color = "white",
size = 5) +
labs(
title = "Depression Label",
x = "",
y = "Percentage"
) +
theme_void()
ggplot(social_media_data, aes(x =gender, y = stress_level, fill = gender)) +
geom_boxplot() +
labs(
x = "gender",
y = "stress level",
title = "Stress level by gender"
) +
theme_minimal()
ggplot(social_media_data, aes(x =gender, y =anxiety_level, fill = gender)) +
geom_boxplot() +
labs(
x = "gender",
y = "anxiety level",
title = "Anxiety level by gender"
) +
theme_minimal()
ggplot(social_media_data, aes(x =gender, y = addiction_level, fill = gender)) +
geom_boxplot() +
labs(
x = "gender",
y = "addiction level",
title = "Addiction level by gender"
) +
theme_minimal()
social_media_data$depression <- factor(social_media_data$depression_label, labels = c("No Depression", "Depression"))
ggplot(social_media_data, aes(x = depression, y = sleep_hours, fill = depression)) +
geom_boxplot() +
labs(
x = "Depression Status",
y = "Sleep Hours",
title = "Sleep Duration by Depression Status"
) +
theme_minimal()
The median sleep duration is 6.5 hours for the “No depression” group and 4.5 hours for the “depression” group. This suggests a potential association between depression and shorter sleep duration.
There is no overlap between the middle 50% of the two groups, indicating a statistically significant difference.
The “No depression” group has a larger interquartile range than the “depression” group, indicating greater variability in sleep duration among those without depression.
These findings support the hypothesis that individuals with depression are likely to experience significantly shorter sleep duration than those without depression. This is confirmed using the t-Test
t.test(sleep_hours ~ depression, data = social_media_data)
##
## Welch Two Sample t-test
##
## data: sleep_hours by depression
## t = 15.745, df = 40.991, p-value < 2.2e-16
## alternative hypothesis: true difference in means between group No Depression and group Depression is not equal to 0
## 95 percent confidence interval:
## 1.510624 1.955162
## sample estimates:
## mean in group No Depression mean in group Depression
## 6.494183 4.761290
Since the p‑value =2.2e-16 < 0.05, the two groups differ significantly. Thus, the distribution of sleep hours differs between depressed and non‑depressed participants.
ggplot(social_media_data, aes(x = depression, y = screen_time_before_sleep, fill = depression)) +
geom_boxplot() +
labs(
x = "Depression Status",
y = "screen time before sleep",
title = "Screen time before sleep by Depression Status"
) +
theme_minimal()
wilcox.test(screen_time_before_sleep ~ depression, data = social_media_data)
##
## Wilcoxon rank sum test with continuity correction
##
## data: screen_time_before_sleep by depression
## W = 19199, p-value = 0.5707
## alternative hypothesis: true location shift is not equal to 0
The relationship between academic performance and platform usage is assessed.
GPA_mean<-social_media_data %>%
group_by(platform_usage) %>%
summarise(avg_GPA = mean(academic_performance, na.rm = TRUE))%>%
print()
## # A tibble: 3 × 2
## platform_usage avg_GPA
## <chr> <dbl>
## 1 Both 2.98
## 2 Instagram 3.00
## 3 TikTok 3.00
ggplot(data =GPA_mean) +
geom_bar(
aes(x = platform_usage, y =avg_GPA, fill = platform_usage),
stat = "identity"
) +
geom_text(
aes(
x = factor(platform_usage),
y = avg_GPA,
label = round(avg_GPA, 2)
),
vjust = -0.5,
size = 3.3
) +
labs(
x = "Social media platform",
y = "Average GPA",
title = "Average GPA by Platform Usage"
)
The relationship between academic performance and platform usage is assessed.
physical_activity_mean<-social_media_data %>%
group_by(platform_usage) %>%
summarise(avg_physical = mean(physical_activity, na.rm = TRUE))%>%
print()
## # A tibble: 3 × 2
## platform_usage avg_physical
## <chr> <dbl>
## 1 Both 1.02
## 2 Instagram 1.04
## 3 TikTok 0.982
ggplot(data = physical_activity_mean) +
geom_bar(
aes(x = platform_usage, y =avg_physical, fill = platform_usage),
stat = "identity"
) +
geom_text(
aes(
x = factor(platform_usage),
y = avg_physical,
label = round(avg_physical, 2)
),
vjust = -0.5,
size = 3.3
) +
labs(
x = "Social media platform",
y = "Mean physical activity time (in hours)",
title = "Mean physical activity time by platform usage group"
)
stress_gender_mean<-social_media_data %>%
group_by(gender) %>%
summarise(avg_stress = mean(stress_level, na.rm = TRUE))%>%
print()
## # A tibble: 2 × 2
## gender avg_stress
## <chr> <dbl>
## 1 female 5.42
## 2 male 5.47
ggplot(data =stress_gender_mean ) +
geom_bar(
aes(x = gender, y =avg_stress, fill = gender),
stat = "identity"
) +
labs(
x = "gender",
y = "average sleep hours",
title = "Average sleep hours by gender"
)
anxiety_gender_mean<-social_media_data %>%
group_by(gender) %>%
summarise(avg_anxiety = mean(anxiety_level, na.rm = TRUE))%>%
print()
## # A tibble: 2 × 2
## gender avg_anxiety
## <chr> <dbl>
## 1 female 5.69
## 2 male 5.59
ggplot(data =anxiety_gender_mean ) +
geom_bar(
aes(x = gender, y =avg_anxiety, fill = gender),
stat = "identity"
) +
geom_text(
aes(
x = factor(gender),
y = avg_anxiety,
label = round(avg_anxiety, 2)
),
vjust = -0.5,
size = 3.3
) +
labs(
x = "gender",
y = "average anxiety level",
title = "Average anxiety level by gender"
)
addiction_gender_mean<-social_media_data %>%
group_by(gender) %>%
summarise(avg_addiction = mean(addiction_level, na.rm = TRUE))%>%
print()
## # A tibble: 2 × 2
## gender avg_addiction
## <chr> <dbl>
## 1 female 5.49
## 2 male 5.64
ggplot(data =addiction_gender_mean ) +
geom_bar(
aes(x = gender, y =avg_addiction, fill = gender),
stat = "identity"
) +
geom_text(
aes(
x = factor(gender),
y = avg_addiction,
label = round(avg_addiction, 2)
),
vjust = -0.5,
size = 3.3
) +
labs(
x = "gender",
y = "average addiction level",
title = "Average addiction level by gender"
)
sleep_depression_mean<-social_media_data %>%
group_by(depression_label) %>%
summarise(avg_sleep_hours= mean(sleep_hours, na.rm = TRUE))
ggplot(data = sleep_depression_mean) +
geom_bar(
aes(
x = factor(depression_label),
y = avg_sleep_hours,
fill = factor(depression_label)
),
stat = "identity"
) +
geom_text(
aes(
x = factor(depression_label),
y = avg_sleep_hours,
label = round(avg_sleep_hours, 2)
),
vjust = -0.5,
size = 3.3
) +
scale_fill_manual(
values = c("0" = "steelblue", "1" = "tomato"),
name = "Depression label"
) +
labs(
x = "Depression label",
y = "Average sleep hours",
title = "Average sleep hours by depression Label"
) +
theme_minimal()
The data indicates a negative association between sleep duration and a depression diagnosis, meaning that as individuals sleep fewer hours, their likelihood of experiencing depression increases.
Notably, both groups average less than the recommended 7–9 hours of sleep for adults.
screen_depression_mean<-social_media_data %>%
group_by(depression_label) %>%
summarise(avg_screen_hours= mean(screen_time_before_sleep, na.rm = TRUE))
ggplot(data = screen_depression_mean) +
geom_bar(
aes(
x = factor(depression_label),
y = avg_screen_hours,
fill = factor(depression_label)
),
stat = "identity"
) +
geom_text(
aes(
x = factor(depression_label),
y = avg_screen_hours,
label = round(avg_screen_hours, 2)
),
vjust = -0.5,
size = 3.3
) +
scale_fill_manual(
values = c("0" = "steelblue", "1" = "tomato"),
name = "Depression label"
) +
labs(
x = "Depression label",
y = "Average screen time before sleep",
title = "Average screen time before sleep by depression Label"
) +
theme_minimal()
stress_depression_mean<-social_media_data %>%
group_by(depression_label) %>%
summarise(avg_stress_hours= mean(stress_level, na.rm = TRUE))
ggplot(data = stress_depression_mean) +
geom_bar(
aes(
x = factor(depression_label),
y = avg_stress_hours,
fill = factor(depression_label)
),
stat = "identity"
) +
geom_text(
aes(
x = factor(depression_label),
y = avg_stress_hours,
label = round(avg_stress_hours, 2)
),
vjust = -0.5,
size = 3.3
) +
scale_fill_manual(
values = c("0" = "steelblue", "1" = "tomato"),
name = "Depression label"
) +
labs(
x = "Depression label",
y = "Average stress levels",
title = "Average stress levels by depression Label"
) +
theme_minimal()
anxiety_depression_mean<-social_media_data %>%
group_by(depression_label) %>%
summarise(avg_anxiety_hours= mean(anxiety_level, na.rm = TRUE))
ggplot(data = anxiety_depression_mean) +
geom_bar(
aes(
x = factor(depression_label),
y = avg_anxiety_hours,
fill = factor(depression_label)
),
stat = "identity"
) +
geom_text(
aes(
x = factor(depression_label),
y = avg_anxiety_hours,
label = round(avg_anxiety_hours, 2)
),
vjust = -0.5,
size = 3.3
) +
scale_fill_manual(
values = c("0" = "steelblue", "1" = "tomato"),
name = "Depression label"
) +
labs(
x = "Depression label",
y = "Average anxiety levels",
title = "Average anxiety levels by depression Label"
) +
theme_minimal()
There is a strong positive correlation between depression and anxiety in this dataset.
The pattern aligns with established psychological research showing that these conditions frequently co‑occur.
Individuals with higher depression scores also exhibit elevated levels of anxiety.
addiction_depression_mean<-social_media_data %>%
group_by(depression_label) %>%
summarise(avg_addiction_hours= mean(addiction_level, na.rm = TRUE))
ggplot(data = addiction_depression_mean) +
geom_bar(
aes(
x = factor(depression_label),
y = avg_addiction_hours,
fill = factor(depression_label)
),
stat = "identity"
) +
geom_text(
aes(
x = factor(depression_label),
y = avg_addiction_hours,
label = round(avg_addiction_hours, 2)
),
vjust = -0.5,
size = 3.3
) +
scale_fill_manual(
values = c("0" = "steelblue", "1" = "tomato"),
name = "Depression label"
) +
labs(
x = "Depression label",
y = "Average addiction levels",
title = "Average addiction levels by depression Label"
) +
theme_minimal()
gpa_depression_mean<-social_media_data %>%
group_by(depression_label) %>%
summarise(avg_gpa_hours= mean(academic_performance, na.rm = TRUE))
ggplot(data =gpa_depression_mean) +
geom_bar(
aes(
x = factor(depression_label),
y = avg_gpa_hours,
fill = factor(depression_label)
),
stat = "identity"
) +
geom_text(
aes(
x = factor(depression_label),
y = avg_gpa_hours,
label = round(avg_gpa_hours, 2)
),
vjust = -0.5,
size = 3.3
) +
scale_fill_manual(
values = c("0" = "steelblue", "1" = "tomato"),
name = "Depression label"
) +
labs(
x = "Depression label",
y = "Average GPA",
title = "Average GPA by depression Label"
) +
theme_minimal()
physical_depression_mean<-social_media_data %>%
group_by(depression_label) %>%
summarise(avg_physical_hours= mean(physical_activity, na.rm = TRUE))
ggplot(data =physical_depression_mean) +
geom_bar(
aes(
x = factor(depression_label),
y = avg_physical_hours,
fill = factor(depression_label)
),
stat = "identity"
) +
geom_text(
aes(
x = factor(depression_label),
y = avg_physical_hours,
label = round(avg_physical_hours, 2)
),
vjust = -0.5,
size = 3.3
) +
scale_fill_manual(
values = c("0" = "steelblue", "1" = "tomato"),
name = "Depression label"
) +
labs(
x = "Depression label",
y = "Average physical activity hours",
title = "Average physical activity hours by depression Label"
) +
theme_minimal()
There is a negative correlation between depression and physical activity in this dataset, indicating that higher levels of depression are associated with reduced engagement in physical exercise.
This pattern is consistent with psychological research showing that depressive symptoms often coincide with lower levels of physical activity.
corr_matrix<-social_media_data%>%
select(1,3,5,6,7,8,10,11,12,13)%>%
cor()
corrplot(corr_matrix,
method = "color",
tl.col = "black"
)
Screen time before sleep vs age: There is a slight positive correlation between age and screen time before sleep, indicating that older participants spend somewhat more time on screens before bed, though the relationship is weak.
Anxiety level vs Academic performance: The heatmap shows a weak negative correlation, suggesting that higher anxiety may be linked to slightly lower academic performance.
Depression vs. daily social media usage: There is a mild positive correlation between depression and daily social media use, indicating that increased social media time may be linked to higher reports of depression.
Depression vs Sleep hours: There is a clear negative relationship, with participants experiencing depression reporting fewer hours of sleep on average.
Depression vs Stress Level: Depression shows a moderate positive correlation with stress, indicating that higher stress is linked to a greater likelihood of depression.
Depression vs Anxiety level: There is a positive relationship between depression and anxiety, with higher anxiety levels associated with increased reports of depression.
Addiction level vs Sleep hours: The heatmap indicates a slight negative correlation, suggesting that higher addiction levels are linked to reduced sleep duration, though the effect is small.
| Factor | Predictor of Depression | Explanation |
|---|---|---|
| Sleep duration | Yes | Depressed students sleep ~2 hours less; no overlap in IQR; highly significant difference. |
| Daily social media usage | Yes | Depressed students use social media more (6 hrs vs. 4.5 hrs). |
| Stress levels | Yes | Depressed students report ~58% higher stress. |
| Anxiety levels | Yes | Strong positive correlation; anxiety and depression co‑occur |
| Screen Time Before Sleep | No | No significant difference; p = 0.57. |
| Social Media Addiction | No | Average addiction levels nearly identical across groups |
| Academic Performance (GPA) | No | GPA is almost the same for depressed and non‑depressed students |
| Physical Activity | No | Slight negative correlation, but not strong enough to predict depression. |
| Platform Used (TikTok/Instagram/Both) | No | Stress/anxiety vary slightly by platform, but depression rates do not. |