Partona Anastasia 1180042
FMAI-08 Applied Statistics
Forensic Medicine, Anthropology and Imaging - University of Crete
Nowadays, almost every teenager uses social media and looks at a screen daily. While all these apps are great to stay in touch with friends, parents and teachers worry about how they affect a teenager’s mental health. Some research shows that spending time online, especially before bed, can make teens feel more stressed and anxious. (Twenge et al., 2018).
Instead of just looking at “screen time” in general, we are looking at two different habits: daytime social media use and late-night screen use. Using social media during the day can cause anxiety because teens constantly compare themselves to others (Vannucci et al., 2017). On the other hand, looking at screens right before bed is known to ruin sleep, which makes it harder to handle emotions the next day (Cain & Gradisar, 2010; Orben & Przybylski, 2019).
To what extent do daily social media hours and screen time before sleep predict anxiety, stress, and depression levels in teenagers?
The data used in this analysis were obtained from the Teenager Mental Health data set available on Kaggle by Algozee (n.d.). This data set studies how social media use affects the mental health of teenagers. It includes daily habits like social media hours, sleep, stress, anxiety and physical activity. The data helps in analyzing behavior and building machine learning models to predict the mental health risk. Overall, it is useful for basic research and creating models that can help in early detection of mental health issues in teenagers.
#Loading Necessary Packages
library(tidyverse)
## Warning: package 'ggplot2' was built under R version 4.5.3
## Warning: package 'purrr' was built under R version 4.5.3
## Warning: package 'dplyr' was built under R version 4.5.3
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.2.1 ✔ readr 2.2.0
## ✔ forcats 1.0.1 ✔ stringr 1.6.0
## ✔ ggplot2 4.0.3 ✔ tibble 3.3.1
## ✔ lubridate 1.9.5 ✔ tidyr 1.3.2
## ✔ purrr 1.2.2
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
#Importing Data
teen_data <- read.csv("Teen_Mental_Health_Dataset.csv", stringsAsFactors = TRUE)
#Viewing the data
head(teen_data)
## age gender daily_social_media_hours platform_usage sleep_hours
## 1 14 male 7.9 Instagram 7.4
## 2 19 female 1.9 TikTok 8.0
## 3 17 female 1.3 Instagram 7.6
## 4 15 male 7.4 TikTok 6.9
## 5 15 female 4.7 Both 4.9
## 6 19 female 7.4 Both 4.4
## screen_time_before_sleep academic_performance physical_activity
## 1 2.9 3.01 1.5
## 2 2.9 3.22 0.8
## 3 0.5 3.92 0.0
## 4 1.6 3.48 0.8
## 5 3.0 2.37 1.4
## 6 2.4 2.63 0.6
## social_interaction_level stress_level anxiety_level addiction_level
## 1 low 2 2 1
## 2 high 8 1 10
## 3 high 2 4 2
## 4 medium 1 7 9
## 5 medium 3 5 2
## 6 high 3 5 7
## depression_label
## 1 0
## 2 0
## 3 0
## 4 0
## 5 0
## 6 0
View(teen_data)
str(teen_data)
## 'data.frame': 1200 obs. of 13 variables:
## $ age : int 14 19 17 15 15 19 18 16 19 15 ...
## $ gender : Factor w/ 2 levels "female","male": 2 1 1 2 1 1 1 2 1 2 ...
## $ daily_social_media_hours: num 7.9 1.9 1.3 7.4 4.7 7.4 2.5 4 3.3 1.9 ...
## $ platform_usage : Factor w/ 3 levels "Both","Instagram",..: 2 3 2 3 1 1 2 1 3 3 ...
## $ sleep_hours : num 7.4 8 7.6 6.9 4.9 4.4 6.4 4.2 5 4.9 ...
## $ screen_time_before_sleep: num 2.9 2.9 0.5 1.6 3 2.4 2.4 0.5 2.1 1.5 ...
## $ academic_performance : num 3.01 3.22 3.92 3.48 2.37 2.63 2.63 2.4 2.04 3.77 ...
## $ physical_activity : num 1.5 0.8 0 0.8 1.4 0.6 0.7 1.3 0.9 1.1 ...
## $ social_interaction_level: Factor w/ 3 levels "high","low","medium": 2 1 1 3 3 1 2 2 1 1 ...
## $ stress_level : int 2 8 2 1 3 3 2 6 1 1 ...
## $ anxiety_level : int 2 1 4 7 5 5 2 10 10 1 ...
## $ addiction_level : int 1 10 2 9 2 7 5 5 9 4 ...
## $ depression_label : int 0 0 0 0 0 0 0 0 0 0 ...
summary(teen_data)
## age gender daily_social_media_hours platform_usage
## Min. :13.00 female:585 Min. :1.000 Both :391
## 1st Qu.:14.00 male :615 1st Qu.:2.800 Instagram:411
## Median :16.00 Median :4.500 TikTok :398
## Mean :15.93 Mean :4.537
## 3rd Qu.:18.00 3rd Qu.:6.300
## Max. :19.00 Max. :8.000
## sleep_hours screen_time_before_sleep academic_performance
## Min. :4.000 Min. :0.50 Min. :2.00
## 1st Qu.:5.200 1st Qu.:1.10 1st Qu.:2.50
## Median :6.500 Median :1.80 Median :2.99
## Mean :6.449 Mean :1.74 Mean :2.99
## 3rd Qu.:7.600 3rd Qu.:2.40 3rd Qu.:3.48
## Max. :9.000 Max. :3.00 Max. :4.00
## physical_activity social_interaction_level stress_level anxiety_level
## Min. :0.000 high :369 Min. : 1.000 Min. : 1.000
## 1st Qu.:0.500 low :415 1st Qu.: 3.000 1st Qu.: 3.000
## Median :1.000 medium:416 Median : 5.000 Median : 6.000
## Mean :1.014 Mean : 5.446 Mean : 5.637
## 3rd Qu.:1.500 3rd Qu.: 8.000 3rd Qu.: 8.000
## Max. :2.000 Max. :10.000 Max. :10.000
## addiction_level depression_label
## Min. : 1.000 Min. :0.00000
## 1st Qu.: 3.000 1st Qu.:0.00000
## Median : 6.000 Median :0.00000
## Mean : 5.565 Mean :0.02583
## 3rd Qu.: 8.000 3rd Qu.:0.00000
## Max. :10.000 Max. :1.00000
#Checking for missing values
colSums(is.na(teen_data))
## age gender daily_social_media_hours
## 0 0 0
## platform_usage sleep_hours screen_time_before_sleep
## 0 0 0
## academic_performance physical_activity social_interaction_level
## 0 0 0
## stress_level anxiety_level addiction_level
## 0 0 0
## depression_label
## 0
There are no missing values so it is not necessary to remove any rows from the data set.
#Exploring the data
#Summarizing Social Media Hours
mean(teen_data$daily_social_media_hours)
## [1] 4.536667
median(teen_data$daily_social_media_hours)
## [1] 4.5
sd(teen_data$daily_social_media_hours)
## [1] 2.029599
#Summarizing Screen Time Before Bed
mean(teen_data$screen_time_before_sleep)
## [1] 1.740333
median(teen_data$screen_time_before_sleep)
## [1] 1.8
sd(teen_data$screen_time_before_sleep)
## [1] 0.7166598
#Summarizing Stress Levels
mean(teen_data$stress_level)
## [1] 5.445833
median(teen_data$stress_level)
## [1] 5
sd(teen_data$stress_level)
## [1] 2.90329
To Summarize as to how many teenagers have depression due to it being coded as 0 = No and 1 = Yes the use of a scale score is more appropriate.
#Depression - Table 1
table(teen_data$depression_label)
##
## 0 1
## 1169 31
Table 1 shows the exact count of “Yes” and “No”.
#Depression - Table 2
prop.table(table(teen_data$depression_label)) * 100
##
## 0 1
## 97.416667 2.583333
Table 2 shows the percentage breakdown.
# Histogram for daily social media hours
hist(teen_data$daily_social_media_hours,
main = "Distribution of Daily Social Media Use",
xlab = "Hours per Day",
col = "skyblue",
border = "black")
# Histogram for screen time before bed
hist(teen_data$screen_time_before_sleep,
main = "Distribution of Screen Time Before Sleep",
xlab = "Hours Before Bed",
col = "lightgreen",
border = "black")
First, visualizing Social Media Habits to look at how many hours teens spend on screens. With the histogram we are splitting the hours and see if each amount of time is common.
# Does daily social media predict anxiety?
plot(teen_data$daily_social_media_hours, teen_data$anxiety_level,
main = "Daily Social Media Hours vs. Anxiety Level",
xlab = "Social Media Hours",
ylab = "Anxiety Score (0-10)",
pch = 16, col = "darkblue")
# Add a red trend line
abline(lm(anxiety_level ~ daily_social_media_hours, data = teen_data), col = "red", lwd = 2)
# Does screen time before sleep predict stress?
plot(teen_data$screen_time_before_sleep, teen_data$stress_level,
main = "Screen Time Before Sleep vs. Stress Level",
xlab = "Screen Time Before Sleep (Hours)",
ylab = "Stress Score (0-10)",
pch = 16, col = "darkgreen")
# Add a red trend line
abline(lm(stress_level ~ screen_time_before_sleep, data = teen_data), col = "red", lwd = 2)
Visualizing the Relationships to see if social media hours or bed time screen time predicts higher anxiety or stress using a scatter plot. With the use of abline() we create a straight line that shows if the trend goes up or down.
# Compare social media hours for teens with and without depression
boxplot(daily_social_media_hours ~ depression_label, data = teen_data,
main = "Daily Social Media Hours by Depression Diagnosis",
xlab = "Is the Teen Labeled with Depression?",
ylab = "Daily Social Media Hours",
col = c("lightgray", "coral"))
Like we previously mentioned depression is categorical so a scattered
plot will not look right instead a boxplot is a better option to see if
teens who are depressed have more screen hours from those who are
not.
#Linear Regressions for Anxiety and Stress - Continuous Scores
#Anxiety
anxiety_model <- lm(anxiety_level ~ daily_social_media_hours + screen_time_before_sleep, data = teen_data)
summary(anxiety_model)
##
## Call:
## lm(formula = anxiety_level ~ daily_social_media_hours + screen_time_before_sleep,
## data = teen_data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -4.810 -2.602 0.303 2.396 4.543
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 5.53500 0.28036 19.742 <2e-16 ***
## daily_social_media_hours 0.03979 0.04073 0.977 0.329
## screen_time_before_sleep -0.04530 0.11535 -0.393 0.695
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.861 on 1197 degrees of freedom
## Multiple R-squared: 0.0009035, Adjusted R-squared: -0.0007658
## F-statistic: 0.5412 on 2 and 1197 DF, p-value: 0.5822
#Stress
stress_model <- lm(stress_level ~ daily_social_media_hours + screen_time_before_sleep, data = teen_data)
summary(stress_model)
##
## Call:
## lm(formula = stress_level ~ daily_social_media_hours + screen_time_before_sleep,
## data = teen_data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -4.6403 -2.4564 -0.3523 2.5547 4.7368
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 5.31317 0.28464 18.666 <2e-16 ***
## daily_social_media_hours 0.04441 0.04135 1.074 0.283
## screen_time_before_sleep -0.03954 0.11711 -0.338 0.736
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.904 on 1197 degrees of freedom
## Multiple R-squared: 0.001037, Adjusted R-squared: -0.0006316
## F-statistic: 0.6216 on 2 and 1197 DF, p-value: 0.5373
#Logistic Regression for Depression - Diagnostic Indicator
depression_model <- glm(depression_label ~ daily_social_media_hours + screen_time_before_sleep,
data = teen_data, family = binomial)
summary(depression_model)
##
## Call:
## glm(formula = depression_label ~ daily_social_media_hours + screen_time_before_sleep,
## family = binomial, data = teen_data)
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -7.6169 1.0687 -7.127 1.03e-12 ***
## daily_social_media_hours 0.7550 0.1465 5.154 2.55e-07 ***
## screen_time_before_sleep -0.2186 0.2578 -0.848 0.396
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 287.87 on 1199 degrees of freedom
## Residual deviance: 244.55 on 1197 degrees of freedom
## AIC: 250.55
##
## Number of Fisher Scoring iterations: 8
Our statistical analysis reveals a clear distinction between daily baseline emotional struggles and severe clinical metrics. While daily social media usage and late-night screen exposure are not statistically significant predictors of general, continuous anxiety (\(p = 0.329\)) or stress levels (\(p = 0.283\)), daily social media hours are a highly significant predictor of clinical depression risk (\(p < 0.001\)). This confirms that while general stress fluctuates independently, heavy daily immersion across social platforms strongly scales the probability of serious mental health outcomes in this adolescent sample.
The data shows that the average teenager in this study spends 4.54 hours a day on social media and 1.74 hours looking at a screen right before going to sleep. At the same time, the average anxiety score is 5.64 and the average stress score is 5.45 (out of 10).
These high scores match a study by Vannucci et al. (2017), which found that teenagers who spend hours on social media every day tend to have much higher anxiety. Spending over 4 hours a day scrolling through apps likely explains why the average stress and anxiety scores in our data set are so high.
In our data set, teenagers who spend a lot of time on screens right before bed also tend to get less sleep and get lower grades.
This matches research by Cain and Gradisar (2010), who showed that using electronics right before bed delays bedtime and causes sleep loss. Furthermore, researchers Orben and Przybylski (2019) point out that social media does the most damage to a teen’s well-being when it replaces healthy habits—like getting enough sleep or exercising.
Finally, about 2.58% of the teenagers in this data set are labeled as having depression. While this is a small percentage, we can use statistical models to see if heavy social media use increases the risk of depression. This links directly to the work of Twenge et al. (2018), who discovered that depression rates among youth shot up drastically right around the time smartphones and social media became popular.
In conclusion, this report directly answers our research question: daily social media hours are a powerful predictor of severe clinical outcomes like depression, but screen habits do not simply predict general daily stress or anxiety scores in this sample.
Our statistical analysis revealed a clear line between day-to-day emotional fluctuations and serious clinical conditions. While daily social media use and late-night screen exposure are not statistically significant predictors of a teenager’s general, daily anxiety or stress scores, total daytime social media hours act as a major risk escalator for clinical depression (\(p < 0.001\)). This indicates that while general stress levels may vary based on many independent life factors, spending heavy hours immersed in social media platforms significantly increases the probability of more serious, long-term mental health struggles.
Algozee. (n.d.). Teenager mental health [Data set]. Kaggle. Retrieved June 24, 2026, from https://www.kaggle.com/datasets/algozee/teenager-menthal-healy?resource=download
Cain, N., & Gradisar, M. (2010). Electronic media use and sleep in school-aged children and adolescents: A review. Sleep Medicine, 11(8), 735-742.
Orben, A., & Przybylski, A. K. (2019). The association between adolescent well-being and digital technology use. Nature Human Behaviour, 3(2), 173-182.
Twenge, J. M., Joiner, T. E., Rogers, M. L., & Martin, G. N. (2018). Increases in depressive symptoms, suicide-related outcomes, and suicide rates among US adolescents after 2010 and income-level links to increased new media screen time. Clinical Psychological Science, 6(1), 3-17.
Vannucci, A., Flannery, K. M., & Ohannessian, C. M. (2017). Social media use and anxiety in emerging adults. Journal of Affective Disorders, 207, 163-166.