Abstract
As technology advances and society’s dependence on it deepens, there is
growing worry that excessive use of technology can negatively impact
users’ stress levels. In fact, the American Psychological Association
found that high attachment levels to devices and the constant use of
technology is associated with higher stress levels, and about a fifth of
Americans identify the use of technology as a significant source of
stress (American Psychological Association, 2017). As such, looking at
the relationship between technology usage and stress can be extremely
important in understanding how technology can negatively impact stress
levels for its users.
Problem
As stated previously, it has been found that technology usage can
negatively affect stress levels for its users. Multiple factors other
than technology can also influence stress and wellness, including but
not limited to age, gender, and physical activity levels (Idrees, Blal
et al). Understanding what factors affect stress levels can help
mitigate or control stress levels and improve personal wellness.
Purpose
The purpose of this project is to determine if stress levels differ
significantly by gender or age group, if screen time is correlated with
mental health scores, if stress and physical activity are independent,
and if regression models can predict stress levels based on tech use.
Understanding how tech use impacts wellness can inform healthier digital
habits and public health strategies.
The Dataset
The dataset used in this project was sourced from Kaggle and was made
available on Kaggle by Nagpal Prabhavalkar. The dataset contains
information on 5,000 participants and 25 variables. The variables are as
follows: the user id of the participant, the age of the participant, the
participant’s gender, daily screen time (in hours), daily phone usage
(in hours), daily laptop usage (in hours), daily tablet usage (in
hours), daily tv usage (in hours), daily social media usage (in hours),
daily work related technology usage (in hours), entertainment hours,
gaming hours, sleep duration (in hours), sleep quality (rated on a scale
of 1 to 5), mood rating (rated on a scale of 1 to 10), stress level
(rated on a scale of 1 to 10), physical activity (in hours per week),
location type (rated as rural, suburban, or urban), mental health score
(rated on a scale of 1 to 100), if the participant uses wellness apps
(true or false), if the participant eats healthy (true or false),
caffeine intake (in milligrams per day), weekly anxiety score (rated
from 0 to 20), weekly depression score (from 0 to 20) and the
participant’s mindfulness score (in minutes per day).
The variables that will be focused on in this project will be gender,
age, daily screen time, stress level, mental health score, physical
activity, mood rating, sleep quality, and location type. It is important
to note that categories like stress level, mental health score, mood
rating, and sleep quality are subjective, and there may not be
consistency in how participants answer or respond to these
categories.
This project analyzes the relationship between technology usage
(daily screen time, phone usage) and wellness outcomes (stress level,
mental health score, sleep quality). We investigate differences across
age groups, genders, and location types.
Research Questions:
• Do stress levels differ significantly by gender or age group?
• Is screen time correlated with mental health scores?
• Are stress and physical activity independent?
• Can regression models predict stress from tech use?
Relevance: Understanding how tech use impacts wellness can inform
healthier digital habits and public health strategies.
# Load the data
data <- read.csv("Tech_Use_Stress_Wellness.csv")
# Exploratory Data Analysis
#View(data)
#str(data)
#glimpse(data)
#summary(data)
ggplot(data, aes(x = daily_screen_time_hours)) +
geom_histogram(binwidth = 1, fill = "steelblue", color = "white") +
labs(title = "Distribution of Daily Screen Time")
# Daily screen time by gender
ggplot(data, aes(x = daily_screen_time_hours, fill = gender)) + labs(title = "Daily Screen Time per Gender", x = "Daily Screen Time (hours)", y = "Count per Gender" ) + geom_histogram() + theme_minimal()
# Correlation between daily screen time and stress level
ggplot(data, aes(x = daily_screen_time_hours, y = stress_level, color = gender)) +
geom_point(alpha = 0.6) +
geom_smooth(method = "lm", se = FALSE) +
labs(title = "Screen Time vs Stress by Gender",
x = "Daily Screen Time (hours)",
y = "Stress Level") + theme_minimal()
# Distribution of Screen Time and Physical Activity
ggplot(data, aes(x = daily_screen_time_hours)) +
geom_histogram(binwidth = 1, fill = "steelblue", color = "white") +
labs(title = "Distribution of Daily Screen Time",
x = "Daily Screen Time (hours)",
y = "Count") +
theme_minimal()
ggplot(data, aes(x = physical_activity_hours_per_week)) +
geom_histogram(binwidth = 1, fill = "darkgreen", color = "white") +
labs(title = "Distribution of Weekly Physical Activity",
x = "Physical Activity (hours per week)",
y = "Count") +
theme_minimal()
# Wellness
ggplot(data, aes(x = daily_screen_time_hours, y = mental_health_score)) +
geom_point(alpha = 0.6, color = "steelblue", size = 2) + # softer points
geom_smooth(method = "lm", se = TRUE, color = "darkred", linetype = "dashed", size = 1.2) +
labs(
title = "Daily Screen Time vs Mental Health Score",
x = "Daily Screen Time (hours)",
y = "Mental Health Score") + theme_minimal()
# Summary statistics
data %>%
group_by(gender) %>%
summarise(mean_stress = mean(stress_level, na.rm = TRUE),
mean_wellness = mean(mental_health_score, na.rm = TRUE))
## # A tibble: 3 × 3
## gender mean_stress mean_wellness
## <chr> <dbl> <dbl>
## 1 Female 5.72 64.7
## 2 Male 5.69 64.9
## 3 Other 6.07 63.4
data %>%
group_by(location_type) %>%
summarise(mean_stress = mean(stress_level, na.rm = TRUE),
mean_wellness = mean(mental_health_score, na.rm = TRUE))
## # A tibble: 3 × 3
## location_type mean_stress mean_wellness
## <chr> <dbl> <dbl>
## 1 Rural 5.64 65.2
## 2 Suburban 5.63 65.1
## 3 Urban 5.81 64.4
data %>%
group_by(age) %>%
summarise(mean_stress = mean(stress_level, na.rm = TRUE),
mean_wellness = mean(mental_health_score, na.rm = TRUE))
## # A tibble: 60 × 3
## age mean_stress mean_wellness
## <int> <dbl> <dbl>
## 1 15 8.96 52.3
## 2 16 8.56 54.3
## 3 17 8.99 52.1
## 4 18 8.71 52.6
## 5 19 8.9 52.5
## 6 20 8.94 52.4
## 7 21 8.13 55.1
## 8 22 8.89 52.4
## 9 23 8.68 52.5
## 10 24 8.94 51.2
## # ℹ 50 more rows
# Mean daily screen time by location type
data %>%
group_by(location_type) %>%
summarise(mean_screen_time = mean(daily_screen_time_hours, na.rm = TRUE))
## # A tibble: 3 × 2
## location_type mean_screen_time
## <chr> <dbl>
## 1 Rural 5.00
## 2 Suburban 4.94
## 3 Urban 5.11
# Compare CI for stress levels across gender/age groups.
# CI for mean stress level using ggpltot2 and infer package
boot_dist <- data %>%
specify(response = stress_level) %>%
generate(reps = 1000, type = "bootstrap") %>%
calculate(stat = "mean")
boot_ci <- boot_dist %>%
get_ci(level = 0.95, type = "percentile")
boot_ci
## # A tibble: 1 × 2
## lower_ci upper_ci
## <dbl> <dbl>
## 1 5.63 5.80
# Manual Bootstrap for mean stress level from Chap05Bootstrap.Rmd Chihara, L., & Hesterberg, T. (2019). Mathematical Statistics with Resampling and R.
x <- data$stress_level
n <- length(x)
N <- 10^4
data.mean<-numeric(N)
#set.seed(2025)
for (i in 1:N)
{
boot_sample <- sample(x, n, replace = TRUE)
data.mean[i] <- mean(boot_sample)
}
# Check normality
hist(data.mean, main = "Bootstrap distribution of means")
abline(v = mean(data.mean), col = "blue", lty = 2)
qqnorm(data.mean)
qqline(data.mean)
#bootstrap mean
mean(data.mean)
## [1] 5.717703
#bootstrap standard error or std dev of the boot means
sd(data.mean)
## [1] 0.04138756
# 95% boot percentile CI
quantile(data.mean, c(0.025, 0.975))
## 2.5% 97.5%
## 5.63760 5.79861
# Hypothesis test
# Example: Is mean stress > 8?
t.test(data$stress_level, mu = 8, alternative = "greater")
##
## One Sample t-test
##
## data: data$stress_level
## t = -55.345, df = 4999, p-value = 1
## alternative hypothesis: true mean is greater than 8
## 95 percent confidence interval:
## 5.650578 Inf
## sample estimates:
## mean of x
## 5.7184
# Compare stress by gender standard t.test insufficient due to 3 factors - use pairwise t.test
pairwise.t.test(data$stress_level, data$gender, p.adjust.method = "bonferroni")
##
## Pairwise comparisons using t tests with pooled SD
##
## data: data$stress_level and data$gender
##
## Female Male
## Male 1.00 -
## Other 0.34 0.24
##
## P value adjustment method: bonferroni
# Correlation matrix for key numeric variables
numeric_vars <- data %>%
select(stress_level,
daily_screen_time_hours,
mental_health_score,
physical_activity_hours_per_week)
cor_matrix <- cor(numeric_vars, use = "complete.obs")
cor_matrix
## stress_level daily_screen_time_hours
## stress_level 1.0000000 0.6656726
## daily_screen_time_hours 0.6656726 1.0000000
## mental_health_score -0.9038402 -0.6218312
## physical_activity_hours_per_week -0.7421300 -0.4626024
## mental_health_score
## stress_level -0.9038402
## daily_screen_time_hours -0.6218312
## mental_health_score 1.0000000
## physical_activity_hours_per_week 0.7858499
## physical_activity_hours_per_week
## stress_level -0.7421300
## daily_screen_time_hours -0.4626024
## mental_health_score 0.7858499
## physical_activity_hours_per_week 1.0000000
# One-sample CI and hypothesis test for mean stress level
mean_stress <- mean(data$stress_level, na.rm = TRUE)
sd_stress <- sd(data$stress_level, na.rm = TRUE)
n_stress <- sum(!is.na(data$stress_level))
# 95% CI for population mean stress
se_stress <- sd_stress / sqrt(n_stress)
lower_ci <- mean_stress - 1.96 * se_stress
upper_ci <- mean_stress + 1.96 * se_stress
c(lower_ci = lower_ci, upper_ci = upper_ci)
## lower_ci upper_ci
## 5.637599 5.799201
# One-sample t-test: is mean stress > 5?
t.test(data$stress_level, mu = 5, alternative = "greater")
##
## One Sample t-test
##
## data: data$stress_level
## t = 17.426, df = 4999, p-value < 2.2e-16
## alternative hypothesis: true mean is greater than 5
## 95 percent confidence interval:
## 5.650578 Inf
## sample estimates:
## mean of x
## 5.7184
# Correlation between activity and stress
cor.test(data$physical_activity, data$stress_level)
##
## Pearson's product-moment correlation
##
## data: data$physical_activity and data$stress_level
## t = -78.278, df = 4998, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.7543317 -0.7294157
## sample estimates:
## cor
## -0.74213
# Linear regression: stress ~ screen time
lm_model <- lm(stress_level ~ daily_screen_time_hours, data = data)
summary(lm_model)
##
## Call:
## lm(formula = stress_level ~ daily_screen_time_hours, data = data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -6.6869 -1.6127 -0.0125 1.5731 6.2244
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.39280 0.08988 4.37 1.27e-05 ***
## daily_screen_time_hours 1.05711 0.01676 63.06 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.176 on 4998 degrees of freedom
## Multiple R-squared: 0.4431, Adjusted R-squared: 0.443
## F-statistic: 3977 on 1 and 4998 DF, p-value: < 2.2e-16
# Plot regression
ggplot(data, aes(x = daily_screen_time_hours, y = stress_level)) +
geom_point(alpha = 0.6, color = "steelblue") +
geom_smooth(method = "lm", se = TRUE, color = "darkred", linetype = "dashed") +
labs(title = "Screen Time vs Stress Level")+ theme_minimal()
# Simple linear regression: mental health ~ screen time
lm_mh_screen <- lm(mental_health_score ~ daily_screen_time_hours, data = data)
summary(lm_mh_screen)
##
## Call:
## lm(formula = mental_health_score ~ daily_screen_time_hours, data = data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -32.476 -7.280 -0.032 7.308 37.495
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 87.11546 0.42374 205.59 <2e-16 ***
## daily_screen_time_hours -4.43626 0.07903 -56.13 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 10.26 on 4998 degrees of freedom
## Multiple R-squared: 0.3867, Adjusted R-squared: 0.3866
## F-statistic: 3151 on 1 and 4998 DF, p-value: < 2.2e-16
# 95% CI for the screen-time slope
confint(lm_mh_screen, parm = "daily_screen_time_hours", level = 0.95)
## 2.5 % 97.5 %
## daily_screen_time_hours -4.591194 -4.281327
# Simple linear regression: stress ~ physical activity
lm_stress_activity <- lm(stress_level ~ physical_activity_hours_per_week, data = data)
summary(lm_stress_activity)
##
## Call:
## lm(formula = stress_level ~ physical_activity_hours_per_week,
## data = data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -5.625 -1.418 -0.058 1.753 9.921
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 8.23193 0.04236 194.31 <2e-16 ***
## physical_activity_hours_per_week -0.94517 0.01207 -78.28 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.954 on 4998 degrees of freedom
## Multiple R-squared: 0.5508, Adjusted R-squared: 0.5507
## F-statistic: 6127 on 1 and 4998 DF, p-value: < 2.2e-16
# 95% CI for the physical-activity slope
confint(lm_stress_activity,
parm = "physical_activity_hours_per_week",
level = 0.95)
## 2.5 % 97.5 %
## physical_activity_hours_per_week -0.9688433 -0.9215002
# Simple linear regression: stress ~ mental health score
lm_stress_mh <- lm(stress_level ~ mental_health_score, data = data)
summary(lm_stress_mh)
##
## Call:
## lm(formula = stress_level ~ mental_health_score, data = data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -4.2690 -0.8606 -0.0499 0.9084 5.1334
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 18.748775 0.089018 210.6 <2e-16 ***
## mental_health_score -0.201191 0.001347 -149.3 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.247 on 4998 degrees of freedom
## Multiple R-squared: 0.8169, Adjusted R-squared: 0.8169
## F-statistic: 2.23e+04 on 1 and 4998 DF, p-value: < 2.2e-16
# 95% CI for the slope (effect of mental health on stress)
confint(lm_stress_mh, parm = "mental_health_score", level = 0.95)
## 2.5 % 97.5 %
## mental_health_score -0.2038321 -0.1985499
# ANOVA: stress by age group
data$age_group <- cut(
data$age,
breaks = c(0, 25, 40, 60, 80, 100),
labels = c("≤25", "26-40", "41-60", "61-80", "81+"),
right = TRUE
)
anova_age <- aov(stress_level ~ age_group, data = data)
summary(anova_age)
## Df Sum Sq Mean Sq F value Pr(>F)
## age_group 3 10928 3643 576.8 <2e-16 ***
## Residuals 4996 31551 6
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# Diagnostics: Q-Q plot
plot(anova_age, which = 2)
# Permutation ANOVA for robustness
set.seed(2025)
observed <- anova(lm(stress_level ~ age_group, data = data))$`F value`[1]
N <- 1000
results <- numeric(N)
for (i in 1:N) {
index <- sample(seq_along(data$stress_level))
stress_perm <- data$stress_level[index]
results[i] <- anova(lm(stress_perm ~ age_group, data = data))$`F value`[1]
}
perm_p <- (sum(results >= observed) + 1) / (N + 1)
perm_p
## [1] 0.000999001
# Post-hoc Tukey HSD
TukeyHSD(anova_age)
## Tukey multiple comparisons of means
## 95% family-wise confidence level
##
## Fit: aov(formula = stress_level ~ age_group, data = data)
##
## $age_group
## diff lwr upr p adj
## 26-40-≤25 -3.2093500 -3.4917621 -2.9269378 0.0000000
## 41-60-≤25 -3.6099167 -3.8781558 -3.3416777 0.0000000
## 61-80-≤25 -4.3777632 -4.6641864 -4.0913399 0.0000000
## 41-60-26-40 -0.4005668 -0.6415778 -0.1595558 0.0001164
## 61-80-26-40 -1.1684132 -1.4295117 -0.9073148 0.0000000
## 61-80-41-60 -0.7678465 -1.0135454 -0.5221475 0.0000000
# Plots side by side
p1 <- ggplot(data, aes(x = age_group, y = stress_level)) +
geom_boxplot(fill = "lightblue") +
labs(title = "Stress Level by Age Group") + theme_minimal()
p2 <- ggplot(data, aes(x = age_group, y = daily_screen_time_hours)) +
geom_jitter(width = 0.2, alpha = 0.5, color = "darkgreen") +
labs(title = "Daily Screen Time by Age Group") + theme_minimal()
p1 + p2
Stress by Age Group
A one-way ANOVA was conducted to examine differences in stress levels
across age groups. One group (81+) had no observations, so the analysis
effectively compared four age groups (≤25, 26–40, 41–60, 61–80). The
ANOVA indicated a strong overall effect of age group on stress, F(3,
4996) = 576.8, p < 0.001.
Tukey post-hoc comparisons showed that the youngest group (≤25) reported
substantially higher mean stress than all older groups (differences ≈
3–4.4 units on the stress scale, all p < 0.001). Stress decreased
with age: 26–40 had lower stress than ≤25, 41–60 lower than 26–40, and
61–80 had the lowest stress levels overall.
A permutation ANOVA yielded a very small p-value (≈ 0.001), consistent
with the classical ANOVA and reinforcing that age-related differences in
stress are highly unlikely to be due to chance. Visualizations of stress
and daily screen time by age group suggested that younger adults both
reported higher screen time and higher stress, pointing to a potential
link between intensive tech use and stress in younger
participants.
# ANOVA: stress by location type
anova_loc <- aov(stress_level ~ location_type, data = data)
summary(anova_loc)
## Df Sum Sq Mean Sq F value Pr(>F)
## location_type 2 39 19.309 2.273 0.103
## Residuals 4997 42441 8.493
# Diagnostics
plot(anova_loc, which = 2)
# Permutation ANOVA
set.seed(2025)
observed <- anova(lm(stress_level ~ location_type, data = data))$`F value`[1]
N <- 1000
results <- numeric(N)
for (i in 1:N) {
index <- sample(seq_along(data$stress_level))
stress_perm <- data$stress_level[index]
results[i] <- anova(lm(stress_perm ~ location_type, data = data))$`F value`[1]
}
perm_p <- (sum(results >= observed) + 1) / (N + 1)
perm_p
## [1] 0.1288711
# Post-hoc Tukey HSD
TukeyHSD(anova_loc)
## Tukey multiple comparisons of means
## 95% family-wise confidence level
##
## Fit: aov(formula = stress_level ~ location_type, data = data)
##
## $location_type
## diff lwr upr p adj
## Suburban-Rural -0.01390902 -0.29024488 0.2624268 0.9923526
## Urban-Rural 0.16735340 -0.08479586 0.4195027 0.2650430
## Urban-Suburban 0.18126242 -0.04329960 0.4058244 0.1409578
# Plots side by side
p3 <- ggplot(data, aes(x = location_type, y = stress_level)) +
geom_boxplot(fill = "lightblue") +
labs(title = "Stress Level by Location") + theme_minimal()
p4 <- ggplot(data, aes(x = location_type, y = daily_screen_time_hours)) +
geom_jitter(width = 0.2, alpha = 0.5, color = "darkgreen") +
labs(title = "Daily Screen Time by Location") + theme_minimal()
p3 + p4
Stress by Location Type
A one-way ANOVA compared mean stress levels across location types
(Urban, Suburban, Rural). The overall effect of location was not
statistically significant, F(2, 4997) = 2.27, p = 0.10. A permutation
ANOVA produced a similar p-value (≈ 0.13), indicating that any observed
differences in mean stress by location could plausibly be due to random
variation.
Tukey post-hoc comparisons showed that the estimated mean stress in
urban areas was slightly higher than in rural and suburban areas, but
all confidence intervals included zero and all adjusted p-values
exceeded 0.10, so no pairwise differences met conventional significance
thresholds.
Boxplots and jittered scatterplots suggest a small trend toward higher
stress and higher screen time in urban participants, but given the
non-significant test results, these patterns should be interpreted
cautiously as suggestive rather than conclusive.
# Chi-square test for independence of mood and sleep quality
tab <- table(data$mood_rating, data$sleep_quality)
chisq.test(tab)
##
## Pearson's Chi-squared test
##
## data: tab
## X-squared = 1083.8, df = 360, p-value < 2.2e-16
chi <- chisq.test(tab)
#chi$stdres
# Identify cells with large standardized residuals
thr <- 3
idx <- which(chi$stdres > thr, arr.ind = TRUE)
cbind(
mood_rating = rownames(chi$stdres)[idx[,"row"]],
sleep_quality = colnames(chi$stdres)[idx[,"col"]],
std_residual = chi$stdres[idx]
)
## mood_rating sleep_quality std_residual
## [1,] "1.1" "1" "9.57135908716979"
## [2,] "1" "2" "5.09738157829577"
## [3,] "2.8" "2" "3.382164807067"
## [4,] "3.7" "2" "3.2831630034317"
## [5,] "4" "2" "4.39496047660441"
## [6,] "1" "3" "14.3292517589722"
## [7,] "1.2" "3" "4.35712315816821"
## [8,] "1.5" "3" "3.42716588748514"
## [9,] "1.6" "3" "3.15054601419805"
## [10,] "7.5" "5" "3.94815298706102"
## [11,] "7.8" "5" "5.50404644494962"
## [12,] "7.9" "5" "4.74986302156519"
## [13,] "8" "5" "4.76489626598922"
## [14,] "8.2" "5" "3.2582363043431"
## [15,] "8.3" "5" "4.4404007644662"
## [16,] "8.4" "5" "4.19693667031868"
## [17,] "8.6" "5" "4.19166505701785"
## [18,] "8.7" "5" "3.7209925989437"
## [19,] "8.9" "5" "4.63503982505443"
## [20,] "9.3" "5" "3.39793654540615"
## [21,] "9.4" "5" "3.17726105895486"
## [22,] "9.5" "5" "3.02666882845495"
## [23,] "9.6" "5" "3.33977948902774"
## [24,] "10" "5" "7.52878571901207"
# Bin mood ratings into categories
data$mood_cat <- cut(data$mood_rating,
breaks = c(0, 3, 7, 10),
labels = c("Low", "Medium", "High"),
include.lowest = TRUE)
# Drop rows with missing values
clean_df <- na.omit(data[, c("mood_cat", "sleep_quality")])
# Force both variables to be factors with explicit levels
clean_df$mood_cat <- factor(clean_df$mood_cat,
levels = c("Low", "Medium", "High"))
clean_df$sleep_quality <- factor(clean_df$sleep_quality,
levels = c("1","2","3","4","5"))
# Build contingency table
tab <- table(clean_df$mood_cat, clean_df$sleep_quality)
library(knitr)
# Convert to data frame for kable
df_tab <- as.data.frame.matrix(tab)
# Create kable
kable(df_tab, caption = "Contingency Table of Mood Category by Sleep Quality",
align = "c")
| 1 | 2 | 3 | 4 | 5 | |
|---|---|---|---|---|---|
| Low | 1 | 22 | 599 | 1096 | 177 |
| Medium | 0 | 9 | 336 | 1224 | 445 |
| High | 0 | 0 | 44 | 578 | 469 |
Independence of Mood and Sleep Quality
A chi‑square test of independence was performed to assess the
relationship between mood ratings and sleep quality. Results indicated a
statistically significant association, suggesting that individuals with
poorer sleep quality tended to report lower mood ratings. Standardized
residuals revealed that the strongest contributions to the chi-square
statistic came from participants with very low mood ratings (≈1–2)
combined with poor sleep quality (ratings of 1–3), where residuals
exceeded thresholds of 9–14. This suggests that individuals reporting
poor sleep were disproportionately likely to also report very low mood.
Conversely, participants with high mood ratings (≈7.5–10) paired with
excellent sleep quality (rating = 5) also showed large positive
residuals (≈3–7.5), indicating that good sleep was strongly associated
with higher mood scores.
# Bootstrap CI for mean stress level
# Manual bootstrap from Chap05Bootstrap.Rmd
x <- data$stress_level
n <- length(x)
N <- 1000
data.mean <- numeric(N)
for (i in 1:N) {
boot_sample <- sample(x, n, replace = TRUE)
data.mean[i] <- mean(boot_sample)
}
mean(data.mean)
## [1] 5.716836
sd(data.mean)
## [1] 0.04237383
quantile(data.mean, c(0.025, 0.975))
## 2.5% 97.5%
## 5.636770 5.802415
# Compare to parametric t-CI
t.test(x)$conf.int
## [1] 5.63758 5.79922
## attr(,"conf.level")
## [1] 0.95
Bootstrap vs Parametric Confidence Intervals
To evaluate robustness of mean stress estimates, both parametric t‑based
confidence intervals and bootstrap percentile intervals were computed.
The parametric 95% CI for mean stress was narrow and centered around the
sample mean. The bootstrap percentile CI closely matched, with only
minor differences in bounds. This agreement suggests that parametric
assumptions were reasonable in this dataset, but the bootstrap method
provided reassurance that conclusions remain valid even under potential
non‑normality or outliers. Together, these analyses demonstrate that
stress levels vary meaningfully across demographic groups, that mood and
sleep are interrelated, and that bootstrap methods can validate
parametric inference.
Conclusion
This study investigated how technology use relates to stress and broader
wellness outcomes through ANOVA, chi-square tests, correlation,
regression, and bootstrap methods. Consistent with prior concerns about
technology and stress, we found that stress levels differed
significantly across age groups, with younger participants (≤25)
reporting the highest stress and stress decreasing steadily with age.
Gender differences were minimal, and location type (Urban, Suburban,
Rural) showed no statistically significant effects, though urban
participants displayed a small, non-significant trend toward higher
stress and screen time. Regression analyses demonstrated that daily
screen time was a strong positive predictor of stress, while physical
activity was strongly negatively correlated with stress, underscoring
the role of lifestyle factors in wellness. Mood and sleep quality were
also closely linked: chi-square tests revealed that poor sleep was
disproportionately associated with very low mood, while excellent sleep
aligned with high mood ratings. Bootstrap confidence intervals for mean
stress closely matched parametric t-based intervals, reinforcing the
robustness of the findings. Overall, these results support the
abstract’s premise that technology use can negatively impact stress and
wellness, particularly among younger adults, while also showing that
wellness indicators such as sleep, mood, and physical activity are
interrelated. The findings suggest that healthier digital habits,
combined with lifestyle interventions, may help mitigate stress and
promote better overall wellness.
Insights:
• Stress varies substantially across age, with younger adults (≤25)
reporting the highest levels and older groups reporting lower
stress.
• Mood and sleep quality are strongly associated, consistent with the
idea that sleep and emotional wellbeing are intertwined.
• Bootstrap methods produced confidence intervals that closely matched
parametric t-intervals, increasing confidence in the stability of the
stress estimates.
Limitations:
• Data are self‑reported, which may introduce bias or measurement
error.
• The study is observational, limiting causal inference.
• Unequal group sizes and potential confounders (e.g., occupation,
health status) may influence results.
References • Kaggle Dataset: Tech Use & Stress
Wellness.https://www.kaggle.com/datasets/nagpalprabhavalkar/tech-use-and-stress-wellness?resource=download
• Chihara, L., & Hesterberg, T. (2019). Mathematical Statistics with
Resampling and R.
• American Psychological Association (2017). Stress in America: Coping
with Change Stress in America™ Survey.