This analysis explores confidence intervals using the Social Media and Entertainment Dataset. Key objectives:
We choose:
# Create new calculated columns
data <- data %>%
mutate(
Engagement_Score = `Daily Social Media Time (hrs)` / Age,
Adjusted_Sleep_Quality = `Sleep Quality (scale 1-10)` / Age
)
# Display first few rows with new columns
head(data)
## # A tibble: 6 × 42
## `User ID` Age Gender Country Daily Social Media Tim…¹ Daily Entertainment …²
## <dbl> <dbl> <chr> <chr> <dbl> <dbl>
## 1 1 32 Other Germany 4.35 4.08
## 2 2 62 Other India 4.96 4.21
## 3 3 51 Female USA 6.78 1.77
## 4 4 44 Female India 5.06 9.21
## 5 5 21 Other Germany 2.57 1.3
## 6 6 21 Male Canada 4.69 1.7
## # ℹ abbreviated names: ¹`Daily Social Media Time (hrs)`,
## # ²`Daily Entertainment Time (hrs)`
## # ℹ 36 more variables: `Social Media Platforms Used` <dbl>,
## # `Primary Platform` <chr>, `Daily Messaging Time (hrs)` <dbl>,
## # `Daily Video Content Time (hrs)` <dbl>, `Daily Gaming Time (hrs)` <dbl>,
## # Occupation <chr>, `Marital Status` <chr>, `Monthly Income (USD)` <dbl>,
## # `Device Type` <chr>, `Internet Speed (Mbps)` <dbl>, …
ggplot(data, aes(x = Adjusted_Sleep_Quality, y = Engagement_Score)) +
geom_point(alpha = 0.5, color = "darkgreen") +
geom_smooth(method = "lm", color = "black") +
labs(
title = "Adjusted Sleep Quality vs Engagement Score",
x = "Adjusted Sleep Quality",
y = "Engagement Score"
) +
theme_minimal()
Observations:
We compute correlation coefficients to measure relationships.
cor_age_social <- cor(data$Age, data$`Daily Social Media Time (hrs)`, use = "complete.obs")
cor_sleep_engagement <- cor(data$Adjusted_Sleep_Quality, data$Engagement_Score, use = "complete.obs")
# Display results
cor_results <- tibble(
Relationship = c("Age vs Social Media Time", "Adjusted Sleep Quality vs Engagement Score"),
Correlation_Coefficient = c(cor_age_social, cor_sleep_engagement)
)
cor_results
## # A tibble: 2 × 2
## Relationship Correlation_Coefficient
## <chr> <dbl>
## 1 Age vs Social Media Time -0.00223
## 2 Adjusted Sleep Quality vs Engagement Score 0.430
Interpretation:
We construct 95% confidence intervals for:
ci_engagement_score <- data %>%
specify(response = Engagement_Score) %>%
generate(reps = 1000, type = "bootstrap") %>%
calculate(stat = "mean") %>%
get_confidence_interval(level = 0.95, type = "percentile")
ci_engagement_score
## # A tibble: 1 × 2
## lower_ci upper_ci
## <dbl> <dbl>
## 1 0.134 0.135
Interpretation:
Key Findings:
1. Social Media Time vs Age
Observations: