Final Project

Author

Shannon Miller

Social Media’s Influence on Feelings of Depression in Teens and Young Adults

Introduction

As technology has become more commonplace, so have social deficiencies and mental illnesses. Though we all grow up with evolving technology, it often seems that these advancements have progressed further than our society has. With few age restrictions and even fewer content filters, social media and online communication/ gaming has transformed from a form of community into an imminent threat to community.

As a result of its popularity among children and teens, social media has become a pertinent concern for parents and mental health professionals. Though the safety and mental wellness of users, especially young users, should be paramount to social media corporations, this responsibility is often delegated to the user themselves; leaving children and adolescents to navigate potential dangers on their own.

With rates of childhood/adolescent mental illness growing, it is crucial to examine how social media use may contribute to these trends. This analysis focuses on how social media use impacts feelings of depression in international participants ranging from 13 - 21 years of age.

Methods

Using a pre-existing data-set detailing stress, happiness, income, education, and more, I aim to examine the correlation between social media and depression in teens and young adults. To determine if there was any significant correlation between social media usage and depression scores in adolescents aged 13 to 12 years, I used Pearson’s product-moment correlation test.

To conduct this test, I first separated the age range of interest into a separate data-set. Using this data set, I was able to conduct a Pearson correlation test determining the association between social media usage in minutes per day and self-report depression scores. With the results from this test, I used ggplot and ggpubr to create a scatter plot with a linear regression line, representing the data visually.

All codes were input into R Studio, generated using the packages dplyr, gganimate, ggExtra, ggplot2, ggpubr, gtsummary, knitr, readr, shiny, tidyr, broom, and tibble.

Descriptive Statistics

With summary statistics on social media usage and depression scores in adolescents aged 13-21 years, we are able to see that this age group has an average “depression score” of 9 out of 30, with a standard deviation of 6 (mean = 9, SD = 6). This revelation is especially alarming when analyzed in conjunction with social media usage, with adolescents spending an average of nearly three hours on social media per day (mean = 178, SD = 132).

library(shiny)
library(dplyr)

Attaching package: 'dplyr'
The following objects are masked from 'package:stats':

    filter, lag
The following objects are masked from 'package:base':

    intersect, setdiff, setequal, union
library(ggplot2)
library(ggpubr)
library(gtsummary)
library(broom)
library(knitr)

finaldata <- read.csv("finaldata.csv")

youth <- finaldata %>%
  filter(age >= 13 & age <= 21)
youthtable <- youth %>%
  dplyr::select(depression_score, social_media_mins) %>%
  tbl_summary(
    statistic = list(all_continuous() ~ "{mean} ({sd}) [{min}, {max}]"),
    label = list(
      depression_score ~ "Depression Score",
      social_media_mins ~ "Social Media Usage (mins)"
    )
  ) %>%
  modify_header(label ~ "**Variable**") %>% 
  bold_labels()

youthtable
Variable N = 9451
Depression Score 9 (6) [0, 27]
Social Media Usage (mins) 178 (132) [17, 631]
1 Mean (SD) [Min, Max]

By graphing these variables by age, we are able to visualize trends of social media usage and depression in the adolescent data group. With this graph, we can see that 16 year olds, on average, use social media the least and have the lowest depression scores. Furthermore, 14 year olds, who used social media second most frequently, have the highest depression scores in the “youth” group of this data.

library(shiny)
library(dplyr)
library(ggplot2)
library(ggpubr)
library(gtsummary)
library(broom)
library(knitr)

finaldata <- read.csv("finaldata.csv")

youth <- finaldata %>%
  filter(age >= 13 & age <= 21)

#| echo: false
#| warning: false
age_trends <- youth %>%
  group_by(age) %>%
  summarize(
    avg_social = mean(social_media_mins, na.rm = TRUE),
    avg_dep = mean(depression_score, na.rm = TRUE)
  )

ggplot(age_trends, aes(x = age)) +
  geom_line(aes(y = avg_social, color = "Social Media (Mins)"), linewidth = 1.2) +
  geom_point(aes(y = avg_social, color = "Social Media (Mins)"), size = 3) +
  geom_line(aes(y = avg_dep * 20, color = "Depression Score (Scaled)"), linewidth = 1.2, linetype = "dashed") +
  geom_point(aes(y = avg_dep * 20, color = "Depression Score (Scaled)"), size = 3) +
  scale_y_continuous(
    name = "Daily Social Media Usage (Mins)",
    sec.axis = sec_axis(~./20, name = "Average Depression Score")
  ) +
  scale_x_continuous(breaks = 13:21) +
  labs(title = "Social Media Usage and Depression Trends in Adolescents",
       x = "Age",
       color = "Metric") +
  theme_classic() +
  theme(legend.position = "bottom")

Results

With the Pearson correlation test, we were able to generate the r correlation score, statistic t-score, and the p-value of the data. This will help determine whether there is any significant correlation between social media usage and depression score in the youth age group.

Through this test, we discovered that the r-score was 0.0032, the t-value was 0.0984, and the p-value was 0.9216 (r = .0032, t = 0.0984, p = 0.9216).

library(shiny)
library(dplyr)
library(ggplot2)
library(ggpubr)
library(gtsummary)
library(broom)
library(knitr)

finaldata <- read.csv("finaldata.csv")

youth <- finaldata %>%
  filter(age >= 13 & age <= 21)

#| echo: false
pearson_result <- cor.test(youth$social_media_mins, youth$depression_score, method = "pearson")

correlation_table <- tidy(pearson_result) %>%
  select(
    `Correlation (r)` = estimate,
    `t-statistic` = statistic,
    `Degrees of Freedom (df)` = parameter,
    `p-value` = p.value,
    `Lower CI (95%)` = conf.low,
    `Upper CI (95%)` = conf.high
  )

kable(correlation_table, digits = 4, caption = "Pearson Correlation Table (N = 945)")
Pearson Correlation Table (N = 945)
Correlation (r) t-statistic Degrees of Freedom (df) p-value Lower CI (95%) Upper CI (95%)
0.0032 0.0984 943 0.9216 -0.0606 0.067
library(shiny)
library(dplyr)
library(ggplot2)
library(ggpubr)
library(gtsummary)
library(broom)
library(knitr)

finaldata <- read.csv("finaldata.csv")

youth <- finaldata %>%
  filter(age >= 13 & age <= 21)

# Initialize the plot and add layers before labels/themes
ggplot(youth, aes(x = social_media_mins, y = depression_score)) +
  geom_point(alpha = 0.4, color = "salmon", size = 2) +
  geom_smooth(method = "lm", color = "navy", fill = "grey50", linewidth = 1.1, level = 0.95) +
  stat_cor(method = "pearson", label.x = 30, label.y = 25, size = 4.5) +
  labs(
    title = "Relationship Between Daily Social Media Usage and Depression Scores",
    subtitle = "Pearson Correlation Analysis (N = 945)",
    x = "Daily Social Media Usage (Mins)",
    y = "Depression Score (0-30)"
  ) +
  theme_pubr(base_size = 12, legend = "none") +
  theme(
    plot.title = element_text(face = "bold", hjust = 0.5),
    plot.subtitle = element_text(hjust = 0.5, face = "italic", color = "black"),
    axis.title = element_text(face = "bold")
  )
`geom_smooth()` using formula = 'y ~ x'

Discussion

With the results of our Pearson correlation test, we are able to determine that there is no statistically significant correlation (p>0.05) between social media usage and depression scores.

Though data analysis tells us there is no significant correlation between these variables, this doesn’t necessarily mean there is no relationship at all. Due to large sample size, and the use of self report measures to collect data, there may have been an issue with the analysis’ internal validity, potentially resulting in skewed results due to outliers.

Furthermore, the age group examined (13-21) is already at high risk of experiencing depressive feelings, potentially impacting our results. As I am unable to determine whether participants included in this data-set were pre-screened for psychiatric illness, it is also possible that underlying disorders could have impacted the depression scores reported.

Though this analysis found that there was no significant relationship between feelings of depression and social media usage, it is important to examine these questions nonetheless. Future researchers should expand upon this research by longitudinally observing the impact of social media use in childhood/adolescence on mental health into adulthood.

Shiny App

This app will generate a scatter plot and descriptive table detailing statistics of social media usage/average depression scores based on age. The app allows you to enter an age, which will generate relevant statistics from the data-set.

Shiny App Link: https://2ty909-shannon-miller.shinyapps.io/Final/

library(shiny)
library(dplyr)
library(ggplot2)
library(ggpubr)
library(gtsummary)
library(broom)
library(knitr)

finaldata <- read.csv("finaldata.csv")

ui <- fluidPage(
  titlePanel("Social Media Usage and Depression Analysis"),
  
  sidebarLayout(
    sidebarPanel(
      numericInput("selected_age",
                   "Type Age (13-21):",
                   value = 16,
                   min = 13,
                   max = 21,
                   step = 1),
      
      helpText("This application filters the dataset for the specific age typed above and updates the statistical analyses.")
    ),
    
    mainPanel(
      h4("Summary Statistics Table"),
      tableOutput("stats_table"),
      
      hr(),
      
      h4("Visual Trend Analysis"),
      plotOutput("scatter_plot")
    )
  )
)

server <- function(input, output) {
  
  filtered_data <- reactive({
    req(input$selected_age)
    finaldata %>%
      filter(age == input$selected_age)
  })
  
  output$stats_table <- renderTable({
    req(filtered_data())
    
    filtered_data() %>%
      mutate(
        Usage_Group = cut(
          social_media_mins,
          breaks = c(0, 60, 120, 180, 240, 300, Inf),
          labels = c("<1 hr", "1-2 hrs", "2-3 hrs", "3-4 hrs", "4-5 hrs", "5+ hrs"),
          include.lowest = TRUE
        )
      ) %>%
      group_by(Usage_Group) %>%
      summarise(
        `Sample Size (n)` = n(),
        `Average Depression Score` = mean(depression_score, na.rm = TRUE),
        `Standard Deviation (SD)` = sd(depression_score, na.rm = TRUE)
      )
  }, digits = 2)
  
  output$scatter_plot <- renderPlot({
    req(filtered_data())
    
    ggplot(filtered_data(), aes(x = social_media_mins, y = depression_score)) +
      geom_point(alpha = 0.4, color = "salmon", size = 2) +
      geom_smooth(method = "lm", color = "navy", fill = "grey50", linewidth = 1.1, level = 0.95) +
      stat_cor(method = "pearson", linewidth = 4.5) +
      labs(
        x = "Daily Social Media Usage (Minutes)",
        y = "Depression Score (0-30)"
      ) +
      theme_pubr(base_size = 12, legend = "none") +
      theme(
        axis.title = element_text(face = "bold")
      )
  })
}

shinyApp(ui = ui, server = server)

Shiny applications not supported in static R Markdown documents