Introduction
For my topic, I chose to look into depression in students and other
related factors like students academic pressure, study satisfaction, age
and more. The data comes from Kaggle. For my cleaning, I simply use
rename from dplyr to simply rename some of the columns for better
readability. The data was collected through a series of willing
anonymous participants of wide-ranging demographics throughout January
and June 2023. I chose this topic because I wanted to see for myself how
my certain behaviors could be affecting my academic performance. I also
thought it would be interesting to see how I could potentially change
these behaviors or inform others on how to change these behaviors for
potenially better academic performance. The topic also just seemed
interesting as I’ve always been into psychology and the development of
different psychological traits (like depression). For background, I just
read through the wikipedia page of depression some to better understand
if any correlations between depression and academic behaviors were
already found.
Citation: Wikipedia contributors. (2024, December 16). Depression
(mood). Wikipedia. https://en.wikipedia.org/wiki/Depression_(mood)
library(tidyverse) #Load libraries & dataset
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.1.4 ✔ readr 2.1.5
## ✔ forcats 1.0.0 ✔ stringr 1.5.1
## ✔ ggplot2 3.5.1 ✔ tibble 3.2.1
## ✔ lubridate 1.9.3 ✔ tidyr 1.3.1
## ✔ purrr 1.0.2
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(plotly)
##
## Attaching package: 'plotly'
##
## The following object is masked from 'package:ggplot2':
##
## last_plot
##
## The following object is masked from 'package:stats':
##
## filter
##
## The following object is masked from 'package:graphics':
##
## layout
library(reshape2)
##
## Attaching package: 'reshape2'
##
## The following object is masked from 'package:tidyr':
##
## smiths
studentDepression <- read_csv("DepressionData.csv")
## Rows: 502 Columns: 11
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (6): Gender, Sleep Duration, Dietary Habits, Have you ever had suicidal ...
## dbl (5): Age, Academic Pressure, Study Satisfaction, Study Hours, Financial ...
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
studentDepression <- studentDepression %>% #Rename columns for better readability and accesss
rename(Study_Satisfaction = `Study Satisfaction`)
studentDepression <- studentDepression %>%
rename(Academic_Pressure = `Academic Pressure`)
studentDepression <- studentDepression %>%
rename(Sleep_Duration = `Sleep Duration`)
studentDepression <- studentDepression %>%
rename(Study_Hours = `Study Hours`)
studentDepression <- studentDepression %>%
rename(Financial_Stress = `Financial Stress`)
model <- lm(Study_Satisfaction ~ Academic_Pressure + Study_Hours, studentDepression) # Linear model
summary(model) #Summarize model
##
## Call:
## lm(formula = Study_Satisfaction ~ Academic_Pressure + Study_Hours,
## data = studentDepression)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.2965 -1.0838 -0.0581 1.0145 2.1417
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 3.396401 0.175419 19.362 <2e-16 ***
## Academic_Pressure -0.099905 0.044066 -2.267 0.0238 *
## Study_Hours -0.003215 0.016367 -0.196 0.8443
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.369 on 499 degrees of freedom
## Multiple R-squared: 0.01039, Adjusted R-squared: 0.006427
## F-statistic: 2.62 on 2 and 499 DF, p-value: 0.07378
plot(model) # Plot the model




This box plot visualization shows that study satisfaction of
depressed males is somewhat higher than that for females, although
interesting, for non-depressed males and females, this gap in study
satisfaction disappears.
bar_chart <- ggplot(studentDepression, aes(x = `Dietary Habits`, fill = `Have you ever had suicidal thoughts ?`)) + #Create bar chart
geom_bar(position = "fill") +
labs(
title = "Proportion of Suicidal Thoughts by Dietary Habits",
x = "Dietary Habits",
y = "Proportion",
fill = "Suicidal Thoughts",
caption = "Data Source: https://www.kaggle.com/datasets/ikynahidwin/depression-student-dataset"
) +
scale_fill_manual(values = c("#F8766D", "#00BFC4")) +
theme_light()
bar_chart

This bar chart shows that for students with unhealthy dietary
habits, suicidal thoughts tend to be slightly more prevalent compared to
those with healthy dietary habits. This goes to show that a good diet is
one factor of many that can potentially ward off depressive
attitudes.
density_plot <- ggplot(studentDepression, aes(x = Sleep_Duration, fill = Depression)) + #Create interactive density plot
geom_density(alpha = 0.6) +
labs(
title = "Sleep Duration by Depression Status",
x = "Sleep Duration (Hours)",
y = "Density",
fill = "Depression Status",
caption = "Data Source: https://www.kaggle.com/datasets/ikynahidwin/depression-student-dataset"
) +
scale_fill_manual(values = c("#8DA0CB", "#FC8D62")) +
theme_bw() +
theme(
plot.title = element_text(hjust = 0.5, size = 14, face = "bold"),
axis.text = element_text(size = 10),
axis.title = element_text(size = 12),
legend.title = element_text(size = 12)
)
interactive_density_plot <- ggplotly(density_plot)
interactive_density_plot
This density plot simply shows that college students most often get
very few hours of sleep. This is likely an unfortunate by-product of the
demanding college student workload, although this lack of sleep, as seen
earlier can cause complications in student’s mental health as well as
decreased academic performance.