How Lifestyle Habits Affect Sleep Quality and Daily Performance

Author

Stephanie De Los Santos Avila
Denisse Hernandez
Marshall Carl

Introduction

Sleep is really important for our health and how we function every day. But a lot of people don’t get good sleep because of things like stress, too much screen time, long work hours, or drinking caffeine late. When people don’t sleep well they feel tired can’t focus and don’t perform as well during the day. This project is important because it looks at how daily habits affect sleep quality and performance. By using this data we can figure out what helps people sleep better and what makes sleep worse. This can help people make better choices and improve their daily lives.

Project Goal

The purpose of this project is to study how lifestyle and behavioral habits affect sleep quality and daily performance.

Data

We obtained a sleep health dataset from Kaggle (Sleep_Health_Dataset) with 100,000 records. It includes variables related to sleep, lifestyle habits, and performance, such as sleep quality, stress, screen time, and cognitive performance. Each row represents one individual’s daily data.

sleep_health_dataset <- read.csv("sleep_health_dataset.csv")

library(tidyverse)

── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.2.1     ✔ readr     2.2.0
✔ forcats   1.0.1     ✔ stringr   1.5.1
✔ ggplot2   3.5.2     ✔ tibble    3.3.0
✔ lubridate 1.9.5     ✔ tidyr     1.3.2
✔ purrr     1.0.4     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

library(knitr)
data.frame(Variable_Names = names(sleep_health_dataset))%>%
  knitr::kable(
    caption = "Variable Names in Sleep Health Dataset"
  )

Variable Names in Sleep Health Dataset
Variable_Names
person_id
age
gender
occupation
bmi
country
sleep_duration_hrs
sleep_quality_score
rem_percentage
deep_sleep_percentage
sleep_latency_mins
wake_episodes_per_night
caffeine_mg_before_bed
alcohol_units_before_bed
screen_time_before_bed_mins
exercise_day
steps_that_day
nap_duration_mins
stress_score
work_hours_that_day
chronotype
mental_health_condition
heart_rate_resting_bpm
sleep_aid_used
shift_work
room_temperature_celsius
weekend_sleep_diff_hrs
season
day_type
cognitive_performance_score
sleep_disorder_risk
felt_rested

You can interact with the data using the search box

library(DT)
datatable(sleep_health_dataset)

Warning in instance$preRenderHook(instance): It seems your data is too big for
client-side DataTables. You may consider server-side processing:
https://rstudio.github.io/DT/server.html

Analysis

Target Variable Analysis

Data Wrangling

#library(tidyverse)
#glimpse(sleep_health_dataset)

sleep_health_dataset <- sleep_health_dataset |>
  select(age,gender,country,occupation,
         sleep_duration_hrs, sleep_quality_score,
         stress_score, mental_health_condition,
         screen_time_before_bed_mins,
         caffeine_mg_before_bed,
         alcohol_units_before_bed,
         exercise_day, work_hours_that_day,
         day_type, cognitive_performance_score)

sleep_health_dataset <- sleep_health_dataset |>
  filter(!is.na(sleep_quality_score),
         !is.na(stress_score),
         !is.na(cognitive_performance_score),
         !is.na(sleep_duration_hrs))

sleep_health_dataset <- sleep_health_dataset |>
  distinct()

sleep_health_dataset <- sleep_health_dataset |>
  mutate(sleep_group = "")

for(i in 1:nrow(sleep_health_dataset)) {
  if(sleep_health_dataset$sleep_quality_score[i] >= 75 ){
    sleep_health_dataset$sleep_group[i] <- "Good Sleep"
  }
  else{
    sleep_health_dataset$sleep_group[i] <- "Poor Sleep"
  }
}

head(sleep_health_dataset$sleep_group)

[1] "Poor Sleep" "Poor Sleep" "Poor Sleep" "Poor Sleep" "Poor Sleep"
[6] "Poor Sleep"

Data visualization

Histogram for sleep quality score

library(ggplot2)
ggplot(sleep_health_dataset, aes(x = sleep_quality_score))+
  geom_histogram(bins = 20, fill = "darkblue")+
  labs(
    titles = "Histogram of Sleep Quality Score",
    x = "Sleep Quality Score",
    y = "Count"
  )

Top 10 of best sleepers vs the worst sleepers

graphic 3

graphic 4

graphic 5

graphic 6

test

graphic 7

graphic 8

graphic 9

graphic 10

graphic 11

Conclusion

Contact Imformation

mcarl@Students.kennesaw.edu
dherna84@students.kennesaw.edu
edloss2@Students.kennesaw.edu

library(leaflet)
leaflet() %>%
  addTiles() %>%
  addMarkers(
    lng = -84.615,
    lat = 34.023,
    popup = "Kennesaw State University"
  )