For my R Assignment, I used the “Sleep Health and Lifestyle” Dataset from kaggle.com. The analysis dives into the dataset of 400 individuals, exploring elements such as stress, sleep disorders, physical activity, heart rate, occupation, sleep duration and quality, BMI category, gender, age, and blood pressure. Through my analysis, I hope to discover patterns and relationships between an individual’s sleep, health, and lifestyle.
# -----------------------R Project--------------------
getwd()
## [1] "/Users/ayomide3/Downloads"
setwd("/Users/ayomide3/Downloads")
if(!file.exists("R_datafiles")) dir.create("R_datafiles")
sleep_data <- read.csv("sleep_dataset.csv", header = TRUE, sep = ",")
library(ggplot2)
library(scales)
library(RColorBrewer)
library(ggthemes)
library(plyr)
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:plyr':
##
## arrange, count, desc, failwith, id, mutate, rename, summarise,
## summarize
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
library(plotly)
##
## Attaching package: 'plotly'
## The following objects are masked from 'package:plyr':
##
## arrange, mutate, rename, summarise
## The following object is masked from 'package:ggplot2':
##
## last_plot
## The following object is masked from 'package:stats':
##
## filter
## The following object is masked from 'package:graphics':
##
## layout
These are my visualizations
Visualization #1
#---------------Stacked Bar Chart------------------
ggplot(sleep_data, aes(x = Occupation, fill = Sleep.Disorder)) +
geom_bar(position = "stack") +
labs(title = "Sleep Disorder By Occupation", x = "Occupation", y = "Count") +
theme_light() +
theme(plot.title = element_text(hjust = 0.5)) +
scale_fill_brewer(palette = "Paired")
According to the stacked bar chart, we can see that the occupations with the highest count in the dataset are nurses, doctors, and engineers. We can also infer that majority of the nurses have the sleeping disorder, Sleep Apnea while Accountants, Doctors, Engineers, and Lawyers have majority of the individuals having no sleeping disorders. Teachers and Salesmen have individuals with insomnia as their highest category.
Visualization #2
#-------------Heatmap------------------------------
ggplot(sleep_data, aes(x = Stress.Level, y = Physical.Activity.Level, fill = Quality.of.Sleep)) +
geom_tile(color = "black") +
geom_text(aes(label = comma(Quality.of.Sleep))) +
labs(title = "Sleep Quality by Stress and Physical Activity",
x = "Stress Level (1 - 10)",
y = "Physical Activity (min/day)",
fill = "Sleep Quality") +
theme_minimal()
According to the heatmap, there is a negative correlation between the stress level and amount of physical activity that an individual gets and their sleep quality. Those who do physical activity for more than 75 minutes and have a stress level 2 have a sleep quality of 9. Contrasting these results, you can visibly see that those who have a higher stress level tend to exercise less and have a lower score for sleep quality. There are two noticable outliers including individuals who have a high stress level as well as exercising for 90 minutes a day and individuals who have a stress level of 5 exercising for 90 minutes a day.
Visualization #3
# -----------------Bar Graph------------------------
ggplot(sleep_data, aes(x = Quality.of.Sleep, y = Daily.Steps)) +
geom_bar(fill = "lightblue", stat = "identity") +
scale_y_continuous(labels = scales::comma) +
labs(title = "Quality of Sleep vs Daily Steps", x = "Quality of Sleep (1-10 Scale)", y = "Daily Steps")
For the bar graph, I compared the total of daily steps to the quality of sleep bins. Those who have a sleep quality of 8 have a total of over 800,000 steps. Surprisingly, those who have a quality of sleep of 9 only have just a little over 400,000 steps while those who have a sleep quality of 6 or 7 have over 500,000 steps. Additionally, those who have have a sleep quality of 4 and 5 have a significantly lower amount of steps.
Visualization #4
#-----------------Multiple Line Plot----------------
ggplot(sleep_data, aes(x = Stress.Level, y = Sleep.Duration, color = Sleep.Disorder)) +
geom_smooth(se = FALSE) +
labs(title = "Effect of Stress Level on Sleep Duration",
x = "Stress Level (1-10)", y = "Sleep Duration (hours)", color = "Sleep Disorder") +
theme_minimal() +
theme(plot.title = element_text(hjust=0.5)) +
scale_y_continuous(labels=comma)
## `geom_smooth()` using method = 'loess' and formula = 'y ~ x'
The multiple line plot has 3 line graphs that represent the different sleep disorder groups. Sleep Apnea has the most consistent line with a strong negative correlation between sleep duration and stress levels. According to chart, there is a drop with those those who have a stress level of 4 with no disorder and insomnia. However, it increases a declines again around a stress level of 5.5.
Visualization #5
#-----------------Nested Pie Chart-------------------
sleep_summary <- sleep_data %>%
group_by(Sleep.Disorder, BMI.Category) %>%
summarise(Count = n(), .groups = 'drop')
sleep_disorder_counts <- sleep_summary %>%
group_by(Sleep.Disorder) %>%
summarise(Count = sum(Count), .groups = 'drop')
fig <- plot_ly() %>%
# Inner Pie Chart (Sleep Disorders)
add_trace(
data = sleep_disorder_counts,
labels = ~Sleep.Disorder,
values = ~Count,
type = "pie",
hole = 0.4,
textinfo = "label+percent",
marker = list(line = list(color = "white", width = 1)),
domain = list(x = c(0, 1), y = c(0, 1))
) %>%
# Outer Pie Chart (BMI Categories)
add_trace(
data = sleep_summary,
labels = ~BMI.Category,
values = ~Count,
type = "pie",
hole = 0.8,
textinfo = "label+percent",
marker = list(line = list(color = "white", width = 1)),
domain = list(x = c(0, 1), y = c(0, 1))
) %>%
layout(
title = "Nested Pie Chart: BMI Categories Within Sleep Disorders",
showlegend = TRUE
)
fig
The nested pie chart shows a pie chart of the BMI categories and pie chart of the different sleep disorders. The chart shows that over 50% of the individuals have a normal weight, however 39% of individuals are overweight and 2.67% of individuals are obese. On the inside of the the pie chart, it shows the 3 sleeping disorder categories with 58.6% of individuals having no sleeping disorders, 20.9% of individuals have sleep apnea, and 20.6% of individuals have insomnia.
This analysis provided a comprehensive look into the relationships between sleep health, lifestyle factors, and physiological metrics. Key findings suggest that sleep disorders are influenced by multiple factors, including stress levels, BMI category, and physical activity. Individuals with higher stress and lower physical activity tended to report poorer sleep quality, while BMI categories also showed notable patterns in the prevalence of sleep disorders such as insomnia and sleep apnea.
By visualizing these interactions through various charts, we gained a better understanding of how daily habits and health indicators contribute to sleep quality and overall well-being. These insights could help inform healthier lifestyle choices and highlight the importance of stress management and physical activity in promoting better sleep.