Using data to explore the connection between sleep habits, daily activities, and health outcomes. With Python to create visualizations, we look at how different factors like physical activity, stress, occupation, and even gender might impact how well people sleep. We also take a closer look at sleep disorders like insomnia and sleep apnea, and how they show up across different groups of people.
The goal of this analysis is to better understand the patterns in sleep and lifestyle data so we can learn what habits might lead to better health and rest.
Source: https://www.kaggle.com/datasets/uom190346a/sleep-health-and-lifestyle-dataset/data
The dataset includes information from around 400 people and has 13 different columns, or types of data. It focuses on sleep habits, daily routines, and overall health. Each row in the dataset represents one person and includes the following details:
Basic Info like: age, gender, and job.
Sleep Details such as how many hours they sleep, how good their sleep is (rated from 1 to 10), and if they have any sleep disorders like insomnia or sleep apnea.
Daily Lifestyle Habits like how many minutes they exercise, how many steps they take, and how stressed they feel (also rated from 1 to 10).
Health Measurements such as their heart rate and blood pressure. BMI Category, which tells whether a person is normal, overweight, or obese.
This analysis explores the relationships between sleep, physical activity, stress, and overall lifestyle habits using visualizations. The charts reveal key patterns, such as how higher physical activity is linked to longer sleep and lower stress, and how different occupations affect average sleep duration. We also see that sleep disorders vary by gender, and that people with better sleep quality tend to take more daily steps, especially those with a normal or healthy BMI. These findings help highlight the strong connection between our daily routines and our sleep health.
This multi line plot compares average heart rate (y-axis) to sleep duration (x-axis), and breaks it down by BMI category:
Each line shows how the heart rate changes for people in that BMI group based on how long they sleep.
# Import libraries
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
# Load Data
df = pd.read_csv("/Users/jayfreeportillo/Downloads/Sleep_health_and_lifestyle_dataset.csv")
# Rounding the sleep duration to 1 decimal and save it in a new column called 'Sleep Bin'
df['Sleep Bin'] = df['Sleep Duration'].round(1)
# Group the data by sleep hours and BMI type, then get the average heart rate for each group
grouped = df.groupby(['Sleep Bin', 'BMI Category'])['Heart Rate'].mean().reset_index()
# Set up the plot size
fig, ax = plt.subplots(figsize=(16, 9))
# Set custom colors for each BMI category
colors = {
'Normal': 'green',
'Overweight': 'orange',
'Obese': 'red',
'Normal Weight': '#7570b3'
}
# Plot one line for each BMI category
for category, group in grouped.groupby('BMI Category'):
group = group.sort_values('Sleep Bin') # Sort the data by sleep hours
ax.plot(group['Sleep Bin'], group['Heart Rate'], # X and Y data
label=category, # Add label to the line
color=colors.get(category, 'gray'), # Use color
marker='o', # Add markers on lune
linewidth=2.5) # Line thickness
# Add title and axis labels
ax.set_title("Heart Rate by Sleep Duration across BMI Categories", fontsize=20, pad=20)
ax.set_xlabel("Sleep Duration (Hours)", fontsize=16)
ax.set_ylabel("Average Heart Rate (bpm)", fontsize=16)
# Adjust size of tick labels
ax.tick_params(axis='both', labelsize=14)
# Add grid to the plot
ax.grid(True, linestyle='--', alpha=0.5)
# Add title and place it at the top right
ax.legend(title="BMI Category", title_fontsize=16, fontsize=14, loc='upper right', frameon=True)
# Fitting
plt.tight_layout()
# Print
plt.show()
Key Insights
Obese individuals consistently have the highest heart rates, regardless of how much they sleep. Their average heart rate stays around 83–85 bpm across sleep durations which might suggest higher baseline cardiovascular stress.
People in the Normal and Normal Weight categories tend to have lower heart rates, especially when sleep duration increases. Their heart rate gradually drops as sleep improves, especially past the 7-hour mark.
Overweight individuals show more fluctuation, but overall, their heart rate decreases as sleep duration increases, similar to other groups except Obese.
In general, more sleep is linked to lower heart rates, especially for those in the Normal and Overweight categories. This supports the idea that longer, better sleep can contribute to better heart health.
What can be learned?
Sleep matters: Across most groups, getting more sleep appears to be connected to a healthier (lower) resting heart rate.
BMI and heart rate are connected: People with higher BMI (Obese) consistently have higher heart rates, even when sleep duration increases. This might point to additional stress on the heart, regardless of sleep.
Better sleep may lower risk: Lower heart rates are often associated with better cardiovascular health. So improving sleep, especially among overweight or normal-weight individuals, may reduce health risks over time.
This nested pie chart shows how sleep disorders are spread out by gender.
The outer ring shows the total percentage of people with each sleep disorder type:
The inner ring breaks down each group into male and female percentages. The total number of people included is 374
# Imports
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
# Load data
df = pd.read_csv("/Users/jayfreeportillo/Downloads/Sleep_health_and_lifestyle_dataset.csv")
# Those with none disorder
df["Sleep Disorder"] = df["Sleep Disorder"].fillna("None")
# Group data by Sleep Disorder and Gender
grouped = df.groupby(['Sleep Disorder', 'Gender']).size().reset_index(name='Count')
# Outer ring (Sleep Disorders total counts)
outer_data = grouped.groupby('Sleep Disorder')['Count'].sum().reset_index()
outer_labels = outer_data['Sleep Disorder']
outer_counts = outer_data['Count']
# Inner ring (Gender counts per disorder)
inner_labels = grouped['Gender']
inner_counts = grouped['Count']
# Color setup
cmap = plt.get_cmap('tab20c')
outer_color_refs = [0, 4, 8, 12]
outer_colors = [cmap(i) for i in outer_color_refs]
all_refs = np.arange(16)
inner_color_refs = [i for i in all_refs if i not in outer_color_refs][:len(inner_counts)]
inner_colors = [cmap(i) for i in inner_color_refs]
# Figure and axis
fig, ax = plt.subplots(figsize=(10, 10))
# Outer pie chart (Sleep Disorders)
ax.pie(
outer_counts,
radius=1,
labels=outer_labels,
labeldistance=1.1,
colors=outer_colors,
wedgeprops=dict(width=0.3, edgecolor='white'),
autopct='%1.1f%%',
pctdistance=0.85,
textprops={'fontsize': 13}
)
# Inner pie chart (Gender)
ax.pie(
inner_counts,
radius=0.7,
labels=inner_labels,
labeldistance=0.7,
colors=inner_colors,
wedgeprops=dict(width=0.3, edgecolor='white'),
autopct='%1.1f%%',
pctdistance=0.64,
textprops={'fontsize': 10}
)
# Add center total text
total = df.shape[0]
ax.text(0, 0, f'Total\n{total}', ha='center', va='center', fontsize=16)
# Title and layout
plt.title("Sleep Disorders by Gender (Including None)", fontsize=18, pad=30)
plt.axis('equal')
plt.tight_layout()
plt.show()
Key Insights
Most people (58.6%) do not have a sleep disorder. This is the largest group in the chart.
Insomnia and Sleep Apnea are nearly equally common, each affecting about 20% of the sample.
Sleep disorders differ by gender: More men experience Insomnia (11.0%) than women (9.6%), while Sleep Apnea is much more common in women (17.9%) than men (2.9%).
Among people with no disorder, there are more males (36.6%) than females (21.9%).
What Can Be Learned?
Sleep disorders are fairly common, affecting about 41% of people in the sample.
Gender appears to influence the type of sleep disorder: Males are slightly more likely to experience Insomnia, while Sleep Apnea is much more common among females in this dataset.
Overall, the nested pie chart shows that most people do not have a sleep disorder, but those who do experience different disorders depending on gender. This perhaps suggests a need for gender-aware approaches to sleep health education and treatment.
This heatmap displays the average number of daily steps for people grouped by two things:
Each box shows how active people are in steps based on their body weight category and how well they sleep
# Import
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from matplotlib.ticker import FuncFormatter
# Load data
df = pd.read_csv("/Users/jayfreeportillo/Downloads/Sleep_health_and_lifestyle_dataset.csv")
# Defining bins and labels to categorize Quality of Sleep
bins = [0, 3, 6, 8, 10]
labels = ['Poor', 'Average', 'Good', 'Excellent']
# New categorical column for Sleep Quality based on defined bins
df['Sleep Quality Category'] = pd.cut(df['Quality of Sleep'], bins=bins, labels=labels)
# Pivot table: Average Daily Steps for each BMI & Sleep Quality combo
heatmap_df = pd.pivot_table(
df,
index='BMI Category',
columns='Sleep Quality Category',
values='Daily Steps',
aggfunc='mean',
observed=False # Include all categories
)
# Adding commas to large numbers in the color bar
comma_fmt = FuncFormatter(lambda x, _: f"{int(x):,}")
# Set up the plot figure
fig, ax = plt.subplots(figsize=(10, 8))
sns.set(font_scale=1.1) # Font scale
# Draw the heatmap
sns.heatmap(
heatmap_df,
annot=True, # Show values inside each box
fmt=".0f", # Format numbers
cmap="coolwarm", # Color palette
linewidths=0.5, # Thin lines between cells
linecolor='white',
square=True,
annot_kws={"size": 13, "weight": "bold"}, # Style for the annotations
cbar_kws={"format": comma_fmt} # Comma formatting to color bar
)
# Customize tick labels
plt.xticks(rotation=15, fontsize=12)
plt.yticks(rotation=0, fontsize=12)
# Adding chart title and axis labels
plt.title("Avg Daily Steps by BMI Category and Sleep Quality", fontsize=18, pad=15)
plt.xlabel("Sleep Quality Category", fontsize=14, labelpad=10)
plt.ylabel("BMI Category", fontsize=14, labelpad=10)
# Add color bar label
cbar = ax.collections[0].colorbar
cbar.set_label("Avg Daily Steps", fontsize=12, weight='bold')
# Print
plt.tight_layout()
plt.show()
Key Insights
Obese individuals take the fewest steps overall, across all levels of sleep quality as low as 3,125 steps per day.
People with Normal or Normal Weight tend to be more active, especially when they report better sleep. For example, those with Normal Weight and Excellent sleep take 8,750 steps, the highest in the chart.
More sleep doesn’t always mean more steps: In the Normal BMI group, people with Good sleep quality actually take more steps than those with Excellent sleep. While in the Overweight group, people with Average sleep have the highest step count.
What can be Learned?
Physical activity is strongly connected to both BMI and sleep quality, but not in a perfect pattern.
People with lower BMI tend to take more steps, which may help them maintain healthier weight and better sleep.
Good sleep may encourage more physical activity, but other factors like lifestyle and job may also affect step counts.
The heatmap shows that people with lower BMI and better sleep quality tend to take more daily steps. Obese individuals are the least active across all sleep categories. This highlights a strong link between physical activity, sleep quality, and body weight which suggesting that promoting better sleep and movement could support healthier lifestyles.
This chart compares how much sleep people in different jobs get on average per night. Each bar shows the average sleep duration for that occupation.
The bars are colored:
# Import
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
import matplotlib.patches as mpatches
# Load the dataset
file_path = "/Users/jayfreeportillo/Downloads/Sleep_health_and_lifestyle_dataset.csv"
df = pd.read_csv(file_path)
# Remove rows with missing values in key columns
df = df.dropna(subset=["Occupation", "Sleep Duration"])
# Calculate average sleep duration for each occupation and sort it
occupation_sleep = df.groupby("Occupation")["Sleep Duration"].mean().sort_values()
# Calculate overall mean sleep duration
mean_sleep_duration = occupation_sleep.mean()
# Function to assign color based on how far value is from mean
def pick_colors(value, mean):
if value > mean * 1.01: # More than 1% above mean
return "green"
elif value < mean * 0.99: # More than 1% below mean
return "red"
else: # Within 1% of mean
return "yellow"
# Apply the color function to each value
bar_colors = [pick_colors(value, mean_sleep_duration) for value in occupation_sleep]
# Create the figure and axis
fig, ax = plt.subplots(figsize=(12, 8))
# Plot the bar chart with colors
bars = ax.bar(occupation_sleep.index, occupation_sleep.values, color=bar_colors, alpha=0.8)
# Add value labels on top of each bar
for bar, value in zip(bars, occupation_sleep.values):
ax.text(bar.get_x() + bar.get_width() / 2, value + 0.1, f"{value:.1f}",
ha="center", fontsize=12, fontweight="bold", color="black", fontfamily="Arial")
# Draw a horizontal line for the mean sleep duration
ax.axhline(mean_sleep_duration, color="black", linestyle="--", linewidth=2)
ax.text(0, mean_sleep_duration + 0.1,
f"Mean: {mean_sleep_duration:.1f}", fontsize=12, fontweight="bold",
color="black", fontfamily="Arial")
# Rotate x-axis labels for readability
plt.xticks(rotation=45, ha="right", fontsize=12, fontfamily="Arial")
# Label the axes and set chart title
ax.set_xlabel("Occupation", fontsize=14, fontweight="bold")
ax.set_ylabel("Average Sleep Duration (Hours)", fontsize=14, fontweight="bold")
ax.set_title("Average Sleep Duration by Occupation", fontsize=16, fontweight="bold")
# Remove top and right border lines
ax.spines["top"].set_visible(False)
ax.spines["right"].set_visible(False)
# Create legend patches for color categories
above = mpatches.Patch(color="green", label="Above Average")
below = mpatches.Patch(color="red", label="Below Average")
close = mpatches.Patch(color="yellow", label="Within 1% of Average")
# Add legend to the plot
ax.legend(handles=[above, close, below], fontsize=12, loc="upper left")
# Display the plot
plt.show()
Key Insights:
What can be Learned?
Occupation impacts sleep duration, possibly due to job demands, stress, or work-life balance. People in technical or high-pressure roles like Sales or Science may be sacrificing sleep.On the other hand, roles like Engineer, Lawyer, and Nurse show better sleep, possibly due to more structured work or better routines
Overall people in different jobs get very different amounts of sleep. Engineers sleep the most, while sales-related jobs sleep the least. This shows how occupation can affect sleep habits, which is important for both personal well-being and job performance.
This scatterplot compares: Physical Activity Level vs Sleep Duration in hours
Each dot represents a person, and the color shows their stress level:
# Import
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
# Load the dataset
file_path = "/Users/jayfreeportillo/Downloads/Sleep_health_and_lifestyle_dataset.csv"
df = pd.read_csv(file_path)
# Removing rows with missing values in key columns
df = df.dropna(subset=["Physical Activity Level", "Sleep Duration", "Stress Level"])
# Create a figure and axis for plotting
fig, ax = plt.subplots(figsize=(16, 10))
# Create scatter plot
scatter = ax.scatter(
df["Physical Activity Level"], # x-axis
df["Sleep Duration"], # y-axis
c=df["Stress Level"],
cmap="coolwarm", # color theme
s=100, # size of points
alpha=0.75, # transparency of points
edgecolors="black" # black border for visibility
)
# Add color bar to show stress level scale
cbar = fig.colorbar(scatter, ax=ax)
cbar.set_label("Stress Level", fontsize=14) # label for color bar
cbar.ax.tick_params(labelsize=12) # font size for color bar ticks
# Add grid for easier reading
ax.grid(True, linestyle="--", alpha=0.6)
# Labeling axes and set title
ax.set_xlabel("Physical Activity Level", fontsize=16, fontweight="bold")
ax.set_ylabel("Sleep Duration (Hours)", fontsize=16, fontweight="bold")
ax.set_title("Physical Activity vs. Sleep Duration (Colored by Stress Level)", fontsize=18, fontweight="bold")
# Adjust axis tick font sizes
ax.tick_params(axis="both", labelsize=14)
# Show the final plot
plt.show()
Key Insights:
What can be Learned?
There seems to be a positive connection between physical activity and better sleep. Lower stress may be linked to staying active and getting more sleep. Encouraging daily movement might be a helpful way to improve sleep and reduce stress levels naturally.
This scatterplot shows that people who are more active tend to sleep longer and report less stress. It highlights how physical activity may support better sleep and lower stress, making it a valuable habit for overall well-being.
These visualizations help us better understand how these variables interact and why they matter for overall health and well-being.
The scatterplot revealed that higher physical activity is generally associated with longer sleep duration and lower stress levels. Individuals who exercised more tended to sleep better and showed lower stress, as indicated by the scatterplot. This reinforces the idea that regular movement is not only good for physical health but also plays a vital role in improving sleep and mental well-being.
The bar chart of average sleep by occupation showed that jobs with more structure or flexibility, such as Engineer and Lawyer, are associated with longer sleep, while roles like Sales Representative and Scientist tend to be linked to shorter sleep. This highlights how work, schedules, and responsibilities could directly impact sleep quality and quantity.
In the heatmap of daily steps by BMI and sleep quality, individuals with lower BMI and better sleep quality consistently took more steps each day, suggesting that active lifestyles may both reflect and support healthier sleep patterns. In contrast, those in the Obese category were less active across all sleep quality levels, pointing to a potential cycle of inactivity and poor rest that could lead to longer-term health challenges.
The nested pie chart of sleep disorders by gender revealed that while most people do not suffer from a sleep disorder, insomnia is more common among males in this sample, and females are more likely to report sleep apnea. This highlights the importance of recongizing how different gender may experience sleep problems in unique ways.
The line plot of heart rate across sleep duration and BMI categories showed that higher BMI is linked to consistently higher heart rates, regardless of sleep duration. This may point to deeper cardiovascular stress in people with obesity, emphasizing the importance of combined interventions focused on sleep, activity, and weight management.
While these results are insightful, they also open up opportunities to unpack further exploration.
Future research could:
More broadly, this kind of analysis can help inform public health policies, workplace wellness programs, and personal health goals. It could encourage both individuals and communities to prioritize sleep as a key component of a healthy lifestyle.
knitr::include_graphics("/Users/jayfreeportillo/Downloads/sleep.png")
Improve Sleep