Introduction

Using data to explore the connection between sleep habits, daily activities, and health outcomes. With Python to create visualizations, we look at how different factors like physical activity, stress, occupation, and even gender might impact how well people sleep. We also take a closer look at sleep disorders like insomnia and sleep apnea, and how they show up across different groups of people.

The goal of this analysis is to better understand the patterns in sleep and lifestyle data so we can learn what habits might lead to better health and rest.

Source: https://www.kaggle.com/datasets/uom190346a/sleep-health-and-lifestyle-dataset/data

Dataset

The dataset includes information from around 400 people and has 13 different columns, or types of data. It focuses on sleep habits, daily routines, and overall health. Each row in the dataset represents one person and includes the following details:

  • Basic Info like: age, gender, and job.

  • Sleep Details such as how many hours they sleep, how good their sleep is (rated from 1 to 10), and if they have any sleep disorders like insomnia or sleep apnea.

  • Daily Lifestyle Habits like how many minutes they exercise, how many steps they take, and how stressed they feel (also rated from 1 to 10).

  • Health Measurements such as their heart rate and blood pressure. BMI Category, which tells whether a person is normal, overweight, or obese.

Findings

This analysis explores the relationships between sleep, physical activity, stress, and overall lifestyle habits using visualizations. The charts reveal key patterns, such as how higher physical activity is linked to longer sleep and lower stress, and how different occupations affect average sleep duration. We also see that sleep disorders vary by gender, and that people with better sleep quality tend to take more daily steps, especially those with a normal or healthy BMI. These findings help highlight the strong connection between our daily routines and our sleep health.

Heart Rate by Sleep across BMI

This multi line plot compares average heart rate (y-axis) to sleep duration (x-axis), and breaks it down by BMI category:

  • Green = Normal
  • Purple = Normal Weight
  • Red = Obese
  • Orange = Overweight

Each line shows how the heart rate changes for people in that BMI group based on how long they sleep.

# Import libraries
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np

# Load Data
df = pd.read_csv("/Users/jayfreeportillo/Downloads/Sleep_health_and_lifestyle_dataset.csv")

# Rounding the sleep duration to 1 decimal and save it in a new column called 'Sleep Bin'
df['Sleep Bin'] = df['Sleep Duration'].round(1)

# Group the data by sleep hours and BMI type, then get the average heart rate for each group
grouped = df.groupby(['Sleep Bin', 'BMI Category'])['Heart Rate'].mean().reset_index()

# Set up the plot size
fig, ax = plt.subplots(figsize=(16, 9))

# Set custom colors for each BMI category
colors = {
    'Normal': 'green',
    'Overweight': 'orange',
    'Obese': 'red',
    'Normal Weight': '#7570b3'
}

# Plot one line for each BMI category
for category, group in grouped.groupby('BMI Category'):
    group = group.sort_values('Sleep Bin')  # Sort the data by sleep hours
    ax.plot(group['Sleep Bin'], group['Heart Rate'],  # X and Y data
            label=category,                           # Add label to the line
            color=colors.get(category, 'gray'),       # Use color 
            marker='o',                               # Add markers on lune
            linewidth=2.5)                            # Line thickness

# Add title and axis labels
ax.set_title("Heart Rate by Sleep Duration across BMI Categories", fontsize=20, pad=20)
ax.set_xlabel("Sleep Duration (Hours)", fontsize=16)
ax.set_ylabel("Average Heart Rate (bpm)", fontsize=16)

# Adjust size of tick labels
ax.tick_params(axis='both', labelsize=14)

# Add grid to the plot
ax.grid(True, linestyle='--', alpha=0.5)

# Add title and place it at the top right
ax.legend(title="BMI Category", title_fontsize=16, fontsize=14, loc='upper right', frameon=True)

# Fitting
plt.tight_layout()

# Print
plt.show()

Key Insights

  • Obese individuals consistently have the highest heart rates, regardless of how much they sleep. Their average heart rate stays around 83–85 bpm across sleep durations which might suggest higher baseline cardiovascular stress.

  • People in the Normal and Normal Weight categories tend to have lower heart rates, especially when sleep duration increases. Their heart rate gradually drops as sleep improves, especially past the 7-hour mark.

  • Overweight individuals show more fluctuation, but overall, their heart rate decreases as sleep duration increases, similar to other groups except Obese.

In general, more sleep is linked to lower heart rates, especially for those in the Normal and Overweight categories. This supports the idea that longer, better sleep can contribute to better heart health.

What can be learned?

  • Sleep matters: Across most groups, getting more sleep appears to be connected to a healthier (lower) resting heart rate.

  • BMI and heart rate are connected: People with higher BMI (Obese) consistently have higher heart rates, even when sleep duration increases. This might point to additional stress on the heart, regardless of sleep.

  • Better sleep may lower risk: Lower heart rates are often associated with better cardiovascular health. So improving sleep, especially among overweight or normal-weight individuals, may reduce health risks over time.

Sleep Disorders by gender

This nested pie chart shows how sleep disorders are spread out by gender.

The outer ring shows the total percentage of people with each sleep disorder type:

  • None (no disorder)
  • Insomnia
  • Sleep Apnea

The inner ring breaks down each group into male and female percentages. The total number of people included is 374

# Imports
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np

# Load data
df = pd.read_csv("/Users/jayfreeportillo/Downloads/Sleep_health_and_lifestyle_dataset.csv")

# Those with none disorder
df["Sleep Disorder"] = df["Sleep Disorder"].fillna("None")

# Group data by Sleep Disorder and Gender
grouped = df.groupby(['Sleep Disorder', 'Gender']).size().reset_index(name='Count')

# Outer ring (Sleep Disorders total counts)
outer_data = grouped.groupby('Sleep Disorder')['Count'].sum().reset_index()
outer_labels = outer_data['Sleep Disorder']
outer_counts = outer_data['Count']

# Inner ring (Gender counts per disorder)
inner_labels = grouped['Gender']
inner_counts = grouped['Count']

# Color setup
cmap = plt.get_cmap('tab20c')
outer_color_refs = [0, 4, 8, 12]
outer_colors = [cmap(i) for i in outer_color_refs]

all_refs = np.arange(16)
inner_color_refs = [i for i in all_refs if i not in outer_color_refs][:len(inner_counts)]
inner_colors = [cmap(i) for i in inner_color_refs]

# Figure and axis
fig, ax = plt.subplots(figsize=(10, 10))

# Outer pie chart (Sleep Disorders)
ax.pie(
    outer_counts,
    radius=1,
    labels=outer_labels,
    labeldistance=1.1,
    colors=outer_colors,
    wedgeprops=dict(width=0.3, edgecolor='white'),
    autopct='%1.1f%%',
    pctdistance=0.85,
    textprops={'fontsize': 13}
)
# Inner pie chart (Gender) 
ax.pie(
    inner_counts,
    radius=0.7,
    labels=inner_labels,
    labeldistance=0.7,
    colors=inner_colors,
    wedgeprops=dict(width=0.3, edgecolor='white'),
    autopct='%1.1f%%',
    pctdistance=0.64,
    textprops={'fontsize': 10}
)
# Add center total text
total = df.shape[0]
ax.text(0, 0, f'Total\n{total}', ha='center', va='center', fontsize=16)

# Title and layout
plt.title("Sleep Disorders by Gender (Including None)", fontsize=18, pad=30)
plt.axis('equal')
plt.tight_layout()
plt.show()

Key Insights

  • Most people (58.6%) do not have a sleep disorder. This is the largest group in the chart.

  • Insomnia and Sleep Apnea are nearly equally common, each affecting about 20% of the sample.

  • Sleep disorders differ by gender: More men experience Insomnia (11.0%) than women (9.6%), while Sleep Apnea is much more common in women (17.9%) than men (2.9%).

  • Among people with no disorder, there are more males (36.6%) than females (21.9%).

What Can Be Learned?

  • Sleep disorders are fairly common, affecting about 41% of people in the sample.

  • Gender appears to influence the type of sleep disorder: Males are slightly more likely to experience Insomnia, while Sleep Apnea is much more common among females in this dataset.

Overall, the nested pie chart shows that most people do not have a sleep disorder, but those who do experience different disorders depending on gender. This perhaps suggests a need for gender-aware approaches to sleep health education and treatment.

Steps by BMI and Sleep

This heatmap displays the average number of daily steps for people grouped by two things:

  • BMI Category (Normal, Normal Weight, Overweight, Obese)
  • Sleep Quality (Average, Good, Excellent)

Each box shows how active people are in steps based on their body weight category and how well they sleep

# Import 
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from matplotlib.ticker import FuncFormatter

# Load data
df = pd.read_csv("/Users/jayfreeportillo/Downloads/Sleep_health_and_lifestyle_dataset.csv")

# Defining bins and labels to categorize Quality of Sleep
bins = [0, 3, 6, 8, 10]
labels = ['Poor', 'Average', 'Good', 'Excellent']

# New categorical column for Sleep Quality based on defined bins
df['Sleep Quality Category'] = pd.cut(df['Quality of Sleep'], bins=bins, labels=labels)

# Pivot table: Average Daily Steps for each BMI & Sleep Quality combo
heatmap_df = pd.pivot_table(
    df,
    index='BMI Category',
    columns='Sleep Quality Category',
    values='Daily Steps',
    aggfunc='mean',
    observed=False  # Include all categories
)

# Adding commas to large numbers in the color bar
comma_fmt = FuncFormatter(lambda x, _: f"{int(x):,}")

# Set up the plot figure
fig, ax = plt.subplots(figsize=(10, 8))
sns.set(font_scale=1.1)  # Font scale 

# Draw the heatmap
sns.heatmap(
    heatmap_df,
    annot=True,  # Show values inside each box
    fmt=".0f",   # Format numbers
    cmap="coolwarm",  # Color palette 
    linewidths=0.5,    # Thin lines between cells
    linecolor='white',
    square=True,
    annot_kws={"size": 13, "weight": "bold"},  # Style for the annotations
    cbar_kws={"format": comma_fmt}  # Comma formatting to color bar
)

# Customize tick labels
plt.xticks(rotation=15, fontsize=12)
plt.yticks(rotation=0, fontsize=12)
# Adding chart title and axis labels
plt.title("Avg Daily Steps by BMI Category and Sleep Quality", fontsize=18, pad=15)
plt.xlabel("Sleep Quality Category", fontsize=14, labelpad=10)
plt.ylabel("BMI Category", fontsize=14, labelpad=10)

# Add color bar label
cbar = ax.collections[0].colorbar
cbar.set_label("Avg Daily Steps", fontsize=12, weight='bold')

# Print
plt.tight_layout()
plt.show()

Key Insights

  • Obese individuals take the fewest steps overall, across all levels of sleep quality as low as 3,125 steps per day.

  • People with Normal or Normal Weight tend to be more active, especially when they report better sleep. For example, those with Normal Weight and Excellent sleep take 8,750 steps, the highest in the chart.

  • More sleep doesn’t always mean more steps: In the Normal BMI group, people with Good sleep quality actually take more steps than those with Excellent sleep. While in the Overweight group, people with Average sleep have the highest step count.

What can be Learned?

  • Physical activity is strongly connected to both BMI and sleep quality, but not in a perfect pattern.

  • People with lower BMI tend to take more steps, which may help them maintain healthier weight and better sleep.

  • Good sleep may encourage more physical activity, but other factors like lifestyle and job may also affect step counts.

The heatmap shows that people with lower BMI and better sleep quality tend to take more daily steps. Obese individuals are the least active across all sleep categories. This highlights a strong link between physical activity, sleep quality, and body weight which suggesting that promoting better sleep and movement could support healthier lifestyles.

Sleep Duration by Occupation

This chart compares how much sleep people in different jobs get on average per night. Each bar shows the average sleep duration for that occupation.

The bars are colored:

  • Red for below average,
  • Yellow for very close to average (within 1%),
  • Green for above average.
# Import
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
import matplotlib.patches as mpatches

# Load the dataset
file_path = "/Users/jayfreeportillo/Downloads/Sleep_health_and_lifestyle_dataset.csv"
df = pd.read_csv(file_path)

# Remove rows with missing values in key columns
df = df.dropna(subset=["Occupation", "Sleep Duration"])

# Calculate average sleep duration for each occupation and sort it
occupation_sleep = df.groupby("Occupation")["Sleep Duration"].mean().sort_values()

# Calculate overall mean sleep duration
mean_sleep_duration = occupation_sleep.mean()

# Function to assign color based on how far value is from mean
def pick_colors(value, mean):
    if value > mean * 1.01:      # More than 1% above mean
        return "green"
    elif value < mean * 0.99:    # More than 1% below mean
        return "red"
    else:                        # Within 1% of mean
        return "yellow"

# Apply the color function to each value
bar_colors = [pick_colors(value, mean_sleep_duration) for value in occupation_sleep]

# Create the figure and axis
fig, ax = plt.subplots(figsize=(12, 8))

# Plot the bar chart with colors
bars = ax.bar(occupation_sleep.index, occupation_sleep.values, color=bar_colors, alpha=0.8)

# Add value labels on top of each bar
for bar, value in zip(bars, occupation_sleep.values):
    ax.text(bar.get_x() + bar.get_width() / 2, value + 0.1, f"{value:.1f}",
            ha="center", fontsize=12, fontweight="bold", color="black", fontfamily="Arial")

# Draw a horizontal line for the mean sleep duration
ax.axhline(mean_sleep_duration, color="black", linestyle="--", linewidth=2)
ax.text(0, mean_sleep_duration + 0.1,
        f"Mean: {mean_sleep_duration:.1f}", fontsize=12, fontweight="bold",
        color="black", fontfamily="Arial")

# Rotate x-axis labels for readability
plt.xticks(rotation=45, ha="right", fontsize=12, fontfamily="Arial")
# Label the axes and set chart title
ax.set_xlabel("Occupation", fontsize=14, fontweight="bold")
ax.set_ylabel("Average Sleep Duration (Hours)", fontsize=14, fontweight="bold")
ax.set_title("Average Sleep Duration by Occupation", fontsize=16, fontweight="bold")

# Remove top and right border lines
ax.spines["top"].set_visible(False)
ax.spines["right"].set_visible(False)

# Create legend patches for color categories
above = mpatches.Patch(color="green", label="Above Average")
below = mpatches.Patch(color="red", label="Below Average")
close = mpatches.Patch(color="yellow", label="Within 1% of Average")

# Add legend to the plot
ax.legend(handles=[above, close, below], fontsize=12, loc="upper left")

# Display the plot
plt.show()

Key Insights:

  • Engineers sleep the most, averaging 8 hours per night, the highest of all groups.
  • Sales Representatives sleep the least, averaging just 5.9 hours, well below the average.
  • Other occupations like Lawyers, Accountants, Nurses, and Doctors tend to sleep more than average.
  • Jobs like Scientists, Salespeople, Teachers, and Software Engineers all get less than average sleep.

What can be Learned?

Occupation impacts sleep duration, possibly due to job demands, stress, or work-life balance. People in technical or high-pressure roles like Sales or Science may be sacrificing sleep.On the other hand, roles like Engineer, Lawyer, and Nurse show better sleep, possibly due to more structured work or better routines

Overall people in different jobs get very different amounts of sleep. Engineers sleep the most, while sales-related jobs sleep the least. This shows how occupation can affect sleep habits, which is important for both personal well-being and job performance.

Stress Levels and Physical Health

This scatterplot compares: Physical Activity Level vs Sleep Duration in hours

Each dot represents a person, and the color shows their stress level:

  • Red = high stress
  • Blue = low stress
# Import
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

# Load the dataset
file_path = "/Users/jayfreeportillo/Downloads/Sleep_health_and_lifestyle_dataset.csv"
df = pd.read_csv(file_path)

# Removing rows with missing values in key columns
df = df.dropna(subset=["Physical Activity Level", "Sleep Duration", "Stress Level"])

# Create a figure and axis for plotting
fig, ax = plt.subplots(figsize=(16, 10))

# Create scatter plot
scatter = ax.scatter(
    df["Physical Activity Level"],         # x-axis
    df["Sleep Duration"],                  # y-axis
    c=df["Stress Level"],                
    cmap="coolwarm",                       # color theme
    s=100,                                 # size of points
    alpha=0.75,                            # transparency of points
    edgecolors="black"                     # black border for visibility
)

# Add color bar to show stress level scale
cbar = fig.colorbar(scatter, ax=ax)
cbar.set_label("Stress Level", fontsize=14)  # label for color bar
cbar.ax.tick_params(labelsize=12)            # font size for color bar ticks

# Add grid for easier reading
ax.grid(True, linestyle="--", alpha=0.6)

# Labeling axes and set title
ax.set_xlabel("Physical Activity Level", fontsize=16, fontweight="bold")
ax.set_ylabel("Sleep Duration (Hours)", fontsize=16, fontweight="bold")
ax.set_title("Physical Activity vs. Sleep Duration (Colored by Stress Level)", fontsize=18, fontweight="bold")

# Adjust axis tick font sizes
ax.tick_params(axis="both", labelsize=14)

# Show the final plot
plt.show()

Key Insights:

  • More physical activity is often linked to more sleep: People who are more physically active (around 70–90 minutes/day) tend to sleep around 7.5 to 8.3 hours, which is relatively high.
  • Lower stress levels are common in people who sleep more: Many blue dots (low stress) appear in the upper-right part of the chart where sleep duration and activity are both higher.
  • Higher stress is often seen in those with less sleep and activity: Red dots are mostly grouped toward the bottom left, showing shorter sleep and lower activity.

What can be Learned?

There seems to be a positive connection between physical activity and better sleep. Lower stress may be linked to staying active and getting more sleep. Encouraging daily movement might be a helpful way to improve sleep and reduce stress levels naturally.

This scatterplot shows that people who are more active tend to sleep longer and report less stress. It highlights how physical activity may support better sleep and lower stress, making it a valuable habit for overall well-being.

Conclusion

These visualizations help us better understand how these variables interact and why they matter for overall health and well-being.

The scatterplot revealed that higher physical activity is generally associated with longer sleep duration and lower stress levels. Individuals who exercised more tended to sleep better and showed lower stress, as indicated by the scatterplot. This reinforces the idea that regular movement is not only good for physical health but also plays a vital role in improving sleep and mental well-being.

The bar chart of average sleep by occupation showed that jobs with more structure or flexibility, such as Engineer and Lawyer, are associated with longer sleep, while roles like Sales Representative and Scientist tend to be linked to shorter sleep. This highlights how work, schedules, and responsibilities could directly impact sleep quality and quantity.

In the heatmap of daily steps by BMI and sleep quality, individuals with lower BMI and better sleep quality consistently took more steps each day, suggesting that active lifestyles may both reflect and support healthier sleep patterns. In contrast, those in the Obese category were less active across all sleep quality levels, pointing to a potential cycle of inactivity and poor rest that could lead to longer-term health challenges.

The nested pie chart of sleep disorders by gender revealed that while most people do not suffer from a sleep disorder, insomnia is more common among males in this sample, and females are more likely to report sleep apnea. This highlights the importance of recongizing how different gender may experience sleep problems in unique ways.

The line plot of heart rate across sleep duration and BMI categories showed that higher BMI is linked to consistently higher heart rates, regardless of sleep duration. This may point to deeper cardiovascular stress in people with obesity, emphasizing the importance of combined interventions focused on sleep, activity, and weight management.

While these results are insightful, they also open up opportunities to unpack further exploration.

Future research could:

  • Look into how diet, screen time, or caffeine impact sleep quality.
  • Explore age-related patterns in sleep and health behaviors.
  • Investigate causal relationships for example does poor sleep lead to less activity, or vice versa?
  • Include more longitudinal data to track how habits change over time.

More broadly, this kind of analysis can help inform public health policies, workplace wellness programs, and personal health goals. It could encourage both individuals and communities to prioritize sleep as a key component of a healthy lifestyle.

knitr::include_graphics("/Users/jayfreeportillo/Downloads/sleep.png")
Improve Sleep

Improve Sleep