📘 Business Task

Bellabeat, a high-tech manufacturer of health-focused smart products for women, wants to leverage data to inform its marketing strategy. As a junior data analyst, your task is to analyze smart device usage patterns (using Fitbit data) and generate actionable insights that could be applied to Bellabeat’s product suite — particularly the Leaf wellness tracker.


📂 Data Source


🧹 Data Cleaning

library(tidyverse)
library(janitor)
library(lubridate)
library(skimr)

daily_activity <- read_csv("dailyActivity_merged.csv") %>% clean_names()
sleep_day <- read_csv("sleepDay_merged.csv") %>% clean_names()

daily_activity <- daily_activity %>% mutate(activity_date = mdy(activity_date))
sleep_day <- sleep_day %>% mutate(sleep_day = mdy_hms(sleep_day))

daily_activity <- distinct(daily_activity)
sleep_day <- distinct(sleep_day)

🔍 Analysis & Visualizations

📊 Steps vs Calories Burned

ggplot(daily_activity, aes(x = total_steps, y = calories)) +
  geom_point(color = "#FF6F61", alpha = 0.6) +
  geom_smooth(method = "lm", se = FALSE, color = "#333333") +
  labs(title = "Steps vs Calories Burned", x = "Total Steps", y = "Calories Burned") +
  theme_minimal()

📌 Insight: There is a strong positive correlation between the number of steps taken and calories burned. This validates the use of step goals in health-tracking apps as an effective strategy for promoting physical activity.


😴 Sleep vs Very Active Minutes

activity_sleep <- inner_join(daily_activity, sleep_day, by = c("id", "activity_date" = "sleep_day"))

ggplot(activity_sleep, aes(x = total_minutes_asleep / 60, y = very_active_minutes)) +
  geom_point(color = "#6C5B7B", alpha = 0.6) +
  geom_smooth(method = "lm", color = "#333333", se = FALSE) +
  labs(title = "Sleep Hours vs Very Active Minutes", x = "Sleep Hours", y = "Very Active Minutes") +
  theme_minimal()

📌 Insight: The relationship between sleep duration and very active minutes is weak, indicating that high activity does not necessarily reduce sleep time. This highlights a potential for Bellabeat to encourage both physical activity and adequate rest without compromise.


📈 Activity Types Comparison

daily_activity %>%
  select(very_active_minutes, fairly_active_minutes, lightly_active_minutes, sedentary_minutes) %>%
  pivot_longer(everything(), names_to = "activity_type", values_to = "minutes") %>%
  ggplot(aes(x = activity_type, y = minutes, fill = activity_type)) +
  geom_boxplot(alpha = 0.7) +
  scale_fill_brewer(palette = "Pastel1") +
  labs(title = "Comparison of Activity Types", x = "Activity Type", y = "Minutes per Day") +
  theme_minimal()

📌 Insight: Users spend the majority of their time in sedentary activity, with very little time being spent in highly active states. This presents a key opportunity for Bellabeat to introduce motivational features encouraging users to break up long sedentary periods with movement.


📌 Key Insights


🧠 Recommendations


✅ Conclusion

The analysis reveals consistent patterns in physical activity, sedentary time, and sleep among Fitbit users. These insights can guide Bellabeat in refining their product positioning and user engagement strategy — especially for the Leaf smart wellness tracker.