Bellabeat, founded by Urška Sršen and Sando Mur in 2013, is a high-tech company that manufactures wellness products dedicated to empower women about their activity, sleep, stress, and reproductive health. Bellabeat is an international company and more about them and their products can be found on their website.
Trends, insights, and patterns can be identified in smart device data to find correlations to Bellabeat products & users, and to help dictate appropriate marketing strategies to increase market share.
Time: The classic look of a watch that uses smart technology to track user activity, sleep, and stress. It connects with the Bellabeat app to provide user insight to daily wellness.
Summary: The datasets contain wellness information among 33 anonymous FitBit users such calories, distance, steps, & sleep. The users consented to submission of tracked wellness information which are contained in .csv for the purpose of data exploration.
Licensing and accessibility: The data has been published under the CCO 1.0 Public Domain Dedication license. The publisher dedicated the data to public domain and waived all rights of the work worldwide under copyright law. Users of this data can copy, modify, distribute, and perform work even for commercial purposes without asking permission. The security and accessibility of this data is free to the public, however my personal copy and work of the data has been secured on a local device and backed up in an online drive.
Organization & Integrity:
The .csv files used were:
Files were categorized into folders such as Daily, Hourly, and Sleep. The datasets were sorted in long format for processing.The integrity of this data is outdated being from 2016 however the user information could not be changed once it is electronically logged and stored.
Limitations: The sample size was just above the minimum with 33 users and activity was tracked only for 31 days. There are possible issues with bias in this data because not every user logged the same amount of days for activity or sleep records which can affect the results of the study if not properly accounted for.
Tools:
Cleaning: Below is a summary of techniques used and embedded in the Bellabeat Datasets: Cleaning Log are the specific changes for cleaning made in the datasets.
Renaming columns
Days of the Week Activity: To get a better understanding of the averages by the day of the week, I uploaded the cleaned datasets to BigQuery. I created a new table in SQL titled Days of the Week Activity by aggregating the columns from the Daily_Activity dataset and extracting the Day of the week from the ActivityDate column.
Then I transformed the newly created table Days of the Week Activity table by creating a CASE statement to convert and sort minutes to hour when they exceeded 60. Then I created one final CASE statement to sort and assign the Day of the Week Column from a number to string text all in order to make the table more comprehensible.
User Type: In order to discern the number of times
each unique user Id logged in over the course of data collection period,
I created a new table and added a CASE statement that counted logins and
assigned users to groups based on activity.
Sleep The final table I created for processing in BigQuery was Sleep_Avg so I created a new table from the uploaded Sleep_Data table and created columns for Id, extracted DAYOFWEEK from the Date column as done in the previous table. I also wanted to get the averages for Time_In_Bed & Time_Asleep.
SQL: To start my analysis I wanted to determine the
average activity level for each day of the week for both the group and
individual users. My metrics were going to be average calories per day
and average steps per day. I created a User_Averages table
using SQL in BigQuery from the Daily_Activity table and
aggregated the average total of steps, distance, and calories. I then
queried the newly created table to round the averages and download the
results to sort in Excel before uploading to Tableau.
Excel: I downloaded the queried results to an excel notebook and created a pivot table for Calories & Steps. For the pivot table the rows were sorted as days of the week and values were the averages of Calories & Steps. The last step was creating a chart to have visualization accompany the table.
Tableau: I downloaded the results of the previously queried table Days_of_Week_Activity to an excel notebook to be uploaded to Tableau. The table was grouped by Id, Calories, Steps, Distance, and Day. I wanted to create a Visualization in Tableau to sort each individual daily calorie and step summary by day of the week, to make it easier to analyze the data. Below are the results.
Results We can already see from both the data and visualizations that the most active days of the week based on user and group averages of calories and steps are Tuesday & Saturday. The least active days appear to be Thursday & Sunday. Let’s continue our analysis and see what other trends we can find.
Steps & Calories by Hour: SQL To find the most active time of the day for all users over the course of the data collection period, I queried the results of the Calories_Hourly & Steps_Hourly tables by selecting DISTINCT Hour and rounding the average steps and caloires.
Hourly Activity: Tableau I uploaded the query results ,that were in long format in an excel notebook, to Tableau. I wanted to compare both the average calories burned and average steps of all users against the time of day to see the most active hours of the day. I created a dual-axis line chart in Tableau, here are the results.
Results According to the aggregated data and chart it appears the most active time of day is between about 5 & 6 pm. We get a sharp increase in activity at the beginning of the day around 6 am and it keeps rising until we get a dip between 2 - 4pm, before our active peak of the day between 5 & 6 pm. After about 7pm there is a sharp decrease in activity.
Average Sleep Data: SQL & Excel Sleep is extremely important and has a major impact on our day. I wanted to find the average amount of sleep by day of the week as well as a summary of sleep over the collection period so I could compare the results to see if it had any affect on activity. I created a pivot table and chart of sleep averages with day of the week in Excel from the Sleep_Avg table which I already processed and transformed from the Sleep_Data dataset in BigQuery.(As shown above in the Process section of this Case Study.)
Sleep vs. Highly Active Minutes: Tableau To get a
comparison of average sleep and activity, I needed to create a dual axis
chart in Tableau. I made a connection with the Sleep_Data table
and Daily_Activity in the data soucre section in Tableau. I
created a dual axis line chart comparing average time asleep to very
active minutes over the dates of the data collection period.
Results: Recommended sleep for adults is between 7 - 9 hours a day. The Sleep averages chart shows that the weekend (Saturday & Sunday) is when users get the most sleep. Tuesday and Thursday users get the least amount of sleep, less than 7 hours which is the minimum amount recommended for adults. We also saw a decline in activity after Tuesday in the Average calories and steps data. This shows a trend in sleep and physical activity. The dual axis chart shows a direct correlation between sleep and highly active minutes over time. Each day the users get more sleep the level in highly active minutes increases. Every day that has a dip in sleep, there is a dip in highly active minutes. The more sleep the users get the more highly active minutes they will partake in the next day. The less sleep the user gets, the more likely of a decrease in highly active minutes the following day.
Since sleep is so important and not only does the Bellabeat Time product keep track of your activity, but sleep as well. I wanted to compare how active users were, when tracking activity versus tracking sleep. I used the same standard for both tables of data by categorizing how often a user was tracking for that specific activity.
I created a pivot chart from the aggregated data in the
User_Type table which is filtered and processed from the
Daily_Activity table.
To categorize the users for sleep tracking I queried the
Sleep_Data table in SQL by selecting DISTINCT user Id counting
each user login date. Then I categorized each user based on number of
logins.
User Tracking: Activity vs. Sleep: Tableau To
compare how users track their sleep and how they track their activity I
wanted to create a dashboard in tableau to compare the charts side by
side. I uploaded the User_Type table and made a connection by
user type with the already uploaded Sleep_Data table. Within
Tableau I created a pie chart from both tables and changed the total
login data to percentage of total. I created a dashboard to compare both
pie charts side by side.
Results: The amount of Highly Active users for physical activity reached almost 88%, while Highly Active users only accounted for roughly 46% of sleep tracking. The amount of Highly Active users for tracking sleep was almost cut in half compared to Highly Active users for physical activity. Lightly Active users (which is between 0 and 5 logins) accounted for one third of all sleep tracking, compared to about 3% of tracking physical activity.
Days of the Week
Hourly
Sleep
User Tracking
Marketing Strategy Since the Covid-19 pandemic and a shift in working from home, it is recommended that the marketing department continue and increase their investment opportunities in digital ads on:
Advertisement time would be beneficial between 4 - 7 pm when people are getting off work and starting their peak time of physical activity.