Founded by Urška Sršen and Sando Mur in 2013, BellaBeat manufactures health-focused smart products. Sršen used her background as an artist to develop beautifully designed technology that informs and inspires women around the world.Collecting data on activity, sleep, stress, and reproductive health has allowed Bellabeat to empower women with knowledge about their own health and habits. Bellabeat is a successful small company and since its founding in 2013, Bellabeat has grown rapidly with the potential to become a large player in the global smart device market.
Analyze usage data of non-Bellabeat smart devices then provide insight on how the company can improve its marketing strategy and user interaction with products.
We will be utilizing publicly published FitBit user data as the basis for this study.
Source - FitBit Data
I set the foundation for this project with loading essential packages I would utilize in this case study.
library(tidyverse)
library(lubridate)
library(dplyr)
library(ggplot2)
library(tidyr)
After having saved the data sets into a folder on my computer, I set my working directory to the folder and loaded the most important crucial sets into R to begin working
Take a glimpse of the data to see what we’re working with
glimpse(activity)
## Rows: 940
## Columns: 15
## $ Id <dbl> 1503960366, 1503960366, 1503960366, 150396036…
## $ ActivityDate <chr> "4/12/2016", "4/13/2016", "4/14/2016", "4/15/…
## $ TotalSteps <dbl> 13162, 10735, 10460, 9762, 12669, 9705, 13019…
## $ TotalDistance <dbl> 8.50, 6.97, 6.74, 6.28, 8.16, 6.48, 8.59, 9.8…
## $ TrackerDistance <dbl> 8.50, 6.97, 6.74, 6.28, 8.16, 6.48, 8.59, 9.8…
## $ LoggedActivitiesDistance <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, …
## $ VeryActiveDistance <dbl> 1.88, 1.57, 2.44, 2.14, 2.71, 3.19, 3.25, 3.5…
## $ ModeratelyActiveDistance <dbl> 0.55, 0.69, 0.40, 1.26, 0.41, 0.78, 0.64, 1.3…
## $ LightActiveDistance <dbl> 6.06, 4.71, 3.91, 2.83, 5.04, 2.51, 4.71, 5.0…
## $ SedentaryActiveDistance <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, …
## $ VeryActiveMinutes <dbl> 25, 21, 30, 29, 36, 38, 42, 50, 28, 19, 66, 4…
## $ FairlyActiveMinutes <dbl> 13, 19, 11, 34, 10, 20, 16, 31, 12, 8, 27, 21…
## $ LightlyActiveMinutes <dbl> 328, 217, 181, 209, 221, 164, 233, 264, 205, …
## $ SedentaryMinutes <dbl> 728, 776, 1218, 726, 773, 539, 1149, 775, 818…
## $ Calories <dbl> 1985, 1797, 1776, 1745, 1863, 1728, 1921, 203…
glimpse(calories)
## Rows: 22,099
## Columns: 3
## $ Id <dbl> 1503960366, 1503960366, 1503960366, 1503960366, 150396036…
## $ ActivityHour <chr> "4/12/2016 12:00:00 AM", "4/12/2016 1:00:00 AM", "4/12/20…
## $ Calories <dbl> 81, 61, 59, 47, 48, 48, 48, 47, 68, 141, 99, 76, 73, 66, …
glimpse(intensities)
## Rows: 22,099
## Columns: 4
## $ Id <dbl> 1503960366, 1503960366, 1503960366, 1503960366, 15039…
## $ ActivityHour <chr> "4/12/2016 12:00:00 AM", "4/12/2016 1:00:00 AM", "4/1…
## $ TotalIntensity <dbl> 20, 8, 7, 0, 0, 0, 0, 0, 13, 30, 29, 12, 11, 6, 36, 5…
## $ AverageIntensity <dbl> 0.333333, 0.133333, 0.116667, 0.000000, 0.000000, 0.0…
glimpse(sleep)
## Rows: 413
## Columns: 5
## $ Id <dbl> 1503960366, 1503960366, 1503960366, 1503960366, 150…
## $ SleepDay <chr> "4/12/2016 12:00:00 AM", "4/13/2016 12:00:00 AM", "…
## $ TotalSleepRecords <dbl> 1, 2, 1, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, …
## $ TotalMinutesAsleep <dbl> 327, 384, 412, 340, 700, 304, 360, 325, 361, 430, 2…
## $ TotalTimeInBed <dbl> 346, 407, 442, 367, 712, 320, 377, 364, 384, 449, 3…
glimpse(weight)
## Rows: 67
## Columns: 8
## $ Id <dbl> 1503960366, 1503960366, 1927972279, 2873212765, 2873212…
## $ Date <chr> "5/2/2016 11:59:59 PM", "5/3/2016 11:59:59 PM", "4/13/2…
## $ WeightKg <dbl> 52.6, 52.6, 133.5, 56.7, 57.3, 72.4, 72.3, 69.7, 70.3, …
## $ WeightPounds <dbl> 115.9631, 115.9631, 294.3171, 125.0021, 126.3249, 159.6…
## $ Fat <dbl> 22, NA, NA, NA, NA, 25, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ BMI <dbl> 22.65, 22.65, 47.54, 21.45, 21.69, 27.45, 27.38, 27.25,…
## $ IsManualReport <lgl> TRUE, TRUE, FALSE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, …
## $ LogId <dbl> 1.462234e+12, 1.462320e+12, 1.460510e+12, 1.461283e+12,…
Upon viewing the Data sets, I found that the format in which the time was stored was unsatisfactory for this project. I decided to switch the format of the time in all the data sets to military standard time (from 0-24 o’clock) along with separating the date and time into two separate columns.
In respect to seeing the validty of the data, I found the number of distinct Id’s in the data set to see how many individuals gave their logged data to the database
n_distinct(activity$Id)
## [1] 33
n_distinct(calories$Id)
## [1] 33
n_distinct(intensities$Id)
## [1] 33
n_distinct(sleep$Id)
## [1] 24
n_distinct(weight$Id)
## [1] 8
Seeing only 8 applicants turn in submissions for weight invalidates any findings we would make with the data as the sample size is too small to make conclusive findings.
The first thing I wanted to look at was variation in intensity level users partake in an attempt to better understand our products target audience. Being able to understand what sorts of activity the majority of individuals who use health focused smart product will sway how we decide to advertise our product, whether we have bulky weightlifters or people sprinting on our posters and website, or if we should have people going on walks and stationary activities on our adversments.
The graph above show the average number of minutes logged per day by our users. We can infer from the graph that a majority of target audience partakes in very light exercise, hence we should market to people with flyers and poster with individuals conducting light activities such as walking or yoga.
BellaBeat is able to efficiently help its clients track their health through their mobile app that allows user to seamlessly check their activity logs and log in personal data. Our goal is to optimize user interaction with Bella Beat products and to do so, we should set reminders for users to workout and track their health
Generally there is a large abundence of activity from 10 AM - 7PM, with our largest hours of activity being from 5 PM - 7PM. With that said, it would be wise to consider utilizing reminders at around 4:00 pm to work out or be active
The final thing I wanted to look at is sleep quality, if we can show that tracking your health improves sleep quality it can motivate ppl to buy our product
Clear negative correlation in minutes sedentary and sleep, ppl should make an effort to track their health by utilizing our products.