Introduction

About the Company

Founded by Urška Sršen and Sando Mur in 2013, BellaBeat manufactures health-focused smart products. Sršen used her background as an artist to develop beautifully designed technology that informs and inspires women around the world.Collecting data on activity, sleep, stress, and reproductive health has allowed Bellabeat to empower women with knowledge about their own health and habits. Bellabeat is a successful small company and since its founding in 2013, Bellabeat has grown rapidly with the potential to become a large player in the global smart device market.

Business Task

Analyze usage data of non-Bellabeat smart devices then provide insight on how the company can improve its marketing strategy and user interaction with products.

Data Source

We will be utilizing publicly published FitBit user data as the basis for this study.

Source - FitBit Data

Working with the data in R

I set the foundation for this project with loading essential packages I would utilize in this case study.

library(tidyverse)
library(lubridate)
library(dplyr)
library(ggplot2)
library(tidyr)

After having saved the data sets into a folder on my computer, I set my working directory to the folder and loaded the most important crucial sets into R to begin working

Take a glimpse of the data to see what we’re working with

glimpse(activity)
## Rows: 940
## Columns: 15
## $ Id                       <dbl> 1503960366, 1503960366, 1503960366, 150396036…
## $ ActivityDate             <chr> "4/12/2016", "4/13/2016", "4/14/2016", "4/15/…
## $ TotalSteps               <dbl> 13162, 10735, 10460, 9762, 12669, 9705, 13019…
## $ TotalDistance            <dbl> 8.50, 6.97, 6.74, 6.28, 8.16, 6.48, 8.59, 9.8…
## $ TrackerDistance          <dbl> 8.50, 6.97, 6.74, 6.28, 8.16, 6.48, 8.59, 9.8…
## $ LoggedActivitiesDistance <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, …
## $ VeryActiveDistance       <dbl> 1.88, 1.57, 2.44, 2.14, 2.71, 3.19, 3.25, 3.5…
## $ ModeratelyActiveDistance <dbl> 0.55, 0.69, 0.40, 1.26, 0.41, 0.78, 0.64, 1.3…
## $ LightActiveDistance      <dbl> 6.06, 4.71, 3.91, 2.83, 5.04, 2.51, 4.71, 5.0…
## $ SedentaryActiveDistance  <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, …
## $ VeryActiveMinutes        <dbl> 25, 21, 30, 29, 36, 38, 42, 50, 28, 19, 66, 4…
## $ FairlyActiveMinutes      <dbl> 13, 19, 11, 34, 10, 20, 16, 31, 12, 8, 27, 21…
## $ LightlyActiveMinutes     <dbl> 328, 217, 181, 209, 221, 164, 233, 264, 205, …
## $ SedentaryMinutes         <dbl> 728, 776, 1218, 726, 773, 539, 1149, 775, 818…
## $ Calories                 <dbl> 1985, 1797, 1776, 1745, 1863, 1728, 1921, 203…
glimpse(calories)
## Rows: 22,099
## Columns: 3
## $ Id           <dbl> 1503960366, 1503960366, 1503960366, 1503960366, 150396036…
## $ ActivityHour <chr> "4/12/2016 12:00:00 AM", "4/12/2016 1:00:00 AM", "4/12/20…
## $ Calories     <dbl> 81, 61, 59, 47, 48, 48, 48, 47, 68, 141, 99, 76, 73, 66, …
glimpse(intensities)
## Rows: 22,099
## Columns: 4
## $ Id               <dbl> 1503960366, 1503960366, 1503960366, 1503960366, 15039…
## $ ActivityHour     <chr> "4/12/2016 12:00:00 AM", "4/12/2016 1:00:00 AM", "4/1…
## $ TotalIntensity   <dbl> 20, 8, 7, 0, 0, 0, 0, 0, 13, 30, 29, 12, 11, 6, 36, 5…
## $ AverageIntensity <dbl> 0.333333, 0.133333, 0.116667, 0.000000, 0.000000, 0.0…
glimpse(sleep)
## Rows: 413
## Columns: 5
## $ Id                 <dbl> 1503960366, 1503960366, 1503960366, 1503960366, 150…
## $ SleepDay           <chr> "4/12/2016 12:00:00 AM", "4/13/2016 12:00:00 AM", "…
## $ TotalSleepRecords  <dbl> 1, 2, 1, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, …
## $ TotalMinutesAsleep <dbl> 327, 384, 412, 340, 700, 304, 360, 325, 361, 430, 2…
## $ TotalTimeInBed     <dbl> 346, 407, 442, 367, 712, 320, 377, 364, 384, 449, 3…
glimpse(weight)
## Rows: 67
## Columns: 8
## $ Id             <dbl> 1503960366, 1503960366, 1927972279, 2873212765, 2873212…
## $ Date           <chr> "5/2/2016 11:59:59 PM", "5/3/2016 11:59:59 PM", "4/13/2…
## $ WeightKg       <dbl> 52.6, 52.6, 133.5, 56.7, 57.3, 72.4, 72.3, 69.7, 70.3, …
## $ WeightPounds   <dbl> 115.9631, 115.9631, 294.3171, 125.0021, 126.3249, 159.6…
## $ Fat            <dbl> 22, NA, NA, NA, NA, 25, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ BMI            <dbl> 22.65, 22.65, 47.54, 21.45, 21.69, 27.45, 27.38, 27.25,…
## $ IsManualReport <lgl> TRUE, TRUE, FALSE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, …
## $ LogId          <dbl> 1.462234e+12, 1.462320e+12, 1.460510e+12, 1.461283e+12,…

Altering the data

Upon viewing the Data sets, I found that the format in which the time was stored was unsatisfactory for this project. I decided to switch the format of the time in all the data sets to military standard time (from 0-24 o’clock) along with separating the date and time into two separate columns.

In respect to seeing the validty of the data, I found the number of distinct Id’s in the data set to see how many individuals gave their logged data to the database

n_distinct(activity$Id)
## [1] 33
n_distinct(calories$Id)
## [1] 33
n_distinct(intensities$Id)
## [1] 33
n_distinct(sleep$Id)
## [1] 24
n_distinct(weight$Id) 
## [1] 8

Seeing only 8 applicants turn in submissions for weight invalidates any findings we would make with the data as the sample size is too small to make conclusive findings.

Findings with the Data

Point 1: Target Audience

The first thing I wanted to look at was variation in intensity level users partake in an attempt to better understand our products target audience. Being able to understand what sorts of activity the majority of individuals who use health focused smart product will sway how we decide to advertise our product, whether we have bulky weightlifters or people sprinting on our posters and website, or if we should have people going on walks and stationary activities on our adversments.

The graph above show the average number of minutes logged per day by our users. We can infer from the graph that a majority of target audience partakes in very light exercise, hence we should market to people with flyers and poster with individuals conducting light activities such as walking or yoga.

Point 2: App Reminders

BellaBeat is able to efficiently help its clients track their health through their mobile app that allows user to seamlessly check their activity logs and log in personal data. Our goal is to optimize user interaction with Bella Beat products and to do so, we should set reminders for users to workout and track their health

Generally there is a large abundence of activity from 10 AM - 7PM, with our largest hours of activity being from 5 PM - 7PM. With that said, it would be wise to consider utilizing reminders at around 4:00 pm to work out or be active

Point 3: Sleep Quality

The final thing I wanted to look at is sleep quality, if we can show that tracking your health improves sleep quality it can motivate ppl to buy our product

Clear negative correlation in minutes sedentary and sleep, ppl should make an effort to track their health by utilizing our products.

Conclusion

  • We should target individuals who conduct light level of activites with our advertisment. This includes photos of individuals partaking in yoga, walking, etc.
  • Optimize user interaction with our products by strategically placing application reminders to exercise right before peak hours of activity (4:00 pm)
  • Show the extent of benefits in your health by tracking your fitness by multipe means such as improved sleep quality