1 Company Summary

1.1 About the company

Bellabeat is a high-tech manufacturer of health-focused products for women, founded by Urška Sršen and Sando Mur in 2013. Their products collect data on activity, sleep, stress, and reproductive health to empower women with knowledge about their own health and habits. Ever since its founding, Bellabeat has rapidly grown and positioned itself as a tech-driven wellness company for women.

By 2016, Bellabeat had opened offices around the world and launched multiple products. Bellabeat products became available through a growing number of online retailers in addition to their own e-commerce channel on their website. The company has invested in traditional advertising media, such as radio, out-of-home billboards, print, and television, but focuses on digital marketing extensively. Bellabeat invests year-round in Google Search, maintaining active Facebook and Instagram pages, and consistently engages consumers on Twitter. Additionally, Bellabeat runs video ads on Youtube and display ads on the Google Display Network to support campaigns around key marketing dates.

1.2 Company Objective

Sršen knows that an analysis of Bellabeat’s available consumer data would reveal more opportunities for growth. She has asked the marketing analytics team to focus on a Bellabeat product and analyze smart device usage data in order to gain insight into how people are already using their smart devices. Then, using this information, she would like high-level recommendations for how these trends can inform Bellabeat marketing strategy.

1.3 Company Products

Bellabeat app: The Bellabeat app provides users with health data related to their activity, sleep, stress, menstrual cycle, and mindfulness habits.
Leaf: Bellabeat’s classic wellness tracker can be worn as a bracelet, necklace, or clip.
Spring: This is a water bottle that tracks daily water intake using smart technology to ensure that you are appropriately hydrated throughout the day.
Time: This wellness watch combines the timeless look of a classic timepiece with smart technology to track user activity, sleep, and stress. The Time watch connects to the Bellabeat app to provide you with insights into your daily wellness
Bellabeat membership: Bellabeat also offers a subscription-based membership program for users.

2 Ask Phase

2.1 Business Task

Sršen asks you to analyze smart device usage data in order to gain insight into how consumers use non-Bellabeat smart devices. She then wants you to select one Bellabeat product to apply these insights to in your presentation.

2.2 Questions guiding my analysis

What are some trends in smart device usage?
How could these trends apply to Bellabeat customers?
How could these trends help influence Bellabeat marketing strategy?

2.3 Stakeholders

Urška Sršen: Bellabeat’s cofounder and Chief Creative Officer
Sando Mur: Mathematician and Bellabeat’s cofounder; key member of the Bellabeat executive team
Bellabeat marketing analytics team: A team of data analysts responsible for collecting, analyzing, and reporting data that helps guide Bellabeat’s marketing strategy.

3 Prepare Phase

3.1 About Dataset

The Dataset used for this specific case study is FitBit Fitness Tracker Data. This Kaggle data set contains personal fitness tracker from thirty fitbit users. Thirty three eligible Fitbit users consented to the submission of personal tracker data, including minute-level output for physical activity, heart rate, and sleep monitoring. It includes information about daily activity, steps, and heart rate that can be used to explore users’ habits

3.2 Data Limitations

The sample size for this dataset is limited to only 33 participants while the market for fitness tracker is much larger.

3.3 Preparing Dataset

Loading Packages

library(tidyverse)

## Warning in Sys.timezone(): unable to identify current timezone 'H':
## please set environment variable 'TZ'

## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.0     ✔ readr     2.1.4
## ✔ forcats   1.0.0     ✔ stringr   1.5.0
## ✔ ggplot2   3.4.1     ✔ tibble    3.1.8
## ✔ lubridate 1.9.2     ✔ tidyr     1.3.0
## ✔ purrr     1.0.1     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

library(dplyr)
library(readxl)

Uploading Data from Database

weightLogInfo_merged <- read_excel("C:/Users/Juan Quintero/Downloads/weightLogInfo_merged.xlsx")
View(weightLogInfo_merged)
library(readxl)
heartrate_seconds_merged <- read_excel("C:/Users/Juan Quintero/Downloads/heartrate_seconds_merged.xlsx")
View(heartrate_seconds_merged)
library(readxl)
sleepDay_merged <- read_excel("C:/Users/Juan Quintero/Downloads/sleepDay_merged.xlsx")
View(sleepDay_merged)
library(readxl)
hourlySteps_merged <- read_excel("C:/Users/Juan Quintero/Downloads/hourlySteps_merged.xlsx")
View(hourlySteps_merged)
library(readxl)
hourlyCalories_merged <- read_excel("C:/Users/Juan Quintero/Downloads/hourlyCalories_merged.xlsx")
View(hourlyCalories_merged)
library(readxl)
DailyActivites_Merged <- read_excel("C:/Users/Juan Quintero/Downloads/DailyActivites_Merged.xlsx")
View(DailyActivites_Merged)

4 Processing Phase

4.1 Data Cleaning

WeightLog <- distinct(weightLogInfo_merged) %>% select(-Fat) %>% rename(DateTime=Date) 
WeightLog$Date <- sapply(strsplit(as.character(WeightLog$DateTime), " "), "[",1)  
WeightLog$Time <- sapply(strsplit(as.character(WeightLog$DateTime), " "), "[",2)  
WeightLog <- mutate(subset(WeightLog,select= c(1,2,8,9,3,4,5,6,7)))
WeightLog <- WeightLog %>% select(-DateTime) 
View(WeightLog)

Made sure to remove duplicates. Deleted “Fat” column due to N/As making it unreliable. I also renamed columns so that when combining tables, they will all stay consistent. The datetime column was separated into date and time and columns were rearranged for a much cleaner and easier to read table.

HeartRateSeconds <- distinct(heartrate_seconds_merged) %>% rename(RatePerSecond=Value) %>% rename(DateTime=Time) 
HeartRateSeconds$Date <- sapply(strsplit(as.character(HeartRateSeconds$DateTime), " "), "[",1)
HeartRateSeconds$Time <- sapply(strsplit(as.character(HeartRateSeconds$DateTime), " "), "[",2)
HeartRateSeconds <- mutate(subset(HeartRateSeconds,select=c(1,2,4,5,3)))
HeartRateSeconds <- HeartRateSeconds %>% select(-DateTime) 
View(HeartRateSeconds) # Viewing Clean Data Frame #

Removed duplicates, renamed column to “rate per second” as it gives me a better understanding of what I am reading compared to “value”. I also split datetime into two columns, and rearranged columns as well.

SleepDays <- distinct(sleepDay_merged) %>% rename(Date=SleepDay) 
View(SleepDays)

Removed duplicates, and renamed columns for consistency.

HourlySteps <- distinct(hourlySteps_merged) %>% rename(DateTime=ActivityHour) %>% rename(StepsTotal=StepTotal) 
HourlySteps$Date <- sapply(strsplit(as.character(HourlySteps$DateTime), " "), "[",1) 
HourlySteps$Time <- sapply(strsplit(as.character(HourlySteps$DateTime), " "), "[",2) 
HourlySteps <- mutate(subset(HourlySteps,select=c(1,2,4,5,3))) 
HourlySteps <- HourlySteps %>% select(-DateTime)
View(HourlySteps)

Removed duplicates, renamed columns, split datetime into two columns, and rearranged columns.

HourlyCalories <- distinct(hourlyCalories_merged) %>% rename(DateTime=ActivityHour) %>% rename(CaloriesBurnt=Calories) 
HourlyCalories$Date <- sapply(strsplit(as.character(HourlyCalories$DateTime), " "), "[",1) 
HourlyCalories$Time <- sapply(strsplit(as.character(HourlyCalories$DateTime), " "), "[",2) 
HourlyCalories <- mutate(subset(HourlyCalories,select=c(1,2,4,5,3))) 
HourlyCalories <- HourlyCalories %>% select(-DateTime) 
View(HourlyCalories)

Removed duplicates, renamed columns, and split datetime table.

DailyActivities <- distinct(DailyActivites_Merged) %>% rename(Date=ActivityDate) 
View(DailyActivities)

Same process for consistency.

4.2 Exploring Tables

First 6 rows of each table

head(WeightLog)

## # A tibble: 6 × 8
##           Id Date       Time     WeightKg WeightPounds   BMI IsManualR…¹   LogId
##        <dbl> <chr>      <chr>       <dbl>        <dbl> <dbl> <lgl>         <dbl>
## 1 1503960366 2016-05-02 23:59:59     52.6         116.  22.6 TRUE        1.46e12
## 2 1503960366 2016-05-03 23:59:59     52.6         116.  22.6 TRUE        1.46e12
## 3 1927972279 2016-04-13 01:08:52    134.          294.  47.5 FALSE       1.46e12
## 4 2873212765 2016-04-21 23:59:59     56.7         125.  21.5 TRUE        1.46e12
## 5 2873212765 2016-05-12 23:59:59     57.3         126.  21.7 TRUE        1.46e12
## 6 4319703577 2016-04-17 23:59:59     72.4         160.  27.5 TRUE        1.46e12
## # … with abbreviated variable name ¹IsManualReport

head(HeartRateSeconds)

## # A tibble: 6 × 4
##           Id Date       Time     RatePerSecond
##        <dbl> <chr>      <chr>            <dbl>
## 1 2022484408 2016-04-12 07:21:00            97
## 2 2022484408 2016-04-12 07:21:05           102
## 3 2022484408 2016-04-12 07:21:10           105
## 4 2022484408 2016-04-12 07:21:20           103
## 5 2022484408 2016-04-12 07:21:25           101
## 6 2022484408 2016-04-12 07:22:05            95

head(SleepDays)

## # A tibble: 6 × 5
##           Id Date                TotalSleepRecords TotalMinutesAsleep TotalTim…¹
##        <dbl> <dttm>                          <dbl>              <dbl>      <dbl>
## 1 1503960366 2016-04-12 00:00:00                 1                327        346
## 2 1503960366 2016-04-13 00:00:00                 2                384        407
## 3 1503960366 2016-04-15 00:00:00                 1                412        442
## 4 1503960366 2016-04-16 00:00:00                 2                340        367
## 5 1503960366 2016-04-17 00:00:00                 1                700        712
## 6 1503960366 2016-04-19 00:00:00                 1                304        320
## # … with abbreviated variable name ¹TotalTimeInBed

head(HourlySteps)

## # A tibble: 6 × 4
##           Id Date       Time     StepsTotal
##        <dbl> <chr>      <chr>         <dbl>
## 1 1503960366 2016-04-12 00:00:00        373
## 2 1503960366 2016-04-12 01:00:00        160
## 3 1503960366 2016-04-12 02:00:00        151
## 4 1503960366 2016-04-12 03:00:00          0
## 5 1503960366 2016-04-12 04:00:00          0
## 6 1503960366 2016-04-12 05:00:00          0

head(HourlyCalories)

## # A tibble: 6 × 4
##           Id Date       Time     CaloriesBurnt
##        <dbl> <chr>      <chr>            <dbl>
## 1 1503960366 2016-04-12 00:00:00            81
## 2 1503960366 2016-04-12 01:00:00            61
## 3 1503960366 2016-04-12 02:00:00            59
## 4 1503960366 2016-04-12 03:00:00            47
## 5 1503960366 2016-04-12 04:00:00            48
## 6 1503960366 2016-04-12 05:00:00            48

head(DailyActivities)

## # A tibble: 6 × 15
##           Id Date                Total…¹ Total…² Track…³ Logge…⁴ VeryA…⁵ Moder…⁶
##        <dbl> <dttm>                <dbl>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl>
## 1 1503960366 2016-04-12 00:00:00   13162    8.5     8.5        0    1.88   0.550
## 2 1503960366 2016-04-13 00:00:00   10735    6.97    6.97       0    1.57   0.690
## 3 1503960366 2016-04-14 00:00:00   10460    6.74    6.74       0    2.44   0.400
## 4 1503960366 2016-04-15 00:00:00    9762    6.28    6.28       0    2.14   1.26 
## 5 1503960366 2016-04-16 00:00:00   12669    8.16    8.16       0    2.71   0.410
## 6 1503960366 2016-04-17 00:00:00    9705    6.48    6.48       0    3.19   0.780
## # … with 7 more variables: LightActiveDistance <dbl>,
## #   SedentaryActiveDistance <dbl>, VeryActiveMinutes <dbl>,
## #   FairlyActiveMinutes <dbl>, LightlyActiveMinutes <dbl>,
## #   SedentaryMinutes <dbl>, Calories <dbl>, and abbreviated variable names
## #   ¹TotalSteps, ²TotalDistance, ³TrackerDistance, ⁴LoggedActivitiesDistance,
## #   ⁵VeryActiveDistance, ⁶ModeratelyActiveDistance

Identifying all the Columns

colnames(WeightLog)

## [1] "Id"             "Date"           "Time"           "WeightKg"      
## [5] "WeightPounds"   "BMI"            "IsManualReport" "LogId"

colnames(HeartRateSeconds)

## [1] "Id"            "Date"          "Time"          "RatePerSecond"

colnames(SleepDays)

## [1] "Id"                 "Date"               "TotalSleepRecords" 
## [4] "TotalMinutesAsleep" "TotalTimeInBed"

colnames(HourlySteps)

## [1] "Id"         "Date"       "Time"       "StepsTotal"

colnames(HourlyCalories)

## [1] "Id"            "Date"          "Time"          "CaloriesBurnt"

colnames(DailyActivities)

##  [1] "Id"                       "Date"                    
##  [3] "TotalSteps"               "TotalDistance"           
##  [5] "TrackerDistance"          "LoggedActivitiesDistance"
##  [7] "VeryActiveDistance"       "ModeratelyActiveDistance"
##  [9] "LightActiveDistance"      "SedentaryActiveDistance" 
## [11] "VeryActiveMinutes"        "FairlyActiveMinutes"     
## [13] "LightlyActiveMinutes"     "SedentaryMinutes"        
## [15] "Calories"

Identifying the Amount of Participants

n_distinct(WeightLog$Id)

## [1] 8

n_distinct(HeartRateSeconds$Id)

## [1] 14

n_distinct(SleepDays$Id)

## [1] 24

n_distinct(HourlySteps$Id)

## [1] 33

n_distinct(HourlyCalories$Id)

## [1] 33

n_distinct(DailyActivities$Id)

## [1] 33

Identifying the Amount of Rows for each Data Set

nrow(WeightLog)

## [1] 67

nrow(HeartRateSeconds)

## [1] 2483658

nrow(SleepDays)

## [1] 410

nrow(HourlySteps)

## [1] 22099

nrow(HourlyCalories)

## [1] 22099

nrow(DailyActivities)

## [1] 940

Brief Summary for each Data Frame

WeightLog %>% select(WeightKg, WeightPounds, BMI) %>% summary()

##     WeightKg       WeightPounds        BMI       
##  Min.   : 52.60   Min.   :116.0   Min.   :21.45  
##  1st Qu.: 61.40   1st Qu.:135.4   1st Qu.:23.96  
##  Median : 62.50   Median :137.8   Median :24.39  
##  Mean   : 72.04   Mean   :158.8   Mean   :25.19  
##  3rd Qu.: 85.05   3rd Qu.:187.5   3rd Qu.:25.56  
##  Max.   :133.50   Max.   :294.3   Max.   :47.54

HeartRateSeconds %>% select(RatePerSecond) %>% summary()

##  RatePerSecond   
##  Min.   : 36.00  
##  1st Qu.: 63.00  
##  Median : 73.00  
##  Mean   : 77.33  
##  3rd Qu.: 88.00  
##  Max.   :203.00

SleepDays %>% select(TotalSleepRecords, TotalMinutesAsleep, TotalTimeInBed) %>% summary()

##  TotalSleepRecords TotalMinutesAsleep TotalTimeInBed 
##  Min.   :1.00      Min.   : 58.0      Min.   : 61.0  
##  1st Qu.:1.00      1st Qu.:361.0      1st Qu.:403.8  
##  Median :1.00      Median :432.5      Median :463.0  
##  Mean   :1.12      Mean   :419.2      Mean   :458.5  
##  3rd Qu.:1.00      3rd Qu.:490.0      3rd Qu.:526.0  
##  Max.   :3.00      Max.   :796.0      Max.   :961.0

HourlySteps %>% select(StepsTotal) %>% summary()

##    StepsTotal     
##  Min.   :    0.0  
##  1st Qu.:    0.0  
##  Median :   40.0  
##  Mean   :  320.2  
##  3rd Qu.:  357.0  
##  Max.   :10554.0

HourlyCalories %>% select(CaloriesBurnt) %>% summary()

##  CaloriesBurnt   
##  Min.   : 42.00  
##  1st Qu.: 63.00  
##  Median : 83.00  
##  Mean   : 97.39  
##  3rd Qu.:108.00  
##  Max.   :948.00

DailyActivities %>% select(TotalSteps,TotalDistance,SedentaryMinutes) %>% summary()

##    TotalSteps    TotalDistance    SedentaryMinutes
##  Min.   :    0   Min.   : 0.000   Min.   :   0.0  
##  1st Qu.: 3790   1st Qu.: 2.620   1st Qu.: 729.8  
##  Median : 7406   Median : 5.245   Median :1057.5  
##  Mean   : 7638   Mean   : 5.490   Mean   : 991.2  
##  3rd Qu.:10727   3rd Qu.: 7.713   3rd Qu.:1229.5  
##  Max.   :36019   Max.   :28.030   Max.   :1440.0

5 Analysis Phase

Creating Health Reports based off of Participants BMI

WeightLog <- WeightLog %>% select(Id,Date,Time,WeightKg,WeightPounds,BMI,IsManualReport,LogId) %>% mutate(HealthStatus = case_when(
  BMI <= 18.5 ~ 'UnderWeight',
  BMI > 18.5 & BMI <=24.9 ~ 'Normal',                                             
  BMI >= 25 & BMI <=29.9 ~ 'OverWeight',
  BMI >= 30 & BMI <=39.9 ~ 'Obese',                     
  TRUE ~ 'AtRisk'
   ))

WeightLog <- mutate(subset(WeightLog,select= c(1,2,3,4,5,6,9,7,8)))
HealthReport <- WeightLog %>% distinct(Id,HealthStatus)
View(HealthReport)

Due to my limitations of only the body mass index and weight measurements, I created a new column using case_when() to understand the health status of each participant. I then brought it over into a new folder showing health status for the only 8 participants that logged in their weight. For more information on BMI categories seek my citation https://www.nhlbi.nih.gov/health/educational/lose_wt/BMI/bmicalc.htm.

Looking into steps taken by each participant throughout the day

AverageHourlySteps <- HourlySteps %>% group_by(Time) %>% summarize(AverageSteps=mean(StepsTotal))
View(AverageHourlySteps)

I wanted to know the average amount of steps taken throughout the day throughout all patients.

HourlyStepsViz <- ggplot(data=AverageHourlySteps)+
  geom_col(mapping=aes(x=Time,y=AverageSteps, fill='coral1'))+
  labs(title="Hourly Steps Throughout The Day")+
  theme(axis.text.x=element_text(angle=90))
plot(HourlyStepsViz)

Participants seem to be most active between 5pm to 7pm from a day to day basis.

Combining total steps taken and calories burnt by each participant

HourlyStepsTotal <- HourlySteps %>% group_by(Id) %>% summarise(HourlySteps=sum(StepsTotal))    
HourlyStepsTotal <- HourlyStepsTotal %>% rename(TotalSteps=HourlySteps)
head(HourlyStepsTotal)

## # A tibble: 6 × 2
##           Id TotalSteps
##        <dbl>      <dbl>
## 1 1503960366     374546
## 2 1624580081     177750
## 3 1644430081     217927
## 4 1844505072      79942
## 5 1927972279      28400
## 6 2022484408     351712

HourlyCaloriesTotal <- HourlyCalories %>% group_by(Id) %>% summarise(HourlyCalories=sum(CaloriesBurnt))    
HourlyCaloriesTotal <- HourlyCaloriesTotal %>% rename(TotalBurntCalories=HourlyCalories)
head(HourlyCaloriesTotal)

## # A tibble: 6 × 2
##           Id TotalBurntCalories
##        <dbl>              <dbl>
## 1 1503960366              56287
## 2 1624580081              45980
## 3 1644430081              84125
## 4 1844505072              48681
## 5 1927972279              67347
## 6 2022484408              77633

CaloriesBurntByStep <- merge(HourlyStepsTotal,HourlyCaloriesTotal, by="Id", all=TRUE)
head(CaloriesBurntByStep)

##           Id TotalSteps TotalBurntCalories
## 1 1503960366     374546              56287
## 2 1624580081     177750              45980
## 3 1644430081     217927              84125
## 4 1844505072      79942              48681
## 5 1927972279      28400              67347
## 6 2022484408     351712              77633

HealthEvaluation <- merge(CaloriesBurntByStep, HealthReport, by="Id")
(HealthEvaluation)

##           Id TotalSteps TotalBurntCalories HealthStatus
## 1 1503960366     374546              56287       Normal
## 2 1927972279      28400              67347       AtRisk
## 3 2873212765     234130              59059       Normal
## 4 4319703577     209464              61904   OverWeight
## 5 4558609924     237909              63046   OverWeight
## 6 5577150313     248519             100816   OverWeight
## 7 6962181067     303621              61461       Normal
## 8 8877689391     495623             105746   OverWeight

I wanted to narrow it down to most active users, at this time only 8 participants out of 33 logged in with total steps, and calories burnt.The rest were inactive for certain columns.

Weekday with most sleep time

SleepDate <- SleepDays %>% distinct(Date)
SleepDate$weekday <- weekdays(SleepDate$Date)          

SleepDays <- merge(SleepDays,SleepDate, by= "Date")
SleepDays <- mutate(subset(SleepDays,select=c(2,1,6,3,4,5)))
SleepDays <- SleepDays %>% rename(Weekday=weekday)

options(width=150)
head(SleepDays)

##           Id       Date Weekday TotalSleepRecords TotalMinutesAsleep TotalTimeInBed
## 1 1503960366 2016-04-12 Tuesday                 1                327            346
## 2 8378563200 2016-04-12 Tuesday                 1                338            356
## 3 5577150313 2016-04-12 Tuesday                 1                419            438
## 4 4020332650 2016-04-12 Tuesday                 1                501            541
## 5 5553957443 2016-04-12 Tuesday                 1                441            464
## 6 4445114986 2016-04-12 Tuesday                 2                429            457

In order to get a better understanding on each participants sleep schedules I added a new column by weekday.

WeekDays <- SleepDays$Weekday %>% factor(levels=c("Sunday","Monday","Tuesday", "Wednesday", "Thursday","Friday","Saturday"))
WeeklySleepViz <- ggplot(data=SleepDays)+geom_col(mapping=aes(x=WeekDays,y=TotalMinutesAsleep, fill='coral1')) +        
  labs(title= "Total Minutes Asleep Throughout the Week") + 
  annotate("text",x="Thursday",y=30000, label = "Most Sleep seems to be in the middle of the week")
plot(WeeklySleepViz)

It seems as though that their is early burnout as the most minutes total asleep is on a Wednesday. Second most is weekends which makes sense if most of these participants have day jobs working 5 days a week.

Correlation Between Total Time in Bed and Sleep Time

SleepTimeViz <- ggplot(data=SleepDays, aes(x=TotalMinutesAsleep,y=TotalTimeInBed))+
  geom_point(color='coral1')+geom_smooth(method=lm, se=FALSE,color = 'black', linewidth=1)+ 
  labs(title="Correlation between Minutes In Bed and Being Asleep")
plot(SleepTimeViz)

## `geom_smooth()` using formula = 'y ~ x'

Their is a positive correlation between total time in bed and total minutes asleep. Since they are dependent to one another, Bellabeat can further explore tracking users sleep cycle to get a better understanding of how they function with or without sleep.

Sleep Report to Health Evaluation

OverallTimeInBed <- SleepDays %>% group_by(Id) %>% summarise(OverallTimeInBed=sum(TotalTimeInBed))
OverallMinutesAsleep <- SleepDays %>% group_by(Id) %>% summarise(OverallMinutesAsleep=sum(TotalMinutesAsleep))
SleepReport <- merge(OverallMinutesAsleep, OverallTimeInBed, by="Id")
HealthEvaluation <- merge(SleepReport,HealthEvaluation, by="Id")

Identifying Daily Activities to Health Performance

n_distinct(DailyActivities$Date)

## [1] 31

31 Days or a month has been tracked on participants daily activities.

ActivityDate <- DailyActivities %>% distinct(Date)
ActivityDate$WeekDay <- weekdays(ActivityDate$Date)                             
DailyActivities <- merge(DailyActivities, ActivityDate, by= "Date")
DailyActivities <- mutate(subset(DailyActivities,select=c(2,1,16,3,4,5,6,7,8,9,10,11,12,13,14,15)))

ActiveViz <- ggplot(data=DailyActivities, aes(x=TotalSteps, y=Calories))+ 
  geom_point(color='coral1') + geom_smooth(method=lm, se=FALSE,color = 'black', linewidth=1)+     
  labs(title = 'Calories Burnt By Steps Taken')
ActiveViz

## `geom_smooth()` using formula = 'y ~ x'

Their is also a positive correlation between steps taken and the amount of calories burnt by users. From an overall standpoint, this can help encourage users of Bellabeat to track their day to day steps, and see their progress on how much calories burnt. Health should be the focus point for Bellabeat.

Heart Rate Per Second for each Participant

HeartDate <- HeartRateSeconds %>% distinct(Date)
HeartDate$Date = as.Date(strptime(HeartDate$Date, "%Y-%m-%d"))    
HeartRateSeconds$Date = as.Date(strptime(HeartRateSeconds$Date, "%Y-%m-%d"))
str(HeartRateSeconds)

## tibble [2,483,658 × 4] (S3: tbl_df/tbl/data.frame)
##  $ Id           : num [1:2483658] 2.02e+09 2.02e+09 2.02e+09 2.02e+09 2.02e+09 ...
##  $ Date         : Date[1:2483658], format: "2016-04-12" "2016-04-12" "2016-04-12" "2016-04-12" ...
##  $ Time         : chr [1:2483658] "07:21:00" "07:21:05" "07:21:10" "07:21:20" ...
##  $ RatePerSecond: num [1:2483658] 97 102 105 103 101 95 91 93 94 93 ...

HeartDate$WeekDay <- weekdays(SleepDate$Date)
HeartRateSeconds <- merge(HeartRateSeconds,HeartDate, by= "Date")
HeartRateSeconds <- mutate(subset(HeartRateSeconds, select=c(2,1,5,3,4)))

HeartRateMonitorViz <- ggplot(data=HeartRateSeconds)+
  geom_col(mapping=aes(x=factor(WeekDay, levels=c("Sunday","Monday","Tuesday", "Wednesday", "Thursday","Friday","Saturday")),
                       y=RatePerSecond, fill='coral1'))+
                       facet_wrap(~Id)+
                       labs(title="Monitoring Heart Rate Per Second Throughout The Week",x="WeekDays")+
  theme(axis.text.x=element_text(angle=90))
plot(HeartRateMonitorViz)

Monitoring heart rates can help elevate fitness level. This gives me the heart rates per second for every user that manually logged in. Continuing the trend of health, Bellabeat can address in their marketing developing health problems that can be found when looking into heart rates.

Monitoring Inactivity

ggplot(data=DailyActivities)+
  geom_col(mapping=aes(x=SedentaryMinutes,y=factor(WeekDay,levels=c("Sunday","Monday","Tuesday", "Wednesday", "Thursday","Friday","Saturday")),fill='coral1'))+labs(title="Weekly Inactive Minutes",y='WeekDays')

There is most inactive minutes from Tuesday to Thursday. These 3 days are spiked up much higher compared to the rest of the week. Reminders based off of inactivity patterns can go along way to encourage a consistency in activeness throughout the week. BellaBeat can suggest notifications on when user has been away for too long.

InactiveMinutes <- DailyActivities %>% group_by(Id) %>% summarise(Inactivity=sum(SedentaryMinutes))
HealthEvaluation <- merge(HealthEvaluation, InactiveMinutes, by="Id")  
HealthEvaluation <- mutate(subset(HealthEvaluation,select=c(1,2,3,7,4,5,6)))
n_distinct(HealthEvaluation$Id)

## [1] 6

head(HealthEvaluation)

##           Id OverallMinutesAsleep OverallTimeInBed Inactivity TotalSteps TotalBurntCalories HealthStatus
## 1 1503960366                 9007             9580      26293     374546              56287       Normal
## 2 1927972279                 2085             2189      40840      28400              67347       AtRisk
## 3 4319703577                12393            13051      22810     209464              61904   OverWeight
## 4 4558609924                  638              700      33902     237909              63046   OverWeight
## 5 5577150313                11232            11976      22633     248519             100816   OverWeight
## 6 6962181067                13888            14450      20532     303621              61461       Normal

To continue the health evaluation, I added total sedentary minutes as well, which brought the table to now 6 users that logged in for each category.

6 Share Phase

6.1 What Are Some Trends in Smart Device Usage?

User Log Ins Overall there are 33 participants in this data frame.

8 logged in their weight log. 14 logged in for heart rate. 24 logged in for sleep time. All 33 logged in for total steps, calories burnt, and daily activities.

Maintaining fitness levels Out of the 8 users for weight log in, BMI told me where they stand in health. By height, and weight, 3 were at a healthy weight, 4 were overweight, 1 was at risk, or danger zone but 0 were underweight. What this tells me is their are a variety of users that use Bellabeat as a way to keep track of activeness.

When Users Are Most Active Users are most active between 5-7pm on average out of a 24 hour span. This tells me they are most active after work hours such as walking their dog, or taking a job throughout the park. On Average, participant take 599.20 steps at its peak at 7pm, with a decrease of 39% from 8-9pm.

When Users Are Most Inactive Assuming users of Bellabeat are also working women, their is a clear burnout on Wednesdays. This day has the most minutes asleep for every user with the weekends being second. Inactive time is shown through sedentary minutes which again is most during the middle of the week. Tuesdays has the most sedimentary minutes with 153,119.

Sleep Levels and Heart Rate Activity Their is a huge dependency between minutes and bed and sleeping. They compliment one another with a positive correlation. Tracking sleep cycles gives Bellabeat a better understanding how active Users will be for that day. Wednesdays has the most minutes asleep by 28,689. A lack of sleep reduces activeness, so encouraging an 8 hour sleep schedule is a must. Monitoring heart rate is a huge helping hand when it comes to keeping track of activity. Individually users vary between heart rates, some go harder than others. The harder you work, the better your heart rate. It also helps with results of cardiac stress tests or identifying bigger issues in your health.

6.2 How could these trends apply to Bellabeat Customers?

Why Fitness level is Important to our Customers Using terms such as overweight, and at risk may be discouraging, it can be a foundation for helping users maintain a control of fitness levels, and their progress. Instead of saying “at risk”, Give insight on what can be improved on, such as daily activeness. Encourage calories being burnt, and monitor heart rate to see how hard their working.

Why Activity Levels are Important to our Customers By keeping track of these patterns, we can maintain consistency of their activeness. Send notifications when they are walking less then usual, congratulate when they hit a new record in walking distance for the day. Their is clear correlation for calories burnt and steps being taken. Calories burnt is a clear motivation for every user, a reason to continue pushing for a healthier, and more active life style.

Why Inactivity is Important to Prevent from our Customers Reminders can help reduce the amount of inactivity based on consistency as well. Increase users daily log in by sending notifications addressing they have been away for too long or send an evaluation of their activity levels based off of weeks before. Keeping updates, and engaging our users will reduce not only inactivity but fully ghosting Bellabeat apps.

Why Monitoring Sleep, and Heart Rate is Important A lack of sleep reduces activeness, so encouraging an 8 hour sleep schedule is a must. Monitoring heart rate is a huge helping hand when it comes to keeping track of activity. Individually users vary between heart rates, some go harder than others. The harder you work, the better your heart rate. It also helps with results of cardiac stress tests or identifying bigger issues in your health.

6.3 How Could These Trends Help Influence Bellabeat Marketing Strategy?

Knowing Our Audience Bellabeat’s audience are health-focused products for women. Our main goal should be to market towards women in need of a healthier lifestyle. Everyone wants to change but not everyone knows how. We have a variety of users already in different fitness levels, the idea is the maintain consistency, reduce calories, and encourage better habits. Create social media pages, and increase engagement by showing exactly what our products has to offer. By helping our audience understand why what we track benefits them, it will increase interest into Bellabeat memberships. I also recommend a free one month trial to bring in users. After 31 days, they will see have a health evaluation that shows how far they’ve gone due to service providing guidance in nutrition, how much weight they’ve lost, how many steps they took, their sleep cycles, and what their next goal will be if they continue to subscribe to the Bellabeat membership.

7 Act Phase

7.1 Conclusion

Bellabeats main focus is to create more engagement with their users by providing reminders, notifications, and health evaluations. It is easy to gain a following, but the idea is the maintain one as well. Prevent any decrease in monthly users. That’s why it is important to inform customers how our services will benefit them in the long run. A constant motivation to improve, even when results show a decline in activity. Small things such as congratulating a customer for walking a farther distance today compared to yesterday will go a long way. Giving diet plans based off of their BMI or health status can help users be more knowledgeable of their health, and what they eat. Reassurance prevents giving up and that is what Bellabeat should strive for.

7.2 Recomendations

Social Media Page
Free 1 month trial
Notifications, and Reminders
Weekly or Monthly Health Evaluation
Dietary and Nutritional Advice
Adding more tracking features such as a timer, or creating schedules

BellabeatReport

Juan Quintero

3/6/2023