Bellabeat is a high-tech company that manufactures health-focused smart products, which was founded by Urska Srsen and Sando Mur. The focus of the smart products is informing and inspiring women by collecting data on activity, sleep, stress, and reproductive health. Since its establishment in 2013, Bellabeat has grown rapidly and has positioned itself as a tech-driven wellness company for women.
The company has 5 main products in their lineup:
Bellabeat App: Provides users with comprehensive health data that connects to their line of smart wellness products.
Leaf: Basic wellness tracker that can be worn as bracelet, nechlace, or clip. The Leaf connects to the Bellabeat app to track activity, sleep, and stress.
Time: Wellness watch that connects to the Bellabeat app to track activity, sleep, and stress.
Spring: Water bottle that tracks daily water intake. Spring connects to the Bellabeat app to track hydration levels.
Bellabeat Membership: Subscription-based membership which gives users 24/7 access to fully personalized guidance on nutrition, activity, sleep, health, and beauty, and mindfulness based on lifestyle and gols.
The task is to analyze smart device usage data to gain insight into how consumers use non-Bellabeat smart devices. These insigths will be used to answer important questions about Bellabeat users.
The data for this analysis was collected through FitBit. Thirty eligible Fitbit users consented to the submission of personal tracker data, including minute-level output for physical activity, heart rate, and sleep monitoring. With the data being submitted voluntarily, sampling bias is likely. Depending on the population being studied, a sample size of 30 may not sufficient to test a hypothesis. Furthermore, the credibility of the data can be described below:
Reliability: The data is considered reliable. The data was collected through “smart” wearables that feed information directly to storage on participants personal device.
Original: The data is original as it is collected on individuals. Each individual has a unique i.d.
Comprehensive: An area of concern is the wearables are optional to wear. There may be times where the subject does not wear the receiver constantly which will lead to inappropriate data.
Current: At the time of this analysis, the data is 7 years old. Cited: The data is cited. If needed, FitBit would be able to olcate each source of data.
The data is published Data can be found here.
The following data sets were used in the study:
Daily Activity
Daily Sleep
Calories per Hour
Steps per Hour
To continue the analysis, necessary packages will need to be installed and loaded. Followed by the needed data sets.
To begin, it is beneficial to have an idea of how the tables are formatted.
# A glimpse into the data.
glimpse(activity_daily)
## Rows: 940
## Columns: 8
## $ Id <dbl> 1503960366, 1624580081, 1844505072, 1927972279, 2…
## $ ActivityDate <chr> "5/12/2016", "5/12/2016", "5/12/2016", "5/12/2016…
## $ TotalSteps <dbl> 0, 2971, 0, 0, 9117, 8891, 2661, 7566, 590, 17, 3…
## $ VeryActiveMinutes <dbl> 0, 0, 0, 0, 16, 0, 0, 0, 0, 0, 0, 0, 0, 0, 8, 0, …
## $ FairlyActiveMinutes <dbl> 0, 0, 0, 0, 16, 0, 0, 0, 0, 0, 0, 0, 0, 0, 6, 8, …
## $ LightlyActiveMinutes <dbl> 0, 107, 0, 0, 236, 343, 128, 268, 21, 2, 108, 58,…
## $ SedentaryMinutes <dbl> 1440, 890, 711, 966, 728, 330, 830, 720, 721, 0, …
## $ Calories <dbl> 0, 1002, 665, 1383, 1853, 1364, 1125, 1431, 1120,…
glimpse(sleep_day)
## Rows: 413
## Columns: 4
## $ Id <dbl> 1503960366, 1503960366, 1503960366, 1503960366, 150…
## $ SleepDay <chr> "4/12/2016", "4/13/2016", "4/15/2016", "4/16/2016",…
## $ TotalMinutesAsleep <dbl> 327, 384, 412, 340, 700, 304, 360, 325, 361, 430, 2…
## $ TotalTimeInBed <dbl> 346, 407, 442, 367, 712, 320, 377, 364, 384, 449, 3…
glimpse(calories_hourly)
## Rows: 22,099
## Columns: 4
## $ Id <dbl> 1503960366, 1503960366, 1503960366, 1503960366, 150396036…
## $ ActivityDate <chr> "4/12/2016", "4/12/2016", "4/12/2016", "4/12/2016", "4/12…
## $ ActivityHour <time> 00:00:00, 01:00:00, 02:00:00, 03:00:00, 04:00:00, 05:00:…
## $ Calories <dbl> 81, 61, 59, 47, 48, 48, 48, 47, 68, 141, 99, 76, 73, 66, …
glimpse(steps_hourly)
## Rows: 22,099
## Columns: 4
## $ Id <dbl> 1503960366, 1503960366, 1503960366, 1503960366, 150396036…
## $ ActivityDate <chr> "4/12/2016", "4/12/2016", "4/12/2016", "4/12/2016", "4/12…
## $ ActivityHour <time> 00:00:00, 01:00:00, 02:00:00, 03:00:00, 04:00:00, 05:00:…
## $ StepTotal <dbl> 373, 160, 151, 0, 0, 0, 0, 0, 250, 1864, 676, 360, 253, 2…
With an understanding of the structure, removing any duplicates is beneficial. This was completed using Excel. Three uplicates were found and removed in the Heartrate data. There were redundant columns in the data sets that were not included. The following data sets had columns removed
To continue, the column names and values need to be addressed.
# Renaming columns that will be used for analysis
activity_daily <- activity_daily %>%
rename(participant = Id) %>%
rename(date = ActivityDate) %>%
rename(steps = TotalSteps) %>%
rename(very_active_minutes = VeryActiveMinutes) %>%
rename(moderate_active_minutes = FairlyActiveMinutes) %>%
rename(light_active_minutes = LightlyActiveMinutes) %>%
rename(sed_active_minutes = SedentaryMinutes) %>%
rename(calories = Calories)
sleep_day <- sleep_day %>%
rename(participant = Id) %>%
rename(date = SleepDay) %>%
rename(sleep_minutes = TotalMinutesAsleep) %>%
rename(time_in_bed = TotalTimeInBed)
calories_hourly <- calories_hourly %>%
rename(participant = Id) %>%
rename(date = ActivityDate) %>%
rename(hour = ActivityHour) %>%
rename(calories = Calories)
steps_hourly <- steps_hourly %>%
rename(participant = Id) %>%
rename(date = ActivityDate) %>%
rename(hour = ActivityHour) %>%
rename(steps = StepTotal)
# Change Date columns so that the values are recognized as dates.
activity_daily$date <- as.Date(activity_daily$date, format = "%m/%d/%Y")
sleep_day$date <- as.Date(sleep_day$date, format = "%m/%d/%Y")
calories_hourly$date <- as.Date(calories_hourly$date, format = "%m/%d/%Y")
steps_hourly$date <- as.Date(steps_hourly$date, format = "%m/%d/%Y")
# Change ID column to characters so they are representative of individuals as opposed to values.
activity_daily$participant <- as.character(activity_daily$participant)
sleep_day$participant <- as.character(sleep_day$participant)
calories_hourly$participant <- as.character(calories_hourly$participant)
steps_hourly$participant <- as.character(steps_hourly$participant)
# Merging sleep and calories together to gain insight on their relationship.
activity_sleep_daily <- merge(sleep_day, activity_daily, by = c("participant", "date"))
calories_step_hourly <- merge(calories_hourly, steps_hourly, by = c("participant", "date", "hour"))
# Make Daily Activity table long. This will help dive deeper into the different activity levels.
activity_sleep_daily_long <- activity_sleep_daily %>%
gather(activity_level, value, very_active_minutes:sed_active_minutes)
Begin analysis by calculating the 5-number summary of the desired information from the Daily Activity dataset. Daily steps, calories and active minutes will be summarized.
activity_daily %>%
select(steps, calories, very_active_minutes, moderate_active_minutes, light_active_minutes, sed_active_minutes) %>%
summary()
## steps calories very_active_minutes moderate_active_minutes
## Min. : 0 Min. : 0 Min. : 0.00 Min. : 0.00
## 1st Qu.: 3790 1st Qu.:1828 1st Qu.: 0.00 1st Qu.: 0.00
## Median : 7406 Median :2134 Median : 4.00 Median : 6.00
## Mean : 7638 Mean :2304 Mean : 21.16 Mean : 13.56
## 3rd Qu.:10727 3rd Qu.:2793 3rd Qu.: 32.00 3rd Qu.: 19.00
## Max. :36019 Max. :4900 Max. :210.00 Max. :143.00
## light_active_minutes sed_active_minutes
## Min. : 0.0 Min. : 0.0
## 1st Qu.:127.0 1st Qu.: 729.8
## Median :199.0 Median :1057.5
## Mean :192.8 Mean : 991.2
## 3rd Qu.:264.0 3rd Qu.:1229.5
## Max. :518.0 Max. :1440.0
Followed by the 5-number summary of the daily sleep pattern.
sleep_day %>%
select(sleep_minutes, time_in_bed) %>%
summary()
## sleep_minutes time_in_bed
## Min. : 58.0 Min. : 61.0
## 1st Qu.:361.0 1st Qu.:403.0
## Median :433.0 Median :463.0
## Mean :419.5 Mean :458.6
## 3rd Qu.:490.0 3rd Qu.:526.0
## Max. :796.0 Max. :961.0
Finally, the 5-number summary for the daily calories and steps by the hour.
calories_step_hourly %>%
select(steps, calories) %>%
summary()
## steps calories
## Min. : 0.0 Min. : 42.00
## 1st Qu.: 0.0 1st Qu.: 63.00
## Median : 40.0 Median : 83.00
## Mean : 320.2 Mean : 97.39
## 3rd Qu.: 357.0 3rd Qu.:108.00
## Max. :10554.0 Max. :948.00
- Daily Steps and Calories seem to be normally distributed.
- Very Active and Moderately Active minutes seem to be skewed to the right.
- Lightly Active and Sedentary minutes seem to be normally distributed.
- Minutes Asleep and Time in Bed seem to be normally distributed.
- Calories per Hour and Steps per Hour seem to be skewed to the right.
Bellabeat intends to become a strong competitor in the wearable health tracker space. To do this, Bellabeat should focus on areas where FitBit lacks. It seems that FitBit has issues with wearability. Areas that reduce wearability would include: * Needing to be charged for long period of time * Lack of style in device design * Device is sensitive to damage
Bellabeat should design a wearable fitness tracker that will allow for: * Minimal charge time * Considered a wardrobe staple * High durability
A reduction of charge time will allow for the device to be worn more consistently throughut the day. A design that can be worn with any outfit will allow users to wear the device during any occasion without causing distraction. A highly durable device can be worn through all activities ranging from sedentary to highly active.