Bellabeat Case Study

Bellabeat Case Study for Data Analytics Certification

Scenario: You are a junior data analyst working on the marketing analyst team at Bellabeat, a high-tech manufacturer of health-focused products for women. Bellabeat is a successful small company, but they have the potential to become a larger player in the global smart device market. Urška Sršen, co-founder and Chief Creative Officer of Bellabeat, believes that analyzing smart device fitness data could help unlock new growth opportunities for the company. You have been asked to focus on one of Bellabeat’s products and analyze smart device data to gain insight into how consumers are using their smart devices. The insights you discover will then help guide marketing strategy for the company. You will present your analysis to the Bellabeat executive team along with your high-level recommendations for Bellabeat’s marketing strategy.

Data Analysis Steps

– Ask

– Prepare

– Process

– Analyze

– Share

– Act

Ask

These are the questions our stakeholders want me to answer:

What are some trends in smart device usage?
How could these trends apply to Bellabeat customers?
How could these trends help influence Bellabeat marketing strategy?

Prepare

On the prepare stage of the analysis, I am going to be using the data provided by the company. Part of the preparation is to determine the credibility of the data provided and choosing which data sets I would like to use for the analysis.

Determining the credibility:

To determine the credibility of the data, I will be using the ROCCC concept. These are 5 criteria a data should meet to assess whether the data is good or bad. ROCCC stands for reliability, original, comprehensive, current and cited.

Reliability: The data is from 33 Fitbit users who consented to the submission of personal data tracker generated from the Amazon Mechanical Turk. It is a small and dated data set which means that the analysis is outdated may be biased.

Original: The data was collected from 33 Fitbit devices with the consent of the users. The users understood how the data was collected, stored, managed and protected.

Comprehensive: The data contains a lot of information, and it needs to be cleaned by removing the nulls, duplicates and 0 values.

Current: The data set was collected from 04/12/2016 – 05/12/2016 which means that the data is not current.

Cited: Full information in regards to the source of the data https://www.kaggle.com/datasets/arashnic/fitbit.

Upon examining the data, I decided that I will be focusing on 3 data sets out of the 18 provided. This case study will be focused on the weight log info, daily activity merged, sleep day merged.

Process

I will be using Microsoft Excel and RStudio in this case study.

1.) Upon examining the data I chose to work with, I decided to pre-format and pre-clean the data on Excel before working on it in RStudio. I decided to format all the dates into short dates because I will not be needing the hours that are in the date column.

2.) I started by installing and loading the necessary packages I need to process the data.

library(tidyverse)

## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.1     ✔ readr     2.1.4
## ✔ forcats   1.0.0     ✔ stringr   1.5.0
## ✔ ggplot2   3.4.2     ✔ tibble    3.2.1
## ✔ lubridate 1.9.2     ✔ tidyr     1.3.0
## ✔ purrr     1.0.1     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

library(here)

## here() starts at /cloud/project

library(skimr)
library(janitor)

## 
## Attaching package: 'janitor'
## 
## The following objects are masked from 'package:stats':
## 
##     chisq.test, fisher.test

library(dplyr)
library(lubridate)
library(ggplot2)
library(readr)
library(scales)

## 
## Attaching package: 'scales'
## 
## The following object is masked from 'package:purrr':
## 
##     discard
## 
## The following object is masked from 'package:readr':
## 
##     col_factor

3.) I imported the data sets I pre-cleaned and pre-formatted.

daily_activity <- read_csv("/cloud/project/daily_activity_merged_new_date.csv")

## Rows: 940 Columns: 15
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr  (1): ActivityDate
## dbl (14): Id, TotalSteps, TotalDistance, TrackerDistance, LoggedActivitiesDi...
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

sleep_activity <- read_csv("/cloud/project/sleep_day_merged_new_date.csv")

## Rows: 413 Columns: 5
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (1): SleepDay
## dbl (4): Id, TotalSleepRecords, TotalMinutesAsleep, TotalTimeInBed
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

weight_info <- read_csv("/cloud/project/weight_log_info_merged_new_date.csv")

## Rows: 67 Columns: 8
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (1): Date
## dbl (6): Id, WeightKg, WeightPounds, Fat, BMI, LogId
## lgl (1): IsManualReport
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

4.) I will now start cleaning the data in RStudio.

Using the colnames(), I checked if the name formats are uniform.

colnames(daily_activity)

##  [1] "Id"                       "ActivityDate"            
##  [3] "TotalSteps"               "TotalDistance"           
##  [5] "TrackerDistance"          "LoggedActivitiesDistance"
##  [7] "VeryActiveDistance"       "ModeratelyActiveDistance"
##  [9] "LightActiveDistance"      "SedentaryActiveDistance" 
## [11] "VeryActiveMinutes"        "FairlyActiveMinutes"     
## [13] "LightlyActiveMinutes"     "SedentaryMinutes"        
## [15] "Calories"

colnames(sleep_activity)

## [1] "Id"                 "SleepDay"           "TotalSleepRecords" 
## [4] "TotalMinutesAsleep" "TotalTimeInBed"

colnames(weight_info)

## [1] "Id"             "Date"           "WeightKg"       "WeightPounds"  
## [5] "Fat"            "BMI"            "IsManualReport" "LogId"

Upon examining the data, I noticed that the name of the columns are not uniform. I used the clean_names() to replace spaces, special characters, and uppercase letters with underscores and lowercase letters, which makes the column names easier to work with in R.

sleep_activity <- clean_names(sleep_activity)
daily_activity <- clean_names(daily_activity)
weight_info <- clean_names(weight_info)

Upon further examination, I noticed that the date columns of the different data sets have different names. To make the merging of the data sets easier and cleaner, I will use the rename() to have a uniform name title for the date column for all three data sets. I decided to name the date column as “activity_date”.

sleep_activity <- sleep_activity %>% rename(activity_date = sleep_day)
daily_activity <- daily_activity %>%  rename(activity_date = activity_date)
weight_info <- weight_info %>% rename(activity_date = date)

I double checked if the all the column names are uniform by using the colnames() again.

colnames(daily_activity)

##  [1] "id"                         "activity_date"             
##  [3] "total_steps"                "total_distance"            
##  [5] "tracker_distance"           "logged_activities_distance"
##  [7] "very_active_distance"       "moderately_active_distance"
##  [9] "light_active_distance"      "sedentary_active_distance" 
## [11] "very_active_minutes"        "fairly_active_minutes"     
## [13] "lightly_active_minutes"     "sedentary_minutes"         
## [15] "calories"

colnames(sleep_activity)

## [1] "id"                   "activity_date"        "total_sleep_records" 
## [4] "total_minutes_asleep" "total_time_in_bed"

colnames(weight_info)

## [1] "id"               "activity_date"    "weight_kg"        "weight_pounds"   
## [5] "fat"              "bmi"              "is_manual_report" "log_id"

I double-checked the output summary of my data set using the str().

str(daily_activity)

## spc_tbl_ [940 × 15] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
##  $ id                        : num [1:940] 1.5e+09 1.5e+09 1.5e+09 1.5e+09 1.5e+09 ...
##  $ activity_date             : chr [1:940] "4/12/2016" "4/13/2016" "4/14/2016" "4/15/2016" ...
##  $ total_steps               : num [1:940] 13162 10735 10460 9762 12669 ...
##  $ total_distance            : num [1:940] 8.5 6.97 6.74 6.28 8.16 ...
##  $ tracker_distance          : num [1:940] 8.5 6.97 6.74 6.28 8.16 ...
##  $ logged_activities_distance: num [1:940] 0 0 0 0 0 0 0 0 0 0 ...
##  $ very_active_distance      : num [1:940] 1.88 1.57 2.44 2.14 2.71 ...
##  $ moderately_active_distance: num [1:940] 0.55 0.69 0.4 1.26 0.41 ...
##  $ light_active_distance     : num [1:940] 6.06 4.71 3.91 2.83 5.04 ...
##  $ sedentary_active_distance : num [1:940] 0 0 0 0 0 0 0 0 0 0 ...
##  $ very_active_minutes       : num [1:940] 25 21 30 29 36 38 42 50 28 19 ...
##  $ fairly_active_minutes     : num [1:940] 13 19 11 34 10 20 16 31 12 8 ...
##  $ lightly_active_minutes    : num [1:940] 328 217 181 209 221 164 233 264 205 211 ...
##  $ sedentary_minutes         : num [1:940] 728 776 1218 726 773 ...
##  $ calories                  : num [1:940] 1985 1797 1776 1745 1863 ...
##  - attr(*, "spec")=
##   .. cols(
##   ..   Id = col_double(),
##   ..   ActivityDate = col_character(),
##   ..   TotalSteps = col_double(),
##   ..   TotalDistance = col_double(),
##   ..   TrackerDistance = col_double(),
##   ..   LoggedActivitiesDistance = col_double(),
##   ..   VeryActiveDistance = col_double(),
##   ..   ModeratelyActiveDistance = col_double(),
##   ..   LightActiveDistance = col_double(),
##   ..   SedentaryActiveDistance = col_double(),
##   ..   VeryActiveMinutes = col_double(),
##   ..   FairlyActiveMinutes = col_double(),
##   ..   LightlyActiveMinutes = col_double(),
##   ..   SedentaryMinutes = col_double(),
##   ..   Calories = col_double()
##   .. )
##  - attr(*, "problems")=<externalptr>

str(sleep_activity)

## spc_tbl_ [413 × 5] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
##  $ id                  : num [1:413] 1.5e+09 1.5e+09 1.5e+09 1.5e+09 1.5e+09 ...
##  $ activity_date       : chr [1:413] "4/12/2016" "4/13/2016" "4/15/2016" "4/16/2016" ...
##  $ total_sleep_records : num [1:413] 1 2 1 2 1 1 1 1 1 1 ...
##  $ total_minutes_asleep: num [1:413] 327 384 412 340 700 304 360 325 361 430 ...
##  $ total_time_in_bed   : num [1:413] 346 407 442 367 712 320 377 364 384 449 ...
##  - attr(*, "spec")=
##   .. cols(
##   ..   Id = col_double(),
##   ..   SleepDay = col_character(),
##   ..   TotalSleepRecords = col_double(),
##   ..   TotalMinutesAsleep = col_double(),
##   ..   TotalTimeInBed = col_double()
##   .. )
##  - attr(*, "problems")=<externalptr>

str(weight_info)

## spc_tbl_ [67 × 8] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
##  $ id              : num [1:67] 1.50e+09 1.50e+09 1.93e+09 2.87e+09 2.87e+09 ...
##  $ activity_date   : chr [1:67] "5/2/2016" "5/3/2016" "4/13/2016" "4/21/2016" ...
##  $ weight_kg       : num [1:67] 52.6 52.6 133.5 56.7 57.3 ...
##  $ weight_pounds   : num [1:67] 116 116 294 125 126 ...
##  $ fat             : num [1:67] 22 NA NA NA NA 25 NA NA NA NA ...
##  $ bmi             : num [1:67] 22.6 22.6 47.5 21.5 21.7 ...
##  $ is_manual_report: logi [1:67] TRUE TRUE FALSE TRUE TRUE TRUE ...
##  $ log_id          : num [1:67] 1.46e+12 1.46e+12 1.46e+12 1.46e+12 1.46e+12 ...
##  - attr(*, "spec")=
##   .. cols(
##   ..   Id = col_double(),
##   ..   Date = col_character(),
##   ..   WeightKg = col_double(),
##   ..   WeightPounds = col_double(),
##   ..   Fat = col_double(),
##   ..   BMI = col_double(),
##   ..   IsManualReport = col_logical(),
##   ..   LogId = col_double()
##   .. )
##  - attr(*, "problems")=<externalptr>

Upon examination of the data, I noticed that the activity_date column of the different data sets are in character format. Therefore I need to reformat them into date formats by using the mdy() that is part of the lubridate package.I also used the head() to double-check if the character format on my activity_date column was transformed into date format.

daily_activity$activity_date <- mdy(daily_activity$activity_date)
head(daily_activity)

## # A tibble: 6 × 15
##           id activity_date total_steps total_distance tracker_distance
##        <dbl> <date>              <dbl>          <dbl>            <dbl>
## 1 1503960366 2016-04-12          13162           8.5              8.5 
## 2 1503960366 2016-04-13          10735           6.97             6.97
## 3 1503960366 2016-04-14          10460           6.74             6.74
## 4 1503960366 2016-04-15           9762           6.28             6.28
## 5 1503960366 2016-04-16          12669           8.16             8.16
## 6 1503960366 2016-04-17           9705           6.48             6.48
## # ℹ 10 more variables: logged_activities_distance <dbl>,
## #   very_active_distance <dbl>, moderately_active_distance <dbl>,
## #   light_active_distance <dbl>, sedentary_active_distance <dbl>,
## #   very_active_minutes <dbl>, fairly_active_minutes <dbl>,
## #   lightly_active_minutes <dbl>, sedentary_minutes <dbl>, calories <dbl>

sleep_activity$activity_date <- mdy(sleep_activity$activity_date)
head(sleep_activity)

## # A tibble: 6 × 5
##           id activity_date total_sleep_records total_minutes_asleep
##        <dbl> <date>                      <dbl>                <dbl>
## 1 1503960366 2016-04-12                      1                  327
## 2 1503960366 2016-04-13                      2                  384
## 3 1503960366 2016-04-15                      1                  412
## 4 1503960366 2016-04-16                      2                  340
## 5 1503960366 2016-04-17                      1                  700
## 6 1503960366 2016-04-19                      1                  304
## # ℹ 1 more variable: total_time_in_bed <dbl>

weight_info$activity_date <- mdy(weight_info$activity_date)
head(weight_info)

## # A tibble: 6 × 8
##           id activity_date weight_kg weight_pounds   fat   bmi is_manual_report
##        <dbl> <date>            <dbl>         <dbl> <dbl> <dbl> <lgl>           
## 1 1503960366 2016-05-02         52.6          116.    22  22.6 TRUE            
## 2 1503960366 2016-05-03         52.6          116.    NA  22.6 TRUE            
## 3 1927972279 2016-04-13        134.           294.    NA  47.5 FALSE           
## 4 2873212765 2016-04-21         56.7          125.    NA  21.5 TRUE            
## 5 2873212765 2016-05-12         57.3          126.    NA  21.7 TRUE            
## 6 4319703577 2016-04-17         72.4          160.    25  27.5 TRUE            
## # ℹ 1 more variable: log_id <dbl>

I identified and verified each number of data frame by id using the n_unique() using the id column of each data sets.

n_unique(daily_activity$id)

## [1] 33

n_unique(sleep_activity$id)

## [1] 24

n_unique(weight_info$id)

## [1] 8

I tried to identify and verify if there are any duplicates in the data sets being used by using the sum() and duplicated()

sum(duplicated(daily_activity))

## [1] 0

sum(duplicated(sleep_activity))

## [1] 3

sum(duplicated(weight_info))

## [1] 0

The function for the sleep_activity data frame returned that there are 3 duplicates. To verify if they are actual duplicates, meaning all the columns and rows had the same entry, I will be using the get_dupes () which is a part of the janitor package.

get_dupes(sleep_activity)

## No variable names specified - using all columns.

## # A tibble: 6 × 6
##           id activity_date total_sleep_records total_minutes_asleep
##        <dbl> <date>                      <dbl>                <dbl>
## 1 4388161847 2016-05-05                      1                  471
## 2 4388161847 2016-05-05                      1                  471
## 3 4702921684 2016-05-07                      1                  520
## 4 4702921684 2016-05-07                      1                  520
## 5 8378563200 2016-04-25                      1                  388
## 6 8378563200 2016-04-25                      1                  388
## # ℹ 2 more variables: total_time_in_bed <dbl>, dupe_count <int>

After verifying that the data is an actual duplicate, I will be using the distinct() to remove all duplicates and double-checking them by using the sum() and duplicated function.

sleep_activity <- distinct(sleep_activity, .keep_all = TRUE)
sum(duplicated(sleep_activity))

## [1] 0

Analyze and Share

It is my goal to answer the questions of the stakeholders so I can help improve the marketing strategy of the company. I will try to find the correlation of their usage by using the data we cleaned during the processing stage. Our goal is to examine how the daily activities affect the weight of the users and how we can improve our marketing strategy to sell our products more effectively. I merged all three data sets to summarize all the data provided.

summary_sleep_daily_activity <- merge(x=daily_activity, y=sleep_activity, by=c("id", "activity_date"))
view(summary_sleep_daily_activity)
 
total_activity_summary <- merge(x=summary_sleep_daily_activity, y=weight_info, by=c("id", "activity_date"))
view(total_activity_summary)

We are going to make a visualization on the difference and correlation of active minutes throughout the day and the sedentary minutes throughout the day.

average_fairly_active_minutes <- mean(daily_activity$fairly_active_minutes)
average_lightly_active_minutes <- mean(daily_activity$lightly_active_minutes)
average_very_active_minutes <- mean(daily_activity$very_active_minutes)
average_sedentary_minutes <- mean(daily_activity$sedentary_minutes)

active_minutes <- c(average_fairly_active_minutes, average_lightly_active_minutes, average_very_active_minutes, average_sedentary_minutes)

barplot(active_minutes, names.arg = c("Fairly_Active", "Lightly_Active", "Very_Active", "Sedentary"), xlab= "Activity", ylab= "Average Minutes", main= "Average Active Minutes vs Average Sedentary Minutes", col = rainbow(4))

Research shows that a very common use of any smartwatch or fitness tracker is to monitor fitness and health. There are various sensors developed by the phone manufacturers that measure heart rate, breathing rate, steps taken, and calories burned. Furthermore, research shows that people use their fitness tracker to count how many calories they are burning while active and while resting. Calories are directly linked to health, mostly body weight. When we eat and drink more calories than we use up, our bodies store the excess as body fat. If this continues over time, we may put on weight. Click here to read more about it.

The next graph shows the difference of sedentary minutes and the sleep minutes.

average_sedentary_minutes <- mean(daily_activity$sedentary_minutes)
average_sleep <- mean(sleep_activity$total_minutes_asleep)

resting_minutes <- c(average_sedentary_minutes, average_sleep)

barplot(resting_minutes, names.arg = c("average_sedentary_minutes", "average_sleep"), xlab = "Activity", ylab = "Average Minutes", main = "Sedentary Minutes vs Sleep Minutes", col = rainbow(2))

The bar graph shows that the average human being spends a lot of time in sedentary and a lot less time sleeping which should not be the case. According to the sleepfoundation.org, most adults require between seven and nine hours of nightly sleep. The bar graph shows that the average minutes of sleep a person gets is 419 (6.9 hours) which is just below the required sleep to be healthy. Sleep is an essential function that allows our body and mind to recharge, leaving us refreshed and alert when we wake up. Healthy sleep also helps the body remain healthy and stave off diseases. Sleep deprivation can impair our brain function.

The next chart shows the difference of the total time we are resting including sleep compared to the active state.

total_resting_minutes <- mean(c(summary_sleep_daily_activity$sedentary_minutes, summary_sleep_daily_activity$total_minutes_asleep))

average_active_minutes <- mean(c(summary_sleep_daily_activity$very_active_minutes, summary_sleep_daily_activity$fairly_active_minutes, summary_sleep_daily_activity$lightly_active_minutes))

resting_vs_active_minutes <- c(total_resting_minutes, average_active_minutes)

pie(resting_vs_active_minutes, labels = c("Average Total Resting Minutes", "Average Total Active Minutes"), main = "Average Resting Minutes vs Average Active Minutes", col = rainbow(2))
percentages <- percent(resting_vs_active_minutes/sum(resting_vs_active_minutes))
percentages <- paste(percentages, " ")
legend("topright", legend = percentages, cex = 0.8, fill = rainbow(2))

The previous graph shows that the average user is only active 13% of the day, while the rest of the day, they are in a sedentary state or sleeping. According to CDC, regular physical activity is one of the most important things you can do for your health. Being physically active can improve your brain health, help manage weight, reduce the risk of disease, strengthen bones and muscles, and improve your ability to do everyday activities.

We are now going to show the correlation on how being active affects the calories burned.

new_daily_activity_minutes <- daily_activity %>% select(very_active_minutes, fairly_active_minutes, lightly_active_minutes, calories) %>% mutate(total_active_minutes = very_active_minutes+fairly_active_minutes+lightly_active_minutes)

ggplot(new_daily_activity_minutes, aes(x=calories, y= total_active_minutes)) + geom_point() + labs(title = "Calories vs Active Minutes", x = "Calories Burned", y= "Active Minutes")

According to this barplot, the more active you are the more calories you burn. This might be an obvious correlation because it is natural for our body to burn calories while we move. According to CDC, there are other benefits of being physically active, it can improve your brain health, help manage weight, reduce the risk of disease, strengthen bones and muscles, and improve your ability to do everyday activities. Adults who sit less and do any amount of moderate-to-vigorous physical activity gain some health benefits.

On the next graph, we are going to see the correlation of sleep and calories burned.

ggplot(summary_sleep_daily_activity, aes(x=total_minutes_asleep, y=calories, color= total_minutes_asleep)) + geom_point() + scale_color_gradient(low="plum2", high = "orchid4") + labs(title = "Calories vs Total Sleep Minutes", x= "Total Minutes Asleep", y= "Calories Burned")

On the calories vs total minutes asleep graph, it shows that just because you slept longer, doesn’t mean you burned more calories. The quality of sleep might affect the results of the calories burned. Although, according to the insider article, the longer you are in the REM state during your sleep the more calories you burn. There are other factors that affect this claim, to know more click on link.

Upon examining the weight information data, we learned that out of the 33 users that gave consent to their data. Only 8 users recorded their weight.

n_unique(weight_info$id)

## [1] 8

n_unique(daily_activity$id)

## [1] 33

Out of the 8 users, only 5 of them manually reported their weight, and only one user reported their weight daily for 30 days.

weight_info %>% 
  filter(is_manual_report == "TRUE") %>% 
  group_by(id) %>% 
  summarise("Manual Weight Report"=n()) %>%
  distinct()

## # A tibble: 5 × 2
##           id `Manual Weight Report`
##        <dbl>                  <int>
## 1 1503960366                      2
## 2 2873212765                      2
## 3 4319703577                      2
## 4 4558609924                      5
## 5 6962181067                     30

3 out of the 8 users had reports about their weight. The report says that the 3 users were not reporting their weight manually. Even one of the users had reports for 24 days. This brings us to the question on how the weight was being recorded without them reporting it manually.

weight_info %>% 
  filter(is_manual_report == "FALSE") %>% 
  group_by(id) %>% 
  summarise("Manual Weight Report"=n())

## # A tibble: 3 × 2
##           id `Manual Weight Report`
##        <dbl>                  <int>
## 1 1927972279                      1
## 2 5577150313                      1
## 3 8877689391                     24

This makes the data unreliable for analyzation due to how small the population is. Although this data is not reliable, this encourages our company to figure out how we can improve the tracking of weight if the user’s goal is to lose, gain or maintain weight.

Now we are going to compute how often the users were wearing their Fitbits. There are 1440 minutes in a day and we are using the mutate() to get the average minutes the users were wearing their Fitbits.

avg_wear_time <- daily_activity %>% mutate(total_wear_time = very_active_minutes + fairly_active_minutes + lightly_active_minutes + sedentary_minutes) %>% group_by(id) %>% summarise(average_wear_time = mean(total_wear_time))

This graph shows how often the users were wearing their fitbits :

ggplot(avg_wear_time, aes(x=average_wear_time)) + geom_histogram(binwidth = 33, fill= "pink", color= "burlywood4") + labs(title = "Average Wear Time", x= "Time Worn in Minutes", y= "Frequency") +  scale_x_continuous(breaks = seq(0, 1440, by = 60), labels = seq(0, 1440, by = 60))

It shows that the users are using their devices at the majority of the day. The company should invest on making the devices comfortable and make sure that the devices are durable. The company should also consider the battery life of the devices.

Act (Recommendations)

How could these trends apply to Bellabeat’s customers?

After cleaning, processing, and analyzing the data that was provided, I concluded that not a lot of people understand the importance of being physically active. According to the trends that were shown, the users spent a lot of time in sedentary state. This might be because of their job or school, but people need to be reminded that they need to move their body to be healthy. These trends apply to Bellabeat’s customers because we can improve our products by setting up reminders to stand up, take a walk, and maybe even stretch. An hourly reminder from the app to remind the customers to have healthier habits can help us improve sales. Promoting a healthy lifestyle is not only a good marketing strategy, but it is also a great way to advertise self-care.

How could these trends help influence Bellabeat marketing strategy?

The trends from the analysis can help influence Bellabeat’s marketing strategy by giving us a perspective on where we can improve. Here are some recommendations:

1.) According to the trends, the users were inconsistent with tracking their weight. Depending on the user’s goals, we can set up a reminder for the user to input their weight every week if their goal is to lose or gain weight. That way they can track their improvements.

2.) According to the data provided, a lot of people do not get enough sleep. We can set up reminders for the users to start getting ready for bed. Having the option to play relaxing music from the app and even reminding them to sleep.

3.) The company can also invest in improving their app to expand its abilities such as having the option to personalize the user’s goals. Examples would be giving them the option to record what they eat, counting their calories depending on the workout they did like yoga, outdoor walk, kayaking, playing tennis etc. This encourages the users to wear their devices everywhere all the time because it will count all the calories they are burning regardless of the activity.

4.) The company could also invest in adding additional features to the devices such as being able to track the location and or the ability to call 911 if there is an emergency.

5.) The company can also invest in improving the battery life of the devices. This will encourage the users to wear it more because they would not have to worry about the battery dying or re-charging it often.

6.) The company should also invest in the quality and comfort of the products since a lot of users wear the devices on majority of the day. The users can focus more on their health and lifestyle and worry less about the devices being scratched or breaking during activities.

Recommendations for Marketing Team:

If all the recommendations were acted on, the marketing team can advertise all these improvements on our devices which could help us boost the sales of the company. We could also use user-generated content creators and bloggers to help us with marketing. According to research a lot of consumers establish greater trust between customer and brand with user-generated content creators. It is real and honest content that the potential customer can trust, as it is not the brand’s own content, but is based on the experience that other users have had with the products.

Reference List

Correlation of sleep and calories burned.

Danielle Pacheco - March 2, 2023

https://www.sleepfoundation.org/how-sleep-works/why-do-we-need-sleep

Benefits of being physically active.

Division of Nutrition, Physical Activity and Obesity, National Center for Chronic Disease Prevention and Health Promotion – June 16, 2022

https://www.cdc.gov/physicalactivity/basics/pa-health/index.htm#:~:text=Being%20physically%20active%20can%20improve,activity%20gain%20some%20health%20benefits.

Reasons people purchase fitness trackers.

Subir Biswas - January 4, 2022

https://www.vertexknowledge.com/post/how-do-smartwatches-measure-calories-tech-knowledge#:~:text=A%20very%20common%20use%20of,calories%20burned%20during%20various%20exercises.

UGC benefits

https://www.adsmurai.com/en/articles/4-benefits-of-user-generated-content#:~:text=User%2Dgenerated%20content%20establishes%20greater,have%20had%20with%20the%20products.

Bellabeat Case Study

Hosanna Bigornia

2023-04-07