Bellabeat is a wellness company focused on improving women’s health through innovative, technology-driven solutions. Bellabeat’s products integrate their app and suite of smart fitness devices to track and provide personalized insights towards optimal customer health and wellness.
In order to gain deeper insights into their core audience, Bellabeat Co-founder, Urška Sršen, seeks actionable data on the fitness habits of non-Bellabeat smart device users.
From available data from past FitBit users, this analysis seeks to address the following questions:
What are some trends in smart device usage?
How could these trends apply to Bellabeat customers?
How could these trends help influence Bellabeat marketing strategy?
The key stakeholders for this analysis are:
Urška Sršen: Bellabeat Co-founder and CPO
Sandro Mur: Bellabeat Co-founder and CEO
Bellabeat Marketing Analytics Team: The team of data analysts responsible for collecting, analyzing, and reporting data to help guide Bellabeat’s marketing strategy.
The initial concern for this analysis will be accessing, cleaning, and identifying relevant data from available sources.
An initial theory for this analysis is that the data will be simultaneously limited and overabundant; the pool of FitBit user data may be smaller than desired, but concurrently, contain more information per user than may be necessary for this analysis.
The source of the data used in this analysis is the CC0: Public Domain FitBit Fitness Tracker Data by Möbius. It consists of FitBit data generated by 30 respondents to a survey distributed by Amazon Mechanical Turk. The respondents consented to the public availability of personal FitBit data including daily, hourly, and minute-level reporting of physical activity, heart rate, and sleep monitoring. The data gathered covers a 30-day period from April 12, 2016 through May 12, 2016.
The limited respondent pool of 30 FitBit users is not large enough to represent the general population as a whole.
The limited respondent pool, combined with the FitBit data source, implies a bias toward existing FitBit users or those currently interested in or working toward health improvement.
The gathered data lacks demographic information, including but not limited to age, gender, and occupation, the presence of which could reveal additional insights.
These factors combine to imply a more limited application scope for this data analysis. As such, this analysis will most likely be relevant to those currently involved in health and fitness improvement or those interested in future health and fitness improvement.
install.packages("tidyverse")
Error in install.packages : Updating loaded packages
library("tidyverse")
install.packages("janitor")
WARNING: Rtools is required to build R packages but is not currently installed. Please download and install the appropriate version of Rtools before proceeding:
https://cran.rstudio.com/bin/windows/Rtools/
trying URL 'https://cran.rstudio.com/bin/windows/contrib/4.5/janitor_2.2.1.zip'
Content type 'application/zip' length 292970 bytes (286 KB)
downloaded 286 KB
package ‘janitor’ successfully unpacked and MD5 sums checked
The downloaded binary packages are in
C:\Users\balut\AppData\Local\Temp\Rtmpm2UNwS\downloaded_packages
install.packages("tidyverse")
WARNING: Rtools is required to build R packages but is not currently installed. Please download and install the appropriate version of Rtools before proceeding:
https://cran.rstudio.com/bin/windows/Rtools/
Warning in install.packages :
package ‘tidyverse’ is in use and will not be installed
library("janitor")
This data contains information about users’ daily FitBit tracked activities.
daily_activity <- read.csv("dailyActivity_merged.csv")
# Use the clean_names() function to change all headers to lowercase with underscores
daily_activity <- clean_names(daily_activity)
# Gain a quick initial look at the data frame
head(daily_activity)
# Gain a structural look at the data frame
str(daily_activity)
'data.frame': 940 obs. of 15 variables:
$ id : num 1.5e+09 1.5e+09 1.5e+09 1.5e+09 1.5e+09 ...
$ activity_date : chr "4/12/2016" "4/13/2016" "4/14/2016" "4/15/2016" ...
$ total_steps : int 13162 10735 10460 9762 12669 9705 13019 15506 10544 9819 ...
$ total_distance : num 8.5 6.97 6.74 6.28 8.16 ...
$ tracker_distance : num 8.5 6.97 6.74 6.28 8.16 ...
$ logged_activities_distance: num 0 0 0 0 0 0 0 0 0 0 ...
$ very_active_distance : num 1.88 1.57 2.44 2.14 2.71 ...
$ moderately_active_distance: num 0.55 0.69 0.4 1.26 0.41 ...
$ light_active_distance : num 6.06 4.71 3.91 2.83 5.04 ...
$ sedentary_active_distance : num 0 0 0 0 0 0 0 0 0 0 ...
$ very_active_minutes : int 25 21 30 29 36 38 42 50 28 19 ...
$ fairly_active_minutes : int 13 19 11 34 10 20 16 31 12 8 ...
$ lightly_active_minutes : int 328 217 181 209 221 164 233 264 205 211 ...
$ sedentary_minutes : int 728 776 1218 726 773 539 1149 775 818 838 ...
$ calories : int 1985 1797 1776 1745 1863 1728 1921 2035 1786 1775 ...
# View the data frame
view(daily_activity)
This data contains information about users’ heart rates, measured every five seconds.
heartrate_data <- read.csv("heartrate_seconds_merged.csv")
# Use the clean_names() function to change all headers to lowercase with underscores
heartrate_data <- clean_names(heartrate_data)
# Gain a quick initial look at the data frame
head(heartrate_data)
# Gain a structural look at the data frame
str(heartrate_data)
'data.frame': 2483658 obs. of 3 variables:
$ id : num 2.02e+09 2.02e+09 2.02e+09 2.02e+09 2.02e+09 ...
$ time : chr "4/12/2016 7:21:00 AM" "4/12/2016 7:21:05 AM" "4/12/2016 7:21:10 AM" "4/12/2016 7:21:20 AM" ...
$ value: int 97 102 105 103 101 95 91 93 94 93 ...
# View the data frame
view(heartrate_data)
This data contains information about users’ sleep duration.
sleep_data <- read.csv("sleepDay_merged.csv")
# Use the clean_names() function to change all headers to lowercase with underscores
sleep_data <- clean_names(sleep_data)
# Gain a quick initial look at the data frame
head(sleep_data)
# Gain a structural look at the data frame
str(sleep_data)
'data.frame': 413 obs. of 5 variables:
$ id : num 1.5e+09 1.5e+09 1.5e+09 1.5e+09 1.5e+09 ...
$ sleep_day : chr "4/12/2016 12:00:00 AM" "4/13/2016 12:00:00 AM" "4/15/2016 12:00:00 AM" "4/16/2016 12:00:00 AM" ...
$ total_sleep_records : int 1 2 1 2 1 1 1 1 1 1 ...
$ total_minutes_asleep: int 327 384 412 340 700 304 360 325 361 430 ...
$ total_time_in_bed : int 346 407 442 367 712 320 377 364 384 449 ...
# View the data frame
view(sleep_data)
This data contains information about users’ body weight.
weight_log_data <- read.csv("weightLogInfo_merged.csv")
# Use the clean_names() function to change all headers to lowercase with underscores
weight_log_data <- clean_names(weight_log_data)
# Gain a quick initial look at the data frame
head(weight_log_data)
# Gain a structural look at the data frame
str(weight_log_data)
'data.frame': 67 obs. of 8 variables:
$ id : num 1.50e+09 1.50e+09 1.93e+09 2.87e+09 2.87e+09 ...
$ date : chr "5/2/2016 11:59:59 PM" "5/3/2016 11:59:59 PM" "4/13/2016 1:08:52 AM" "4/21/2016 11:59:59 PM" ...
$ weight_kg : num 52.6 52.6 133.5 56.7 57.3 ...
$ weight_pounds : num 116 116 294 125 126 ...
$ fat : int 22 NA NA NA NA 25 NA NA NA NA ...
$ bmi : num 22.6 22.6 47.5 21.5 21.7 ...
$ is_manual_report: chr "True" "True" "False" "True" ...
$ log_id : num 1.46e+12 1.46e+12 1.46e+12 1.46e+12 1.46e+12 ...
# View the data frame
view(weight_log_data)
A perusal of the data sources reveal certain redundancies in the information whose consolidation, reduction, or elimination should result in more efficient data analysis.
The file dailyActivity_merged.csv already contains the datasets present in dailyCalories_merged.csv, dailyIntensities_merged.csv, and dailySteps_merged.csv. To eliminate redundancy, the dailyCalories_merged, dailyIntensities_merged, and dailySteps_merged files can be removed from this analysis.
The hourly data records, hourlyCalories_merged.csv, hourlyIntensities_merged.csv, and hourlySteps_merged.csv contain Id and ActivityHour fields in common. To reduce the number of files required for analysis, hourlyCalories_merged, hourlyIntensities_merged, and hourlySteps_merged will be combined into the file combined_HourlyData.csv.
# Using purrr::map (part of tidyverse)
# library(purrr) is not needed because it was previously loaded in tidyverse
file_paths_hourly <- list.files(path = "hourly_data", pattern = "*.csv", full.names = TRUE)
list_of_hourly_dfs <- lapply(file_paths_hourly, read_csv)
hourly_activity <- Reduce(inner_join, list_of_hourly_dfs)
write_csv(hourly_activity, "combined_HourlyData.csv")
#Use the clean_names() function to change all headers to lowercase with underscores
hourly_activity <- clean_names(hourly_activity)
# Gain a quick initial look at the data frame
head(hourly_activity)
# Gain a structural look at the data frame
str(hourly_activity)
spc_tbl_ [22,099 × 6] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
$ id : num [1:22099] 1.5e+09 1.5e+09 1.5e+09 1.5e+09 1.5e+09 ...
$ activity_hour : chr [1:22099] "4/12/2016 12:00:00 AM" "4/12/2016 1:00:00 AM" "4/12/2016 2:00:00 AM" "4/12/2016 3:00:00 AM" ...
$ calories : num [1:22099] 81 61 59 47 48 48 48 47 68 141 ...
$ total_intensity : num [1:22099] 20 8 7 0 0 0 0 0 13 30 ...
$ average_intensity: num [1:22099] 0.333 0.133 0.117 0 0 ...
$ step_total : num [1:22099] 373 160 151 0 0 ...
- attr(*, "spec")=
.. cols(
.. Id = col_double(),
.. ActivityHour = col_character(),
.. Calories = col_double()
.. )
- attr(*, "problems")=<externalptr>
# View the data frame
view(hourly_activity)
The previous structural looks at the data frames show that the data is currently in the correct format (character, numeric, or integer). The data frames will be checked for any missing or duplicated values.
# Check for missing values
sum(is.na(daily_activity))
[1] 0
sum(is.na(heartrate_data))
[1] 0
sum(is.na(hourly_activity))
[1] 0
sum(is.na(sleep_data))
[1] 0
sum(is.na(weight_log_data))
[1] 65
weight_log_data_2 <- select(weight_log_data, -fat)
# Checking daily_activity
daily_activity %>%
group_by_all() %>%
filter(n()>1)
# Checking heartrate_data
heartrate_data %>%
group_by_all() %>%
filter(n()>1)
# Checking hourly_activity
hourly_activity %>%
group_by_all() %>%
filter(n()>1)
# Checking weight_log_data
weight_log_data %>%
group_by_all() %>%
filter(n()>1)
# Checking sleep_data
sleep_data %>%
group_by_all() %>%
filter(n()>1)
NA
sleep_data_clean <- distinct(sleep_data)
Separating the time data will make it easier to analyze hourly activity data and heart rate data.
Separating the date data will make it easier to analyze daily activity and weight log data.
# Convert the date time to 24-hour format
hourly_activity$activity_hour <- mdy_hms(hourly_activity$activity_hour)
# Separate the date and time
hourly_activity_clean <- hourly_activity %>%
separate(activity_hour, c("date", "time"), " ")
# Change the midnight time (NA) to 00:00:01 as this is measured hourly, and the seconds will not make a difference
hourly_activity_clean$time[is.na(hourly_activity_clean$time)] <- "00:00:01"
# Convert the date time to 24-hour format
heartrate_data$time <- mdy_hms(heartrate_data$time)
# Separate the date and time
heartrate_data_clean <- heartrate_data %>%
separate(time, c("date", "time"), " ")
# Remove any missing values
heartrate_data_clean <- na.omit(heartrate_data_clean)
# Convert the date time to 24-hour format
weight_log_data_2$date <- mdy_hms(weight_log_data_2$date)
# Separate the date and time
weight_log_data_clean <- weight_log_data_2 %>%
separate(date, c("date", "time"), " ")
Now that the data has been properly cleaned and verified, an initial summary of their contents can be examined.
Determine how many unique users submitted data for each data frame.
# Count users submitting daily activity logs
n_distinct(daily_activity$id)
[1] 33
# Count users submitting heart rate logs
n_distinct(heartrate_data_clean$id)
[1] 14
# Count users submitting hourly activity logs
n_distinct(hourly_activity_clean$id)
[1] 33
# Count users submitting sleep logs
n_distinct(sleep_data_clean$id)
[1] 24
# Count users submitting weight logs
n_distinct(weight_log_data_clean$id)
[1] 8
The number of unique users submitting fitness data varies depending on the type of data:
33 unique users submitted logs for daily activity and hourly activity
24 unique users submitted logs for sleep
14 unique users submitted logs for heart rate
8 unique users submitted logs for weight
Some of the user data refers to terms specific to FitBit. These terms are defined as the following:
METs: Metabolic Equivalents of Task, a measurement of energy expenditure due to physical activity.
Activity levels: FitBit activity classification based on METs:
Sedentary = <1.5 METs
Lightly Active = 1.5 to 3 METs (e.g., leisurely walking or household chores)
Fairly Active = 3 to 6 METs (e.g., brisk walking or dancing)
Very Active = 6+ METs (e.g., jogging or running)
daily_activity %>%
select(total_steps,
total_distance,
very_active_minutes,
fairly_active_minutes,
lightly_active_minutes,
sedentary_minutes,
calories) %>%
summary()
total_steps total_distance very_active_minutes
Min. : 0 Min. : 0.000 Min. : 0.00
1st Qu.: 3790 1st Qu.: 2.620 1st Qu.: 0.00
Median : 7406 Median : 5.245 Median : 4.00
Mean : 7638 Mean : 5.490 Mean : 21.16
3rd Qu.:10727 3rd Qu.: 7.713 3rd Qu.: 32.00
Max. :36019 Max. :28.030 Max. :210.00
fairly_active_minutes lightly_active_minutes
Min. : 0.00 Min. : 0.0
1st Qu.: 0.00 1st Qu.:127.0
Median : 6.00 Median :199.0
Mean : 13.56 Mean :192.8
3rd Qu.: 19.00 3rd Qu.:264.0
Max. :143.00 Max. :518.0
sedentary_minutes calories
Min. : 0.0 Min. : 0
1st Qu.: 729.8 1st Qu.:1828
Median :1057.5 Median :2134
Mean : 991.2 Mean :2304
3rd Qu.:1229.5 3rd Qu.:2793
Max. :1440.0 Max. :4900
Of the 33 users who submitted daily activity data:
The average user accumulates 7638 steps per day.
The average user is active for 5.490 kilometers per day.
The average user is very active for 21.16 minutes per day.
The average user is fairly active for 13.56 minutes per day.
The average user is lightly active for 192.8 minutes (3.2 hours) per day.
The average user burns 2304 calories per day.
The activity observations show that users spend most activity time being lightly active, but with more time spent in very active exercise than in fairly active exercise, implying light, everyday activities punctuated by focused bursts of more intense activity.
# Generate a general heart rate summary
heartrate_data_clean %>%
select(value,
time) %>%
summary()
value time
Min. : 36.00 Length:2483546
1st Qu.: 63.00 Class :character
Median : 73.00 Mode :character
Mean : 77.33
3rd Qu.: 88.00
Max. :203.00
Of the 14 users who submitted heart rate data:
The average heart rate is 77.33 bpm (beats per minute).
The maximum heart rate of 203 bpm is very high (according to the American Heart Association, the maximum heart rate should be 200 bpm at 20 years of age, with the maximum lowering as age increases).
sleep_data_clean %>%
select(total_minutes_asleep,
total_time_in_bed) %>%
summary()
total_minutes_asleep total_time_in_bed
Min. : 58.0 Min. : 61.0
1st Qu.:361.0 1st Qu.:403.8
Median :432.5 Median :463.0
Mean :419.2 Mean :458.5
3rd Qu.:490.0 3rd Qu.:526.0
Max. :796.0 Max. :961.0
Of the 24 users who submitted sleep data:
The average sleep time is 419.2 minutes (~7 hours).
The average time in bed is 458.2 minutes (7.64 hours, or 7 hours, 38 minutes).
The difference between average time in bed and average sleep time (38 minutes) can be attributed to time lying in bed before falling asleep and/or time lying in bed after waking up.
weight_log_data_clean %>%
select(weight_pounds,
bmi) %>%
summary()
weight_pounds bmi
Min. :116.0 Min. :21.45
1st Qu.:135.4 1st Qu.:23.96
Median :137.8 Median :24.39
Mean :158.8 Mean :25.19
3rd Qu.:187.5 3rd Qu.:25.56
Max. :294.3 Max. :47.54
Only 8 users submitted weight logs.
The average weight is 158.8 pounds.
The average BMI (Body Mass Index) is 25.19. A BMI of 25 to 29.9 is considered overweight; therefore, the average FitBit user could be classified as overweight.
Because of the small sample size, any weight data conclusions cannot be considered high in accuracy.
ggplot(data=daily_activity, aes(x=total_steps, y=calories))+
geom_point(color="darkorange")+
geom_smooth(color="red")+
labs(title="Daily Steps vs Calories Burned",
x="Total Daily Steps",
y="Daily Calories Burned")
# Prepare the data for a bar chart
activity_type_data <- daily_activity %>%
pivot_longer(cols = c(very_active_minutes, fairly_active_minutes, lightly_active_minutes), names_to="Variable", values_to="Value")
# Reorder the bars in the chart
activity_type_data$Variable <- factor(activity_type_data$Variable, levels=c("very_active_minutes", "fairly_active_minutes", "lightly_active_minutes"))
# Create the bar chart from the prepared data "activity_type_data"
ggplot(activity_type_data, aes(x=Variable, y=Value))+
geom_bar(stat="identity", fill="dodgerblue3")+
labs(title="Comparison of Activity Intensity Types",
x="Activity Intensity",
y="Activity Minutes")
A comparison of activity intensity types shows that:
The vast majority of activity time is spent performing light activities.
More time is spent being very active than fairly active.
# Convert dates to POSIXct format before converting to days of the week
daily_activity$date_posixct <- as.POSIXct(daily_activity$activity_date, format="%m/%d/%Y")
# Find days of the week from date_posixct
daily_activity$dotw <- wday(daily_activity$date_posixct, label=TRUE, abbr=FALSE)
# Create the dotw_activity data frame
dotw_activity <- daily_activity %>%
group_by(dotw) %>%
summarise(avg_steps_dotw=as.integer(mean(total_steps)),
vam_dotw=as.integer(mean(very_active_minutes)),
fam_dotw=as.integer(mean(fairly_active_minutes)),
lam_dotw=as.integer(mean(lightly_active_minutes)),
total_minutes_dotw=as.integer(sum(vam_dotw, fam_dotw, lam_dotw)))
# Plot days of the week vs average daily steps
ggplot(dotw_activity, aes(x=dotw, y=avg_steps_dotw))+
geom_bar(stat="identity", fill="dodgerblue")+
labs(title="Average Steps by Day of the Week",
x="Day of the Week",
y="Average Daily Steps")+
theme(axis.text.x=element_text(angle=30),
legend.position="none")+
geom_text(aes(label=avg_steps_dotw, vjust=1.2))
NA
NA
The most active step days are Saturday and Tuesday.
The least active step day is Sunday.
ggplot(dotw_activity, aes(x=dotw, y=total_minutes_dotw))+
geom_bar(stat="identity", fill="dodgerblue")+
labs(title="Total Active Minutes Daily",
x="Day of the Week",
y="Total Active Minutes")+
theme(axis.text.x=element_text(angle=30),
legend.position="none")+
geom_text(aes(label=total_minutes_dotw, vjust=1.2))
The most active day of the week is Saturday.
The least active day of the week is Sunday.
# Pivot the data for the bar chart
dotw_intensity <- dotw_activity %>%
pivot_longer(cols = c(vam_dotw, fam_dotw), names_to="variable", values_to="value")
# Create the bar chart from the prepared data "dotw_intensity"
ggplot(dotw_intensity, aes(x=dotw, fill=variable, y=value))+
geom_bar(stat="identity", position="dodge")+
labs(title="Comparison of Fairly and Very Active Minutes Daily",
x="Day of the Week",
y="Activity Minutes")+
theme(axis.text.x=element_text(angle=30))+
scale_fill_discrete(name="Activity Type", labels=c("Fairly Active", "Very Active"))+
geom_text(aes(label=value))
“Very Active” minutes occur the most on Monday and the least on Sunday.
“Fairly Active” minutes occur the most on Saturday and the least on Thursday.
# Plot days of the week vs lightly active minutes
ggplot(dotw_activity, aes(x=dotw, y=lam_dotw))+
geom_bar(stat="identity", fill="dodgerblue")+
labs(title="Lightly Active Minutes Daily",
x="Day of the Week",
y="Lightly Active Minutes")+
theme(axis.text.x=element_text(angle=30),
legend.position="none")+
geom_text(aes(label=lam_dotw, vjust=1.2))
ggplot(data=daily_activity, aes(x=very_active_minutes, y=calories))+
geom_point(color="darkorange")+
geom_smooth(color="red")+
labs(title="Very Active Minutes vs Calories Burned",
x="Very Active Minutes",
y="Calories Burned")
ggplot(data=daily_activity, aes(x=fairly_active_minutes, y=calories))+
geom_point(color="darkorange")+
geom_smooth(color="red")+
labs(title="Fairly Active Minutes vs Calories Burned",
x="Fairly Active Minutes",
y="Calories Burned")
ggplot(data=daily_activity, aes(x=lightly_active_minutes, y=calories))+
geom_point(color="darkorange")+
geom_smooth(color="red")+
labs(title="Lightly Active Minutes vs Calories Burned",
x="Lightly Active Minutes",
y="Calories Burned")
The plots of activity minutes versus calories burned demonstrate:
A strong positive correlation between very active minutes and calories burned.
An inconclusive or even slightly negative correlation between fairly active minutes and calories burned.
A slight correlation between lightly active minutes and calories burned.
ggplot(data=hourly_activity_clean, aes(x=average_intensity, y=calories))+
geom_jitter(color="darkorange")+
geom_smooth(method="gam", formula=y~s(x, bs="cs"), color="red")+
labs(title="Average Hourly Intensity vs Calories Burned",
x="Average Hourly Intensity",
y="Calories Burned")
# Prepare the hourly_activity_clean data for time analysis
hourly_activity_df <- hourly_activity_clean
hourly_activity_df$time <- hms(hourly_activity_clean$time)
hour_posix <- hourly_activity_df$time
# Group the average total intensity values by hour values
total_intensity_hourly <- hourly_activity_df %>%
group_by(hour = hour(hour_posix)) %>%
summarise(avg_intensity=mean(total_intensity))
# Limit the average total intensity values to 1 decimal place for ease of reading
avg_intensity_rd <- format(round(total_intensity_hourly$avg_intensity, 1))
# Plot the average total intensity per hour into a bar/column chart
ggplot(data=total_intensity_hourly, aes(x=hour, y=avg_intensity))+
geom_col(fill="darkorange")+
scale_x_continuous(labels=scales::date_format("%H"), breaks=unique(total_intensity_hourly$hour)) +
labs(title="Activity Intensity by Time of Day",
x="Time of Day",
y="Average Total Hourly Intensity")+
geom_text(aes(label=avg_intensity_rd, angle=60, hjust=0.5))
NA
NA
This chart shows some general activity trends:
The most active time of the day is evening, from 5pm through 7pm. This implies an exercise period after the work day has ended.
The second-most active time period during the day is 12pm through 2pm, implying an exercise period during lunch break hours.
The least active time period is 12am through 4am, implying minimal activity due to sleep.
ggplot(data=weight_log_data_clean, aes(x=date, y=weight_pounds, group=id))+
geom_line(linewidth=1.5, color="dodgerblue3")+
labs(title="Weight (lbs) over Time",
x="Date",
y="Weight in Pounds")+
annotate("text", x=I(15), y=I(0.55), label="Each line represents a unique individual's weight data", color="darkblue")+
theme(axis.text.x = element_text(angle=90), legend.position="none")
This plot shows:
A very slight decrease in weight over time for two of the users.
A slight increase in weight for one user.
Inconclusive weight gain or loss for two users.
The data sample size is too small to generate any conclusive actionable insights.
Based on the average body mass index (BMI) of 25.19, users are generally slightly overweight, implying a predisposition for those currently out of shape to use the FitBit to track exercise activity. As this is based on only eight users’ weight logs, this observation must be considered ultimately inconclusive at this time.
Users tend to spend most of their time engaged in light activity with short bursts of higher-intensity activity.
Most user activity occurs on Saturday, although the most intense activity occurs on Monday. Surprisingly, the least overall user activity also occurs on Monday. This implies focused, high-impact yet short-duration exercise activity at the start of the typical work week.
User activity seems to take place most often just after work hours (5pm to 7pm) or right around lunch time (12 noon to 2pm).
An increase in daily steps and very active time displays a solid positive correlation with an increase in daily burned calories.
Average hourly intensity demonstrates a strong positive correlation with hourly calories burned.
Users average a healthy amount of sleep for adults, at generally seven hours of sleep a night.
The small sample size is sub-optimal, especially in the reporting of weight data, with only eight users logging weights, and only two of those users logging weights over the 30-day time period.
The smaller-than-expected sample size for heart rate (14 out of 33 users) and weight log (8 out of 33 users) imply a need to better inform users of the importance of tracking those attributes. Bellabeat may consider incentivizing such tracking.
Additional user information such as age, gender, and occupational hours could help in providing more substantive data correlations and more customizable exercise plans.
One user displayed a heart rate considered very high, at 203 bpm. Bellabeat should consider adding in a visual and/or audio alert when user heart rates exceed a maximum healthy heart rate for their age.
The lack of consistent weight log data may be improved with dedicated app-integration and connectivity with BlueTooth-connected scales.
Bellabeat should consider adding in periodic reminders for users to track not just physical activity, but heart rate and weight data. Perhaps kudos, achievements, or badges could be given to users as a reward for providing more comprehensive health tracking.
Data strongly suggests that higher-intensity activity directly correlates with higher caloric burn rates. Bellabeat should consider encouraging custom, high-intensity exercise plans for those looking to burn the most calories.
Bellabeat should provide an option for app users to receive periodic reminders for daily movement and activity, as even light activity correlates with an increase in caloric burn. This would be most effective if targeted for activity during lunch breaks or at the end of the work day, as these times show the highest active intensity rates.
Ultimately, further analysis of fitness tracker users requires both a larger sample size and additional data. As is, current data is enough to provide broad guidelines for actionable steps, but further data gathering and analysis is recommended.