Company Summary

Bellabeat was founded in 2013 by Urška Sršen and Sando Mur. With offices all over the world, it is a technology based company geared towards woman. They offer a vast array of products, which help monitor their clients activity, sleep, stress and reproductive health.

Stakeholders

Business Task

Data Sources

Prepare

The Excel files that are going to be used are dailyActivity_merged and weightLogInfo_merged. The dataset is Fitbit Fitness Tracker Data, which was obtained from Kaggle. It contains information of 33 participants for a date range of 2 months(03/12/16 - 05/12/16). Furthermore, it has 18 tables of participants daily activities. These range from physical activity, BMI measures, heart rate, steps taken, and sleep monitoring to name a few.

Excel

R

Dataset Limitations

Analysis

Activity Report

For the following analysis, the dailyActivity_merged.xlxsx file was used. Within this table, it accounts for all the activities for a 2 month period(April - May). Below is a summarization of the totals for both months.

Month Individuals Observations Steps Distance.km Active.hrs Calories
Apr 33 611 4,772,721 3,418.81 12,622.43 14,313.57
May 30 329 2,406,915 1,728.02 6,471.367 7,340.36

There were 33 participants in total, but 2 were not in May. An interesting note, in May, the totals for each category was reduced by almost 50%. Further analysis doesn’t show why this occurred, therefore people were more active in April.

Active hours is the total sum of 4 categories: Very Active, Fairly Active, Lightly Active and Sedentary minutes. These were all in minutes but were converted to hours, in order to understand the data better. Unfortunately, the table doesn’t break down how many steps, distance traveled nor how many calories were accountable for each activity, therefore they all had to be summed together. For the calories category, the totals were divided by 100 in order to make the graphs easier to read.

Active Averages per Week
Date.Month Weekly.cases Steps Very.active Moderate.active Sedentary.active
Apr 2016-04-10 7,870 0.37 3.5 17
Apr 2016-04-17 7,790 0.38 3.5 17
Apr 2016-04-24 7,780 0.35 3.6 17
May 2016-05-01 7,520 0.34 3.4 17
May 2016-05-08 6,990 0.31 3.1 15

According to the Centers for Disease Control and Prevention (CDC), 10,000 daily steps is recommended per day. For the months of April and May, this was not met instead the average ranged from 6,990-7,870. Moreover, 69%-79% of daily step were met.

Also, a person should be getting 75 mins/1.2 hrs of vigorous activity per week, this was not meet with the participants, in fact less than half of the amount was reached.

For moderate activity, the CDC recommends getting 150 mins/2.5 hours per week. This was not only met but exceeded for both months.

Sedentary activity was well over the recommended range of 7-10 hours per day. The average was 15-17 hours.

The graph below indicates which day of the week, the participants were more active. For the month of April, it was on Wednesday and for May it was Tuesday. Note, that the amount of activity dropped significantly. As stated from the first summarization, it dropped by almost half the amount. Therefore, that participants were more active in April.



Below shows a more clear indication of how active the participants really were by category along with their percentages. As stated in the first data output, the totals for each category reduced by almost 50%. The graph below graph states this also. 81% of the time people were more Sedentary active. Even if one combines all other categories, they still will not come close on how inactive the participants really were.



Below is a graphical representation if there’s any type of relationship between the total number of Kilometers(KM) traveled and calories burned. In order to see if a positive or negative linear correlation exists between these two, an analysis was done. In the month of April there was a positive linear correlation of 0.627 and in May it was 0.655. Therefore, since the numbers are both positive and between 0.5 and 0.7, there’s an indication that both variables are considered moderately correlated. A number close to 1 means a perfect correlation between 2 variables, in this example, there was a stronger correlation in May then in April.





BMI Report

Body Mass Index(BMI) is a measure of body fat based on height and weight for adult women and men. The following table shows all the categories.

BMI Category
bmi range
Underweight Below 18.5
Healthy between 18.5 & 24.9
Overweight between 25.0 & 29.9
Obese greater than or equal to 30

For the following analysis, while using R, the weightLogInfo_merged.xlsx table was used and renamed weight loss. This was done, in order to extract and manipulate the information needed to produce the following tables and graph.

The table below gives a summary for each BMI category. The max and average BMI index, along with the max and average weight in pounds plus, the number of distinct individuals and total observations. For the months of April and May, there are 67 observations along with 8 individuals. The max weight ranges from 138lbs - 294lbs, along with its averages ranging from 138lbs - 294lbs. In addition, the max BMI ranging from 24.4 - 47.5, along with its average ranging from 23.8 - 47.5. Note, since there’s only 1 individual and observation for obese, the max and average values are the same across the board. If there were more, then there could have been a better understanding within that category.

Max BMI Categories
Category Max.BMI Avg.BMI Max.Weight Avg.Weight Individuals Total.Observations
Healthy 24.4 23.8 138 134.0 3 34
Obese 47.5 47.5 294 294.3 1 1
Overweight 28.0 26.0 200 181.0 4 32

This is a graphical representation for the BMI index per month along with each of its categories. For the month of April, it was higher the amount of observations for people whom were overweight compared to being healthy and obese. But in May there was a change, moreover there was an increase of healthy observations and a decrease in being overweight. In addition, as stated in the previous analysis, for obese there was only 1 individual and observation for the month of April, with that the amount of obese people and observations decreased in May.



Conclusion

By analyzing and creating visualizations of the Fitbit dataset, there is a clear indication by measuring different types of activities, and recording them does lead to see a person’s health habits. Therefore, it would greatly benefit BellaBeat to invest and become more aggressive towards tracking one’s fitness. Moreover, they should create their own focus group and conduct statistical analysis. This would lead them, to become more familiar with their core audience and be unique within the industry.

A few recommendations should be considered:

  • Number of participants be greater than 100.
  • Participants should be categorized within age groups. For example, between the ages of 20-29, 30-39 etc.
  • Provide different fitness goals. For example, weight loss, strength training etc.
  • Indicate which exercises were done and how many calories were used.
  • Track food throughout the day and the amount of calories for each meal/snack.