Bellabeat was founded in 2013 by Urška Sršen and Sando Mur. With offices all over the world, it is a technology based company geared towards woman. They offer a vast array of products, which help monitor their clients activity, sleep, stress and reproductive health.
The Excel files that are going to be used are dailyActivity_merged and weightLogInfo_merged. The dataset is Fitbit Fitness Tracker Data, which was obtained from Kaggle. It contains information of 33 participants for a date range of 2 months(03/12/16 - 05/12/16). Furthermore, it has 18 tables of participants daily activities. These range from physical activity, BMI measures, heart rate, steps taken, and sleep monitoring to name a few.
Excel
R
Dataset Limitations
For the following analysis, the dailyActivity_merged.xlxsx file was used. Within this table, it accounts for all the activities for a 2 month period(April - May). Below is a summarization of the totals for both months.
| Month | Individuals | Observations | Steps | Distance.km | Active.hrs | Calories |
|---|---|---|---|---|---|---|
| Apr | 33 | 611 | 4,772,721 | 3,418.81 | 12,622.43 | 14,313.57 |
| May | 30 | 329 | 2,406,915 | 1,728.02 | 6,471.367 | 7,340.36 |
There were 33 participants in total, but 2 were not in May. An interesting note, in May, the totals for each category was reduced by almost 50%. Further analysis doesn’t show why this occurred, therefore people were more active in April.
Active hours is the total sum of 4 categories: Very Active, Fairly Active, Lightly Active and Sedentary minutes. These were all in minutes but were converted to hours, in order to understand the data better. Unfortunately, the table doesn’t break down how many steps, distance traveled nor how many calories were accountable for each activity, therefore they all had to be summed together. For the calories category, the totals were divided by 100 in order to make the graphs easier to read.
| Date.Month | Weekly.cases | Steps | Very.active | Moderate.active | Sedentary.active |
|---|---|---|---|---|---|
| Apr | 2016-04-10 | 7,870 | 0.37 | 3.5 | 17 |
| Apr | 2016-04-17 | 7,790 | 0.38 | 3.5 | 17 |
| Apr | 2016-04-24 | 7,780 | 0.35 | 3.6 | 17 |
| May | 2016-05-01 | 7,520 | 0.34 | 3.4 | 17 |
| May | 2016-05-08 | 6,990 | 0.31 | 3.1 | 15 |
According to the Centers for Disease Control and Prevention (CDC), 10,000 daily steps is recommended per day. For the months of April and May, this was not met instead the average ranged from 6,990-7,870. Moreover, 69%-79% of daily step were met.
Also, a person should be getting 75 mins/1.2 hrs of vigorous activity per week, this was not meet with the participants, in fact less than half of the amount was reached.
For moderate activity, the CDC recommends getting 150 mins/2.5 hours per week. This was not only met but exceeded for both months.
Sedentary activity was well over the recommended range of 7-10 hours per day. The average was 15-17 hours.
The graph below indicates which day of the week, the participants were more active. For the month of April, it was on Wednesday and for May it was Tuesday. Note, that the amount of activity dropped significantly. As stated from the first summarization, it dropped by almost half the amount. Therefore, that participants were more active in April.
Below shows a more clear indication of how active the
participants really were by category along with their percentages. As
stated in the first data output, the totals for each category reduced by
almost 50%. The graph below graph states this also. 81% of the time
people were more Sedentary active. Even if one combines all other
categories, they still will not come close on how inactive the
participants really were.
Below is a graphical representation if there’s any type of relationship between the total number of Kilometers(KM) traveled and calories burned. In order to see if a positive or negative linear correlation exists between these two, an analysis was done. In the month of April there was a positive linear correlation of 0.627 and in May it was 0.655. Therefore, since the numbers are both positive and between 0.5 and 0.7, there’s an indication that both variables are considered moderately correlated. A number close to 1 means a perfect correlation between 2 variables, in this example, there was a stronger correlation in May then in April.
Body Mass Index(BMI) is a measure of body fat based on height and weight for adult women and men. The following table shows all the categories.
| bmi | range |
|---|---|
| Underweight | Below 18.5 |
| Healthy | between 18.5 & 24.9 |
| Overweight | between 25.0 & 29.9 |
| Obese | greater than or equal to 30 |
For the following analysis, while using R, the weightLogInfo_merged.xlsx table was used and renamed weight loss. This was done, in order to extract and manipulate the information needed to produce the following tables and graph.
The table below gives a summary for each BMI category. The max and average BMI index, along with the max and average weight in pounds plus, the number of distinct individuals and total observations. For the months of April and May, there are 67 observations along with 8 individuals. The max weight ranges from 138lbs - 294lbs, along with its averages ranging from 138lbs - 294lbs. In addition, the max BMI ranging from 24.4 - 47.5, along with its average ranging from 23.8 - 47.5. Note, since there’s only 1 individual and observation for obese, the max and average values are the same across the board. If there were more, then there could have been a better understanding within that category.
| Category | Max.BMI | Avg.BMI | Max.Weight | Avg.Weight | Individuals | Total.Observations |
|---|---|---|---|---|---|---|
| Healthy | 24.4 | 23.8 | 138 | 134.0 | 3 | 34 |
| Obese | 47.5 | 47.5 | 294 | 294.3 | 1 | 1 |
| Overweight | 28.0 | 26.0 | 200 | 181.0 | 4 | 32 |
This is a graphical representation for the BMI index per month along
with each of its categories. For the month of April, it was higher the
amount of observations for people whom were overweight compared to being
healthy and obese. But in May there was a change, moreover there was an
increase of healthy observations and a decrease in being overweight. In
addition, as stated in the previous analysis, for obese there was only 1
individual and observation for the month of April, with that the amount
of obese people and observations decreased in May.
By analyzing and creating visualizations of the Fitbit dataset, there is a clear indication by measuring different types of activities, and recording them does lead to see a person’s health habits. Therefore, it would greatly benefit BellaBeat to invest and become more aggressive towards tracking one’s fitness. Moreover, they should create their own focus group and conduct statistical analysis. This would lead them, to become more familiar with their core audience and be unique within the industry.
A few recommendations should be considered: