Bellabeat - Spring (Smart Water Bottle) marketing recommendations through data analysis

Introduction: Sršen asks us to analyze smart device usage data in order to gain insight into how consumers use non-Bellabeat smart devices. She then wants us to select one Bellabeat product to apply those insights into our presentation… Spring - Smart Water Bottle

Stakeholders: - Founder Urška Sršen and Sando Mur - Online retailers - Web channels - Digital Marketers or Social Networking such as Facebook, Instagram, TWITTER, Google Search, Google Display Network, etc. - Sales & Marketing team

Product:

Spring - water bottle https://bellabeat.com/product/spring/

The trends:

As wearables get common and functional for more sporty people. According to AF Arkbar’s research (1) on Indonesian people, most are not aware of intake of fluid, and hence over 47% of 1200 sampling were found with dehydration issues. Adequate intake of water is especially important to elderly people as it helps at building the resilience of respiratory organs. Worse still, dehydration promotes prevalence of plasma hypertonicity among community-dwelling older adults (2). Dehydration could lead to Mood and performance deceleration. In addition, Dehydration could cause heartrate issues such as palpitation because the amount of blood circulating through one’s body, or blood volume, decreases when dehydrated. (3)

Spring’s app and smart technology can calculate the optimal amount of water for user’s body and remind users of water intake base on the users age, height, weight, local weather, activity level, pregnancy or breastfeeding, help to remind users of water consumption. Thus, Spring, as smart bottle that can help remind users avoid dehydration, establish, and maintain healthy hydration habit, is considerable product for development in the market. From existing market, there are since 2017 more research had been done on smart water bottles.

Data Sources:

Done several smart bottle review websites first to learn more about related smart bottle players in the market. Also respective products’ websites. Best Smart Water Bottle of 2021 – 10 Great Smart Water Bottles Reviewed – Solid Guides 10 Smart Water Bottles to Use in 2020 | Food & Wine (foodandwine.com) https://hiconsumption.com/best-smart-water-bottles/

Or from the activity, intensity, sleep mode, heart rate, calorie data which is macro data, that might reflect the user’s overall health condition that should be also counted as the need for fluid intake. Here we can refernce from Here FitBit Fitness Tracker Data | Kaggle

Would load daily activities, daily intensities and heartrate seconds to analyse

Index (1) https://www.researchgate.net/publication/332516479_Smart_bottle_work_design_using_waterflow_sensor_based_on_Raspberry_Pi_and_Android/fulltext/5cb924e54585156cd7a25e6b/Smart-bottle-work-design-using-waterflow-sensor-based-on-Raspberry-Pi-and-Android.pdf?origin=publication_detail (2) “High prevalence of plasma hypertonicity among community-dwelling older adults: results from NHANES III. Stookey JD J Am Diet Assoc. 2005 Aug; 105(8):1231-9.” Sourced from A Container-Attachable Inertial Sensor for Real-Time Hydration Tracking (nih.gov) (3) https://share.upmc.com/2014/09/importance-hydration-heart/

Data ROCCC:

So far I tried to find data from renown websites with methodologies of their ratings research. But some data may be missing due to various reasons, such as not enough rating review and the writer’ selective input. Some reviews seem to be biased and due to lack of tech or product knowledge. Some may be advertisers themselves. For my own internal research, I review about 6 smart water bottles websites, then come down to three with a bit different approach, such as a more of high end and a life-style smart water bottles website (to pair with Bella beat positioning), as well as an analytical one. Have to note that from my research of amazon website for smart water bottles, the no. of rating for each of the whole product is different from individual category of the product itself. For instance, whilst Bella Beat has 37 reviews, for ‘app’ category, there are over 5 customer reviews; but regarding ‘value for money’, there are just 2. And for other smart bottles products with apps, ‘value for money’ have around 12 to 30 reviews.

Goals:

1, Utilize data sources to find trends from third parities’ market reviews and consumers. Data sources include three websites about various smart water bottles, reviews from amazon.com, public datasets from Fitbit, etc. 2, Recommend marketing strategies base on existing marketing trend, depending on price, capacity versus weight, consumer preferences and products technology, which could be categorized durability, value for money, advantages such as design to keep hot or cold drinks, easy to use and clean, battery life, app and accuracy vs. self-cleaning function such as disinfecting, anti-bacteria and purification. From there, would narrow down to more similar categories with Bellabeat on apps related products in regards of product marketing comparison, but good to have the otherwise tech products for price comparison.

Smart_Water_Bottle_Review_for_new_plot %>% select(Products, price, mobile_app, app_accuracy, easy_use) # A tibble: 9 x 5 Products price mobile_app app_accuracy easy_use
1 LARQ 95.00 0 NIL 4.70
2 Equa Smart Water 79.00 0 NIL NIL
3 CrazyCap Self Cleaning, UV water purifyer 70.00 0 NIL 4.70
4 Hydrate Spark Stainless Steel 21 oz 65.00 1 2.30 4.10
5 Bellabeat 60.40 1 4.00 3.00
6 UV Brite explorer self cleaning 20.3 oz 60.00 0 NIL NIL
7 Hydrate Spark 3 Tracks (not insultaed) 60.00 1 3.50 4.40
8 Philips Water GoZero UV Self-Cleaning Vacuum Insulated 57.80 1 NIL 4.50
9 Thermos 24 oz hydration bottle w smart lid 40.20 1 3.00 4.00

Here the higest priced LARQ has highest easy to use rate but no app. And Bellabeat as among the one with mobile app has highest app accuracy has mid high price on the 5th place, the next going to Hydrate Spark 3 Tracks has second app accuracy rate has mid low price on the 7th place. Even Hydrate Spark Stainless with nt bad 2.3 app accuracy has just mid price range on the 4th place. This to certain extent describe app tech products do not necessarily churn high price.

Hypothesis:

1, PRICE WISE: At a glance; it looks like price may be relatively on the high side as products are sleek and technology wise more advanced.

Analysis method: Can check ‘price’ relative to other objective factors skipping ‘amazon rating 5/5’ and ‘value for money’.

i, Can tech be the factor for supporting higher price? First to display price vs ‘mobile app availability’, ‘app accuracy’ and ‘easy to use’

ii, Then filter those with mobile apps:

Smart_Water_Bottle_Review_for_new_plot %>% select (Products, price, mobile_app, app_accuracy, easy_use) %>% dplyr::filter(mobile_app == “1”)

Products price mobile_app app_accuracy easy_use
1 Hydrate Spark Stainless Steel 21 oz 65.0 1 2.30 4.10
2 Bellabeat 60.4 1 4.00 3.00
3 Hydrate Spark 3 Tracks (not insultaed) 60.0 1 3.50 4.40
4 Philips Water GoZero UV Self-Cleaning Vacuum Insulated 57.8 1 NIL 4.50
5 Thermos 24 oz hydration bottle w smart lid 40.2 1 3.00 4.00

iii, Also filter those with tech features like keep_cold_hrs, easy_clean, battery_life:

Smart_Water_Bottle_Review_for_new_plot %>% select (Products, price, keep_cold_hrs, easy_clean, battery_life)

Products price keep_cold_hrs easy_clean battery_life
1 LARQ 95 24.00 3.75 4.00
2 CrazyCap Self Cleaning, UV water purifyer 70.0 24.00 4.50 3.00
3 Hydrate Spark Stainless Steel 21 oz 65.0 24.00 4.50 3.00
4 Bellabeat 60.4 6.00 3.60 3.00
5 Hydrate Spark 3 Tracks (not insultaed) 60.0 9.00 4.30 2.50
6 Philips Water GoZero UV Self-Cleaning Vacuum Insulated 57.8 24.00 4.75 3.50

After removing the NIL rows, the high priced LARQ is found to have best battery life and among longest hrs for keep cold(which is important for sport needs), and third rank for easy clean.
then is Philips Water GoZero UV self-clean product, having high easy clean rate, high battery life and long keep cold, which but ranks the lowest price among the 6 other tech products. This again shows tech doesnt necessarily support high price, although the high priced products do maintain not low level of such conditions. For Bellabeat, it is noticeable that its keep cold hrs and easy to clean features are the lowest while its price is about mid low.

Thus, tech is not everything for highest price but is considerably important for maintaining higher price.

2, RATING: So what affects rating? Before looking into the factors, lets find the products by ranking.

The highest rating/ranking will be? 4.5

Smart_Water_Bottle_Review_for_plot [which.max(Smart_Water_Bottle_Review_for_plot$amazon_ratings_5), “amazon_ratings_5”, drop = FALSE] # A tibble: 1 x 1 amazon_ratings_5 1 4.5

Now can take a look at price versus ratings. So if by rating, it looks like highest price has best rating at the same time; and good drafting has relatively high price.

Smart_Water_Bottle_Review_for_new_plot %>% arrange(-amazon_ratings_5)

Products price amazon_ratings_5 1 LARQ 95 4.5 2 CrazyCap Self Cleaning, UV water purifyer 70.0 4.4 3 Hydrate Spark Stainless Steel 21 oz 65.0 4.4 4 Hydrate Spark 3 Tracks (not insultaed) 60.0 4.4 5 Philips Water GoZero UV Self-Cleaning Vacuum Insulated 57.8 4.1 6 Equa Smart Water 79 4
7 Bellabeat 60.4 3.8 8 Thermos 24 oz hydration bottle w smart lid 40.2 3.7 9 UV Brite explorer self cleaning 20.3 oz 60 NA

i Deeper look at rating: Rating also needs looking at no. of ratings to justify, but due to limitations in data sourcing, there are bias unavoidably.

Products price amazon_ratings_5 no_of_ratings_100s
1 LARQ 95 4.5 8.29
2 CrazyCap Self Cleaning, UV water purifyer 70.0 4.4 11.86
3 Hydrate Spark Stainless Steel 21 oz 65.0 4.4 16.52
4 Hydrate Spark 3 Tracks (not insultaed) 60.0 4.4 37.26
5 Philips Water GoZero UV Self-Cleaning Vacuum Insulated 57.8 4.1 3.22
6 Equa Smart Water 79 4 0.02
7 Bellabeat 60.4 3.8 0.37
8 Thermos 24 oz hydration bottle w smart lid 40.2 3.7 11.50
9 UV Brite explorer self cleaning 20.3 oz 60 NA NIL

Here, the table shows highest rating has just modorate no. of ratings, the product that got most no. of ratings is Hydrate Spark 3, then is Hydrate Spark Stainless, all these info about amazon ratings and no. of ratings are obtained from amazon website.

Maybe can filter no. of ratings to sort real high rating products:

smart1 %>% mutate(no_of_ratings_100s = as.numeric(no_of_ratings_100s)) %>% arrange(-no_of_ratings_100s)

Products no_of_ratings_100s amazon_ratings_5 1 Hydrate Spark 3 Tracks (not insultaed) 37.3 4.4 2 Hydrate Spark Stainless Steel 21 oz 16.5 4.4 3 CrazyCap Self Cleaning, UV water purifyer 11.9 4.4 4 Thermos 24 oz hydration bottle w smart lid 11.5 3.7 5 LARQ 8.29 4.5 6 Philips Water GoZero UV Self-Cleaning Vacuum Insulated 3.22 4.1 7 Bellabeat 0.37 3.8 8 Equa Smart Water 0.02 4
9 UV Brite explorer self cleaning 20.3 oz NA NA

From here we could tell Hydrate Spark 3 and Stainless Steel and CrazyCap self clean bottle seem to have both highest no of rating whilst attaining relatively higher rating though not highest. So either to compete or to avoid direct competition, these products could be good for referencing.

ii, Lets look at rating versus all other features:

Smart_Water_Bottle_Review_for_new_plot %>% mutate(value_for_money = as.numeric(value_for_money)) %>% arrange(-value_for_money) %>% select(Products, value_for_money, price, easy_clean, easy_use, battery_life, app_accuracy)

Products price amazon_ratings_5 no_of_ratings_1~ app_accuracy easy_clean battery_life
1 LARQ 95 4.5 8.29 NIL 3.75 4.00
2 CrazyCap Self Cleaning, UV water purifyer 70.0 4.4 11.86 NIL 4.50 3.00
3 Hydrate Spark Stainless Steel 21 oz 65.0 4.4 16.52 2.30 4.50 3.00
4 Hydrate Spark 3 Tracks (not insultaed) 60.0 4.4 37.26 3.50 4.30 2.50
5 Philips Water GoZero UV Self-Cleaning Vacuum Insulated 57.8 4.1 3.22 NIL 4.75 3.50
6 Equa Smart Water 79 4 0.02 NIL NIL NIL
7 Bellabeat 60.4 3.8 0.37 4.00 3.60 3.00
8 Thermos 24 oz hydration bottle w smart lid 40.2 3.7 11.50 3.00 4.20 4.00
9 UV Brite explorer self cleaning 20.3 oz 60 NA NIL NIL NIL NIL

Some highest rated products do not have apps, so if compared to apps equipped and more all rounded featured, it looks like Hydrate Spark 3, Hydrate Spark Stainless and Bellabeat with not low rating and all-roundness are relatively well ranked despite price and amazon rating itself.

Having said that, it is noticeable that other features besides apps are important, such as easy to clean and battery life Whereas the self-cleaning bottles are more expensive yet carry relatively high popularity (no. of rating, ranking, value for money response). That might suggest the materials and function important and trend is about emphasizing importance in clean, healthy over price. (affordability higher) If consumers think that worth.

As another marketing strategy consideration, Bellabeat in this regard might think about more emphasis on materials allowing cleanliness.

Also, ability to keep cold water for longer time seems to be rather welcome among reviews especially for people from exercises. This seems to be the components that supports higher price and popularity. And for Bellabeat with relatively less support for keep_cold, it is considerable strategic factor too.

3, Quality: According to amazon ratings again, more features, better material or unique design could be pricier, but not necessarily more popular. More useful is to find out where are the optimal level for features for price? After general reviews, the quality of the features is also important to support the higher price.

Analysis can be based on ‘easy to use’, ‘battery life’, ‘durability’, etc. no matter with apps or heat/cool function or self-clean tech or not.

Lets use ‘value for money’ to rank the products:

smart_water_bottle_review_for_new_plot %>% mutate(value_for_money = as.numeric(value_for_money)) %>% arrange(-value_for_money)

Products amazon_ratings_5 battery_life easy_use durability value_for_money 1 Hydrate Spark Stainless Steel 21 oz 4.4 3.00 4.10 3.50 3.8 2 Hydrate Spark 3 Tracks (not insultaed) 4.4 2.50 4.40 3.90 3.7 3 CrazyCap Self Cleaning, UV water purifyer 4.4 3.00 4.70 4.30 3.5 4 Philips Water GoZero UV Self-Cleaning Vacuum Insulated 4.1 3.50 4.50 4.00 3.5 5 LARQ 4.5 4.00 4.70 4.40 2.75 6 Thermos 24 oz hydration bottle w smart lid 3.7 4.00 4.00 3.80 2.5 7 Bellabeat 3.8 3.00 3.00 5.00 2
8 Equa Smart Water 4 NIL NIL NIL NA
9 UV Brite explorer self cleaning 20.3 oz NA NIL NIL NIL NA

From this set, it is understood, ‘battery life’ and ‘easy use’ are quite align with value for money, but durability seems not as the same direction.

As such, lets explore more which could be more possible value for money factors:

smart_water_bottle_review_for_new_plot %>% mutate(value_for_money = as.numeric(value_for_money)) %>% arrange(-value_for_money)

Products amazon_ratings_5 easy_clean easy_use keep_hot_hrs keep_cold_hrs battery_life app_accuracy value_for_money 1 Hydrate Spark Stainless Steel~ 4.4 4.50 4.10 0.00 24.00 3.00 2.30 3.8 2 Hydrate Spark 3 Tracks (not i~ 4.4 4.30 4.40 NIL 9.00 2.50 3.50 3.7 3 CrazyCap Self Cleaning, UV wa~ 4.4 4.50 4.70 12.00 24.00 3.00 NIL 3.5 4 Philips Water GoZero UV Self-~ 4.1 4.75 4.50 12.00 24.00 3.50 NIL 3.5 5 LARQ 4.5 3.75 4.70 12.00 24.00 4.00 NIL 2.75 6 Thermos 24 oz hydration bottl~ 3.7 4.20 4.00 NIL NIL 4.00 3.00 2.5 7 Bellabeat 3.8 3.60 3.00 NIL 6.00 3.00 4.00 2
8 Equa Smart Water 4 NIL NIL 12.00 24.00 NIL NIL NA
9 UV Brite explorer self cleani~ NA NIL NIL 6.00 12.00 NIL NIL NA

Let’s look at different factors possibly echoing with value for money through plotting: >ggplot(data = Smart_Water_Bottle_Review_for_new_plot, mapping = aes(x = easy_clean, y = value_for_money)) + geom_point()+ facet_wrap(~ Products, nrow = 2)

dput(Smart_Water_Bottle_Review_for_new_plot)
ggplot(Smart_Water_Bottle_Review_for_new_plot, mapping = aes(x = easy_clean, y = value_for_money)) + geom_point()+ facet_wrap(~ Products, nrow = 2)

ggplot(data = smart_water_bottle_review_for_new_plot, mapping = aes(x = keep_cold_hrs, y = value_for_money)) + geom_point()+ facet_wrap(~ Products, nrow = 2)

dput(Smart_Water_Bottle_Review_for_new_plot)
ggplot(Smart_Water_Bottle_Review_for_new_plot, mapping = aes(x = keep_cold_hrs, y = value_for_money)) + geom_point()+ facet_wrap(~ Products, nrow = 2)

ggplot(data = Smart_Water_Bottle_Review_for_new_plot, mapping = aes(x = keep_hot_hrs, y = value_for_money)) + geom_point()+ facet_wrap(~ Products, nrow = 2)

dput(Smart_Water_Bottle_Review_for_new_plot)
ggplot(Smart_Water_Bottle_Review_for_new_plot, mapping = aes(x =keep_hot_hrs, y = value_for_money)) + geom_point()+ facet_wrap(~ Products, nrow = 2)

ggplot(data = Smart_Water_Bottle_Review_for_new_plot, mapping = aes(x = easy_use, y = value_for_money)) + geom_point()+ facet_wrap(~ Products, nrow = 2)

dput(Smart_Water_Bottle_Review_for_new_plot)
ggplot(Smart_Water_Bottle_Review_for_new_plot, mapping = aes(x = easy_use, y = value_for_money)) + geom_point()+ facet_wrap(~ Products, nrow = 2)

ggplot(data = smart_water_bottle_review_for_new_plot, mapping = aes(x = battery_life, y = value_for_money)) + geom_point()+ facet_wrap(~ Products, nrow = 2)

dput(Smart_Water_Bottle_Review_for_new_plot)
ggplot(Smart_Water_Bottle_Review_for_new_plot, mapping = aes(x = battery_life, y = value_for_money)) + geom_point()+ facet_wrap(~ Products, nrow = 2)

If by all products together, can easily spot which line is linear with value for money, which appears to be ‘easy_clean’ and ’keep cold_hrs, then is battery_life

Lets create plots w Products variables colored:

View(Smart_Water_Bottle_Review_for_new_plot)

colScale <- scale_colour_manual(name = “Products”,values = myColors) library(RColorBrewer) myColors <- brewer.pal(5,“Set1”) names(myColors) <- levels(smartqualityfactorsval$Products) colScale <- scale_colour_manual(name = “Products”,values = myColors) p <- ggplot(smartqualityfactorsval,aes(x,y,colour = Products)) + geom_point() p1 <- p + colScale

ggplot(data = Smart_Water_Bottle_Review_for_new_plot, aes(x = easy_clean ,y = value_for_money,color = Products)) + geom_point()

dput(Smart_Water_Bottle_Review_for_new_plot)
ggplot(Smart_Water_Bottle_Review_for_new_plot, mapping = aes(x = easy_clean, y = value_for_money, color = Products)) + geom_point()

ggplot(data = Smart_Water_Bottle_Review_for_new_plot, aes(x = easy_use ,y = value_for_money,color = Products)) + geom_point()

dput(Smart_Water_Bottle_Review_for_new_plot)
ggplot(Smart_Water_Bottle_Review_for_new_plot, mapping = aes(x = easy_use, y = value_for_money, color = Products)) + geom_point()

ggplot(data = Smart_Water_Bottle_Review_for_new_plot, aes(x = battery_life ,y = value_for_money,color = Products)) + geom_point()

dput(Smart_Water_Bottle_Review_for_new_plot)
ggplot(Smart_Water_Bottle_Review_for_new_plot, mapping = aes(x = battery_life, y = value_for_money, color = Products)) + geom_point()

ggplot(data = Smart_Water_Bottle_Review_for_new_plot, aes(x = keep_cold_hrs ,y = value_for_money,color = Products)) + geom_point()

dput(Smart_Water_Bottle_Review_for_new_plot)
ggplot(Smart_Water_Bottle_Review_for_new_plot, mapping = aes(x = keep_cold_hrs, y = value_for_money, color = Products)) + geom_point()

ggplot(data = Smart_Water_Bottle_Review_for_new_plot, aes(x = keep_hot_hrs ,y = value_for_money,color = Products)) + geom_point()

dput(Smart_Water_Bottle_Review_for_new_plot)
ggplot(Smart_Water_Bottle_Review_for_new_plot, mapping = aes(x = keep_hot_hrs, y = value_for_money, color = Products)) + geom_point()

Also can match physical design factors with ‘value for money’: >ggplot(data = Smart_Water_Bottle_Review_for_new_plot, mapping = aes(x = weight, y = value_for_money)) + geom_point(color = “grey”)

dput(Smart_Water_Bottle_Review_for_new_plot)
ggplot(Smart_Water_Bottle_Review_for_new_plot, mapping = aes(x = weight, y = value_for_money, color = Products)) + geom_point()

ggplot(data = Smart_Water_Bottle_Review_for_new_plot, mapping = aes(x = volumne, y = value_for_money)) + geom_point(color = “grey”)

dput(Smart_Water_Bottle_Review_for_new_plot)
ggplot(Smart_Water_Bottle_Review_for_new_plot, mapping = aes(x = volumne, y = value_for_money, color = Products)) + geom_point()

As reflecting from data here, it is higher between 16 to 23 the volumne with better value for money; but for weight, it sticks with lower as 0.95 to medium weight as 1.3 that gain value.

Our suggestion for Bellabeat from this set of data:

From the data, it seems that Bellabeat, with mainly 2.5 for ‘value for money’ may consider strengthen the overall bottle quality such as volumne versus weight, easy to clean/self cleaning and app (has to shake to sync) as these are more considered by the users according to their response and rating such as value for money.

Having said that, it is also noticed that some factors like ‘easy to use’ and ‘battery life’ has somehow reversed relation with value for money, why? possibly it is because the price for higher battery life and easy usage are with some pricy products here.

4, The need for carrying the smart bottle somehow depends on intensity and frequency of activities from general users’ situations. Whilst different person’s profiles such as age, big or slim might affect the need for hydration intake. Bellabeat might utilize the data to market to those holding health data showing higher intensities and activities to consider Spring Smart Bottle.

Analysis method: Can check ‘what time/days/ when hydration intake most crucial according to ’intensities’, ‘heartrates’ and ‘weightlogs’

First take a look at the data. then import dailyIntensities_merged.csv, hourlyIntensities_merged.csv, heartrate_seconds_merged.csv, weightLogInfo_merged.csv

Daily Intensities:

dailyIntensities_merged <- read_csv(“C:/Users/Helen/Desktop/Fitabase Data 4.12.16-5.12.16/dailyIntensities_merged.csv”) Rows: 940 Columns: 10
– Column specification ———————————————————— Delimiter: “,” chr (1): ActivityDay dbl (9): Id, SedentaryMinutes, LightlyActiveMinutes, FairlyActiveMinutes, VeryAc…

Hourly Intensities: > hourlyIntensities_merged <- read_csv(“C:/Users/Helen/Desktop/Fitabase Data 4.12.16-5.12.16/hourlyIntensities_merged.csv”) Rows: 22099 Columns: 4
– Column specification ———————————————————— Delimiter: “,” chr (1): ActivityHour dbl (3): Id, TotalIntensity, AverageIntensity

Heart Rate: > heartrate_seconds_merged <- read_csv(“C:/Users/Helen/Desktop/Fitabase Data 4.12.16-5.12.16/heartrate_seconds_merged.csv”) Rows: 2483658 Columns: 3
– Column specification ———————————————————— Delimiter: “,” chr (1): Time dbl (2): Id, Value

Weight Log:

weightLogInfo_merged <- read_csv(“C:/Users/Helen/Desktop/Fitabase Data 4.12.16-5.12.16/weightLogInfo_merged.csv”) Rows: 67 Columns: 8
– Column specification ———————————————————— Delimiter: “,” chr (1): Date dbl (6): Id, WeightKg, WeightPounds, Fat, BMI, LogId lgl (1): IsManualReport View(weightLogInfo_merged) BMI: Underweight (Below 18.5) · Normal (18.5 - 24.9) · Overweight (25.0 - 29.9) · Obese (30.0 and Above). BMI = Weight(KG)/Height(M) Then cleaning data through r distinct

Then summarize Daily Intensities data

Summary: ActivityDay FairlyActiveMinutes VeryActiveMinutes Length:940 Min. : 0.00 Min. : 0.00
Class :character 1st Qu.: 0.00 1st Qu.: 0.00
Mode :character Median : 6.00 Median : 4.00
Mean : 13.56 Mean : 21.16
3rd Qu.: 19.00 3rd Qu.: 32.00
Max. :143.00 Max. :210.00

length(unique(dailyIntensities_merged$Id)) [1] 33 length(unique(dailyIntensities_merged$ActivityDay)) [1] 31

It looks like there are more VeryActiveMinutese then FairlyActive Minutes. And during VeryActive times, it is imaginable one cant have regular or even enough fluid intake. And this is from the datasets, the max is about over 3 hours very active in certain day, whilst most people stay around 21 mins and 1/4 ppl averagely have 32 minutes which belong to rather healthily active group who need enough intake of fluid.

Then plot an exploration: how many during the Activity Days usually have most ‘very active minutes’?

ggplot(dailyIntensities_merged, aes(x=ActivityDay, y=VeryActiveMinutes)) + geom_point()

library(readxl)
  dailyIntensities_merged <- read_excel("C:/Users/Helen/Desktop/Helen 2021 backup/Google Cert of DA - Case Study Aug 2021/Fitabase Data 4.12.16-5.12.16/dailyIntensities_merged.xlsx", range ="A1:J940", na="NA")

library(dplyr)
library(magrittr)
library(knitr)
library(ggplot2)

dput(dailyIntensities_merged)
ggplot(dailyIntensities_merged, mapping = aes(x=ActivityDay, y=VeryActiveMinutes)) + geom_point()

From the plot, it is observed that about a bit less then half of the ppl stay VeryActively in most of time.

According to health department of US government (https://health.gov/sites/default/files/2019-09/Physical_Activity_Guidelines_2nd_edition.pdf), “At the greatest volume of moderate-to-vigorous physical activity, the risk is low even for those who sit the most (upper right corner). The best currently available estimate of this volume is about 60 to 75 minutes per day of moderate-intensity activities, or 30 to 40 minutes per day of vigorous-intensity activities.”

As reflected from the datasets, it looks like most ppl stay with 0-40minutes, then from about 50 to 140 minutes intensive activity, quite more then enough. Generally speaking, most are involved in intensive activities at least every other day. This may be due to the fact that these data are from Fitbitbase where ppl are more active by nature.

Accoridng to Healthline (https://www.healthline.com/nutrition/how-much-water-should-you-drink-per-day#effects) “There are many factors affecting one water intake need, such as health, activity and environment. The temperature or season: You may need more water in warmer months than cooler ones due to perspiration. Your environment: If you spend more time outdoors in the sun or hot temperatures or in a heated room, you might feel thirstier faster. How active you are: If you are active during the day or walk or stand a lot, you’ll need more water than someone who’s sitting at a desk. If you exercise or do any intense activity, you will need to drink more to cover water loss. Your health: If you have an infection or a fever, or if you lose fluids through vomiting or diarrhea, you will need to drink more water. If you have a health condition like diabetes you will also need more water. Some medications like diuretics can also make you lose water. Pregnant or breastfeeding: If you’re pregnant or nursing your baby, you’ll need to drink extra water to stay hydrated. Your body is doing the work for two (or more), after all.”

Then we can look at hourly intensity data:

head(hourlyIntensities_merged) Id ActivityHour TotalIntensity AverageIntensity 1 1503960366 4/12/2016 12:00:00 AM 20 0.333333 2 1503960366 4/12/2016 1:00:00 AM 8 0.133333 3 1503960366 4/12/2016 2:00:00 AM 7 0.116667 4 1503960366 4/12/2016 3:00:00 AM 0 0.000000 5 1503960366 4/12/2016 4:00:00 AM 0 0.000000 6 1503960366 4/12/2016 5:00:00 AM 0 0.000000

hourlyIntensities_merged%>% + select(Id, ActivityHour, TotalIntensity, AverageIntensity)%>% + summary() Id ActivityHour TotalIntensity AverageIntensity Min. :1.504e+09 Length:22099 Min. : 0.00 Min. :0.0000
1st Qu.:2.320e+09 Class :character 1st Qu.: 0.00 1st Qu.:0.0000
Median :4.445e+09 Mode :character Median : 3.00 Median :0.0500
Mean :4.848e+09 Mean : 12.04 Mean :0.2006
3rd Qu.:6.962e+09 3rd Qu.: 16.00 3rd Qu.:0.2667
Max. :8.878e+09 Max. :180.00 Max. :3.0000

length(unique(hourlyIntensities_merged$ActivityHour)) [1] 736 length(unique(hourlyIntensities_merged$Id)) [1] 33

Let’s explore from this sampling of 8.878e+09 Id, what are the total Intensity per Activity Hour from 736 hours by 33 ppl:

ggplot(hourlyIntensities_merged, aes(x=ActivityHour, y=TotalIntensity)) + geom_point()```

library(readxl)

hourlyIntensities_merged <- read_excel("C:/Users/Helen/Desktop/Helen 2021 backup/Google Cert of DA - Case Study Aug 2021/Fitabase Data 4.12.16-5.12.16/hourlyIntensities_merged.xlsx", range = "A1:D22100", na="NA")

 
library(dplyr)
library(magrittr)
library(knitr)
library(ggplot2)

dput(hourlyIntensities_merged)
ggplot(hourlyIntensities_merged, mapping = aes(x=ActivityHour, y=TotalIntensity)) + geom_point()

From the plot as explored, it is noticed that about half of the ppl stay active to very active from above 50 to 150 degree of Intensity.

So in general, the need for adequate intake of fluid for these active ppl are considerable.

Then can take a look at ppl with weight vs BMI considerations where enough fluid intake can be crucial.

length(unique(weightLogInfo_merged$Id)) [1] 8 length(unique(weightLogInfo_merged$Date)) [1] 56

In this dataset, there are only 8 ppl. Anyway, let’s explore the weight and BMI situations.

weightLogInfo_merged%>% + select(Id, WeightKg, BMI)%>% + summary()

   Id               WeightKg           BMI

Min. :1.504e+09 Min. : 52.60 Min. :21.45
1st Qu.:6.962e+09 1st Qu.: 61.40 1st Qu.:23.96
Median :6.962e+09 Median : 62.50 Median :24.39
Mean :7.009e+09 Mean : 72.04 Mean :25.19
3rd Qu.:8.878e+09 3rd Qu.: 85.05 3rd Qu.:25.56
Max. :8.878e+09 Max. :133.50 Max. :47.54

Healthy BMI: Weight/Height = 18.5 ~ 24.9

Underweight : <18.5 Normal weight : 18.5 ≦ BMI < 24 Overweight : 24 ≦ BMI < 27 Obesity : BMI ≧ 27

If in this case it looks like most of these ppl have a relatively high BMI AS 25.19 AND 1ST qU TO 3RD qU IS 23.96 TO 25.56, meaning that they are mostly a bit overweight.

Lets look at the plot that reflected BMI from weight: > ggplot(weightLogInfo_merged, aes(x=WeightKg, y=BMI)) + geom_point()

library(readxl)

weightLogInfo_merged <- read_excel("C:/Users/Helen/Desktop/Helen 2021 backup/Google Cert of DA - Case Study Aug 2021/Fitabase Data 4.12.16-5.12.16/weightLogInfo_merged.xlsx", range = "A1:H68", na="NA")

 
library(dplyr)
library(magrittr)
library(knitr)
library(ggplot2)

dput(weightLogInfo_merged)
ggplot(weightLogInfo_merged,aes(x=WeightKg, y=BMI)) + geom_point()

This shows that almost half of samplers have over healthy range of BMI 24 or even far higher BMI, indicating that overweight and obesity situations are not rare.

Our suggestion to Bellabeat:

In this referencing from Fitabase data, it is essential to market to sporty and active people for their health considerations due to their intense activities and hidden weight issues, which have to be educated to them with priority and thoroughly.

Note that the echo = FALSE parameter was added to the code chunk to prevent printing of the R code that generated the plot.

BellaBeat Marketing Analysis for Spring

Helen

8/7/2021