How Can a Wellness Technology Company Play It Smart?

Scenario

I am a junior data analyst working on the marketing analyst team at Bellabeat, a high-tech manufacturer of health-focused product for women. Bellabeat is a successful small company, but they have the potential to become a larger player in the global smart device market. Urška Sršen, cofounder and Chief Creative Officer of Bellabeat, believes that analyzing smart device fitness data could help unlock new growth opportunities for the company. I have been asked to focus on one of Bellabeat’s products and analyze smart device data to gain insight into how consumers are using their smart devices. The insights I discover will then help guide marketing strategy for the company. I will present my analysis to the Bellabeat executive team along with high-level recommendations for Bellabeat’s marketing strategy.

Key Stakeholders

  • Urška Sršen: Bellabeat’s Cofounder and Chief Creative Officer
  • Sando Mur: Mathematician and Bellabeat’s Cofounder
  • Bellabeat Marketing analytics team: A team of data analysts.

I will follow six steps of the data analysis process: ask, prepare, process, analyze, share, and act.

1.0 Ask

Sršen asked me to analyze smart device usage data in order to gain insight into how consumers use non-Bellabeat smart devices. She then wants me to select one Bellabeat product to apply these insight to in my presentation.

1.1 Business Task

Analyze smart device usage data to gain insight into how people are already using their smart devices then recommendations of how these trends can inform Bellabeat marketing strategy.

2.0 Prepare

Now, I will prepare data for analysis.

Sršen encourages to use public data that explores smart device users’ daily habits. She points to a specific data set: - FitBit Fitness Tracker Data CCO: Public Domain, dataset made available through Mobius: - This Kaggle dataset contains personal fitness tracker from thirty fitbit users.

2.1 Credibility in this data

I will use the process of ROCCC to determine the credibility of the data. Good data sources are found in the acronyms ROCCC: Reliable, Original, Comprehensive, Current, and Cited.

  • Reliable: Not reliable, because it contains limited data of 30 Fitbit users
  • Original: Not original. The data set was generated by respondents to a distributed survey via Amazon mechanical turk.
  • Comprehensive: Not comprehensive. Contain limited information needed to find solutions.
  • Current: Not current. Data is 7 years old. The usefulness of data decrease as time passes.
  • Cited: Not credible. The data was collected by third party, Amazon Mechanical, and is unknown when the data was last refreshed.

2.2 Load Packages—-

if(!require(pacman))install.packages("pacman")
## Loading required package: pacman
## Warning: package 'pacman' was built under R version 4.2.3
pacman::p_load(
  tidyverse, 
  janitor,
  inspectdf,
  flextable,
  plotly,
  visdat,
  esquisse, 
  skimr,
  here
)

2.3 Load data

 dailyActivity_merged <- read_csv(here("data/dailyActivity_merged.csv"))
## Rows: 940 Columns: 15
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr  (1): ActivityDate
## dbl (14): Id, TotalSteps, TotalDistance, TrackerDistance, LoggedActivitiesDi...
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
daily_activity <- dailyActivity_merged

** 3.0 Process**

I will clean and transform my data to ensure integrity to make sure data is complete and correct.

Explore data

head(daily_activity, n = 50)
## # A tibble: 50 × 15
##            Id ActivityDate TotalSteps TotalDistance TrackerDistance
##         <dbl> <chr>             <dbl>         <dbl>           <dbl>
##  1 1503960366 4/12/2016         13162          8.5             8.5 
##  2 1503960366 4/13/2016         10735          6.97            6.97
##  3 1503960366 4/14/2016         10460          6.74            6.74
##  4 1503960366 4/15/2016          9762          6.28            6.28
##  5 1503960366 4/16/2016         12669          8.16            8.16
##  6 1503960366 4/17/2016          9705          6.48            6.48
##  7 1503960366 4/18/2016         13019          8.59            8.59
##  8 1503960366 4/19/2016         15506          9.88            9.88
##  9 1503960366 4/20/2016         10544          6.68            6.68
## 10 1503960366 4/21/2016          9819          6.34            6.34
## # ℹ 40 more rows
## # ℹ 10 more variables: LoggedActivitiesDistance <dbl>,
## #   VeryActiveDistance <dbl>, ModeratelyActiveDistance <dbl>,
## #   LightActiveDistance <dbl>, SedentaryActiveDistance <dbl>,
## #   VeryActiveMinutes <dbl>, FairlyActiveMinutes <dbl>,
## #   LightlyActiveMinutes <dbl>, SedentaryMinutes <dbl>, Calories <dbl>
View(daily_activity)
dim(daily_activity)
## [1] 940  15

There are zero steps in the data that may indicate that they were not wearing the Fitbit monitor. I will negate some of respondent’s zero condition to help avoid the skewing effect.

daily_activity2 <- daily_activity %>% filter(TotalSteps !=0)

I have filtered out the data to negate TotalSteps listed as zero to ensure that the data is clean, and limit the skewing of my results. Now the data listed with zero steps has mostly been eliminated, and is ready to analyze.

View(daily_activity2)

The Kaggle dataset contains personal fitness tracker from thirty-three fitbit users instead of thirty.

summary(daily_activity2)
##        Id            ActivityDate         TotalSteps    TotalDistance  
##  Min.   :1.504e+09   Length:863         Min.   :    4   Min.   : 0.00  
##  1st Qu.:2.320e+09   Class :character   1st Qu.: 4923   1st Qu.: 3.37  
##  Median :4.445e+09   Mode  :character   Median : 8053   Median : 5.59  
##  Mean   :4.858e+09                      Mean   : 8319   Mean   : 5.98  
##  3rd Qu.:6.962e+09                      3rd Qu.:11092   3rd Qu.: 7.90  
##  Max.   :8.878e+09                      Max.   :36019   Max.   :28.03  
##  TrackerDistance  LoggedActivitiesDistance VeryActiveDistance
##  Min.   : 0.000   Min.   :0.0000           Min.   : 0.000    
##  1st Qu.: 3.370   1st Qu.:0.0000           1st Qu.: 0.000    
##  Median : 5.590   Median :0.0000           Median : 0.410    
##  Mean   : 5.964   Mean   :0.1178           Mean   : 1.637    
##  3rd Qu.: 7.880   3rd Qu.:0.0000           3rd Qu.: 2.275    
##  Max.   :28.030   Max.   :4.9421           Max.   :21.920    
##  ModeratelyActiveDistance LightActiveDistance SedentaryActiveDistance
##  Min.   :0.0000           Min.   : 0.000      Min.   :0.00000        
##  1st Qu.:0.0000           1st Qu.: 2.345      1st Qu.:0.00000        
##  Median :0.3100           Median : 3.580      Median :0.00000        
##  Mean   :0.6182           Mean   : 3.639      Mean   :0.00175        
##  3rd Qu.:0.8650           3rd Qu.: 4.895      3rd Qu.:0.00000        
##  Max.   :6.4800           Max.   :10.710      Max.   :0.11000        
##  VeryActiveMinutes FairlyActiveMinutes LightlyActiveMinutes SedentaryMinutes
##  Min.   :  0.00    Min.   :  0.00      Min.   :  0.0        Min.   :   0.0  
##  1st Qu.:  0.00    1st Qu.:  0.00      1st Qu.:146.5        1st Qu.: 721.5  
##  Median :  7.00    Median :  8.00      Median :208.0        Median :1021.0  
##  Mean   : 23.02    Mean   : 14.78      Mean   :210.0        Mean   : 955.8  
##  3rd Qu.: 35.00    3rd Qu.: 21.00      3rd Qu.:272.0        3rd Qu.:1189.0  
##  Max.   :210.00    Max.   :143.00      Max.   :518.0        Max.   :1440.0  
##     Calories   
##  Min.   :  52  
##  1st Qu.:1856  
##  Median :2220  
##  Mean   :2361  
##  3rd Qu.:2832  
##  Max.   :4900

general overview of data

vis_dat(daily_activity2)

Categorical overview

## Warning in geom2trace.default(dots[[1L]][[1L]], dots[[2L]][[1L]], dots[[3L]][[1L]]): geom_GeomFitText() has yet to be implemented in plotly.
##   If you'd like to see this geom implemented,
##   Please open an issue with your example code at
##   https://github.com/ropensci/plotly/issues

The categorical overview is an interactive diagram of the proportion of people whose count in the activityDate range from 3.8% to 1.9%, with a steady decrease in activity in the 30 day tracking period.

Numerical overview

The interactive diagram numerical overview of those in the TotalSteps with a middle of 8319 makeup close to 16% of the whole sample. Sedentary Active Distance is nearly 0%, and the Sedentary Minutes predominate the total daily activity.

4.0 Analyze

**Analyzing single variables: numeric—-

daily_activity2$TotalSteps
##   [1] 13162 10735 10460  9762 12669  9705 13019 15506 10544  9819 12764 14371
##  [13] 10039 15355 13755 18134 13154 11181 14673 10602 14727 15103 11100 14070
##  [25] 12159 11992 10060 12022 12207 12770  8163  7007  9107  1510  5370  6175
##  [37] 10536  2916  4974  6349  4026  8538  6076  6497  2826  8367  2759  2390
##  [49]  6474 36019  7155  2100  2193  2470  1727  2104  3427  1732  2969  3134
##  [61]  2971 10694  8001 11037  5263 15300  8757  7132 11256  2436  1223  3673
##  [73]  6637  3321  3580  9919  3032  9405  3176 18213  6132  3758 12850  2309
##  [85]  4363  9787 13372  6724  6643  9167  1329  6697  4929  7937  3844  3414
##  [97]  4525  4597   197     8  8054  5372  3570     4  6907  4920  4014  2573
## [109]  4059  2080  2237    44   678   356  2163   980   244   149  2945  2090
## [121]   152  3761  1675  2704  3790  1326  1786  2091  1510 11875 12024 10690
## [133] 11034 10100 15112 14131 11548 15112 12453 12954  6001 13481 11369 10119
## [145] 10159 10140 10245 18387 10538 10379 12183 11768 11895 10227  6708  3292
## [157] 13379 12798 13272  9117  4414  4993  3335  3821  2547   838  3325  2424
## [169]  7222  2467  2915 12357  3490  6017  5933  6088  6375  7604  4729  3609
## [181]  7018  5992  6564 12167  8198  4193  5528 10685   254  8580  8891 10725
## [193]  7275  3973  5205  5057  6198  6559  5997  7192  3404  5583  5079  4165
## [205]  3588  3409  1715  1532   924  4571   772  3634  7443  1201  5202  4878
## [217]  7379  5161  3090  6227  6424  2661 10113 10352 10129 10465 22244  5472
## [229]  8247  6711 10999 10080  7804 16901  9471  9482  5980 11423  5439    42
## [241]  8796  7618  7910  8482  9685  2524  7762  7948  9202  8859  7286  9317
## [253]  6873  7373  8242  3516  7913  7365  8452  7399  7525  7412  8278  8314
## [265]  7063  4940  8168  7726  8275  6440  7566  4747  9715  8844  7451  6905
## [277]  8199  6798  7711  4880  8857  3843  7396  6731  5995  8283  7904  5512
## [289]  9135  5250  3077  8856 10035  7641  9010 13459 10415 11663 12414 11658
## [301]  6093  8911 12058 14112 11177 11388  7193  7114 10645 13238 10414 16520
## [313] 14335 13559 12312 11677 11550 13585 14687 13072   746  8539   108  1882
## [325]  1982    16    62   475  4496 10252 11728  4369  6132  5862  4556  5546
## [337]  3689   590  5394  5974  3984  7753  8204 10210  5664  4744    29  2276
## [349]  8925  8954  3702  4500  4935  4081  9259  9899 10780 10817  7990  8221
## [361]  1251  9261  9648 10429 13658  9524  7937  3672 10378  9487  9129    17
## [373] 10122 10993  8863  8758  6580  4660 11009 10181 10553 10055 12139 13236
## [385] 10243 12961  9461 11193 10074  9232 12533 10255 10096 12727 12375  9603
## [397] 13175 22770 17298 10218 10299 10201  3369  3276  2961  3974  7198  3945
## [409]  2268  6155  2064  2072  3809  6831  4363  5002  3385  6326  7243  4493
## [421]  4676  6222  5232  6910  7502  2923  3800  4514  5183  7303  5275  3915
## [433]  9105   768  5135  4978  6799  7795  7289  9634  8940  5401  4803 13743
## [445]  9601  6890  8563  8095  9148  9557  9451  7833 10319  3428  7891  5267
## [457]  5232 10611  3755  8237  6543 11451  6435  9108  6307  7213  6877  7860
## [469]  6506 11140 12692  9105  6708  8793  6530  1664 15126 15050  9167  6108
## [481]  7047  9023  9930 10144  7245  9454  8161  8614  6943 14370 12857  8232
## [493] 10613  9810  2752 11596  4832 17022 16556  5771   655  3727 15482  2713
## [505] 12346 11682  4112  1807 10946 11886 10538 11393 12764  1202  5164  9769
## [517] 12848  4249 14331  9632  1868  6083 11611 16358  4926  3121  8135  5077
## [529]  8596 12087 14269 12231  9893 12574  8330 10830  9172  7638 15764  6393
## [541]  5325  6805  9841  7924 12363 13368  7439 11045  5206  7550  4950  3421
## [553]  8869  4038 14019 14450  7150  5153 11135 10449 19542  8206 11495  7623
## [565]  9543  9411  3403  9592  6987  8915  4933  2997  9799  3365  7336  7328
## [577]  4477  4562  7142  7671  9501  8301  7851  6885  7142  6361  6238  5896
## [589]  7802  5565  5731  6744  9837  6781  6047  5832  6339  6116  5510  7706
## [601]  6277  4053  5162  1282  4732  2497  8294 10771   637  2153  6474  7091
## [613]   703  2503  2487     9  4697  1967 10199  5652  1551  5563 13217 10145
## [625] 11404 10742 13928 11835 10725 20031  5029 13239 10433 10320 12627 10762
## [637] 10081  5454 12912 12109 10147 10524  5908  6815  4188 12342 15448  6722
## [649]  3587 14172 12862 11179  5273  4631  8059 14816 14194 15566 13744 15299
## [661]  8093 11085 18229 15090 13541 15128 20067  3761  5600 13041 14510 15010
## [673] 11459 11317  5813  9123  8585    31  9827 10688 14365  9469  9753  2817
## [685]  3520 10091 10387 11107 11584  7881 14560 12390 10052 10288 10988  8564
## [697] 12461 12827 10677 13566 14433  9572  3789 18060 16433 20159 20669 14549
## [709] 18827 17076 15929 15108 16057 10520 22359 22988 20500 12685 12422 15447
## [721] 12315  7135  1170  1969 15484 14581 14990 13953 19769 22026 12465 14810
## [733] 12209  4998  9033  8053  5234  2672  9256 10204  5151  4212  6466 11268
## [745]  2824  9282  8905  6829  4562 10232  2718  6260  7626 12386 13318 14461
## [757] 11207  2132 13630 13070  9388 15148 12200  5709  3703 12405 16208  7359
## [769]  5417  6175  2946 11419  6064  8712  7875  8567  7045  4468  2943  8382
## [781]  6582  9143  4561  5014  5571  3135  3430  5319  3008  3864  5697  5273
## [793]  8538  8687  9423  8286  4503 10499 12474  6174 15168 10085  4512  8469
## [805] 12015  3588 12427  5843  6117  9217  9877  8240  8701  2564  1320  1219
## [817]  2483   244  3147   144  4068  5245   400  1321  1758  6157  8360  7174
## [829]  1619  1831  2421  2283 23186 15337 21129 13422 29326 15118 11423 18785
## [841] 19948 19377 18258 11200 16674 12986 11101 23629 14890  9733 27745 10930
## [853]  4790 10818 18193 14055 21727 12332 10686 20226 10733 21420  8064
mean(daily_activity2$TotalSteps)
## [1] 8319.393
median(daily_activity2$TotalSteps)
## [1] 8053
skim_without_charts(daily_activity2)
Data summary
Name daily_activity2
Number of rows 863
Number of columns 15
_______________________
Column type frequency:
character 1
numeric 14
________________________
Group variables None

Variable type: character

skim_variable n_missing complete_rate min max empty n_unique whitespace
ActivityDate 0 1 8 9 0 31 0

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100
Id 0 1 4.857542e+09 2.418405e+09 1503960366 2.320127e+09 4.445115e+09 6.962181e+09 8.877689e+09
TotalSteps 0 1 8.319390e+03 4.744970e+03 4 4.923000e+03 8.053000e+03 1.109250e+04 3.601900e+04
TotalDistance 0 1 5.980000e+00 3.720000e+00 0 3.370000e+00 5.590000e+00 7.900000e+00 2.803000e+01
TrackerDistance 0 1 5.960000e+00 3.700000e+00 0 3.370000e+00 5.590000e+00 7.880000e+00 2.803000e+01
LoggedActivitiesDistance 0 1 1.200000e-01 6.500000e-01 0 0.000000e+00 0.000000e+00 0.000000e+00 4.940000e+00
VeryActiveDistance 0 1 1.640000e+00 2.740000e+00 0 0.000000e+00 4.100000e-01 2.270000e+00 2.192000e+01
ModeratelyActiveDistance 0 1 6.200000e-01 9.100000e-01 0 0.000000e+00 3.100000e-01 8.700000e-01 6.480000e+00
LightActiveDistance 0 1 3.640000e+00 1.860000e+00 0 2.340000e+00 3.580000e+00 4.890000e+00 1.071000e+01
SedentaryActiveDistance 0 1 0.000000e+00 1.000000e-02 0 0.000000e+00 0.000000e+00 0.000000e+00 1.100000e-01
VeryActiveMinutes 0 1 2.302000e+01 3.365000e+01 0 0.000000e+00 7.000000e+00 3.500000e+01 2.100000e+02
FairlyActiveMinutes 0 1 1.478000e+01 2.043000e+01 0 0.000000e+00 8.000000e+00 2.100000e+01 1.430000e+02
LightlyActiveMinutes 0 1 2.100200e+02 9.678000e+01 0 1.465000e+02 2.080000e+02 2.720000e+02 5.180000e+02
SedentaryMinutes 0 1 9.557500e+02 2.802900e+02 0 7.215000e+02 1.021000e+03 1.189000e+03 1.440000e+03
Calories 0 1 2.361300e+03 7.027100e+02 52 1.855500e+03 2.220000e+03 2.832000e+03 4.900000e+03

5.0 Share

Now that I have completed my analysis, I will create supporting visualizations

VeryActiveMinutes <- sum(daily_activity2$VeryActiveMinutes)
FairlyActiveMinutes <- sum(daily_activity2$FairlyActiveMinutes)
LightlyActiveMinutes <- sum(daily_activity2$LightlyActiveMinutes)
SedentaryMinutes <- sum(daily_activity2$SedentaryMinutes)

The relative frequency in daily activity for the entire month is at a very low level of physical activity. The percentage for VeryActiveMinutes is 2%, for FairlyActiveMinutes is 1%, for LightlyActiveMinutes is 17%, and for SendentaryMinutes is 79%. The majority of the person’s activity was characterized by much sitting and little physical exercise.

Visualize categorical variables

ggplot(data = daily_activity2, aes(x=TotalSteps, y=Calories)) + geom_point(aes(color = Calories, TotalSteps)) + geom_smooth(method="loess") 
## `geom_smooth()` using formula = 'y ~ x'

The majority of the Calories are low around 1000 in relations to the Total Steps. 10,000 steps a day is recommended, and the visualization scatter plot reveal a large percentage of steps are under that count. Over-all, there is a low level of physical activity.

6.0 Act

Now that I have finished creating my visualization, I will act on my findings. The high-level recommendations based on my analysis is as follows:

6.1 Overview

  • My final conclusion is based on my analysis of the predictive Fitbit data of low participation in daily exercise activity, and studies on the Self-determination theory (SDT), Deci et al.,2000.

  • Studies have shown that group fitness classes led to a decrease in perceived stress and an increase in physical, mental, and emotional quality of life compared with participation in exercise individually or not participating in regular exercise. One study found that group fitness can help you socialize and gain support, and have a positive impact on your social health.

6.2 Recommendation

  • The recommendation is to develop an online Bellabeat app that addresses identification and intrinsic motivation in terms of support for autonomy and competence. The app can facilitate Group Fitness which will let clients exercise in groups up to six people with possible “you go I go” style format where clients can take turns with partners to cheer each other on.

  • In addition, group fitness instructors can create workout online for clients, offer workout logging, sell workout plans online, run a variety of online workout groups, enhance engagement, elevate teaching methods and more.

6.3 Evidence

  • According to the self-determination theory, all forms of autonomous regulation predict exercise participation. There is also increasing evidence that a motivational profile marked by high autonomous motivation is important to sustain exercise behavior over time.

  • A predominance of intrinsic motivation is especially important for longer-term exercise participation.

  • According to Self-determined theory, only autonomously regulated behavior can translate into enhanced psychological wellness.

6.4 Introduction

  • Data reflect a minority of adults reports engaging in physical exercise at a level compatable with most public health guidelines. For example, in the U.S., less than 50% of adults are considered regularly physically active. These findings suggest that many people lack sufficient motivation to participate in the 150 minutes of moderately intense exercise or physical activity per week recommended.

  • In a general survey, and reflected in the Fitbit data, people were not sufficiently interested in exercise or value its outcomes enough to make it a priority in their lives. This highlights the need to look more closely at goals and self regulatory features associated with regular participation in exercise and physical activity. Self-determination theory (SDT), examine the differential effects of qualitatively different types of motivation that can underlie behavior.

  • SDT distinguishes between intrinsic and extrinsic types of motivation regulating one’s behavior.

  • Intrinsic motivation is defined as doing an activity because it is inherently fun or satisfying. When intrinsically motivated the person experiences feelings of enjoyment, the exercise of their skills, personal accomplishment, and excitement.

  • Extrinsic motivation is when a person engages in an activity to obtain some tangible outcome or social reward or to avoid disapproval. Some extrinsic motives is described as controlled forms of motivation. Controlled are extrinsic motivation based on introjected regulation where behavior is driven by self-approval and expected with SDT to sometimes regulate (or motivate) short term behavior but not to sustain maintenance over time.

  • Introjected regulation may be more positively associated with exercise among females than it is relevant for both genders in the action stage.

  • Regulation by identification with the outcome can represent a more autonomous form of behavioral regulation, and be more important then exercising for fun and enjoyment or to challenge oneself (intrinsic regulation).

  • Studies have shown a positive association favoring autonomous regulation as a predictor of exercise outcomes. Identification, and intrinsic motivation both are autonomous forms of motivation that share common causal relation in terms of support for autonomy and competence. Only autonomous motivation was predictive of long term moderate exercise.

  • SDT introduce the concept of basic psychological needs as central to understanding autonomous forms of motivation. Satisfaction of these basic needs results in increased feelings of vitality and well-being. Engaging in sports and exercise can be conducive to having one’s psychological needs realized.

6.5 How the team and business can apply my insight

  • Health promotion campaigns typically market exercise more in terms of health-related outcomes than in terms of its intrinsic value. Exercise promotion programs should take care not to explicitly or implicitly denigrate appearance or weight motive or any other motive for exercising, which may lead individuals to perceive that their autonomy is threatened, with consequent defiance and dropout.

6.6 Based on findings, the next step to take

  • Encourage autonomy by acknowledging the validity of individual motives in a need-supportive context which may ultimately promote movement away from controlled regulations toward more autonomous commitment to be active.

Reference

Deci EL, Ryan Rm: The ‘what’ and ‘why’ of goal pursuits: Human needs and the self-determination of behavior. Psychological Inquiry. 2000, 11 227-268. 10.1207/S15327965PLI1104_01.