R Markdown

This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.

When you click the Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this:

summary(cars)
##      speed           dist       
##  Min.   : 4.0   Min.   :  2.00  
##  1st Qu.:12.0   1st Qu.: 26.00  
##  Median :15.0   Median : 36.00  
##  Mean   :15.4   Mean   : 42.98  
##  3rd Qu.:19.0   3rd Qu.: 56.00  
##  Max.   :25.0   Max.   :120.00

Including Plots

You can also embed plots, for example:

Note that the echo = FALSE parameter was added to the code chunk to prevent printing of the R code that generated the plot. ## 1. Ask: Business Task

Bellabeat is a high-growth wellness technology company for women. The co-founder, Urška Sršen, believes that analyzing smart device usage data can unlock new marketing opportunities.

Business Task:
Analyze non-Bellabeat smart device usage data to identify trends in user behavior, then apply these insights to one Bellabeat product (e.g., the Bellabeat app or Leaf tracker) to guide marketing strategy.

Key Questions: 1. What are the trends in smart device usage? 2. How can these trends apply to Bellabeat customers? 3. How can they influence Bellabeat’s marketing strategy?

Stakeholders: - Urška Sršen, Co-founder & Chief Creative Officer - Sando Mur, Co-founder - Bellabeat Marketing Analytics Team ## 2. Prepare: Data Source

Dataset: FitBit Fitness Tracker Data (Kaggle, CC0 Public Domain)
Link: https://www.kaggle.com/datasets/arashnic/fitbit

Description: - 30 Fitbit users - Daily and minute-level data on activity, heart rate, and sleep - Period: April 12 – May 12, 2016

ROCCC Evaluation: - Reliability: Medium (small sample) - Originality: High (real user data) - Comprehensiveness: Medium (missing demographics) - Currentness: Low (2016 data) - Citedness: Low (not peer-reviewed)

Limitations: - Small, non-representative sample - Outdated (2016) - No age, gender, or location data ## 3. Process: Data Cleaning

# Load libraries
library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.4     ✔ readr     2.1.5
## ✔ forcats   1.0.0     ✔ stringr   1.5.1
## ✔ ggplot2   3.5.2     ✔ tibble    3.3.0
## ✔ lubridate 1.9.4     ✔ tidyr     1.3.1
## ✔ purrr     1.1.0     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(lubridate)
library(janitor)
## 
## Attaching package: 'janitor'
## 
## The following objects are masked from 'package:stats':
## 
##     chisq.test, fisher.test
# Load and clean data
daily <- read_csv("data/dailyActivity_merged.csv") %>% clean_names()
## Rows: 940 Columns: 15
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr  (1): ActivityDate
## dbl (14): Id, TotalSteps, TotalDistance, TrackerDistance, LoggedActivitiesDi...
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
sleep <- read_csv("data/sleepDay_merged.csv") %>% clean_names()
## Rows: 413 Columns: 5
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (1): SleepDay
## dbl (4): Id, TotalSleepRecords, TotalMinutesAsleep, TotalTimeInBed
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
# Convert dates
daily <- daily %>% mutate(date = mdy(activity_date))
sleep <- sleep %>% mutate(date = mdy_hms(sleep_day) %>% as_date())

# Join data
combined <- inner_join(daily, sleep, by = c("id", "date"))

# Create new variables
combined <- combined %>%
  mutate(
    activity_level = case_when(
      total_steps < 5000 ~ "Low",
      total_steps < 10000 ~ "Moderate",
      TRUE ~ "High"
    ),
    hours_asleep = total_minutes_asleep / 60
  )

4. Analyze: Key Findings

ggplot(combined, aes(x = activity_level)) +
  geom_bar(fill = "#FF69B4") +
  labs(title = "User Activity Levels", x = "Level", y = "Count") +
  theme_minimal()

ggplot(combined, aes(x = total_steps, y = hours_asleep)) +
  geom_point(alpha = 0.6, color = "#00BFFF") +
  geom_smooth(method = 'lm') +
  labs(title = "Steps vs. Sleep Duration", x = "Daily Steps", y = "Hours Asleep") +
  theme_minimal()
## `geom_smooth()` using formula = 'y ~ x'

5. Act: Recommendations

Based on the analysis, I recommend the following marketing strategies for Bellabeat App and Leaf Tracker:

  1. Launch a “7-Day Sleep Challenge”
    Encourage users to increase daily steps to improve sleep quality. The app can send personalized notifications.

  2. Create Weekly Step Challenges
    Use social features to engage users. Reward achievements with digital badges.

  3. Personalize Content Based on Activity
    If a user is inactive, suggest short walks. If active, offer mindfulness content for recovery.

These strategies align with Bellabeat’s mission of holistic wellness and can increase user engagement and retention. ## Conclusion

The analysis reveals that most users do not reach 10,000 steps daily, and there is a positive trend between activity and sleep. By leveraging these insights, Bellabeat can enhance user engagement through personalized, data-driven marketing campaigns that promote both physical activity and better sleep.