Introduction

This storyboard investigates how Fitbit activity data reflects regular movement patterns.

In order to comprehend how consistency and movement levels affect general wellbeing, it visualizes daily steps, activity intensity, and energy expenditure across users.

Finding patterns in activity and considering how minor daily decisions, consistency, and behavior affect health outcomes are the goals; analysts and regular fitness users can both benefit from this.

Explaination:

There is a noticeable variation in the daily movement patterns, with sharp dips on weekends or during rest periods and peaks in the middle of the week.

This illustrates how consistency is influenced more by structured routines (such as workdays and gym days) than by motivation alone.

A slight upward trend over several weeks points to a change in behaviour, perhaps due to habit formation or seasonal fluctuations in energy levels.

The smoother orange line serves as a reminder that progress is based on perseverance rather than perfection by demonstrating that there is a steady improvement despite daily variation.

This chart demonstrates to audiences like students, office workers, and active adults how small changes in lifestyle can have a cumulative, quantifiable impact over time.

Personally, I can relate to how activity frequently seems erratic from week to week, but when measured over time, the upward curve indicates that genuine effort is paying off.

Activity Balance

Explaination:

Global data on physical inactivity supports the conclusion that most days are dominated by sedentary time, accounting for almost 70–80% of total minutes.

The thin slice of Very Active minutes shows that exercise accounts for a very small portion of total energy consumption.

Walking, housework, or running errands are examples of Lightly Active and Fairly Active periods that fill the gap and demonstrate how regular movement still makes a significant contribution.

This pattern demonstrates how structured exercises by themselves are insufficient to counteract inactivity; lowering sedentary behaviour is the real obstacle.

This graphic highlights for data storytellers how layered data provides context: minor regions (such as the pink “Very Active” band) convey significant behavioural meaning.

For me, it serves as a reminder that regular “light activity,” such as pacing in between meetings or walking to class, is far more important than it first appears.

Movement vs. Energy

Explaination:

Although not exactly linear, there is a positive correlation between movement and calorie burn.

At higher step counts, variability rises, indicating variations in activity intensity, stride length, and metabolism.

Because they move more quickly or exert more effort, some people burn more energy while taking fewer steps.

This serves as a reminder that, even in aggregated data, energy use is highly individualistic.

This visualization makes it clear to wearable audiences why comparing users can be deceptive—personal context is always important.

It also reaffirms the obvious fact that, regardless of starting point, more movement still results in higher energy consumption.

Reflectively, this is consistent with lived experience; depending on fatigue or intensity, taking fewer steps can still feel more difficult on some days.

User Consistency

Explaination:

An individual is represented by each box, which displays variability as well as average activity levels.

Wide ranges suggest erratic effort or shifting motivation, while narrow boxes suggest consistent daily behaviour.

Consistency varies significantly, even among highly engaged users, which may be due to outside variables like mood, weather, or work schedules.

This highlights the need for both averages and variability in behavior-based insights for health researchers.

The audience learns that consistency building might be more important than striving for peak performance.

For me, this illustrates the fact that stability, not intense outbursts followed by burnout, is the key to long-term progress.

From a design perspective, this plot transforms anonymous metrics into behavioural portraits by capturing individuality in collective data.

Reflection

Objective: Showcase how data visualisation uncovers practical trends in behaviour and health.

Audience: Designed for health analysts, data-savvy readers, and wearable technology users looking for evidence-based insights to help them better themselves.

Realising that even physically active people spend a large portion of their days sitting down, long-term wellness is defined by habits rather than intense exercise.

Conclusion: Frequent moderate exercise, such as taking a few steps each day or moving lightly, has a greater lasting effect and is more sustainable than sporadic intense sessions.

Introspection: Converting actual Fitbit data into images strengthens the link between practice and outcomes. When data is connected to real-world experiences, it transcends analysis and becomes profoundly human.

References:

Arash, N. (2016). FitBit Fitness Tracker Data [Data set]. Kaggle. https://www.kaggle.com/datasets/arashnic/fitbit

Wickham, H., François, R., Henry, L., & Müller, K. (2023). Dplyr: A grammar of data manipulation [Computer software]. R Studio PBC. https://dplyr.tidyverse.org

Chang, W., Borges Ribeiro, B., & Allaire, J. J. (2023). Flexdashboard: R Markdown format for flexible dashboards [R package]. R Studio PBC. https://pkgs.rstudio.com/flexdashboard/

World Health Organization. (2020). Global recommendations on physical activity for health (2nd ed.). World Health Organization. https://www.who.int/publications/i/item/9789241599979

Centers for Disease Control and Prevention. (2022). Physical activity basics: Facts about physical activity. U.S. Department of Health and Human Services. https://www.cdc.gov/physicalactivity/basics

Kreuter, M. W., & McClure, S. M. (2004). The role of culture in health communication. Annual Review of Public Health, 25(1), 439–455. https://doi.org/10.1146/annurev.publhealth.25.101802.123000

R Core Team. (2025). R: A language and environment for statistical computing [Computer software]. R Foundation for Statistical Computing. https://www.R-project.org

---
title: "A Fitness Data Story: A Journey Towards Transformation"
author: "James Kumarasinha (s4092436)"
date: "`r format(Sys.Date(), '%d %b %Y')`"
output:
  flexdashboard::flex_dashboard:
    storyboard: true
    theme: cosmo
    source_code: embed
---

```{r setup, include=FALSE}

options(repos = c(CRAN = "https://cloud.r-project.org"))

knitr::opts_chunk$set(echo = FALSE, message = FALSE, warning = FALSE)


pkgs <- c("readr","dplyr","tidyr","lubridate","ggplot2","plotly","scales","forcats","flexdashboard")
need <- setdiff(pkgs, rownames(installed.packages()))
if (length(need)) install.packages(need, quiet = TRUE)
invisible(lapply(pkgs, library, character.only = TRUE))

# load fitbits dataset 
raw <- readr::read_csv("dailyActivity_merged.csv",
  col_types = readr::cols(
    Id = readr::col_double(),
    ActivityDate = readr::col_character(),
    TotalSteps = readr::col_double(),
    TotalDistance = readr::col_double(),
    TrackerDistance = readr::col_double(),
    LoggedActivitiesDistance = readr::col_double(),
    VeryActiveMinutes = readr::col_double(),
    FairlyActiveMinutes = readr::col_double(),
    LightlyActiveMinutes = readr::col_double(),
    SedentaryMinutes = readr::col_double(),
    Calories = readr::col_double()
  )
)

# cleaning
data <- raw |>
  mutate(
    date  = lubridate::mdy(ActivityDate),
    dow   = lubridate::wday(date, label = TRUE, abbr = TRUE),
    week  = lubridate::isoweek(date),
    month = lubridate::floor_date(date, "month")
  ) |>
  filter(!is.na(date))

# smoothing
by_day <- data |>
  group_by(date) |>
  summarise(
    avg_steps = mean(TotalSteps, na.rm = TRUE),
    avg_cals  = mean(Calories,    na.rm = TRUE),
    avg_dist  = mean(TotalDistance,na.rm = TRUE),
    .groups = "drop"
  )

activity_long <- data |>
  transmute(
    date,
    Very       = VeryActiveMinutes,
    Fairly     = FairlyActiveMinutes,
    Lightly    = LightlyActiveMinutes,
    Sedentary  = SedentaryMinutes
  ) |>
  pivot_longer(
    cols = c(Very, Fairly, Lightly, Sedentary),
    names_to = "Intensity", values_to = "Minutes"
  )

activity_share <- activity_long |>
  group_by(date, Intensity) |>
  summarise(mean_min = mean(Minutes, na.rm = TRUE), .groups = "drop") |>
  group_by(date) |>
  mutate(share = mean_min / sum(mean_min)) |>
  ungroup()

by_user <- data |>
  group_by(Id) |>
  summarise(
    days       = n(),
    mean_steps = mean(TotalSteps, na.rm = TRUE),
    mean_cals  = mean(Calories,   na.rm = TRUE),
    .groups = "drop"
  ) |>
  filter(days >= 7)
```
### Introduction
This storyboard investigates how Fitbit activity data reflects regular movement patterns.

In order to comprehend how consistency and movement levels affect general wellbeing, it visualizes daily steps, activity intensity, and energy expenditure across users.

Finding patterns in activity and considering how minor daily decisions, consistency, and behavior affect health outcomes are the goals; analysts and regular fitness users can both benefit from this.


### Trends in Daily Movement
```{r}
p_steps <- ggplot(by_day, aes(x = date, y = avg_steps)) +
  geom_line(color = "#0072B2", linewidth = 0.9) +
  geom_smooth(method = "loess", se = FALSE, color = "#D55E00") +
  scale_y_continuous(labels = scales::label_comma()) +
  labs(
    x = NULL, y = "Average Steps",
    title = "Average Steps Per Day Over Time",
    caption = "Fitabase Fitbit Exports (dailyActivity_merged.csv) is the source."
  )

plotly::ggplotly(p_steps, width = NULL, height = NULL) |>
  plotly::layout(
    autosize = TRUE,
    margin = list(l = 40, r = 20, t = 40, b = 40)
  )
```

### Explaination:
There is a noticeable variation in the daily movement patterns, with sharp dips on weekends or during rest periods and peaks in the middle of the week.

This illustrates how consistency is influenced more by structured routines (such as workdays and gym days) than by motivation alone.

A slight upward trend over several weeks points to a change in behaviour, perhaps due to habit formation or seasonal fluctuations in energy levels.

The smoother orange line serves as a reminder that progress is based on perseverance rather than perfection by demonstrating that there is a steady improvement despite daily variation.

This chart demonstrates to audiences like students, office workers, and active adults how small changes in lifestyle can have a cumulative, quantifiable impact over time.

Personally, I can relate to how activity frequently seems erratic from week to week, but when measured over time, the upward curve indicates that genuine effort is paying off.
 

### Activity Balance
```{r}
p_activity <- ggplot(activity_share, aes(x = date, y = share, fill = Intensity)) +
  geom_area(alpha = 0.9, colour = "white") +
  scale_y_continuous(labels = scales::percent_format(accuracy = 1)) +
  scale_fill_brewer(palette = "Pastel1") +
  labs(
    x = NULL, y = "Daily Minutes Shared",
    title = "How Much Time Is Spent on Different Activity Levels Each Day",
    caption = "Source: Fitabase Fitbit Exports (dailyActivity_merged.csv)"
  )

plotly::ggplotly(p_activity, width = NULL, height = NULL) |>
  plotly::layout(
    autosize = TRUE,
    margin = list(l = 40, r = 20, t = 40, b = 40)
  )
```



### Explaination:
Global data on physical inactivity supports the conclusion that most days are dominated by sedentary time, accounting for almost 70–80% of total minutes.

The thin slice of Very Active minutes shows that exercise accounts for a very small portion of total energy consumption.

Walking, housework, or running errands are examples of Lightly Active and Fairly Active periods that fill the gap and demonstrate how regular movement still makes a significant contribution.

This pattern demonstrates how structured exercises by themselves are insufficient to counteract inactivity; lowering sedentary behaviour is the real obstacle.

This graphic highlights for data storytellers how layered data provides context: minor regions (such as the pink "Very Active" band) convey significant behavioural meaning.

For me, it serves as a reminder that regular "light activity," such as pacing in between meetings or walking to class, is far more important than it first appears.

### Movement vs. Energy
```{r}
p_energy <- ggplot(by_user, aes(x = mean_steps, y = mean_cals)) +
  geom_point(alpha = 0.4, color = "#56B4E9") +
  geom_smooth(method = "loess", se = FALSE, color = "#E69F00") +
  labs(
    x = "Steps", y = "Calories Burned",
    title = "Steps and Calories Burned: A Relationship",
    caption = "Source: Fitabase Fitbit Exports (dailyActivity_merged.csv)"
  )

plotly::ggplotly(p_energy, width = NULL, height = NULL) |>
  plotly::layout(
    autosize = TRUE,
    margin = list(l = 40, r = 20, t = 40, b = 40)
  )
```

### Explaination:
Although not exactly linear, there is a positive correlation between movement and calorie burn.

At higher step counts, variability rises, indicating variations in activity intensity, stride length, and metabolism.

Because they move more quickly or exert more effort, some people burn more energy while taking fewer steps.

This serves as a reminder that, even in aggregated data, energy use is highly individualistic.

This visualization makes it clear to wearable audiences why comparing users can be deceptive—personal context is always important.

It also reaffirms the obvious fact that, regardless of starting point, more movement still results in higher energy consumption.

Reflectively, this is consistent with lived experience; depending on fatigue or intensity, taking fewer steps can still feel more difficult on some days.

### User Consistency
```{r}
# 20
top_ids <- by_user |>
  arrange(desc(days)) |>
  slice_head(n = 20) |>
  pull(Id)

# narrowing it
df_consist <- data |>
  dplyr::filter(Id %in% top_ids) |>
  dplyr::group_by(Id) |>
  dplyr::filter(dplyr::n() >= 10) |>
  dplyr::ungroup()

p_consistency <- ggplot(
  df_consist,
  aes(
    x = forcats::fct_reorder(as.factor(Id), TotalSteps, .fun = median),
    y = TotalSteps
  )
) +
  geom_boxplot(fill = "#56B4E9", colour = "black", outlier.alpha = 0.25) +
  coord_flip() +
  labs(
    title = "Step Consistency Among Users",
    x = "User ID (Top 20 by Days Logged)",
    y = "Daily Steps",
    caption = "Source: Fitabase Fitbit Exports (dailyActivity_merged.csv)"
  )

plotly::ggplotly(p_consistency, width = NULL, height = NULL) |>
  plotly::layout(autosize = TRUE, margin = list(l = 120, r = 20, t = 40, b = 40))
```

### Explaination:
An individual is represented by each box, which displays variability as well as average activity levels.

Wide ranges suggest erratic effort or shifting motivation, while narrow boxes suggest consistent daily behaviour.

Consistency varies significantly, even among highly engaged users, which may be due to outside variables like mood, weather, or work schedules.

This highlights the need for both averages and variability in behavior-based insights for health researchers.

The audience learns that consistency building might be more important than striving for peak performance.

For me, this illustrates the fact that stability, not intense outbursts followed by burnout, is the key to long-term progress.

From a design perspective, this plot transforms anonymous metrics into behavioural portraits by capturing individuality in collective data.

### Reflection
Objective: Showcase how data visualisation uncovers practical trends in behaviour and health.

Audience: Designed for health analysts, data-savvy readers, and wearable technology users looking for evidence-based insights to help them better themselves.

Realising that even physically active people spend a large portion of their days sitting down, long-term wellness is defined by habits rather than intense exercise.

Conclusion: Frequent moderate exercise, such as taking a few steps each day or moving lightly, has a greater lasting effect and is more sustainable than sporadic intense sessions.

Introspection: Converting actual Fitbit data into images strengthens the link between practice and outcomes.  When data is connected to real-world experiences, it transcends analysis and becomes profoundly human.

### References:
Arash, N. (2016). FitBit Fitness Tracker Data [Data set]. Kaggle. https://www.kaggle.com/datasets/arashnic/fitbit

Wickham, H., François, R., Henry, L., & Müller, K. (2023). Dplyr: A grammar of data manipulation [Computer software]. R Studio PBC. https://dplyr.tidyverse.org

Chang, W., Borges Ribeiro, B., & Allaire, J. J. (2023). Flexdashboard: R Markdown format for flexible dashboards [R package]. R Studio PBC. https://pkgs.rstudio.com/flexdashboard/

World Health Organization. (2020). Global recommendations on physical activity for health (2nd ed.). World Health Organization. https://www.who.int/publications/i/item/9789241599979

Centers for Disease Control and Prevention. (2022). Physical activity basics: Facts about physical activity. U.S. Department of Health and Human Services. https://www.cdc.gov/physicalactivity/basics

Kreuter, M. W., & McClure, S. M. (2004). The role of culture in health communication. Annual Review of Public Health, 25(1), 439–455. https://doi.org/10.1146/annurev.publhealth.25.101802.123000

R Core Team. (2025). R: A language and environment for statistical computing [Computer software]. R Foundation for Statistical Computing. https://www.R-project.org